{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Most examples work across multiple plotting backends equivalent, this example is also available for:\n", "\n", "* [Bokeh - radial_heatmap](../bokeh/radial_heatmap.ipynb)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "\n", "import holoviews as hv\n", "hv.extension(\"matplotlib\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Declaring data\n", "\n", "### NYC Taxi Data\n", "\n", "Let's dive into a concrete example, namely the New York - Taxi Data ([For-Hire Vehicle (“FHV”) records](http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml)). The following data contains hourly pickup counts for the entire year of 2016. \n", "\n", "**Considerations**: Thinking about taxi pickup counts, we might expect higher taxi usage during business hours. In addition, public holidays should be clearly distinguishable from regular business days. Furthermore, we might expect high taxi pickup counts during Friday and Saturday nights.\n", "\n", "**Design**: In order model the above ideas, we decide to assign days with hourly split to the *radial segments* and week of year to the *annulars*. This will allow to detect daily/hourly periodicity and weekly trends. To get you more familiar with the mapping of segments and annulars, take a look at the following radial heatmap:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# load example data\n", "df_nyc = pd.read_csv(\"../../../assets/nyc_taxi.csv.gz\", parse_dates=[\"Pickup_date\"])\n", "\n", "# create relevant time columns\n", "df_nyc[\"Day & Hour\"] = df_nyc[\"Pickup_date\"].dt.strftime(\"%A %H:00\")\n", "df_nyc[\"Week of Year\"] = df_nyc[\"Pickup_date\"].dt.strftime(\"Week %W\")\n", "df_nyc[\"Date\"] = df_nyc[\"Pickup_date\"].dt.strftime(\"%Y-%m-%d\")\n", "\n", "heatmap = hv.HeatMap(df_nyc, [\"Day & Hour\", \"Week of Year\"], [\"Pickup_Count\", \"Date\"])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Plot" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**At first glance**: First, let's take a closer look at the mentioned segments and annulars. **Segments** correspond to *hours of a given day* whereas **annulars** represent entire *weeks*. If you use the hover tool, you will quickly get an idea of how segments and annulars are organized. **Color** decodes the pickup values with blue being low and red being high.\n", "\n", "**Plot improvements**: The above plot clearly shows systematic patterns however the default plot options are somewhat disadvantageous. Therefore, before we start to dive into the results, let's increase the readability of the given plot:\n", "\n", "- **Remove annular ticks**: The information about week of year is not very important. Therefore, we hide it via `yticks=None`.\n", "- **Custom segment ticks**: Right now, segment labels are given via day and hour. We don't need hourly information and we want every day to be labeled. We can use a tuple here which will be passed to `xticks=(\"Friday\", ..., \"Thursday\")`\n", "- **Add segment markers**: Moreover, we want to aid the viewer in distinguishing each day more clearly. Hence, we can provide marker lines via `xmarks=7`.\n", "- **Rotate heatmap**: The week starts with Monday and ends with Sunday. Accordingly, we want to rotate the plot to have Sunday and Monday be at the top. This can be done via `start_angle=np.pi*19/14`. The default order is defined by the global sort order which is present in the data. The default starting angle is at 12 o'clock.\n", "\n", "Let's see the result of these modifications:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "heatmap.opts(\n", " radial=True, fig_size=300, yticks=None, xmarks=7, ymarks=3, start_angle=np.pi*19/14, \n", " xticks=(\"Friday\", \"Saturday\", \"Sunday\", \"Monday\", \"Tuesday\", \"Wednesday\", \"Thursday\"))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After tweaking the plot defaults, we're comfortable with the given visualization and can focus on the story the plot tells us.\n", "\n", "**There are many interesting findings in this visualization:**\n", "\n", "1. Taxi pickup counts are high between 7-9am and 5-10pm during weekdays which business hours as expected. In contrast, during weekends, there is not much going on until 11am. \n", "2. Friday and Saturday nights clearly stand out with the highest pickup densities as expected. \n", "3. Public holidays can be easily identified. For example, taxi pickup counts are comparetively low around Christmas and Thanksgiving.\n", "4. Weather phenomena also influence taxi service. There is a very dark blue stripe at the beginning of the year starting at Saturday 23rd and lasting until Sunday 24th. Interestingly, there was one of the [biggest blizzards](https://www.weather.gov/okx/Blizzard_Jan2016) in the history of NYC." ] } ], "metadata": { "language_info": { "name": "python", "pygments_lexer": "ipython3" } }, "nbformat": 4, "nbformat_minor": 2 }