{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.9" }, "colab": { "name": "2021-06-25-altair-plot-part-4.ipynb", "provenance": [] } }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "J_ndgt-jp7La" }, "source": [ "# Altair Plot Part 4 - Multi-View Composition\n", "> A series of Altair plot examples\n", "\n", "- toc: true\n", "- badges: true\n", "- comments: true\n", "- categories: [altair, visualization]\n", "- image:" ] }, { "cell_type": "markdown", "metadata": { "id": "TCXpDyvdn_XX" }, "source": [ "When visualizing a number of different data fields, we might be tempted to use as many visual encoding channels as we can: `x`, `y`, `color`, `size`, `shape`, and so on. However, as the number of encoding channels increases, a chart can rapidly become cluttered and difficult to read. An alternative to \"over-loading\" a single chart is to instead _compose multiple charts_ in a way that facilitates rapid comparisons.\n", "\n", "In this notebook, we will examine a variety of operations for _multi-view composition_:\n", "\n", "- _layer_: place compatible charts directly on top of each other,\n", "- _facet_: partition data into multiple charts, organized in rows or columns,\n", "- _concatenate_: position arbitrary charts within a shared layout, and\n", "- _repeat_: take a base chart specification and apply it to multiple data fields.\n", "\n", "We'll then look at how these operations form a _view composition algebra_, in which the operations can be combined to build a variety of complex multi-view displays.\n", "\n", "_This notebook is part of the [data visualization curriculum](https://github.com/uwdata/visualization-curriculum)._" ] }, { "cell_type": "code", "metadata": { "id": "dIig-LFMn1DY" }, "source": [ "import pandas as pd\n", "import altair as alt" ], "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "gaYNlguioJYK" }, "source": [ "## Weather Data\n", "\n", "We will be visualizing weather statistics for the U.S. cities of Seattle and New York. Let's load the dataset and peek at the first and last 10 rows:" ] }, { "cell_type": "code", "metadata": { "id": "r3obq-Ksn8W5" }, "source": [ "weather = 'https://cdn.jsdelivr.net/npm/vega-datasets@1/data/weather.csv'" ], "execution_count": null, "outputs": [] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 359 }, "id": "NFKYk5waoSVd", "outputId": "49003cdc-647a-4a87-8c9b-293b3f7e153f" }, "source": [ "df = pd.read_csv(weather)\n", "df.head(10)" ], "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
locationdateprecipitationtemp_maxtemp_minwindweather
0Seattle2012-01-010.012.85.04.7drizzle
1Seattle2012-01-0210.910.62.84.5rain
2Seattle2012-01-030.811.77.22.3rain
3Seattle2012-01-0420.312.25.64.7rain
4Seattle2012-01-051.38.92.86.1rain
5Seattle2012-01-062.54.42.22.2rain
6Seattle2012-01-070.07.22.82.3rain
7Seattle2012-01-080.010.02.82.0sun
8Seattle2012-01-094.39.45.03.4rain
9Seattle2012-01-101.06.10.63.4rain
\n", "
" ], "text/plain": [ " location date precipitation temp_max temp_min wind weather\n", "0 Seattle 2012-01-01 0.0 12.8 5.0 4.7 drizzle\n", "1 Seattle 2012-01-02 10.9 10.6 2.8 4.5 rain\n", "2 Seattle 2012-01-03 0.8 11.7 7.2 2.3 rain\n", "3 Seattle 2012-01-04 20.3 12.2 5.6 4.7 rain\n", "4 Seattle 2012-01-05 1.3 8.9 2.8 6.1 rain\n", "5 Seattle 2012-01-06 2.5 4.4 2.2 2.2 rain\n", "6 Seattle 2012-01-07 0.0 7.2 2.8 2.3 rain\n", "7 Seattle 2012-01-08 0.0 10.0 2.8 2.0 sun\n", "8 Seattle 2012-01-09 4.3 9.4 5.0 3.4 rain\n", "9 Seattle 2012-01-10 1.0 6.1 0.6 3.4 rain" ] }, "metadata": { "tags": [] }, "execution_count": 3 } ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 359 }, "id": "myLKf7LWS2T4", "outputId": "d72e8414-f30b-47e8-8ac4-dde69462dbf4" }, "source": [ "df.tail(10)" ], "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
locationdateprecipitationtemp_maxtemp_minwindweather
2912New York2015-12-224.815.611.13.8fog
2913New York2015-12-2329.517.28.94.5fog
2914New York2015-12-240.520.613.94.9fog
2915New York2015-12-252.517.811.10.9fog
2916New York2015-12-260.315.69.44.8drizzle
2917New York2015-12-272.017.28.95.5fog
2918New York2015-12-281.38.91.76.3snow
2919New York2015-12-2916.89.41.15.3fog
2920New York2015-12-309.410.65.03.0fog
2921New York2015-12-311.511.16.15.5fog
\n", "
" ], "text/plain": [ " location date precipitation temp_max temp_min wind weather\n", "2912 New York 2015-12-22 4.8 15.6 11.1 3.8 fog\n", "2913 New York 2015-12-23 29.5 17.2 8.9 4.5 fog\n", "2914 New York 2015-12-24 0.5 20.6 13.9 4.9 fog\n", "2915 New York 2015-12-25 2.5 17.8 11.1 0.9 fog\n", "2916 New York 2015-12-26 0.3 15.6 9.4 4.8 drizzle\n", "2917 New York 2015-12-27 2.0 17.2 8.9 5.5 fog\n", "2918 New York 2015-12-28 1.3 8.9 1.7 6.3 snow\n", "2919 New York 2015-12-29 16.8 9.4 1.1 5.3 fog\n", "2920 New York 2015-12-30 9.4 10.6 5.0 3.0 fog\n", "2921 New York 2015-12-31 1.5 11.1 6.1 5.5 fog" ] }, "metadata": { "tags": [] }, "execution_count": 4 } ] }, { "cell_type": "markdown", "metadata": { "id": "TKyMsDCoPxiC" }, "source": [ "We will create multi-view displays to examine weather within and across the cities." ] }, { "cell_type": "markdown", "metadata": { "id": "KicCeq0Gpm_j" }, "source": [ "## Layer" ] }, { "cell_type": "markdown", "metadata": { "id": "gHAnMxaFoSmN" }, "source": [ "One of the most common ways of combining multiple charts is to *layer* marks on top of each other. If the underlying scale domains are compatible, we can merge them to form _shared axes_. If either of the `x` or `y` encodings is not compatible, we might instead create a _dual-axis chart_, which overlays marks using separate scales and axes." ] }, { "cell_type": "markdown", "metadata": { "id": "Jc4hle6Npoij" }, "source": [ "### Shared Axes" ] }, { "cell_type": "markdown", "metadata": { "id": "nChT_olsQ8vX" }, "source": [ "Let's start by plotting the minimum and maximum average temperatures per month:" ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 331 }, "id": "iwgqOsfoyVe_", "outputId": "77cb9541-dc66-45ac-f06f-7d05ed093534" }, "source": [ "alt.Chart(weather).mark_area().encode(\n", " alt.X('month(date):T'),\n", " alt.Y('average(temp_max):Q'),\n", " alt.Y2('average(temp_min):Q')\n", ")" ], "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.Chart(...)" ] }, "metadata": { "tags": [] }, "execution_count": 5 } ] }, { "cell_type": "markdown", "metadata": { "id": "Bl1XIaeqSrzl" }, "source": [ "_The plot shows us temperature ranges for each month over the entirety of our data. However, this is pretty misleading as it aggregates the measurements for both Seattle and New York!_\n", "\n", "Let's subdivide the data by location using a color encoding, while also adjusting the mark opacity to accommodate overlapping areas:" ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 331 }, "id": "PbNkncWsRKrV", "outputId": "fb6da544-e1e7-477e-fedc-e1f0a3dc61b4" }, "source": [ "alt.Chart(weather).mark_area(opacity=0.3).encode(\n", " alt.X('month(date):T'),\n", " alt.Y('average(temp_max):Q'),\n", " alt.Y2('average(temp_min):Q'),\n", " alt.Color('location:N')\n", ")" ], "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.Chart(...)" ] }, "metadata": { "tags": [] }, "execution_count": 6 } ] }, { "cell_type": "markdown", "metadata": { "id": "j30cu2YmTW8g" }, "source": [ "_We can see that Seattle is more temperate: warmer in the winter, and cooler in the summer._\n", "\n", "In this case we've created a layered chart without any special features by simply subdividing the area marks by color. While the chart above shows us the temperature ranges, we might also want to emphasize the middle of the range.\n", "\n", "Let's create a line chart showing the average temperature midpoint. We'll use a `calculate` transform to compute the midpoints between the minimum and maximum daily temperatures:" ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 331 }, "id": "zHfuCUbyzW7q", "outputId": "257e2570-49f0-4502-aebb-b8b28aae3897" }, "source": [ "alt.Chart(weather).mark_line().transform_calculate(\n", " temp_mid='(+datum.temp_min + +datum.temp_max) / 2'\n", ").encode(\n", " alt.X('month(date):T'),\n", " alt.Y('average(temp_mid):Q'),\n", " alt.Color('location:N')\n", ")" ], "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.Chart(...)" ] }, "metadata": { "tags": [] }, "execution_count": 7 } ] }, { "cell_type": "markdown", "metadata": { "id": "fQ8IMV3IUfYg" }, "source": [ "_Aside_: note the use of `+datum.temp_min` within the calculate transform. As we are loading the data directly from a CSV file without any special parsing instructions, the temperature values may be internally represented as string values. Adding the `+` in front of the value forces it to be treated as a number.\n", "\n", "We'd now like to combine these charts by layering the midpoint lines over the range areas. Using the syntax `chart1 + chart2`, we can specify that we want a new layered chart in which `chart1` is the first layer and `chart2` is a second layer drawn on top:" ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 331 }, "id": "DLOAw_LBzcuW", "outputId": "dc6a8c55-45cf-4029-fa68-29ea0214589e" }, "source": [ "tempMinMax = alt.Chart(weather).mark_area(opacity=0.3).encode(\n", " alt.X('month(date):T'),\n", " alt.Y('average(temp_max):Q'),\n", " alt.Y2('average(temp_min):Q'),\n", " alt.Color('location:N')\n", ")\n", "\n", "tempMid = alt.Chart(weather).mark_line().transform_calculate(\n", " temp_mid='(+datum.temp_min + +datum.temp_max) / 2'\n", ").encode(\n", " alt.X('month(date):T'),\n", " alt.Y('average(temp_mid):Q'),\n", " alt.Color('location:N')\n", ")\n", "\n", "tempMinMax + tempMid" ], "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.LayerChart(...)" ] }, "metadata": { "tags": [] }, "execution_count": 8 } ] }, { "cell_type": "markdown", "metadata": { "id": "CrA84n7W2mUi" }, "source": [ "_Now we have a multi-layer plot! However, the y-axis title (though informative) has become a bit long and unruly..._\n", "\n", "Let's customize our axes to clean up the plot. If we set a custom axis title within one of the layers, it will automatically be used as a shared axis title for all the layers:" ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 331 }, "id": "LUyPrsgf2N-R", "outputId": "b261942a-7c09-4aab-c296-c0516e6343b6" }, "source": [ "tempMinMax = alt.Chart(weather).mark_area(opacity=0.3).encode(\n", " alt.X('month(date):T', title=None, axis=alt.Axis(format='%b')),\n", " alt.Y('average(temp_max):Q', title='Avg. Temperature °C'),\n", " alt.Y2('average(temp_min):Q'),\n", " alt.Color('location:N')\n", ")\n", "\n", "tempMid = alt.Chart(weather).mark_line().transform_calculate(\n", " temp_mid='(+datum.temp_min + +datum.temp_max) / 2'\n", ").encode(\n", " alt.X('month(date):T'),\n", " alt.Y('average(temp_mid):Q'),\n", " alt.Color('location:N')\n", ")\n", "\n", "tempMinMax + tempMid" ], "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.LayerChart(...)" ] }, "metadata": { "tags": [] }, "execution_count": 9 } ] }, { "cell_type": "markdown", "metadata": { "id": "jb405Vkd3dFh" }, "source": [ "_What happens if both layers have custom axis titles? Modify the code above to find out..._\n", "\n", "Above used the `+` operator, a convenient shorthand for Altair's `layer` method. We can generate an identical layered chart using the `layer` method directly:" ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 331 }, "id": "aPnJVekc5iOE", "outputId": "07beb43c-9dd8-4cbc-9bb0-4fda3216ace8" }, "source": [ "alt.layer(tempMinMax, tempMid)" ], "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.LayerChart(...)" ] }, "metadata": { "tags": [] }, "execution_count": 10 } ] }, { "cell_type": "markdown", "metadata": { "id": "_QwjqCdm6crv" }, "source": [ "Note that the order of inputs to a layer matters, as subsequent layers will be drawn on top of earlier layers. _Try swapping the order of the charts in the cells above. What happens? (Hint: look closely at the color of the `line` marks.)_" ] }, { "cell_type": "markdown", "metadata": { "id": "WzJf0GAe1pug" }, "source": [ "### Dual-Axis Charts" ] }, { "cell_type": "markdown", "metadata": { "id": "C5hcdEdPZddR" }, "source": [ "_Seattle has a reputation as a rainy city. Is that deserved?_\n", "\n", "Let's look at precipitation alongside temperature to learn more. First let's create a base plot the shows average monthly precipitation in Seattle:" ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 331 }, "id": "kD70HRtQaN-e", "outputId": "f98c00a3-a8a6-485f-d61d-7ea40bcb6f97" }, "source": [ "alt.Chart(weather).transform_filter(\n", " 'datum.location == \"Seattle\"'\n", ").mark_line(\n", " interpolate='monotone',\n", " stroke='grey'\n", ").encode(\n", " alt.X('month(date):T', title=None),\n", " alt.Y('average(precipitation):Q', title='Precipitation')\n", ")" ], "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.Chart(...)" ] }, "metadata": { "tags": [] }, "execution_count": 11 } ] }, { "cell_type": "markdown", "metadata": { "id": "nlcsyvEwaZjm" }, "source": [ "To facilitate comparison with the temperature data, let's create a new layered chart. Here's what happens if we try to layer the charts as we did earlier:" ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 331 }, "id": "eezzWWPR1tJ6", "outputId": "f22513cc-d5b7-4c7f-9e95-f73ff3dec020" }, "source": [ "tempMinMax = alt.Chart(weather).transform_filter(\n", " 'datum.location == \"Seattle\"'\n", ").mark_area(opacity=0.3).encode(\n", " alt.X('month(date):T', title=None, axis=alt.Axis(format='%b')),\n", " alt.Y('average(temp_max):Q', title='Avg. Temperature °C'),\n", " alt.Y2('average(temp_min):Q')\n", ")\n", "\n", "precip = alt.Chart(weather).transform_filter(\n", " 'datum.location == \"Seattle\"'\n", ").mark_line(\n", " interpolate='monotone',\n", " stroke='grey'\n", ").encode(\n", " alt.X('month(date):T'),\n", " alt.Y('average(precipitation):Q', title='Precipitation')\n", ")\n", "\n", "alt.layer(tempMinMax, precip)" ], "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.LayerChart(...)" ] }, "metadata": { "tags": [] }, "execution_count": 12 } ] }, { "cell_type": "markdown", "metadata": { "id": "jaRBJcj8bGsY" }, "source": [ "_The precipitation values use a much smaller range of the y-axis then the temperatures!_\n", "\n", "By default, layered charts use a *shared domain*: the values for the x-axis or y-axis are combined across all the layers to determine a shared extent. This default behavior assumes that the layered values have the same units. However, this doesn't hold up for this example, as we are combining temperature values (degrees Celsius) with precipitation values (inches)!\n", "\n", "If we want to use different y-axis scales, we need to specify how we want Altair to *resolve* the data across layers. In this case, we want to resolve the y-axis `scale` domains to be `independent` rather than use a `shared` domain. The `Chart` object produced by a layer operator includes a `resolve_scale` method with which we can specify the desired resolution:" ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 331 }, "id": "c2ZZBtzecCff", "outputId": "770e1942-479c-4ac4-bbac-94b853c39239" }, "source": [ "tempMinMax = alt.Chart(weather).transform_filter(\n", " 'datum.location == \"Seattle\"'\n", ").mark_area(opacity=0.3).encode(\n", " alt.X('month(date):T', title=None, axis=alt.Axis(format='%b')),\n", " alt.Y('average(temp_max):Q', title='Avg. Temperature °C'),\n", " alt.Y2('average(temp_min):Q')\n", ")\n", "\n", "precip = alt.Chart(weather).transform_filter(\n", " 'datum.location == \"Seattle\"'\n", ").mark_line(\n", " interpolate='monotone',\n", " stroke='grey'\n", ").encode(\n", " alt.X('month(date):T'),\n", " alt.Y('average(precipitation):Q', title='Precipitation')\n", ")\n", "\n", "alt.layer(tempMinMax, precip).resolve_scale(y='independent')" ], "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.LayerChart(...)" ] }, "metadata": { "tags": [] }, "execution_count": 13 } ] }, { "cell_type": "markdown", "metadata": { "id": "tZ7Pkv4GdG2f" }, "source": [ "_We can now see that autumn is the rainiest season in Seattle (peaking in November), complemented by dry summers._\n", "\n", "You may have noticed some redundancy in our plot specifications above: both use the same dataset and the same filter to look at Seattle only. If you want, you can streamline the code a bit by providing the data and filter transform to the top-level layered chart. The individual layers will then inherit the data if they don't have their own data definitions:" ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 331 }, "id": "3XY7hQQrarSk", "outputId": "7360b4b6-1b15-4bc8-87f4-79d40bf6152f" }, "source": [ "tempMinMax = alt.Chart().mark_area(opacity=0.3).encode(\n", " alt.X('month(date):T', title=None, axis=alt.Axis(format='%b')),\n", " alt.Y('average(temp_max):Q', title='Avg. Temperature °C'),\n", " alt.Y2('average(temp_min):Q')\n", ")\n", "\n", "precip = alt.Chart().mark_line(\n", " interpolate='monotone',\n", " stroke='grey'\n", ").encode(\n", " alt.X('month(date):T'),\n", " alt.Y('average(precipitation):Q', title='Precipitation')\n", ")\n", "\n", "alt.layer(tempMinMax, precip, data=weather).transform_filter(\n", " 'datum.location == \"Seattle\"'\n", ").resolve_scale(y='independent')" ], "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.LayerChart(...)" ] }, "metadata": { "tags": [] }, "execution_count": 14 } ] }, { "cell_type": "markdown", "metadata": { "id": "XeEefzafoSnA" }, "source": [ "While dual-axis charts can be useful, _they are often prone to misinterpretation_, as the different units and axis scales may be incommensurate. As is feasible, you might consider transformations that map different data fields to shared units, for example showing [quantiles](https://en.wikipedia.org/wiki/Quantile) or relative percentage change." ] }, { "cell_type": "markdown", "metadata": { "id": "2IefKqajprJd" }, "source": [ "## Facet" ] }, { "cell_type": "markdown", "metadata": { "id": "eyadekmFptag" }, "source": [ "*Faceting* involves subdividing a dataset into groups and creating a separate plot for each group. In earlier notebooks, we learned how to create faceted charts using the `row` and `column` encoding channels. We'll first review those channels and then show how they are instances of the more general `facet` operator.\n", "\n", "Let's start with a basic histogram of maximum temperature values in Seattle:" ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 372 }, "id": "sYjs5JvDcZ6R", "outputId": "c25f5b7a-ca1f-465b-fa1d-416f3a304f2b" }, "source": [ "alt.Chart(weather).mark_bar().transform_filter(\n", " 'datum.location == \"Seattle\"'\n", ").encode(\n", " alt.X('temp_max:Q', bin=True, title='Temperature (°C)'),\n", " alt.Y('count():Q')\n", ")" ], "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.Chart(...)" ] }, "metadata": { "tags": [] }, "execution_count": 15 } ] }, { "cell_type": "markdown", "metadata": { "id": "Za775__ccf4e" }, "source": [ "_How does this temperature profile change based on the weather of a given day – that is, whether there was drizzle, fog, rain, snow, or sun?_\n", "\n", "Let's use the `column` encoding channel to facet the data by weather type. We can also use `color` as a redundant encoding, using a customized color range:" ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 271 }, "id": "Q88OIqK5tm35", "outputId": "68b98e04-57c1-4ed4-960e-5919d0f597a4" }, "source": [ "colors = alt.Scale(\n", " domain=['drizzle', 'fog', 'rain', 'snow', 'sun'],\n", " range=['#aec7e8', '#c7c7c7', '#1f77b4', '#9467bd', '#e7ba52']\n", ")\n", "\n", "alt.Chart(weather).mark_bar().transform_filter(\n", " 'datum.location == \"Seattle\"'\n", ").encode(\n", " alt.X('temp_max:Q', bin=True, title='Temperature (°C)'),\n", " alt.Y('count():Q'),\n", " alt.Color('weather:N', scale=colors),\n", " alt.Column('weather:N')\n", ").properties(\n", " width=150,\n", " height=150\n", ")" ], "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.Chart(...)" ] }, "metadata": { "tags": [] }, "execution_count": 16 } ] }, { "cell_type": "markdown", "metadata": { "id": "lNspGpzQdlI8" }, "source": [ "_Unsurprisingly, those rare snow days center on the coldest temperatures, followed by rainy and foggy days. Sunny days are warmer and, despite Seattle stereotypes, are the most plentiful. Though as any Seattleite can tell you, the drizzle occasionally comes, no matter the temperature!_" ] }, { "cell_type": "markdown", "metadata": { "id": "PxQ1VwQjXJZt" }, "source": [ "In addition to `row` and `column` encoding channels *within* a chart definition, we can take a basic chart definition and apply faceting using an explicit `facet` operator.\n", "\n", "Let's recreate the chart above, but this time using `facet`. We start with the same basic histogram definition, but remove the data source, filter transform, and column channel. We can then invoke the `facet` method, passing in the data and specifying that we should facet into columns according to the `weather` field. The `facet` method accepts both `row` and `column` arguments. The two can be used together to create a 2D grid of faceted plots.\n", "\n", "Finally we include our filter transform, applying it to the top-level faceted chart. While we could apply the filter transform to the histogram definition as before, that is slightly less efficient. Rather than filter out \"New York\" values within each facet cell, applying the filter to the faceted chart lets Vega-Lite know that we can filter out those values up front, prior to the facet subdivision." ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 35 }, "id": "zbH63h3VWoxB", "outputId": "c0a8bcd2-08ec-4496-f572-7e476e7cb643" }, "source": [ "colors = alt.Scale(\n", " domain=['drizzle', 'fog', 'rain', 'snow', 'sun'],\n", " range=['#aec7e8', '#c7c7c7', '#1f77b4', '#9467bd', '#e7ba52']\n", ")\n", "\n", "alt.Chart().mark_bar().encode(\n", " alt.X('temp_max:Q', bin=True, title='Temperature (°C)'),\n", " alt.Y('count():Q'),\n", " alt.Color('weather:N', scale=colors)\n", ").properties(\n", " width=150,\n", " height=150\n", ").facet(\n", " data=weather,\n", " column='weather:N'\n", ").transform_filter(\n", " 'datum.location == \"Seattle\"'\n", ")" ], "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.FacetChart(...)" ] }, "metadata": { "tags": [] }, "execution_count": 17 } ] }, { "cell_type": "markdown", "metadata": { "id": "p-2HTw75YmxV" }, "source": [ "Given all the extra code above, why would we want to use an explicit `facet` operator? For basic charts, we should certainly use the `column` or `row` encoding channels if we can. However, using the `facet` operator explicitly is useful if we want to facet composed views, such as layered charts.\n", "\n", "Let's revisit our layered temperature plots from earlier. Instead of plotting data for New York and Seattle in the same plot, let's break them up into separate facets. The individual chart definitions are nearly the same as before: one area chart and one line chart. The only difference is that this time we won't pass the data directly to the chart constructors; we'll wait and pass it to the facet operator later. We can layer the charts much as before, then invoke `facet` on the layered chart object, passing in the data and specifying `column` facets based on the `location` field:" ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 404 }, "id": "DXPaA6CylJQE", "outputId": "29cd36bb-babc-4b4b-8820-75175feb7d23" }, "source": [ "tempMinMax = alt.Chart().mark_area(opacity=0.3).encode(\n", " alt.X('month(date):T', title=None, axis=alt.Axis(format='%b')),\n", " alt.Y('average(temp_max):Q', title='Avg. Temperature (°C)'),\n", " alt.Y2('average(temp_min):Q'),\n", " alt.Color('location:N')\n", ")\n", "\n", "tempMid = alt.Chart().mark_line().transform_calculate(\n", " temp_mid='(+datum.temp_min + +datum.temp_max) / 2'\n", ").encode(\n", " alt.X('month(date):T'),\n", " alt.Y('average(temp_mid):Q'),\n", " alt.Color('location:N')\n", ")\n", "\n", "alt.layer(tempMinMax, tempMid).facet(\n", " data=weather,\n", " column='location:N'\n", ")" ], "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.FacetChart(...)" ] }, "metadata": { "tags": [] }, "execution_count": 18 } ] }, { "cell_type": "markdown", "metadata": { "id": "dH3laH2hZ_Er" }, "source": [ "The faceted charts we have seen so far use the same axis scale domains across the facet cells. This default of using *shared* scales and axes helps aid accurate comparison of values. However, in some cases you may wish to scale each chart independently, for example if the range of values in the cells differs significantly.\n", "\n", "Similar to layered charts, faceted charts also support _resolving_ to independent scales or axes across plots. Let's see what happens if we call the `resolve_axis` method to request `independent` y-axes:" ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 35 }, "id": "tdkplJytl4vh", "outputId": "602817e3-0824-44ef-dc5a-3e2ced485cfa" }, "source": [ "tempMinMax = alt.Chart().mark_area(opacity=0.3).encode(\n", " alt.X('month(date):T', title=None, axis=alt.Axis(format='%b')),\n", " alt.Y('average(temp_max):Q', title='Avg. Temperature (°C)'),\n", " alt.Y2('average(temp_min):Q'),\n", " alt.Color('location:N')\n", ")\n", "\n", "tempMid = alt.Chart().mark_line().transform_calculate(\n", " temp_mid='(+datum.temp_min + +datum.temp_max) / 2'\n", ").encode(\n", " alt.X('month(date):T'),\n", " alt.Y('average(temp_mid):Q'),\n", " alt.Color('location:N')\n", ")\n", "\n", "alt.layer(tempMinMax, tempMid).facet(\n", " data=weather,\n", " column='location:N'\n", ").resolve_axis(y='independent')" ], "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.FacetChart(...)" ] }, "metadata": { "tags": [] }, "execution_count": 19 } ] }, { "cell_type": "markdown", "metadata": { "id": "J9UK1nYKa2hM" }, "source": [ "_The chart above looks largely unchanged, but the plot for Seattle now includes its own axis._\n", "\n", "What if we instead call `resolve_scale` to resolve the underlying scale domains?" ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 416 }, "id": "VjRyVRVfbCll", "outputId": "3b140df2-76f9-4f6d-df74-b38cb7d6e9f5" }, "source": [ "tempMinMax = alt.Chart().mark_area(opacity=0.3).encode(\n", " alt.X('month(date):T', title=None, axis=alt.Axis(format='%b')),\n", " alt.Y('average(temp_max):Q', title='Avg. Temperature (°C)'),\n", " alt.Y2('average(temp_min):Q'),\n", " alt.Color('location:N')\n", ")\n", "\n", "tempMid = alt.Chart().mark_line().transform_calculate(\n", " temp_mid='(+datum.temp_min + +datum.temp_max) / 2'\n", ").encode(\n", " alt.X('month(date):T'),\n", " alt.Y('average(temp_mid):Q'),\n", " alt.Color('location:N')\n", ")\n", "\n", "alt.layer(tempMinMax, tempMid).facet(\n", " data=weather,\n", " column='location:N'\n", ").resolve_scale(y='independent')" ], "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.FacetChart(...)" ] }, "metadata": { "tags": [] }, "execution_count": 20 } ] }, { "cell_type": "markdown", "metadata": { "id": "m6snDsbnbFho" }, "source": [ "_Now we see facet cells with different axis scale domains. In this case, using independent scales seems like a bad idea! The domains aren't very different, and one might be fooled into thinking that New York and Seattle have similar maximum summer temperatures._\n", "\n", "To borrow a cliché: just because you *can* do something, doesn't mean you *should*..." ] }, { "cell_type": "markdown", "metadata": { "id": "sDyHxrLXpubO" }, "source": [ "## Concatenate" ] }, { "cell_type": "markdown", "metadata": { "id": "kq4iE1w9pxMl" }, "source": [ "Faceting creates [small multiple](https://en.wikipedia.org/wiki/Small_multiple) plots that show separate subdivisions of the data. However, we might wish to create a multi-view display with different views of the *same* dataset (not subsets) or views involving *different* datasets.\n", "\n", "Altair provides *concatenation* operators to combine arbitrary charts into a composed chart. The `hconcat` operator (shorthand `|` ) performs horizontal concatenation, while the `vconcat` operator (shorthand `&`) performs vertical concatenation." ] }, { "cell_type": "markdown", "metadata": { "id": "c_w_BsgFhta8" }, "source": [ "Let's start with a basic line chart showing the average maximum temperature per month for both New York and Seattle, much like we've seen before:" ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 355 }, "id": "KAkDG69Xhnr_", "outputId": "d97fa576-5791-4912-cf3a-934d12c685d4" }, "source": [ "alt.Chart(weather).mark_line().encode(\n", " alt.X('month(date):T', title=None),\n", " alt.Y('average(temp_max):Q'),\n", " color='location:N'\n", ")" ], "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.Chart(...)" ] }, "metadata": { "tags": [] }, "execution_count": 21 } ] }, { "cell_type": "markdown", "metadata": { "id": "6-6CLOxgh88q" }, "source": [ "_What if we want to compare not just temperature over time, but also precipitation and wind levels?_\n", "\n", "Let's create a concatenated chart consisting of three plots. We'll start by defining a \"base\" chart definition that contains all the aspects that should be shared by our three plots. We can then modify this base chart to create customized variants, with different y-axis encodings for the `temp_max`, `precipitation`, and `wind` fields. We can then concatenate them using the pipe (`|`) shorthand operator:" ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 357 }, "id": "crZTVb-FpHDk", "outputId": "6394136a-5be8-45b4-f610-744cf4cf70cb" }, "source": [ "base = alt.Chart(weather).mark_line().encode(\n", " alt.X('month(date):T', title=None),\n", " color='location:N'\n", ").properties(\n", " width=240,\n", " height=180\n", ")\n", "\n", "temp = base.encode(alt.Y('average(temp_max):Q'))\n", "precip = base.encode(alt.Y('average(precipitation):Q'))\n", "wind = base.encode(alt.Y('average(wind):Q'))\n", "\n", "temp | precip | wind" ], "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.HConcatChart(...)" ] }, "metadata": { "tags": [] }, "execution_count": 22 } ] }, { "cell_type": "markdown", "metadata": { "id": "YyJN3OIdinC-" }, "source": [ "Alternatively, we could use the more explicit `alt.hconcat()` method in lieu of the pipe `|` operator. _Try rewriting the code above to use `hconcat` instead._\n", "\n", "Vertical concatenation works similarly to horizontal concatenation. _Using the `&` operator (or `alt.vconcat` method), modify the code to use a vertical ordering instead of a horizontal ordering._\n", "\n", "Finally, note that horizontal and vertical concatenation can be combined. _What happens if you write something like `(temp | precip) & wind`?_\n", "\n", "_Aside_: Note the importance of those parentheses... what happens if you remove them? Keep in mind that these overloaded operators are still subject to [Python's operator precendence rules](https://docs.python.org/3/reference/expressions.html#operator-precedence), and so vertical concatenation with `&` will take precedence over horizontal concatenation with `|`!\n", "\n", "As we will revisit later, concatenation operators let you combine any and all charts into a multi-view dashboard!" ] }, { "cell_type": "markdown", "metadata": { "id": "Dmt0ai7NpyJp" }, "source": [ "## Repeat" ] }, { "cell_type": "markdown", "metadata": { "id": "thajPm3Mp0yV" }, "source": [ "The concatenation operators above are quite general, allowing arbitrary charts to be composed. Nevertheless, the example above was still a bit verbose: we have three very similar charts, yet have to define them separately and then concatenate them.\n", "\n", "For cases where only one or two variables are changing, the `repeat` operator provides a convenient shortcut for creating multiple charts. Given a *template* specification with some free variables, the repeat operator will then create a chart for each specified assignment to those variables.\n", "\n", "Let's recreate our concatenation example above using the `repeat` operator. The only aspect that changes across charts is the choice of data field for the `y` encoding channel. To create a template specification, we can use the *repeater variable* `alt.repeat('column')` as our y-axis field. This code simply states that we want to use the variable assigned to the `column` repeater, which organizes repeated charts in a horizontal direction. (As the repeater provides the field name only, we have to specify the field data type separately as `type='quantitative'`.)\n", "\n", "We then invoke the `repeat` method, passing in data field names for each column:" ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 357 }, "id": "Pp3X3Aa0oc28", "outputId": "3c27963c-cd25-42a5-ef61-99e22f4508e5" }, "source": [ "alt.Chart(weather).mark_line().encode(\n", " alt.X('month(date):T',title=None),\n", " alt.Y(alt.repeat('column'), aggregate='average', type='quantitative'),\n", " color='location:N'\n", ").properties(\n", " width=240,\n", " height=180\n", ").repeat(\n", " column=['temp_max', 'precipitation', 'wind']\n", ")" ], "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.RepeatChart(...)" ] }, "metadata": { "tags": [] }, "execution_count": 23 } ] }, { "cell_type": "markdown", "metadata": { "id": "WeO93wdcm-kl" }, "source": [ "Repetition is supported for both columns and rows. _What happens if you modify the code above to use `row` instead of `column`?_\n", "\n", "We can also use `row` and `column` repetition together! One common visualization for exploratory data analysis is the [scatter plot matrix (or SPLOM)](https://en.wikipedia.org/wiki/Scatter_plot#Scatterplot_matrices). Given a collection of variables to inspect, a SPLOM provides a grid of all pairwise plots of those variables, allowing us to assess potential associations.\n", "\n", "Let's use the `repeat` operator to create a SPLOM for the `temp_max`, `precipitation`, and `wind` fields. We first create our template specification, with repeater variables for both the x- and y-axis data fields. We then invoke `repeat`, passing in arrays of field names to use for both `row` and `column`. Altair will then generate the [cross product (or, Cartesian product)](https://en.wikipedia.org/wiki/Cartesian_product) to create the full space of repeated charts:" ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 630 }, "id": "pEzPoVN_mtJr", "outputId": "e0589555-41c5-4cb4-dd78-7653ecc841b9" }, "source": [ "alt.Chart().mark_point(filled=True, size=15, opacity=0.5).encode(\n", " alt.X(alt.repeat('column'), type='quantitative'),\n", " alt.Y(alt.repeat('row'), type='quantitative')\n", ").properties(\n", " width=150,\n", " height=150\n", ").repeat(\n", " data=weather,\n", " row=['temp_max', 'precipitation', 'wind'],\n", " column=['wind', 'precipitation', 'temp_max']\n", ").transform_filter(\n", " 'datum.location == \"Seattle\"'\n", ")" ], "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.RepeatChart(...)" ] }, "metadata": { "tags": [] }, "execution_count": 24 } ] }, { "cell_type": "markdown", "metadata": { "id": "zwemGUsXp1m8" }, "source": [ "_Looking at these plots, there does not appear to be a strong association between precipitation and wind, though we do see that extreme wind and precipitation events occur in similar temperature ranges (~5-15° C). However, this observation is not particularly surprising: if we revisit our histogram at the beginning of the facet section, we can plainly see that the days with maximum temperatures in the range of 5-15° C are the most commonly occurring._\n", "\n", "*Modify the code above to get a better understanding of chart repetition. Try adding another variable (`temp_min`) to the SPLOM. What happens if you rearrange the order of the field names in either the `row` or `column` parameters for the `repeat` operator?*\n", "\n", "_Finally, to really appreciate what the `repeat` operator provides, take a moment to imagine how you might recreate the SPLOM above using only `hconcat` and `vconcat`!_" ] }, { "cell_type": "markdown", "metadata": { "id": "bNGvvh6dp2Ba" }, "source": [ "## A View Composition Algebra" ] }, { "cell_type": "markdown", "metadata": { "id": "MxKKfCjX44Dn" }, "source": [ "Together, the composition operators `layer`, `facet`, `concat`, and `repeat` form a *view composition algebra*: the various operators can be combined to construct a variety of multi-view visualizations.\n", "\n", "As an example, let's start with two basic charts: a histogram and a simple line (a single `rule` mark) showing a global average." ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 331 }, "id": "JG_OLPpR5yUN", "outputId": "6776832d-c756-4936-e40d-954c18d3630b" }, "source": [ "basic1 = alt.Chart(weather).transform_filter(\n", " 'datum.location == \"Seattle\"'\n", ").mark_bar().encode(\n", " alt.X('month(date):O'),\n", " alt.Y('average(temp_max):Q')\n", ")\n", "\n", "basic2 = alt.Chart(weather).transform_filter(\n", " 'datum.location == \"Seattle\"'\n", ").mark_rule(stroke='firebrick').encode(\n", " alt.Y('average(temp_max):Q')\n", ")\n", "\n", "basic1 | basic2" ], "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.HConcatChart(...)" ] }, "metadata": { "tags": [] }, "execution_count": 25 } ] }, { "cell_type": "markdown", "metadata": { "id": "w2LRQzyi6SVs" }, "source": [ "We can then combine the two charts using a `layer` operator, and then `repeat` that layered chart to show histograms with overlaid averages for multiple fields:" ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 181 }, "id": "oi06I8yk0EUj", "outputId": "8f74c396-ae7d-462f-a156-427c14eda87f" }, "source": [ "alt.layer(\n", " alt.Chart().mark_bar().encode(\n", " alt.X('month(date):O', title='Month'),\n", " alt.Y(alt.repeat('column'), aggregate='average', type='quantitative')\n", " ),\n", " alt.Chart().mark_rule(stroke='firebrick').encode(\n", " alt.Y(alt.repeat('column'), aggregate='average', type='quantitative')\n", " )\n", ").properties(\n", " width=200,\n", " height=150\n", ").repeat(\n", " data=weather,\n", " column=['temp_max', 'precipitation', 'wind']\n", ").transform_filter(\n", " 'datum.location == \"Seattle\"'\n", ")" ], "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.RepeatChart(...)" ] }, "metadata": { "tags": [] }, "execution_count": 26 } ] }, { "cell_type": "markdown", "metadata": { "id": "cjW4vHVtp8mP" }, "source": [ "Focusing only on the multi-view composition operators, the model for the visualization above is:\n", "\n", "```\n", "repeat(column=[...])\n", "|- layer\n", " |- basic1\n", " |- basic2\n", "```\n", "\n", "Now let's explore how we can apply *all* the operators within a final [dashboard](https://en.wikipedia.org/wiki/Dashboard_%28business%29) that provides an overview of Seattle weather. We'll combine the SPLOM and faceted histogram displays from earlier sections with the repeated histograms above:" ] }, { "cell_type": "code", "metadata": { "id": "8_ayEGf9oSni", "outputId": "fa08c27c-01c2-4ee3-f76a-960ea7bbb96d" }, "source": [ "splom = alt.Chart().mark_point(filled=True, size=15, opacity=0.5).encode(\n", " alt.X(alt.repeat('column'), type='quantitative'),\n", " alt.Y(alt.repeat('row'), type='quantitative')\n", ").properties(\n", " width=125,\n", " height=125\n", ").repeat(\n", " row=['temp_max', 'precipitation', 'wind'],\n", " column=['wind', 'precipitation', 'temp_max']\n", ")\n", "\n", "dateHist = alt.layer(\n", " alt.Chart().mark_bar().encode(\n", " alt.X('month(date):O', title='Month'),\n", " alt.Y(alt.repeat('row'), aggregate='average', type='quantitative')\n", " ),\n", " alt.Chart().mark_rule(stroke='firebrick').encode(\n", " alt.Y(alt.repeat('row'), aggregate='average', type='quantitative')\n", " )\n", ").properties(\n", " width=175,\n", " height=125\n", ").repeat(\n", " row=['temp_max', 'precipitation', 'wind']\n", ")\n", "\n", "tempHist = alt.Chart(weather).mark_bar().encode(\n", " alt.X('temp_max:Q', bin=True, title='Temperature (°C)'),\n", " alt.Y('count():Q'),\n", " alt.Color('weather:N', scale=alt.Scale(\n", " domain=['drizzle', 'fog', 'rain', 'snow', 'sun'],\n", " range=['#aec7e8', '#c7c7c7', '#1f77b4', '#9467bd', '#e7ba52']\n", " ))\n", ").properties(\n", " width=115,\n", " height=100\n", ").facet(\n", " column='weather:N'\n", ")\n", "\n", "alt.vconcat(\n", " alt.hconcat(splom, dateHist),\n", " tempHist,\n", " data=weather,\n", " title='Seattle Weather Dashboard'\n", ").transform_filter(\n", " 'datum.location == \"Seattle\"'\n", ").resolve_legend(\n", " color='independent'\n", ").configure_axis(\n", " labelAngle=0\n", ")" ], "execution_count": null, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.VConcatChart(...)" ] }, "metadata": { "tags": [] }, "execution_count": 27 } ] }, { "cell_type": "markdown", "metadata": { "id": "2KeY0T5G79KO" }, "source": [ "The full composition model for this dashboard is:\n", "\n", "```\n", "vconcat\n", "|- hconcat\n", "| |- repeat(row=[...], column=[...])\n", "| | |- splom base chart\n", "| |- repeat(row=[...])\n", "| |- layer\n", "| |- dateHist base chart 1\n", "| |- dateHist base chart 2\n", "|- facet(column='weather')\n", " |- tempHist base chart\n", "```\n", "\n", "_Phew!_ The dashboard also includes a few customizations to improve the layout:\n", "\n", "- We adjust chart `width` and `height` properties to assist alignment and ensure the full visualization fits on the screen.\n", "- We add `resolve_legend(color='independent')` to ensure the color legend is associated directly with the colored histograms by temperature. Otherwise, the legend will resolve to the dashboard as a whole.\n", "- We use `configure_axis(labelAngle=0)` to ensure that no axis labels are rotated. This helps to ensure proper alignment among the scatter plots in the SPLOM and the histograms by month on the right.\n", "\n", "_Try removing or modifying any of these adjustments and see how the dashboard layout responds!_\n", "\n", "This dashboard can be reused to show data for other locations or from other datasets. _Update the dashboard to show weather patterns for New York instead of Seattle._" ] }, { "cell_type": "markdown", "metadata": { "id": "NZzvXj7c4BpD" }, "source": [ "## Summary\n", "\n", "For more details on multi-view composition, including control over sub-plot spacing and header labels, see the [Altair Compound Charts documentation](https://altair-viz.github.io/user_guide/compound_charts.html).\n", "\n", "Now that we've seen how to compose multiple views, we're ready to put them into action. In addition to statically presenting data, multiple views can enable interactive multi-dimensional exploration. For example, using _linked selections_ we can highlight points in one view to see corresponding values highlight in other views.\n", "\n", "In the next notebook, we'll examine how to author *interactive selections* for both individual plots and multi-view compositions." ] } ] }