{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "

Tutorial 7. Custom Interactivity

" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using hvPlot allows you to generate a number of different types of plot quickly from a standard API by building [HoloViews](https://holoviews.org) objects, as discussed in the previous notebook. These objects are rendered with Bokeh, which offers a number of standard ways to interact with your plot, such as panning and zooming tools.\n", "\n", "Many other modes of interactivity are possible when building an exploratory visualization (such as a dashboard) and these forms of interactivity cannot be achieved using hvPlot alone.\n", "\n", "In this notebook, we will drop down to the HoloViews level of representation to build a visualization directly that consists of linked plots that update when you interactivily select a particular earthquake with the mouse. The goal is to show how more sophisticated forms of interactivity can be built when needed, in a way that's fully compatible with all the examples shown in earlier sections." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First let us load our initial imports:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pathlib\n", "import numpy as np\n", "import pandas as pd\n", "import hvplot.pandas # noqa\n", "from holoviews.element import tiles" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And clean the data before filtering (for magnitude `>7`) and projecting to to Web Mercator as before:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%time\n", "df = pd.read_parquet(pathlib.Path('../data/earthquakes-projected.parq'))\n", "most_severe = df[df.mag >= 7]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Towards the end of the previous notebook we generated a scatter plot of earthquakes\n", "across the earth that had a magnitude `>7` overlaid on top of a map tile source:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "high_mag_quakes = most_severe.hvplot.points(x='easting', y='northing', c='mag', \n", " title='Earthquakes with magnitude >= 7')\n", "esri = tiles.EsriImagery().redim(x='easting', y='northing')\n", "esri * high_mag_quakes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And saw how this object is a HoloViews `Points` object:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(high_mag_quakes)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This object is an example of a HoloViews *Element*, which is an object that can display itself. These elements are *thin* wrappers around your data and the raw input data is always available on the `.data` attribute. For instance, we can look at the `head` of the `most_severe` `DataFrame` as follows:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "high_mag_quakes.data.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will now learn a little more about `HoloViews` elements, including how to build them up from scratch so that we can control every aspect of them.\n", "\n", "### An Introduction to HoloViews Elements\n", "\n", "HoloViews elements are the atomic, visualizable components that can be\n", "rendered by a plotting library such as Bokeh. We don't actually need to use\n", "hvPlot to create these element objects: we can create them directly by\n", "importing HoloViews (and loading the extension if we have not loaded\n", "hvPlot):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import holoviews as hv\n", "hv.extension(\"bokeh\") # Optional here as we have already loaded hvplot.pandas" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can create our own example of a `Points` element. In the next cell we plot 100 points with independent normal distributions in the `x` and `y` directions:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "xs = np.random.randn(100)\n", "ys = np.random.randn(100)\n", "hv.Points((xs, ys))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that the axis labels are 'x' and 'y', the default *dimensions* for\n", "this element type. We can use a different set of dimensions along the x- and y-axis (say\n", "'weight' and 'height') and we can also associate additional `fitness` information with each point if we wish:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "xs = np.random.randn(100)\n", "ys = np.random.randn(100)\n", "fitness = np.random.randn(100)\n", "height_v_weight = hv.Points((xs, ys, fitness), ['weight', 'height'], 'fitness')\n", "height_v_weight" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can look at the printed representation of this object:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(height_v_weight)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here the printed representation shows the *key dimensions* that we specified in square brackets as `[weight,height]` and the additional *value dimension* `fitness` in parentheses as `(fitness)`. The *key dimensions* map to the axes and the *value dimensions* can be visually represented by other visual attributes as we shall see shortly.\n", "\n", "For more information an HoloViews dimensions, see this [user guide](http://holoviews.org/user_guide/Annotating_Data.html)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Exercise\n", "\n", "Visit the [HoloViews reference gallery](http://holoviews.org/reference/index.html) and browse\n", "the available set of elements. Pick an element type and try running\n", "one of the self-contained examples in the following cell." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Setting Visual Options\n", "\n", "The two `Points` elements above look quite different from the one returned by hvPlot showing the earthquake positions. This is because hvPlot makes use of the HoloViews *options system* to customize the visual representation of these element objects.\n", "\n", "Let us color the `height_v_weight` scatter by the fitness value and use a larger\n", "point size:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "height_v_weight.opts(color='fitness', size=8, colorbar=True, aspect='square')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Exercise\n", "\n", "Copy the line above into the next cell and try changing the points to\n", "'blue' or 'green' or another dimension of the data such as 'height' or 'weight'.\n", "\n", "Are the results what you expect?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### The `help` system\n", "\n", "You can learn more about the `.opts` method and the HoloViews options\n", "system in the [corresponding user\n", "guide](http://holoviews.org/user_guide/Applying_Customizations.html). To\n", "easily learn about the available options from inside a notebook, you can\n", "use `hv.help` and inspect the 'Style Options'." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Commented as there is a lot of help output!\n", "# hv.help(hv.Scatter) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "At this point, we can have some insight to the sort of HoloViews object\n", "hvPlot is building behind the scenes for our earthquake example:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "esri * hv.Points(most_severe, ['easting', 'northing'], 'mag').opts(color='mag', size=8, aspect='equal')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Exercise\n", "\n", "Try using `hv.help` to inspect the options available for different element types such as the `Points` element used above. Copy the line above into the cell below and pick a `Points` option that makes sense to you and try using it in the `.opts` method.\n", "\n", "
(Hint)
\n", "\n", "If you can't decide on an option to pick, a good choice is `marker`. For instance, try:\n", "\n", " * `marker='+'` \n", " * `marker='d'`.\n", " \n", " HoloViews uses [matplotlib's conventions](https://matplotlib.org/3.1.0/api/markers_api.html) for specifying the various marker types. Try finding out which ones are support by Bokeh.\n", "\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Custom interactivity for Elements\n", "\n", "When rasterization of the population density data via hvplot was first introduced, we saw that the HoloViews object returned was not an element but a *`DynamicMap`*.\n", "\n", "A `DynamicMap` enables custom interactivity beyond the Bokeh defaults by dynamically generating elements that get displayed and updated as the plot is interacted with.\n", "\n", "There is a counterpart to the `DynamicMap` that does not require a live Python server to be running, called the `HoloMap`. The `HoloMap`\n", "container will not be covered in the tutorial but you can learn more about them in the [containers user guide](http://holoviews.org/user_guide/Dimensioned_Containers.html).\n", "\n", "Now let us build a very simple `DynamicMap` that is driven by a *linked stream* (specifically a `PointerXY` stream) that represents the position of the cursor over the plot:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from holoviews import streams\n", "\n", "ellipse = hv.Ellipse(0, 0, 1) \n", "pointer = streams.PointerXY(x=0, y=0) # x=0 and y=0 are the initialized values\n", "\n", "def crosshair(x, y):\n", " return hv.HLine(y) * hv.VLine(x)\n", "\n", "ellipse * hv.DynamicMap(crosshair, streams=[pointer])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Try moving your mouse over the plot and you should see the crosshair\n", "follow your mouse position.\n", "\n", "The core concepts here are:\n", "\n", "* The plot shows an overlay built with the `*` operator introduced in\n", " the previous notebook.\n", "* There is a callback that returns this overlay that is built according\n", " to the supplied `x` and `y` arguments. A DynamicMap always contains a\n", " callback that returns a HoloViews object such as an `Element` or\n", " `Overlay`\n", "* These `x` and `y` arguments are supplied by the `PointerXY` stream\n", " that reflect the position of the mouse on the plot." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Exercise\n", "\n", "Look up the `Ellipse`, `HLine`, and `VLine` elements in the\n", "[HoloViews reference guide](http://holoviews.org/reference/index.html) and see\n", "if the definitions of these elements align with your initial intuitions.\n", "\n", "#### Exercise (additional)\n", "\n", "If you have time, try running one of the examples in the\n", "'Streams' section of the [HoloViews reference guide](http://holoviews.org/reference/index.html) in the cell below. All the examples in the reference guide should be relatively short and self-contained." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Selecting a particular earthquake with the mouse\n", "\n", "Now we only need two more concepts before we can set up the appropriate\n", "mechanism to select a particular earthquake on the hvPlot-generated\n", "Scatter plot we started with.\n", "\n", "First, we can attach a stream to an existing HoloViews element such as\n", "the earthquake distribution generated with hvplot:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "selection_stream = streams.Selection1D(source=high_mag_quakes)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next we need to enable the 'tap' tool on our Scatter to instruct Bokeh\n", "to enable the desired selection mechanism in the browser." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "high_mag_quakes.opts(tools=['tap'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The Bokeh default alpha of points which are unselected is going to be too low when we overlay these points on a tile source. We can use the HoloViews options system to pick a better default as follows:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "hv.opts.defaults(hv.opts.Points(nonselection_alpha=0.4))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The tap tool is in the toolbar with the icon showing the concentric\n", "circles and plus symbol. If you enable this tool, you should be able to pick individual earthquakes above by tapping on them.\n", "\n", "Now we can make a DynamicMap that uses the stream we defined to show the index of the earthquake selected via the `hv.Text` element:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def labelled_callback(index):\n", " if len(index) == 0:\n", " return hv.Text(x=0,y=0, text='')\n", " first_index = index[0] # Pick only the first one if multiple are selected\n", " row = most_severe.iloc[first_index]\n", " text = '%d : %s' % (first_index, row.place)\n", " return hv.Text(x=row.easting, y=row.northing, text=text).opts(color='white')\n", "\n", "labeller = hv.DynamicMap(labelled_callback, streams=[selection_stream])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This labeller receives the index argument from the Selection1D stream\n", "which corresponds to the row of the original dataframe (`most_severe`)\n", "that was selected. This lets us present the index and place value using\n", "`hv.Text` which we then position at the corresponding latitude and\n", "longitude to label the chosen earthquake.\n", "\n", "Finally, we overlay this labeller `DynamicMap` over the original\n", "plot. Now by using the tap tool you can see the index number of an\n", "earthquake followed by the assigned place name:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "(esri * high_mag_quakes * labeller).opts(hv.opts.Points(tools=['tap', 'hover']))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Exercise\n", "\n", "Pick an earthquake point above and using the displayed index, display the corresponding row of the `most_severe` dataframe using the `.iloc` method in the following cell." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Building a linked earthquake visualizer\n", "\n", "Now we will build a visualization that achieves the following:\n", "\n", "* The user can select an earthquake with magnitude `>7` using the tap\n", " tool in the manner illustrated in the last section.\n", " \n", "* In addition to the existing label, we will add concentric circles to further highlight the\n", " selected earthquake location.\n", "\n", "* *All* earthquakes within 0.5 degrees of latitude and longitude of the\n", " selected earthquake (~50km) will then be used to supply data for two linked\n", " plots:\n", " \n", " 1. A histogram showing the distribution of magnitudes in the selected area.\n", " 2. A timeseries scatter plot showing the magnitudes of earthquakes over time in the selected area." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The first step is to generate a concentric-circle marker using a similar approach to the `labeller` above. We can write a function that uses `Ellipse` to mark a particular earthquake and pass it to a `DynamicMap`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def mark_earthquake(index):\n", " if len(index) == 0:\n", " return hv.Overlay([])\n", " first_index = index[0] # Pick only the first one if multiple are selected\n", " row = most_severe.iloc[first_index]\n", " return (hv.Ellipse(row.easting, row.northing, 1.5e6) *\n", " hv.Ellipse(row.easting, row.northing, 3e6)).opts(\n", " hv.opts.Ellipse(color='white', alpha=0.5)\n", " )\n", "\n", "quake_marker = hv.DynamicMap(mark_earthquake, streams=[selection_stream])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can test this component by building an overlay of the `ESRI` tile source, the `>=7` magnitude points and `quake_marked`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "esri * high_mag_quakes.opts(tools=['tap']) * quake_marker" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that you may need to zoom in to your selected earthquake to see the\n", "localized, lower magnitude earthquakes around it." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Filtering earthquakes by location\n", "\n", "We wish to analyze the earthquakes that occur around a particular latitude and longitude. To do this we will define a function that given a latitude and longitude, returns the rows of a suitable dataframe that corresponding to earthquakes within 0.5 degrees of that position:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def earthquakes_around_point(df, lat, lon, degrees_dist=0.5):\n", " half_dist = degrees_dist / 2.0\n", " return df[((df['latitude'] - lat).abs() < half_dist) \n", " & ((df['longitude'] - lon).abs() < half_dist)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As it can be slow to filter our dataframes in this way, we can define the following function that can cache the result of filtering `df` (containing all earthquakes) based on an index pulled from the `most_severe` dataframe:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def index_to_selection(indices, cache={}):\n", " if not indices: \n", " return most_severe.iloc[[]]\n", " index = indices[0] # Pick only the first one if multiple are selected\n", " if index in cache: return cache[index]\n", " row = most_severe.iloc[index]\n", " selected_df = earthquakes_around_point(df, row.latitude, row.longitude)\n", " cache[index] = selected_df\n", " return selected_df " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The caching will be useful as we know both of our planned linked plots (i.e the histogram and scatter over time) make use of the same earthquake selection once a particular index is supplied from a user selection. This particular caching strategy is rather awkward (and leaks memory!) but it simple and will serve for the current example. A better approach to caching will be presented in the [Advanced Dashboards](./08_Advanced_Dashboards.ipynb) section of the tutorial." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Exercise\n", "\n", "Test the `index_to_selection` function above for the index you picked in the previous exercise. Note that the stream supplied a *list* of indices and that the function above only uses the first value given in that list. Do the selected rows look correct?:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Exercise\n", "\n", "Convince yourself that the selected earthquakes are within 0.5$^o$ distance of each other in both latitude and longitude." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
Hint
\n", "\n", "For a given `chosen` index, you can see the distance difference using the following code:\n", "\n", "```python\n", "chosen = 235\n", "delta_long = index_to_selection([chosen]).longitude.max() - index_to_selection([chosen]).longitude.min()\n", "delta_lat = index_to_selection([chosen]).latitude.max() - index_to_selection([chosen]).latitude.min()\n", "print(\"Difference in longitude: %s\" % delta_long)\n", "print(\"Difference in latitude: %s\" % delta_lat)\n", "```\n", "\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Linked plots\n", "\n", "So far we have overlayed the display updates on top of the existing\n", "spatial distribution of earthquakes. However, there is no requirement\n", "that the data is overlaid and we might want to simply attach an entirely\n", "new, derived plot that dynamically updates to the side.\n", "\n", "Using the same principles as we have already seen, we can define a\n", "`DynamicMap` that returns `Histogram` distributions of earthquake\n", "magnitude:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def histogram_callback(index):\n", " title = 'Distribution of all magnitudes within half a degree of selection'\n", " selected_df = index_to_selection(index)\n", " return selected_df.hvplot.hist(y='mag', bin_range=(0,10), bins=20, color='red', title=title)\n", "\n", "histogram = hv.DynamicMap(histogram_callback, streams=[selection_stream])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The only real difference in the approach here is that we can still use\n", "`.hvplot` to generate our elements instead of declaring the HoloViews\n", "elements explicitly. In this example, `.hvplot.hist` is used." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The exact same principles can be used to build the scatter callback and `temporal_distribution` `DynamicMap`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def scatter_callback(index):\n", " title = 'Temporal distribution of all magnitudes within half a degree of selection '\n", " selected_df = index_to_selection(index)\n", " return selected_df.hvplot.scatter('time', 'mag', color='green', title=title)\n", "\n", "temporal_distribution = hv.DynamicMap(scatter_callback, streams=[selection_stream])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Lastly, let us define a `DynamicMap` that draws a `VLine` to mark the time at which the selected earthquake occurs so we can see which tremors may have been aftershocks immediately after that major earthquake occurred:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def vline_callback(index):\n", " if not index:\n", " return hv.VLine(0).opts(alpha=0)\n", " row = most_severe.iloc[index[0]]\n", " return hv.VLine(row.name).opts(line_width=2, color='black')\n", "\n", "temporal_vline = hv.DynamicMap(vline_callback, streams=[selection_stream])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We now have all the pieces we need to build an interactive, linked visualization of earthquake data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Exercise\n", "\n", "Test the `histogram_callback` and `scatter_callback` callback functions by supplying your chosen index, remembering that these functions require a list argument in the following cell." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Putting it together" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can combine the components we have already built as follows to create a dynamically updating plot together with an associated, linked histogram:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "((esri * high_mag_quakes.opts(tools=['tap']) * labeller * quake_marker)\n", " + histogram + temporal_distribution * temporal_vline).cols(1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We now have a custom interactive visualization that builds on the output of `hvplot` by making use of the underlying HoloViews objects that it generates." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Conclusion\n", "\n", "When exploring data it can be convenient to use the `.plot` API to quickly visualize a particular dataset. By calling `.hvplot` to generate different plots over the course of a session and then linking such plots together, it is possible to gradually build up a mental model of how a particular dataset is structured. \n", "\n", "In the workflow presented here, building such custom interaction is relatively quick and easy and does not involve throwing away prior code used to generate simpler plots. In the spirit of 'short cuts not dead ends', we can use the HoloViews-object output of `hvplot` that we used in our initial exploration to build rich visualizations with custom interaction to explore our data at a deeper level. \n", "\n", "These interactive visualizations not only allow for custom interactions beyond the scope of `hvplot` alone, but they can display visual annotations not offered by the `.plot` API. In particular, we can overlay our data on top of tile sources, generate interactive textual annotations, draw shapes such a circles, mark horizontal and vertical marker lines and much more. Using HoloViews you can build visualizations that allow you to directly interact with your data in a useful and intuitive manner." ] } ], "metadata": { "language_info": { "name": "python", "pygments_lexer": "ipython3" } }, "nbformat": 4, "nbformat_minor": 4 }