{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "

Tutorial 03. Customizing Visual Appearance

" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The previous tutorial focused on specifying elements and simple collections of them. This one explains how the visual appearance can be adjusted to bring out the most salient aspects of your data, or just to make the style match the overall theme of your document. We'll use data in Pandas, and HoloViews, Bokeh, and Matplotlib to display the results:\n", "\n", "
\n", "\n", "\n", "\n", "\n", "
\n", "\n", "HoloViews explicitly makes the distinction between **data** and **plotting options**, which allows annotating the data with semantic metadata before deciding how to visualize the data. It also allows rendering the same object using different plotting libraries, such as Bokeh or Matplotlib.\n", "\n", "\n", "## Preliminaries\n", "\n", "In the [annotating your data section](./02_Annotating_Data.ipynb), ``hv.extension('bokeh')`` was used at the start to load and activate the bokeh plotting extension. In this notebook, we will also briefly use [matplotlib](www.matplotlib.org) that will be loaded, but not yet activated, by listing it second:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import holoviews as hv\n", "from holoviews import opts, dim\n", "hv.extension('bokeh', 'matplotlib')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualizing diamond data\n", "\n", "Let's find some interesting data to generate elements from, before we consider how to customize them. Here is a dataset of information about 50,000 individual diamonds (including their weight in carats and their cut, quality, and price), which provides some rich opportunities for visualization:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "diamonds = pd.read_csv('../data/diamonds.csv')\n", "diamonds.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One obvious thing to look at is the relationship between the mass of the diamonds given in 'carat' and their 'price'. Since the dataset is large we will sample 1,000 random datapoints from the DataFrame and plot the 'carat' column against the 'price' as a Scatter:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "hv.Scatter(diamonds.sample(5000), 'carat', 'price')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There is clearly structure in this data, but the data is also clearly being overplotted and being squashed into a small plot. To fix the problems, we can start customizing the appearance of this object using the HoloViews [options system](http://holoviews.org/user_guide/Customizing_Plots.html). Later on in the tutorial, we will see an alternative way of avoiding overplotting using Datashader in [Working_with_Large_Datasets](./10_Working_with_Large_Datasets.ipynb). \n", "\n", "\n", "### Setting options with `.opts`\n", "\n", "We noted that the data is too compressed in the x direction. Let us fix that by specifying the ``width`` option and additionally spread the data out along the y-axis by enabling a log axis using the ``logy`` option. We will also enable a Bokeh 'hover' tool letting us reveal more information about each datapoint:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "scatter = hv.Scatter(diamonds.sample(5000), \n", " 'carat', ['price', 'cut']).redim.label(carat='Carat (ct)',\n", " price='Price ($)')\n", "scatter.opts(width=600, logy=True, tools=['hover'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here you can see that it's still a plot of Price vs. Carat, but if you hover over a datapoint you can see that the 'cut' information is visible for each point as you visit it.\n", "\n", "The bottom line uses the `.opts` method to specify the ``width``, `logy` and `tools` options applied to the [``Scatter``](http://holoviews.org/reference/elements/bokeh/Scatter.html) object.\n", "\n", "In addition to specifying keywords directly on a single element, you can also make use of a convenient, tab-completable *options builder* (see the [user guide](http://holoviews.org/user_guide/Customizing_Plots.html) for details):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "scatter.opts(opts.Scatter(width=600, logy=True, tools=['hover']))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Exercise: Try inspecting some of the tab-completable keywords for Curve elements\n", "# Note: You can see the available completions by pressing Tab inside opts." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Exercise: Try enabling the boolean show_grid plot option for the curve above\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Aside: ``hv.help``\n", "\n", "Tab completion helps discover what keywords are available, but you can get more complete help using the ``hv.help`` utility. For instance, to learn more about the options for ``hv.Scatter`` run ``hv.help(hv.Scatter)``:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# hv.help(hv.Scatter)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Further customization\n", "\n", "The options applied earlier instructed HoloViews to build a plot 600 pixels wide, when rendered with the Bokeh plotting extension. Now let's specify that the Bokeh glyph should be colored by the 'cut' column using the 'Set1' colormap and reduce the 'alpha' and 'size' of the points so we can see overlapping points better:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "scatter.opts(color=dim('cut'), alpha=0.5, cmap='Set1')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note how the plot options applied above to ``scatter`` are remembered! The ``.opts`` method allows incremental customization of an object without storing those options on the object itself. Behind the scenes HoloViews has linked the specified keyword options to the ``scatter`` object via a hidden integer id attribute.\n", "\n", "Having used the ``.opts`` method on ``scatter`` again, we have now associated the 'alpha', 'size' and 'cmap' options to it. Unlike the `width` and `logy` options specified earlier, these options are defined by Bokeh and belong to the corresponding [scatter glyph](http://bokeh.pydata.org/en/latest/docs/user_guide/plotting.html#scatter-markers). See the HoloViews [user guide](http://holoviews.org/user_guide/Customizing_Plots.html) for more information on the difference between these two types of options (called 'plot' and 'style' options respectively).\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Exercise: Display scatter without any new options to verify it stays colored\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Exercise: Try setting the 'size' style options to 1\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Exercise: Try using an options builder and exploring some of the completions\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Switching to matplotlib\n", "\n", "Let us now view our curve with matplotlib using the ``hv.output`` utility:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "hv.output(scatter, backend='matplotlib')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All our options are gone! This is because the options are associated with the corresponding plotting extension---if you switch back to 'bokeh', the options will be applicable again. In general, options have to be specific to backends; e.g. the ``size`` style option accepted by Bokeh is called ``s`` in matplotlib.\n", "\n", "Let us briefly make matplotlib the default plotting extension using `hv.output` without specifying an object to customize:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "hv.output(backend='matplotlib') " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can apply the matplotlib specific options:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "selection = scatter.select(carat=(0, 3))\n", "selection.opts(aspect=4, fig_size=400, color='blue', s=4, alpha=0.2)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Exercise: Apply the color and alpha options as above, but to the matplotlib plot\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Changing the output format\n", "\n", "With the matplotlib plotting extension still active, we can use `hv.output` to specify that we want SVG output instead of PNG:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "hv.output(fig='svg') " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can generate an SVG image using a different set of matplotlib options:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "scatter.opts(aspect=4, fig_size=400, xrotation=70, color='green', s=10, marker='^')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Exercise: Verify for yourself that the output above is SVG and not PNG\n", "# You can do this by right-clicking above then selecting 'Open Image in a new Tab' (Chrome) or 'View Image' (Firefox)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Switching back to bokeh\n", "\n", "In previous releases of HoloViews, it was typical to switch to matplotlib in order to export to PNG or SVG, because Bokeh did not support these file formats. Since [Bokeh 0.12.6](https://bokeh.github.io/blog/2017/6/13/release-0-12-6/) we can now easily use HoloViews to export Bokeh plots to a PNG file, as we will now demonstrate:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "hv.output(backend='bokeh')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By passing ``fmt='png'`` and a ``filename='diamonds'`` to ``hv.save`` we can save the output to a PNG file before displaying the plot again:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#%%output fig='png' filename='diamonds'\n", "hv.save(scatter, fmt='png', filename='diamonds')\n", "scatter" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we have requested PNG format using ``fig='png'`` and that the output should go to diamonds.png using ``filename='diamonds'``:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ls *.png" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Bokeh also has some SVG support, but it is not yet exposed in HoloViews." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using ``group`` and ``label``" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The above examples showed how to customize by Element type, but HoloViews offers multiple additional levels of customization that should be sufficient to cover any purpose. For our last example, let us split our diamonds dataframe based on the clarity of the diamonds, selecting the lowest and highest clarity:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "low_clarity = diamonds[diamonds.clarity=='I1']\n", "high_clarity = diamonds[diamonds.clarity=='IF']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll now introduce the [``Spikes``](http://holoviews.org/reference/elements/bokeh/Spikes.html) element, and display it with a large width, a log y-axis and some modifications to the xticks. We can specify those options for all following [``Spikes``](http://holoviews.org/reference/elements/bokeh/Spikes.html) elements using the ``opts.defaults`` utility:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "opts.defaults(\n", " opts.Spikes(width=900, logx=True, xticks=8, xrotation=90))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This time we will visulize the same the data as a dot graph by combining ``Scatter`` elements with the ``Spikes``, showing the distribution of prices between the low and high clarity groups. \n", "\n", "We can do this using the element ``group`` and ``label`` introduced in the [annotating your data](./02_Annotating_Data.ipynb) section as follows:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "overlay = (hv.Spikes( low_clarity, 'price', 'carat', group='Diamonds', label='Low')\n", " * hv.Scatter( low_clarity, 'price', 'carat', group='Diamonds', label='Low')\n", " * hv.Spikes( high_clarity, 'price', 'carat', group='Diamonds', label='High')\n", " * hv.Scatter(high_clarity, 'price', 'carat', group='Diamonds', label='High'))\n", "\n", "overlay.opts(\n", " opts.Spikes('Diamonds.Low', color='blue'),\n", " opts.Spikes('Diamonds.High', color='red'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using the color option to distinguish between the two categories of data we can now see the clear difference between the two groups, showing that diamonds with a low clarity need to have much higher mass in carats to obtain the same price. Similar techniques can be used to provide arbitrarily specific customizations when needed." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Exercise: Remove the call to the .opts method above and observe the effect\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Exercise: Give the 'Low' clarity scatter points a black 'line_color'\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Optional Exercise: Try differentiating the two sets of spikes by group and not label\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Onwards\n", "\n", "We have now seen some of the ways you can customize the appearance of your visualizations. You can consult our [Customizing Plots](http://holoviews.org/user_guide/Customizing_Plots.html) user guide to learn about other approaches, including the notebook-specific magic syntax that was used in older versions of HoloViews, as well as how to clear options using the `.opts.clear` method.\n", "\n", "You may also wish to consult the extra [A1 Exploration with Containers](./A1_Exploration_with_Containers.ipynb) tutorial, which gives examples of how the appearance of elements can be customized when viewed in containers. In the next tutorial, [Working with Tabular Data](./04_Working_with_Tabular_Data.ipynb) we will see how to use the flexibility offered by HoloViews when working with tabular data." ] } ], "metadata": { "language_info": { "name": "python", "pygments_lexer": "ipython3" } }, "nbformat": 4, "nbformat_minor": 2 }