{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Creating Interactive Visualizations with Bokeh" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[Bokeh](http://bokeh.pydata.org) is a Python package for creating interactive, browser-based visualizations, and is well-suited for \"big data\" applications.\n", "\n", "* Bindings can (and have) been created for other languages.\n", "\n", "\n", "\n", "Bokeh allows users to create interactive html visualizations without using JS.\n", "\n", "Bokeh is a **language-based** visualization system. This allows for:\n", "\n", "* high-level commands for data binding, transformation, interaction\n", "* low-level power to deeply customize\n", "\n", "Bokeh philosophy: \n", "\n", "> Make a smart choice when it is possible to do so automatically, and expose low-level capabilities when it is not." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# How does Bokeh work?\n", "\n", "Bokeh writes to a custom-built HTML5 Canvas library, which affords it high performance. This allows it to integrate with other web tools, such as Google Maps.\n", "\n", "Bokeh plots are based on visual elements called **glyphs** that are bound to data objects." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Installation\n", "\n", "Bokeh can be installed easily either via `pip` or `conda` (if using Anaconda):\n", "\n", " pip install bokeh\n", " \n", " conda install bokeh" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## A Simple Example\n", "\n", "First we'll import the bokeh.plotting module, which defines the graphical functions and primitives." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import bokeh.plotting as bk" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we'll tell Bokeh to display its plots directly into the notebook. This will cause all of the Javascript and data to be embedded directly into the HTML of the notebook itself. (Bokeh can output straight to HTML files, or use a server, which we'll look at later.)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
\n", " \n", " Loading BokehJS ...\n", "
" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/javascript": [ "\n", "(function(global) {\n", " function now() {\n", " return new Date();\n", " }\n", "\n", " var force = true;\n", "\n", " if (typeof (window._bokeh_onload_callbacks) === \"undefined\" || force === true) {\n", " window._bokeh_onload_callbacks = [];\n", " window._bokeh_is_loading = undefined;\n", " }\n", "\n", "\n", " \n", " if (typeof (window._bokeh_timeout) === \"undefined\" || force === true) {\n", " window._bokeh_timeout = Date.now() + 5000;\n", " window._bokeh_failed_load = false;\n", " }\n", "\n", " var NB_LOAD_WARNING = {'data': {'text/html':\n", " \"
\\n\"+\n", " \"

\\n\"+\n", " \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n", " \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n", " \"

\\n\"+\n", " \"\\n\"+\n", " \"\\n\"+\n", " \"from bokeh.resources import INLINE\\n\"+\n", " \"output_notebook(resources=INLINE)\\n\"+\n", " \"\\n\"+\n", " \"
\"}};\n", "\n", " function display_loaded() {\n", " if (window.Bokeh !== undefined) {\n", " document.getElementById(\"057f099e-d583-4124-9b12-e15bb813acb1\").textContent = \"BokehJS successfully loaded.\";\n", " } else if (Date.now() < window._bokeh_timeout) {\n", " setTimeout(display_loaded, 100)\n", " }\n", " }\n", "\n", " function run_callbacks() {\n", " window._bokeh_onload_callbacks.forEach(function(callback) { callback() });\n", " delete window._bokeh_onload_callbacks\n", " console.info(\"Bokeh: all callbacks have finished\");\n", " }\n", "\n", " function load_libs(js_urls, callback) {\n", " window._bokeh_onload_callbacks.push(callback);\n", " if (window._bokeh_is_loading > 0) {\n", " console.log(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n", " return null;\n", " }\n", " if (js_urls == null || js_urls.length === 0) {\n", " run_callbacks();\n", " return null;\n", " }\n", " console.log(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n", " window._bokeh_is_loading = js_urls.length;\n", " for (var i = 0; i < js_urls.length; i++) {\n", " var url = js_urls[i];\n", " var s = document.createElement('script');\n", " s.src = url;\n", " s.async = false;\n", " s.onreadystatechange = s.onload = function() {\n", " window._bokeh_is_loading--;\n", " if (window._bokeh_is_loading === 0) {\n", " console.log(\"Bokeh: all BokehJS libraries loaded\");\n", " run_callbacks()\n", " }\n", " };\n", " s.onerror = function() {\n", " console.warn(\"failed to load library \" + url);\n", " };\n", " console.log(\"Bokeh: injecting script tag for BokehJS library: \", url);\n", " document.getElementsByTagName(\"head\")[0].appendChild(s);\n", " }\n", " };var element = document.getElementById(\"057f099e-d583-4124-9b12-e15bb813acb1\");\n", " if (element == null) {\n", " console.log(\"Bokeh: ERROR: autoload.js configured with elementid '057f099e-d583-4124-9b12-e15bb813acb1' but no matching script tag was found. \")\n", " return false;\n", " }\n", "\n", " var js_urls = [\"https://cdn.pydata.org/bokeh/release/bokeh-0.12.4.min.js\", \"https://cdn.pydata.org/bokeh/release/bokeh-widgets-0.12.4.min.js\"];\n", "\n", " var inline_js = [\n", " function(Bokeh) {\n", " Bokeh.set_log_level(\"info\");\n", " },\n", " \n", " function(Bokeh) {\n", " \n", " document.getElementById(\"057f099e-d583-4124-9b12-e15bb813acb1\").textContent = \"BokehJS is loading...\";\n", " },\n", " function(Bokeh) {\n", " console.log(\"Bokeh: injecting CSS: https://cdn.pydata.org/bokeh/release/bokeh-0.12.4.min.css\");\n", " Bokeh.embed.inject_css(\"https://cdn.pydata.org/bokeh/release/bokeh-0.12.4.min.css\");\n", " console.log(\"Bokeh: injecting CSS: https://cdn.pydata.org/bokeh/release/bokeh-widgets-0.12.4.min.css\");\n", " Bokeh.embed.inject_css(\"https://cdn.pydata.org/bokeh/release/bokeh-widgets-0.12.4.min.css\");\n", " }\n", " ];\n", "\n", " function run_inline_js() {\n", " \n", " if ((window.Bokeh !== undefined) || (force === true)) {\n", " for (var i = 0; i < inline_js.length; i++) {\n", " inline_js[i](window.Bokeh);\n", " }if (force === true) {\n", " display_loaded();\n", " }} else if (Date.now() < window._bokeh_timeout) {\n", " setTimeout(run_inline_js, 100);\n", " } else if (!window._bokeh_failed_load) {\n", " console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n", " window._bokeh_failed_load = true;\n", " } else if (force !== true) {\n", " var cell = $(document.getElementById(\"057f099e-d583-4124-9b12-e15bb813acb1\")).parents('.cell').data().cell;\n", " cell.output_area.append_execute_result(NB_LOAD_WARNING)\n", " }\n", "\n", " }\n", "\n", " if (window._bokeh_is_loading === 0) {\n", " console.log(\"Bokeh: BokehJS loaded, going straight to plotting\");\n", " run_inline_js();\n", " } else {\n", " load_libs(js_urls, function() {\n", " console.log(\"Bokeh: BokehJS plotting callback run at\", now());\n", " run_inline_js();\n", " });\n", " }\n", "}(this));" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "bk.output_notebook()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we'll import NumPy and create some simple data." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "\n", "x = np.linspace(-6, 6, 100)\n", "y = np.random.normal(0.3*x, 1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we'll call Bokeh's `circle()` function to render a red circle at\n", "each of the points in x and y.\n", "\n", "We can immediately interact with the plot:\n", "\n", " * click-and-drag will pan the plot around.\n", " * Shift + mousewheel will zoom in and out\n" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "
\n", "
\n", "
\n", "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fig = bk.figure(plot_width=500, plot_height=500)\n", "fig.circle(x, y, color=\"red\")\n", "bk.show(fig)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Statistical Plots\n", "\n", "Let's try plotting multiple series on the same axes.\n", "\n", "First, we generate some data from an exponential distribution with mean $\\theta=1$." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "from scipy.stats import expon\n", "\n", "theta = 1\n", "\n", "measured = np.random.exponential(theta, 1000)\n", "hist, edges = np.histogram(measured, density=True, bins=50)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, create our figure, which is not displayed until we ask Bokeh to do so explicitly. We will customize the intractive toolbar, as well as customize the background color." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/fonnescj/anaconda3/envs/dev/lib/python3.6/site-packages/bokeh/util/deprecation.py:34: BokehDeprecationWarning: Plot.background_fill was deprecated in Bokeh 0.11.0 and will be removed, use Plot.background_fill_color instead.\n", " warn(message)\n" ] } ], "source": [ "fig = bk.figure(title=\"Exponential Distribution (θ=1)\",tools=\"previewsave\",\n", " background_fill=\"#E8DDCB\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The quad glyph displays axis-aligned rectangles with the given attributes." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
GlyphRenderer(
id = '20c9ae6a-394e-4fde-a960-6059fe8943c7', …)
data_source = ColumnDataSource(id='7201ee4a-71c8-4c83-be2d-eb780ff03d6d', ...),
glyph = Quad(id='34a1e24e-b91c-4511-b3cf-d7ac4e12272d', ...),
hover_glyph = None,
js_callbacks = {},
level = 'glyph',
name = None,
nonselection_glyph = Quad(id='f3cff073-2aff-4c63-95cc-1055a14ec073', ...),
selection_glyph = None,
tags = [],
visible = True,
x_range_name = 'default',
y_range_name = 'default')
\n", "\n" ], "text/plain": [ "GlyphRenderer(id='20c9ae6a-394e-4fde-a960-6059fe8943c7', ...)" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fig.quad(top=hist, bottom=0, left=edges[:-1], right=edges[1:], fill_color=\"#036564\", line_color=\"#033649\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, add lines showing the form of the probability distribution function (PDF) and cumulative distribution function (CDF)." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
GlyphRenderer(
id = 'e5173358-43d3-45df-9378-22fc75bc6bb4', …)
data_source = ColumnDataSource(id='60efc888-ba3b-4345-a3ad-fcfa7f14b636', ...),
glyph = Line(id='10174097-53cf-4018-a134-0f357a65836a', ...),
hover_glyph = None,
js_callbacks = {},
level = 'glyph',
name = None,
nonselection_glyph = Line(id='26370e7e-2a1a-409e-87d5-f3bb0fa273c2', ...),
selection_glyph = None,
tags = [],
visible = True,
x_range_name = 'default',
y_range_name = 'default')
\n", "\n" ], "text/plain": [ "GlyphRenderer(id='e5173358-43d3-45df-9378-22fc75bc6bb4', ...)" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = np.linspace(0, 10, 1000)\n", "fig.line(x, expon.pdf(x, scale=1), line_color=\"#D95B43\", line_width=8, alpha=0.7, legend=\"PDF\")\n", "fig.line(x, expon.cdf(x, scale=1), line_color=\"white\", line_width=2, alpha=0.7, legend=\"CDF\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, add a legend before releasing the hold and displaying the complete plot." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "
\n", "
\n", "
\n", "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fig.legend.location = \"top_right\"\n", "\n", "bk.show(fig)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Bar Plot Example\n", "\n", "Bokeh's core display model relies on *composing graphical primitives* which are bound to data series. A more sophisticated example demonstrates this idea.\n", "\n", "Bokeh ships with a small set of interesting \"sample data\" in the `bokeh.sampledata` package. We'll load up some historical automobile fuel efficiency data, which is returned as a [Pandas](http://pandas.pydata.org) `DataFrame`." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "from bokeh.sampledata.autompg import autompg" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We first need to reshape the data, by grouping it according to the year of the car, and then by the country of origin (here, USA or Japan)." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
mpgcyldisplhpweightaccelyroriginname
018.08307.0130350412.0701chevrolet chevelle malibu
115.08350.0165369311.5701buick skylark 320
218.08318.0150343611.0701plymouth satellite
316.08304.0150343312.0701amc rebel sst
417.08302.0140344910.5701ford torino
515.08429.0198434110.0701ford galaxie 500
614.08454.022043549.0701chevrolet impala
714.08440.021543128.5701plymouth fury iii
814.08455.0225442510.0701pontiac catalina
915.08390.019038508.5701amc ambassador dpl
\n", "
" ], "text/plain": [ " mpg cyl displ hp weight accel yr origin name\n", "0 18.0 8 307.0 130 3504 12.0 70 1 chevrolet chevelle malibu\n", "1 15.0 8 350.0 165 3693 11.5 70 1 buick skylark 320\n", "2 18.0 8 318.0 150 3436 11.0 70 1 plymouth satellite\n", "3 16.0 8 304.0 150 3433 12.0 70 1 amc rebel sst\n", "4 17.0 8 302.0 140 3449 10.5 70 1 ford torino\n", "5 15.0 8 429.0 198 4341 10.0 70 1 ford galaxie 500\n", "6 14.0 8 454.0 220 4354 9.0 70 1 chevrolet impala\n", "7 14.0 8 440.0 215 4312 8.5 70 1 plymouth fury iii\n", "8 14.0 8 455.0 225 4425 10.0 70 1 pontiac catalina\n", "9 15.0 8 390.0 190 3850 8.5 70 1 amc ambassador dpl" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "grouped = autompg.groupby(\"yr\")\n", "mpg = grouped[\"mpg\"]\n", "mpg_avg = mpg.mean()\n", "mpg_std = mpg.std()\n", "years = np.asarray(list(grouped.groups.keys()))\n", "american = autompg[autompg[\"origin\"]==1]\n", "japanese = autompg[autompg[\"origin\"]==3]\n", "\n", "american.head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![Fury](https://c1.staticflickr.com/5/4022/4312399177_4f39f17a4b_z.jpg?zz=1)\n", "\n", "For each year, we want to plot the distribution of MPG within that year. As a guide, we will include a box that represents the mean efficiency, plus and minus one standard deviation. We will make these boxes partly transparent." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
GlyphRenderer(
id = '5c601201-74bf-45aa-a27f-69f145c19dcd', …)
data_source = ColumnDataSource(id='0f1d66fb-c067-49fe-ad68-a637d814efb5', ...),
glyph = Quad(id='5072a835-a2d5-48d4-bf6d-5a4ad9b9c9af', ...),
hover_glyph = None,
js_callbacks = {},
level = 'glyph',
name = None,
nonselection_glyph = Quad(id='39d062ac-cfd1-4307-81fb-fe9761fdc069', ...),
selection_glyph = None,
tags = [],
visible = True,
x_range_name = 'default',
y_range_name = 'default')
\n", "\n" ], "text/plain": [ "GlyphRenderer(id='5c601201-74bf-45aa-a27f-69f145c19dcd', ...)" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fig = bk.figure(title='Automobile mileage by year and country')\n", "\n", "fig.quad(left=years-0.4, right=years+0.4, bottom=mpg_avg-mpg_std, top=mpg_avg+mpg_std, fill_alpha=0.4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we overplot the actual data points, using contrasting symbols for American and Japanese cars." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
GlyphRenderer(
id = '2c770e3f-a44d-487b-bcf3-c09030cca8ab', …)
data_source = ColumnDataSource(id='d7117ad9-6a7e-4705-a6b6-9798d19991e9', ...),
glyph = Triangle(id='8559872f-fbcd-46bc-95e7-be5148260930', ...),
hover_glyph = None,
js_callbacks = {},
level = 'glyph',
name = None,
nonselection_glyph = Triangle(id='aa9ec016-642d-44d4-b540-c0cf273013b2', ...),
selection_glyph = None,
tags = [],
visible = True,
x_range_name = 'default',
y_range_name = 'default')
\n", "\n" ], "text/plain": [ "GlyphRenderer(id='2c770e3f-a44d-487b-bcf3-c09030cca8ab', ...)" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Add Japanese cars as circles\n", "fig.circle(x=np.asarray(japanese[\"yr\"]), \n", " y=np.asarray(japanese[\"mpg\"]), \n", " size=8, alpha=0.4, line_color=\"red\", fill_color=None, line_width=2)\n", "\n", "# Add American cars as triangles\n", "fig.triangle(x=np.asarray(american[\"yr\"]), \n", " y=np.asarray(american[\"mpg\"]),\n", " size=8, alpha=0.4, line_color=\"blue\", fill_color=None, line_width=2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can add axis labels by binding them to the `axis_label` attribute of each axis." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "fig.xaxis.axis_label = 'Year'\n", "fig.yaxis.axis_label = 'MPG'" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "
\n", "
\n", "
\n", "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "bk.show(fig)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Linked Brushing\n", "\n", "To link plots together at a data level, we can explicitly wrap the data in a ColumnDataSource. This allows us to reference columns by name." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "variables = autompg.to_dict(\"list\")\n", "variables.update({'yr':autompg[\"yr\"]})\n", "source = bk.ColumnDataSource(variables)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `gridplot` function takes a 2-dimensional list containing elements to be arranged in a grid on the same canvas." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
GlyphRenderer(
id = '8ef03dbe-0b2b-4d4d-b291-3e10fe744ed0', …)
data_source = ColumnDataSource(id='178281ab-52f5-426a-a06a-17653d4837db', ...),
glyph = Circle(id='0522c0ae-4ac0-485f-b18e-259a10f509c9', ...),
hover_glyph = None,
js_callbacks = {},
level = 'glyph',
name = None,
nonselection_glyph = Circle(id='1e18b98b-9a53-4f6c-a7cd-847cd570ea06', ...),
selection_glyph = None,
tags = [],
visible = True,
x_range_name = 'default',
y_range_name = 'default')
\n", "\n" ], "text/plain": [ "GlyphRenderer(id='8ef03dbe-0b2b-4d4d-b291-3e10fe744ed0', ...)" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "plot_config = dict(plot_width=300, plot_height=300, tools=\"box_select,lasso_select,help\")\n", "\n", "left = bk.figure(title=\"MPG by Year\", **plot_config)\n", "left.circle(\"yr\", \"mpg\", color=\"blue\", source=source)\n", "\n", "center = bk.figure(title=\"HP vs. Displacement\", **plot_config)\n", "center.circle(\"hp\", \"displ\", color=\"green\", source=source)\n", "\n", "right = bk.figure(title=\"MPG vs. Displacement\", **plot_config)\n", "right.circle(\"mpg\", \"displ\", size=\"cyl\", line_color=\"red\", source=source)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can use the `select` tool to select points on one plot, and the linked points on the other plots will *automagically* highlight." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "
\n", "
\n", "
\n", "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "p = bk.gridplot([[left, center, right]])\n", "bk.show(p)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualization of US unemployment rates\n", "\n", "Our first example of an interactive chart involves generating a heat map of US unemployment rates by month and year. This plot will be made interactive by invoking a `HoverTool` that displays information as the pointer hovers over any cell within the plot." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, we import the data with Pandas and manipulate it as needed." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "from bokeh.models import HoverTool\n", "from bokeh.sampledata.unemployment1948 import data\n", "from collections import OrderedDict\n", "\n", "data['Year'] = [str(x) for x in data['Year']]\n", "years = list(data['Year'])\n", "months = [\"Jan\",\"Feb\",\"Mar\",\"Apr\",\"May\",\"Jun\",\"Jul\",\"Aug\",\"Sep\",\"Oct\",\"Nov\",\"Dec\"]\n", "data = data.set_index('Year')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Specify a color map (where do we get color maps, you ask? -- Try [Color Brewer](http://colorbrewer2.org))" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "colors = [\n", " \"#75968f\", \"#a5bab7\", \"#c9d9d3\", \"#e2e2e2\", \"#dfccce\",\n", " \"#ddb7b1\", \"#cc7878\", \"#933b41\", \"#550b1d\"\n", "]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Set up the data for plotting. We will need to have values for every pair of year/month names. Map the rate to a color." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "month = []\n", "year = []\n", "color = []\n", "rate = []\n", "for y in years:\n", " for m in months:\n", " month.append(m)\n", " year.append(y)\n", " monthly_rate = data[m][y]\n", " rate.append(monthly_rate)\n", " color.append(colors[min(int(monthly_rate)-2, 8)])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a `ColumnDataSource` with columns: month, year, color, rate" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [], "source": [ "source = bk.ColumnDataSource(\n", " data=dict(\n", " month=month,\n", " year=year,\n", " color=color,\n", " rate=rate,\n", " )\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a new figure." ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [], "source": [ "fig = bk.figure(plot_width=900, plot_height=400, x_axis_location=\"above\", tools=\"resize,hover\",\n", " x_range=years, y_range=list(reversed(months)), title=\"US Unemployment (1948 - 2013)\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "use the `rect renderer with the following attributes:\n", "- `x_range` is years, `y_range` is months (reversed)\n", "- fill color for the rectangles is the 'color' field\n", "- `line_color` for the rectangles is `None`\n", "- tools are resize and hover tools\n", "- add a nice title, and set the `plot_width` and `plot_height`" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
GlyphRenderer(
id = '298eb6ec-6c51-48eb-9816-b113cfb70e07', …)
data_source = ColumnDataSource(id='c5209ebd-6802-4692-8400-4156fd3afa0d', ...),
glyph = Rect(id='cda00911-2878-440b-bd0b-bddd17b4df2b', ...),
hover_glyph = None,
js_callbacks = {},
level = 'glyph',
name = None,
nonselection_glyph = Rect(id='ae852aac-c476-417c-a2af-4c0903d9e2c5', ...),
selection_glyph = None,
tags = [],
visible = True,
x_range_name = 'default',
y_range_name = 'default')
\n", "\n" ], "text/plain": [ "GlyphRenderer(id='298eb6ec-6c51-48eb-9816-b113cfb70e07', ...)" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fig.rect('year', 'month', 0.95, 0.95, source=source,\n", " color='color', line_color=None)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Style the plot, including:\n", "- remove the axis and grid lines\n", "- remove the major ticks\n", "- make the tick labels smaller\n", "- set the x-axis orientation to vertical, or angled" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [], "source": [ "fig.grid.grid_line_color = None\n", "fig.axis.axis_line_color = None\n", "fig.axis.major_tick_line_color = None\n", "fig.axis.major_label_text_font_size = \"5pt\"\n", "fig.axis.major_label_standoff = 0\n", "fig.xaxis.major_label_orientation = np.pi/3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Configure the hover tool to display the month, year and rate" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "hover = HoverTool(\n", " tooltips=OrderedDict([\n", " ('date', '@month @year'),\n", " ('rate', '@rate'),\n", "])\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can display the plot. Try moving your pointer over different cells in the plot." ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "
\n", "
\n", "
\n", "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "bk.show(fig)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Similarly, we can provide a geographic heatmap, here using data just from Texas." ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Using data directory: /Users/fonnescj/.bokeh/data\n", "Downloading: CGM.csv (1589982 bytes)\n", " 1589982 [100.00%]\n", "Downloading: US_Counties.zip (3182088 bytes)\n", " 3182088 [100.00%]\n", "Unpacking: US_Counties.csv\n", "Downloading: us_cities.json (713565 bytes)\n", " 713565 [100.00%]\n", "Downloading: unemployment09.csv (253301 bytes)\n", " 253301 [100.00%]\n", "Downloading: AAPL.csv (166698 bytes)\n", " 166698 [100.00%]\n", "Downloading: FB.csv (9706 bytes)\n", " 9706 [100.00%]\n", "Downloading: GOOG.csv (113894 bytes)\n", " 113894 [100.00%]\n", "Downloading: IBM.csv (165625 bytes)\n", " 165625 [100.00%]\n", "Downloading: MSFT.csv (161614 bytes)\n", " 161614 [100.00%]\n", "Downloading: WPP2012_SA_DB03_POPULATION_QUINQUENNIAL.zip (5148539 bytes)\n", " 5148539 [100.00%]\n", "Unpacking: WPP2012_SA_DB03_POPULATION_QUINQUENNIAL.csv\n", "Downloading: gapminder_fertility.csv (64346 bytes)\n", " 64346 [100.00%]\n", "Downloading: gapminder_population.csv (94509 bytes)\n", " 94509 [100.00%]\n", "Downloading: gapminder_life_expectancy.csv (73243 bytes)\n", " 73243 [100.00%]\n", "Downloading: gapminder_regions.csv (7781 bytes)\n", " 7781 [100.00%]\n", "Downloading: world_cities.zip (646858 bytes)\n", " 646858 [100.00%]\n", "Unpacking: world_cities.csv\n", "Downloading: airports.json (6373 bytes)\n", " 6373 [100.00%]\n", "Downloading: movies.db.zip (5067833 bytes)\n", " 5067833 [100.00%]\n", "Unpacking: movies.db\n" ] } ], "source": [ "from bokeh.sampledata import download\n", "download()" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "
\n", "
\n", "
\n", "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from bokeh.sampledata import us_counties, unemployment\n", "from collections import OrderedDict\n", "\n", "# Longitude and latitude values for county boundaries\n", "county_xs=[\n", " us_counties.data[code]['lons'] for code in us_counties.data\n", " if us_counties.data[code]['state'] == 'tx'\n", "]\n", "county_ys=[\n", " us_counties.data[code]['lats'] for code in us_counties.data\n", " if us_counties.data[code]['state'] == 'tx'\n", "]\n", "\n", "# Color palette from colorbrewer2.org\n", "colors = ['#ffffd4','#fee391','#fec44f','#fe9929','#d95f0e','#993404']\n", "\n", "# Assign colors based on unemployment\n", "county_colors = []\n", "for county_id in us_counties.data:\n", " if us_counties.data[county_id]['state'] != 'tx':\n", " continue\n", " try:\n", " rate = unemployment.data[county_id]\n", " idx = min(int(rate/2), 5)\n", " county_colors.append(colors[idx])\n", " except KeyError:\n", " county_colors.append(\"black\")\n", " \n", "fig = bk.figure(tools=\"pan,wheel_zoom,box_zoom,reset,hover,previewsave\", title=\"Texas Unemployment 2009\")\n", "\n", "# Here are the polygons for plotting\n", "fig.patches(county_xs, county_ys, fill_color=county_colors, fill_alpha=0.7, \n", " line_color=\"white\", line_width=0.5)\n", "\n", "# Configure hover tool\n", "hover = HoverTool(\n", " tooltips=OrderedDict([\n", " (\"index\", \"$index\"),\n", " (\"(x,y)\", \"($x, $y)\"),\n", " (\"fill color\", \"$color[hex, swatch]:fill_color\"),\n", "])\n", " )\n", "\n", "\n", "bk.show(fig)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# High-level Plots\n", "\n", "The examples so far have been relatively low-level, in that individual elements of plots need to be specified by hand. The `bokeh.charts` interface makes it easy to get up-and-running with a high-level API that tries to make smart layout and design decisions by default.\n", "\n", "To use them, you simply import the chart type you need from `bokeh.charts`:\n", "\n", "* `Bar`\n", "* `BoxPlot`\n", "* `HeatMap`\n", "* `Histogram`\n", "* `Scatter`\n", "* `Timeseries`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To illustrate, let's create some random data and display it as histograms." ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [], "source": [ "normal = np.random.standard_normal(1000)\n", "student_t = np.random.standard_t(6, 1000)\n", "distributions = pd.DataFrame({'Normal': normal, 'Student-T': student_t})" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/fonnescj/anaconda3/envs/dev/lib/python3.6/site-packages/bokeh/util/deprecation.py:34: BokehDeprecationWarning: bokeh.io.hplot() was deprecated in Bokeh 0.12.0 and will be removed, use bokeh.models.layouts.Row instead.\n", " warn(message)\n" ] }, { "data": { "text/html": [ "\n", "\n", "
\n", "
\n", "
\n", "" ] },