{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from bokeh.charts import HeatMap, bins, output_notebook, show, vplot\n", "from bokeh.sampledata.autompg import autompg as df\n", "from bokeh.sampledata.unemployment1948 import data\n", "\n", "from bokeh.palettes import RdYlGn6, RdYlGn9\n", "\n", "import pandas as pd" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "output_notebook()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Auto MPG Data" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2D Binning\n", "\n", "Default behavior for 2d binning is to bin the dimensions provided, then count the rows that fall into each bin. This is visualizing how the source data represents all possible combinations of mpg and displacement." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "hm = HeatMap(df, x=bins('mpg'), y=bins('displ'))\n", "show(hm)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Binning and Aggregating Values\n", "\n", "For each x and y bin, the stat is used on values. After the binning operation, the aggregated values are then binned a second time so that a discrete color can be mapped to the aggregated value." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "hm = HeatMap(df, x=bins('mpg'), y=bins('displ'), values='cyl', stat='mean')\n", "show(hm)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Specifying the Number of Bins" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "hm = HeatMap(df, x=bins('mpg'), y=bins('displ', bins=15), \n", " values='cyl', stat='mean')\n", "show(hm)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "### Mixing binning and non-binned data" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "hm = HeatMap(df, x=bins('mpg'), y='cyl', values='displ', stat='mean')\n", "show(hm)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### The Size of the Glyph Can be Altered" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "hm = HeatMap(df, y=bins('displ'), x=bins('mpg'), values='cyl', stat='mean',\n", " spacing_ratio=0.9)\n", "\n", "show(hm)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Applying a Custom Palette\n", "The number of discrete colors in the palette determines the number of bins used for applying colors.\n", "\n", "#### Example with 6 Colors" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "hm = HeatMap(df, x=bins('mpg'), y=bins('displ'), stat='mean', values='cyl',\n", " palette=RdYlGn6)\n", "show(hm)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Example with 9 Colors" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "hm = HeatMap(df, x=bins('mpg'), y=bins('displ'), stat='mean', values='cyl',\n", " palette=RdYlGn9)\n", "show(hm)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Viewing Color Bins in Legend" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "hm = HeatMap(df, x=bins('mpg'), y=bins('displ'), values='cyl',\n", " stat='mean', legend='top_right')\n", "show(hm)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Build Fruit Data" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "fruits = {'fruit': ['apples', 'apples', 'apples', 'apples', 'apples',\n", " 'pears', 'pears', 'pears', 'pears', 'pears',\n", " 'bananas', 'bananas', 'bananas', 'bananas', 'bananas'],\n", " 'fruit_count': [4, 5, 8, 12, 4, 6, 5, 4, 8, 7, 1, 2, 4, 8, 12],\n", " 'year': [2009, 2010, 2011, 2012, 2013, 2009, 2010, 2011, 2012, 2013, 2009, 2010,\n", " 2011, 2012, 2013]}\n", "fruits['year'] = [str(yr) for yr in fruits['year']]\n", "fruits_df = pd.DataFrame(fruits)\n", "\n", "fruits_df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Without Dimension Binning\n", "\n", "Each x and y value are treated as the coordinates of a bin. The values column is then binned and assigned to the color attribute, so discrete colors can be assigned." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "hm = HeatMap(fruits, y='year', x='fruit', values='fruit_count', stat=None)\n", "show(hm)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Unemployment Data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Prepare the Data\n", "Current data is in a pivoted form, which is difficult to use when defining plots by specifying columns. We will use the pandas melt function to de-pivot the columns with numerical data we want to use, while keeping the categorical data." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "unempl_data = data.copy()\n", "unempl_data.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# Remove the annual column if we don't want to show the total\n", "del unempl_data['Annual']\n", "\n", "# Convert numerical year to strings\n", "unempl_data['Year'] = unempl_data['Year'].astype(str)\n", "\n", "# de-pivot all columns, except for Year, into two columns.\n", "# One column will have the values and the second will have the labels\n", "unempl = pd.melt(unempl_data, var_name='Month', value_name='Unemployment', id_vars=['Year'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### De-pivoted Data\n", "We now have the 3 dimensions that we want to map to attributes of the plot." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "unempl.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [ "hm = HeatMap(unempl, x='Year', y='Month', values='Unemployment', stat=None,\n", " sort_dim={'x': False}, height=200, responsive=True)\n", "show(hm)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.4.4" } }, "nbformat": 4, "nbformat_minor": 0 }