{ "cells": [ { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "#### New to Plotly?\n", "Plotly's Python library is free and open source! [Get started](https://plotly.com/python/getting-started/) by downloading the client and [reading the primer](https://plotly.com/python/getting-started/).\n", "
You can set up Plotly to work in [online](https://plotly.com/python/getting-started/#initialization-for-online-plotting) or [offline](https://plotly.com/python/getting-started/#initialization-for-offline-plotting) mode, or in [jupyter notebooks](https://plotly.com/python/getting-started/#start-plotting-online).\n", "
We also have a quick-reference [cheatsheet](https://images.plot.ly/plotly-documentation/images/python_cheat_sheet.pdf) (new!) to help you get started!" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false, "deletable": true, "editable": true }, "source": [ "#### Version Check\n", "Note: Distplots are available in version 1.11.0+
\n", "Run `pip install plotly --upgrade` to update your Plotly version" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "data": { "text/plain": [ "'2.0.2'" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import plotly\n", "plotly.__version__" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "#### Basic Distplot " ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import plotly.plotly as py\n", "import plotly.figure_factory as ff\n", "\n", "import numpy as np\n", "\n", "x = np.random.randn(1000) \n", "hist_data = [x]\n", "group_labels = ['distplot']\n", "\n", "fig = ff.create_distplot(hist_data, group_labels)\n", "py.iplot(fig, filename='Basic Distplot')" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "#### Plot Multiple Datasets" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import plotly.plotly as py\n", "import plotly.figure_factory as ff\n", "\n", "import numpy as np\n", "\n", "# Add histogram data\n", "x1 = np.random.randn(200)-2 \n", "x2 = np.random.randn(200) \n", "x3 = np.random.randn(200)+2 \n", "x4 = np.random.randn(200)+4 \n", "\n", "# Group data together\n", "hist_data = [x1, x2, x3, x4]\n", "\n", "group_labels = ['Group 1', 'Group 2', 'Group 3', 'Group 4']\n", "\n", "# Create distplot with custom bin_size\n", "fig = ff.create_distplot(hist_data, group_labels, bin_size=.2)\n", "\n", "# Plot!\n", "py.iplot(fig, filename='Distplot with Multiple Datasets')" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "#### Use Multiple Bin Sizes" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import plotly.plotly as py\n", "import plotly.figure_factory as ff\n", "\n", "import numpy as np\n", "\n", "# Add histogram data\n", "x1 = np.random.randn(200)-2 \n", "x2 = np.random.randn(200) \n", "x3 = np.random.randn(200)+2 \n", "x4 = np.random.randn(200)+4 \n", "\n", "# Group data together\n", "hist_data = [x1, x2, x3, x4]\n", "\n", "group_labels = ['Group 1', 'Group 2', 'Group 3', 'Group 4']\n", "\n", "# Create distplot with custom bin_size\n", "fig = ff.create_distplot(hist_data, group_labels, bin_size=[.1, .25, .5, 1])\n", "\n", "# Plot!\n", "py.iplot(fig, filename='Distplot with Multiple Bin Sizes')" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "#### Customize Rug Text, Colors & Title" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import plotly.plotly as py\n", "import plotly.figure_factory as ff\n", "\n", "import numpy as np\n", "\n", "x1 = np.random.randn(26) \n", "x2 = np.random.randn(26) + .5 \n", "\n", "hist_data = [x1, x2]\n", "\n", "group_labels = ['2014', '2015']\n", "\n", "rug_text_one = ['a', 'b', 'c', 'd', 'e',\n", " 'f', 'g', 'h', 'i', 'j', \n", " 'k', 'l', 'm', 'n', 'o',\n", " 'p', 'q', 'r', 's', 't', \n", " 'u', 'v', 'w', 'x', 'y', 'z'] \n", "\n", "rug_text_two = ['aa', 'bb', 'cc', 'dd', 'ee',\n", " 'ff', 'gg', 'hh', 'ii', 'jj', \n", " 'kk', 'll', 'mm', 'nn', 'oo',\n", " 'pp', 'qq', 'rr', 'ss', 'tt', \n", " 'uu', 'vv', 'ww', 'xx', 'yy', 'zz'] \n", "\n", "rug_text = [rug_text_one, rug_text_two]\n", "\n", "colors = ['rgb(0, 0, 100)', 'rgb(0, 200, 200)']\n", "\n", "# Create distplot with custom bin_size\n", "fig = ff.create_distplot(\n", " hist_data, group_labels, bin_size=.2,\n", " rug_text=rug_text, colors=colors)\n", "\n", "fig['layout'].update(title='Customized Distplot')\n", "\n", "# Plot!\n", "py.iplot(fig, filename='Distplot Colors')" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "#### Plot Normal Curve " ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import plotly.plotly as py\n", "import plotly.figure_factory as ff\n", "\n", "import numpy as np\n", "\n", "x1 = np.random.randn(200) \n", "x2 = np.random.randn(200) + 2 \n", "hist_data = [x1, x2]\n", "\n", "group_labels = ['Group 1', 'Group 2']\n", "\n", "colors = ['#3A4750', '#F64E8B']\n", "\n", "# Create distplot with curve_type set to 'normal'\n", "fig = ff.create_distplot(hist_data, group_labels, bin_size=.5, curve_type='normal', colors=colors)\n", "\n", "# Add title\n", "fig['layout'].update(title='Distplot with Normal Distribution')\n", "\n", "# Plot!\n", "py.iplot(fig, filename='Distplot with Normal Curve')" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true, "deletable": true, "editable": true }, "source": [ "#### Plot Only Curve and Rug" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import plotly.plotly as py\n", "import plotly.figure_factory as ff\n", "\n", "import numpy as np\n", "\n", "x1 = np.random.randn(200) - 1 \n", "x2 = np.random.randn(200)\n", "x3 = np.random.randn(200) + 1 \n", "\n", "hist_data = [x1, x2, x3]\n", "\n", "group_labels = ['Group 1', 'Group 2', 'Group 3']\n", "colors = ['#333F44', '#37AA9C', '#94F3E4']\n", "\n", "# Create distplot with curve_type set to 'normal'\n", "fig = ff.create_distplot(hist_data, group_labels, show_hist=False, colors=colors)\n", "\n", "# Add title\n", "fig['layout'].update(title='Curve and Rug Plot')\n", "\n", "# Plot!\n", "py.iplot(fig, filename='Curve and Rug')" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "#### Plot Only Hist and Rug" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import plotly.plotly as py\n", "import plotly.figure_factory as ff\n", "\n", "import numpy as np\n", "\n", "x1 = np.random.randn(200) - 1 \n", "x2 = np.random.randn(200)\n", "x3 = np.random.randn(200) + 1 \n", "\n", "hist_data = [x1, x2, x3]\n", "\n", "group_labels = ['Group 1', 'Group 2', 'Group 3']\n", "colors = ['#835AF1', '#7FA6EE', '#B8F7D4']\n", "\n", "# Create distplot with curve_type set to 'normal'\n", "fig = ff.create_distplot(hist_data, group_labels, colors=colors, bin_size=.25, show_curve=False)\n", "\n", "# Add title\n", "fig['layout'].update(title='Hist and Rug Plot')\n", "\n", "# Plot!\n", "py.iplot(fig, filename='Hist and Rug')" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "#### Plot Hist and Rug with Different Bin Sizes" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import plotly.plotly as py\n", "import plotly.figure_factory as ff\n", "\n", "import numpy as np\n", "\n", "x1 = np.random.randn(200) - 2 \n", "x2 = np.random.randn(200)\n", "x3 = np.random.randn(200) + 2 \n", "\n", "hist_data = [x1, x2, x3]\n", "\n", "group_labels = ['Group 1', 'Group 2', 'Group 3']\n", "colors = ['#393E46', '#2BCDC1', '#F66095']\n", "# Create distplot with curve_type set to 'normal'\n", "fig = ff.create_distplot(hist_data, group_labels, colors=colors, \n", " bin_size=[0.3, 0.2, 0.1], show_curve=False)\n", "\n", "# Add title\n", "fig['layout'].update(title='Hist and Rug Plot')\n", "\n", "# Plot!\n", "py.iplot(fig, filename='Hist and Rug Different Bin Size')" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "#### Plot Only Hist and Curve" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import plotly.plotly as py\n", "import plotly.figure_factory as ff\n", "\n", "import numpy as np\n", "\n", "x1 = np.random.randn(200) - 2 \n", "x2 = np.random.randn(200)\n", "x3 = np.random.randn(200) + 2 \n", "\n", "hist_data = [x1, x2, x3]\n", "\n", "group_labels = ['Group 1', 'Group 2', 'Group 3']\n", "colors = ['#A56CC1', '#A6ACEC', '#63F5EF']\n", "\n", "# Create distplot with curve_type set to 'normal'\n", "fig = ff.create_distplot(hist_data, group_labels, colors=colors,\n", " bin_size=.2, show_rug=False)\n", "\n", "# Add title\n", "fig['layout'].update(title='Hist and Curve Plot')\n", "\n", "# Plot!\n", "py.iplot(fig, filename='Hist and Curve')" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "#### Distplot with Pandas" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import plotly.plotly as py\n", "import plotly.figure_factory as ff\n", "\n", "import numpy as np\n", "import pandas as pd\n", "\n", "df = pd.DataFrame({'2012': np.random.randn(200),\n", " '2013': np.random.randn(200)+1})\n", "py.iplot(ff.create_distplot([df[c] for c in df.columns], df.columns, bin_size=.25),\n", " filename='distplot with pandas')" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "#### Reference " ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Help on function create_distplot in module plotly.figure_factory._distplot:\n", "\n", "create_distplot(hist_data, group_labels, bin_size=1.0, curve_type='kde', colors=None, rug_text=None, histnorm='probability density', show_hist=True, show_curve=True, show_rug=True)\n", " BETA function that creates a distplot similar to seaborn.distplot\n", " \n", " The distplot can be composed of all or any combination of the following\n", " 3 components: (1) histogram, (2) curve: (a) kernel density estimation\n", " or (b) normal curve, and (3) rug plot. Additionally, multiple distplots\n", " (from multiple datasets) can be created in the same plot.\n", " \n", " :param (list[list]) hist_data: Use list of lists to plot multiple data\n", " sets on the same plot.\n", " :param (list[str]) group_labels: Names for each data set.\n", " :param (list[float]|float) bin_size: Size of histogram bins.\n", " Default = 1.\n", " :param (str) curve_type: 'kde' or 'normal'. Default = 'kde'\n", " :param (str) histnorm: 'probability density' or 'probability'\n", " Default = 'probability density'\n", " :param (bool) show_hist: Add histogram to distplot? Default = True\n", " :param (bool) show_curve: Add curve to distplot? Default = True\n", " :param (bool) show_rug: Add rug to distplot? Default = True\n", " :param (list[str]) colors: Colors for traces.\n", " :param (list[list]) rug_text: Hovertext values for rug_plot,\n", " :return (dict): Representation of a distplot figure.\n", " \n", " Example 1: Simple distplot of 1 data set\n", " ```\n", " import plotly.plotly as py\n", " from plotly.figure_factory import create_distplot\n", " \n", " hist_data = [[1.1, 1.1, 2.5, 3.0, 3.5,\n", " 3.5, 4.1, 4.4, 4.5, 4.5,\n", " 5.0, 5.0, 5.2, 5.5, 5.5,\n", " 5.5, 5.5, 5.5, 6.1, 7.0]]\n", " \n", " group_labels = ['distplot example']\n", " \n", " fig = create_distplot(hist_data, group_labels)\n", " \n", " url = py.plot(fig, filename='Simple distplot', validate=False)\n", " ```\n", " \n", " Example 2: Two data sets and added rug text\n", " ```\n", " import plotly.plotly as py\n", " from plotly.figure_factory import create_distplot\n", " \n", " # Add histogram data\n", " hist1_x = [0.8, 1.2, 0.2, 0.6, 1.6,\n", " -0.9, -0.07, 1.95, 0.9, -0.2,\n", " -0.5, 0.3, 0.4, -0.37, 0.6]\n", " hist2_x = [0.8, 1.5, 1.5, 0.6, 0.59,\n", " 1.0, 0.8, 1.7, 0.5, 0.8,\n", " -0.3, 1.2, 0.56, 0.3, 2.2]\n", " \n", " # Group data together\n", " hist_data = [hist1_x, hist2_x]\n", " \n", " group_labels = ['2012', '2013']\n", " \n", " # Add text\n", " rug_text_1 = ['a1', 'b1', 'c1', 'd1', 'e1',\n", " 'f1', 'g1', 'h1', 'i1', 'j1',\n", " 'k1', 'l1', 'm1', 'n1', 'o1']\n", " \n", " rug_text_2 = ['a2', 'b2', 'c2', 'd2', 'e2',\n", " 'f2', 'g2', 'h2', 'i2', 'j2',\n", " 'k2', 'l2', 'm2', 'n2', 'o2']\n", " \n", " # Group text together\n", " rug_text_all = [rug_text_1, rug_text_2]\n", " \n", " # Create distplot\n", " fig = create_distplot(\n", " hist_data, group_labels, rug_text=rug_text_all, bin_size=.2)\n", " \n", " # Add title\n", " fig['layout'].update(title='Dist Plot')\n", " \n", " # Plot!\n", " url = py.plot(fig, filename='Distplot with rug text', validate=False)\n", " ```\n", " \n", " Example 3: Plot with normal curve and hide rug plot\n", " ```\n", " import plotly.plotly as py\n", " from plotly.figure_factory import create_distplot\n", " import numpy as np\n", " \n", " x1 = np.random.randn(190)\n", " x2 = np.random.randn(200)+1\n", " x3 = np.random.randn(200)-1\n", " x4 = np.random.randn(210)+2\n", " \n", " hist_data = [x1, x2, x3, x4]\n", " group_labels = ['2012', '2013', '2014', '2015']\n", " \n", " fig = create_distplot(\n", " hist_data, group_labels, curve_type='normal',\n", " show_rug=False, bin_size=.4)\n", " \n", " url = py.plot(fig, filename='hist and normal curve', validate=False)\n", " \n", " Example 4: Distplot with Pandas\n", " ```\n", " import plotly.plotly as py\n", " from plotly.figure_factory import create_distplot\n", " import numpy as np\n", " import pandas as pd\n", " \n", " df = pd.DataFrame({'2012': np.random.randn(200),\n", " '2013': np.random.randn(200)+1})\n", " py.iplot(create_distplot([df[c] for c in df.columns], df.columns),\n", " filename='examples/distplot with pandas',\n", " validate=False)\n", " ```\n", "\n" ] } ], "source": [ "help(ff.create_distplot)" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false, "deletable": true, "editable": true }, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stderr", "output_type": "stream", "text": [ "/usr/local/lib/python2.7/site-packages/IPython/nbconvert.py:13: ShimWarning:\n", "\n", "The `IPython.nbconvert` package has been deprecated. You should import from nbconvert instead.\n", "\n", "/usr/local/lib/python2.7/site-packages/publisher/publisher.py:53: UserWarning:\n", "\n", "Did you \"Save\" this notebook before running this command? Remember to save, always save.\n", "\n" ] } ], "source": [ "from IPython.display import display, HTML\n", "\n", "display(HTML(''))\n", "display(HTML(''))\n", "\n", "!pip install git+https://github.com/plotly/publisher.git --upgrade\n", "import publisher\n", "publisher.publish(\n", " 'distplots.ipynb', 'python/distplot/', 'Python Distplots | plotly',\n", " 'How to make interactive Distplots in Python with Plotly. ',\n", " title = 'Python Distplots | plotly',\n", " name = 'Distplots',\n", " has_thumbnail='true', thumbnail='thumbnail/distplot.jpg', \n", " language='python', page_type='example_index', \n", " display_as='statistical', order=5,\n", " ipynb= '~notebook_demo/23') " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true, "deletable": true, "editable": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.12" } }, "nbformat": 4, "nbformat_minor": 0 }