{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"#### New to Plotly?\n",
"Plotly's Python library is free and open source! [Get started](https://plotly.com/python/getting-started/) by downloading the client and [reading the primer](https://plotly.com/python/getting-started/).\n",
"
You can set up Plotly to work in [online](https://plotly.com/python/getting-started/#initialization-for-online-plotting) or [offline](https://plotly.com/python/getting-started/#initialization-for-offline-plotting) mode, or in [jupyter notebooks](https://plotly.com/python/getting-started/#start-plotting-online).\n",
"
We also have a quick-reference [cheatsheet](https://images.plot.ly/plotly-documentation/images/python_cheat_sheet.pdf) (new!) to help you get started!"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false,
"deletable": true,
"editable": true
},
"source": [
"#### Version Check\n",
"Note: Distplots are available in version 1.11.0+
\n",
"Run `pip install plotly --upgrade` to update your Plotly version"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [
{
"data": {
"text/plain": [
"'2.0.2'"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import plotly\n",
"plotly.__version__"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"#### Basic Distplot "
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import plotly.plotly as py\n",
"import plotly.figure_factory as ff\n",
"\n",
"import numpy as np\n",
"\n",
"x = np.random.randn(1000) \n",
"hist_data = [x]\n",
"group_labels = ['distplot']\n",
"\n",
"fig = ff.create_distplot(hist_data, group_labels)\n",
"py.iplot(fig, filename='Basic Distplot')"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"#### Plot Multiple Datasets"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import plotly.plotly as py\n",
"import plotly.figure_factory as ff\n",
"\n",
"import numpy as np\n",
"\n",
"# Add histogram data\n",
"x1 = np.random.randn(200)-2 \n",
"x2 = np.random.randn(200) \n",
"x3 = np.random.randn(200)+2 \n",
"x4 = np.random.randn(200)+4 \n",
"\n",
"# Group data together\n",
"hist_data = [x1, x2, x3, x4]\n",
"\n",
"group_labels = ['Group 1', 'Group 2', 'Group 3', 'Group 4']\n",
"\n",
"# Create distplot with custom bin_size\n",
"fig = ff.create_distplot(hist_data, group_labels, bin_size=.2)\n",
"\n",
"# Plot!\n",
"py.iplot(fig, filename='Distplot with Multiple Datasets')"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"#### Use Multiple Bin Sizes"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import plotly.plotly as py\n",
"import plotly.figure_factory as ff\n",
"\n",
"import numpy as np\n",
"\n",
"# Add histogram data\n",
"x1 = np.random.randn(200)-2 \n",
"x2 = np.random.randn(200) \n",
"x3 = np.random.randn(200)+2 \n",
"x4 = np.random.randn(200)+4 \n",
"\n",
"# Group data together\n",
"hist_data = [x1, x2, x3, x4]\n",
"\n",
"group_labels = ['Group 1', 'Group 2', 'Group 3', 'Group 4']\n",
"\n",
"# Create distplot with custom bin_size\n",
"fig = ff.create_distplot(hist_data, group_labels, bin_size=[.1, .25, .5, 1])\n",
"\n",
"# Plot!\n",
"py.iplot(fig, filename='Distplot with Multiple Bin Sizes')"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"#### Customize Rug Text, Colors & Title"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import plotly.plotly as py\n",
"import plotly.figure_factory as ff\n",
"\n",
"import numpy as np\n",
"\n",
"x1 = np.random.randn(26) \n",
"x2 = np.random.randn(26) + .5 \n",
"\n",
"hist_data = [x1, x2]\n",
"\n",
"group_labels = ['2014', '2015']\n",
"\n",
"rug_text_one = ['a', 'b', 'c', 'd', 'e',\n",
" 'f', 'g', 'h', 'i', 'j', \n",
" 'k', 'l', 'm', 'n', 'o',\n",
" 'p', 'q', 'r', 's', 't', \n",
" 'u', 'v', 'w', 'x', 'y', 'z'] \n",
"\n",
"rug_text_two = ['aa', 'bb', 'cc', 'dd', 'ee',\n",
" 'ff', 'gg', 'hh', 'ii', 'jj', \n",
" 'kk', 'll', 'mm', 'nn', 'oo',\n",
" 'pp', 'qq', 'rr', 'ss', 'tt', \n",
" 'uu', 'vv', 'ww', 'xx', 'yy', 'zz'] \n",
"\n",
"rug_text = [rug_text_one, rug_text_two]\n",
"\n",
"colors = ['rgb(0, 0, 100)', 'rgb(0, 200, 200)']\n",
"\n",
"# Create distplot with custom bin_size\n",
"fig = ff.create_distplot(\n",
" hist_data, group_labels, bin_size=.2,\n",
" rug_text=rug_text, colors=colors)\n",
"\n",
"fig['layout'].update(title='Customized Distplot')\n",
"\n",
"# Plot!\n",
"py.iplot(fig, filename='Distplot Colors')"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"#### Plot Normal Curve "
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import plotly.plotly as py\n",
"import plotly.figure_factory as ff\n",
"\n",
"import numpy as np\n",
"\n",
"x1 = np.random.randn(200) \n",
"x2 = np.random.randn(200) + 2 \n",
"hist_data = [x1, x2]\n",
"\n",
"group_labels = ['Group 1', 'Group 2']\n",
"\n",
"colors = ['#3A4750', '#F64E8B']\n",
"\n",
"# Create distplot with curve_type set to 'normal'\n",
"fig = ff.create_distplot(hist_data, group_labels, bin_size=.5, curve_type='normal', colors=colors)\n",
"\n",
"# Add title\n",
"fig['layout'].update(title='Distplot with Normal Distribution')\n",
"\n",
"# Plot!\n",
"py.iplot(fig, filename='Distplot with Normal Curve')"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true,
"deletable": true,
"editable": true
},
"source": [
"#### Plot Only Curve and Rug"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import plotly.plotly as py\n",
"import plotly.figure_factory as ff\n",
"\n",
"import numpy as np\n",
"\n",
"x1 = np.random.randn(200) - 1 \n",
"x2 = np.random.randn(200)\n",
"x3 = np.random.randn(200) + 1 \n",
"\n",
"hist_data = [x1, x2, x3]\n",
"\n",
"group_labels = ['Group 1', 'Group 2', 'Group 3']\n",
"colors = ['#333F44', '#37AA9C', '#94F3E4']\n",
"\n",
"# Create distplot with curve_type set to 'normal'\n",
"fig = ff.create_distplot(hist_data, group_labels, show_hist=False, colors=colors)\n",
"\n",
"# Add title\n",
"fig['layout'].update(title='Curve and Rug Plot')\n",
"\n",
"# Plot!\n",
"py.iplot(fig, filename='Curve and Rug')"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"#### Plot Only Hist and Rug"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import plotly.plotly as py\n",
"import plotly.figure_factory as ff\n",
"\n",
"import numpy as np\n",
"\n",
"x1 = np.random.randn(200) - 1 \n",
"x2 = np.random.randn(200)\n",
"x3 = np.random.randn(200) + 1 \n",
"\n",
"hist_data = [x1, x2, x3]\n",
"\n",
"group_labels = ['Group 1', 'Group 2', 'Group 3']\n",
"colors = ['#835AF1', '#7FA6EE', '#B8F7D4']\n",
"\n",
"# Create distplot with curve_type set to 'normal'\n",
"fig = ff.create_distplot(hist_data, group_labels, colors=colors, bin_size=.25, show_curve=False)\n",
"\n",
"# Add title\n",
"fig['layout'].update(title='Hist and Rug Plot')\n",
"\n",
"# Plot!\n",
"py.iplot(fig, filename='Hist and Rug')"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"#### Plot Hist and Rug with Different Bin Sizes"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import plotly.plotly as py\n",
"import plotly.figure_factory as ff\n",
"\n",
"import numpy as np\n",
"\n",
"x1 = np.random.randn(200) - 2 \n",
"x2 = np.random.randn(200)\n",
"x3 = np.random.randn(200) + 2 \n",
"\n",
"hist_data = [x1, x2, x3]\n",
"\n",
"group_labels = ['Group 1', 'Group 2', 'Group 3']\n",
"colors = ['#393E46', '#2BCDC1', '#F66095']\n",
"# Create distplot with curve_type set to 'normal'\n",
"fig = ff.create_distplot(hist_data, group_labels, colors=colors, \n",
" bin_size=[0.3, 0.2, 0.1], show_curve=False)\n",
"\n",
"# Add title\n",
"fig['layout'].update(title='Hist and Rug Plot')\n",
"\n",
"# Plot!\n",
"py.iplot(fig, filename='Hist and Rug Different Bin Size')"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"#### Plot Only Hist and Curve"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import plotly.plotly as py\n",
"import plotly.figure_factory as ff\n",
"\n",
"import numpy as np\n",
"\n",
"x1 = np.random.randn(200) - 2 \n",
"x2 = np.random.randn(200)\n",
"x3 = np.random.randn(200) + 2 \n",
"\n",
"hist_data = [x1, x2, x3]\n",
"\n",
"group_labels = ['Group 1', 'Group 2', 'Group 3']\n",
"colors = ['#A56CC1', '#A6ACEC', '#63F5EF']\n",
"\n",
"# Create distplot with curve_type set to 'normal'\n",
"fig = ff.create_distplot(hist_data, group_labels, colors=colors,\n",
" bin_size=.2, show_rug=False)\n",
"\n",
"# Add title\n",
"fig['layout'].update(title='Hist and Curve Plot')\n",
"\n",
"# Plot!\n",
"py.iplot(fig, filename='Hist and Curve')"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"#### Distplot with Pandas"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import plotly.plotly as py\n",
"import plotly.figure_factory as ff\n",
"\n",
"import numpy as np\n",
"import pandas as pd\n",
"\n",
"df = pd.DataFrame({'2012': np.random.randn(200),\n",
" '2013': np.random.randn(200)+1})\n",
"py.iplot(ff.create_distplot([df[c] for c in df.columns], df.columns, bin_size=.25),\n",
" filename='distplot with pandas')"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"#### Reference "
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Help on function create_distplot in module plotly.figure_factory._distplot:\n",
"\n",
"create_distplot(hist_data, group_labels, bin_size=1.0, curve_type='kde', colors=None, rug_text=None, histnorm='probability density', show_hist=True, show_curve=True, show_rug=True)\n",
" BETA function that creates a distplot similar to seaborn.distplot\n",
" \n",
" The distplot can be composed of all or any combination of the following\n",
" 3 components: (1) histogram, (2) curve: (a) kernel density estimation\n",
" or (b) normal curve, and (3) rug plot. Additionally, multiple distplots\n",
" (from multiple datasets) can be created in the same plot.\n",
" \n",
" :param (list[list]) hist_data: Use list of lists to plot multiple data\n",
" sets on the same plot.\n",
" :param (list[str]) group_labels: Names for each data set.\n",
" :param (list[float]|float) bin_size: Size of histogram bins.\n",
" Default = 1.\n",
" :param (str) curve_type: 'kde' or 'normal'. Default = 'kde'\n",
" :param (str) histnorm: 'probability density' or 'probability'\n",
" Default = 'probability density'\n",
" :param (bool) show_hist: Add histogram to distplot? Default = True\n",
" :param (bool) show_curve: Add curve to distplot? Default = True\n",
" :param (bool) show_rug: Add rug to distplot? Default = True\n",
" :param (list[str]) colors: Colors for traces.\n",
" :param (list[list]) rug_text: Hovertext values for rug_plot,\n",
" :return (dict): Representation of a distplot figure.\n",
" \n",
" Example 1: Simple distplot of 1 data set\n",
" ```\n",
" import plotly.plotly as py\n",
" from plotly.figure_factory import create_distplot\n",
" \n",
" hist_data = [[1.1, 1.1, 2.5, 3.0, 3.5,\n",
" 3.5, 4.1, 4.4, 4.5, 4.5,\n",
" 5.0, 5.0, 5.2, 5.5, 5.5,\n",
" 5.5, 5.5, 5.5, 6.1, 7.0]]\n",
" \n",
" group_labels = ['distplot example']\n",
" \n",
" fig = create_distplot(hist_data, group_labels)\n",
" \n",
" url = py.plot(fig, filename='Simple distplot', validate=False)\n",
" ```\n",
" \n",
" Example 2: Two data sets and added rug text\n",
" ```\n",
" import plotly.plotly as py\n",
" from plotly.figure_factory import create_distplot\n",
" \n",
" # Add histogram data\n",
" hist1_x = [0.8, 1.2, 0.2, 0.6, 1.6,\n",
" -0.9, -0.07, 1.95, 0.9, -0.2,\n",
" -0.5, 0.3, 0.4, -0.37, 0.6]\n",
" hist2_x = [0.8, 1.5, 1.5, 0.6, 0.59,\n",
" 1.0, 0.8, 1.7, 0.5, 0.8,\n",
" -0.3, 1.2, 0.56, 0.3, 2.2]\n",
" \n",
" # Group data together\n",
" hist_data = [hist1_x, hist2_x]\n",
" \n",
" group_labels = ['2012', '2013']\n",
" \n",
" # Add text\n",
" rug_text_1 = ['a1', 'b1', 'c1', 'd1', 'e1',\n",
" 'f1', 'g1', 'h1', 'i1', 'j1',\n",
" 'k1', 'l1', 'm1', 'n1', 'o1']\n",
" \n",
" rug_text_2 = ['a2', 'b2', 'c2', 'd2', 'e2',\n",
" 'f2', 'g2', 'h2', 'i2', 'j2',\n",
" 'k2', 'l2', 'm2', 'n2', 'o2']\n",
" \n",
" # Group text together\n",
" rug_text_all = [rug_text_1, rug_text_2]\n",
" \n",
" # Create distplot\n",
" fig = create_distplot(\n",
" hist_data, group_labels, rug_text=rug_text_all, bin_size=.2)\n",
" \n",
" # Add title\n",
" fig['layout'].update(title='Dist Plot')\n",
" \n",
" # Plot!\n",
" url = py.plot(fig, filename='Distplot with rug text', validate=False)\n",
" ```\n",
" \n",
" Example 3: Plot with normal curve and hide rug plot\n",
" ```\n",
" import plotly.plotly as py\n",
" from plotly.figure_factory import create_distplot\n",
" import numpy as np\n",
" \n",
" x1 = np.random.randn(190)\n",
" x2 = np.random.randn(200)+1\n",
" x3 = np.random.randn(200)-1\n",
" x4 = np.random.randn(210)+2\n",
" \n",
" hist_data = [x1, x2, x3, x4]\n",
" group_labels = ['2012', '2013', '2014', '2015']\n",
" \n",
" fig = create_distplot(\n",
" hist_data, group_labels, curve_type='normal',\n",
" show_rug=False, bin_size=.4)\n",
" \n",
" url = py.plot(fig, filename='hist and normal curve', validate=False)\n",
" \n",
" Example 4: Distplot with Pandas\n",
" ```\n",
" import plotly.plotly as py\n",
" from plotly.figure_factory import create_distplot\n",
" import numpy as np\n",
" import pandas as pd\n",
" \n",
" df = pd.DataFrame({'2012': np.random.randn(200),\n",
" '2013': np.random.randn(200)+1})\n",
" py.iplot(create_distplot([df[c] for c in df.columns], df.columns),\n",
" filename='examples/distplot with pandas',\n",
" validate=False)\n",
" ```\n",
"\n"
]
}
],
"source": [
"help(ff.create_distplot)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/usr/local/lib/python2.7/site-packages/IPython/nbconvert.py:13: ShimWarning:\n",
"\n",
"The `IPython.nbconvert` package has been deprecated. You should import from nbconvert instead.\n",
"\n",
"/usr/local/lib/python2.7/site-packages/publisher/publisher.py:53: UserWarning:\n",
"\n",
"Did you \"Save\" this notebook before running this command? Remember to save, always save.\n",
"\n"
]
}
],
"source": [
"from IPython.display import display, HTML\n",
"\n",
"display(HTML(''))\n",
"display(HTML(''))\n",
"\n",
"!pip install git+https://github.com/plotly/publisher.git --upgrade\n",
"import publisher\n",
"publisher.publish(\n",
" 'distplots.ipynb', 'python/distplot/', 'Python Distplots | plotly',\n",
" 'How to make interactive Distplots in Python with Plotly. ',\n",
" title = 'Python Distplots | plotly',\n",
" name = 'Distplots',\n",
" has_thumbnail='true', thumbnail='thumbnail/distplot.jpg', \n",
" language='python', page_type='example_index', \n",
" display_as='statistical', order=5,\n",
" ipynb= '~notebook_demo/23') "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true,
"deletable": true,
"editable": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.12"
}
},
"nbformat": 4,
"nbformat_minor": 0
}