{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Implementing PMFs\n",
"\n",
"Copyright 2019 Allen Downey\n",
"\n",
"BSD 3-clause license: https://opensource.org/licenses/BSD-3-Clause"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"%matplotlib inline\n",
"\n",
"import numpy as np\n",
"import pandas as pd\n",
"\n",
"import seaborn as sns\n",
"sns.set_style('white')\n",
"\n",
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import inspect\n",
"\n",
"def psource(obj):\n",
" \"\"\"Prints the source code for a given object.\n",
"\n",
" obj: function or method object\n",
" \"\"\"\n",
" print(inspect.getsource(obj))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Constructor\n",
"\n",
"For comments or questions about this section, see [this issue](https://github.com/AllenDowney/EmpyricalDistributions/issues/1).\n",
"\n",
"The `Pmf` class inherits from `pd.Series`. The `__init__` method is essentially unchanged, but it includes a workaround for what I think is bad behavior."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" def __init__(self, *args, **kwargs):\n",
" \"\"\"Initialize a Pmf.\n",
"\n",
" Note: this cleans up a weird Series behavior, which is\n",
" that Series() and Series([]) yield different results.\n",
" See: https://github.com/pandas-dev/pandas/issues/16737\n",
" \"\"\"\n",
" if args or ('index' in kwargs):\n",
" super().__init__(*args, **kwargs)\n",
" else:\n",
" underride(kwargs, dtype=np.float64)\n",
" super().__init__([], **kwargs)\n",
"\n"
]
}
],
"source": [
"from empiricaldist import Pmf\n",
"\n",
"psource(Pmf.__init__)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can create an empty `Pmf` and then add elements.\n",
"\n",
"Here's a `Pmf` that represents a six-sided die."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"d6 = Pmf()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"for x in [1,2,3,4,5,6]:\n",
" d6[x] = 1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Initially the probabilities don't add up to 1."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" probs | \n",
"
\n",
" \n",
" \n",
" \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" 2 | \n",
" 1 | \n",
"
\n",
" \n",
" 3 | \n",
" 1 | \n",
"
\n",
" \n",
" 4 | \n",
" 1 | \n",
"
\n",
" \n",
" 5 | \n",
" 1 | \n",
"
\n",
" \n",
" 6 | \n",
" 1 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"1 1\n",
"2 1\n",
"3 1\n",
"4 1\n",
"5 1\n",
"6 1\n",
"dtype: int64"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`normalize` adds up the probabilities and divides through. The return value is the total probability before normalizing."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" def normalize(self):\n",
" \"\"\"Make the probabilities add up to 1 (modifies self).\n",
"\n",
" :return: normalizing constant\n",
" \"\"\"\n",
" total = self.sum()\n",
" self /= total\n",
" return total\n",
"\n"
]
}
],
"source": [
"psource(Pmf.normalize)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"6"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6.normalize()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now the Pmf is normalized."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" probs | \n",
"
\n",
" \n",
" \n",
" \n",
" 1 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 2 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 4 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 5 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 6 | \n",
" 0.166667 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"1 0.166667\n",
"2 0.166667\n",
"3 0.166667\n",
"4 0.166667\n",
"5 0.166667\n",
"6 0.166667\n",
"dtype: float64"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Properties\n",
"\n",
"For comments or questions about this section, see [this issue](https://github.com/AllenDowney/EmpyricalDistributions/issues/2).\n",
"\n",
"In a `Pmf` the index contains the quantities (`qs`) and the values contain the probabilities (`ps`).\n",
"\n",
"These attributes are available as properties that return arrays (same semantics as the Pandas `values` property)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([1, 2, 3, 4, 5, 6])"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6.qs"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0.16666667, 0.16666667, 0.16666667, 0.16666667, 0.16666667,\n",
" 0.16666667])"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6.ps"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Sharing\n",
"\n",
"For comments or questions about this section, see [this issue](https://github.com/AllenDowney/EmpyricalDistributions/issues/3).\n",
"\n",
"Because `Pmf` is a `Series` you can initialize it with any type `Series.__init__` can handle.\n",
"\n",
"Here's an example with a dictionary."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" probs | \n",
"
\n",
" \n",
" \n",
" \n",
" a | \n",
" 1 | \n",
"
\n",
" \n",
" b | \n",
" 2 | \n",
"
\n",
" \n",
" c | \n",
" 3 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"a 1\n",
"b 2\n",
"c 3\n",
"dtype: int64"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d = dict(a=1, b=2, c=3)\n",
"pmf = Pmf(d)\n",
"pmf"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here's an example with two lists."
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" probs | \n",
"
\n",
" \n",
" \n",
" \n",
" 1 | \n",
" 0.25 | \n",
"
\n",
" \n",
" 2 | \n",
" 0.25 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.25 | \n",
"
\n",
" \n",
" 4 | \n",
" 0.25 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"1 0.25\n",
"2 0.25\n",
"3 0.25\n",
"4 0.25\n",
"dtype: float64"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"qs = [1,2,3,4]\n",
"ps = [0.25, 0.25, 0.25, 0.25]\n",
"d4 = Pmf(ps, index=qs)\n",
"d4"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can copy a `Pmf` like this."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" probs | \n",
"
\n",
" \n",
" \n",
" \n",
" 1 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 2 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 4 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 5 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 6 | \n",
" 0.166667 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"1 0.166667\n",
"2 0.166667\n",
"3 0.166667\n",
"4 0.166667\n",
"5 0.166667\n",
"6 0.166667\n",
"dtype: float64"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6_copy = Pmf(d6)\n",
"d6_copy"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"However, you have to be careful about sharing. In this example, the copies share the arrays:"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6.index is d6_copy.index"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6.ps is d6_copy.ps"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can avoid sharing with `copy=True`"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" probs | \n",
"
\n",
" \n",
" \n",
" \n",
" 1 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 2 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 4 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 5 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 6 | \n",
" 0.166667 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"1 0.166667\n",
"2 0.166667\n",
"3 0.166667\n",
"4 0.166667\n",
"5 0.166667\n",
"6 0.166667\n",
"dtype: float64"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6_copy = Pmf(d6, copy=True)\n",
"d6_copy"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6.index is d6_copy.index"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6.ps is d6_copy.ps"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Or by calling `copy` explicitly."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" probs | \n",
"
\n",
" \n",
" \n",
" \n",
" 1 | \n",
" 0.25 | \n",
"
\n",
" \n",
" 2 | \n",
" 0.25 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.25 | \n",
"
\n",
" \n",
" 4 | \n",
" 0.25 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"1 0.25\n",
"2 0.25\n",
"3 0.25\n",
"4 0.25\n",
"dtype: float64"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d4_copy = d4.copy()\n",
"d4_copy"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d4.index is d4_copy.index"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d4.ps is d4_copy.ps"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Displaying PMFs\n",
"\n",
"For comments or questions about this section, see [this issue](https://github.com/AllenDowney/EmpyricalDistributions/issues/4).\n",
"\n",
"`Pmf` provides `_repr_html_`, so it looks good when displayed in a notebook."
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" def _repr_html_(self):\n",
" \"\"\"Returns an HTML representation of the series.\n",
"\n",
" Mostly used for Jupyter notebooks.\n",
" \"\"\"\n",
" df = pd.DataFrame(dict(probs=self))\n",
" return df._repr_html_()\n",
"\n"
]
}
],
"source": [
"psource(Pmf._repr_html_)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`Pmf` provides `bar`, which plots the Pmf as a bar chart."
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" def bar(self, **options):\n",
" \"\"\"Makes a bar plot.\n",
"\n",
" options: passed to plt.bar\n",
" \"\"\"\n",
" underride(options, label=self.name)\n",
" plt.bar(self.qs, self.ps, **options)\n",
"\n"
]
}
],
"source": [
"psource(Pmf.bar)"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [],
"source": [
"def decorate_dice(title):\n",
" \"\"\"Labels the axes.\n",
" \n",
" title: string\n",
" \"\"\"\n",
" plt.xlabel('Outcome')\n",
" plt.ylabel('PMF')\n",
" plt.title(title)"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYQAAAESCAYAAAD9gqKNAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjAsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+17YcXAAAVOElEQVR4nO3dbbReZX3n8W9CQo51klQLJQ5isUv8FzsQRzOQQKDYEqMoNW1H67CkGhskVsUaHRsFp6wlBZ0FzpSx8SGtVRR0bEs0wjIE6aiQGBhrZQJOfs5x+TDqOBWQhA4mkId5sffB7eE+OQnJziGH7+cN997Xvu793wu4f+faD9eesmfPHiRJmjrRBUiSHh8MBEkSYCBIkloGgiQJMBAkSS0DQZIEwLSJLkB6vKiq5cDrgenAHuBrwMVJvtfT/uYBf5vk+Hbfv5jkPX3sS9oXBoIEVNWVwFzgpUn+d1VNBV4FfKWqTk3y/T73n+SDfX6/tC8MBD3hVdXTgeXAcUl+ApBkN3BNVT0feAfwhqr6DvBR4LeAZwDXJHlX+x3nApcARwIPAm9L8pUB+3o98BZgK7C5s/5S4Kgkb6yqY4H3t/uYDnwqyeUH/cClUbyGIMGpwP8cCYNRvgAs7Cz/iyRnAKcBb6uqZ1bVCcDlwDlJ/jXwOuD6qnpy94uq6rnApcCZSf4N8NAY9Xwc+EiS5wOnAGdX1Sse++FJ+8ZAkBrTx1g/g+Z6wojPAiT5AfBPwFOBRcDTgFuq6uvAtcBu4Fmjvuu3gPVJftQuf3j0ztoQ+Q3g3e13baIZKTz3MRyTtF88ZSQ1P7onVNWczo/1iBcAGzvLP+183gNMAY4Abkny+yMNVXUc8MMB+5rS+bxzQPsR7TanJXmw/a6jgO37eCzSY+YIQU947V/7VwOfbM/fA1BVS4HfA947zlfcArywqn6t7XcO8D+AJ43abn273dPb5dcMqGUbTUCtaL/rF4ENwMv276ik/WcgSECSdwCfAD5bVXdV1f8CzgYWJPnuOH2/QXPd4FNVdSfwbuC3k/zzqO02A2+nObX0VWBojK88D5hfVZuB24FPJrn2AA5P2idTnP5akgSOECRJLQNBkgQYCJKkloEgSQIO8+cQTj311D3HHnvs+BtKkh5x991335Pk6NHrD+tAOPbYY7n++usnugxJOqxU1cBbqT1lJEkCDARJUstAkCQBBoIkqWUgSJIAA0GS1DIQJEmAgSBJahkIkiTgCRwI2x/eNdEl7JN9rXOyHc/+bjuR/Hf0+PdE/ne0Pw7rqSsOxND0Izh+5Y0TXca4vvOel+zTdpPteGDyHdNkOx6YfMc02Y5nfz1hRwiSpJ9nIEiSAANBktTq5RpCVU0FVgFzgR3AsiTDo7Y5GtgInJRke1UdAbwPmAfMAC5NckMf9UmSHq2vEcISYCjJAmAlcFW3saoWA+uBYzqrzwemJzkdeBnwrJ5qkyQN0FcgLATWASTZRPNXf9du4Gzgvs66xcD3q+pGYDXwuZ5qkyQN0FcgzAK2dpZ3VdUjp6eS3Jzk3lF9jgJOAF4KvBf4655qkyQN0FcgbANmdveTZOc4fe4FbkiyJ8mXgGf3VJskaYC+AmEDcA5AVc0HNu9Dn9s6feYC3+upNknSAH09qbwGWFRVG4EpwNKqWgEMJ1k7Rp/VwAeqalPbZ3lPtUmSBuglEJLs5tE/6FsGbHd85/MO4LV91CNJGp8PpkmSAANBktQyECRJgIEgSWoZCJIkwECQJLUMBEkSYCBIkloGgiQJMBAkSS0DQZIEGAiSpJaBIEkCDARJUstAkCQBBoIkqWUgSJKAnt6YVlVTgVXAXGAHsCzJ8KhtjgY2Aicl2d5Z/2vA7cAx3fWSpH71NUJYAgwlWQCsBK7qNlbVYmA9cMyo9bPabXf0VJckaQx9BcJCYB1Akk3AvFHtu4GzgftGVlTVFODDwDuBB3uqS5I0hr4CYRawtbO8q6oeOT2V5OYk947q86fAjUnu7KkmSdJe9BUI24CZ3f0k2TlOn1cBf1hVXwTm0JxSkiQdIr1cVAY2AOcCn66q+cDm8TokedbI56r6DvDCnmqTJA3QVyCsARZV1UZgCrC0qlYAw0nW9rRPSdIB6CUQkuwGlo9avWXAdseP0X/geklSf3wwTZIEGAiSpJaBIEkCDARJUstAkCQBBoIkqWUgSJIAA0GS1DIQJEmAgSBJahkIkiTAQJAktQwESRJgIEiSWgaCJAkwECRJLQNBkgT09Ma0qpoKrALmAjuAZUmGR21zNLAROCnJ9qqaDXwCmAUcCaxI8pU+6pMkPVpfI4QlwFCSBcBK4KpuY1UtBtYDx3RWrwBuSfIbwGuAv+ipNknSAH0FwkJgHUCSTcC8Ue27gbOB+zrr/hPwofbzNGB7T7VJkgbo5ZQRzWmfrZ3lXVU1LclOgCQ3A1TVIxskub9dN4fm1NEf91SbJGmAvkYI24CZ3f2MhMHeVNVJwC3AO5N8qafaJEkD9BUIG4BzAKpqPrB5vA5V9Rzgb4Dzkny+p7okSWPo65TRGmBRVW0EpgBLq2oFMJxk7Rh9rgCGgD9vTyVtTfKynuqTJI3SSyAk2Q0sH7V6y4Dtju989sdfkiaQD6ZJkgADQZLUMhAkSYCBIElqGQiSJMBAkCS1DARJEmAgSJJaBoIkCTAQJEktA0GSBBgIkqSWgSBJAgwESVLLQJAkAQaCJKllIEiSgJ7emFZVU4FVwFxgB7AsyfCobY4GNgInJdleVU8CPgH8MvAA8OokP+6jPknSo/U1QlgCDCVZAKwEruo2VtViYD1wTGf164HNSc4ArgEu6ak2SdIAfQXCQmAdQJJNwLxR7buBs4H7BvUBPt+2S5IOkb4CYRawtbO8q6oeOT2V5OYk9+6lzwPA7J5qkyQN0FcgbANmdveTZOd+9JkJ3N9HYZKkwfoKhA3AOQBVNR/YvD99gBcDt/ZTmiRpkF7uMgLWAIuqaiMwBVhaVSuA4SRrx+jzAeBjVXUb8BBwXk+1SZIG6CUQkuwGlo9avWXAdsd3Pj8IvLyPeiRJ4/PBNEkSYCBIklp7DYSquqTz+Wn9lyNJmijjjRB+s/P52j4LkSRNrPECYcoYnyVJk8x4gbBnjM+SpElmvNtOn995luA5nc97kpzWe3WSpENmvEA4+ZBUIUmacOMFwq/spe27B7MQSdLEGi8Qvgh8C/jv7fLIheU9wJd7qkmSNAHGC4R5NHMKPQ/4e+DaJN/uvSpJ0iG310BI8jXga1U1heaZhEuqag6wNsmHDkWBkqRDY5+mrkiyh+b9x19o+yzrsyhJ0qG31xFCVU2neTfBecCzgbXAm5N88xDUJkk6hMYbIfwTcAVwF/AOmlHC8VX1wr4LkyQdWuNdVP4MzbuNTwZ+lZ+/y2h9j3VJkg6x8QLhH4C3AruANyZZ139JkqSJMF4gjFw7mA18HNinQKiqqcAqYC6wA1iWZLjTfgFwIbATuCzJDVX1jHYfU4D7gPPat6hJkg6B8a4hbE/ycJJ7gCP343uXAENJFgArgatGGtrbVi8CTgcWA1dU1QzgLcB/TXImcDfwh/uxP0nSAdqfN6btz/TXC2lHE0k20TzgNuIUYEOSHUm2AsM01yi+Djyl3WYW8PB+7E+SdIDGO2X061V1HU0YjHwGIMl5e+k3C9jaWd5VVdOS7BzQ9gDNKanvA++pqvOAGcCl+3wUkqQDNl4gvKLz+YP78b3bgJmd5altGAxqmwncD3wYeE2Sm6rqJcA1wEv2Y5+SpAMw3tQVX3qM37sBOBf4dFXNBzZ32u4A/qyqhmhGAifSPOfwE342cvghPzt9JEk6BMYbITxWa4BFnRfqLK2qFcBwkrVVdTVwK801jIuTbK+qNwHvr6oj2j5v6Kk2SdIAvQRCkt3A8lGrt3TaVwOrR/X5Bs0EepKkCbA/dxlJkiYxA0GSBBgIkqSWgSBJAgwESVLLQJAkAQaCJKllIEiSAANBktQyECRJgIEgSWoZCJIkwECQJLUMBEkSYCBIkloGgiQJMBAkSa1e3phWVVOBVcBcYAewLMlwp/0C4EJgJ3BZkhuq6snAB4BnAkcCb0pyRx/1SZIera8RwhJgKMkCYCVw1UhDVc0BLgJOBxYDV1TVDODfA3clOQO4AKieapMkDdBXICwE1gEk2QTM67SdAmxIsiPJVmAYOJkmHB6qqpuAdwE39VSbJGmAvgJhFrC1s7yrqqaN0fYAMBs4CnhKksXA54Are6pNkjRAX4GwDZjZ3U+SnWO0zQTuB+4F1rbrPsfPjyokST3rKxA2AOcAVNV8YHOn7Q7gjKoaqqrZwInAXcBtI32AM4G7e6pNkjRAL3cZAWuARVW1EZgCLK2qFcBwkrVVdTVwK00gXZxke1VdDvxlVX0FeBj4g55qkyQN0EsgJNkNLB+1ekunfTWwelSf+4Df7aMeSdL4fDBNkgQYCJKkloEgSQIMBElSy0CQJAEGgiSpZSBIkgADQZLUMhAkSYCBIElqGQiSJMBAkCS1DARJEmAgSJJaBoIkCTAQJEmtXl6QU1VTgVXAXGAHsCzJcKf9AuBCYCdwWZIbOm1nAtcmOa6P2iRJg/U1QlgCDCVZAKwErhppqKo5wEXA6cBi4IqqmtG2HQe8FZjeU12SpDH0FQgLgXUASTYB8zptpwAbkuxIshUYBk6uqiHgg8Af9VSTJGkv+gqEWcDWzvKuqpo2RtsDwGzg/cCVSX7QU02SpL3oKxC2ATO7+0myc4y2mcBDwBnAn1bVF4GnVtWneqpNkjRALxeVgQ3AucCnq2o+sLnTdgfwZ+0pohnAicAdSWpkg6r6UZJX9lSbJGmAvgJhDbCoqjYCU4ClVbUCGE6ytqquBm6lGaFcnGR7T3VIkvZRL4GQZDewfNTqLZ321cDqvfSf00ddkqSx+WCaJAkwECRJLQNBkgQYCJKkloEgSQIMBElSy0CQJAEGgiSpZSBIkgADQZLUMhAkSYCBIElqGQiSJMBAkCS1DARJEmAgSJJaBoIkCejpjWlVNRVYBcwFdgDLkgx32i8ALgR2ApcluaGqngF8pK1pCvC6JOmjPknSo/U1QlgCDCVZAKwErhppqKo5wEXA6cBi4IqqmgG8G3h/krOAy4EreqpNkjRAX4GwEFgHkGQTMK/TdgqwIcmOJFuBYeBk4K3Aje0204DtPdUmSRqgl1NGwCxga2d5V1VNS7JzQNsDwOwk9wBUVQFX0owyJEmHSF8jhG3AzO5+2jAY1DYTuB+gql4AfAY43+sHknRo9RUIG4BzAKpqPrC503YHcEZVDVXVbOBE4K42DP4ceFGSr/ZUlyRpDH2dMloDLKqqjTR3DC2tqhXAcJK1VXU1cCtNIF2cZHtV/WfgSOBjzVkjkuTCnuqTJI3SSyAk2Q0sH7V6S6d9NbB6VJ+5fdQiSdo3PpgmSQIMBElSy0CQJAEGgiSpZSBIkgADQZLUMhAkSYCBIElqGQiSJMBAkCS1DARJEmAgSJJaBoIkCTAQJEktA0GSBBgIkqSWgSBJAnp6Y1pVTQVWAXOBHcCyJMOd9guAC4GdwGVJbqiqo4DrgCcBPwSWJnmwj/okSY/W1whhCTCUZAGwErhqpKGq5gAXAacDi4ErqmoG8B+A65KcAfwjTWBIkg6RvgJhIbAOIMkmYF6n7RRgQ5IdSbYCw8DJ3T7A54Gze6pNkjRAL6eMgFnA1s7yrqqalmTngLYHgNmj1o+s26u77777nqr67mMtcsZj7XgI1ZoV+7ztZDsemHzHNNmOBybfMU224xnDrwxa2VcgbANmdpantmEwqG0mcH9n/U876/YqydEHpVpJUm+njDYA5wBU1Xxgc6ftDuCMqhqqqtnAicBd3T7Ai4Fbe6pNkjTAlD179hz0L+3cZXQyMAVYSvNjP5xkbXuX0etoAunyJH9XVccAH6MZHdwDnJfk/x304iRJA/USCJKkw48PpkmSAANBktQyECRJQH+3nT7hVNWpwHuTnDXRtRyoqpoOfAQ4nua27MuSrJ3Qog5AVR0BrAYK2EUzLcq3Jraqg6Oqfhn4B2BRki0TXc+Bqqp/5GfPI307ydKJrOdAVdU7gN8GjgRWJfmrCS5prwyEg6Cq3g6cD0yWu6JeBdyb5Pyq+iWaqUQO20AAzgVIcnpVnQW8D3jZhFZ0ELTB/SGaZ3cOe1U1BDAZ/qgCaP9bO41mmp5fAN42oQXtA08ZHRzfAn53oos4iP4GeFdneedYGx4OknyG5jZnaJ7Q/L8TWM7BdCXwQZrJICeDucAvVNX6qvr79hmmw9limmew1gCfA26Y2HLGZyAcBEn+Dnh4ous4WJL8c5IHqmom8LfAJRNd04FKsrOqPgb8F5pjOqxV1WuAHye5aaJrOYgepAm5xcBy4NqqOpzPYhxFM4/by/nZ8UyZ2JL2zkDQQFV1HPDfgI8nuW6i6zkYkrwaeDawuqqePNH1HKDXAouq6ovAc4Fr2pmED2ffBD6RZE+SbwL3Ak+b4JoOxL3ATUkeShJgO/C4nm7ncE5f9aR9anw98MYkt0x0PQeqqs4Hnp7kCpq/QnfTXFw+bCU5c+RzGwrLk/xo4io6KF4LnAT8UVX9S5oJL//PxJZ0QG4D3lxV76MJtifThMTjloGgQd4JPAV4V1WNXEt4cZLD9eLl9cBfV9WXgenAHyfZPsE16dH+CvhoVd0G7AFe25kU87DTvvjrTJr526YCb0jyuP5DxKkrJEmA1xAkSS0DQZIEGAiSpJaBIEkCDARJUsvbTqVWVT2T5knZX6K5PfVO4E+SPDDG9r8D3J5kskwdoSc4RwgSUFVPopnA7z8mOSvJ6cDtwCf30u3NNA9PSZOCzyFIQFX9W+CsJG8ctX4TzZQK1yVZV1UvAl5JMwHgtW3bQuDtwBKaUfcHknyoqt7abrsT+HKSP6mqS4Fn0cxz81Sad4//Hs2UGq9Osqmq3gScR/Nw1qeSXN3v0UsNRwhS41dpZq0d7dvAmaNXJrkR+DrwB8CvAy8GTqWZ7vg5VXUS8Ip2+TTghKp6adv9p0leRPME9TlJzgXeA7yyqp4D/D5NyCwEllRVHbSjlPbCawhS4wfAKQPWnwB8ubM8aLbKAu5opyV4kGb+mpcDm5I8DFBVt9IEB8DX2n/eD3yj/fwTYAj4VzRTdI/MIfUUmhFFHsMxSfvFEYLU+CzN7KGPhEJVLQN+TPMjPzLr5vM6fXbT/D+0BXheVU2tqulVdTPNqaRTq2paO+Xxme06aE4FjSXA3cAL2hfFfJRmTn2pdwaCRPMOCJo3q11SVRuq6naaU0D/DvhL4C1V9QXg2E63jcA1wPeAdcAGmhkur01yJ/Dpdt0dwHeAz+xDHXfSjA5uq6qv0oxQfnAwjlEajxeVJUmAIwRJUstAkCQBBoIkqWUgSJIAA0GS1DIQJEmAgSBJav1/ElvzagDJ3AYAAAAASUVORK5CYII=\n",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"d6.bar()\n",
"decorate_dice('One die')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`Pmf` inherits `plot` from `Series`."
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAY8AAAESCAYAAAAFYll6AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjAsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+17YcXAAAXsklEQVR4nO3dcbBeVXnv8e8JCUQkkSox0eDVWuBpsWkU0pJApFGCgRQkPVxt64gYmrGh1VFSRxOLV5h6a+gVrL13AjQVKCXKtfagEGpIBEWSGKyBakR9MEpFLCoGSPBCgCTv/WPvA68vJ3nPCmefQ8L3M+P47rXXWu/aM3p+WWu/e+2eVquFJEklRo30ACRJ+x7DQ5JUzPCQJBUzPCRJxQwPSVIxw0OSVGz0SA9A2hdFxELgXGAM0ALuAP4qM+9t6PumAZ/LzFfV331oZi5t4rukwTA8pEIR8XFgKnBaZv44IkYBbwe+FhHHZeZ9TX5/Zl7WZP/SYBgeUoGIOBxYCLwiMx8CyMxdwNURcSywBPiLiPhP4CrgJOC/AVdn5ofrPk4HzgcOBB4F3p+ZXxvgu84FzgO2Apvayi8ADsvMd0fEZOD/1N8xBrg2M/9myC9c6uA9D6nMccB3+4Ojw5eAmW3Hh2Tm64HjgfdHxK9HxJHA3wBzM/N1wLuAvoh4YXtHEfFa4ALgxMz8XeCJ3Yznn4ErMvNY4PeA2RHx1r2/PGlwDA+p3JjdlB9Edf+j3xcAMvMnwM+BFwMnAy8Dbo6I/wBWALuAIzr6OglYnZk/rY//ofPL6sD5feCv6742UM1AXrsX1yQVcdlKKrMBODIiJrX9Ye/3BmB92/FjbZ9bQA9wAHBzZv5R/4mIeAXwXwN8V0/b5x0DnD+grnN8Zj5a93UYsH2Q1yLtNWceUoF6FvH3wGfq+w0ARMR84Ezgoi5d3Ay8KSJ+s243F/gW8IKOeqvreofXx+8cYCzbqMJsUd3XocA64Iyyq5LKGR5SocxcAlwDfCEivh0R3wdmAzMy80dd2n6H6j7HtRHxTeCvgTdn5i876m0CPkC1vPUNYOxuunwbMD0iNgG3A5/JzBXP4vKkQelxS3ZJUilnHpKkYoaHJKmY4SFJKmZ4SJKKPW+e8zjuuONakydP7l5RkvSUu+666xeZOaGz/HkTHpMnT6avr2+khyFJ+5SIGPDn5y5bSZKKGR6SpGKGhySpmOEhSSpmeEiSihkekqRijfxUt36n8zKq9zw/DizIzM0ddSZQvftgSmZuj4jFwCn16UOBSZk5KSIWAX8KPFCf+zPgXqpdTV8KPAKcnZkPIEkaFk3NPOYBYzNzBrAYuLj9ZETMoXpfwcT+ssxcmpmzMnMWcB9wdn3qGOAd/ecyM4FzgU31Kz6vpnoftCRpmDQVHjOBVQCZuQGY1nF+F9X7Dx7sbBgRvcBDmXlTXXQssCQi1kbEks7+gS/WfUmShklT4TEe2Np2vDMinloiy8w1mbllN22XABe2HV8LLATeCMyMiNM6+n8EeNFQDVyS1F1T25NsA8a1HY/KzIHewfwrIuJo4OH++yMR0QP8XWZurY9vBF7X0f844OEhHLskqYumZh7rgLkAETEd2DTIdrOplqH6jQe+HRGH1EHyRmBje//AqcBtQzFoSdLgNBUe1wHbI2I98AngvIhYFBFv7tIugB/2H9Qzjg8BX6YKiLsy89+AS4HXRMRaqvdBXzhAX5Kkhjxv3mHe29vbclddSSoTERszs/NHTz4kKEkqZ3hIkooZHpKkYoaHJKmY4SFJKmZ4SJKKGR6SpGKGhySpmOEhSSpmeEiSihkekqRihockqZjhIUkqZnhIkooZHpKkYoaHJKmY4SFJKmZ4SJKKGR6SpGKGhySpmOEhSSo2uolOI2IUsAyYCjwOLMjMzR11JgDrgSmZuT0iFgOn1KcPBSZl5qSI+BPgfcBO4FvAn2fmroi4E9ha178nM+c3cS2SpGdqJDyAecDYzJwREdOBi4Ez+k9GxBxgKTCxvywzl9ZlRMRK4IMR8QLgo1QB82hEfAY4LSJW121mNTR+SdIeNLVsNRNYBZCZG4BpHed3AbOBBzsbRkQv8FBm3kQ1azk+Mx+tT48GtlPNaA6OiNURcUsdUJKkYdJUeIzn6SUlgJ0R8dQsJzPXZOaW3bRdAlxY19uVmT8DiIj3AIcAa4BHgY8Dc4CFwIr2/iVJzWrqD+42YFzb8ajM3NGtUUQcDTzcfn+kvn/yt8BRwJmZ2YqIu4HNmdkC7o6ILcDLgB8P5UVIkgbW1MxjHTAXoF5S2jTIdrOBL3aUXQ6MBea1LV+dQ3UfhYh4OdVM5/5nOWZJ0iA1NfO4Djg5ItYDPcD8iFhENVu4fg/tgmpZqjqIOAb4U+A24JaIAPgk8CngqohYC7SAcwYzs5EkDY2eVqs10mMYFr29va2+vr6RHoYk7VMiYmNmdv7oyYcEJUnlDA9JUjHDQ5JUzPCQJBUzPCRJxQwPSVIxw0OSVMzwkCQVMzwkScUMD0lSMcNDklTM8JAkFTM8JEnFDA9JUjHDQ5JUzPCQJBUzPCRJxQwPSVIxw0OSVMzwkCQVMzwkScUMD0lSsdFNdBoRo4BlwFTgcWBBZm7uqDMBWA9MycztEbEYOKU+fSgwKTMnRcTpwP8AdgBXZObyiHgBcA3wUuAR4OzMfKCJa5EkPVNTM495wNjMnAEsBi5uPxkRc4DVwMT+ssxcmpmzMnMWcB9wdkSMAT4BvAn4feBdETEJOBfYlJmvB64Gzm/oOiRJA2gqPGYCqwAycwMwreP8LmA28GBnw4joBR7KzJuA3wI2Z+ZDmfkEsBZ4fXv/wBfrviRJw6Sp8BgPbG073hkRTy2RZeaazNyym7ZLgAt3088jwIs6yvvLJEnDpKnw2AaMa/+ezNzRrVFEHA083HZ/pLOfccDDHeX9ZZKkYdJUeKwD5gJExHRg0yDbzaZahur3XeDIiHhxRBwInAh8rb1/4FTgtqEYtCRpcJoKj+uA7RGxnuqG93kRsSgi3tylXQA/7D/IzCeBRcBNVKFxRWb+BLgUeE1ErAXexdPLXJKkYdDTarVGegzDore3t9XX1zfSw5CkfUpEbMzMzh89+ZCgJKmc4SFJKmZ4SJKKGR6SpGKGhySpmOEhSSpmeEiSihkekqRihockqZjhIUkqZnhIkooZHpKkYoaHJKmY4SFJKmZ4SJKKGR6SpGKGhySpmOEhSSpmeEiSihkekqRihockqZjhIUkqNrqJTiNiFLAMmAo8DizIzM0ddSYA64Epmbk9Ig4ALgGmAQcBF2Tmyoj4Sluz3wSuyszFEXEnsLUuvycz5zdxLZKkZ2okPIB5wNjMnBER04GLgTP6T0bEHGApMLGtzVnAmMw8ISImA28ByMxZdZtXA58FPhoRY9vPSZKGV1PLVjOBVQCZuYFqNtFuFzAbeLCtbA5wX0TcCCwHbuho83fABzPzl1QzmoMjYnVE3FIHlCRpmDQVHuN5ekkJYGdEPDXLycw1mbmlo81hwJHAacBFwJX9JyLid4DxmXlzXfQo8HGqwFkIrGjvX5LUrD2GR0Sc3/b5ZQX9bgPGtX9PZu7o0mYLsDIzW5l5K3BU27m3U81G+t0NXFPXvbtuWzI+SdKz0G3m8ca2zysK+l0HzAWol5Q2DaLN2rY2U4F7286dRL0MVjuH6j4KEfFyqpnO/QXjkyQ9C92Wenp287mb64CTI2J93W5+RCwCNmfm9btpsxy4NCI21G0Wtp2b1LHM9SngqohYC7SAcwYxs5EkDZFu4dHazec9ysxd/Ooff4DvDVDvVW2fH6eaUQzU3+SO4yeAtw12PJKkodUtPI5tmz0c3fa5lZnHNz46SdJzUrfw+J1hGYUkaZ/SLTxeuYdzPxrKgUiS9h3dwuMrwA+Af6+P+2+at4CvNjSm55R/3Xgfn/3Gj0d6GJK0V9467RWceezhQ95vt/CYRnVj+hjgFmBFZt4z5KOQJO1T9hgemXkHcEdE9FA983F+REwCrs/My4djgCPtzGMPbyS1JWlfNqjtSTKzRbUD7pfqNguaHJQk6bltjzOPiBgDnEq1dHUUcD3w3npLEEnS81S3mcfPgY8B3waWUM0+XhURb2p6YJKk565uN8w/D7yI6nmPV/Orv7Za3eC4JEnPYd3CYyPwl8BO4N2ZuapLfUnS80C3Zav+ex3Tgfc2PxxJ0r6gW3hsz8wnM/MXwIHDMSBJ0nNfyZsES7ZklyTtx7rd83hNRHyaKjj6PwOQmW6JLknPU93C461tny9rciCSpH1Ht+1Jbh2ugUiS9h0l9zwkSQIMD0nSXjA8JEnFDA9JUrFuv7baKxExClgGTAUeBxZk5uaOOhOoNlqckpnbI+IA4BKqF1AdBFyQmSsjohf4X0D/6/w+AtzWrX9JUnOamnnMA8Zm5gxgMXBx+8mImEO1seLEtuKzgDGZeQJwBnBEXX4M8IHMnFX/59Zu/UuSmtVUeMwEVgFk5gaq2US7XcBs4MG2sjnAfRFxI7AcuKEuPxY4JyJui4iLI2L0IPqXJDWoqfAYD2xtO95Z/9EHIDPXZOaWjjaHAUcCpwEXAVfW5WuA9wAnAocAC7v1L0lqVlPhsQ0Y1/49mbmjS5stwMrMbNVLU0fV5Vdk5g/rV+F+AXjdXvYvSRoiTYXHOmAuQERMBzYNos3atjZTgXsjogf4VkQcXtc5ieodI3vTvyRpiDS11HMdcHJErKfaVHF+RCwCNmfm9btpsxy4NCI21G0WZmYrIhYAfRHxGPCdut7Ozv4bug5J0gB6Wq3WSI9hWPT29rb6+vpGehiStE+JiI2Z+YwfJfmQoCSpmOEhSSpmeEiSihkekqRihockqZjhIUkqZnhIkooZHpKkYoaHJKmY4SFJKmZ4SJKKGR6SpGKGhySpmOEhSSpmeEiSihkekqRihockqZjhIUkqZnhIkooZHpKkYoaHJKmY4SFJKja6iU4jYhSwDJgKPA4syMzNHXUmAOuBKZm5PSIOAC4BpgEHARdk5sqIOAn4KPAk8HPgHZn5aERcD7ykLn8sM09t4lokSc/U1MxjHjA2M2cAi4GL209GxBxgNTCxrfgsYExmngCcARxRly8D5mXmicD3gQV1+RHAzMycZXBI0vBqKjxmAqsAMnMD1Wyi3S5gNvBgW9kc4L6IuBFYDtxQl8/KzJ/Vn0cD2yNiInAocENErI2I05q5DEnSQJoKj/HA1rbjnRHx1BJZZq7JzC0dbQ4DjgROAy4Crqzr3g8QEX8IvAG4GjiQajYzD+gFPhERL23mUiRJnZoKj23AuPbvycwdXdpsAVZmZiszbwWO6j8REecB7wdOycztwE+ByzJzR2b+HLgTiCG9AknSbjUVHuuAuQARMR3YNIg2a9vaTAXurT//FfB6YHZm/qKuOxv4bH3+EOC3ge8O4fglSXvQyK+tgOuAkyNiPdADzI+IRcDmzLx+N22WA5dGxIa6zcL63sZHgDuAL0YEwP/NzEsjYk5ddxfwobZgkSQ1rKfVao30GIZFb29vq6+vb6SHIUn7lIjYmJmdP3ryIUFJUjnDQ5JUzPCQJBUzPCRJxQwPSVIxw0OSVMzwkCQVMzwkScUMD0lSMcNDklTM8JAkFTM8JEnFDA9JUjHDQ5JUzPCQJBUzPCRJxQwPSVIxw0OSVMzwkCQVMzwkScUMD0lSsdFNdBoRo4BlwFTgcWBBZm7uqDMBWA9MycztEXEAcAkwDTgIuCAzV0bEdOCTwA5gdWZeOJj+JUnNaWrmMQ8Ym5kzgMXAxe0nI2IOsBqY2FZ8FjAmM08AzgCOqMsvA94GzASOi4hjuvUvSWpWU+ExE1gFkJkbqGYT7XYBs4EH28rmAPdFxI3AcuCGiBgPHJSZP8jMFnATcNIg+pckNaip8BgPbG073hkRTy2RZeaazNzS0eYw4EjgNOAi4Mq6n21tdR4BXtStf0lSs5r6g7sNGNd2PCozd3RpswVYWc8wbo2IowboZxzwMHDwXvQvSRoiTc081gFzAeob3psG0WZtW5upwL2ZuQ14IiJ+IyJ6qJa2btvL/iVJQ6Spmcd1wMkRsR7oAeZHxCJgc2Zev5s2y4FLI2JD3WZhXb4QWAEcQPVrq9sj4t87+2/oOiRJA+hptVojPYZh0dvb2+rr6xvpYUjSPiUiNmbmM36U5EOCkqRihockqZjhIUkqZnhIkooZHpKkYoaHJKmY4SFJKmZ4SJKKGR6SpGKGhySpmOEhSSpmeEiSihkekqRihockqZjhIUkqZnhIkooZHpKkYoaHJKmY4SFJKmZ4SJKKGR6SpGKGhySp2OgmOo2IUcAyYCrwOLAgMzd31JkArAemZOb2iOgB7gO+X1f5GvBJ4Nq2Zq8FFgOXd9bNzCVNXIsk6ZkaCQ9gHjA2M2dExHTgYuCM/pMRMQdYCkxsa/MbwB2ZeXpHX7PqNjOA/wks30NdSdIwaGrZaiawCiAzNwDTOs7vAmYDD7aVHQtMjogvR8S/RUT0n6hnJf8bODczd+6priSpeU3NPMYDW9uOd0bE6MzcAZCZawA6/ubfD3wsM/8lImYC1wC/W587HbgrM3MQdQd01113/SIifvQsr0uSnm9eOVBhU+GxDRjXdjyqPzj24BtAf7isjYjJEdGTmS3g7VT3PwZTd0CZOWFvLkSS9ExNLVutA+YC1Pc8Ng2izUeA99VtpgL3toXBsVQ31wdTV5LUsKZmHtcBJ0fEeqAHmB8Ri4DNmXn9btosBa6JiD+gmlW8E576VdYjHeEwYF1J0vDoabX8B7skqYwPCUqSihkekqRihockqVhTN8z3GxFxHHBRZs4a6bE0LSLGAFcArwIOAj66hx847Bci4gCqXQsC2AnMz8wfjOyohkdEvBTYCJycmd8b6fE0LSLu5Onnz+7JzPkjOZ7hEBFLgDcDBwLLMvNTQ9W34bEHEfEB4Czg/430WIbJ24EtmXlWRLwEuBPYr8OD6gFUMvOEiJgFXELbVjr7q/ofCpcDj430WIZDRIwFeD78I7Bf/b/n44ETgIOB9w9l/y5b7dkPgN6RHsQw+hfgw23H3R7s3Odl5ueBd9WHrwR+NoLDGU4fBy4D/mukBzJMpgIHR8TqiLilfv5sfzeH6hm764AbgJVD2bnhsQeZ+a/AkyM9juGSmb/MzEciYhzwOeD8kR7TcMjMHRHxT1T7p31upMfTtIh4J/BAZt400mMZRo9SBeYcYCGwIiL295WXw6j2FXwLT19zz1B1bnjoV0TEK4AvA/+cmZ8e6fEMl8w8GzgKWB4RLxzp8TTsHKqHeL9C9ZqDqyNi0sgOqXF3A9dkZisz7wa2AC8b4TE1bQtwU2Y+Ue8LuB0Ysm2a9vfkVYGImAisBt6dmTeP9HiGQ0ScBRyemR+j+tfpLqob5/utzDyx/3MdIAsz86cjN6JhcQ4wBfjziHg51eat94/skBq3FnhvRFxCFZQvpAqUIWF4qN2HgF8DPhwR/fc+Ts3M/fmmah9wZUR8FRgDvC8zt4/wmDT0PgVcFRFrgRZwziA2a92nZebKiDgR+DrVKtNf1K+0GBJuTyJJKuY9D0lSMcNDklTM8JAkFTM8JEnFDA9JUjF/qivthYj4daonll9C9RPfbwIfzMxHdlP/D4HbM/P5sh2I9nPOPKRCEfECqg0j/zYzZ2XmCcDtwGf20Oy9VA+mSfsFn/OQCkXEfwdmZea7O8o3UG2D8enMXBURpwB/TLXh5Ir63EzgA8A8qpn/pZl5eUT8ZV13B/DVzPxgRFwAHEG1R9GLgWXAmVTbqJydmRsi4j3A26gefLs2M/++2auXKs48pHKvptpxudM9wImdhZl5I/AfwDuA1wCnAsdRbZd9dERMAd5aHx8PHBkRp9XNH8vMU6iehJ+bmacDS4E/joijgT+iCqSZwLyIiCG7SmkPvOchlfsJ8HsDlB8JfLXteKAdTAP4er1NxKNUew+9BdiQmU8CRMRtVCEDcEf93w8D36k/PwSMBX6bahv5/n3Ifo1qppJ7cU1SEWceUrkvUO1K+1SARMQC4AGqQOjfrfWYtja7qP7/9j3gmIgYFRFjImIN1XLWcRExut4y+8S6DKrlqN1J4C7gDfVLjq6ien+D1DjDQyqUmb+kegPh+RGxLiJup1qG+hPgH4HzIuJLwOS2ZuuBq4F7gVXAOqpdT1dk5jeBz9ZlXwf+E/j8IMbxTapZx9qI+AbVzOcnQ3GNUjfeMJckFXPmIUkqZnhIkooZHpKkYoaHJKmY4SFJKmZ4SJKKGR6SpGL/H0A3FLDo7VoOAAAAAElFTkSuQmCC\n",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"d6.plot()\n",
"decorate_dice('One die')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Make Pmf from sequence\n",
"\n",
"For comments or questions about this section, see [this issue](https://github.com/AllenDowney/EmpyricalDistributions/issues/5).\n",
"\n",
"\n",
"The following function makes a `Pmf` object from a sequence of values."
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" @staticmethod\n",
" def from_seq(seq, normalize=True, sort=True, **options):\n",
" \"\"\"Make a PMF from a sequence of values.\n",
"\n",
" seq: any kind of sequence\n",
" normalize: whether to normalize the Pmf, default True\n",
" sort: whether to sort the Pmf by values, default True\n",
" options: passed to the pd.Series constructor\n",
"\n",
" :return: Pmf object\n",
" \"\"\"\n",
" series = pd.Series(seq).value_counts(sort=False)\n",
"\n",
" options[\"copy\"] = False\n",
" pmf = Pmf(series, **options)\n",
"\n",
" if sort:\n",
" pmf.sort_index(inplace=True)\n",
"\n",
" if normalize:\n",
" pmf.normalize()\n",
"\n",
" return pmf\n",
"\n"
]
}
],
"source": [
"psource(Pmf.from_seq)"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" probs | \n",
"
\n",
" \n",
" \n",
" \n",
" a | \n",
" 0.2 | \n",
"
\n",
" \n",
" e | \n",
" 0.2 | \n",
"
\n",
" \n",
" l | \n",
" 0.4 | \n",
"
\n",
" \n",
" n | \n",
" 0.2 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"a 0.2\n",
"e 0.2\n",
"l 0.4\n",
"n 0.2\n",
"dtype: float64"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pmf = Pmf.from_seq(list('allen'))\n",
"pmf"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" probs | \n",
"
\n",
" \n",
" \n",
" \n",
" 1 | \n",
" 0.2 | \n",
"
\n",
" \n",
" 2 | \n",
" 0.4 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.2 | \n",
"
\n",
" \n",
" 5 | \n",
" 0.2 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"1 0.2\n",
"2 0.4\n",
"3 0.2\n",
"5 0.2\n",
"dtype: float64"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pmf = Pmf.from_seq(np.array([1, 2, 2, 3, 5]))\n",
"pmf"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Selection\n",
"\n",
"For comments or questions about this section, see [this issue](https://github.com/AllenDowney/EmpyricalDistributions/issues/6).\n",
"\n",
"`Pmf` overrides `__getitem__` to return 0 for values that are not in the distribution."
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" def __getitem__(self, qs):\n",
" \"\"\"Look up qs and return ps.\"\"\"\n",
" try:\n",
" return super().__getitem__(qs)\n",
" except (KeyError, ValueError, IndexError):\n",
" return 0\n",
"\n"
]
}
],
"source": [
"psource(Pmf.__getitem__)"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.16666666666666666"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6[1]"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.16666666666666666"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6[6]"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6[7]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`Pmf` objects are mutable, but in general the result is not normalized."
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" probs | \n",
"
\n",
" \n",
" \n",
" \n",
" 1 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 2 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 4 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 5 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 6 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 7 | \n",
" 0.166667 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"1 0.166667\n",
"2 0.166667\n",
"3 0.166667\n",
"4 0.166667\n",
"5 0.166667\n",
"6 0.166667\n",
"7 0.166667\n",
"dtype: float64"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6[7] = 1/6\n",
"d6"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1.1666666666666665"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6.sum()"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1.1666666666666665"
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6.normalize()"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1.0000000000000002"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6.sum()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Statistics\n",
"\n",
"For comments or questions about this section, see [this issue](https://github.com/AllenDowney/EmpyricalDistributions/issues/7).\n",
"\n",
"`Pmf` overrides the statistics methods to compute `mean`, `median`, etc.\n",
"\n",
"These functions only work correctly if the `Pmf` is normalized."
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" def mean(self):\n",
" \"\"\"Computes expected value.\n",
"\n",
" :return: float\n",
" \"\"\"\n",
" # TODO: error if not normalized\n",
" # TODO: error if the quantities are not numeric\n",
" return np.sum(self.ps * self.qs)\n",
"\n"
]
}
],
"source": [
"psource(Pmf.mean)"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"4.000000000000001"
]
},
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6.mean()"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" def var(self):\n",
" \"\"\"Variance of a PMF.\n",
"\n",
" :return: float\n",
" \"\"\"\n",
" m = self.mean()\n",
" d = self.qs - m\n",
" return np.sum(d ** 2 * self.ps)\n",
"\n"
]
}
],
"source": [
"psource(Pmf.var)"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"4.0"
]
},
"execution_count": 42,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6.var()"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" def std(self):\n",
" \"\"\"Standard deviation of a PMF.\n",
"\n",
" :return: float\n",
" \"\"\"\n",
" return np.sqrt(self.var())\n",
"\n"
]
}
],
"source": [
"psource(Pmf.std)"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"2.0"
]
},
"execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6.std()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Sampling\n",
"\n",
"For comments or questions about this section, see [this issue](https://github.com/AllenDowney/EmpyricalDistributions/issues/8).\n",
"\n",
"`choice` chooses a random values from the Pmf, following the API of `np.random.choice`"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" def choice(self, *args, **kwargs):\n",
" \"\"\"Makes a random sample.\n",
"\n",
" Uses the probabilities as weights unless `p` is provided.\n",
"\n",
" args: same as np.random.choice\n",
" kwargs: same as np.random.choice\n",
"\n",
" :return: NumPy array\n",
" \"\"\"\n",
" underride(kwargs, p=self.ps)\n",
" return np.random.choice(self.qs, *args, **kwargs)\n",
"\n"
]
}
],
"source": [
"psource(Pmf.choice)"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([3, 5, 2, 6, 2, 5, 3, 4, 4, 3])"
]
},
"execution_count": 46,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6.choice(size=10)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`sample` chooses a random values from the `Pmf`, following the API of `pd.Series.sample`"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" def sample(self, *args, **kwargs):\n",
" \"\"\"Makes a random sample.\n",
"\n",
" Uses the probabilities as weights unless `weights` is provided.\n",
"\n",
" This function returns an array containing a sample of the quantities in this Pmf,\n",
" which is different from Series.sample, which returns a Series with a sample of\n",
" the rows in the original Series.\n",
"\n",
" args: same as Series.sample\n",
" options: same as Series.sample\n",
"\n",
" :return: NumPy array\n",
" \"\"\"\n",
" series = pd.Series(self.qs)\n",
" underride(kwargs, weights=self.ps)\n",
" sample = series.sample(*args, **kwargs)\n",
" return sample.values\n",
"\n"
]
}
],
"source": [
"psource(Pmf.sample)"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([2, 5, 5, 2, 1, 5, 7, 7, 7, 6])"
]
},
"execution_count": 48,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6.sample(n=10, replace=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Arithmetic\n",
"\n",
"For comments or questions about this section, see [this issue](https://github.com/AllenDowney/EmpyricalDistributions/issues/9).\n",
"\n",
"`Pmf` provides `add_dist`, which computes the distribution of the sum.\n",
"\n",
"The implementation uses outer products to compute the convolution of the two distributions."
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" def add_dist(self, x):\n",
" \"\"\"Computes the Pmf of the sum of values drawn from self and x.\n",
"\n",
" x: Distribution, scalar, or sequence\n",
"\n",
" :return: new Pmf\n",
" \"\"\"\n",
" if isinstance(x, Distribution):\n",
" return self.convolve_dist(x, np.add.outer)\n",
" else:\n",
" return Pmf(self.ps, index=self.qs + x)\n",
"\n"
]
}
],
"source": [
"psource(Pmf.add_dist)"
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" def convolve_dist(self, dist, ufunc):\n",
" \"\"\"Convolve two distributions.\n",
"\n",
" dist: Distribution\n",
" ufunc: elementwise function for arrays\n",
"\n",
" :return: new Pmf\n",
" \"\"\"\n",
" if not isinstance(dist, Pmf):\n",
" dist = dist.make_pmf()\n",
"\n",
" qs = ufunc(self.qs, dist.qs).flatten()\n",
" ps = np.multiply.outer(self.ps, dist.ps).flatten()\n",
" series = pd.Series(ps).groupby(qs).sum()\n",
"\n",
" return Pmf(series)\n",
"\n"
]
}
],
"source": [
"psource(Pmf.convolve_dist)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here's the distribution of the sum of two dice."
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" probs | \n",
"
\n",
" \n",
" \n",
" \n",
" 2 | \n",
" 0.027778 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.055556 | \n",
"
\n",
" \n",
" 4 | \n",
" 0.083333 | \n",
"
\n",
" \n",
" 5 | \n",
" 0.111111 | \n",
"
\n",
" \n",
" 6 | \n",
" 0.138889 | \n",
"
\n",
" \n",
" 7 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 8 | \n",
" 0.138889 | \n",
"
\n",
" \n",
" 9 | \n",
" 0.111111 | \n",
"
\n",
" \n",
" 10 | \n",
" 0.083333 | \n",
"
\n",
" \n",
" 11 | \n",
" 0.055556 | \n",
"
\n",
" \n",
" 12 | \n",
" 0.027778 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"2 0.027778\n",
"3 0.055556\n",
"4 0.083333\n",
"5 0.111111\n",
"6 0.138889\n",
"7 0.166667\n",
"8 0.138889\n",
"9 0.111111\n",
"10 0.083333\n",
"11 0.055556\n",
"12 0.027778\n",
"dtype: float64"
]
},
"execution_count": 51,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6 = Pmf.from_seq([1,2,3,4,5,6])\n",
"\n",
"twice = d6.add_dist(d6)\n",
"twice"
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"6.999999999999998"
]
},
"execution_count": 52,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYQAAAESCAYAAAD9gqKNAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjAsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+17YcXAAAWQUlEQVR4nO3de7RedX3n8XducFyuhEVHCpRBUxfyLViIyzJAICDtkIaLlLQdK8VLjQ0l0yKOYUajwujMoOBUtDJMkMaqUGEcW8AJYQigFYSkgbEqK9Dm45wuo6s6WgRJUEnIbf7Y++DD4Tk5BM4+hxzer3/Ye//25ftwcp7P+e3Lb0/ZtWsXkiRNnegCJEkvDAaCJAkwECRJLQNBkgQYCJKkloEgSQJg+kQXIE2EqroSOLmdPRL4NvBEOz83yRN9N3z+x/33wK8meVtVfQr4fJIvdXEsaU8ZCHpRSnLh0HRVbQTelORr41zD4vE8njQaA0Eapqr+DHg8ySVVdTDwfeA3knylqt4MnJXkjVV1CfD7wHbgW8AFSX4wbF8zgCuB+cA/Az8ENrVtdwFXJfnrqno9cCnNadyfAkuSPFBVJwAfAV4K7AD+U5JVHf8v0IuU1xCkZ7oJOL2dPg34Ac0XOsBvATdW1aJ2nX+V5GjgQeCzffb1x8DhNKel5gMvH75CVR0IfA5Y1O7rT4HLq2p/4DPAW5K8FjgbuLqqnrEPaSwYCNIz3Qv8y/aL+jSav9znV9U+wOuA/00TBp9J8tN2m08A/7pdp9epwA1JnmzXvb7P8U4EHkzyDYAkNyU5HZgLHAx8saq+2R53F3D0GH5W6SmeMpKGSbKzqlYBZwDHAW8B3gu8AVib5CdVNY3my3nIVJrfpyl9dtm7bHuf9u29+6qqKcBRwDTgH5Ic19P2S8DDz+VzSaOxhyD1dxPwbmB9kieBvwEuA25s21cDb6+ql7bzFwJfTbJ12H5uA95aVQNVNQC8sc+x7gOOqKpXt/Nn05xCWge8qqpOBqiq1wD/FzhkLD6gNJw9BKm/LwG/BFzdzt9O82V+Szv/F8ChwP1VNRUYBN7UZz/XAIfRXGN4hOYL/WmS/LCq3gRcW1XTgc3AOUkerqrfBf60DZOpNNcTNo7NR5SeborDX0uSwFNGkqSWgSBJAgwESVLLQJAkAXv5XUbHHXfcrkMO8Q48SdoTDz300I+SHDB8+V4dCIcccgg33XTTRJchSXuVqvpOv+WeMpIkAQaCJKllIEiSAANBktQyECRJgIEgSWoZCJIkwECQJLUMBEkSYCBIz9qWbTv2qv1Ke2qvHrpCGk8DM6Yxe9mtY77fjZefOeb7lJ4LewiSJMBAkCS1DARJEtDRNYSqmgosB+YAW4HFSQaHrXMAsBY4KsmWqpoGfAw4BtgX+GCSVV3UJ0l6pq56CAuBgSRzgWXAFb2NVbUAuAM4sGfxW4AZSU4EzgYO66g2SVIfXQXCPGA1QJJ1NH/199oJnAo82rNsAfBPVXUrsAK4paPaJEl9dBUIs4BNPfM7quqp01NJ7kzyyLBtXga8Cng98BHgMx3VJknqo6tA2AzM7D1Oku2jbPMIsCrJriR3A4d3VJskqY+uAmENcAZAVR0PrH8W29zbs80c4Lsd1SZJ6qOrJ5VvBuZX1VpgCrCoqpYCg0lWjrDNCuDqqlrXbrOko9okSX10EghJdvLML/QNfdab3TO9FXh7F/VIkkbng2mSJMBAkCS1DARJEmAgSJJaBoIkCTAQJEktA0GSBBgIkqSWgSBJAgwESVLLQJAkAQaCJKllIEiSAANBktQyECRJgIEgSWoZCJIkoKM3plXVVGA5MAfYCixOMjhsnQOAtcBRSbb0LP8V4D7gwN7lkqRuddVDWAgMJJkLLAOu6G2sqgXAHcCBw5bPatfd2lFdkqQRdBUI84DVAEnWAccMa98JnAo8OrSgqqYAfw68D/hZR3VJkkbQVSDMAjb1zO+oqqdOTyW5M8kjw7b5AHBrkgc6qkmStBtdBcJmYGbvcZJsH2WbNwN/WFV3AQfRnFKSRrRl2469ar97qss6XiifUS8snVxUBtYAZwFfqKrjgfWjbZDksKHpqtoI/GZHtWmSGJgxjdnLbh3z/W68/Mwx3+dz0dXngxfOZ9QLS1eBcDMwv6rWAlOARVW1FBhMsrKjY0qSnodOAiHJTmDJsMUb+qw3e4Tt+y6XJHXHB9MkSYCBIElqGQiSJMBAkCS1DARJEmAgSJJaBoIkCTAQJEktA0GSBBgIkqSWgSBJAgwESVLLQJAkAQaCJKllIEiSAANBktQyECRJQEdvTKuqqcByYA6wFVicZHDYOgcAa4Gjkmypqv2AzwGzgH2ApUn+tov6JEnP1FUPYSEwkGQusAy4orexqhYAdwAH9ixeCnw5yeuAtwH/vaPaJEl9dBUI84DVAEnWAccMa98JnAo82rPs48A17fR0YEtHtUmS+ujklBHNaZ9NPfM7qmp6ku0ASe4EqKqnVkjyWLvsIJpTR/+uo9okSX101UPYDMzsPc5QGOxOVR0FfBl4X5K7O6pNktRHV4GwBjgDoKqOB9aPtkFVHQn8FXBukts6qkuSNIKuThndDMyvqrXAFGBRVS0FBpOsHGGby4AB4BPtqaRNSc7uqD5J0jCdBEKSncCSYYs39Flvds+0X/6SNIF8ME2SBBgIkqSWgSBJAgwESVLLQJAkAQaCJKllIGhMbNm2Y6/ct37On6G6ejBNLzIDM6Yxe9mtnex74+VndrJfPZ0/Q9lDkCQBBoIkqWUgSJIAA0GS1DIQJEmAgSBJahkIkiTAQJAktQwESRLQ0ZPKVTUVWA7MAbYCi5MMDlvnAGAtcFSSLVX1EuBzwC8CjwN/kOThLuqTJD1TVz2EhcBAkrnAMuCK3saqWgDcARzYs/jfAuuTnARcB1zcUW2SpD66CoR5wGqAJOuAY4a17wROBR7ttw1wW9suSRonXQXCLGBTz/yOqnrq9FSSO5M8spttHgf266g2SVIfXQXCZmBm73GSbN+DbWYCj3VRmCSpv64CYQ1wBkBVHQ+s35NtgNOBe7opTZLUT1fvQ7gZmF9Va4EpwKKqWgoMJlk5wjZXA9dW1b3Ak8C5HdUmSeqjk0BIshNYMmzxhj7rze6Z/hnwhi7qkSSNzgfTJEmAgSBJau02EKrq4p7pg7svR5I0UUbrIfxGz/T1XRYiSZpYowXClBGmJUmTzGiBsGuEaUnSJDPabae/1vMswZE907uSnNB5dZKkcTNaIBw9LlVIkibcaIHwit20fWcsC5EkTazRAuEu4B+B/9POD11Y3gV8taOaJEkTYLRAOIZmTKHXAn8DXJ/k251XJUkad7sNhCRfB75eVVNonkm4uKoOAlYmuWY8CpQkjY9nNXRFkl007z/+UrvN4i6LkiSNv932EKpqBs27Cc4FDgdWAu9M8q1xqE2SNI5G6yH8M3AZ8CDwXppewuyq+s2uC5Mkja/RLip/kebdxkcDr+Tpdxnd0WFdep62bNvBwIxpe92+9eLhv9EXntEC4e+Ai4AdwAVJVndfksbCwIxpzF52ayf73nj5mZ3sVy8u/ht94RktEIauHewH/CXwrAKhqqYCy4E5wFZgcZLBnvbzgPOB7cClSVZV1cvbY0wBHgXObd+iJkkaB6NdQ9iSZFuSHwH77MF+FwIDSeYCy4Arhhra21YvBE4EFgCXVdW+wLuA/5nkZOAh4A/34HiSpOdpT96YtifDX8+j7U0kWUfzgNuQY4E1SbYm2QQM0lyj+Cawf7vOLGDbHhxPkvQ8jXbK6NVVdQNNGAxNA5Dk3N1sNwvY1DO/o6qmJ9nep+1xmlNS/wRcXlXnAvsCH3zWn0KS9LyNFgi/1zP9yT3Y72ZgZs/81DYM+rXNBB4D/hx4W5Lbq+pM4DrAK0OSNE5GG7ri7ue43zXAWcAXqup4YH1P2/3Ah6pqgKYncATNcw4/5uc9h+/z89NHkqRxMFoP4bm6GZjf80KdRVW1FBhMsrKqrgTuobmG8f4kW6rqHcBVVTWt3eZPOqpNktRHJ4GQZCewZNjiDT3tK4AVw7b5e5oB9CRJE2BP7jKSJE1iBoIkCTAQJEktA0GSBBgIkqSWgSBJAgwESVLLQJAkAQaCJKllIEiSAANBktQyECRJgIEgSWoZCJIkwECQJLUMBEkSYCBIklqdvDGtqqYCy4E5wFZgcZLBnvbzgPOB7cClSVZV1UuBq4FfBvYB3pHk/i7qkyQ9U1c9hIXAQJK5wDLgiqGGqjoIuBA4EVgAXFZV+wL/AXgwyUnAeUB1VJskqY+uAmEesBogyTrgmJ62Y4E1SbYm2QQMAkfThMOTVXU7cAlwe0e1SZL66CoQZgGbeuZ3VNX0EdoeB/YDXgbsn2QBcAvw0Y5qkyT10VUgbAZm9h4nyfYR2mYCjwGPACvbZbfw9F6FJKljXQXCGuAMgKo6Hljf03Y/cFJVDVTVfsARwIPAvUPbACcDD3VUmySpj07uMgJuBuZX1VpgCrCoqpYCg0lWVtWVwD00gfT+JFuq6sPAp6rqb4FtwFs7qk2S1EcngZBkJ7Bk2OINPe0rgBXDtnkU+J0u6pEkjc4H08bJlm079sp9S5NJV78rk+V3sKtTRhpmYMY0Zi+7tZN9b7z8zE72K002Xf0eTpbfQXsIkiTAQJAktQwESRJgIEiSWgaCJAkwECRJLQNBkgQYCJKkloEgSQIMBElSy0CQJAEGgiSpZSBIkgADQZLU6mT466qaCiwH5gBbgcVJBnvazwPOB7YDlyZZ1dN2MnB9kkO7qE2S1F9XPYSFwECSucAy4Iqhhqo6CLgQOBFYAFxWVfu2bYcCFwEzOqpLkjSCrgJhHrAaIMk64JietmOBNUm2JtkEDAJHV9UA8EngjzuqSZK0G10FwixgU8/8jqqaPkLb48B+wFXAR5N8r6OaJEm70VUgbAZm9h4nyfYR2mYCTwInAR+oqruAX6iqz3dUmySpj67eqbwGOAv4QlUdD6zvabsf+FB7imhf4Ajg/iQ1tEJV/SDJOR3VJknqo6tAuBmYX1VrgSnAoqpaCgwmWVlVVwL30PRQ3p9kS0d1SJKepU4CIclOYMmwxRt62lcAK3az/UFd1CVJGpkPpkmSAANBktQyECRJgIEgSWoZCJIkwECQJLUMBEkSYCBIkloGgiQJMBAkSa0XbSBs2bZjr9qvpL3P3vY909Xgdi94AzOmMXvZrWO+342Xnznm+5S0d9rbvmdetD0ESdLTGQiSJMBAkCS1DARJEmAgSJJandxlVFVTgeXAHGArsDjJYE/7ecD5wHbg0iSrqurlwKfbmqYAf5QkXdQnSXqmrnoIC4GBJHOBZcAVQw1VdRBwIXAisAC4rKr2Bf4LcFWSU4APA5d1VJskqY+uAmEesBogyTrgmJ62Y4E1SbYm2QQMAkcDFwFDN+xOB7Z0VJskqY+uHkybBWzqmd9RVdOTbO/T9jiwX5IfAVRVAR+l6WVIksZJVz2EzcDM3uO0YdCvbSbwGEBV/TrwReAtXj+QpPHVVSCsAc4AqKrjgfU9bfcDJ1XVQFXtBxwBPNiGwSeA05J8raO6JEkj6OqU0c3A/KpaS3PH0KKqWgoMJllZVVcC99AE0vuTbKmqPwP2Aa5tzhqRJOd3VJ8kaZhOAiHJTmDJsMUbetpXACuGbTOni1okSc+OD6ZJkgADQZLUMhAkSYCBIElqGQiSJMBAkCS1DARJEmAgSJJaBoIkCTAQJEktA0GSBBgIkqSWgSBJAgwESVLLQJAkAQaCJKllIEiSgI7emFZVU4HlwBxgK7A4yWBP+3nA+cB24NIkq6rqZcANwEuA7wOLkvysi/okSc/UVQ9hITCQZC6wDLhiqKGqDgIuBE4EFgCXVdW+wH8EbkhyEvANmsCQJI2TrgJhHrAaIMk64JietmOBNUm2JtkEDAJH924D3Aac2lFtkqQ+puzatWvMd1pVnwJuTHJbO/9d4JVJtlfVm4GjkrynbbsOuA74ZLv8iap6JXBdknmjHOdh4Dtj/gEkaXJ7RZIDhi/s5BoCsBmY2TM/Ncn2EdpmAo/1LH+iZ9lu9ftAkqTnpqtTRmuAMwCq6nhgfU/b/cBJVTVQVfsBRwAP9m4DnA7c01FtkqQ+ujplNHSX0dHAFGARzZf9YJKV7V1Gf0QTSB9OcmNVHQhcS9M7+BFwbpKfjnlxkqS+OgkESdLexwfTJEmAgSBJahkIkiSgu9tOJ4WqmgF8GpgN7EszzMbKCS2qA1X1i8DfAfOTbJjoesZSVb0X+C1gH2B5kr+Y4JLGVPtv9Fqaf6M7gPMmy8+wqo4DPpLklKo6DPgssIvmrsQ/SbJzIusbC8M+42uA/0bzc9wKvDXJD8ezHnsIu/dm4JF2OI3TgasmuJ4x136hXEPz/MekUlWnACfQDJPyOuDQCS2oG2cA05OcAPxn4EMTXM+YqKp3A58CBtpFHwMubn8XpwBnT1RtY6XPZ/wE8I4kpwA3Ae8Z75oMhN37K+CSnvntI624F/sozVPi35/oQjqwgOYZmJuBW4BVE1tOJ74FTG9v9Z4FbJvgesbKPwK/0zP/a8Dd7fRkGdpm+Gc8J8k32+npwJbxLshA2I0kP0nyeFXNBP4auHiiaxpLVfU24OEkt090LR15Gc04Wm8AlgDXV9WUiS1pzP2E5nTRBmAFcOWEVjNGktzI08NtSpKhe+QfB/Yb/6rG1vDPmOT/AVTVCcAFwMfHuyYDYRRVdSjwFeAvk9ww0fWMsbcD86vqLuA1wHXtaLSTxSPA7UmeTBKav7gm23An76L5jIfTDDd/bVUNjLLN3qj3esGzGtpmb1RVb6TpsZ+Z5OHxPr4XlXejfXr6DuCCJF+e6HrGWpKTh6bbUFiS5AcTV9GYuxd4Z1V9DDgYeClNSEwmP+bnf2U+CswApk1cOZ35RlWdkuQumut5X5ngesZcO/Dn+cApSR6diBoMhN17H7A/cElVDV1LOD3JpLsAOxm1L146mWb8rKk0d6bsmOCyxtrHgU9X1T00d1K9b5IO+XIRsKKq9gH+geYU7qRRVdNoTvd9F7ipqgDuTvKB8azDoSskSYDXECRJLQNBkgQYCJKkloEgSQIMBElSy9tOpVZV/TLNUB7/guZ+/geA9yR5fIT1fxu4L8lkHPZDL0L2ECSgql4CrAT+a5JTkpwI3Af8j91s9k6a8YOkScHnECSgqv4NzROiFwxbvo5mALkbkqyuqtOAc2gGPry+bZsHvBtYSNPrvjrJNVV1UbvuduCrSd5TVR8EDqMZZ+kXaN49/rvA4cAfJFlXVe8AzqUZ6vnzSSbF+ER64bOHIDVeSTP65HDfBk4evjDJrcA3gbcCr6YZTuE4muG2j6yqo4Dfa+dPAF5VVa9vN38iyWk0QxyfkeQs4HLgnKo6EngjTcjMAxZW+9iq1DWvIUiN7wHH9ln+KuCrPfP9Rkst4P52WIyf0Yyf9AZgXZJtAO3QEq9u1/96+9/HgL9vp39MMy7+rwKvAIbGztqfpkeR5/CZpD1iD0Fq/C+akV+fCoWqWgw8TPMlf3C7+LU92+yk+R3aALy2qqZW1YyqupPmVNJxVTW9HXL75HYZNKeCRhLgIeDX2xelfJbmnQ5S5wwEiebdF8BZwMVVtaaq7qM5BfT7NG+1eldVfQk4pGeztcB1NAOSrQbW0Iywen2SB4AvtMvuBzYCX3wWdTxA0zu4t6q+RtND+d5YfEZpNF5UliQB9hAkSS0DQZIEGAiSpJaBIEkCDARJUstAkCQBBoIkqfX/AbzkVjZ3AaiWAAAAAElFTkSuQmCC\n",
"text/plain": [
""
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"twice.bar()\n",
"decorate_dice('Two dice')\n",
"twice.mean()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To add a constant to a distribution, you could construct a deterministic `Pmf`"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" probs | \n",
"
\n",
" \n",
" \n",
" \n",
" 2 | \n",
" 0.25 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.25 | \n",
"
\n",
" \n",
" 4 | \n",
" 0.25 | \n",
"
\n",
" \n",
" 5 | \n",
" 0.25 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"2 0.25\n",
"3 0.25\n",
"4 0.25\n",
"5 0.25\n",
"dtype: float64"
]
},
"execution_count": 53,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"const = Pmf.from_seq([1])\n",
"d4.add_dist(const)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"But `add_dist` also handles constants as a special case:"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" probs | \n",
"
\n",
" \n",
" \n",
" \n",
" 2 | \n",
" 0.25 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.25 | \n",
"
\n",
" \n",
" 4 | \n",
" 0.25 | \n",
"
\n",
" \n",
" 5 | \n",
" 0.25 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"2 0.25\n",
"3 0.25\n",
"4 0.25\n",
"5 0.25\n",
"dtype: float64"
]
},
"execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d4.add_dist(1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Other arithmetic operations are also implemented"
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" probs | \n",
"
\n",
" \n",
" \n",
" \n",
" -3 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" -2 | \n",
" 0.083333 | \n",
"
\n",
" \n",
" -1 | \n",
" 0.125000 | \n",
"
\n",
" \n",
" 0 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 1 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 2 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.125000 | \n",
"
\n",
" \n",
" 4 | \n",
" 0.083333 | \n",
"
\n",
" \n",
" 5 | \n",
" 0.041667 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"-3 0.041667\n",
"-2 0.083333\n",
"-1 0.125000\n",
" 0 0.166667\n",
" 1 0.166667\n",
" 2 0.166667\n",
" 3 0.125000\n",
" 4 0.083333\n",
" 5 0.041667\n",
"dtype: float64"
]
},
"execution_count": 55,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6.sub_dist(d4)"
]
},
{
"cell_type": "code",
"execution_count": 56,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" probs | \n",
"
\n",
" \n",
" \n",
" \n",
" 1 | \n",
" 0.0625 | \n",
"
\n",
" \n",
" 2 | \n",
" 0.1250 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.1250 | \n",
"
\n",
" \n",
" 4 | \n",
" 0.1875 | \n",
"
\n",
" \n",
" 6 | \n",
" 0.1250 | \n",
"
\n",
" \n",
" 8 | \n",
" 0.1250 | \n",
"
\n",
" \n",
" 9 | \n",
" 0.0625 | \n",
"
\n",
" \n",
" 12 | \n",
" 0.1250 | \n",
"
\n",
" \n",
" 16 | \n",
" 0.0625 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"1 0.0625\n",
"2 0.1250\n",
"3 0.1250\n",
"4 0.1875\n",
"6 0.1250\n",
"8 0.1250\n",
"9 0.0625\n",
"12 0.1250\n",
"16 0.0625\n",
"dtype: float64"
]
},
"execution_count": 56,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d4.mul_dist(d4)"
]
},
{
"cell_type": "code",
"execution_count": 57,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" probs | \n",
"
\n",
" \n",
" \n",
" \n",
" 0.250000 | \n",
" 0.0625 | \n",
"
\n",
" \n",
" 0.333333 | \n",
" 0.0625 | \n",
"
\n",
" \n",
" 0.500000 | \n",
" 0.1250 | \n",
"
\n",
" \n",
" 0.666667 | \n",
" 0.0625 | \n",
"
\n",
" \n",
" 0.750000 | \n",
" 0.0625 | \n",
"
\n",
" \n",
" 1.000000 | \n",
" 0.2500 | \n",
"
\n",
" \n",
" 1.333333 | \n",
" 0.0625 | \n",
"
\n",
" \n",
" 1.500000 | \n",
" 0.0625 | \n",
"
\n",
" \n",
" 2.000000 | \n",
" 0.1250 | \n",
"
\n",
" \n",
" 3.000000 | \n",
" 0.0625 | \n",
"
\n",
" \n",
" 4.000000 | \n",
" 0.0625 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"0.250000 0.0625\n",
"0.333333 0.0625\n",
"0.500000 0.1250\n",
"0.666667 0.0625\n",
"0.750000 0.0625\n",
"1.000000 0.2500\n",
"1.333333 0.0625\n",
"1.500000 0.0625\n",
"2.000000 0.1250\n",
"3.000000 0.0625\n",
"4.000000 0.0625\n",
"dtype: float64"
]
},
"execution_count": 57,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d4.div_dist(d4)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Comparison operators\n",
"\n",
"`Pmf` implements comparison operators that return probabilities.\n",
"\n",
"You can compare a `Pmf` to a scalar:"
]
},
{
"cell_type": "code",
"execution_count": 58,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.3333333333333333"
]
},
"execution_count": 58,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6.lt_dist(3)"
]
},
{
"cell_type": "code",
"execution_count": 59,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.75"
]
},
"execution_count": 59,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d4.ge_dist(2)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Or compare `Pmf` objects:"
]
},
{
"cell_type": "code",
"execution_count": 60,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.25"
]
},
"execution_count": 60,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d4.gt_dist(d6)"
]
},
{
"cell_type": "code",
"execution_count": 61,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.41666666666666663"
]
},
"execution_count": 61,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6.le_dist(d4)"
]
},
{
"cell_type": "code",
"execution_count": 62,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.16666666666666666"
]
},
"execution_count": 62,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d4.eq_dist(d6)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Interestingly, this way of comparing distributions is [nontransitive]()."
]
},
{
"cell_type": "code",
"execution_count": 63,
"metadata": {},
"outputs": [],
"source": [
"A = Pmf.from_seq([2, 2, 4, 4, 9, 9])\n",
"B = Pmf.from_seq([1, 1, 6, 6, 8, 8])\n",
"C = Pmf.from_seq([3, 3, 5, 5, 7, 7])"
]
},
{
"cell_type": "code",
"execution_count": 64,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.5555555555555556"
]
},
"execution_count": 64,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"A.gt_dist(B)"
]
},
{
"cell_type": "code",
"execution_count": 65,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.5555555555555556"
]
},
"execution_count": 65,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"B.gt_dist(C)"
]
},
{
"cell_type": "code",
"execution_count": 66,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.5555555555555556"
]
},
"execution_count": 66,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"C.gt_dist(A)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Joint distributions\n",
"\n",
"For comments or questions about this section, see [this issue](https://github.com/AllenDowney/EmpyricalDistributions/issues/10).\n",
"\n",
"`Pmf.make_joint` takes two `Pmf` objects and makes their joint distribution, assuming independence."
]
},
{
"cell_type": "code",
"execution_count": 67,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" def make_joint(self, other, **options):\n",
" \"\"\"Make joint distribution (assuming independence).\n",
"\n",
" :param self:\n",
" :param other:\n",
" :param options: passed to Pmf constructor\n",
"\n",
" :return: new Pmf\n",
" \"\"\"\n",
" qs = pd.MultiIndex.from_product([self.qs, other.qs])\n",
" ps = np.multiply.outer(self.ps, other.ps).flatten()\n",
" return Pmf(ps, index=qs, **options)\n",
"\n"
]
}
],
"source": [
"psource(Pmf.make_joint)"
]
},
{
"cell_type": "code",
"execution_count": 68,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" probs | \n",
"
\n",
" \n",
" \n",
" \n",
" 1 | \n",
" 0.25 | \n",
"
\n",
" \n",
" 2 | \n",
" 0.25 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.25 | \n",
"
\n",
" \n",
" 4 | \n",
" 0.25 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"1 0.25\n",
"2 0.25\n",
"3 0.25\n",
"4 0.25\n",
"dtype: float64"
]
},
"execution_count": 68,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d4 = Pmf.from_seq(range(1,5))\n",
"d4"
]
},
{
"cell_type": "code",
"execution_count": 69,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" probs | \n",
"
\n",
" \n",
" \n",
" \n",
" 1 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 2 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 4 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 5 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 6 | \n",
" 0.166667 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"1 0.166667\n",
"2 0.166667\n",
"3 0.166667\n",
"4 0.166667\n",
"5 0.166667\n",
"6 0.166667\n",
"dtype: float64"
]
},
"execution_count": 69,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"d6 = Pmf.from_seq(range(1,7))\n",
"d6"
]
},
{
"cell_type": "code",
"execution_count": 70,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" | \n",
" probs | \n",
"
\n",
" \n",
" \n",
" \n",
" 1 | \n",
" 1 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 2 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 4 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 5 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 6 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 2 | \n",
" 1 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 2 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 4 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 5 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 6 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 3 | \n",
" 1 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 2 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 4 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 5 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 6 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 4 | \n",
" 1 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 2 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 4 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 5 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 6 | \n",
" 0.041667 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"1 1 0.041667\n",
" 2 0.041667\n",
" 3 0.041667\n",
" 4 0.041667\n",
" 5 0.041667\n",
" 6 0.041667\n",
"2 1 0.041667\n",
" 2 0.041667\n",
" 3 0.041667\n",
" 4 0.041667\n",
" 5 0.041667\n",
" 6 0.041667\n",
"3 1 0.041667\n",
" 2 0.041667\n",
" 3 0.041667\n",
" 4 0.041667\n",
" 5 0.041667\n",
" 6 0.041667\n",
"4 1 0.041667\n",
" 2 0.041667\n",
" 3 0.041667\n",
" 4 0.041667\n",
" 5 0.041667\n",
" 6 0.041667\n",
"dtype: float64"
]
},
"execution_count": 70,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"joint = Pmf.make_joint(d4, d6)\n",
"joint"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The result is a `Pmf` object that uses a MultiIndex to represent the values."
]
},
{
"cell_type": "code",
"execution_count": 71,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"MultiIndex(levels=[[1, 2, 3, 4], [1, 2, 3, 4, 5, 6]],\n",
" codes=[[0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3], [0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5]])"
]
},
"execution_count": 71,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"joint.index"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If you ask for the `qs`, you get an array of pairs:"
]
},
{
"cell_type": "code",
"execution_count": 72,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (2, 1), (2, 2),\n",
" (2, 3), (2, 4), (2, 5), (2, 6), (3, 1), (3, 2), (3, 3), (3, 4),\n",
" (3, 5), (3, 6), (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6)],\n",
" dtype=object)"
]
},
"execution_count": 72,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"joint.qs"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can select elements using tuples:"
]
},
{
"cell_type": "code",
"execution_count": 73,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.041666666666666664"
]
},
"execution_count": 73,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"joint[1,1]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can get unnnormalized conditional distributions by selecting on different axes:"
]
},
{
"cell_type": "code",
"execution_count": 74,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" probs | \n",
"
\n",
" \n",
" \n",
" \n",
" 1 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 2 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 4 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 5 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 6 | \n",
" 0.041667 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"1 0.041667\n",
"2 0.041667\n",
"3 0.041667\n",
"4 0.041667\n",
"5 0.041667\n",
"6 0.041667\n",
"dtype: float64"
]
},
"execution_count": 74,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Pmf(joint[1])"
]
},
{
"cell_type": "code",
"execution_count": 75,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" probs | \n",
"
\n",
" \n",
" \n",
" \n",
" 1 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 2 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.041667 | \n",
"
\n",
" \n",
" 4 | \n",
" 0.041667 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"1 0.041667\n",
"2 0.041667\n",
"3 0.041667\n",
"4 0.041667\n",
"dtype: float64"
]
},
"execution_count": 75,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Pmf(joint.loc[:, 1])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"But `Pmf` also provides `conditional(i,j,val)` which returns the distribution along axis `i` conditioned on the value of axis `j`: "
]
},
{
"cell_type": "code",
"execution_count": 76,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" probs | \n",
"
\n",
" \n",
" \n",
" \n",
" 1 | \n",
" 0.25 | \n",
"
\n",
" \n",
" 2 | \n",
" 0.25 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.25 | \n",
"
\n",
" \n",
" 4 | \n",
" 0.25 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"1 0.25\n",
"2 0.25\n",
"3 0.25\n",
"4 0.25\n",
"dtype: float64"
]
},
"execution_count": 76,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"joint.conditional(0, 1, 1)"
]
},
{
"cell_type": "code",
"execution_count": 77,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" probs | \n",
"
\n",
" \n",
" \n",
" \n",
" 1 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 2 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 4 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 5 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 6 | \n",
" 0.166667 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"1 0.166667\n",
"2 0.166667\n",
"3 0.166667\n",
"4 0.166667\n",
"5 0.166667\n",
"6 0.166667\n",
"dtype: float64"
]
},
"execution_count": 77,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"joint.conditional(1, 0, 1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It also provides `marginal(i)`, which returns the marginal distribution along axis `i`"
]
},
{
"cell_type": "code",
"execution_count": 78,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" probs | \n",
"
\n",
" \n",
" \n",
" \n",
" 1 | \n",
" 0.25 | \n",
"
\n",
" \n",
" 2 | \n",
" 0.25 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.25 | \n",
"
\n",
" \n",
" 4 | \n",
" 0.25 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"1 0.25\n",
"2 0.25\n",
"3 0.25\n",
"4 0.25\n",
"dtype: float64"
]
},
"execution_count": 78,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"joint.marginal(0)"
]
},
{
"cell_type": "code",
"execution_count": 79,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" probs | \n",
"
\n",
" \n",
" \n",
" \n",
" 1 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 2 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 4 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 5 | \n",
" 0.166667 | \n",
"
\n",
" \n",
" 6 | \n",
" 0.166667 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"1 0.166667\n",
"2 0.166667\n",
"3 0.166667\n",
"4 0.166667\n",
"5 0.166667\n",
"6 0.166667\n",
"dtype: float64"
]
},
"execution_count": 79,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"joint.marginal(1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The implementations of `conditional` and `marginal` are simple, but could be made more efficient using Pandas methods."
]
},
{
"cell_type": "code",
"execution_count": 80,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" def conditional(self, i, j, val, name=None):\n",
" \"\"\"Gets the conditional distribution of the indicated variable.\n",
"\n",
" Distribution of vs[i], conditioned on vs[j] = val.\n",
"\n",
" i: index of the variable we want\n",
" j: which variable is conditioned on\n",
" val: the value the jth variable has to have\n",
" name: string\n",
"\n",
" :return: Pmf\n",
" \"\"\"\n",
" # TODO: rewrite this using MultiIndex operations\n",
" pmf = Pmf(name=name)\n",
" for vs, p in self.items():\n",
" if vs[j] == val:\n",
" pmf[vs[i]] += p\n",
"\n",
" pmf.normalize()\n",
" return pmf\n",
"\n"
]
}
],
"source": [
"psource(Pmf.conditional)"
]
},
{
"cell_type": "code",
"execution_count": 81,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" def marginal(self, i, name=None):\n",
" \"\"\"Gets the marginal distribution of the indicated variable.\n",
"\n",
" i: index of the variable we want\n",
" name: string\n",
"\n",
" :return: Pmf\n",
" \"\"\"\n",
" # TODO: rewrite this using MultiIndex operations\n",
" pmf = Pmf(name=name)\n",
" for vs, p in self.items():\n",
" pmf[vs[i]] += p\n",
" return pmf\n",
"\n"
]
}
],
"source": [
"psource(Pmf.marginal)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here are some ways of iterating through a joint distribution."
]
},
{
"cell_type": "code",
"execution_count": 82,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(1, 1)\n",
"(1, 2)\n",
"(1, 3)\n",
"(1, 4)\n",
"(1, 5)\n",
"(1, 6)\n",
"(2, 1)\n",
"(2, 2)\n",
"(2, 3)\n",
"(2, 4)\n",
"(2, 5)\n",
"(2, 6)\n",
"(3, 1)\n",
"(3, 2)\n",
"(3, 3)\n",
"(3, 4)\n",
"(3, 5)\n",
"(3, 6)\n",
"(4, 1)\n",
"(4, 2)\n",
"(4, 3)\n",
"(4, 4)\n",
"(4, 5)\n",
"(4, 6)\n"
]
}
],
"source": [
"for q in joint.qs:\n",
" print(q)"
]
},
{
"cell_type": "code",
"execution_count": 83,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0.041666666666666664\n",
"0.041666666666666664\n",
"0.041666666666666664\n",
"0.041666666666666664\n",
"0.041666666666666664\n",
"0.041666666666666664\n",
"0.041666666666666664\n",
"0.041666666666666664\n",
"0.041666666666666664\n",
"0.041666666666666664\n",
"0.041666666666666664\n",
"0.041666666666666664\n",
"0.041666666666666664\n",
"0.041666666666666664\n",
"0.041666666666666664\n",
"0.041666666666666664\n",
"0.041666666666666664\n",
"0.041666666666666664\n",
"0.041666666666666664\n",
"0.041666666666666664\n",
"0.041666666666666664\n",
"0.041666666666666664\n",
"0.041666666666666664\n",
"0.041666666666666664\n"
]
}
],
"source": [
"for p in joint.ps:\n",
" print(p)"
]
},
{
"cell_type": "code",
"execution_count": 84,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(1, 1) 0.041666666666666664\n",
"(1, 2) 0.041666666666666664\n",
"(1, 3) 0.041666666666666664\n",
"(1, 4) 0.041666666666666664\n",
"(1, 5) 0.041666666666666664\n",
"(1, 6) 0.041666666666666664\n",
"(2, 1) 0.041666666666666664\n",
"(2, 2) 0.041666666666666664\n",
"(2, 3) 0.041666666666666664\n",
"(2, 4) 0.041666666666666664\n",
"(2, 5) 0.041666666666666664\n",
"(2, 6) 0.041666666666666664\n",
"(3, 1) 0.041666666666666664\n",
"(3, 2) 0.041666666666666664\n",
"(3, 3) 0.041666666666666664\n",
"(3, 4) 0.041666666666666664\n",
"(3, 5) 0.041666666666666664\n",
"(3, 6) 0.041666666666666664\n",
"(4, 1) 0.041666666666666664\n",
"(4, 2) 0.041666666666666664\n",
"(4, 3) 0.041666666666666664\n",
"(4, 4) 0.041666666666666664\n",
"(4, 5) 0.041666666666666664\n",
"(4, 6) 0.041666666666666664\n"
]
}
],
"source": [
"for q, p in joint.items():\n",
" print(q, p)"
]
},
{
"cell_type": "code",
"execution_count": 85,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1 1 0.041666666666666664\n",
"1 2 0.041666666666666664\n",
"1 3 0.041666666666666664\n",
"1 4 0.041666666666666664\n",
"1 5 0.041666666666666664\n",
"1 6 0.041666666666666664\n",
"2 1 0.041666666666666664\n",
"2 2 0.041666666666666664\n",
"2 3 0.041666666666666664\n",
"2 4 0.041666666666666664\n",
"2 5 0.041666666666666664\n",
"2 6 0.041666666666666664\n",
"3 1 0.041666666666666664\n",
"3 2 0.041666666666666664\n",
"3 3 0.041666666666666664\n",
"3 4 0.041666666666666664\n",
"3 5 0.041666666666666664\n",
"3 6 0.041666666666666664\n",
"4 1 0.041666666666666664\n",
"4 2 0.041666666666666664\n",
"4 3 0.041666666666666664\n",
"4 4 0.041666666666666664\n",
"4 5 0.041666666666666664\n",
"4 6 0.041666666666666664\n"
]
}
],
"source": [
"for (q1, q2), p in joint.items():\n",
" print(q1, q2, p)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}