{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "### Implementing PMFs\n", "\n", "Copyright 2019 Allen Downey\n", "\n", "BSD 3-clause license: https://opensource.org/licenses/BSD-3-Clause" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "\n", "import numpy as np\n", "import pandas as pd\n", "\n", "import seaborn as sns\n", "sns.set_style('white')\n", "\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import inspect\n", "\n", "def psource(obj):\n", " \"\"\"Prints the source code for a given object.\n", "\n", " obj: function or method object\n", " \"\"\"\n", " print(inspect.getsource(obj))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Constructor\n", "\n", "For comments or questions about this section, see [this issue](https://github.com/AllenDowney/EmpyricalDistributions/issues/1).\n", "\n", "The `Pmf` class inherits from `pd.Series`. The `__init__` method is essentially unchanged, but it includes a workaround for what I think is bad behavior." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " def __init__(self, *args, **kwargs):\n", " \"\"\"Initialize a Pmf.\n", "\n", " Note: this cleans up a weird Series behavior, which is\n", " that Series() and Series([]) yield different results.\n", " See: https://github.com/pandas-dev/pandas/issues/16737\n", " \"\"\"\n", " if args or ('index' in kwargs):\n", " super().__init__(*args, **kwargs)\n", " else:\n", " underride(kwargs, dtype=np.float64)\n", " super().__init__([], **kwargs)\n", "\n" ] } ], "source": [ "from empiricaldist import Pmf\n", "\n", "psource(Pmf.__init__)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can create an empty `Pmf` and then add elements.\n", "\n", "Here's a `Pmf` that represents a six-sided die." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "d6 = Pmf()" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "for x in [1,2,3,4,5,6]:\n", " d6[x] = 1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Initially the probabilities don't add up to 1." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
11
21
31
41
51
61
\n", "
" ], "text/plain": [ "1 1\n", "2 1\n", "3 1\n", "4 1\n", "5 1\n", "6 1\n", "dtype: int64" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`normalize` adds up the probabilities and divides through. The return value is the total probability before normalizing." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " def normalize(self):\n", " \"\"\"Make the probabilities add up to 1 (modifies self).\n", "\n", " :return: normalizing constant\n", " \"\"\"\n", " total = self.sum()\n", " self /= total\n", " return total\n", "\n" ] } ], "source": [ "psource(Pmf.normalize)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "6" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.normalize()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now the Pmf is normalized." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.166667
20.166667
30.166667
40.166667
50.166667
60.166667
\n", "
" ], "text/plain": [ "1 0.166667\n", "2 0.166667\n", "3 0.166667\n", "4 0.166667\n", "5 0.166667\n", "6 0.166667\n", "dtype: float64" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Properties\n", "\n", "For comments or questions about this section, see [this issue](https://github.com/AllenDowney/EmpyricalDistributions/issues/2).\n", "\n", "In a `Pmf` the index contains the quantities (`qs`) and the values contain the probabilities (`ps`).\n", "\n", "These attributes are available as properties that return arrays (same semantics as the Pandas `values` property)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1, 2, 3, 4, 5, 6])" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.qs" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0.16666667, 0.16666667, 0.16666667, 0.16666667, 0.16666667,\n", " 0.16666667])" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.ps" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Sharing\n", "\n", "For comments or questions about this section, see [this issue](https://github.com/AllenDowney/EmpyricalDistributions/issues/3).\n", "\n", "Because `Pmf` is a `Series` you can initialize it with any type `Series.__init__` can handle.\n", "\n", "Here's an example with a dictionary." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
a1
b2
c3
\n", "
" ], "text/plain": [ "a 1\n", "b 2\n", "c 3\n", "dtype: int64" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d = dict(a=1, b=2, c=3)\n", "pmf = Pmf(d)\n", "pmf" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here's an example with two lists." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.25
20.25
30.25
40.25
\n", "
" ], "text/plain": [ "1 0.25\n", "2 0.25\n", "3 0.25\n", "4 0.25\n", "dtype: float64" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "qs = [1,2,3,4]\n", "ps = [0.25, 0.25, 0.25, 0.25]\n", "d4 = Pmf(ps, index=qs)\n", "d4" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can copy a `Pmf` like this." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.166667
20.166667
30.166667
40.166667
50.166667
60.166667
\n", "
" ], "text/plain": [ "1 0.166667\n", "2 0.166667\n", "3 0.166667\n", "4 0.166667\n", "5 0.166667\n", "6 0.166667\n", "dtype: float64" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6_copy = Pmf(d6)\n", "d6_copy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, you have to be careful about sharing. In this example, the copies share the arrays:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.index is d6_copy.index" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.ps is d6_copy.ps" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can avoid sharing with `copy=True`" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.166667
20.166667
30.166667
40.166667
50.166667
60.166667
\n", "
" ], "text/plain": [ "1 0.166667\n", "2 0.166667\n", "3 0.166667\n", "4 0.166667\n", "5 0.166667\n", "6 0.166667\n", "dtype: float64" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6_copy = Pmf(d6, copy=True)\n", "d6_copy" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.index is d6_copy.index" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.ps is d6_copy.ps" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Or by calling `copy` explicitly." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.25
20.25
30.25
40.25
\n", "
" ], "text/plain": [ "1 0.25\n", "2 0.25\n", "3 0.25\n", "4 0.25\n", "dtype: float64" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d4_copy = d4.copy()\n", "d4_copy" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d4.index is d4_copy.index" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d4.ps is d4_copy.ps" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Displaying PMFs\n", "\n", "For comments or questions about this section, see [this issue](https://github.com/AllenDowney/EmpyricalDistributions/issues/4).\n", "\n", "`Pmf` provides `_repr_html_`, so it looks good when displayed in a notebook." ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " def _repr_html_(self):\n", " \"\"\"Returns an HTML representation of the series.\n", "\n", " Mostly used for Jupyter notebooks.\n", " \"\"\"\n", " df = pd.DataFrame(dict(probs=self))\n", " return df._repr_html_()\n", "\n" ] } ], "source": [ "psource(Pmf._repr_html_)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`Pmf` provides `bar`, which plots the Pmf as a bar chart." ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " def bar(self, **options):\n", " \"\"\"Makes a bar plot.\n", "\n", " options: passed to plt.bar\n", " \"\"\"\n", " underride(options, label=self.name)\n", " plt.bar(self.qs, self.ps, **options)\n", "\n" ] } ], "source": [ "psource(Pmf.bar)" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [], "source": [ "def decorate_dice(title):\n", " \"\"\"Labels the axes.\n", " \n", " title: string\n", " \"\"\"\n", " plt.xlabel('Outcome')\n", " plt.ylabel('PMF')\n", " plt.title(title)" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "scrolled": true }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYQAAAESCAYAAAD9gqKNAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjAsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+17YcXAAAVOElEQVR4nO3dbbReZX3n8W9CQo51klQLJQ5isUv8FzsQRzOQQKDYEqMoNW1H67CkGhskVsUaHRsFp6wlBZ0FzpSx8SGtVRR0bEs0wjIE6aiQGBhrZQJOfs5x+TDqOBWQhA4mkId5sffB7eE+OQnJziGH7+cN997Xvu793wu4f+faD9eesmfPHiRJmjrRBUiSHh8MBEkSYCBIkloGgiQJMBAkSS0DQZIEwLSJLkB6vKiq5cDrgenAHuBrwMVJvtfT/uYBf5vk+Hbfv5jkPX3sS9oXBoIEVNWVwFzgpUn+d1VNBV4FfKWqTk3y/T73n+SDfX6/tC8MBD3hVdXTgeXAcUl+ApBkN3BNVT0feAfwhqr6DvBR4LeAZwDXJHlX+x3nApcARwIPAm9L8pUB+3o98BZgK7C5s/5S4Kgkb6yqY4H3t/uYDnwqyeUH/cClUbyGIMGpwP8cCYNRvgAs7Cz/iyRnAKcBb6uqZ1bVCcDlwDlJ/jXwOuD6qnpy94uq6rnApcCZSf4N8NAY9Xwc+EiS5wOnAGdX1Sse++FJ+8ZAkBrTx1g/g+Z6wojPAiT5AfBPwFOBRcDTgFuq6uvAtcBu4Fmjvuu3gPVJftQuf3j0ztoQ+Q3g3e13baIZKTz3MRyTtF88ZSQ1P7onVNWczo/1iBcAGzvLP+183gNMAY4Abkny+yMNVXUc8MMB+5rS+bxzQPsR7TanJXmw/a6jgO37eCzSY+YIQU947V/7VwOfbM/fA1BVS4HfA947zlfcArywqn6t7XcO8D+AJ43abn273dPb5dcMqGUbTUCtaL/rF4ENwMv276ik/WcgSECSdwCfAD5bVXdV1f8CzgYWJPnuOH2/QXPd4FNVdSfwbuC3k/zzqO02A2+nObX0VWBojK88D5hfVZuB24FPJrn2AA5P2idTnP5akgSOECRJLQNBkgQYCJKkloEgSQIO8+cQTj311D3HHnvs+BtKkh5x991335Pk6NHrD+tAOPbYY7n++usnugxJOqxU1cBbqT1lJEkCDARJUstAkCQBBoIkqWUgSJIAA0GS1DIQJEmAgSBJahkIkiTgCRwI2x/eNdEl7JN9rXOyHc/+bjuR/Hf0+PdE/ne0Pw7rqSsOxND0Izh+5Y0TXca4vvOel+zTdpPteGDyHdNkOx6YfMc02Y5nfz1hRwiSpJ9nIEiSAANBktTq5RpCVU0FVgFzgR3AsiTDo7Y5GtgInJRke1UdAbwPmAfMAC5NckMf9UmSHq2vEcISYCjJAmAlcFW3saoWA+uBYzqrzwemJzkdeBnwrJ5qkyQN0FcgLATWASTZRPNXf9du4Gzgvs66xcD3q+pGYDXwuZ5qkyQN0FcgzAK2dpZ3VdUjp6eS3Jzk3lF9jgJOAF4KvBf4655qkyQN0FcgbANmdveTZOc4fe4FbkiyJ8mXgGf3VJskaYC+AmEDcA5AVc0HNu9Dn9s6feYC3+upNknSAH09qbwGWFRVG4EpwNKqWgEMJ1k7Rp/VwAeqalPbZ3lPtUmSBuglEJLs5tE/6FsGbHd85/MO4LV91CNJGp8PpkmSAANBktQyECRJgIEgSWoZCJIkwECQJLUMBEkSYCBIkloGgiQJMBAkSS0DQZIEGAiSpJaBIEkCDARJUstAkCQBBoIkqWUgSJKAnt6YVlVTgVXAXGAHsCzJ8KhtjgY2Aicl2d5Z/2vA7cAx3fWSpH71NUJYAgwlWQCsBK7qNlbVYmA9cMyo9bPabXf0VJckaQx9BcJCYB1Akk3AvFHtu4GzgftGVlTVFODDwDuBB3uqS5I0hr4CYRawtbO8q6oeOT2V5OYk947q86fAjUnu7KkmSdJe9BUI24CZ3f0k2TlOn1cBf1hVXwTm0JxSkiQdIr1cVAY2AOcCn66q+cDm8TokedbI56r6DvDCnmqTJA3QVyCsARZV1UZgCrC0qlYAw0nW9rRPSdIB6CUQkuwGlo9avWXAdseP0X/geklSf3wwTZIEGAiSpJaBIEkCDARJUstAkCQBBoIkqWUgSJIAA0GS1DIQJEmAgSBJahkIkiTAQJAktQwESRJgIEiSWgaCJAkwECRJLQNBkgT09Ma0qpoKrALmAjuAZUmGR21zNLAROCnJ9qqaDXwCmAUcCaxI8pU+6pMkPVpfI4QlwFCSBcBK4KpuY1UtBtYDx3RWrwBuSfIbwGuAv+ipNknSAH0FwkJgHUCSTcC8Ue27gbOB+zrr/hPwofbzNGB7T7VJkgbo5ZQRzWmfrZ3lXVU1LclOgCQ3A1TVIxskub9dN4fm1NEf91SbJGmAvkYI24CZ3f2MhMHeVNVJwC3AO5N8qafaJEkD9BUIG4BzAKpqPrB5vA5V9Rzgb4Dzkny+p7okSWPo65TRGmBRVW0EpgBLq2oFMJxk7Rh9rgCGgD9vTyVtTfKynuqTJI3SSyAk2Q0sH7V6y4Dtju989sdfkiaQD6ZJkgADQZLUMhAkSYCBIElqGQiSJMBAkCS1DARJEmAgSJJaBoIkCTAQJEktA0GSBBgIkqSWgSBJAgwESVLLQJAkAQaCJKllIEiSgJ7emFZVU4FVwFxgB7AsyfCobY4GNgInJdleVU8CPgH8MvAA8OokP+6jPknSo/U1QlgCDCVZAKwEruo2VtViYD1wTGf164HNSc4ArgEu6ak2SdIAfQXCQmAdQJJNwLxR7buBs4H7BvUBPt+2S5IOkb4CYRawtbO8q6oeOT2V5OYk9+6lzwPA7J5qkyQN0FcgbANmdveTZOd+9JkJ3N9HYZKkwfoKhA3AOQBVNR/YvD99gBcDt/ZTmiRpkF7uMgLWAIuqaiMwBVhaVSuA4SRrx+jzAeBjVXUb8BBwXk+1SZIG6CUQkuwGlo9avWXAdsd3Pj8IvLyPeiRJ4/PBNEkSYCBIklp7DYSquqTz+Wn9lyNJmijjjRB+s/P52j4LkSRNrPECYcoYnyVJk8x4gbBnjM+SpElmvNtOn995luA5nc97kpzWe3WSpENmvEA4+ZBUIUmacOMFwq/spe27B7MQSdLEGi8Qvgh8C/jv7fLIheU9wJd7qkmSNAHGC4R5NHMKPQ/4e+DaJN/uvSpJ0iG310BI8jXga1U1heaZhEuqag6wNsmHDkWBkqRDY5+mrkiyh+b9x19o+yzrsyhJ0qG31xFCVU2neTfBecCzgbXAm5N88xDUJkk6hMYbIfwTcAVwF/AOmlHC8VX1wr4LkyQdWuNdVP4MzbuNTwZ+lZ+/y2h9j3VJkg6x8QLhH4C3AruANyZZ139JkqSJMF4gjFw7mA18HNinQKiqqcAqYC6wA1iWZLjTfgFwIbATuCzJDVX1jHYfU4D7gPPat6hJkg6B8a4hbE/ycJJ7gCP343uXAENJFgArgatGGtrbVi8CTgcWA1dU1QzgLcB/TXImcDfwh/uxP0nSAdqfN6btz/TXC2lHE0k20TzgNuIUYEOSHUm2AsM01yi+Djyl3WYW8PB+7E+SdIDGO2X061V1HU0YjHwGIMl5e+k3C9jaWd5VVdOS7BzQ9gDNKanvA++pqvOAGcCl+3wUkqQDNl4gvKLz+YP78b3bgJmd5altGAxqmwncD3wYeE2Sm6rqJcA1wEv2Y5+SpAMw3tQVX3qM37sBOBf4dFXNBzZ32u4A/qyqhmhGAifSPOfwE342cvghPzt9JEk6BMYbITxWa4BFnRfqLK2qFcBwkrVVdTVwK801jIuTbK+qNwHvr6oj2j5v6Kk2SdIAvQRCkt3A8lGrt3TaVwOrR/X5Bs0EepKkCbA/dxlJkiYxA0GSBBgIkqSWgSBJAgwESVLLQJAkAQaCJKllIEiSAANBktQyECRJgIEgSWoZCJIkwECQJLUMBEkSYCBIkloGgiQJMBAkSa1e3phWVVOBVcBcYAewLMlwp/0C4EJgJ3BZkhuq6snAB4BnAkcCb0pyRx/1SZIera8RwhJgKMkCYCVw1UhDVc0BLgJOBxYDV1TVDODfA3clOQO4AKieapMkDdBXICwE1gEk2QTM67SdAmxIsiPJVmAYOJkmHB6qqpuAdwE39VSbJGmAvgJhFrC1s7yrqqaN0fYAMBs4CnhKksXA54Are6pNkjRAX4GwDZjZ3U+SnWO0zQTuB+4F1rbrPsfPjyokST3rKxA2AOcAVNV8YHOn7Q7gjKoaqqrZwInAXcBtI32AM4G7e6pNkjRAL3cZAWuARVW1EZgCLK2qFcBwkrVVdTVwK00gXZxke1VdDvxlVX0FeBj4g55qkyQN0EsgJNkNLB+1ekunfTWwelSf+4Df7aMeSdL4fDBNkgQYCJKkloEgSQIMBElSy0CQJAEGgiSpZSBIkgADQZLUMhAkSYCBIElqGQiSJMBAkCS1DARJEmAgSJJaBoIkCTAQJEmtXl6QU1VTgVXAXGAHsCzJcKf9AuBCYCdwWZIbOm1nAtcmOa6P2iRJg/U1QlgCDCVZAKwErhppqKo5wEXA6cBi4IqqmtG2HQe8FZjeU12SpDH0FQgLgXUASTYB8zptpwAbkuxIshUYBk6uqiHgg8Af9VSTJGkv+gqEWcDWzvKuqpo2RtsDwGzg/cCVSX7QU02SpL3oKxC2ATO7+0myc4y2mcBDwBnAn1bVF4GnVtWneqpNkjRALxeVgQ3AucCnq2o+sLnTdgfwZ+0pohnAicAdSWpkg6r6UZJX9lSbJGmAvgJhDbCoqjYCU4ClVbUCGE6ytqquBm6lGaFcnGR7T3VIkvZRL4GQZDewfNTqLZ321cDqvfSf00ddkqSx+WCaJAkwECRJLQNBkgQYCJKkloEgSQIMBElSy0CQJAEGgiSpZSBIkgADQZLUMhAkSYCBIElqGQiSJMBAkCS1DARJEmAgSJJaBoIkCejpjWlVNRVYBcwFdgDLkgx32i8ALgR2ApcluaGqngF8pK1pCvC6JOmjPknSo/U1QlgCDCVZAKwErhppqKo5wEXA6cBi4IqqmgG8G3h/krOAy4EreqpNkjRAX4GwEFgHkGQTMK/TdgqwIcmOJFuBYeBk4K3Aje0204DtPdUmSRqgl1NGwCxga2d5V1VNS7JzQNsDwOwk9wBUVQFX0owyJEmHSF8jhG3AzO5+2jAY1DYTuB+gql4AfAY43+sHknRo9RUIG4BzAKpqPrC503YHcEZVDVXVbOBE4K42DP4ceFGSr/ZUlyRpDH2dMloDLKqqjTR3DC2tqhXAcJK1VXU1cCtNIF2cZHtV/WfgSOBjzVkjkuTCnuqTJI3SSyAk2Q0sH7V6S6d9NbB6VJ+5fdQiSdo3PpgmSQIMBElSy0CQJAEGgiSpZSBIkgADQZLUMhAkSYCBIElqGQiSJMBAkCS1DARJEmAgSJJaBoIkCTAQJEktA0GSBBgIkqSWgSBJAnp6Y1pVTQVWAXOBHcCyJMOd9guAC4GdwGVJbqiqo4DrgCcBPwSWJnmwj/okSY/W1whhCTCUZAGwErhqpKGq5gAXAacDi4ErqmoG8B+A65KcAfwjTWBIkg6RvgJhIbAOIMkmYF6n7RRgQ5IdSbYCw8DJ3T7A54Gze6pNkjRAL6eMgFnA1s7yrqqalmTngLYHgNmj1o+s26u77777nqr67mMtcsZj7XgI1ZoV+7ztZDsemHzHNNmOBybfMU224xnDrwxa2VcgbANmdpantmEwqG0mcH9n/U876/YqydEHpVpJUm+njDYA5wBU1Xxgc6ftDuCMqhqqqtnAicBd3T7Ai4Fbe6pNkjTAlD179hz0L+3cZXQyMAVYSvNjP5xkbXuX0etoAunyJH9XVccAH6MZHdwDnJfk/x304iRJA/USCJKkw48PpkmSAANBktQyECRJQH+3nT7hVNWpwHuTnDXRtRyoqpoOfAQ4nua27MuSrJ3Qog5AVR0BrAYK2EUzLcq3Jraqg6Oqfhn4B2BRki0TXc+Bqqp/5GfPI307ydKJrOdAVdU7gN8GjgRWJfmrCS5prwyEg6Cq3g6cD0yWu6JeBdyb5Pyq+iWaqUQO20AAzgVIcnpVnQW8D3jZhFZ0ELTB/SGaZ3cOe1U1BDAZ/qgCaP9bO41mmp5fAN42oQXtA08ZHRzfAn53oos4iP4GeFdneedYGx4OknyG5jZnaJ7Q/L8TWM7BdCXwQZrJICeDucAvVNX6qvr79hmmw9limmew1gCfA26Y2HLGZyAcBEn+Dnh4ous4WJL8c5IHqmom8LfAJRNd04FKsrOqPgb8F5pjOqxV1WuAHye5aaJrOYgepAm5xcBy4NqqOpzPYhxFM4/by/nZ8UyZ2JL2zkDQQFV1HPDfgI8nuW6i6zkYkrwaeDawuqqePNH1HKDXAouq6ovAc4Fr2pmED2ffBD6RZE+SbwL3Ak+b4JoOxL3ATUkeShJgO/C4nm7ncE5f9aR9anw98MYkt0x0PQeqqs4Hnp7kCpq/QnfTXFw+bCU5c+RzGwrLk/xo4io6KF4LnAT8UVX9S5oJL//PxJZ0QG4D3lxV76MJtifThMTjloGgQd4JPAV4V1WNXEt4cZLD9eLl9cBfV9WXgenAHyfZPsE16dH+CvhoVd0G7AFe25kU87DTvvjrTJr526YCb0jyuP5DxKkrJEmA1xAkSS0DQZIEGAiSpJaBIEkCDARJUsvbTqVWVT2T5knZX6K5PfVO4E+SPDDG9r8D3J5kskwdoSc4RwgSUFVPopnA7z8mOSvJ6cDtwCf30u3NNA9PSZOCzyFIQFX9W+CsJG8ctX4TzZQK1yVZV1UvAl5JMwHgtW3bQuDtwBKaUfcHknyoqt7abrsT+HKSP6mqS4Fn0cxz81Sad4//Hs2UGq9Osqmq3gScR/Nw1qeSXN3v0UsNRwhS41dpZq0d7dvAmaNXJrkR+DrwB8CvAy8GTqWZ7vg5VXUS8Ip2+TTghKp6adv9p0leRPME9TlJzgXeA7yyqp4D/D5NyCwEllRVHbSjlPbCawhS4wfAKQPWnwB8ubM8aLbKAu5opyV4kGb+mpcDm5I8DFBVt9IEB8DX2n/eD3yj/fwTYAj4VzRTdI/MIfUUmhFFHsMxSfvFEYLU+CzN7KGPhEJVLQN+TPMjPzLr5vM6fXbT/D+0BXheVU2tqulVdTPNqaRTq2paO+Xxme06aE4FjSXA3cAL2hfFfJRmTn2pdwaCRPMOCJo3q11SVRuq6naaU0D/DvhL4C1V9QXg2E63jcA1wPeAdcAGmhkur01yJ/Dpdt0dwHeAz+xDHXfSjA5uq6qv0oxQfnAwjlEajxeVJUmAIwRJUstAkCQBBoIkqWUgSJIAA0GS1DIQJEmAgSBJav1/ElvzagDJ3AYAAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "d6.bar()\n", "decorate_dice('One die')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`Pmf` inherits `plot` from `Series`." ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "scrolled": true }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAY8AAAESCAYAAAAFYll6AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjAsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+17YcXAAAXsklEQVR4nO3dcbBeVXnv8e8JCUQkkSox0eDVWuBpsWkU0pJApFGCgRQkPVxt64gYmrGh1VFSRxOLV5h6a+gVrL13AjQVKCXKtfagEGpIBEWSGKyBakR9MEpFLCoGSPBCgCTv/WPvA68vJ3nPCmefQ8L3M+P47rXXWu/aM3p+WWu/e+2eVquFJEklRo30ACRJ+x7DQ5JUzPCQJBUzPCRJxQwPSVIxw0OSVGz0SA9A2hdFxELgXGAM0ALuAP4qM+9t6PumAZ/LzFfV331oZi5t4rukwTA8pEIR8XFgKnBaZv44IkYBbwe+FhHHZeZ9TX5/Zl7WZP/SYBgeUoGIOBxYCLwiMx8CyMxdwNURcSywBPiLiPhP4CrgJOC/AVdn5ofrPk4HzgcOBB4F3p+ZXxvgu84FzgO2Apvayi8ADsvMd0fEZOD/1N8xBrg2M/9myC9c6uA9D6nMccB3+4Ojw5eAmW3Hh2Tm64HjgfdHxK9HxJHA3wBzM/N1wLuAvoh4YXtHEfFa4ALgxMz8XeCJ3Yznn4ErMvNY4PeA2RHx1r2/PGlwDA+p3JjdlB9Edf+j3xcAMvMnwM+BFwMnAy8Dbo6I/wBWALuAIzr6OglYnZk/rY//ofPL6sD5feCv6742UM1AXrsX1yQVcdlKKrMBODIiJrX9Ye/3BmB92/FjbZ9bQA9wAHBzZv5R/4mIeAXwXwN8V0/b5x0DnD+grnN8Zj5a93UYsH2Q1yLtNWceUoF6FvH3wGfq+w0ARMR84Ezgoi5d3Ay8KSJ+s243F/gW8IKOeqvreofXx+8cYCzbqMJsUd3XocA64Iyyq5LKGR5SocxcAlwDfCEivh0R3wdmAzMy80dd2n6H6j7HtRHxTeCvgTdn5i876m0CPkC1vPUNYOxuunwbMD0iNgG3A5/JzBXP4vKkQelxS3ZJUilnHpKkYoaHJKmY4SFJKmZ4SJKKPW+e8zjuuONakydP7l5RkvSUu+666xeZOaGz/HkTHpMnT6avr2+khyFJ+5SIGPDn5y5bSZKKGR6SpGKGhySpmOEhSSpmeEiSihkekqRijfxUt36n8zKq9zw/DizIzM0ddSZQvftgSmZuj4jFwCn16UOBSZk5KSIWAX8KPFCf+zPgXqpdTV8KPAKcnZkPIEkaFk3NPOYBYzNzBrAYuLj9ZETMoXpfwcT+ssxcmpmzMnMWcB9wdn3qGOAd/ecyM4FzgU31Kz6vpnoftCRpmDQVHjOBVQCZuQGY1nF+F9X7Dx7sbBgRvcBDmXlTXXQssCQi1kbEks7+gS/WfUmShklT4TEe2Np2vDMinloiy8w1mbllN22XABe2HV8LLATeCMyMiNM6+n8EeNFQDVyS1F1T25NsA8a1HY/KzIHewfwrIuJo4OH++yMR0QP8XWZurY9vBF7X0f844OEhHLskqYumZh7rgLkAETEd2DTIdrOplqH6jQe+HRGH1EHyRmBje//AqcBtQzFoSdLgNBUe1wHbI2I98AngvIhYFBFv7tIugB/2H9Qzjg8BX6YKiLsy89+AS4HXRMRaqvdBXzhAX5Kkhjxv3mHe29vbclddSSoTERszs/NHTz4kKEkqZ3hIkooZHpKkYoaHJKmY4SFJKmZ4SJKKGR6SpGKGhySpmOEhSSpmeEiSihkekqRihockqZjhIUkqZnhIkooZHpKkYoaHJKmY4SFJKmZ4SJKKGR6SpGKGhySpmOEhSSo2uolOI2IUsAyYCjwOLMjMzR11JgDrgSmZuT0iFgOn1KcPBSZl5qSI+BPgfcBO4FvAn2fmroi4E9ha178nM+c3cS2SpGdqJDyAecDYzJwREdOBi4Ez+k9GxBxgKTCxvywzl9ZlRMRK4IMR8QLgo1QB82hEfAY4LSJW121mNTR+SdIeNLVsNRNYBZCZG4BpHed3AbOBBzsbRkQv8FBm3kQ1azk+Mx+tT48GtlPNaA6OiNURcUsdUJKkYdJUeIzn6SUlgJ0R8dQsJzPXZOaW3bRdAlxY19uVmT8DiIj3AIcAa4BHgY8Dc4CFwIr2/iVJzWrqD+42YFzb8ajM3NGtUUQcDTzcfn+kvn/yt8BRwJmZ2YqIu4HNmdkC7o6ILcDLgB8P5UVIkgbW1MxjHTAXoF5S2jTIdrOBL3aUXQ6MBea1LV+dQ3UfhYh4OdVM5/5nOWZJ0iA1NfO4Djg5ItYDPcD8iFhENVu4fg/tgmpZqjqIOAb4U+A24JaIAPgk8CngqohYC7SAcwYzs5EkDY2eVqs10mMYFr29va2+vr6RHoYk7VMiYmNmdv7oyYcEJUnlDA9JUjHDQ5JUzPCQJBUzPCRJxQwPSVIxw0OSVMzwkCQVMzwkScUMD0lSMcNDklTM8JAkFTM8JEnFDA9JUjHDQ5JUzPCQJBUzPCRJxQwPSVIxw0OSVMzwkCQVMzwkScUMD0lSsdFNdBoRo4BlwFTgcWBBZm7uqDMBWA9MycztEbEYOKU+fSgwKTMnRcTpwP8AdgBXZObyiHgBcA3wUuAR4OzMfKCJa5EkPVNTM495wNjMnAEsBi5uPxkRc4DVwMT+ssxcmpmzMnMWcB9wdkSMAT4BvAn4feBdETEJOBfYlJmvB64Gzm/oOiRJA2gqPGYCqwAycwMwreP8LmA28GBnw4joBR7KzJuA3wI2Z+ZDmfkEsBZ4fXv/wBfrviRJw6Sp8BgPbG073hkRTy2RZeaazNyym7ZLgAt3088jwIs6yvvLJEnDpKnw2AaMa/+ezNzRrVFEHA083HZ/pLOfccDDHeX9ZZKkYdJUeKwD5gJExHRg0yDbzaZahur3XeDIiHhxRBwInAh8rb1/4FTgtqEYtCRpcJoKj+uA7RGxnuqG93kRsSgi3tylXQA/7D/IzCeBRcBNVKFxRWb+BLgUeE1ErAXexdPLXJKkYdDTarVGegzDore3t9XX1zfSw5CkfUpEbMzMzh89+ZCgJKmc4SFJKmZ4SJKKGR6SpGKGhySpmOEhSSpmeEiSihkekqRihockqZjhIUkqZnhIkooZHpKkYoaHJKmY4SFJKmZ4SJKKGR6SpGKGhySpmOEhSSpmeEiSihkekqRihockqZjhIUkqNrqJTiNiFLAMmAo8DizIzM0ddSYA64Epmbk9Ig4ALgGmAQcBF2Tmyoj4Sluz3wSuyszFEXEnsLUuvycz5zdxLZKkZ2okPIB5wNjMnBER04GLgTP6T0bEHGApMLGtzVnAmMw8ISImA28ByMxZdZtXA58FPhoRY9vPSZKGV1PLVjOBVQCZuYFqNtFuFzAbeLCtbA5wX0TcCCwHbuho83fABzPzl1QzmoMjYnVE3FIHlCRpmDQVHuN5ekkJYGdEPDXLycw1mbmlo81hwJHAacBFwJX9JyLid4DxmXlzXfQo8HGqwFkIrGjvX5LUrD2GR0Sc3/b5ZQX9bgPGtX9PZu7o0mYLsDIzW5l5K3BU27m3U81G+t0NXFPXvbtuWzI+SdKz0G3m8ca2zysK+l0HzAWol5Q2DaLN2rY2U4F7286dRL0MVjuH6j4KEfFyqpnO/QXjkyQ9C92Wenp287mb64CTI2J93W5+RCwCNmfm9btpsxy4NCI21G0Wtp2b1LHM9SngqohYC7SAcwYxs5EkDZFu4dHazec9ysxd/Ooff4DvDVDvVW2fH6eaUQzU3+SO4yeAtw12PJKkodUtPI5tmz0c3fa5lZnHNz46SdJzUrfw+J1hGYUkaZ/SLTxeuYdzPxrKgUiS9h3dwuMrwA+Af6+P+2+at4CvNjSm55R/3Xgfn/3Gj0d6GJK0V9467RWceezhQ95vt/CYRnVj+hjgFmBFZt4z5KOQJO1T9hgemXkHcEdE9FA983F+REwCrs/My4djgCPtzGMPbyS1JWlfNqjtSTKzRbUD7pfqNguaHJQk6bltjzOPiBgDnEq1dHUUcD3w3npLEEnS81S3mcfPgY8B3waWUM0+XhURb2p6YJKk565uN8w/D7yI6nmPV/Orv7Za3eC4JEnPYd3CYyPwl8BO4N2ZuapLfUnS80C3Zav+ex3Tgfc2PxxJ0r6gW3hsz8wnM/MXwIHDMSBJ0nNfyZsES7ZklyTtx7rd83hNRHyaKjj6PwOQmW6JLknPU93C461tny9rciCSpH1Ht+1Jbh2ugUiS9h0l9zwkSQIMD0nSXjA8JEnFDA9JUrFuv7baKxExClgGTAUeBxZk5uaOOhOoNlqckpnbI+IA4BKqF1AdBFyQmSsjohf4X0D/6/w+AtzWrX9JUnOamnnMA8Zm5gxgMXBx+8mImEO1seLEtuKzgDGZeQJwBnBEXX4M8IHMnFX/59Zu/UuSmtVUeMwEVgFk5gaq2US7XcBs4MG2sjnAfRFxI7AcuKEuPxY4JyJui4iLI2L0IPqXJDWoqfAYD2xtO95Z/9EHIDPXZOaWjjaHAUcCpwEXAVfW5WuA9wAnAocAC7v1L0lqVlPhsQ0Y1/49mbmjS5stwMrMbNVLU0fV5Vdk5g/rV+F+AXjdXvYvSRoiTYXHOmAuQERMBzYNos3atjZTgXsjogf4VkQcXtc5ieodI3vTvyRpiDS11HMdcHJErKfaVHF+RCwCNmfm9btpsxy4NCI21G0WZmYrIhYAfRHxGPCdut7Ozv4bug5J0gB6Wq3WSI9hWPT29rb6+vpGehiStE+JiI2Z+YwfJfmQoCSpmOEhSSpmeEiSihkekqRihockqZjhIUkqZnhIkooZHpKkYoaHJKmY4SFJKmZ4SJKKGR6SpGKGhySpmOEhSSpmeEiSihkekqRihockqZjhIUkqZnhIkooZHpKkYoaHJKmY4SFJKja6iU4jYhSwDJgKPA4syMzNHXUmAOuBKZm5PSIOAC4BpgEHARdk5sqIOAn4KPAk8HPgHZn5aERcD7ykLn8sM09t4lokSc/U1MxjHjA2M2cAi4GL209GxBxgNTCxrfgsYExmngCcARxRly8D5mXmicD3gQV1+RHAzMycZXBI0vBqKjxmAqsAMnMD1Wyi3S5gNvBgW9kc4L6IuBFYDtxQl8/KzJ/Vn0cD2yNiInAocENErI2I05q5DEnSQJoKj/HA1rbjnRHx1BJZZq7JzC0dbQ4DjgROAy4Crqzr3g8QEX8IvAG4GjiQajYzD+gFPhERL23mUiRJnZoKj23AuPbvycwdXdpsAVZmZiszbwWO6j8REecB7wdOycztwE+ByzJzR2b+HLgTiCG9AknSbjUVHuuAuQARMR3YNIg2a9vaTAXurT//FfB6YHZm/qKuOxv4bH3+EOC3ge8O4fglSXvQyK+tgOuAkyNiPdADzI+IRcDmzLx+N22WA5dGxIa6zcL63sZHgDuAL0YEwP/NzEsjYk5ddxfwobZgkSQ1rKfVao30GIZFb29vq6+vb6SHIUn7lIjYmJmdP3ryIUFJUjnDQ5JUzPCQJBUzPCRJxQwPSVIxw0OSVMzwkCQVMzwkScUMD0lSMcNDklTM8JAkFTM8JEnFDA9JUjHDQ5JUzPCQJBUzPCRJxQwPSVIxw0OSVMzwkCQVMzwkScUMD0lSsdFNdBoRo4BlwFTgcWBBZm7uqDMBWA9MycztEXEAcAkwDTgIuCAzV0bEdOCTwA5gdWZeOJj+JUnNaWrmMQ8Ym5kzgMXAxe0nI2IOsBqY2FZ8FjAmM08AzgCOqMsvA94GzASOi4hjuvUvSWpWU+ExE1gFkJkbqGYT7XYBs4EH28rmAPdFxI3AcuCGiBgPHJSZP8jMFnATcNIg+pckNaip8BgPbG073hkRTy2RZeaazNzS0eYw4EjgNOAi4Mq6n21tdR4BXtStf0lSs5r6g7sNGNd2PCozd3RpswVYWc8wbo2IowboZxzwMHDwXvQvSRoiTc081gFzAeob3psG0WZtW5upwL2ZuQ14IiJ+IyJ6qJa2btvL/iVJQ6Spmcd1wMkRsR7oAeZHxCJgc2Zev5s2y4FLI2JD3WZhXb4QWAEcQPVrq9sj4t87+2/oOiRJA+hptVojPYZh0dvb2+rr6xvpYUjSPiUiNmbmM36U5EOCkqRihockqZjhIUkqZnhIkooZHpKkYoaHJKmY4SFJKmZ4SJKKGR6SpGKGhySpmOEhSSpmeEiSihkekqRihockqZjhIUkqZnhIkooZHpKkYoaHJKmY4SFJKmZ4SJKKGR6SpGKGhySp2OgmOo2IUcAyYCrwOLAgMzd31JkArAemZOb2iOgB7gO+X1f5GvBJ4Nq2Zq8FFgOXd9bNzCVNXIsk6ZkaCQ9gHjA2M2dExHTgYuCM/pMRMQdYCkxsa/MbwB2ZeXpHX7PqNjOA/wks30NdSdIwaGrZaiawCiAzNwDTOs7vAmYDD7aVHQtMjogvR8S/RUT0n6hnJf8bODczd+6priSpeU3NPMYDW9uOd0bE6MzcAZCZawA6/ubfD3wsM/8lImYC1wC/W587HbgrM3MQdQd01113/SIifvQsr0uSnm9eOVBhU+GxDRjXdjyqPzj24BtAf7isjYjJEdGTmS3g7VT3PwZTd0CZOWFvLkSS9ExNLVutA+YC1Pc8Ng2izUeA99VtpgL3toXBsVQ31wdTV5LUsKZmHtcBJ0fEeqAHmB8Ri4DNmXn9btosBa6JiD+gmlW8E576VdYjHeEwYF1J0vDoabX8B7skqYwPCUqSihkekqRihockqVhTN8z3GxFxHHBRZs4a6bE0LSLGAFcArwIOAj66hx847Bci4gCqXQsC2AnMz8wfjOyohkdEvBTYCJycmd8b6fE0LSLu5Onnz+7JzPkjOZ7hEBFLgDcDBwLLMvNTQ9W34bEHEfEB4Czg/430WIbJ24EtmXlWRLwEuBPYr8OD6gFUMvOEiJgFXELbVjr7q/ofCpcDj430WIZDRIwFeD78I7Bf/b/n44ETgIOB9w9l/y5b7dkPgN6RHsQw+hfgw23H3R7s3Odl5ueBd9WHrwR+NoLDGU4fBy4D/mukBzJMpgIHR8TqiLilfv5sfzeH6hm764AbgJVD2bnhsQeZ+a/AkyM9juGSmb/MzEciYhzwOeD8kR7TcMjMHRHxT1T7p31upMfTtIh4J/BAZt400mMZRo9SBeYcYCGwIiL295WXw6j2FXwLT19zz1B1bnjoV0TEK4AvA/+cmZ8e6fEMl8w8GzgKWB4RLxzp8TTsHKqHeL9C9ZqDqyNi0sgOqXF3A9dkZisz7wa2AC8b4TE1bQtwU2Y+Ue8LuB0Ysm2a9vfkVYGImAisBt6dmTeP9HiGQ0ScBRyemR+j+tfpLqob5/utzDyx/3MdIAsz86cjN6JhcQ4wBfjziHg51eat94/skBq3FnhvRFxCFZQvpAqUIWF4qN2HgF8DPhwR/fc+Ts3M/fmmah9wZUR8FRgDvC8zt4/wmDT0PgVcFRFrgRZwziA2a92nZebKiDgR+DrVKtNf1K+0GBJuTyJJKuY9D0lSMcNDklTM8JAkFTM8JEnFDA9JUjF/qivthYj4daonll9C9RPfbwIfzMxHdlP/D4HbM/P5sh2I9nPOPKRCEfECqg0j/zYzZ2XmCcDtwGf20Oy9VA+mSfsFn/OQCkXEfwdmZea7O8o3UG2D8enMXBURpwB/TLXh5Ir63EzgA8A8qpn/pZl5eUT8ZV13B/DVzPxgRFwAHEG1R9GLgWXAmVTbqJydmRsi4j3A26gefLs2M/++2auXKs48pHKvptpxudM9wImdhZl5I/AfwDuA1wCnAsdRbZd9dERMAd5aHx8PHBkRp9XNH8vMU6iehJ+bmacDS4E/joijgT+iCqSZwLyIiCG7SmkPvOchlfsJ8HsDlB8JfLXteKAdTAP4er1NxKNUew+9BdiQmU8CRMRtVCEDcEf93w8D36k/PwSMBX6bahv5/n3Ifo1qppJ7cU1SEWceUrkvUO1K+1SARMQC4AGqQOjfrfWYtja7qP7/9j3gmIgYFRFjImIN1XLWcRExut4y+8S6DKrlqN1J4C7gDfVLjq6ien+D1DjDQyqUmb+kegPh+RGxLiJup1qG+hPgH4HzIuJLwOS2ZuuBq4F7gVXAOqpdT1dk5jeBz9ZlXwf+E/j8IMbxTapZx9qI+AbVzOcnQ3GNUjfeMJckFXPmIUkqZnhIkooZHpKkYoaHJKmY4SFJKmZ4SJKKGR6SpGL/H0A3FLDo7VoOAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "d6.plot()\n", "decorate_dice('One die')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Make Pmf from sequence\n", "\n", "For comments or questions about this section, see [this issue](https://github.com/AllenDowney/EmpyricalDistributions/issues/5).\n", "\n", "\n", "The following function makes a `Pmf` object from a sequence of values." ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " @staticmethod\n", " def from_seq(seq, normalize=True, sort=True, **options):\n", " \"\"\"Make a PMF from a sequence of values.\n", "\n", " seq: any kind of sequence\n", " normalize: whether to normalize the Pmf, default True\n", " sort: whether to sort the Pmf by values, default True\n", " options: passed to the pd.Series constructor\n", "\n", " :return: Pmf object\n", " \"\"\"\n", " series = pd.Series(seq).value_counts(sort=False)\n", "\n", " options[\"copy\"] = False\n", " pmf = Pmf(series, **options)\n", "\n", " if sort:\n", " pmf.sort_index(inplace=True)\n", "\n", " if normalize:\n", " pmf.normalize()\n", "\n", " return pmf\n", "\n" ] } ], "source": [ "psource(Pmf.from_seq)" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
a0.2
e0.2
l0.4
n0.2
\n", "
" ], "text/plain": [ "a 0.2\n", "e 0.2\n", "l 0.4\n", "n 0.2\n", "dtype: float64" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pmf = Pmf.from_seq(list('allen'))\n", "pmf" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.2
20.4
30.2
50.2
\n", "
" ], "text/plain": [ "1 0.2\n", "2 0.4\n", "3 0.2\n", "5 0.2\n", "dtype: float64" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pmf = Pmf.from_seq(np.array([1, 2, 2, 3, 5]))\n", "pmf" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Selection\n", "\n", "For comments or questions about this section, see [this issue](https://github.com/AllenDowney/EmpyricalDistributions/issues/6).\n", "\n", "`Pmf` overrides `__getitem__` to return 0 for values that are not in the distribution." ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " def __getitem__(self, qs):\n", " \"\"\"Look up qs and return ps.\"\"\"\n", " try:\n", " return super().__getitem__(qs)\n", " except (KeyError, ValueError, IndexError):\n", " return 0\n", "\n" ] } ], "source": [ "psource(Pmf.__getitem__)" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.16666666666666666" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6[1]" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.16666666666666666" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6[6]" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6[7]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`Pmf` objects are mutable, but in general the result is not normalized." ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.166667
20.166667
30.166667
40.166667
50.166667
60.166667
70.166667
\n", "
" ], "text/plain": [ "1 0.166667\n", "2 0.166667\n", "3 0.166667\n", "4 0.166667\n", "5 0.166667\n", "6 0.166667\n", "7 0.166667\n", "dtype: float64" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6[7] = 1/6\n", "d6" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1.1666666666666665" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.sum()" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1.1666666666666665" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.normalize()" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1.0000000000000002" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.sum()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Statistics\n", "\n", "For comments or questions about this section, see [this issue](https://github.com/AllenDowney/EmpyricalDistributions/issues/7).\n", "\n", "`Pmf` overrides the statistics methods to compute `mean`, `median`, etc.\n", "\n", "These functions only work correctly if the `Pmf` is normalized." ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " def mean(self):\n", " \"\"\"Computes expected value.\n", "\n", " :return: float\n", " \"\"\"\n", " # TODO: error if not normalized\n", " # TODO: error if the quantities are not numeric\n", " return np.sum(self.ps * self.qs)\n", "\n" ] } ], "source": [ "psource(Pmf.mean)" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "4.000000000000001" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.mean()" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " def var(self):\n", " \"\"\"Variance of a PMF.\n", "\n", " :return: float\n", " \"\"\"\n", " m = self.mean()\n", " d = self.qs - m\n", " return np.sum(d ** 2 * self.ps)\n", "\n" ] } ], "source": [ "psource(Pmf.var)" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "4.0" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.var()" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " def std(self):\n", " \"\"\"Standard deviation of a PMF.\n", "\n", " :return: float\n", " \"\"\"\n", " return np.sqrt(self.var())\n", "\n" ] } ], "source": [ "psource(Pmf.std)" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2.0" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.std()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Sampling\n", "\n", "For comments or questions about this section, see [this issue](https://github.com/AllenDowney/EmpyricalDistributions/issues/8).\n", "\n", "`choice` chooses a random values from the Pmf, following the API of `np.random.choice`" ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " def choice(self, *args, **kwargs):\n", " \"\"\"Makes a random sample.\n", "\n", " Uses the probabilities as weights unless `p` is provided.\n", "\n", " args: same as np.random.choice\n", " kwargs: same as np.random.choice\n", "\n", " :return: NumPy array\n", " \"\"\"\n", " underride(kwargs, p=self.ps)\n", " return np.random.choice(self.qs, *args, **kwargs)\n", "\n" ] } ], "source": [ "psource(Pmf.choice)" ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([3, 5, 2, 6, 2, 5, 3, 4, 4, 3])" ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.choice(size=10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`sample` chooses a random values from the `Pmf`, following the API of `pd.Series.sample`" ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " def sample(self, *args, **kwargs):\n", " \"\"\"Makes a random sample.\n", "\n", " Uses the probabilities as weights unless `weights` is provided.\n", "\n", " This function returns an array containing a sample of the quantities in this Pmf,\n", " which is different from Series.sample, which returns a Series with a sample of\n", " the rows in the original Series.\n", "\n", " args: same as Series.sample\n", " options: same as Series.sample\n", "\n", " :return: NumPy array\n", " \"\"\"\n", " series = pd.Series(self.qs)\n", " underride(kwargs, weights=self.ps)\n", " sample = series.sample(*args, **kwargs)\n", " return sample.values\n", "\n" ] } ], "source": [ "psource(Pmf.sample)" ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([2, 5, 5, 2, 1, 5, 7, 7, 7, 6])" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.sample(n=10, replace=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Arithmetic\n", "\n", "For comments or questions about this section, see [this issue](https://github.com/AllenDowney/EmpyricalDistributions/issues/9).\n", "\n", "`Pmf` provides `add_dist`, which computes the distribution of the sum.\n", "\n", "The implementation uses outer products to compute the convolution of the two distributions." ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " def add_dist(self, x):\n", " \"\"\"Computes the Pmf of the sum of values drawn from self and x.\n", "\n", " x: Distribution, scalar, or sequence\n", "\n", " :return: new Pmf\n", " \"\"\"\n", " if isinstance(x, Distribution):\n", " return self.convolve_dist(x, np.add.outer)\n", " else:\n", " return Pmf(self.ps, index=self.qs + x)\n", "\n" ] } ], "source": [ "psource(Pmf.add_dist)" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " def convolve_dist(self, dist, ufunc):\n", " \"\"\"Convolve two distributions.\n", "\n", " dist: Distribution\n", " ufunc: elementwise function for arrays\n", "\n", " :return: new Pmf\n", " \"\"\"\n", " if not isinstance(dist, Pmf):\n", " dist = dist.make_pmf()\n", "\n", " qs = ufunc(self.qs, dist.qs).flatten()\n", " ps = np.multiply.outer(self.ps, dist.ps).flatten()\n", " series = pd.Series(ps).groupby(qs).sum()\n", "\n", " return Pmf(series)\n", "\n" ] } ], "source": [ "psource(Pmf.convolve_dist)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here's the distribution of the sum of two dice." ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
20.027778
30.055556
40.083333
50.111111
60.138889
70.166667
80.138889
90.111111
100.083333
110.055556
120.027778
\n", "
" ], "text/plain": [ "2 0.027778\n", "3 0.055556\n", "4 0.083333\n", "5 0.111111\n", "6 0.138889\n", "7 0.166667\n", "8 0.138889\n", "9 0.111111\n", "10 0.083333\n", "11 0.055556\n", "12 0.027778\n", "dtype: float64" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6 = Pmf.from_seq([1,2,3,4,5,6])\n", "\n", "twice = d6.add_dist(d6)\n", "twice" ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "6.999999999999998" ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYQAAAESCAYAAAD9gqKNAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjAsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+17YcXAAAWQUlEQVR4nO3de7RedX3n8XducFyuhEVHCpRBUxfyLViIyzJAICDtkIaLlLQdK8VLjQ0l0yKOYUajwujMoOBUtDJMkMaqUGEcW8AJYQigFYSkgbEqK9Dm45wuo6s6WgRJUEnIbf7Y++DD4Tk5BM4+hxzer3/Ye//25ftwcp7P+e3Lb0/ZtWsXkiRNnegCJEkvDAaCJAkwECRJLQNBkgQYCJKkloEgSQJg+kQXIE2EqroSOLmdPRL4NvBEOz83yRN9N3z+x/33wK8meVtVfQr4fJIvdXEsaU8ZCHpRSnLh0HRVbQTelORr41zD4vE8njQaA0Eapqr+DHg8ySVVdTDwfeA3knylqt4MnJXkjVV1CfD7wHbgW8AFSX4wbF8zgCuB+cA/Az8ENrVtdwFXJfnrqno9cCnNadyfAkuSPFBVJwAfAV4K7AD+U5JVHf8v0IuU1xCkZ7oJOL2dPg34Ac0XOsBvATdW1aJ2nX+V5GjgQeCzffb1x8DhNKel5gMvH75CVR0IfA5Y1O7rT4HLq2p/4DPAW5K8FjgbuLqqnrEPaSwYCNIz3Qv8y/aL+jSav9znV9U+wOuA/00TBp9J8tN2m08A/7pdp9epwA1JnmzXvb7P8U4EHkzyDYAkNyU5HZgLHAx8saq+2R53F3D0GH5W6SmeMpKGSbKzqlYBZwDHAW8B3gu8AVib5CdVNY3my3nIVJrfpyl9dtm7bHuf9u29+6qqKcBRwDTgH5Ic19P2S8DDz+VzSaOxhyD1dxPwbmB9kieBvwEuA25s21cDb6+ql7bzFwJfTbJ12H5uA95aVQNVNQC8sc+x7gOOqKpXt/Nn05xCWge8qqpOBqiq1wD/FzhkLD6gNJw9BKm/LwG/BFzdzt9O82V+Szv/F8ChwP1VNRUYBN7UZz/XAIfRXGN4hOYL/WmS/LCq3gRcW1XTgc3AOUkerqrfBf60DZOpNNcTNo7NR5SeborDX0uSwFNGkqSWgSBJAgwESVLLQJAkAXv5XUbHHXfcrkMO8Q48SdoTDz300I+SHDB8+V4dCIcccgg33XTTRJchSXuVqvpOv+WeMpIkAQaCJKllIEiSAANBktQyECRJgIEgSWoZCJIkwECQJLUMBEkSYCBIz9qWbTv2qv1Ke2qvHrpCGk8DM6Yxe9mtY77fjZefOeb7lJ4LewiSJMBAkCS1DARJEtDRNYSqmgosB+YAW4HFSQaHrXMAsBY4KsmWqpoGfAw4BtgX+GCSVV3UJ0l6pq56CAuBgSRzgWXAFb2NVbUAuAM4sGfxW4AZSU4EzgYO66g2SVIfXQXCPGA1QJJ1NH/199oJnAo82rNsAfBPVXUrsAK4paPaJEl9dBUIs4BNPfM7quqp01NJ7kzyyLBtXga8Cng98BHgMx3VJknqo6tA2AzM7D1Oku2jbPMIsCrJriR3A4d3VJskqY+uAmENcAZAVR0PrH8W29zbs80c4Lsd1SZJ6qOrJ5VvBuZX1VpgCrCoqpYCg0lWjrDNCuDqqlrXbrOko9okSX10EghJdvLML/QNfdab3TO9FXh7F/VIkkbng2mSJMBAkCS1DARJEmAgSJJaBoIkCTAQJEktA0GSBBgIkqSWgSBJAgwESVLLQJAkAQaCJKllIEiSAANBktQyECRJgIEgSWoZCJIkoKM3plXVVGA5MAfYCixOMjhsnQOAtcBRSbb0LP8V4D7gwN7lkqRuddVDWAgMJJkLLAOu6G2sqgXAHcCBw5bPatfd2lFdkqQRdBUI84DVAEnWAccMa98JnAo8OrSgqqYAfw68D/hZR3VJkkbQVSDMAjb1zO+oqqdOTyW5M8kjw7b5AHBrkgc6qkmStBtdBcJmYGbvcZJsH2WbNwN/WFV3AQfRnFKSRrRl2469ar97qss6XiifUS8snVxUBtYAZwFfqKrjgfWjbZDksKHpqtoI/GZHtWmSGJgxjdnLbh3z/W68/Mwx3+dz0dXngxfOZ9QLS1eBcDMwv6rWAlOARVW1FBhMsrKjY0qSnodOAiHJTmDJsMUb+qw3e4Tt+y6XJHXHB9MkSYCBIElqGQiSJMBAkCS1DARJEmAgSJJaBoIkCTAQJEktA0GSBBgIkqSWgSBJAgwESVLLQJAkAQaCJKllIEiSAANBktQyECRJQEdvTKuqqcByYA6wFVicZHDYOgcAa4Gjkmypqv2AzwGzgH2ApUn+tov6JEnP1FUPYSEwkGQusAy4orexqhYAdwAH9ixeCnw5yeuAtwH/vaPaJEl9dBUI84DVAEnWAccMa98JnAo82rPs48A17fR0YEtHtUmS+ujklBHNaZ9NPfM7qmp6ku0ASe4EqKqnVkjyWLvsIJpTR/+uo9okSX101UPYDMzsPc5QGOxOVR0FfBl4X5K7O6pNktRHV4GwBjgDoKqOB9aPtkFVHQn8FXBukts6qkuSNIKuThndDMyvqrXAFGBRVS0FBpOsHGGby4AB4BPtqaRNSc7uqD5J0jCdBEKSncCSYYs39Flvds+0X/6SNIF8ME2SBBgIkqSWgSBJAgwESVLLQJAkAQaCJKllIGhMbNm2Y6/ct37On6G6ejBNLzIDM6Yxe9mtnex74+VndrJfPZ0/Q9lDkCQBBoIkqWUgSJIAA0GS1DIQJEmAgSBJahkIkiTAQJAktQwESRLQ0ZPKVTUVWA7MAbYCi5MMDlvnAGAtcFSSLVX1EuBzwC8CjwN/kOThLuqTJD1TVz2EhcBAkrnAMuCK3saqWgDcARzYs/jfAuuTnARcB1zcUW2SpD66CoR5wGqAJOuAY4a17wROBR7ttw1wW9suSRonXQXCLGBTz/yOqnrq9FSSO5M8spttHgf266g2SVIfXQXCZmBm73GSbN+DbWYCj3VRmCSpv64CYQ1wBkBVHQ+s35NtgNOBe7opTZLUT1fvQ7gZmF9Va4EpwKKqWgoMJlk5wjZXA9dW1b3Ak8C5HdUmSeqjk0BIshNYMmzxhj7rze6Z/hnwhi7qkSSNzgfTJEmAgSBJau02EKrq4p7pg7svR5I0UUbrIfxGz/T1XRYiSZpYowXClBGmJUmTzGiBsGuEaUnSJDPabae/1vMswZE907uSnNB5dZKkcTNaIBw9LlVIkibcaIHwit20fWcsC5EkTazRAuEu4B+B/9POD11Y3gV8taOaJEkTYLRAOIZmTKHXAn8DXJ/k251XJUkad7sNhCRfB75eVVNonkm4uKoOAlYmuWY8CpQkjY9nNXRFkl007z/+UrvN4i6LkiSNv932EKpqBs27Cc4FDgdWAu9M8q1xqE2SNI5G6yH8M3AZ8CDwXppewuyq+s2uC5Mkja/RLip/kebdxkcDr+Tpdxnd0WFdep62bNvBwIxpe92+9eLhv9EXntEC4e+Ai4AdwAVJVndfksbCwIxpzF52ayf73nj5mZ3sVy8u/ht94RktEIauHewH/CXwrAKhqqYCy4E5wFZgcZLBnvbzgPOB7cClSVZV1cvbY0wBHgXObd+iJkkaB6NdQ9iSZFuSHwH77MF+FwIDSeYCy4Arhhra21YvBE4EFgCXVdW+wLuA/5nkZOAh4A/34HiSpOdpT96YtifDX8+j7U0kWUfzgNuQY4E1SbYm2QQM0lyj+Cawf7vOLGDbHhxPkvQ8jXbK6NVVdQNNGAxNA5Dk3N1sNwvY1DO/o6qmJ9nep+1xmlNS/wRcXlXnAvsCH3zWn0KS9LyNFgi/1zP9yT3Y72ZgZs/81DYM+rXNBB4D/hx4W5Lbq+pM4DrAK0OSNE5GG7ri7ue43zXAWcAXqup4YH1P2/3Ah6pqgKYncATNcw4/5uc9h+/z89NHkqRxMFoP4bm6GZjf80KdRVW1FBhMsrKqrgTuobmG8f4kW6rqHcBVVTWt3eZPOqpNktRHJ4GQZCewZNjiDT3tK4AVw7b5e5oB9CRJE2BP7jKSJE1iBoIkCTAQJEktA0GSBBgIkqSWgSBJAgwESVLLQJAkAQaCJKllIEiSAANBktQyECRJgIEgSWoZCJIkwECQJLUMBEkSYCBIklqdvDGtqqYCy4E5wFZgcZLBnvbzgPOB7cClSVZV1UuBq4FfBvYB3pHk/i7qkyQ9U1c9hIXAQJK5wDLgiqGGqjoIuBA4EVgAXFZV+wL/AXgwyUnAeUB1VJskqY+uAmEesBogyTrgmJ62Y4E1SbYm2QQMAkfThMOTVXU7cAlwe0e1SZL66CoQZgGbeuZ3VNX0EdoeB/YDXgbsn2QBcAvw0Y5qkyT10VUgbAZm9h4nyfYR2mYCjwGPACvbZbfw9F6FJKljXQXCGuAMgKo6Hljf03Y/cFJVDVTVfsARwIPAvUPbACcDD3VUmySpj07uMgJuBuZX1VpgCrCoqpYCg0lWVtWVwD00gfT+JFuq6sPAp6rqb4FtwFs7qk2S1EcngZBkJ7Bk2OINPe0rgBXDtnkU+J0u6pEkjc4H08bJlm079sp9S5NJV78rk+V3sKtTRhpmYMY0Zi+7tZN9b7z8zE72K002Xf0eTpbfQXsIkiTAQJAktQwESRJgIEiSWgaCJAkwECRJLQNBkgQYCJKkloEgSQIMBElSy0CQJAEGgiSpZSBIkgADQZLU6mT466qaCiwH5gBbgcVJBnvazwPOB7YDlyZZ1dN2MnB9kkO7qE2S1F9XPYSFwECSucAy4Iqhhqo6CLgQOBFYAFxWVfu2bYcCFwEzOqpLkjSCrgJhHrAaIMk64JietmOBNUm2JtkEDAJHV9UA8EngjzuqSZK0G10FwixgU8/8jqqaPkLb48B+wFXAR5N8r6OaJEm70VUgbAZm9h4nyfYR2mYCTwInAR+oqruAX6iqz3dUmySpj67eqbwGOAv4QlUdD6zvabsf+FB7imhf4Ajg/iQ1tEJV/SDJOR3VJknqo6tAuBmYX1VrgSnAoqpaCgwmWVlVVwL30PRQ3p9kS0d1SJKepU4CIclOYMmwxRt62lcAK3az/UFd1CVJGpkPpkmSAANBktQyECRJgIEgSWoZCJIkwECQJLUMBEkSYCBIkloGgiQJMBAkSa0XbSBs2bZjr9qvpL3P3vY909Xgdi94AzOmMXvZrWO+342Xnznm+5S0d9rbvmdetD0ESdLTGQiSJMBAkCS1DARJEmAgSJJandxlVFVTgeXAHGArsDjJYE/7ecD5wHbg0iSrqurlwKfbmqYAf5QkXdQnSXqmrnoIC4GBJHOBZcAVQw1VdRBwIXAisAC4rKr2Bf4LcFWSU4APA5d1VJskqY+uAmEesBogyTrgmJ62Y4E1SbYm2QQMAkcDFwFDN+xOB7Z0VJskqY+uHkybBWzqmd9RVdOTbO/T9jiwX5IfAVRVAR+l6WVIksZJVz2EzcDM3uO0YdCvbSbwGEBV/TrwReAtXj+QpPHVVSCsAc4AqKrjgfU9bfcDJ1XVQFXtBxwBPNiGwSeA05J8raO6JEkj6OqU0c3A/KpaS3PH0KKqWgoMJllZVVcC99AE0vuTbKmqPwP2Aa5tzhqRJOd3VJ8kaZhOAiHJTmDJsMUbetpXACuGbTOni1okSc+OD6ZJkgADQZLUMhAkSYCBIElqGQiSJMBAkCS1DARJEmAgSJJaBoIkCTAQJEktA0GSBBgIkqSWgSBJAgwESVLLQJAkAQaCJKllIEiSgI7emFZVU4HlwBxgK7A4yWBP+3nA+cB24NIkq6rqZcANwEuA7wOLkvysi/okSc/UVQ9hITCQZC6wDLhiqKGqDgIuBE4EFgCXVdW+wH8EbkhyEvANmsCQJI2TrgJhHrAaIMk64JietmOBNUm2JtkEDAJH924D3Aac2lFtkqQ+puzatWvMd1pVnwJuTHJbO/9d4JVJtlfVm4GjkrynbbsOuA74ZLv8iap6JXBdknmjHOdh4Dtj/gEkaXJ7RZIDhi/s5BoCsBmY2TM/Ncn2EdpmAo/1LH+iZ9lu9ftAkqTnpqtTRmuAMwCq6nhgfU/b/cBJVTVQVfsBRwAP9m4DnA7c01FtkqQ+ujplNHSX0dHAFGARzZf9YJKV7V1Gf0QTSB9OcmNVHQhcS9M7+BFwbpKfjnlxkqS+OgkESdLexwfTJEmAgSBJahkIkiSgu9tOJ4WqmgF8GpgN7EszzMbKCS2qA1X1i8DfAfOTbJjoesZSVb0X+C1gH2B5kr+Y4JLGVPtv9Fqaf6M7gPMmy8+wqo4DPpLklKo6DPgssIvmrsQ/SbJzIusbC8M+42uA/0bzc9wKvDXJD8ezHnsIu/dm4JF2OI3TgasmuJ4x136hXEPz/MekUlWnACfQDJPyOuDQCS2oG2cA05OcAPxn4EMTXM+YqKp3A58CBtpFHwMubn8XpwBnT1RtY6XPZ/wE8I4kpwA3Ae8Z75oMhN37K+CSnvntI624F/sozVPi35/oQjqwgOYZmJuBW4BVE1tOJ74FTG9v9Z4FbJvgesbKPwK/0zP/a8Dd7fRkGdpm+Gc8J8k32+npwJbxLshA2I0kP0nyeFXNBP4auHiiaxpLVfU24OEkt090LR15Gc04Wm8AlgDXV9WUiS1pzP2E5nTRBmAFcOWEVjNGktzI08NtSpKhe+QfB/Yb/6rG1vDPmOT/AVTVCcAFwMfHuyYDYRRVdSjwFeAvk9ww0fWMsbcD86vqLuA1wHXtaLSTxSPA7UmeTBKav7gm23An76L5jIfTDDd/bVUNjLLN3qj3esGzGtpmb1RVb6TpsZ+Z5OHxPr4XlXejfXr6DuCCJF+e6HrGWpKTh6bbUFiS5AcTV9GYuxd4Z1V9DDgYeClNSEwmP+bnf2U+CswApk1cOZ35RlWdkuQumut5X5ngesZcO/Dn+cApSR6diBoMhN17H7A/cElVDV1LOD3JpLsAOxm1L146mWb8rKk0d6bsmOCyxtrHgU9X1T00d1K9b5IO+XIRsKKq9gH+geYU7qRRVdNoTvd9F7ipqgDuTvKB8azDoSskSYDXECRJLQNBkgQYCJKkloEgSQIMBElSy9tOpVZV/TLNUB7/guZ+/geA9yR5fIT1fxu4L8lkHPZDL0L2ECSgql4CrAT+a5JTkpwI3Af8j91s9k6a8YOkScHnECSgqv4NzROiFwxbvo5mALkbkqyuqtOAc2gGPry+bZsHvBtYSNPrvjrJNVV1UbvuduCrSd5TVR8EDqMZZ+kXaN49/rvA4cAfJFlXVe8AzqUZ6vnzSSbF+ER64bOHIDVeSTP65HDfBk4evjDJrcA3gbcCr6YZTuE4muG2j6yqo4Dfa+dPAF5VVa9vN38iyWk0QxyfkeQs4HLgnKo6EngjTcjMAxZW+9iq1DWvIUiN7wHH9ln+KuCrPfP9Rkst4P52WIyf0Yyf9AZgXZJtAO3QEq9u1/96+9/HgL9vp39MMy7+rwKvAIbGztqfpkeR5/CZpD1iD0Fq/C+akV+fCoWqWgw8TPMlf3C7+LU92+yk+R3aALy2qqZW1YyqupPmVNJxVTW9HXL75HYZNKeCRhLgIeDX2xelfJbmnQ5S5wwEiebdF8BZwMVVtaaq7qM5BfT7NG+1eldVfQk4pGeztcB1NAOSrQbW0Iywen2SB4AvtMvuBzYCX3wWdTxA0zu4t6q+RtND+d5YfEZpNF5UliQB9hAkSS0DQZIEGAiSpJaBIEkCDARJUstAkCQBBoIkqfX/AbzkVjZ3AaiWAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "twice.bar()\n", "decorate_dice('Two dice')\n", "twice.mean()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To add a constant to a distribution, you could construct a deterministic `Pmf`" ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
20.25
30.25
40.25
50.25
\n", "
" ], "text/plain": [ "2 0.25\n", "3 0.25\n", "4 0.25\n", "5 0.25\n", "dtype: float64" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "const = Pmf.from_seq([1])\n", "d4.add_dist(const)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "But `add_dist` also handles constants as a special case:" ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
20.25
30.25
40.25
50.25
\n", "
" ], "text/plain": [ "2 0.25\n", "3 0.25\n", "4 0.25\n", "5 0.25\n", "dtype: float64" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d4.add_dist(1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Other arithmetic operations are also implemented" ] }, { "cell_type": "code", "execution_count": 55, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
-30.041667
-20.083333
-10.125000
00.166667
10.166667
20.166667
30.125000
40.083333
50.041667
\n", "
" ], "text/plain": [ "-3 0.041667\n", "-2 0.083333\n", "-1 0.125000\n", " 0 0.166667\n", " 1 0.166667\n", " 2 0.166667\n", " 3 0.125000\n", " 4 0.083333\n", " 5 0.041667\n", "dtype: float64" ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.sub_dist(d4)" ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.0625
20.1250
30.1250
40.1875
60.1250
80.1250
90.0625
120.1250
160.0625
\n", "
" ], "text/plain": [ "1 0.0625\n", "2 0.1250\n", "3 0.1250\n", "4 0.1875\n", "6 0.1250\n", "8 0.1250\n", "9 0.0625\n", "12 0.1250\n", "16 0.0625\n", "dtype: float64" ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d4.mul_dist(d4)" ] }, { "cell_type": "code", "execution_count": 57, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
0.2500000.0625
0.3333330.0625
0.5000000.1250
0.6666670.0625
0.7500000.0625
1.0000000.2500
1.3333330.0625
1.5000000.0625
2.0000000.1250
3.0000000.0625
4.0000000.0625
\n", "
" ], "text/plain": [ "0.250000 0.0625\n", "0.333333 0.0625\n", "0.500000 0.1250\n", "0.666667 0.0625\n", "0.750000 0.0625\n", "1.000000 0.2500\n", "1.333333 0.0625\n", "1.500000 0.0625\n", "2.000000 0.1250\n", "3.000000 0.0625\n", "4.000000 0.0625\n", "dtype: float64" ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d4.div_dist(d4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Comparison operators\n", "\n", "`Pmf` implements comparison operators that return probabilities.\n", "\n", "You can compare a `Pmf` to a scalar:" ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.3333333333333333" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.lt_dist(3)" ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.75" ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d4.ge_dist(2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Or compare `Pmf` objects:" ] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.25" ] }, "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d4.gt_dist(d6)" ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.41666666666666663" ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.le_dist(d4)" ] }, { "cell_type": "code", "execution_count": 62, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.16666666666666666" ] }, "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d4.eq_dist(d6)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Interestingly, this way of comparing distributions is [nontransitive]()." ] }, { "cell_type": "code", "execution_count": 63, "metadata": {}, "outputs": [], "source": [ "A = Pmf.from_seq([2, 2, 4, 4, 9, 9])\n", "B = Pmf.from_seq([1, 1, 6, 6, 8, 8])\n", "C = Pmf.from_seq([3, 3, 5, 5, 7, 7])" ] }, { "cell_type": "code", "execution_count": 64, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.5555555555555556" ] }, "execution_count": 64, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A.gt_dist(B)" ] }, { "cell_type": "code", "execution_count": 65, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.5555555555555556" ] }, "execution_count": 65, "metadata": {}, "output_type": "execute_result" } ], "source": [ "B.gt_dist(C)" ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.5555555555555556" ] }, "execution_count": 66, "metadata": {}, "output_type": "execute_result" } ], "source": [ "C.gt_dist(A)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Joint distributions\n", "\n", "For comments or questions about this section, see [this issue](https://github.com/AllenDowney/EmpyricalDistributions/issues/10).\n", "\n", "`Pmf.make_joint` takes two `Pmf` objects and makes their joint distribution, assuming independence." ] }, { "cell_type": "code", "execution_count": 67, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " def make_joint(self, other, **options):\n", " \"\"\"Make joint distribution (assuming independence).\n", "\n", " :param self:\n", " :param other:\n", " :param options: passed to Pmf constructor\n", "\n", " :return: new Pmf\n", " \"\"\"\n", " qs = pd.MultiIndex.from_product([self.qs, other.qs])\n", " ps = np.multiply.outer(self.ps, other.ps).flatten()\n", " return Pmf(ps, index=qs, **options)\n", "\n" ] } ], "source": [ "psource(Pmf.make_joint)" ] }, { "cell_type": "code", "execution_count": 68, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.25
20.25
30.25
40.25
\n", "
" ], "text/plain": [ "1 0.25\n", "2 0.25\n", "3 0.25\n", "4 0.25\n", "dtype: float64" ] }, "execution_count": 68, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d4 = Pmf.from_seq(range(1,5))\n", "d4" ] }, { "cell_type": "code", "execution_count": 69, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.166667
20.166667
30.166667
40.166667
50.166667
60.166667
\n", "
" ], "text/plain": [ "1 0.166667\n", "2 0.166667\n", "3 0.166667\n", "4 0.166667\n", "5 0.166667\n", "6 0.166667\n", "dtype: float64" ] }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6 = Pmf.from_seq(range(1,7))\n", "d6" ] }, { "cell_type": "code", "execution_count": 70, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
110.041667
20.041667
30.041667
40.041667
50.041667
60.041667
210.041667
20.041667
30.041667
40.041667
50.041667
60.041667
310.041667
20.041667
30.041667
40.041667
50.041667
60.041667
410.041667
20.041667
30.041667
40.041667
50.041667
60.041667
\n", "
" ], "text/plain": [ "1 1 0.041667\n", " 2 0.041667\n", " 3 0.041667\n", " 4 0.041667\n", " 5 0.041667\n", " 6 0.041667\n", "2 1 0.041667\n", " 2 0.041667\n", " 3 0.041667\n", " 4 0.041667\n", " 5 0.041667\n", " 6 0.041667\n", "3 1 0.041667\n", " 2 0.041667\n", " 3 0.041667\n", " 4 0.041667\n", " 5 0.041667\n", " 6 0.041667\n", "4 1 0.041667\n", " 2 0.041667\n", " 3 0.041667\n", " 4 0.041667\n", " 5 0.041667\n", " 6 0.041667\n", "dtype: float64" ] }, "execution_count": 70, "metadata": {}, "output_type": "execute_result" } ], "source": [ "joint = Pmf.make_joint(d4, d6)\n", "joint" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The result is a `Pmf` object that uses a MultiIndex to represent the values." ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "MultiIndex(levels=[[1, 2, 3, 4], [1, 2, 3, 4, 5, 6]],\n", " codes=[[0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3], [0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5]])" ] }, "execution_count": 71, "metadata": {}, "output_type": "execute_result" } ], "source": [ "joint.index" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you ask for the `qs`, you get an array of pairs:" ] }, { "cell_type": "code", "execution_count": 72, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (2, 1), (2, 2),\n", " (2, 3), (2, 4), (2, 5), (2, 6), (3, 1), (3, 2), (3, 3), (3, 4),\n", " (3, 5), (3, 6), (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6)],\n", " dtype=object)" ] }, "execution_count": 72, "metadata": {}, "output_type": "execute_result" } ], "source": [ "joint.qs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can select elements using tuples:" ] }, { "cell_type": "code", "execution_count": 73, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.041666666666666664" ] }, "execution_count": 73, "metadata": {}, "output_type": "execute_result" } ], "source": [ "joint[1,1]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can get unnnormalized conditional distributions by selecting on different axes:" ] }, { "cell_type": "code", "execution_count": 74, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.041667
20.041667
30.041667
40.041667
50.041667
60.041667
\n", "
" ], "text/plain": [ "1 0.041667\n", "2 0.041667\n", "3 0.041667\n", "4 0.041667\n", "5 0.041667\n", "6 0.041667\n", "dtype: float64" ] }, "execution_count": 74, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Pmf(joint[1])" ] }, { "cell_type": "code", "execution_count": 75, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.041667
20.041667
30.041667
40.041667
\n", "
" ], "text/plain": [ "1 0.041667\n", "2 0.041667\n", "3 0.041667\n", "4 0.041667\n", "dtype: float64" ] }, "execution_count": 75, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Pmf(joint.loc[:, 1])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "But `Pmf` also provides `conditional(i,j,val)` which returns the distribution along axis `i` conditioned on the value of axis `j`: " ] }, { "cell_type": "code", "execution_count": 76, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.25
20.25
30.25
40.25
\n", "
" ], "text/plain": [ "1 0.25\n", "2 0.25\n", "3 0.25\n", "4 0.25\n", "dtype: float64" ] }, "execution_count": 76, "metadata": {}, "output_type": "execute_result" } ], "source": [ "joint.conditional(0, 1, 1)" ] }, { "cell_type": "code", "execution_count": 77, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.166667
20.166667
30.166667
40.166667
50.166667
60.166667
\n", "
" ], "text/plain": [ "1 0.166667\n", "2 0.166667\n", "3 0.166667\n", "4 0.166667\n", "5 0.166667\n", "6 0.166667\n", "dtype: float64" ] }, "execution_count": 77, "metadata": {}, "output_type": "execute_result" } ], "source": [ "joint.conditional(1, 0, 1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It also provides `marginal(i)`, which returns the marginal distribution along axis `i`" ] }, { "cell_type": "code", "execution_count": 78, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.25
20.25
30.25
40.25
\n", "
" ], "text/plain": [ "1 0.25\n", "2 0.25\n", "3 0.25\n", "4 0.25\n", "dtype: float64" ] }, "execution_count": 78, "metadata": {}, "output_type": "execute_result" } ], "source": [ "joint.marginal(0)" ] }, { "cell_type": "code", "execution_count": 79, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.166667
20.166667
30.166667
40.166667
50.166667
60.166667
\n", "
" ], "text/plain": [ "1 0.166667\n", "2 0.166667\n", "3 0.166667\n", "4 0.166667\n", "5 0.166667\n", "6 0.166667\n", "dtype: float64" ] }, "execution_count": 79, "metadata": {}, "output_type": "execute_result" } ], "source": [ "joint.marginal(1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The implementations of `conditional` and `marginal` are simple, but could be made more efficient using Pandas methods." ] }, { "cell_type": "code", "execution_count": 80, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " def conditional(self, i, j, val, name=None):\n", " \"\"\"Gets the conditional distribution of the indicated variable.\n", "\n", " Distribution of vs[i], conditioned on vs[j] = val.\n", "\n", " i: index of the variable we want\n", " j: which variable is conditioned on\n", " val: the value the jth variable has to have\n", " name: string\n", "\n", " :return: Pmf\n", " \"\"\"\n", " # TODO: rewrite this using MultiIndex operations\n", " pmf = Pmf(name=name)\n", " for vs, p in self.items():\n", " if vs[j] == val:\n", " pmf[vs[i]] += p\n", "\n", " pmf.normalize()\n", " return pmf\n", "\n" ] } ], "source": [ "psource(Pmf.conditional)" ] }, { "cell_type": "code", "execution_count": 81, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " def marginal(self, i, name=None):\n", " \"\"\"Gets the marginal distribution of the indicated variable.\n", "\n", " i: index of the variable we want\n", " name: string\n", "\n", " :return: Pmf\n", " \"\"\"\n", " # TODO: rewrite this using MultiIndex operations\n", " pmf = Pmf(name=name)\n", " for vs, p in self.items():\n", " pmf[vs[i]] += p\n", " return pmf\n", "\n" ] } ], "source": [ "psource(Pmf.marginal)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here are some ways of iterating through a joint distribution." ] }, { "cell_type": "code", "execution_count": 82, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(1, 1)\n", "(1, 2)\n", "(1, 3)\n", "(1, 4)\n", "(1, 5)\n", "(1, 6)\n", "(2, 1)\n", "(2, 2)\n", "(2, 3)\n", "(2, 4)\n", "(2, 5)\n", "(2, 6)\n", "(3, 1)\n", "(3, 2)\n", "(3, 3)\n", "(3, 4)\n", "(3, 5)\n", "(3, 6)\n", "(4, 1)\n", "(4, 2)\n", "(4, 3)\n", "(4, 4)\n", "(4, 5)\n", "(4, 6)\n" ] } ], "source": [ "for q in joint.qs:\n", " print(q)" ] }, { "cell_type": "code", "execution_count": 83, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n" ] } ], "source": [ "for p in joint.ps:\n", " print(p)" ] }, { "cell_type": "code", "execution_count": 84, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(1, 1) 0.041666666666666664\n", "(1, 2) 0.041666666666666664\n", "(1, 3) 0.041666666666666664\n", "(1, 4) 0.041666666666666664\n", "(1, 5) 0.041666666666666664\n", "(1, 6) 0.041666666666666664\n", "(2, 1) 0.041666666666666664\n", "(2, 2) 0.041666666666666664\n", "(2, 3) 0.041666666666666664\n", "(2, 4) 0.041666666666666664\n", "(2, 5) 0.041666666666666664\n", "(2, 6) 0.041666666666666664\n", "(3, 1) 0.041666666666666664\n", "(3, 2) 0.041666666666666664\n", "(3, 3) 0.041666666666666664\n", "(3, 4) 0.041666666666666664\n", "(3, 5) 0.041666666666666664\n", "(3, 6) 0.041666666666666664\n", "(4, 1) 0.041666666666666664\n", "(4, 2) 0.041666666666666664\n", "(4, 3) 0.041666666666666664\n", "(4, 4) 0.041666666666666664\n", "(4, 5) 0.041666666666666664\n", "(4, 6) 0.041666666666666664\n" ] } ], "source": [ "for q, p in joint.items():\n", " print(q, p)" ] }, { "cell_type": "code", "execution_count": 85, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1 1 0.041666666666666664\n", "1 2 0.041666666666666664\n", "1 3 0.041666666666666664\n", "1 4 0.041666666666666664\n", "1 5 0.041666666666666664\n", "1 6 0.041666666666666664\n", "2 1 0.041666666666666664\n", "2 2 0.041666666666666664\n", "2 3 0.041666666666666664\n", "2 4 0.041666666666666664\n", "2 5 0.041666666666666664\n", "2 6 0.041666666666666664\n", "3 1 0.041666666666666664\n", "3 2 0.041666666666666664\n", "3 3 0.041666666666666664\n", "3 4 0.041666666666666664\n", "3 5 0.041666666666666664\n", "3 6 0.041666666666666664\n", "4 1 0.041666666666666664\n", "4 2 0.041666666666666664\n", "4 3 0.041666666666666664\n", "4 4 0.041666666666666664\n", "4 5 0.041666666666666664\n", "4 6 0.041666666666666664\n" ] } ], "source": [ "for (q1, q2), p in joint.items():\n", " print(q1, q2, p)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 2 }