{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# The empiricaldist API\n", "\n", "Copyright 2021 Allen Downey\n", "\n", "BSD 3-clause license: https://opensource.org/licenses/BSD-3-Clause" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## A Pmf is a Series\n", "\n", "`empiricaldist` provides `Pmf`, which is a Pandas Series that represents a probability mass function." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "scrolled": true }, "outputs": [], "source": [ "from empiricaldist import Pmf" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can create a `Pmf` in any of the ways you can create a `Series`, but the most common way is to use `from_seq` to make a `Pmf` from a sequence.\n", "\n", "The following is a `Pmf` that represents a six-sided die." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "d6 = Pmf.from_seq([1,2,3,4,5,6])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By default, the probabilities are normalized to add up to 1." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.166667
20.166667
30.166667
40.166667
50.166667
60.166667
\n", "
" ], "text/plain": [ "1 0.166667\n", "2 0.166667\n", "3 0.166667\n", "4 0.166667\n", "5 0.166667\n", "6 0.166667\n", "Name: , dtype: float64" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "But you can also make an unnormalized `Pmf` if you want to keep track of the counts." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
11
21
31
41
51
61
\n", "
" ], "text/plain": [ "1 1\n", "2 1\n", "3 1\n", "4 1\n", "5 1\n", "6 1\n", "Name: , dtype: int64" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6 = Pmf.from_seq([1,2,3,4,5,6], normalize=False)\n", "d6" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Or normalize later (the return value is the prior sum)." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "6" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.normalize()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now the Pmf is normalized." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.166667
20.166667
30.166667
40.166667
50.166667
60.166667
\n", "
" ], "text/plain": [ "1 0.166667\n", "2 0.166667\n", "3 0.166667\n", "4 0.166667\n", "5 0.166667\n", "6 0.166667\n", "Name: , dtype: float64" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Properties\n", "\n", "In a `Pmf` the index contains the quantities (`qs`) and the values contain the probabilities (`ps`).\n", "\n", "These attributes are available as properties that return arrays (same semantics as the Pandas `values` property)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1, 2, 3, 4, 5, 6])" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.qs" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0.16666667, 0.16666667, 0.16666667, 0.16666667, 0.16666667,\n", " 0.16666667])" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.ps" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plotting PMFs\n", "\n", "`Pmf` provides two plotting functions. `bar` plots the `Pmf` as a histogram." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "def decorate_dice(title):\n", " \"\"\"Labels the axes.\n", " \n", " title: string\n", " \"\"\"\n", " plt.xlabel('Outcome')\n", " plt.ylabel('PMF')\n", " plt.title(title)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "scrolled": true }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "d6.bar()\n", "decorate_dice('One die')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`plot` displays the `Pmf` as a line." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "scrolled": true }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "d6.plot()\n", "decorate_dice('One die')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Selection\n", "\n", "The bracket operator looks up an outcome and returns its probability." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.16666666666666666" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6[1]" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.16666666666666666" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6[6]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Outcomes that are not in the distribution cause a `KeyError`\n", "\n", "```\n", "d6[7]\n", "```\n", "\n", "You can also use parentheses to look up a quantity and get the corresponding probability." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.16666666666666666" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6(1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With parentheses, a quantity that is not in the distribution returns `0`, not an error." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6(7)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Mutation\n", "\n", "`Pmf` objects are mutable, but in general the result is not normalized." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.166667
20.166667
30.166667
40.166667
50.166667
60.166667
70.166667
\n", "
" ], "text/plain": [ "1 0.166667\n", "2 0.166667\n", "3 0.166667\n", "4 0.166667\n", "5 0.166667\n", "6 0.166667\n", "7 0.166667\n", "Name: , dtype: float64" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6[7] = 1/6\n", "d6" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1.1666666666666665" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.sum()" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1.1666666666666665" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.normalize()" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1.0000000000000002" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.sum()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Statistics\n", "\n", "`Pmf` overrides the statistics methods to compute `mean`, `median`, etc.\n", "\n", "These functions only work correctly if the `Pmf` is normalized." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "d6 = Pmf.from_seq([1,2,3,4,5,6])" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3.5" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.mean()" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2.9166666666666665" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.var()" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1.707825127659933" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.std()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Sampling\n", "\n", "`choice` chooses a random values from the Pmf, following the API of `np.random.choice`" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([3, 2, 3, 6, 4, 3, 4, 5, 6, 5])" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.choice(size=10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`sample` chooses a random values from the `Pmf`, with replacement." ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1., 4., 5., 3., 2., 2., 1., 6., 5., 6.])" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.sample(n=10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## CDFs\n", "\n", "`empiricaldist` also provides `Cdf`, which represents a cumulative distribution function." ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "from empiricaldist import Cdf" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can create an empty `Cdf` and then add elements.\n", "\n", "Here's a `Cdf` that represents a four-sided die." ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [], "source": [ "d4 = Cdf.from_seq([1,2,3,4])" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.25
20.50
30.75
41.00
\n", "
" ], "text/plain": [ "1 0.25\n", "2 0.50\n", "3 0.75\n", "4 1.00\n", "Name: , dtype: float64" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d4" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Properties\n", "\n", "In a `Cdf` the index contains the quantities (`qs`) and the values contain the probabilities (`ps`).\n", "\n", "These attributes are available as properties that return arrays (same semantics as the Pandas `values` property)" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1, 2, 3, 4])" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d4.qs" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0.25, 0.5 , 0.75, 1. ])" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d4.ps" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Displaying CDFs\n", "\n", "`Cdf` provides two plotting functions.\n", "\n", "`plot` displays the `Cdf` as a line." ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [], "source": [ "def decorate_dice(title):\n", " \"\"\"Labels the axes.\n", " \n", " title: string\n", " \"\"\"\n", " plt.xlabel('Outcome')\n", " plt.ylabel('CDF')\n", " plt.title(title)" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "scrolled": true }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "d4.plot()\n", "decorate_dice('One die')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`step` plots the Cdf as a step function (which is more technically correct)." ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "scrolled": true }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEWCAYAAABrDZDcAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAWHUlEQVR4nO3de7QdZ33e8e+DbBeIESbRCRGSsJwgAoKC6goVQldxIQSZhKgGmsohNZg0jpOYW7ooDm0hNKtdEEi5xC5aKnUMFHBIuVgmAjd1SAiUi2RHvkjgRrXBPpaDj2FhYTAYmV//2KN0e599LrLP6Ojo/X7W2ssz77x7zu/1u3SeMzN7z6SqkCS16yGLXYAkaXEZBJLUOINAkhpnEEhS4wwCSWqcQSBJjTMIpKMsycuSfHZo/e4kP7mYNaltBoGa1P0yvj7Jd5P8bZJ3JzllMWqpqpOr6qbF+NkSGARqUJJ/DbwFeC3wSODpwKnAnyU5aTFrkxaDQaCmJFkOvAl4RVV9qqp+UFVfBX6JQRj8Stfvd5N8OMn7knw7yd4kG4f285gkH0kyleTmJK+c5Wf+WJIdSQ4m+RLwUyPbK8njuuW/l+RtSW5J8vUk25I8bMH/R0hDDAK15meAhwIfHW6sqruBTwLPHWr+ReAy4BRgB3ARQJKHAFcA1wKrgOcAr07yvBl+5sXA94CVwMu710zeAjwe2AA8rtv/G+Y5NukBMQjUmhXAnVV1aMy227vth322qnZW1X3A+4Gndu1PAyaq6j9U1b3d+f3/Cmwd3WGSZcCLgDdU1Xeq6gbgveMKSxLg14DXVNU3q+rbwH8at19pIZ2w2AVIR9mdwIokJ4wJg5Xd9sP+dmj5u8BDk5zA4BTSY5J8a2j7MuCvxvy8CQb/zm4davvaDLVNAA8Hrh5kAgDp9i31xiMCtebzwPeBFw43JvkR4Ezgqnns41bg5qo6Zej1iKp6/pi+U8AhYM1Q22Nn2O+dwD3Ak4b2+8iqOnkeNUkPmEGgplTVXQwuFv9hks1JTkyyFvgTYJLBKaC5fAk4mOR1SR6WZFmSJyd52pifdx+D6xG/m+ThSdYDL52hth8yOMX09iQ/DpBk1SzXHqQFYRCoOVX1+8DrgbcBB4EvMvgr/zlV9f15vP8+4AUMLujezOAv+fcw+CjqOBcAJzM41XQp8Eez7P51wH7gC0kOAv8L+Om5apIejPhgGklqm0cEktQ4g0CSGmcQSFLjDAJJatyS+0LZihUrau3atYtdhiQtKVdfffWdVTUxbtuSC4K1a9eye/fuxS5DkpaUJDN9o91TQ5LUOoNAkhpnEEhS4wwCSWqcQSBJjestCJJckuSOJDfMsD1J3pVkf5LrkpzeVy2SpJn1eURwKbB5lu1nAuu613nAu3usRZI0g96CoKo+A3xzli5bgPfVwBeAU5Ks7KseSVrK3nTFXt50xd5e9r2YXyhbxf0f3zfZtd0+2jHJeQyOGnjsY2d6uJMkHb/2HTjY274X82JxxrSNfThCVW2vqo1VtXFiYuw3pCVJD9BiBsEk93+O62rgwCLVIknNWswg2AGc03166OnAXVU17bSQJKlfvV0jSPIh4AxgRZJJ4I3AiQBVtQ3YCTyfwfNZvwuc21ctkqSZ9RYEVXX2HNsL+K2+fr4kaX78ZrEkNc4gkKTGGQSS1DiDQJIaZxBIUuMMAklqnEEgSY0zCCSpcQaBJDXOIJCkxhkEktQ4g0CSGmcQSFLjDAJJapxBIEmNMwgkqXEGgSQ1rtcgSLI5yY1J9ie5cMz2RyX5WJLrknwpyZP7rEeSNF1vQZBkGXAxcCawHjg7yfqRbq8H9lTVU4BzgHf2VY8kabw+jwg2Afur6qaquhe4DNgy0mc9cBVAVX0FWJvk0T3WJEka0WcQrAJuHVqf7NqGXQu8ECDJJuBUYHWPNUmSRvQZBBnTViPrbwYelWQP8Argr4FD03aUnJdkd5LdU1NTC16oJLXshB73PQmsGVpfDRwY7lBVB4FzAZIEuLl7MdJvO7AdYOPGjaNhIkl6EPo8ItgFrEtyWpKTgK3AjuEOSU7ptgH8K+AzXThIko6S3o4IqupQkguAK4FlwCVVtTfJ+d32bcATgfcluQ/YB/xqX/VIksbr89QQVbUT2DnStm1o+fPAuj5rkCTNzm8WS1LjDAJJapxBIEmNMwgkqXEGgSQ1ziCQpMYZBJLUOINAkhpnEEhS4wwCSWqcQSBJjTMIJKlxBoEkNc4gkKTGGQSS1DiDQJIaZxBIUuMMAklqXK9BkGRzkhuT7E9y4Zjtj0xyRZJrk+xNcm6f9UiSpustCJIsAy4GzgTWA2cnWT/S7beAfVX1VOAM4A+SnNRXTZKk6fo8ItgE7K+qm6rqXuAyYMtInwIekSTAycA3gUM91iRJGtFnEKwCbh1an+zahl0EPBE4AFwPvKqqfji6oyTnJdmdZPfU1FRf9UpSk/oMgoxpq5H15wF7gMcAG4CLkiyf9qaq7VW1sao2TkxMLHSdktS0PoNgElgztL6awV/+w84FPloD+4GbgSf0WJMkaUSfQbALWJfktO4C8FZgx0ifW4DnACR5NPDTwE091iRJGnFCXzuuqkNJLgCuBJYBl1TV3iTnd9u3Ab8HXJrkegankl5XVXf2VZMkabreggCgqnYCO0fatg0tHwB+rs8aJEmz85vFktQ4g0CSGmcQSFLjDAJJapxBIEmNMwgkqXG9fnxU0tLzwS/ewuV7blvsMjRi3+0HWb9y2h14FoRHBJLu5/I9t7Hv9oOLXYZGrF+5nC0bRu/buTA8IpA0zfqVy/njX3/GYpeho8QjAklqnEEgSY0zCCSpcQaBJDXOIJCkxhkEktQ4g0CSGmcQSFLjeg2CJJuT3Jhkf5ILx2x/bZI93euGJPcl+dE+a5Ik3V9vQZBkGXAxcCawHjg7yfrhPlX11qraUFUbgN8B/rKqvtlXTZKk6fo8ItgE7K+qm6rqXuAyYMss/c8GPtRjPZKkMfoMglXArUPrk13bNEkeDmwGPjLD9vOS7E6ye2pqasELlaSW9RkEGdNWM/R9AfC5mU4LVdX2qtpYVRsnJiYWrEBJUr9BMAmsGVpfDRyYoe9WPC0kSYuizyDYBaxLclqSkxj8st8x2inJI4FnAZf3WIskaQa9PY+gqg4luQC4ElgGXFJVe5Oc323f1nU9C/ifVfWdvmqRJM2s1wfTVNVOYOdI27aR9UuBS/usQ5I0M79ZLEmNMwgkqXEGgSQ1ziCQpMYZBJLUOINAkhpnEEhS42YNgiSXDi2/tPdqJElH3VxHBE8dWn5Vn4VIkhbHXEEw091CJUnHibluMbE6ybsY3FL68PLfqapX9laZJOmomCsIXju0vLvPQiRJi2PWIKiq9x6tQiRJi2POj48meWmSa5J8p3vtTnLO0ShOktS/WY8Iul/4rwZ+G7iGwbWC04G3JqGq3td7hZKkXs11RPCbwFlV9emququqvlVVfw68qNsmSVri5gqC5VX11dHGrm15HwVJko6uuYLgnge4TZK0RMz18dEnJrluTHuAn5xr50k2A+9k8Mzi91TVm8f0OQN4B3AicGdVPWuu/UqSFs5cQfBU4NHArSPtpwIHZntjkmXAxcBzgUlgV5IdVbVvqM8pwH8BNlfVLUl+/MjKlyQ9WHOdGno7cLCqvjb8Ar7bbZvNJmB/Vd1UVfcClwFbRvr8MvDRqroFoKruOPIhSJIejLmCYG1VTTs1VFW7gbVzvHcV9z+SmOzahj0eeFSSv0hy9UzfT0hyXvf9hd1TU1Nz/FhJ0pGYKwgeOsu2h83x3oxpG72J3QnAPwR+Hnge8O+TPH7am6q2V9XGqto4MTExx4+VJB2JuYJgV5JfG21M8qvA1XO8dxJYM7S+munXFSaBT1XVd6rqTuAz3P/W15Kkns11sfjVwMeSvIT//4t/I3AScNYc790FrEtyGnAbsJXBNYFhlwMXJTmh2+c/Yu5rD5KkBTTXTee+DvxMkn8KPLlr/tPu28WzqqpDSS4ArmTw8dFLqmpvkvO77duq6stJPgVcB/yQwUdMb3gQ45EkHaG5jggAqKpPA58+0p1X1U5g50jbtpH1twJvPdJ9S5IWhg+vl6TGGQSS1DiDQJIaZxBIUuMMAklqnEEgSY0zCCSpcQaBJDXOIJCkxhkEktQ4g0CSGmcQSFLjDAJJapxBIEmNMwgkqXEGgSQ1ziCQpMb1GgRJNie5Mcn+JBeO2X5GkruS7Oleb+izHknSdPN6VOUDkWQZcDHwXGAS2JVkR1XtG+n6V1X1C33VIUmaXW9BAGwC9lfVTQBJLgO2AKNBoIZ98Iu3cPme2xa7DA3Zd/tB1q9cvthl6Cjq89TQKuDWofXJrm3UM5Jcm+STSZ40bkdJzkuyO8nuqampPmrVIrl8z23su/3gYpehIetXLmfLhnH/VHW86vOIIGPaamT9GuDUqro7yfOBjwPrpr2pajuwHWDjxo2j+9ASt37lcv7415+x2GVIzerziGASWDO0vho4MNyhqg5W1d3d8k7gxCQreqxJkjSizyDYBaxLclqSk4CtwI7hDkl+Ikm65U1dPd/osSZJ0ojeTg1V1aEkFwBXAsuAS6pqb5Lzu+3bgBcDv5HkEHAPsLWqPPUjSUdRn9cIDp/u2TnStm1o+SLgoj5rkCTNzm8WS1LjDAJJapxBIEmNMwgkqXEGgSQ1ziCQpMYZBJLUOINAkhpnEEhS4wwCSWqcQSBJjTMIJKlxBoEkNc4gkKTGGQSS1DiDQJIaZxBIUuN6DYIkm5PcmGR/kgtn6fe0JPcleXGf9UiSpustCJIsAy4GzgTWA2cnWT9Dv7cweLaxJOko6/OIYBOwv6puqqp7gcuALWP6vQL4CHBHj7VIkmbQZxCsAm4dWp/s2v5OklXAWcA2ZpHkvCS7k+yemppa8EIlqWV9BkHGtNXI+juA11XVfbPtqKq2V9XGqto4MTGxUPVJkoATetz3JLBmaH01cGCkz0bgsiQAK4DnJzlUVR/vsS5J0pA+g2AXsC7JacBtwFbgl4c7VNVph5eTXAp8whCQpKOrtyCoqkNJLmDwaaBlwCVVtTfJ+d32Wa8LSJKOjj6PCKiqncDOkbaxAVBVL+uzFknSeH6zWJIaZxBIUuMMAklqnEEgSY0zCCSpcQaBJDXOIJCkxhkEktQ4g0CSGmcQSFLjDAJJapxBIEmNMwgkqXEGgSQ1ziCQpMYZBJLUOINAkhpnEEhS43oNgiSbk9yYZH+SC8ds35LkuiR7kuxO8o/7rEeSNF1vzyxOsgy4GHguMAnsSrKjqvYNdbsK2FFVleQpwIeBJ/RVkyRpuj6PCDYB+6vqpqq6F7gM2DLcoarurqrqVn8EKCRJR1WfQbAKuHVofbJru58kZyX5CvCnwMvH7SjJed2po91TU1O9FCtJreozCDKmbdpf/FX1sap6AvDPgN8bt6Oq2l5VG6tq48TExMJWKUmN6zMIJoE1Q+urgQMzda6qzwA/lWRFjzVJkkb0GQS7gHVJTktyErAV2DHcIcnjkqRbPh04CfhGjzVJkkb09qmhqjqU5ALgSmAZcElV7U1yfrd9G/Ai4JwkPwDuAf7F0MVjSdJR0FsQAFTVTmDnSNu2oeW3AG/pswZJ0uz8ZrEkNc4gkKTGGQSS1DiDQJIaZxBIUuMMAklqXK8fHz2WvOmKvew7cHCxy9CIfbcfZP3K5YtdhtQ0jwi0qNavXM6WDdPuRSjpKGrmiOCNL3jSYpcgScckjwgkqXEGgSQ1ziCQpMYZBJLUOINAkhpnEEhS4wwCSWqcQSBJjctSezJkkingaw/w7SuAOxewnMXkWI5Nx8tYjpdxgGM57NSqmhi3YckFwYORZHdVbVzsOhaCYzk2HS9jOV7GAY5lPjw1JEmNMwgkqXGtBcH2xS5gATmWY9PxMpbjZRzgWObU1DUCSdJ0rR0RSJJGGASS1LjjMgiSXJLkjiQ3zLA9Sd6VZH+S65KcfrRrnI95jOOMJHcl2dO93nC0a5yvJGuSfDrJl5PsTfKqMX2O+XmZ5ziWxLwkeWiSLyW5thvLm8b0OebnBOY9liUxLwBJliX56ySfGLNt4eekqo67F/BPgNOBG2bY/nzgk0CApwNfXOyaH+A4zgA+sdh1znMsK4HTu+VHAP8HWL/U5mWe41gS89L9fz65Wz4R+CLw9KU2J0cwliUxL12tvw18cFy9fczJcXlEUFWfAb45S5ctwPtq4AvAKUlWHp3q5m8e41gyqur2qrqmW/428GVg9GHFx/y8zHMcS0L3//nubvXE7jX66ZFjfk5g3mNZEpKsBn4eeM8MXRZ8To7LIJiHVcCtQ+uTLNF/zMAzusPhTyZZEg9mTrIW+AcM/mobtqTmZZZxwBKZl+4UxB7gDuDPqmrJzsk8xgJLY17eAfwb4IczbF/wOWk1CDKmbSn+9XANg/uHPBX4Q+Dji1vO3JKcDHwEeHVVHRzdPOYtx+S8zDGOJTMvVXVfVW0AVgObkjx5pMuSmZN5jOWYn5ckvwDcUVVXz9ZtTNuDmpNWg2ASWDO0vho4sEi1PGBVdfDw4XBV7QROTLJikcuaUZITGfzy/EBVfXRMlyUxL3ONY6nNC0BVfQv4C2DzyKYlMSfDZhrLEpmXZwK/mOSrwGXAs5P895E+Cz4nrQbBDuCc7ur704G7qur2xS7qSCX5iSTpljcxmM9vLG5V43V1/jfgy1X1n2fodszPy3zGsVTmJclEklO65YcBPwt8ZaTbMT8nML+xLIV5qarfqarVVbUW2Ar8eVX9yki3BZ+TEx7Mm49VST7E4BMCK5JMAm9kcPGIqtoG7GRw5X0/8F3g3MWpdHbzGMeLgd9Icgi4B9ha3ccKjkHPBP4lcH13Hhfg9cBjYUnNy3zGsVTmZSXw3iTLGPxS/HBVfSLJ+bCk5gTmN5alMi/T9D0n3mJCkhrX6qkhSVLHIJCkxhkEktQ4g0CSGmcQSFLjDAI1KcnqJJcn+Zsk/zfJO5OcNMd7Xn+06pOOJoNAzem+VPRR4ONVtQ54PHAy8B/neKtBoOOSQaAWPRv4XlX9EQzuUQO8Bnh5kt9MctHhjkk+0d3H/s3Aw7r72H+g23ZOdz/4a5O8v2s7NclVXftVSR7btV+a5N0ZPMvgpiTPyuB5E19OcunQz/u5JJ9Pck2SP+nuaST1yiBQi54E3O+mXt2N425hhm/bV9WFwD1VtaGqXtLdufLfAs/ubmJ2+AE1FzG4RfBTgA8A7xrazaMYhNBrgCuAt3e1/P0kG7r73vw74Ger6nRgN4P70ku9Oi5vMSHNIYy/W+NM7eM8G/gfVXUnQFUdfm7EM4AXdsvvB35/6D1XVFUluR74elVdD5BkL7CWwc3D1gOf626JcxLw+XnWIz1gBoFatBd40XBDkuUM7uh4F/c/Un7oDPuYb2gM9/l+998fDi0fXj8BuI/BffTPnsd+pQXjqSG16Crg4UnOgcEDTYA/AC4FbgI2JHlIkjXApqH3/aC7BfXhffxSkh/r9vGjXfv/ZnDXSICXAJ89grq+ADwzyeO6fT48yeOPdHDSkTII1JzujpNnAf88yd8weO7w9xh8KuhzwM3A9cDbGDzM5LDtwHVJPlBVexl8yugvk1wLHL4l9SuBc5Ncx+AupdMebj9LXVPAy4APde//AvCEBzpOab68+6gkNc4jAklqnEEgSY0zCCSpcQaBJDXOIJCkxhkEktQ4g0CSGvf/AOffuV12VrW0AAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "d4.step()\n", "decorate_dice('One die')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Selection\n", "\n", "The bracket operator works as usual." ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.25" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d4[1]" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1.0" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d4[4]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Evaluating CDFs\n", "\n", "`Cdf` provides `forward` and `inverse`, which evaluate the CDF and its inverse as functions.\n", "\n", "Evaluating a `Cdf` forward maps from a quantity to its cumulative probability." ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [], "source": [ "d6 = Cdf.from_seq([1,2,3,4,5,6])" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array(0.5)" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.forward(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`forward` interpolates, so it works for quantities that are not in the distribution." ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array(0.5)" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.forward(3.5)" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array(0.)" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.forward(0)" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array(1.)" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.forward(7)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also call the `Cdf` like a function (which it is)." ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array(0.16666667)" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6(1.5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`forward` can take an array of quantities, too." ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [], "source": [ "def decorate_cdf(title):\n", " \"\"\"Labels the axes.\n", " \n", " title: string\n", " \"\"\"\n", " plt.xlabel('Quantity')\n", " plt.ylabel('CDF')\n", " plt.title(title)" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "qs = np.linspace(0, 7)\n", "ps = d6(qs)\n", "plt.plot(qs, ps)\n", "decorate_cdf('Forward evaluation')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`Cdf` also provides `inverse`, which computes the inverse `Cdf`:" ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array(3.)" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.inverse(0.5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`quantile` is a synonym for `inverse`" ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array(3.)" ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.quantile(0.5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`inverse` and `quantile` work with arrays " ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEWCAYAAABrDZDcAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAaVUlEQVR4nO3dfZRddX3v8fdnZs5kJg8zATIESIBACUi0gjig4KrSKsqjlLb3FrDlymrLwopFqy1cvZZebbvaatWyQNKIPLU8VBA1YhT0CgW1tJkgD0IEYghmCMiEh0xIZjJP3/vH2ZOcOXPmTBJmz5mZ3+e11l5z9t6/2ed7Jivfz977nLO3IgIzM0tXXa0LMDOz2nIQmJklzkFgZpY4B4GZWeIcBGZmiXMQmJklzkFgNkVIukHS3+S07Q9IuiePbdv05yCwSSdpg6T31LqOmUrSEkkhqWF4WUTcHBHvrWVdNnU5CCwJpU3RzEZyEFhNSfqgpB9J+rykVyQ9I+m0bN25kjrKxn9M0srs8azs934p6VeSlktqztadLKlT0mWSXgCul7RA0l2SXpX0sqQHJNVl4w+S9HVJXVkNf1al5mrPu1bSmSVjGyRtlnRcNn+7pBckbZF0v6Q3Vvu7lC0LSUdkj8+Q9FNJ3ZI2SvrrkqH3Zz9flfSapBPLtyfpJEmrszpWSzqpZN19kj4r6ceStkq6R9KCsf4eNv05CGwqeBvwJLAA+Efgq5IErASOkrS0ZOz5wC3Z438AjgSOBY4AFgF/VTL2AGBf4FDgIuDjQCfQBiwEPglEFgbfBh7JtvFu4KOS3jdGvdWe91bgvJKx7wM2R8RD2fx3gaXA/sBDwM3V/jBVbAMuAOYDZwAfkvTb2bp3Zj/nR8TciPjP0l+UtC/wHeBKYD/gC8B3JO1XMux84MKszkbgE3tZp00DDgKbCp6NiK9ExCBwI3AgsDAitgPfImusWSC8AViZBcWfAB+LiJcjYivwd8C5JdsdAq6IiB0R0QP0Z9s+NCL6I+KBKF5s63igLSI+ExF9EbEe+ErZtshqGO95bwHeL2l2Nl8aXETEdRGxNSJ2AH8NHCOpdU//YBFxX0Q8FhFDEfEoxQB6127++hnA0xHxrxExEBG3Aj8HzioZc31EPJX93b5GMfRshnIQ2FTwwvCDrPkDzM1+3sKuPezzgW9mY9qA2cCa7FTPq8D3suXDuiKit2T+c8A64B5J6yVdni0/FDhoeDvZtj5J8aihXNXnjYh1wFrgrCwM3p+9BiTVS/p7Sb+Q1A1syLa5x6ddJL1N0r3ZqawtwMV7sJ2DgGfLlj1L8chm2Aslj7ez69/DZiC/gWZT3T3AAknHUgyEj2XLNwM9wBsj4rkxfnfEpXWzvfePAx/Pzs3fK2k1sBF4JiKWVthGud153uHTQ3XAE1k4QDHIzgbeQzEEWoFXAFXYxjaKgQOApAPK1t8CXAWcFhG9kr7EriAY75LCmyiGX6lDKAaaJchHBDalRcQAcAfFvfl9ge9ny4conr75oqT9ASQtqnJeH0lnSjoiO73TDQxm038D3dkby83ZnvubJB1foZ7ded7bgPcCH6LktBAwD9gBvESxyf9dlZf+CPBGScdKaqJ4GqnUPODlLAROoBgyw7oonhY7fIxtrwKOlHR+9mb27wPLgLuq1GMzmIPApoNbKO5F354Fw7DLKJ7qeTA71fID4Kgq21majXkN+E/gy9m59kGK58ePBZ6huNd/LcU99kqqPm9EPJ9t/yTg30t+7yaKp2CeA54AHhyr0Ih4CvhMtu2ngR+VDflT4DOStlJ8o/prJb+7Hfhb4MfZ6au3l237JeBMikdHLwF/CZwZEZvHqsdmNvnGNGZmafMRgZlZ4hwEZmaJcxCYmSXOQWBmlrhp9z2CBQsWxJIlS2pdhpnZtLJmzZrNEdFWad20C4IlS5bQ0dEx/kAzM9tJUvm3yXfyqSEzs8Q5CMzMEucgMDNLnIPAzCxxDgIzs8TlFgSSrpP0oqSfjbFekq6UtE7So8O38jMzs8mV5xHBDcCpVdafRvFqkEsp3kbwmhxrMTOzMeQWBBFxP/BylSFnAzdF0YPAfEkH5lWPmdl09s8/eJr7n+rKZdu1fI9gEcU7Qw3rZOSt8naSdJGkDkkdXV35/CHMzKayq+9bx09+8VIu265lEFS6PV/FmyNExIqIaI+I9ra2it+QNjOzvVTLIOgEDi6ZX0zxXqpmZjaJahkEK4ELsk8PvR3Ykt3iz8zMJlFuF52TdCtwMrBAUidwBVAAiIjlFG+gfTrFe79uBy7MqxYzMxtbbkEQEeeNsz6AD+f1/GZmtnv8zWIzs8Q5CMzMEucgMDNLnIPAzCxxDgIzs8Q5CMzMEucgMDNLnIPAzCxxDgIzs8Q5CMzMEucgMDNLnIPAzCxxDgIzs8Q5CMzMEucgMDNLnIPAzCxxDgIzs8Q5CMzMEucgMDNLnIPAzCxxDgIzs8Q5CMzMEucgMDNLnIPAzCxxDgIzs8Q11LoAMzOrrH9wiC09/Wzp6ScicnseB4GZWY52DAyypaef7qyh75y299PdOzByWdm47X2DI7Y1p7E+lxodBGZm4+jtHxzdyCtMlcb09g9V3facxnpamwu0NBdobS5wyL6zac0eDy9rbS7QOrvAiYfvl8vrcxCY2YwXEfT2D+1xEx+e+gaqN/O5sxpKGncDhy2Ys6uBV2rqJcsK9bV/q9ZBYGbTzivb+njyV1urNvFdywbo7umnb7B6M5/X1DCiSS/df+6Ihl2pkbc2F2hpaqBhCjTz18NBYGbTzoU3rObhja+OWCbBvFkNtM7e1aQPaG0ac2+8dJrXVKC+TrV5MVNArkEg6VTgn4F64NqI+Puy9a3AvwGHZLV8PiKuz7MmM5v+tvb28/bD9+VTpy8raeYN1CXczF+P3IJAUj1wNXAK0AmslrQyIp4oGfZh4ImIOEtSG/CkpJsjoi+vusxsZthv7ix+fXFrrcuYEfI8sXUCsC4i1meN/Tbg7LIxAcyTJGAu8DIwkGNNZmZWJs8gWARsLJnvzJaVugo4GtgEPAZcGhGj3tGRdJGkDkkdXV1dedVrZpakPIOg0sm68q/GvQ94GDgIOBa4SlLLqF+KWBER7RHR3tbWNtF1mpklLc8g6AQOLplfTHHPv9SFwJ1RtA54BnhDjjWZmVmZPINgNbBU0mGSGoFzgZVlY34JvBtA0kLgKGB9jjWZmVmZ3D41FBEDki4B7qb48dHrIuJxSRdn65cDnwVukPQYxVNJl0XE5rxqMjOz0XL9HkFErAJWlS1bXvJ4E/DePGswM7Pqpvf3os3M7HVzEJiZJc5BYGaWOAeBmVniHARmZolzEJiZJc5BYGaWOAeBmVniHARmZolzEJiZJc5BYGaWOAeBmVniHARmZolzEJiZJc5BYGaWOAeBmVniHARmZolzEJiZJS7XW1WamVUzOBR09/SzZYxprHWbXu1h2UGttS5/xnAQmNnr0j84VLFhj93gB+jO1m/dMVB1240NdbQ2F3ZOC1uaOHLhPFqbC5zzlkWT9ApnPgeBmdE3MFS1iVdr9Nv6Bqtuu6kwspkvmt/E0QfOG7Gs0tTSXKCpUD9Jf4G0OQjMZoje/sERDbu7N3u8vbgXXq2x9/RXb+bNhfoRTXrxPrNpPai0cTfQOnt0I29tLjCrwc18qnMQmE1zX/j+U/zLf/yCHQNDVcfNaawf0aAP3W/2yL3w2SMbeGtzgZam4s/GBn+uZCZzEJhNc49sfJXW5gIXnHjoqEZeundeqHczt8ocBGYzwIHzm7nkt5bWugybpryLYGaWOAeBmVniHARmZolzEJiZJc5BYGaWuFyDQNKpkp6UtE7S5WOMOVnSw5Iel/QfedZjZmaj5fbxUUn1wNXAKUAnsFrSyoh4omTMfODLwKkR8UtJ++dVj5mZVZbnEcEJwLqIWB8RfcBtwNllY84H7oyIXwJExIs51mNmZhXkGQSLgI0l853ZslJHAvtIuk/SGkkXVNqQpIskdUjq6OrqyqlcM7M05RkEqrAsyuYbgLcCZwDvAz4t6chRvxSxIiLaI6K9ra1t4is1M0tYnpeY6AQOLplfDGyqMGZzRGwDtkm6HzgGeCrHuszMrESeRwSrgaWSDpPUCJwLrCwb8y3gNyQ1SJoNvA1Ym2NNZmZWJrcjgogYkHQJcDdQD1wXEY9Lujhbvzwi1kr6HvAoMARcGxE/y6smMzMbLderj0bEKmBV2bLlZfOfAz6XZx1mZjY2f7PYzCxxDgIzs8Q5CMzMEucgMDNLnIPAzCxxDgIzs8Q5CMzMElc1CCTdUPL4f+VejZmZTbrxjgiOKXl8aZ6FmJlZbYwXBOVXCzUzsxlmvEtMLJZ0JcVLSg8/3iki/iy3yszMbFKMFwR/UfK4I89CzGykoaFg644Bunv62VJlWvt8NwfOb651uTaNVQ2CiLhxsgoxm4mGhoKtvQNVG/mWnv6KzX5rbz9DVU7O1teJ1uYCrc0FTjnat/u2vTfu1UezTwtdChyVLVoLXBkRN+VZmNlUMTgUVffKx1rX3dPP1h0DRJVmXqgvNvOWrKHvN7eRwxbM2dngh6eW0vnZxZ9zGuuRKt0I0GzPVA2C7B7CHwX+HHiI4nsFxwGfk4TDwKaLgcEhusfYM9/ZyLeP3cyraayvyxp1A63NBfafN4ul+8+t3MRLGnlrc4Hmgpu51d54RwR/CpwTERtKlv1Q0u8CtwEOAptSOja8zFceWM+r24dPrxSb/2vjNPNZDXUjmvUBrU0cdcC8yo28bGoq1LmZ27Q2XhC0lIUAABGxQVJLPiWZ7b1vPvwcP1j7IscdMp/F+zSP28SHG31Tob7WpZvVzHhB0LOX68xqZn5zgdsvPqnWZZhNG+MFwdGSHq2wXMDhOdRjZmaTbLwgOAZYCGwsW34osCmXiszMbFKNd4mJLwLdEfFs6QRsz9aZmdk0N14QLImIUaeGIqIDWJJLRWZmNqnGC4KmKuv8nXYzsxlgvCBYLelPyhdK+iNgTT4lmZnZZBrvzeKPAt+Q9AF2Nf52oBE4J8e6zMxskox30blfASdJ+k3gTdni70TED3OvzMzMJsW4F50DiIh7gXtzrsXMzGrAN683M0ucg8DMLHEOAjOzxDkIzMwSl2sQSDpV0pOS1km6vMq44yUNSvq9POsxM7PRcgsCSfXA1cBpwDLgPEnLxhj3D8DdedViZmZjy/OI4ARgXUSsj4g+inc0O7vCuI8AXwdezLEWMzMbQ55BsIiRl6/uzJbtJGkRxW8oL6+2IUkXSeqQ1NHV1TXhhZqZpSzPIKh0E9com/8ScFlEDFbbUESsiIj2iGhva2ubqPrMzIzd/GbxXuoEDi6ZX8zom9m0A7dlN/5eAJwuaSAivpljXWZmViLPIFgNLJV0GPAccC5wfumAiDhs+LGkG4C7HAJmZpMrtyCIiAFJl1D8NFA9cF1EPC7p4mx91fcFzMxscuR5REBErAJWlS2rGAAR8cE8azEzs8r8zWIzs8Q5CMzMEucgMDNLXK7vEZjlJSLY3jfIlp7+nVN3Tz/PbN5W69LMph0HgdVMRLBtuJlvH9nQt4wxDa/r7u2nf7D8+4lFyw5smeRXYja9OQjsdYkItu4Y2NnIu3vHauQDo5t5Tz8DQ5WbOUCdoKW5QGtzgZam4s9F85t3LhtrWtg6axL/AmbTn4PAGBoqNvNqe+Jj7a139/RTpZdTXydamhp2NumW5gIH79M8ZhPf2eRnF5jb2EBdXaUrlZjZRHIQzGDdvf1897HneXlb9dMuW3urN/OGOo1o1PvMbmTJfnPGbOItzbsa/9xZDWSXEDGzKcpBMIPd3tHJZ+96AoDG+rqsUReb9IK5jfxa25yyBl55L312Y72budkM5iCYwfoHhwB46NOnsM/sgpu5mVXk7xEkoLngPXozG5uDwMwscQ4CM7PEOQjMzBLnIDAzS5yDwMwscQ4CM7PEOQjMzBLnIDAzS5yDwMwscQ4CM7PEOQjMzBLnIDAzS5yDwMwscQ4CM7PEOQjMzBLnIDAzS5yDwMwscQ4CM7PEOQjMzBKXaxBIOlXSk5LWSbq8wvoPSHo0m34i6Zg86zEzs9FyCwJJ9cDVwGnAMuA8ScvKhj0DvCsi3gx8FliRVz1mZlZZnkcEJwDrImJ9RPQBtwFnlw6IiJ9ExCvZ7IPA4hzrMTOzCvIMgkXAxpL5zmzZWP4I+G6lFZIuktQhqaOrq2sCSzQzszyDQBWWRcWB0m9SDILLKq2PiBUR0R4R7W1tbRNYopmZNeS47U7g4JL5xcCm8kGS3gxcC5wWES/lWE9SdgwMsr1vsNZlmNk0kGcQrAaWSjoMeA44Fzi/dICkQ4A7gT+MiKdyrGVa6u0fZEtP/65pe//I+Z5+usvne4s/e/uHAKivE3X+kLCZVZFbEETEgKRLgLuBeuC6iHhc0sXZ+uXAXwH7AV+WBDAQEe151TTZIoLe/qFRzbtaIy+d+gaGqm5/7qwGWpsLtDQXaG1u4PC2ObQ2F0ZMR+w/j1kN9ZP0is1sOlJExdP2U1Z7e3t0dHRM2vNFBNv7Bne7ie9aNkB3Tz99g9Wb+bymhlHNe3hqKfs5Yl1TAw313tU3s90jac1YO9p5nhqacl7c2svmrX27tTc+vK67t5/+wbHDUoKWptLm3cABrU1jNvDSaV5Tgfq6Su+pm5lNnmSC4L/Wv8Tvr3iw4ro6MappL9qnuWoTH27082Y1UOdmbmbTWDJBsPm1PgCuOGsZRx0wb0RDn9PoZm5m6UomCIa944gFHLlwXq3LMDObMvxuo5lZ4hwEZmaJcxCYmSXOQWBmljgHgZlZ4hwEZmaJcxCYmSXOQWBmljgHgZlZ4hwEZmaJcxCYmSXOQWBmljgHgZlZ4hwEZmaJcxCYmSXOQWBmljgHgZlZ4hwEZmaJcxCYmSXOQWBmljgHgZlZ4hwEZmaJcxCYmSXOQWBmljgHgZlZ4hwEZmaJcxCYmSUu1yCQdKqkJyWtk3R5hfWSdGW2/lFJx+VZj5mZjZZbEEiqB64GTgOWAedJWlY27DRgaTZdBFyTVz1mZlZZnkcEJwDrImJ9RPQBtwFnl405G7gpih4E5ks6MMeazMysTJ5BsAjYWDLfmS3b0zFIukhSh6SOrq6uvSrmgNYmTv/1A5g7q2Gvft/MbKbKsyuqwrLYizFExApgBUB7e/uo9bvjrYfuw1sPfeve/KqZ2YyW5xFBJ3BwyfxiYNNejDEzsxzlGQSrgaWSDpPUCJwLrCwbsxK4IPv00NuBLRHxfI41mZlZmdxODUXEgKRLgLuBeuC6iHhc0sXZ+uXAKuB0YB2wHbgwr3rMzKyyXN85jYhVFJt96bLlJY8D+HCeNZiZWXX+ZrGZWeIcBGZmiXMQmJklzkFgZpY4Fd+vnT4kdQHP7uWvLwA2T2A504Ffcxr8mtPwel7zoRHRVmnFtAuC10NSR0S017qOyeTXnAa/5jTk9Zp9asjMLHEOAjOzxKUWBCtqXUAN+DWnwa85Dbm85qTeIzAzs9FSOyIwM7MyDgIzs8QlEQSSrpP0oqSf1bqWySLpYEn3Slor6XFJl9a6prxJapL035IeyV7z/611TZNBUr2kn0q6q9a1TBZJGyQ9JulhSR21ridvkuZLukPSz7P/0ydO6PZTeI9A0juB1yjeH/lNta5nMmT3fj4wIh6SNA9YA/x2RDxR49JyI0nAnIh4TVIB+BFwaXY/7BlL0p8D7UBLRJxZ63omg6QNQHtEJPGFMkk3Ag9ExLXZ/V1mR8SrE7X9JI4IIuJ+4OVa1zGZIuL5iHgoe7wVWEuF+0HPJFH0WjZbyKYZvacjaTFwBnBtrWuxfEhqAd4JfBUgIvomMgQgkSBInaQlwFuA/6pxKbnLTpM8DLwIfD8iZvpr/hLwl8BQjeuYbAHcI2mNpItqXUzODge6gOuzU4DXSpozkU/gIJjhJM0Fvg58NCK6a11P3iJiMCKOpXj/6xMkzdhTgZLOBF6MiDW1rqUG3hERxwGnAR/OTv/OVA3AccA1EfEWYBtw+UQ+gYNgBsvOk38duDki7qx1PZMpO3S+Dzi1tpXk6h3A+7Pz5bcBvyXp32pb0uSIiE3ZzxeBbwAn1LaiXHUCnSVHt3dQDIYJ4yCYobI3Tr8KrI2IL9S6nskgqU3S/OxxM/Ae4Oc1LSpHEfG/I2JxRCwBzgV+GBF/UOOycidpTvYBCLJTJO8FZuwnAiPiBWCjpKOyRe8GJvRDH7nes3iqkHQrcDKwQFIncEVEfLW2VeXuHcAfAo9l58wBPpndR3qmOhC4UVI9xZ2cr0VEMh+pTMhC4BvFfR0agFsi4nu1LSl3HwFuzj4xtB64cCI3nsTHR83MbGw+NWRmljgHgZlZ4hwEZmaJcxCYmSXOQWBmljgHgSVN0mJJ35L0tKT1kq6SNGuCn+NkSSeVzF8s6YLs8QclHTSRz2e2pxwElqzsS3d3At+MiKXAUqAZ+McJfqqTgZ1BEBHLI+KmbPaDgIPAasrfI7BkSXo3xS8XvrNkWQvwLPBp4A0RcUm2/C7g8xFxn6RrgOMphsYdEXFFNmYDcCNwFsUrn/4PoBd4EBikeOGwj1D8ZuhrwAbgBuA5oAf4FPDHEXFOtr1TgA9FxO/k9kcww0cElrY3UrxPw07Zhfk2UP1b95+KiHbgzcC7JL25ZN3m7GJo1wCfiIgNwHLgixFxbEQ8UPJcdwAdwAeyC+WtAo6W1JYNuRC4fu9fntnucRBYykTl+xVonN/7n5IeAn5KMUyWlawbvrjfGmDJnhQTxcPzfwX+ILtm0onAd/dkG2Z7I4lrDZmN4XHgd0sXZKeGFgIvAUeWrGrK1h8GfAI4PiJekXTD8LrMjuznIHv3/+t64NsUTyndHhEDe7ENsz3iIwJL2f8DZpd8gqce+CfgKuAZ4FhJdZIOZtdljlsoXg9+i6SFFK+HP56twLzdWZddXnkT8H8ovn9gljsHgSUrOxVzDvB7kp6meBQwFBF/C/yYYhg8BnweGL7t5yMUTwk9DlyXjRvPt4Fzshut/0bZuhuA5dm65mzZzcDGmXx/aZta/Kkhs0z2Wf9bgd+p5V2/JF0F/DSBS6XbFOEgMJtCJK2heOrplIjYMd54s4ngIDAzS5zfIzAzS5yDwMwscQ4CM7PEOQjMzBLnIDAzS9z/B53tJ/fh8NL4AAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "ps = np.linspace(0, 1)\n", "qs = d6.quantile(ps)\n", "plt.plot(qs, ps)\n", "decorate_cdf('Inverse evaluation')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "These functions provide a simple way to make a Q-Q plot.\n", "\n", "Here are two samples from the same distribution." ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "cdf1 = Cdf.from_seq(np.random.normal(size=100))\n", "cdf2 = Cdf.from_seq(np.random.normal(size=100))\n", "\n", "cdf1.plot()\n", "cdf2.plot()\n", "decorate_cdf('Two random samples')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here's how we compute the Q-Q plot." ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [], "source": [ "def qq_plot(cdf1, cdf2):\n", " \"\"\"Compute results for a Q-Q plot.\n", " \n", " Evaluates the inverse Cdfs for a \n", " range of cumulative probabilities.\n", " \n", " :param cdf1: Cdf\n", " :param cdf2: Cdf\n", " \n", " :return: tuple of arrays\n", " \"\"\"\n", " ps = np.linspace(0, 1)\n", " q1 = cdf1.quantile(ps)\n", " q2 = cdf2.quantile(ps)\n", " return q1, q2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The result is near the identity line, which suggests that the samples are from the same distribution." ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "q1, q2 = qq_plot(cdf1, cdf2)\n", "plt.plot(q1, q2)\n", "plt.xlabel('Quantity 1')\n", "plt.ylabel('Quantity 2')\n", "plt.title('Q-Q plot');" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here's how we compute a P-P plot" ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [], "source": [ "def pp_plot(cdf1, cdf2):\n", " \"\"\"Compute results for a P-P plot.\n", " \n", " Evaluates the Cdfs for all quantities in either Cdf.\n", " \n", " :param cdf1: Cdf\n", " :param cdf2: Cdf\n", " \n", " :return: tuple of arrays\n", " \"\"\"\n", " qs = cdf1.index.union(cdf2.index)\n", " p1 = cdf1(qs)\n", " p2 = cdf2(qs)\n", " return p1, p2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And here's what it looks like." ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEWCAYAAABrDZDcAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8vihELAAAACXBIWXMAAAsTAAALEwEAmpwYAAAVjUlEQVR4nO3df6zdd33f8eerBndMAcJqoKkd45SGHx4lCC6BMgpJWYcdjXmZqjUhIiMCMquEdZ0m0bGqrEJD6UpV0hLwrCyNaAseShMI1BChpSEwEhpnc0PsqMwYktw4EAc6aBK01uG9P84xHI7POT73+nzPued8nw/pSvf7/X7uue+PbN339/35fD+fb6oKSVJ7/disA5AkzZaJQJJazkQgSS1nIpCkljMRSFLLmQgkqeVMBNIUJakkPzPrOKReJgK1UpKvJ/lekkeTfDPJHyY5bUjb65L8bbftt5N8NskLGo7vzUm+0OTvkI4zEajN3lBVpwEvBV4O/MaItv+l23YT8DBwXfPhSdNhIlDrVdWDwKeBF43R9nHgI8PadquHXd2q4W+SfC7Jc4a0fXqSDyc5muS+JL+R5MeSvBDYBfxctwr5v6vunDQGE4FaL8mZwAXA/x6j7WnAJSdpewnwHmADsB/4kyHt/gB4OvDTwGuBS4HLqupeYCdwe1WdVlWnj9URaZVMBGqzj3fvtr8AfA5474i2/77b9hBwGvDmEW3/rKpuq6r/B/xHOnf2Z/Y2SLIO+GXgP1TV31TV14HfBd60uq5Iq2ciUJv986o6vaqeU1W/UlXfS/Ku7nDMo0l29bR9X7ftT1bVP6uqr4743AeOf1NVjwLfBn6qr80GYD1wX8+5+4CNp9gnacVMBFKPqnpvdzjmtKraucqP+cHdf3co6R8AR/raPAL8HdA7f7AZePB4KKv83dKKmQikybsgyauTrKczV/Clqnqgt0FVPQF8DPjPSZ7anVD+d8Afd5t8E9jU/QypUSYCafI+ArybzpDQy+hMHg/yDuAx4DCdeYqPANd2r90CHAC+keSRRqNV68UX00iTk+Q6YLmqRq1JkNYUKwJJajkTgSS1nENDktRyVgSS1HJPmnUAK7Vhw4basmXLrMOQpLly1113PVJVzxx0be4SwZYtW9i3b9+sw5CkuZLkvmHXHBqSpJYzEUhSy5kIJKnlTASS1HImAklqucYSQZJrkzyc5J4h15Pk95McSnJ3kpc2FYskabgmK4LrgG0jrm8Hzu5+XQ58qMFYJElDNJYIquo2OtvwDrMD+HB13AGcnuSMpuKRpHn2W588wG998kAjnz3LBWUb6XmlH7DcPfdQf8Mkl9OpGti8efNUgpOkteTgke829tmzTAQZcG7gDnhVtRvYDbC0tOQueZIWyke+dD+f2P/gyDYHH/ouW894WiO/f5ZPDS3T825XYBMnvtdVkhbeJ/Y/yMGHRt/xbz3jaex4ycZGfv8sK4KbgCuS7AFeAXynqk4YFpKkNth6xtP47//652byuxtLBEk+CpwHbEiyTOcdrk8GqKpdwF7gAuAQ8DhwWVOxSJKGaywRVNXFJ7lewNub+v2SpPG4sliSWs5EIEktZyKQpJYzEUhSy5kIJKnl5u6dxZI0j0atHm5y1fA4rAgkaQpGrR5uctXwOKwIJKkhvVXA8bv+Wa0eHsWKQJIa0lsFzPqufxQrAknqM85uoONYy1VALysCSeozzm6g41jLVUAvKwJJrbKSvf/X+p38pFgRSGqVWe/9vxZZEUhqnTbd7Y/DikCSWs5EIEktZyKQpJZzjkDSwhu0wlc/ZEUgaeHNywrfWbEikLSQ5mWfn7XAikDSQrIKGJ8VgaSFZRUwHisCSWo5KwJJa8akdv0Enw5aCSsCSWvGpHb9BOcFVsKKQNKa4rj+9FkRSFLLmQgkqeVMBJLUcs4RSJqKlbwZTNNlRSBpKnwz2NrVaEWQZBtwFbAOuKaqruy7/nTgj4HN3VjeV1V/2GRMkmbHJ4LWpsYqgiTrgKuB7cBW4OIkW/uavR04WFXnAOcBv5tkfVMxSZJO1GRFcC5wqKoOAyTZA+wADva0KeCpSQKcBnwbONZgTJKmyPcAzIcm5wg2Ag/0HC93z/X6APBC4AjwZeBXq+r7/R+U5PIk+5LsO3r0aFPxSpowdwCdD01WBBlwrvqOXw/sB34BeC7w2SSfr6ofmVGqqt3AboClpaX+z5C0hvgegPnTZEWwDJzZc7yJzp1/r8uAG6rjEPA14AUNxiSpYVYB86fJiuBO4OwkZwEPAhcBb+xrcz/wOuDzSZ4NPB843GBMkiZk2LoAq4D501hFUFXHgCuAm4F7gY9V1YEkO5Ps7DZ7D/CqJF8G/gfwzqp6pKmYJE3OsHUBVgHzp9F1BFW1F9jbd25Xz/dHgH/SZAySmuOd/2JwZbEktZyJQJJazkQgSS3n7qOShhq1Y6grhReHFYGkoUbtGOrTQYvDikCSawJazopAkmsCWs6KQBLgmoA2syKQpJazIpBayncF6DgrAqml3CVUx1kRSC3mvIDAikCSWs+KQFpwJ1sjIFkRSAvONQI6GSsCaQH53mCthBWBtIB8IkgrYUUgLSirAI3LikCSWs5EIEktZyKQpJYzEUhSyzlZLC0IN5HTalkRSAvCR0a1WlYE0gLxkVGthhWBJLWciUCSWs5EIEkt5xyBNMd8UkiTYEUgzTGfFNIkNFoRJNkGXAWsA66pqisHtDkPeD/wZOCRqnptkzFJi8YnhXSqGksESdYBVwO/CCwDdya5qaoO9rQ5HfggsK2q7k/yrKbikSQN1mRFcC5wqKoOAyTZA+wADva0eSNwQ1XdD1BVDzcYj7QmDHt15Go4L6BJaHKOYCPwQM/xcvdcr+cBz0hya5K7klw66IOSXJ5kX5J9R48ebShcaTqGvTpyNZwX0CQ0WRFkwLka8PtfBrwOeApwe5I7quorP/JDVbuB3QBLS0v9nyHNHcf1tZY0mQiWgTN7jjcBRwa0eaSqHgMeS3IbcA7wFSRJU9FkIrgTODvJWcCDwEV05gR6fQL4QJInAeuBVwC/12BMarlJjs+vluP6WmsamyOoqmPAFcDNwL3Ax6rqQJKdSXZ229wLfAa4G/gLOo+Y3tNUTNIkx+dXy3F9rTWNriOoqr3A3r5zu/qOfwf4nSbjkHo5Pi/9KFcWS1LLudeQ5tZqxvsdn5dOZEWgubWa8X7H56UTjawIkryAziKwL1XVoz3nt1XVZ5oOTuo3aLdNx/ulUzO0Ikjyb+g83vkO4J4kO3ouv7fpwKRB3G1TmrxRFcHbgJdV1aNJtgDXJ9lSVVcxeNWwNBVWAdJkjUoE644PB1XV17vbRV+f5DmYCCRpYYxKBN9I8pKq2g/QrQz+KXAt8LPTCE4C38IlNW3UU0OXAt/oPVFVx6rqUuA1jUYl9XBeQGrW0IqgqpZHXPufzYQjDea8gNQc1xFIUsuNenz0x6cZiCRpNkZVBLcDJPmjKcUiSZqBUU8NrU/yr4BXJfkX/Rer6obmwpIkTcuoRLATuAQ4HXhD37UCTASStABGPTX0BeALSfZV1X+bYkySpCkamgh6hoP+2qEhSVpco4aGjg8HPQt4FXBL9/h84FYcGpKkhTBqaOgygCSfArZW1UPd4zOAq6cTniSpaeMsKNtyPAl0fRN4XkPxSJKmbJxXVd6a5Gbgo3SeFroY+PNGo5IkTc1JE0FVXZHkQn640dx/raobmw1LbeeOo9L0jNpi4meS/COAqrqxqn6tqn4N+FaS504tQrWSO45K0zOqIng/8K4B5x/vXutfZCZNlDuOStMxarJ4S1Xd3X+yqvYBWxqLSJI0VaMqgr834tpTJh2I5LyANBujKoI7k7yt/2SStwB3NReS2sp5AWk2RlUE/xa4Mckl/PAP/xKwHriw4bjUUs4LSNM3amXxN+lsQX0+8KLu6T+rqluG/Ywkaf6Ms47gz3EBmSaody6gl/MC0mz4zmJNXe9cQC/nBaTZGGeLiVVLsg24ClgHXFNVVw5p93LgDuCXq+r6JmPS2uBcgLR2NFYRJFlHZ5fS7cBW4OIkW4e0+23g5qZikSQN12RFcC5wqKoOAyTZA+wADva1ewfwp8DLG4xFM+YaAWntanKOYCPwQM/xcvfcDyTZSOdR1F2jPijJ5Un2Jdl39OjRiQeq5rlGQFq7mqwIMuBc9R2/H3hnVT2RDGre/aGq3cBugKWlpf7P0Bo1qApwXkBae5pMBMvAmT3Hm4AjfW2WgD3dJLABuCDJsar6eINxaUqOVwFbz3iaVYC0hjWZCO4Ezk5yFvAgcBHwxt4GVXXW8e+TXAd8yiSwWKwCpLWvsURQVceSXEHnaaB1wLVVdSDJzu71kfMCkqTpaHQdQVXtBfb2nRuYAKrqzU3Gounw6SBp/riyWBPl00HS/Gm0IlA7+HSQNN+sCHTKrAKk+WZFoKG7gY7LKkCab1YEGrob6LisAqT5ZkUgwOf9pTazIpCkljMRSFLLmQgkqeVMBJLUciYCSWo5E4EktZyJQJJazkQgSS1nIpCkljMRSFLLmQgkqeVMBJLUciYCSWo5dx9tkWHvHfDdwlK7WRG0yLD3Dvg+AandrAhaxvcOSOpnRSBJLWcikKSWc2hoATkpLGklrAgWkJPCklbCimAODLvDH+b4nb+TwpLGYUUwB4bd4Q/jnb+klbAimBPe4UtqihWBJLVco4kgybYkf5XkUJJfH3D9kiR3d7++mOScJuORJJ2osUSQZB1wNbAd2ApcnGRrX7OvAa+tqhcD7wF2NxWPJGmwJucIzgUOVdVhgCR7gB3AweMNquqLPe3vADY1GM+asNIngMDn/yU1q8mhoY3AAz3Hy91zw7wF+PSgC0kuT7Ivyb6jR49OMMTpW+kTQOBTQJKa1WRFkAHnamDD5Hw6ieDVg65X1W66w0ZLS0sDP2Oe+ASQpLWkyUSwDJzZc7wJONLfKMmLgWuA7VX1rQbjkSQN0OTQ0J3A2UnOSrIeuAi4qbdBks3ADcCbquorDcYiSRqisYqgqo4luQK4GVgHXFtVB5Ls7F7fBfwm8BPAB5MAHKuqpaZikiSdqNGVxVW1F9jbd25Xz/dvBd7aZAySpNFcWSxJLWcikKSWMxFIUsuZCCSp5UwEktRyJgJJajkTgSS1nG8om4LeHUfdSVTSWmNFMAW9O466k6iktcaKYErccVTSWmVFIEktZyKQpJYzEUhSy5kIJKnlTASS1HImAklqOROBJLWc6whOUe+q4WFcTSxpLbMiOEW9q4aHcTWxpLXMimBMw+78j9/tu2pY0ryyIhjTsDt/7/YlzTsrghEG7Rrqnb+kRWNFMIK7hkpqAyuCPlYBktrGiqCPVYCktrEiGMAqQFKbWBFIUstZEeA7hSW1mxUBzgtIajcrgi7nBSS1lRWBJLVcoxVBkm3AVcA64JqqurLverrXLwAeB95cVf+riVhG7RLqvICkNmusIkiyDrga2A5sBS5OsrWv2Xbg7O7X5cCHmopn1C6hzgtIarMmK4JzgUNVdRggyR5gB3Cwp80O4MNVVcAdSU5PckZVPdREQM4DSNKJmkwEG4EHeo6XgVeM0WYj8COJIMnldCoGNm/evKpgtv6UQz+SNEiTiSADztUq2lBVu4HdAEtLSydcH8e73/APV/NjkrTwmnxqaBk4s+d4E3BkFW0kSQ1qMhHcCZyd5Kwk64GLgJv62twEXJqOVwLfaWp+QJI0WGNDQ1V1LMkVwM10Hh+9tqoOJNnZvb4L2Evn0dFDdB4fvaypeCRJgzW6jqCq9tL5Y997blfP9wW8vckYJEmjubJYklrORCBJLWcikKSWMxFIUsulM187P5IcBe5bwY9sAB5pKJy1rI39bmOfoZ39bmOf4dT6/ZyqeuagC3OXCFYqyb6qWpp1HNPWxn63sc/Qzn63sc/QXL8dGpKkljMRSFLLtSER7J51ADPSxn63sc/Qzn63sc/QUL8Xfo5AkjRaGyoCSdIIJgJJarmFSQRJtiX5qySHkvz6gOtJ8vvd63cneeks4pykMfp8Sbevdyf5YpJzZhHnpJ2s3z3tXp7kiSS/NM34mjBOn5Ocl2R/kgNJPjftGJswxv/xpyf5ZJK/7PZ77ncwTnJtkoeT3DPk+uT/llXV3H/R2eb6q8BPA+uBvwS29rW5APg0nbeivRL40qzjnkKfXwU8o/v99nnv87j97ml3C53db39p1nFP4d/6dDrvA9/cPX7WrOOeUr/fBfx29/tnAt8G1s869lPs92uAlwL3DLk+8b9li1IRnAscqqrDVfW3wB5gR1+bHcCHq+MO4PQkZ0w70Ak6aZ+r6otV9dfdwzvovAFu3o3zbw3wDuBPgYenGVxDxunzG4Ebqup+gKpqS78LeGqSAKfRSQTHphvmZFXVbXT6MczE/5YtSiLYCDzQc7zcPbfSNvNkpf15C527iHl30n4n2QhcCOxiMYzzb/084BlJbk1yV5JLpxZdc8bp9weAF9J5xe2XgV+tqu9PJ7yZmfjfskZfTDNFGXCu/7nYcdrMk7H7k+R8Oong1Y1GNB3j9Pv9wDur6onOjeLcG6fPTwJeBrwOeApwe5I7quorTQfXoHH6/XpgP/ALwHOBzyb5fFV9t+HYZmnif8sWJREsA2f2HG+ic4ew0jbzZKz+JHkxcA2wvaq+NaXYmjROv5eAPd0ksAG4IMmxqvr4VCKcvHH/fz9SVY8BjyW5DTgHmOdEME6/LwOurM7g+aEkXwNeAPzFdEKciYn/LVuUoaE7gbOTnJVkPXARcFNfm5uAS7sz7q8EvlNVD0070Ak6aZ+TbAZuAN4053eGvU7a76o6q6q2VNUW4HrgV+Y4CcB4/78/Afx8kicl+fvAK4B7pxznpI3T7/vpVEEkeTbwfODwVKOcvon/LVuIiqCqjiW5AriZzpMG11bVgSQ7u9d30Xl65ALgEPA4nTuJuTVmn38T+Angg92742M15zs2jtnvhTJOn6vq3iSfAe4Gvg9cU1UDHz+cF2P+W78HuC7Jl+kMmbyzquZ6e+okHwXOAzYkWQbeDTwZmvtb5hYTktRyizI0JElaJROBJLWciUCSWs5EIEktZyKQpJYzEUhDJPnJJHuSfDXJwSR7kzxvQLvrju9wmuTnu7tg7k/ylL52I3eVlGbFRCAN0N3E7Ebg1qp6blVtpbPT5bNP8qOXAO+rqpdU1ff6rl0HbJt4sNIpWogFZVIDzgf+rneBWlXthx8kiT+gs7/N1+ju/ZLkrcC/BF6f5B9X1SW9H1hVtyXZMpXopRUwEUiDvQi4a8i1C+lsZfCzdCqEg3RWvV6T5NXAp6rq+umEKZ06h4aklXsN8NGqeqKqjtB5AY40t0wE0mAH6GzrPIx7s2hhmAikwW4BfjzJ246f6L4D+bXAbcBFSdZ13wx1/qyClCbBRCAN0N3f/kLgF7uPjx4A/hOdfd9vBP4PnTdifQgY60Xx3V0lbween2Q5yVuaiF1aKXcflaSWsyKQpJYzEUhSy5kIJKnlTASS1HImAklqOROBJLWciUCSWu7/A1uN0shkIj8HAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "p1, p2 = pp_plot(cdf1, cdf2)\n", "plt.plot(p1, p2)\n", "plt.xlabel('Cdf 1')\n", "plt.ylabel('Cdf 2')\n", "plt.title('P-P plot');" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Mutation\n", "\n", "`Cdf` objects are mutable, but in general the result is not a valid Cdf." ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.25
20.50
30.75
41.00
51.25
\n", "
" ], "text/plain": [ "1 0.25\n", "2 0.50\n", "3 0.75\n", "4 1.00\n", "5 1.25\n", "Name: , dtype: float64" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d4[5] = 1.25\n", "d4" ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.2
20.4
30.6
40.8
51.0
\n", "
" ], "text/plain": [ "1 0.2\n", "2 0.4\n", "3 0.6\n", "4 0.8\n", "5 1.0\n", "Name: , dtype: float64" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d4.normalize()\n", "d4" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Statistics\n", "\n", "`Cdf` overrides the statistics methods to compute `mean`, `median`, etc." ] }, { "cell_type": "code", "execution_count": 55, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3.5" ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.mean()" ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2.916666666666667" ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.var()" ] }, { "cell_type": "code", "execution_count": 57, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1.7078251276599332" ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.std()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Sampling\n", "\n", "`choice` chooses a random values from the Cdf, following the API of `np.random.choice`" ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([2, 2, 3, 4, 2, 5, 2, 2, 2, 2])" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.choice(size=10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`sample` chooses a random values from the `Cdf`, with replacement." ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([6., 1., 5., 1., 6., 5., 1., 5., 6., 5.])" ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.sample(n=10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Arithmetic\n", "\n", "`Pmf` and `Cdf` provide `add_dist`, which computes the distribution of the sum." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here's the distribution of the sum of two dice." ] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
20.027778
30.055556
40.083333
50.111111
60.138889
70.166667
80.138889
90.111111
100.083333
110.055556
120.027778
\n", "
" ], "text/plain": [ "2 0.027778\n", "3 0.055556\n", "4 0.083333\n", "5 0.111111\n", "6 0.138889\n", "7 0.166667\n", "8 0.138889\n", "9 0.111111\n", "10 0.083333\n", "11 0.055556\n", "12 0.027778\n", "Name: , dtype: float64" ] }, "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6 = Pmf.from_seq([1,2,3,4,5,6])\n", "\n", "twice = d6.add_dist(d6)\n", "twice" ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "6.999999999999998" ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "twice.bar()\n", "decorate_dice('Two dice')\n", "twice.mean()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To add a constant to a distribution, you could construct a deterministic `Pmf`" ] }, { "cell_type": "code", "execution_count": 62, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
20.166667
30.166667
40.166667
50.166667
60.166667
70.166667
\n", "
" ], "text/plain": [ "2 0.166667\n", "3 0.166667\n", "4 0.166667\n", "5 0.166667\n", "6 0.166667\n", "7 0.166667\n", "Name: , dtype: float64" ] }, "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ "const = Pmf.from_seq([1])\n", "d6.add_dist(const)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "But `add_dist` also handles constants as a special case:" ] }, { "cell_type": "code", "execution_count": 63, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
20.166667
30.166667
40.166667
50.166667
60.166667
70.166667
\n", "
" ], "text/plain": [ "2 0.166667\n", "3 0.166667\n", "4 0.166667\n", "5 0.166667\n", "6 0.166667\n", "7 0.166667\n", "Name: , dtype: float64" ] }, "execution_count": 63, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.add_dist(1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Other arithmetic operations are also implemented" ] }, { "cell_type": "code", "execution_count": 64, "metadata": {}, "outputs": [], "source": [ "d4 = Pmf.from_seq([1,2,3,4])" ] }, { "cell_type": "code", "execution_count": 65, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
-30.041667
-20.083333
-10.125000
00.166667
10.166667
20.166667
30.125000
40.083333
50.041667
\n", "
" ], "text/plain": [ "-3 0.041667\n", "-2 0.083333\n", "-1 0.125000\n", " 0 0.166667\n", " 1 0.166667\n", " 2 0.166667\n", " 3 0.125000\n", " 4 0.083333\n", " 5 0.041667\n", "Name: , dtype: float64" ] }, "execution_count": 65, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.sub_dist(d4)" ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.0625
20.1250
30.1250
40.1875
60.1250
80.1250
90.0625
120.1250
160.0625
\n", "
" ], "text/plain": [ "1 0.0625\n", "2 0.1250\n", "3 0.1250\n", "4 0.1875\n", "6 0.1250\n", "8 0.1250\n", "9 0.0625\n", "12 0.1250\n", "16 0.0625\n", "Name: , dtype: float64" ] }, "execution_count": 66, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d4.mul_dist(d4)" ] }, { "cell_type": "code", "execution_count": 67, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
0.2500000.0625
0.3333330.0625
0.5000000.1250
0.6666670.0625
0.7500000.0625
1.0000000.2500
1.3333330.0625
1.5000000.0625
2.0000000.1250
3.0000000.0625
4.0000000.0625
\n", "
" ], "text/plain": [ "0.250000 0.0625\n", "0.333333 0.0625\n", "0.500000 0.1250\n", "0.666667 0.0625\n", "0.750000 0.0625\n", "1.000000 0.2500\n", "1.333333 0.0625\n", "1.500000 0.0625\n", "2.000000 0.1250\n", "3.000000 0.0625\n", "4.000000 0.0625\n", "Name: , dtype: float64" ] }, "execution_count": 67, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d4.div_dist(d4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Comparison operators\n", "\n", "`Pmf` implements comparison operators that return probabilities.\n", "\n", "You can compare a `Pmf` to a scalar:" ] }, { "cell_type": "code", "execution_count": 68, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.3333333333333333" ] }, "execution_count": 68, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.lt_dist(3)" ] }, { "cell_type": "code", "execution_count": 69, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.75" ] }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d4.ge_dist(2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Or compare `Pmf` objects:" ] }, { "cell_type": "code", "execution_count": 70, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.25" ] }, "execution_count": 70, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d4.gt_dist(d6)" ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.41666666666666663" ] }, "execution_count": 71, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6.le_dist(d4)" ] }, { "cell_type": "code", "execution_count": 72, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.16666666666666666" ] }, "execution_count": 72, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d4.eq_dist(d6)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Interestingly, this way of comparing distributions is [nontransitive]()." ] }, { "cell_type": "code", "execution_count": 73, "metadata": {}, "outputs": [], "source": [ "A = Pmf.from_seq([2, 2, 4, 4, 9, 9])\n", "B = Pmf.from_seq([1, 1, 6, 6, 8, 8])\n", "C = Pmf.from_seq([3, 3, 5, 5, 7, 7])" ] }, { "cell_type": "code", "execution_count": 74, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.5555555555555556" ] }, "execution_count": 74, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A.gt_dist(B)" ] }, { "cell_type": "code", "execution_count": 75, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.5555555555555556" ] }, "execution_count": 75, "metadata": {}, "output_type": "execute_result" } ], "source": [ "B.gt_dist(C)" ] }, { "cell_type": "code", "execution_count": 76, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.5555555555555556" ] }, "execution_count": 76, "metadata": {}, "output_type": "execute_result" } ], "source": [ "C.gt_dist(A)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Joint distributions\n", "\n", "`Pmf.make_joint` takes two `Pmf` objects and makes their joint distribution, assuming independence." ] }, { "cell_type": "code", "execution_count": 77, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.25
20.25
30.25
40.25
\n", "
" ], "text/plain": [ "1 0.25\n", "2 0.25\n", "3 0.25\n", "4 0.25\n", "Name: , dtype: float64" ] }, "execution_count": 77, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d4 = Pmf.from_seq(range(1,5))\n", "d4" ] }, { "cell_type": "code", "execution_count": 78, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.166667
20.166667
30.166667
40.166667
50.166667
60.166667
\n", "
" ], "text/plain": [ "1 0.166667\n", "2 0.166667\n", "3 0.166667\n", "4 0.166667\n", "5 0.166667\n", "6 0.166667\n", "Name: , dtype: float64" ] }, "execution_count": 78, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d6 = Pmf.from_seq(range(1,7))\n", "d6" ] }, { "cell_type": "code", "execution_count": 79, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
110.041667
20.041667
30.041667
40.041667
50.041667
60.041667
210.041667
20.041667
30.041667
40.041667
50.041667
60.041667
310.041667
20.041667
30.041667
40.041667
50.041667
60.041667
410.041667
20.041667
30.041667
40.041667
50.041667
60.041667
\n", "
" ], "text/plain": [ "1 1 0.041667\n", " 2 0.041667\n", " 3 0.041667\n", " 4 0.041667\n", " 5 0.041667\n", " 6 0.041667\n", "2 1 0.041667\n", " 2 0.041667\n", " 3 0.041667\n", " 4 0.041667\n", " 5 0.041667\n", " 6 0.041667\n", "3 1 0.041667\n", " 2 0.041667\n", " 3 0.041667\n", " 4 0.041667\n", " 5 0.041667\n", " 6 0.041667\n", "4 1 0.041667\n", " 2 0.041667\n", " 3 0.041667\n", " 4 0.041667\n", " 5 0.041667\n", " 6 0.041667\n", "Name: , dtype: float64" ] }, "execution_count": 79, "metadata": {}, "output_type": "execute_result" } ], "source": [ "joint = Pmf.make_joint(d4, d6)\n", "joint" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The result is a `Pmf` object that uses a MultiIndex to represent the values." ] }, { "cell_type": "code", "execution_count": 80, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "MultiIndex([(1, 1),\n", " (1, 2),\n", " (1, 3),\n", " (1, 4),\n", " (1, 5),\n", " (1, 6),\n", " (2, 1),\n", " (2, 2),\n", " (2, 3),\n", " (2, 4),\n", " (2, 5),\n", " (2, 6),\n", " (3, 1),\n", " (3, 2),\n", " (3, 3),\n", " (3, 4),\n", " (3, 5),\n", " (3, 6),\n", " (4, 1),\n", " (4, 2),\n", " (4, 3),\n", " (4, 4),\n", " (4, 5),\n", " (4, 6)],\n", " )" ] }, "execution_count": 80, "metadata": {}, "output_type": "execute_result" } ], "source": [ "joint.index" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you ask for the `qs`, you get an array of pairs:" ] }, { "cell_type": "code", "execution_count": 81, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (2, 1), (2, 2),\n", " (2, 3), (2, 4), (2, 5), (2, 6), (3, 1), (3, 2), (3, 3), (3, 4),\n", " (3, 5), (3, 6), (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6)],\n", " dtype=object)" ] }, "execution_count": 81, "metadata": {}, "output_type": "execute_result" } ], "source": [ "joint.qs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can select elements using tuples:" ] }, { "cell_type": "code", "execution_count": 82, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.041666666666666664" ] }, "execution_count": 82, "metadata": {}, "output_type": "execute_result" } ], "source": [ "joint[1,1]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can get unnnormalized conditional distributions by selecting on different axes:" ] }, { "cell_type": "code", "execution_count": 83, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.041667
20.041667
30.041667
40.041667
50.041667
60.041667
\n", "
" ], "text/plain": [ "1 0.041667\n", "2 0.041667\n", "3 0.041667\n", "4 0.041667\n", "5 0.041667\n", "6 0.041667\n", "Name: , dtype: float64" ] }, "execution_count": 83, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Pmf(joint[1])" ] }, { "cell_type": "code", "execution_count": 84, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.041667
20.041667
30.041667
40.041667
\n", "
" ], "text/plain": [ "1 0.041667\n", "2 0.041667\n", "3 0.041667\n", "4 0.041667\n", "Name: , dtype: float64" ] }, "execution_count": 84, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Pmf(joint.loc[:, 1])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "But `Pmf` also provides `conditional(i,val)` which returns the conditional distribution where variable `i` has the value `val`: " ] }, { "cell_type": "code", "execution_count": 85, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.166667
20.166667
30.166667
40.166667
50.166667
60.166667
\n", "
" ], "text/plain": [ "1 0.166667\n", "2 0.166667\n", "3 0.166667\n", "4 0.166667\n", "5 0.166667\n", "6 0.166667\n", "Name: , dtype: float64" ] }, "execution_count": 85, "metadata": {}, "output_type": "execute_result" } ], "source": [ "joint.conditional(0, 1)" ] }, { "cell_type": "code", "execution_count": 86, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.25
20.25
30.25
40.25
\n", "
" ], "text/plain": [ "1 0.25\n", "2 0.25\n", "3 0.25\n", "4 0.25\n", "Name: , dtype: float64" ] }, "execution_count": 86, "metadata": {}, "output_type": "execute_result" } ], "source": [ "joint.conditional(1, 1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It also provides `marginal(i)`, which returns the marginal distribution along axis `i`" ] }, { "cell_type": "code", "execution_count": 87, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.25
20.25
30.25
40.25
\n", "
" ], "text/plain": [ "1 0.25\n", "2 0.25\n", "3 0.25\n", "4 0.25\n", "Name: , dtype: float64" ] }, "execution_count": 87, "metadata": {}, "output_type": "execute_result" } ], "source": [ "joint.marginal(0)" ] }, { "cell_type": "code", "execution_count": 88, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
probs
10.166667
20.166667
30.166667
40.166667
50.166667
60.166667
\n", "
" ], "text/plain": [ "1 0.166667\n", "2 0.166667\n", "3 0.166667\n", "4 0.166667\n", "5 0.166667\n", "6 0.166667\n", "Name: , dtype: float64" ] }, "execution_count": 88, "metadata": {}, "output_type": "execute_result" } ], "source": [ "joint.marginal(1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here are some ways of iterating through a joint distribution." ] }, { "cell_type": "code", "execution_count": 89, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(1, 1)\n", "(1, 2)\n", "(1, 3)\n", "(1, 4)\n", "(1, 5)\n", "(1, 6)\n", "(2, 1)\n", "(2, 2)\n", "(2, 3)\n", "(2, 4)\n", "(2, 5)\n", "(2, 6)\n", "(3, 1)\n", "(3, 2)\n", "(3, 3)\n", "(3, 4)\n", "(3, 5)\n", "(3, 6)\n", "(4, 1)\n", "(4, 2)\n", "(4, 3)\n", "(4, 4)\n", "(4, 5)\n", "(4, 6)\n" ] } ], "source": [ "for q in joint.qs:\n", " print(q)" ] }, { "cell_type": "code", "execution_count": 90, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n", "0.041666666666666664\n" ] } ], "source": [ "for p in joint.ps:\n", " print(p)" ] }, { "cell_type": "code", "execution_count": 91, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(1, 1) 0.041666666666666664\n", "(1, 2) 0.041666666666666664\n", "(1, 3) 0.041666666666666664\n", "(1, 4) 0.041666666666666664\n", "(1, 5) 0.041666666666666664\n", "(1, 6) 0.041666666666666664\n", "(2, 1) 0.041666666666666664\n", "(2, 2) 0.041666666666666664\n", "(2, 3) 0.041666666666666664\n", "(2, 4) 0.041666666666666664\n", "(2, 5) 0.041666666666666664\n", "(2, 6) 0.041666666666666664\n", "(3, 1) 0.041666666666666664\n", "(3, 2) 0.041666666666666664\n", "(3, 3) 0.041666666666666664\n", "(3, 4) 0.041666666666666664\n", "(3, 5) 0.041666666666666664\n", "(3, 6) 0.041666666666666664\n", "(4, 1) 0.041666666666666664\n", "(4, 2) 0.041666666666666664\n", "(4, 3) 0.041666666666666664\n", "(4, 4) 0.041666666666666664\n", "(4, 5) 0.041666666666666664\n", "(4, 6) 0.041666666666666664\n" ] } ], "source": [ "for q, p in joint.items():\n", " print(q, p)" ] }, { "cell_type": "code", "execution_count": 92, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1 1 0.041666666666666664\n", "1 2 0.041666666666666664\n", "1 3 0.041666666666666664\n", "1 4 0.041666666666666664\n", "1 5 0.041666666666666664\n", "1 6 0.041666666666666664\n", "2 1 0.041666666666666664\n", "2 2 0.041666666666666664\n", "2 3 0.041666666666666664\n", "2 4 0.041666666666666664\n", "2 5 0.041666666666666664\n", "2 6 0.041666666666666664\n", "3 1 0.041666666666666664\n", "3 2 0.041666666666666664\n", "3 3 0.041666666666666664\n", "3 4 0.041666666666666664\n", "3 5 0.041666666666666664\n", "3 6 0.041666666666666664\n", "4 1 0.041666666666666664\n", "4 2 0.041666666666666664\n", "4 3 0.041666666666666664\n", "4 4 0.041666666666666664\n", "4 5 0.041666666666666664\n", "4 6 0.041666666666666664\n" ] } ], "source": [ "for (q1, q2), p in joint.items():\n", " print(q1, q2, p)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.9" } }, "nbformat": 4, "nbformat_minor": 2 }