{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Bayesian updates using the table method\n",
    "\n",
    "This notebook demonstrates a way of doing simple Bayesian updates using the table method, with a Pandas DataFrame as the table.\n",
    "\n",
    "Copyright 2018 Allen Downey\n",
    "\n",
    "MIT License: https://opensource.org/licenses/MIT\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Configure Jupyter so figures appear in the notebook\n",
    "%matplotlib inline\n",
    "\n",
    "# Configure Jupyter to display the assigned value after an assignment\n",
    "%config InteractiveShell.ast_node_interactivity='last_expr_or_assign'\n",
    "\n",
    "import numpy as np\n",
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As an example, I'll use the \"cookie problem\", which is a version of a classic probability \"urn problem\".\n",
    "\n",
    "Suppose there are two bowls of cookies.\n",
    "\n",
    "* Bowl #1 has 10 chocolate and 30 vanilla.\n",
    "\n",
    "* Bowl #2 has 20 of each.\n",
    "\n",
    "You choose a bowl at random, and then pick a cookie at random.  The cookie turns out to be vanilla.  What is the probability that the bowl you picked from is Bowl #1?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### The BayesTable class\n",
    "\n",
    "Here's the class that represents a Bayesian table."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "class BayesTable(pd.DataFrame):\n",
    "    def __init__(self, hypo, prior=1):\n",
    "        columns = ['hypo', 'prior', 'likelihood', 'unnorm', 'posterior']\n",
    "        super().__init__(columns=columns)\n",
    "        self.hypo = hypo\n",
    "        self.prior = prior\n",
    "    \n",
    "    def mult(self):\n",
    "        self.unnorm = self.prior * self.likelihood\n",
    "        \n",
    "    def norm(self):\n",
    "        nc = np.sum(self.unnorm)\n",
    "        self.posterior = self.unnorm / nc\n",
    "        return nc\n",
    "    \n",
    "    def update(self):\n",
    "        self.mult()\n",
    "        return self.norm()\n",
    "    \n",
    "    def reset(self):\n",
    "        return BayesTable(self.hypo, self.posterior)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here's an instance that represents the two hypotheses: you either chose from Bowl 1 or Bowl 2:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>hypo</th>\n",
       "      <th>prior</th>\n",
       "      <th>likelihood</th>\n",
       "      <th>unnorm</th>\n",
       "      <th>posterior</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Bowl 1</td>\n",
       "      <td>1</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Bowl 2</td>\n",
       "      <td>1</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     hypo  prior likelihood unnorm posterior\n",
       "0  Bowl 1      1        NaN    NaN       NaN\n",
       "1  Bowl 2      1        NaN    NaN       NaN"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "table = BayesTable(['Bowl 1', 'Bowl 2'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Since we didn't specify prior probabilities, the default value is equal priors for all hypotheses.\n",
    "\n",
    "Now we can specify the likelihoods:\n",
    "\n",
    "* The likelihood of getting a vanilla cookie from Bowl 1 is 3/4.\n",
    "\n",
    "* The likelihood of getting a vanilla cookie from Bowl 2 is 1/2.\n",
    "\n",
    "Here's how we plug the likelihoods in:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>hypo</th>\n",
       "      <th>prior</th>\n",
       "      <th>likelihood</th>\n",
       "      <th>unnorm</th>\n",
       "      <th>posterior</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Bowl 1</td>\n",
       "      <td>1</td>\n",
       "      <td>0.75</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Bowl 2</td>\n",
       "      <td>1</td>\n",
       "      <td>0.50</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     hypo  prior  likelihood unnorm posterior\n",
       "0  Bowl 1      1        0.75    NaN       NaN\n",
       "1  Bowl 2      1        0.50    NaN       NaN"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "table.likelihood = [3/4, 1/2]\n",
    "table"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The next step is to multiply the priors by the likelihoods, which yields the unnormalized posteriors."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>hypo</th>\n",
       "      <th>prior</th>\n",
       "      <th>likelihood</th>\n",
       "      <th>unnorm</th>\n",
       "      <th>posterior</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Bowl 1</td>\n",
       "      <td>1</td>\n",
       "      <td>0.75</td>\n",
       "      <td>0.75</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Bowl 2</td>\n",
       "      <td>1</td>\n",
       "      <td>0.50</td>\n",
       "      <td>0.50</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     hypo  prior  likelihood  unnorm posterior\n",
       "0  Bowl 1      1        0.75    0.75       NaN\n",
       "1  Bowl 2      1        0.50    0.50       NaN"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "table.mult()\n",
    "table"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can compute the normalized posteriors; `norm` returns the normalization constant."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1.25"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "table.norm()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>hypo</th>\n",
       "      <th>prior</th>\n",
       "      <th>likelihood</th>\n",
       "      <th>unnorm</th>\n",
       "      <th>posterior</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Bowl 1</td>\n",
       "      <td>1</td>\n",
       "      <td>0.75</td>\n",
       "      <td>0.75</td>\n",
       "      <td>0.6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Bowl 2</td>\n",
       "      <td>1</td>\n",
       "      <td>0.50</td>\n",
       "      <td>0.50</td>\n",
       "      <td>0.4</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     hypo  prior  likelihood  unnorm  posterior\n",
       "0  Bowl 1      1        0.75    0.75        0.6\n",
       "1  Bowl 2      1        0.50    0.50        0.4"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "table"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can read the posterior probabilities from the last column: the probability that you chose from Bowl 1 is 60%."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Resetting\n",
    "\n",
    "Suppose you put the first cookie back, stir the bowl, and select another cookie from the same bowl.\n",
    "\n",
    "If this second cookie is chocolate, what is the probability, now, that you are drawing from Bowl 1?\n",
    "\n",
    "To solve this problem, we want a new table where the priors in the new table are the posteriors from the old table.  That's what the `reset` method computes:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>hypo</th>\n",
       "      <th>prior</th>\n",
       "      <th>likelihood</th>\n",
       "      <th>unnorm</th>\n",
       "      <th>posterior</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Bowl 1</td>\n",
       "      <td>0.6</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Bowl 2</td>\n",
       "      <td>0.4</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     hypo  prior likelihood unnorm posterior\n",
       "0  Bowl 1    0.6        NaN    NaN       NaN\n",
       "1  Bowl 2    0.4        NaN    NaN       NaN"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "table2 = table.reset()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here are the likelihoods for the second update."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "table2.likelihood = [1/4, 1/2]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We could run `mult` and `norm` again, or run `update`, which does both steps."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.35"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "table2.update()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here are the results."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>hypo</th>\n",
       "      <th>prior</th>\n",
       "      <th>likelihood</th>\n",
       "      <th>unnorm</th>\n",
       "      <th>posterior</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Bowl 1</td>\n",
       "      <td>0.6</td>\n",
       "      <td>0.25</td>\n",
       "      <td>0.15</td>\n",
       "      <td>0.428571</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Bowl 2</td>\n",
       "      <td>0.4</td>\n",
       "      <td>0.50</td>\n",
       "      <td>0.20</td>\n",
       "      <td>0.571429</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     hypo  prior  likelihood  unnorm  posterior\n",
       "0  Bowl 1    0.6        0.25    0.15   0.428571\n",
       "1  Bowl 2    0.4        0.50    0.20   0.571429"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "table2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "But the result is the same."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}