{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> This is one of the 100 recipes of the [IPython Cookbook](http://ipython-books.github.io/), the definitive guide to high-performance scientific computing and data science in Python.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "word_id": "4818_07_bayesian"
   },
   "source": [
    "# 7.3. Getting started with Bayesian methods"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let $q$ be the probability of obtaining a head. Whereas $q$ was just a fixed number in the previous recipe, we consider here that it is a *random variable*. Initially, this variable follows a distribution called the **prior distribution**. It represents our knowledge about $q$ *before* we start flipping the coin. We will update this distribution after each trial (**posterior distribution**)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "1. First, we assume that $q$ is a *uniform* random variable on the interval $[0, 1]$. That's our prior distribution: for all $q$, $P(q)=1$.\n",
    "2. Then, we flip our coin $n$ times. We note $x_i$ the outcome of the $i$-th flip ($0$ for tail, $1$ for head).\n",
    "3. What is the probability distribution of $q$ knowing the observations $x_i$? **Bayes' formula** allows us to compute the *posterior distribution* analytically (see the next section for the mathematical details):\n",
    "\n",
    "$$P(q | \\{x_i\\}) = \\frac{P(\\{x_i\\} | q) P(q)}{\\displaystyle\\int_0^1 P(\\{x_i\\} | q) P(q) dq} = (n+1)\\binom n h  q^h (1-q)^{n-h}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We define the posterior distribution according to the mathematical formula above. We remark this this expression is $(n+1)$ times the *probability mass function* (PMF) of the binomial distribution, which is directly available in `scipy.stats`. (http://en.wikipedia.org/wiki/Binomial_distribution)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import scipy.stats as st\n",
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "posterior = lambda n, h, q: (n+1) * st.binom(n, q).pmf(h)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's plot this distribution for an observation of $h=61$ heads and $n=100$ total flips."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "n = 100\n",
    "h = 61\n",
    "q = np.linspace(0., 1., 1000)\n",
    "d = posterior(n, h, q)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "plt.figure(figsize=(5,3));\n",
    "plt.plot(q, d, '-k');\n",
    "plt.xlabel('q parameter');\n",
    "plt.ylabel('Posterior distribution');\n",
    "plt.ylim(0, d.max()+1);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "4. This distribution indicates the plausible values for $q$ given the observations. We could use it to derive a **credible interval**, likely to contain the actual value. (http://en.wikipedia.org/wiki/Credible_interval)\n",
    "\n",
    "We can also derive a point estimate. For example, the **maximum a posteriori (MAP) estimation** consists in considering the *maximum* of this distribution as an estimate for $q$. We can find this maximum analytically or numerically. Here, we find analytically $\\hat q = h/n$, which looks quite sensible. (http://en.wikipedia.org/wiki/Maximum_a_posteriori_estimation)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> You'll find all the explanations, figures, references, and much more in the book (to be released later this summer).\n",
    "\n",
    "> [IPython Cookbook](http://ipython-books.github.io/), by [Cyrille Rossant](http://cyrille.rossant.net), Packt Publishing, 2014 (500 pages)."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.4.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}