{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Solutions-02"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Setup Code"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "import seaborn; seaborn.set() # for plot formatting"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Making the data\n",
    "\n",
    "def make_data(intercept, slope, N=20, dy=5, rseed=42):\n",
    "    rand = np.random.RandomState(rseed)\n",
    "    x = 100 * rand.rand(N)\n",
    "    y = intercept + slope * x\n",
    "    y += dy * rand.randn(N)\n",
    "    return x, y, dy * np.ones_like(x)\n",
    "\n",
    "theta_true = [25, 0.5]\n",
    "x, y, dy = make_data(*theta_true)\n",
    "plt.errorbar(x, y, dy, fmt='o');"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Quick Exercise\n",
    "\n",
    "1. Write a Python function which computes the log-likelihood given a parameter vector $\\theta$, an array of errors $\\varepsilon$, and an array of $x$ and $y$ values\n",
    "\n",
    "2. Use tools in [``scipy.optimize``](http://docs.scipy.org/doc/scipy/reference/optimize.html) to maximize this likelihood (i.e. minimize the negative log-likelihood). How close is this result to the input ``theta_true`` above?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Solution"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def log_likelihood(theta, x, y, dy):\n",
    "    y_model = theta[0] + theta[1] * x\n",
    "    return -0.5 * np.sum(np.log(2 * np.pi * dy ** 2) +\n",
    "                         (y - y_model) ** 2 / dy ** 2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "from scipy import optimize\n",
    "\n",
    "def minfunc(theta, x, y, dy):\n",
    "    return -log_likelihood(theta, x, y, dy)\n",
    "\n",
    "result = optimize.minimize(minfunc, x0=[0, 0], args=(x, y, dy))\n",
    "print(\"result:\", result.x)\n",
    "print(\"input\", theta_true)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Breakout\n",
    "\n",
    "1. Using matplotlib, plot the posterior probability distribution for the slope and intercept, once for each prior. I would suggest using ``plt.contourf()`` or ``plt.pcolor()``. How different are the distributions?\n",
    "\n",
    "2. Modify the dataset – how do the results change if you have very few data points or very large errors?\n",
    "\n",
    "3. If you finish this quickly, try adding 1-sigma and 2-sigma contours to your plot, keeping in mind that the probabilities are not normalized! You can add them to your plot with ``plt.contour()``."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Setup\n",
    "\n",
    "These are the two prior functions we defined in the notebook:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def log_flat_prior(theta):\n",
    "    if np.all(np.abs(theta) < 1000):\n",
    "        return 0\n",
    "    else:\n",
    "        return -np.inf  # log(0)\n",
    "    \n",
    "def log_symmetric_prior(theta):\n",
    "    if np.abs(theta[0]) < 1000:\n",
    "        return -1.5 * np.log(1 + theta[1] ** 2)\n",
    "    else:\n",
    "        return -np.inf  # log(0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Solution"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We'll start by defining a function which takes a two-dimensional grid of likelihoods and returns 1, 2, and 3-sigma contours.\n",
    "This acts by sorting and normalizing the values and then finding the locations of the $0.68^2$, $0.95^2$, and $0.997^2$ cutoffs:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def contour_levels(grid):\n",
    "    \"\"\"Compute 1, 2, 3-sigma contour levels for a gridded 2D posterior\"\"\"\n",
    "    sorted_ = np.sort(grid.ravel())[::-1]\n",
    "    pct = np.cumsum(sorted_) / np.sum(sorted_)\n",
    "    cutoffs = np.searchsorted(pct, np.array([0.68, 0.95, 0.997]) ** 2)\n",
    "    return sorted_[cutoffs]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we define a function to compute and plot the results of the Bayesian analysis:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def plot_results(x, y, dy,\n",
    "                 slope_limits=(0.3, 0.7),\n",
    "                 intercept_limits=(15, 35)):\n",
    "    # 1. Evaluate the log probability on the grid (once for each prior)\n",
    "    slope_range = np.linspace(*slope_limits)\n",
    "    intercept_range = np.linspace(*intercept_limits)\n",
    "\n",
    "    log_P1 = [[log_likelihood([b, m], x, y, dy) + log_flat_prior([b, m])\n",
    "               for m in slope_range] for b in intercept_range]\n",
    "    log_P2 = [[log_likelihood([b, m], x, y, dy) + log_symmetric_prior([b, m])\n",
    "               for m in slope_range] for b in intercept_range]\n",
    "\n",
    "    # For convenience, we normalize the probability density such that the maximum is 1\n",
    "    P1 = np.exp(log_P1 - np.max(log_P1))\n",
    "    P2 = np.exp(log_P2 - np.max(log_P2))\n",
    "\n",
    "    # 2. Create two subplots and plot contours showing the results\n",
    "    fig, ax = plt.subplots(1, 2, figsize=(16, 6),\n",
    "                           sharex=True, sharey=True)\n",
    "    \n",
    "    ax[0].contourf(slope_range, intercept_range, P1, 100, cmap='Blues')\n",
    "    ax[0].contour(slope_range, intercept_range, P1, contour_levels(P1), colors='black')\n",
    "    ax[0].set_title('Flat Prior')\n",
    "\n",
    "    ax[1].contourf(slope_range, intercept_range, P2, 100, cmap='Blues')\n",
    "    ax[1].contour(slope_range, intercept_range, P2, contour_levels(P1), colors='black')\n",
    "    ax[1].set_title('Symmetric Prior')\n",
    "\n",
    "    # 3. Add grids and set axis labels\n",
    "    for axi in ax:\n",
    "        axi.grid('on', linestyle=':', color='gray', alpha=0.5)\n",
    "        axi.set_axisbelow(False)\n",
    "        axi.set_xlabel('slope')\n",
    "        axi.set_ylabel('intercept')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "plot_results(x, y, dy)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We see that the form of the prior in this case makes very little difference to the final posterior. In general, this often ends up being the case: for all the worrying about the effect of the prior, when you have enough data to constrain your model well, the prior has very little effect."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's use some different data and see what happens:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "x2, y2, dy2 = make_data(*theta_true, N=3, dy=40)\n",
    "plt.errorbar(x2, y2, dy2, fmt='o');"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "plot_results(x2, y2, dy2,\n",
    "             slope_limits=(-2, 5),\n",
    "             intercept_limits=(-300, 200))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We see here that the form of the prior **does** have a clear effect in the case where the data don't constrain the model well (in this case, three points with very large error bars).\n",
    "This encodes exactly what you would scientifically expect: if you don't have very good data, it is unlikely to change your views of the world (which are of course encoded in the prior)."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}