{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#Statistical Inference for Everyone: Technical Supplement\n",
    "\n",
    "\n",
    "\n",
    "This document is the technical supplement, for instructors, for [Statistical Inference for Everyone], the introductory statistical inference textbook from the perspective of \"probability theory as logic\".\n",
    "\n",
    "<img  src=\"http://web.bryant.edu/~bblais/images/Saturn_with_Dice.png\" align=center width = 250px />\n",
    "\n",
    "[Statistical Inference for Everyone]: http://web.bryant.edu/~bblais/statistical-inference-for-everyone-sie.html\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Estimating a Proportion\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\\newcommand{\\twocvec}[2]{\\left(\\begin{array}{c}\n",
    "        #1 \\\\\\\\ #2\n",
    "        \\end{array}\\right)}\n",
    "\\newcommand{\\nchoosek}[2]{\\twocvec{#1}{#2}}\n",
    "$$\n",
    "\n",
    "If $\\theta$ is the model representing the probability, $\\theta$, of the coin\n",
    "landing on heads (and $1-\\theta$ is the probability of landing on tails), we\n",
    "need to make an estimate of probability of model $\\theta$ being true given the\n",
    "data, which will consist of $N$ flips of which $h$ are heads.  \n",
    "\n",
    "Bayes rule is:\n",
    "\\begin{eqnarray}\n",
    "p(\\theta|D,I) &=& \\frac{p(D|\\theta,I)p(\\theta|I)}{p(D|I)} = \n",
    "\\frac{p(D|\\theta,I)p(\\theta,I)}{\\sum_\\theta p(D|\\theta,I)p(\\theta|I)}\n",
    "\\end{eqnarray}\n",
    "\n",
    "Thus, the probability of a particular model $\\theta$ being true is the product\n",
    "of the probability of the observed data ($h$ heads in $N$ flips) given the\n",
    "model $\\theta$ and the prior probability of the model $\\theta$ being true\n",
    "before we even look at the data, divided by the probability of the data itself\n",
    "over all models.\n",
    "\n",
    "The prior probability of model $\\theta$ will be assumed to be uniform (from\n",
    "maximum entropy considerations).  The probability, $\\theta$, ranges from 0 to\n",
    "1, to the prior is\n",
    "\\begin{eqnarray}\n",
    "p(\\theta|I) = 1\n",
    "\\end{eqnarray}\n",
    "\n",
    "The probability of the data given the random model, is just the binomial\n",
    "distribution:\n",
    "\n",
    "\\begin{eqnarray}\n",
    "p(D|\\theta)=\\nchoosek{N}{h} \\theta^h (1-\\theta)^{N-h}\n",
    "\\end{eqnarray}\n",
    "\n",
    "The probability of the data, $p(D|I)$, is found by summing (or in this case\n",
    "integrating) $p(D|\\theta,I)p(\\theta|I)$ for all $\\theta$:\n",
    "\n",
    "\\begin{eqnarray}\n",
    "p(D|I) &=& \\int_0^1 \\nchoosek{N}{h} \\theta^h (1-\\theta)^{N-h} \\cdot 1 d\\theta\n",
    "\\\\\\\\\n",
    "&=&\\frac{N!}{h!(N-h)!} \\frac{h!(N-h)!}{(N+1)!} = \\frac{1}{N+1}\n",
    "\\end{eqnarray}\n",
    "\n",
    "Now the probability of model $\\theta$ being true, given the data, is just\n",
    "\n",
    "\\begin{eqnarray}\n",
    "p(\\theta|D,I)&=& (N+1) \\cdot \\nchoosek{N}{h} \\theta^h (1-\\theta)^{N-h} \\\\\n",
    "&=& \\frac{(N+1)!}{h!(N-h)!} \\theta^h (1-\\theta)^{N-h} \n",
    "\\end{eqnarray}\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Max, Mean, Variance\n",
    "\n",
    "The model with the maximum probability is found by maximizing $p(\\theta|D,I)$\n",
    "w.r.t. $\\theta$:\n",
    "\n",
    "\\begin{eqnarray}\n",
    "\\frac{dP(\\theta|D,I)}{d\\theta} &=& 0 = \\frac{(N+1)!}{h!(N-h)!} \\left( \n",
    "       -(N-h) \\theta^h (1-\\theta)^{N-h-1} + h \\theta^{h-1} (1-\\theta)^{N-h} \\right) \\\\\\\\\n",
    "(N-h) \\theta^h (1-\\theta)^{N-h-1} &=&  h \\theta^{h-1} (1-\\theta)^{N-h} \\\\\\\\\n",
    "\\theta(N-h) &=& (1-\\theta) h = h-\\theta h = N\\theta-\\theta h \\\\\\\\\n",
    "\\theta&=&\\frac{h}{N} \\;\\;\\;\\;\\;\\surd\n",
    "\\end{eqnarray}\n",
    "\n",
    "The average and the standard deviation is also straightforward.\n",
    "\n",
    "\n",
    "\\begin{eqnarray}\n",
    "\\bar{\\theta} &=& \\int_0^1 \\theta \\cdot \\frac{(N+1)!}{h!(N-h)!} \\theta^h (1-\\theta)^{N-h} \\\\\\\\\n",
    " &=& \\frac{(N+1)!}{h!(N-h)!} \\int_0^1 \\theta^{h+1} (1-\\theta)^{N-h} \\\\\\\\\n",
    " &=&\\frac{(N+1)!}{h!(N-h)!} \\frac{(h+1)!(N-h)!}{(N+2)!} \\\\\\\\\n",
    " &=&\\frac{h+1}{N+2} \\\\\\\\\n",
    "\\bar{\\theta^2} &=& \\int_0^1 \\theta^2 \\cdot \\frac{(N+1)!}{h!(N-h)!} \\theta^h (1-\\theta)^{N-h} \\\\\\\\\n",
    " &=&\\frac{(N+1)!}{h!(N-h)!} \\frac{(h+2)!(N-h)!}{(N+3)!} \\\\\\\\\n",
    " &=&\\frac{(h+1)(h+2)}{(N+2)(N+3)} \\\\\\\\\n",
    "\\sigma^2 &=& \\bar{\\theta^2} - \\bar{\\theta}^2 =  \\frac{(h+1)(h+2)}{(N+2)(N+3)} -\n",
    "                              \\frac{(h+1)(h+1)}{(N+2)(N+2)} \\\\\\\\\n",
    "&=&\\frac{(h+1)(N-h+1)}{(N+2)^2(N+3)} \\\\\\\\\n",
    "&=& \\frac{(h+1)}{(N+2)}\\left( \\frac{n+2}{n+2} - \\frac{h+1}{N+2}\\right)\n",
    "\\frac{1}{N+3} \\\\\\\\\n",
    "&=& \\bar{\\theta}(1-\\bar{\\theta})\\frac{1}{N+3}\n",
    "\\end{eqnarray}\n",
    "\n",
    "### An Approximation for the Variance\n",
    "\n",
    "If $f=h/N$ is the actual fraction of heads observed, then the variance above\n",
    "can be written as\n",
    "\\begin{eqnarray}\n",
    "\\sigma^2 &=&\\frac{(fN+1)(N-fN+1)}{(N+2)^2(N+3)} \\\\\\\\\n",
    "\\mbox{(for large $N$)}&\\approx& \\frac{(fN+1)(N-fN)}{N^3}\n",
    "                       =\\frac{(fN+1)(1-f)}{N^2} \\\\\\\\\n",
    "\\mbox{(for large $fN$)}&\\approx& \\frac{(fN)(N-fN)}{N^2} = \\frac{f(1-f)}{N} \\\\\\\\\n",
    "\\sigma^2&\\approx& \\frac{f(1-f)}{N}\n",
    "\\end{eqnarray}\n",
    "\n",
    "In this limit, the distribution (beta distribution) can be approximated with a\n",
    "Gaussian.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---------------------"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style>\n",
       "  @font-face {\n",
       "    font-family: \"Computer Modern\";\n",
       "    src: url('http://mirrors.ctan.org/fonts/cm-unicode/fonts/otf/cmunss.otf');\n",
       "  }\n",
       "  @font-face {\n",
       "    font-family: \"Computer Modern\";\n",
       "    src: url('http://mirrors.ctan.org/fonts/cm-unicode/fonts/otf/cmunsx.otf');\n",
       "    font-weight: bold;\n",
       "  }\n",
       "  @font-face {\n",
       "    font-family: \"Computer Modern\";\n",
       "    src: url('http://mirrors.ctan.org/fonts/cm-unicode/fonts/otf/cmunsi.otf');\n",
       "    font-style: italic, oblique;\n",
       "  }\n",
       "  @font-face {\n",
       "    font-family: \"Computer Modern\";\n",
       "    src: url('http://mirrors.ctan.org/fonts/cm-unicode/fonts/otf/cmunbxo.otf');\n",
       "    font-weight: bold;\n",
       "    font-style: italic, oblique;\n",
       "  }\n",
       "\n",
       "    div.cell{\n",
       "        width:800px;\n",
       "        margin-left:16% !important;\n",
       "        margin-right:auto;\n",
       "    }\n",
       "    h1 {\n",
       "        font-family: Garamond, Times, serif;\n",
       "    }\n",
       "    h4{\n",
       "        margin-top:12px;\n",
       "        margin-bottom: 3px;\n",
       "       }\n",
       "    div.text_cell_render{\n",
       "        font-family: Garamond, Times, serif;\n",
       "        line-height: 145%;\n",
       "        font-size: 130%;\n",
       "        width:800px;\n",
       "        margin-left:auto;\n",
       "        margin-right:auto;\n",
       "    }\n",
       "    .CodeMirror{\n",
       "            font-family: \"Source Code Pro\", source-code-pro,Consolas, monospace;\n",
       "    }\n",
       "    .prompt{\n",
       "        display: None;\n",
       "    }\n",
       "    .text_cell_render h5 {\n",
       "        font-weight: 300;\n",
       "        font-size: 22pt;\n",
       "        color: #4057A1;\n",
       "        font-style: italic;\n",
       "        margin-bottom: .5em;\n",
       "        margin-top: 0.5em;\n",
       "        display: block;\n",
       "    }\n",
       "    \n",
       "    .warning{\n",
       "        color: rgb( 240, 20, 20 )\n",
       "        }  \n",
       "</style>\n",
       "<script>\n",
       "    MathJax.Hub.Config({\n",
       "                        TeX: {\n",
       "                           extensions: [\"AMSmath.js\"]\n",
       "                           },\n",
       "                tex2jax: {\n",
       "                    inlineMath: [ ['$','$'], [\"\\\\(\",\"\\\\)\"] ],\n",
       "                    displayMath: [ ['$$','$$'], [\"\\\\[\",\"\\\\]\"] ]\n",
       "                },\n",
       "                displayAlign: 'center', // Change this to 'center' to center equations.\n",
       "                \"HTML-CSS\": {\n",
       "                    styles: {'.MathJax_Display': {\"margin\": 4}}\n",
       "                }\n",
       "        });\n",
       "</script>\n"
      ],
      "text/plain": [
       "<IPython.core.display.HTML at 0x105c36e90>"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from IPython.core.display import HTML\n",
    "\n",
    "\n",
    "def css_styling():\n",
    "    styles = open(\"../styles/custom.css\", \"r\").read()\n",
    "    return HTML(styles)\n",
    "css_styling()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}