{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Wald's Sequential Probability Ratio Test"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Sequential test: draw data sequentially\n",
    "\n",
    "Sequence of data $X_1, X_2, \\ldots$. \n",
    "\n",
    "Two hypotheses, $H_0$ and $H_1$, each of which completely specifies the joint distribution of the data.\n",
    "\n",
    "Assume that the joint distributions under $H_0$ and $H_1$ are absolutely continuous with respect to each other, relative to some underlying measure.\n",
    "\n",
    "Let $f_{0m}$ be the likelihood of $H_0$ for data $(X_j)_{j=1}^m$ and let $f_{1m}$ be the likelihood of $H_1$ for data $(X_j)_{j=1}^m$.\n",
    "\n",
    "The likelihood ratio of $H_1$ to $H_0$ is $f_{1m}/f_{0m}$; this is (loosely speaking) the probability of observing $X_1, \\ldots, X_m$ if $H_1$ is true, divided by the probability of observing $X_1, \\ldots, X_m$ if $H_0$ is true.\n",
    "\n",
    "The probability of observing the data actually observed will tend to be higher for whichever\n",
    "hypothesis is in fact true, so this likelihood ratio will tend to be greater than $1$ if $H_1$ is true,\n",
    "and will tend to be less than $1$ if $H_0$ is true.\n",
    "The more observations we make, the more probable it is that the\n",
    "resulting likelihood ratio will be small if $H_0$ is true.\n",
    "Wald (1945) showed that if $H_0$ is true, then the probability is at most $\\alpha$\n",
    "that the likelihood ratio\n",
    "is ever greater than $1/\\alpha$, no matter how many observations are made.\n",
    "More generally, we have:\n",
    "<p class=\"gap01\">&nbsp;</p>\n",
    "\n",
    "<div class=\"theorem\"> Wald's SPRT\n",
    "\n",
    "<p>For any $\\alpha \\in (0, 1)$ and $\\beta \\in [0, 1)$, the following sequential \n",
    "    algorithm tests the hypothesis $H_0$ at level no larger than \n",
    "   $\\alpha$ and with power at least $1-\\beta$\n",
    "   against the alternative $H_1$.\n",
    "</p>\n",
    "\n",
    "<p>\n",
    "   Set $m=0$.\n",
    "</p> \n",
    "\n",
    "   <ol>\n",
    "     <li> Increment $m$\n",
    "     <li> If $\\frac{f_{1m}}{f_{0m}} \\ge \\frac{1-\\beta}{\\alpha}$, stop and reject $H_0$.</li>\n",
    "     <li> If $\\frac{f_{1m}}{f_{0m}} \\le \\frac{\\beta}{1-\\alpha}$, stop and do not reject $H_0$.</li>\n",
    "     <li> If $\\frac{\\beta}{1-\\alpha} < \\frac{f_{1m}}{f_{0m}} < \\frac{1-\\beta}{\\alpha}$, go to step 1.  </li>\n",
    "   </ol>\n",
    "  \n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## SPRT miracle\n",
    "\n",
    "Don't need to know the distribution of the likelihood ratio $\\mbox{LR}=\\frac{f_{1m}}{f_{0m}}$ under the null hypothesis\n",
    "to find the critical values for the test."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Connection to Gambler's ruin\n",
    "\n",
    "For iid data, the likelihood ratio is a product of terms. On a log scale, it's a sum.  Each \"trial\" produces another term in the sum, positive or negative. But&mdash;unlike the classical Gambler's Ruin problem in which the game is fair&mdash;the terms are not equal in magnitude: the steps are not all the same size."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Likelihood ratio for two values of $p$ in iid Bernoulli trials\n",
    "\n",
    "Suppose $X_1, X_2, \\ldots,$ are iid $\\mbox{Bernoulli}(p)$ random variables\n",
    "and let $p_1 > p_0$.\n",
    "\n",
    "Set $\\mbox{LR} \\leftarrow 1$ and $j \\leftarrow 0$.\n",
    "\n",
    " + Increment $j$\n",
    " + If $X_j = 1$, $\\mbox{LR} \\leftarrow \\mbox{LR} \\times p_1/p_0$. \n",
    " + If $X_j = 0$, $\\mbox{LR} \\leftarrow \\mbox{LR} \\times (1-p_1)/(1- p_0)$.\n",
    "\n",
    "What's $\\mbox{LR}$ at stage $m$?\n",
    "Let $T_m \\equiv \\sum_{j=1}^m X_j$.\n",
    "$$\n",
    "     \\frac{p_{1m}}{p_{0m}} \\equiv \n",
    "         \\frac{p_1^{T_m}(1-p_1)^{m-T_m}}{p_0^{T_m}(1-p_0)^{m-T_m}}.\n",
    "$$\n",
    "This is the ratio of binomial probability when $p = p_1$ to binomial probability when\n",
    "$p = p_0$ (the binomial coefficients in the numerator and denominator cancel).\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "<div class=\"callout\">\n",
    "\n",
    "<p class=\"subtitle\">Wald's SPRT for $p$ in iid Bernoulli trials</p>\n",
    "\n",
    "Conclude $p > p_0$ if \n",
    "$$\n",
    "     \\frac{p_{1m}}{p_{0m}} \\ge \\frac{1-\\beta}{\\alpha}.\n",
    "$$\n",
    "Conclude $p \\le p_0$ if\n",
    "$$\n",
    "     \\frac{p_{1m}}{p_{0m}} \\le \\frac{\\beta}{1-\\alpha}.\n",
    "$$\n",
    "Otherwise, draw again.\n",
    "\n",
    "</div>\n",
    "\n",
    "The SPRT approximately minimizes the \n",
    "expected sample size\n",
    "when $p \\le p_0$ or $p > p_1$.\n",
    "For values in $(p_1, p_0)$, it can have larger sample sizes than fixed-sample-size tests."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Simulation of SPRT for Binomial\n",
    "\n",
    "Let's watch the SPRT in action for iid Bernoulli trials."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# This is the first cell with code: set up the Python environment\n",
    "%matplotlib inline\n",
    "from __future__ import print_function, division\n",
    "import matplotlib.pyplot as plt\n",
    "import math\n",
    "import numpy as np\n",
    "import scipy as sp\n",
    "import scipy.stats\n",
    "from scipy.stats import binom\n",
    "import pandas as pd\n",
    "# For interactive widgets\n",
    "from ipywidgets import interact, interactive, fixed\n",
    "import ipywidgets as widgets\n",
    "from IPython.display import clear_output, display, HTML"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "ee6735145b8a489b8e7ed45fa4cbf413"
      }
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/plain": [
       "<function __main__.plotBinomialSPRT>"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def plotBinomialSPRT(n, p, p0, p1, alpha, beta):\n",
    "    '''\n",
    "       Plots the progress of the SPRT for n iid Bernoulli trials with probabiity p\n",
    "       of success, for testing the hypothesis that p=p0 against the hypothesis p=p1\n",
    "       with significance level alpha and power 1-beta\n",
    "    '''\n",
    "    fig, ax = plt.subplots(nrows=2, ncols=1, sharex=True)\n",
    "    trials = sp.stats.binom.rvs(1, p, size=n+1)  # leave room to start at 1\n",
    "    terms = np.ones(n+1)\n",
    "    sfac = p1/p0\n",
    "    ffac = (1.0-p1)/(1.0-p0)\n",
    "    terms[trials == 1.0] = sfac\n",
    "    terms[trials == 0.0] = ffac\n",
    "    terms[0] = 1.0\n",
    "    logterms = np.log(terms)\n",
    "    #\n",
    "    ax[0].plot(range(n+1),np.cumprod(terms), color='b')\n",
    "    ax[0].axhline(y=(1-beta)/alpha, xmin=0, xmax=n, color='g', label=r'$(1-\\beta)/\\alpha$')\n",
    "    ax[0].axhline(y=beta/(1-alpha), xmin=0, xmax=n, color='r', label=r'$\\beta/(1-\\alpha)$')\n",
    "    ax[0].set_title('Simulation of Wald\\'s SPRT')\n",
    "    ax[0].set_ylabel('LR')\n",
    "    ax[0].legend(loc='best')\n",
    "    #\n",
    "    ax[1].plot(range(n+1),np.cumsum(logterms), color='b', linestyle='--')\n",
    "    ax[1].axhline(y=math.log((1-beta)/alpha), xmin=0, xmax=n, color='g', label=r'$log((1-\\beta)/\\alpha)$')\n",
    "    ax[1].axhline(y=math.log(beta/(1-alpha)), xmin=0, xmax=n, color='r', label=r'$log(\\beta/(1-\\alpha))$')\n",
    "    ax[1].set_ylabel('log(LR)')\n",
    "    ax[1].set_xlabel('trials')\n",
    "    ax[1].legend(loc='best')\n",
    "    plt.show()\n",
    "\n",
    "\n",
    "interact(plotBinomialSPRT, n=widgets.IntSlider(min=5, max=300, step=5, value=100),\\\n",
    "         p=widgets.FloatSlider(min=0.001, max=1, step=0.01, value=.45),\\\n",
    "         p0=widgets.FloatSlider(min=0.001, max=1, step=0.01, value=.5),\\\n",
    "         p1=widgets.FloatSlider(min=0.001, max=1, step=0.01, value=.6),\\\n",
    "         alpha=widgets.FloatSlider(min=0.001, max=1, step=0.01, value=.05),\\\n",
    "         beta=widgets.FloatSlider(min=0.001, max=1, step=0.01, value=.05)\n",
    "         )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Behavior\n",
    "For $p_0 < p_1$,\n",
    "+ when $p \\ge p_1$, the likelihood ratio is likely to cross the upper (green) line eventually and unlikely to cross the lower (red) line.\n",
    "+ when $p \\le p_0$, the likelihood ratio is likely to cross the lower (red) line eventually and unlikely to cross the upper (green) line.\n",
    "+ the SPRT approximately minimizes the expected number of trials before rejecting one or the other hypothesis, provided $p \\notin (p_0, p_1)$.\n",
    "\n",
    "For $p_1 < p_0$, the directions are reversed."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## It works for $\\beta = 0$ too: $P$-values\n",
    "The inequalities hold when $\\beta = 0$ also. Then the likelihood ratio has chance at most $\\alpha$ of ever being greater than $1/\\alpha$, if in fact $p > p_0$.\n",
    "\n",
    "Hence, $1/\\mbox{LR}$ is the $P$-value of the hypothesis $p > p_0$. This can be used to construct one-sided confidence bounds for $p$ and other parameters. The next chapter does exactly that, to find a lower confidence bound for the mean of a nonnegative population."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Dependent observations\n",
    "As an illustration of SPRT in for obervations that are not iid, consider the following problem.\n",
    "\n",
    "There is a population of $B$ items. Item $b$ has \"weight\" $N_b$ and \"value\" $a_b \\in \\{0, 1\\}$. Let $N = \\sum_{b=1}^B N_b$. (You might think of items as manufacturing lots, where $a_b = 1$ means the lot is acceptable and $a_b=0$ means the lot is defective.)\n",
    "\n",
    "We want to test the hypothesis $H_0$ that $\\frac{1}{N}\\sum_b N_b a_b = 1/2$ against the\n",
    "alternative hypothesis $H_1$ that $\\frac{1}{N}\\sum_b N_b a_b = \\gamma $, for some\n",
    "fixed $\\gamma > 1/2$.\n",
    "\n",
    "We will draw items sequentially, without replacement, such that the chance that item $b$ is selected in draw $i$, assuming it has not been selected already, is $N_b/\\sum_{j \\notin {\\mathcal B_{i-1}}} N_j$,\n",
    "where ${\\mathcal B_{i-1}}$ is the sample up to and including the $i-1$st draw,\n",
    "and ${\\mathcal B_0} \\equiv \\emptyset$. \n",
    "That is, the chance of selecting an item is in proportion to its weight among the items\n",
    "that have not yet been selected.\n",
    "\n",
    "Let $\\mathbb B_i$ denote the item selected at random in such a manner in the $i$th draw.\n",
    "\n",
    "The chance that the first draw ${\\mathbb B_1}$ gives an item with value 1, i.e., \n",
    "$\\Pr \\{a_{\\mathbb B_1} = 1\\}$, is $\\frac{1}{N}\\sum_b N_b a_b$.\n",
    "Under $H_0$, this chance is $p_{01} = 1/2$; under $H_1$, this chance is \n",
    "$p_{11} = \\gamma$.\n",
    "\n",
    "Given the values of $\\{a_{\\mathbb B_k}\\}_{k=1}^i$, the conditional\n",
    "probability that the $i$th draw gives an item with value 1 is\n",
    "\n",
    "$$\n",
    "   \\Pr \\{a_{\\mathbb B_i} = 1 | {\\mathcal B_{i-1}} \\} = \\frac{ \\sum_{b \\notin {\\mathcal B_{i-1}}} N_b a_b}{\\sum_{b \\notin {\\mathcal B_{i-1}}} N_b}.\n",
    "$$\n",
    "\n",
    "Under $H_0$, this chance is\n",
    "\n",
    "$$\n",
    "   p_{0i} =  \\frac{N/2 - \\sum_{b \\in {\\mathcal B_{i-1}}} N_b a_b}{N - \\sum_{b \\in {\\mathcal B_{i-1}}} N_b}.\n",
    "$$\n",
    "\n",
    "Under $H_1$, this chance is\n",
    "\n",
    "$$\n",
    "   p_{1i} = \\frac{N \\gamma - \\sum_{b \\in {\\mathcal B_{i-1}}} N_b a_b}{N - \\sum_{b \\in {\\mathcal B_{i-1}}} N_b}.\n",
    "$$\n",
    "\n",
    "Let $X_k$ be the indicator of the event that the $k$th draw gives an item with\n",
    "value $1$, i.e., the indicator of the event $a_{\\mathbb B_k} = 1$.\n",
    "The likelihood ratio for a given sequence $\\{X_k\\}_{k=1}^i$ is\n",
    "\n",
    "$$\n",
    "    \\mbox{LR} = \\frac{\\prod_{k=1}^i p_{1k}^{X_k}(1-p_{1k})^{1-X_k}}\n",
    "         {\\prod_{k=1}^i p_{0k}^{X_k}(1-p_{0k})^{1-X_k}}.\n",
    "$$\n",
    "\n",
    "This can be simplified: $p_{0k}$ and $p_{1k}$ have the same denominator,\n",
    "$N - \\sum_{b \\in {\\mathcal B_{i-1}}} N_b$,\n",
    "and their numerators share a term.\n",
    "Define $N(k) \\equiv \\sum_{j = 1}^{k-1}N_b$ and\n",
    "$A(k) \\equiv \\sum_{j = 1}^{k-1}N_b a_b$.\n",
    "Then\n",
    "\n",
    "$$\n",
    "    \\mbox{LR} = \\prod_{k=1}^i \n",
    "    \\left ( \\frac{N\\gamma - A(k)}{N/2 - A(k)} \\right )^{X_k}\n",
    "    \\left ( \\frac{N-N\\gamma - (N(k) - A(k))}{N-N/2 - (N(k)-A(k))} \\right )^{1-X_k}\n",
    "$$\n",
    "\n",
    "$$\n",
    "   = \\prod_{k=1}^i\n",
    "\\left ( \\frac{N\\gamma - A(k)}{N/2 - A(k)} \\right )^{X_k}\n",
    "\\left ( \\frac{N(1-\\gamma) - N(k) + A(k)}{N/2 - N(k) + A(k)} \\right )^{1-X_k},\n",
    "$$\n",
    "where the products are defined to be infinite if the denominator vanishes anywhere.\n",
    "\n",
    "If $H_0$ is true, the chance that $\\mbox{LR}$ is ever greater than $1/\\alpha$\n",
    "is at most $\\alpha$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## What's next?\n",
    "+ [Optional next: The SPRT for the population percentage, sampling without replacement](pSPRTnoReplacement.ipynb)\n",
    "+ [Next: The Kaplan-Wald Confidence Bound for a Nonnegative Mean](kaplanWald.ipynb)\n",
    "+ [Lower confidence bounds for the mean of a nonnegative population: Markov's Inequality & methods based on the empirical distribution](markov.ipynb)\n",
    "+ [Index](index.ipynb)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}