{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# The Kaplan-Wald Confidence Bound for a Nonnegative Mean" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This section presents an approach to finding lower confidence bounds for the mean of a nonnegative random variable described in H.M. Kaplan's website, http://printmacroj.com/martMean.htm. That work fleshes out an idea due to Wald (1945, 2004), and is closely related to a technique presented in Kaplan (1987).\n", "For another derivation, see Stark & Teague (2014) https://www.usenix.org/system/files/jets/issues/0301/overview/jets_0301-stark.pdf\n", "\n", "\n", "We have an iid sequence of random variables $X_1, X_2, \\ldots$ such that $\\mathbb P \\{X_j \\ge 0 \\} = 1$. Let $F$ be their common distribution function. We seek a lower confidence bound for their common expectation $\\mu \\equiv \\mathbb E X_1 = \\int_0^\\infty x dF$. \n", "\n", "Under the hypothesis that $\\mu = t$, \n", "\n", "$$\n", " t^{-1} \\mu = t^{-1} \\int_0^\\infty xdF(x) = \\int_0^\\infty xt^{-1} dF(x) = 1.\n", "$$\n", "Fix $\\gamma \\in [0, 1]$.\n", "Because $\\int_0^\\infty dF = 1$, it follows that if $\\mu = t$,\n", "\n", "$$ \n", " \\mathbb E \\left ( (\\gamma/t) X_j + (1-\\gamma) \\right ) = (\\gamma/t) \\mathbb E X_j + (1-\\gamma) = \n", " (\\gamma/t)t + (1-\\gamma) = 1.\n", "$$\n", "Now,\n", "$$\n", " \\mathbb E \\left ((\\gamma/t) X_j + (1-\\gamma) \\right ) \\equiv \n", " \\int_0^\\infty \\left (x \\gamma/t + (1-\\gamma) \\right ) dF(x).\n", "$$\n", "Since for $x \\ge 0$, $(x \\gamma/t + (1-\\gamma)) \\ge 0$, it follows that if we define\n", "\n", "$$\n", " dG \\equiv (x \\gamma/t + (1-\\gamma))dF\n", "$$\n", "\n", "then $G$ is the cdf of a nonnegative random variable.\n", "\n", "Let $Y$ be a random variable with cdf $G$.\n", "By Jensen's inequality, $\\mathbb E X_j^2 \\ge (\\mathbb E X_j)^2 = t \\cdot \\mathbb E X_j$ (by hypothesis).\n", "Since $\\mathbb E X_j = t \\ge 0$,\n", "\n", "\\begin{align}\n", " \\mathbb E Y &= \\int_0^\\infty x dG(x) \\\\\n", " &= \\int_0^\\infty x (x \\gamma/t + (1-\\gamma)) dF(x) \\\\\n", " &= \\gamma/t \\int_0^\\infty x^2 dF(x) + (1-\\gamma) \\int_0^\\infty x dF(x) \\\\\n", " &= \\gamma/t \\cdot \\mathbb E X_j^2 + (1-\\gamma) \\cdot \\mathbb E X_j \\\\\n", " &\\ge \\gamma \\cdot \\mathbb E X_j + (1-\\gamma) \\cdot \\mathbb E X_j = \\mathbb E X_j.\n", "\\end{align}\n", "(The penultimate step uses Jensen's inequality.)\n", "\n", "If the data allow us to reject the hypothesis $H_0$ that $\\{ X_j\\}$ are IID with cdf $F$\n", "in favor of the alternative hypothesis $H_1$ that they are iid with cdf $G$,\n", "we have strong statistical evidence that $\\mu = \\mathbb E X_j > t$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using the SPRT to test $H_1$ versus $H_0$\n", "\n", "Now a bit of magic occurs. For a given observation $X_j = x_j$, what is the likelihood ratio of $H_1$ to $H_0$?\n", "\n", "$$\n", " \\mbox{LR} = \\frac{dG(x_j)}{dF(x_j)} = \\frac{(x_j\\gamma/t+(1−\\gamma))dF(x_j)}{dF(x_j)} = (x_j\\gamma/t+(1−\\gamma)).\n", "$$\n", "This doesn't depend on $F$ or $G$!\n", "\n", "The $\\mbox{LR}$ for observations $X_1, \\ldots, X_m$ is\n", "$$\n", " \\mbox{LR} = \\Pi_{i=1}^m \\left ( (\\gamma/t) X_i + 1 - \\gamma \\right ).\n", "$$\n", "This expression shows why $\\gamma$ was introduced in the first place:\n", "for $\\gamma = 1$, if even a single observation turned out to be zero, $\\mbox{LR}$ would forever be\n", "zero no matter how large subsequent observations turned out to be.\n", "Taking $\\gamma < 1$ hedges against that possibility.\n", "Any value of $\\gamma \\in [0, 1]$ gives a conservative test, but smaller values provide more \"protection\"\n", "against small values of $X_j$.\n", "\n", "Recall that the $\\mbox{LR}$ is the $P$-value of $H_0: \\mu = t$ based on the observations $\\{X_j\\}_{j=1}^m$.\n", "We will use this to find a lower confidence bound for $\\mu$.\n", "\n", "### \"Lookahead\" and the SPRT\n", "There's more: recall that the SPRT says the chance that $\\mbox{LR}$ _ever_ gets larger than $1/\\alpha$ is at most $\\alpha$ if $H_0$ is true.\n", "That means that we can use the observations sequentially, maximizing over the partial products.\n", "If any of the partial products exceeds $1/\\alpha$, we can reject $H_0$.\n", "\n", "That is, our level-$\\alpha$ test based on a sample of size $n$ is\n", "$$\n", " \\mbox{ Reject } H_0 \\mbox{ if } \\max_{m=1}^n \\Pi_{i=1}^m \\left [ \\gamma X_i/t + 1 - \\gamma \\right ] \\ge 1/\\alpha.\n", "$$\n", "\n", "It is _only_ legitimate to do this maximization if the data are in random order. For instance, it's impermissible to sort them from largest to smallest. And if you maximize over partial products, the result will, in general, depend on the order of the data.\n", "\n", "It might be hard to explain to a court or consulting client that your confidence bound depends on the order the data happened to arrive in. If that's a concern, you can always just use all $n$ terms in the product and not maximize over $1 \\le m \\le n$.\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Confidence bounds from the Kaplan-Wald test\n", "\n", "To find a lower confidence bound, we can invert hypothesis tests: the lower endpoint of a one-sided confidence bound for $\\mu$ is the largest value \n", "of $t$ for which we would not reject the hypothesis $\\mu = t$ at significance level $\\alpha$.\n", "\n", "For confidence levels above 50%, this lower confidence bound will certainly be between zero and the sample mean. However, for $t=0$, we have a divide-by-zero issue. Hence, for numerical implementation, we search the interval $[\\epsilon, \\bar{X}]$ for the smallest $t$ for which we can reject the hypothesis $\\mu = t$ at significance level $\\alpha$. If that smallest $t$ is $\\epsilon$, we set the confidence bound to zero. \n", "\n", "The following code implements this idea, working with the log of the test statistic to improve numerical stability." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# This is the first cell with code: set up the Python environment\n", "%matplotlib inline\n", "from __future__ import print_function, division\n", "import matplotlib.pyplot as plt\n", "import math\n", "import numpy as np\n", "import scipy as sp\n", "import scipy.stats\n", "from scipy.stats import binom\n", "import scipy.optimize\n", "import pandas as pd\n", "from ipywidgets import interact, interactive, fixed\n", "import ipywidgets as widgets\n", "from IPython.display import clear_output, display, HTML" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def kaplanWaldLowerCI(x, cl = 0.95, gamma = 0.99, xtol=1.e-12, logf=True):\n", " '''\n", " Calculates the Kaplan-Wald lower 1-alpha confidence bound for the mean of a nonnegative random\n", " variable.\n", " '''\n", " alpha = 1.0-cl\n", " if any(x < 0):\n", " raise ValueError('Data x must be nonnegative.')\n", " elif all(x <= 0):\n", " lo = 0.0\n", " else:\n", " if logf:\n", " f = lambda t: (np.max(np.cumsum(np.log(gamma*x/t + 1 - gamma))) + np.log(alpha))\n", " else:\n", " f = lambda t: (np.max(np.cumprod(gamma*x/t + 1 - gamma)) - 1/alpha)\n", " xm = np.mean(x)\n", " if f(xtol)*f(xm) > 0.0:\n", " lo = 0.0\n", " else:\n", " lo = sp.optimize.brentq(f, xtol, np.mean(x), xtol=xtol) \n", " return lo" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## How well does it work?\n", "Let's test the method on our standard test problems: binomial and a mixture of pointmass and uniform. We will fix $\\gamma$; you might experiment using different values." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | mass at 1 | \n", "sample size | \n", "Student-t cov | \n", "Kaplan-Wald cov | \n", "Student-t low | \n", "Kaplan-Wald low | \n", "
---|---|---|---|---|---|---|
0 | \n", "0.900 | \n", "25.0 | \n", "89.64% | \n", "100.0% | \n", "0.6836 | \n", "0.82 | \n", "
1 | \n", "0.900 | \n", "50.0 | \n", "98.63% | \n", "100.0% | \n", "0.6701 | \n", "0.8713 | \n", "
2 | \n", "0.900 | \n", "100.0 | \n", "99.96% | \n", "97.76% | \n", "0.6643 | \n", "0.8954 | \n", "
3 | \n", "0.900 | \n", "400.0 | \n", "100.0% | \n", "96.57% | \n", "0.6617 | \n", "0.8951 | \n", "
4 | \n", "0.990 | \n", "25.0 | \n", "21.83% | \n", "100.0% | \n", "0.9553 | \n", "0.8794 | \n", "
5 | \n", "0.990 | \n", "50.0 | \n", "38.78% | \n", "100.0% | \n", "0.9422 | \n", "0.934 | \n", "
6 | \n", "0.990 | \n", "100.0 | \n", "61.17% | \n", "100.0% | \n", "0.9285 | \n", "0.9628 | \n", "
7 | \n", "0.990 | \n", "400.0 | \n", "97.61% | \n", "100.0% | \n", "0.9069 | \n", "0.9847 | \n", "
8 | \n", "0.999 | \n", "25.0 | \n", "2.62% | \n", "100.0% | \n", "0.995 | \n", "0.8853 | \n", "
9 | \n", "0.999 | \n", "50.0 | \n", "4.62% | \n", "100.0% | \n", "0.9939 | \n", "0.9406 | \n", "
10 | \n", "0.999 | \n", "100.0 | \n", "9.65% | \n", "100.0% | \n", "0.9913 | \n", "0.9695 | \n", "
11 | \n", "0.999 | \n", "400.0 | \n", "32.36% | \n", "100.0% | \n", "0.9847 | \n", "0.9917 | \n", "