{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Tests and Confidence Sets when there are Nuisance Parameters\n",
    "\n",
    "**Work in Progress!**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Many statistical problems involve _nuisance parameters_, parameters that are not of scientific interest but that affect the probability distribution of the data.\n",
    "\n",
    "Testing hypotheses and constructing confidence sets for some parameters when there are additional nuisance parameters can be complicated.\n",
    "\n",
    "This chapter addresses two statistical inference problems involving nuisance parameters:\n",
    "tests and confidence bounds for a population mean from a stratified random sample, and tests and confidence bounds for the difference between the values of $p$ for two independent binomial distributions.\n",
    "\n",
    "One strategy that is widely applicable to testing hypotheses about some parameter when there are nuisance parameters is to find the maximum $P$-value over all possible values of the nuisance parameter, keeping the parameters of interest fixed."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Difference between two Binomial $p$s.\n",
    "\n",
    "Suppose we observe $X_1 \\sim \\mbox{Binom}(n_1, p_1)$ and $X_2 \\sim \\mbox{Binom}(n_2, p_2)$ where $X_1$ and $X_2$ are independent. How can we make inferences about $p_1-p_2$?\n",
    "\n",
    "We will develop several approaches.\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Fisher's exact test: a conditional test\n",
    "\n",
    "Suppose we are only interested in the hypothesis $p_1 = p_2$.\n",
    "One way to test that hypothesis is to condition on the value of $X = X_1 + X_2$.\n",
    "If $p_1 = p_2 = p$, then $X_1 + X_2 \\sim \\mbox{Binom}(n_1+n_2, p)$.\n",
    "We can think of the data as arising from $n_1+n_2$ iid Bernoulli$(p)$ trials.\n",
    "If that were the case, then, given the total, $X$, any $X$ of the $n_1+n_2$\n",
    "trials are equally likely to be the trials that resulted in \"1\" rather than \"0.\"\n",
    "(The trials are _exchangeable_: the joint distribution of the $n_1+n_2$ trials\n",
    "is invariant under permutations of the labels.)\n",
    "\n",
    "That is, conditional on $X=x$, the distribution of $X_1$ is hypergeometric with parameters\n",
    "$N=n_1+n_2$, $G=x$, and $n=n_1$.\n",
    "We can use that to construct a conditional test of the hypothesis $p_1=p_2$.\n",
    "As noted in [the chapter on tests](./tests/ipynb), if we always test conditionally\n",
    "at significance level not greater than $\\alpha$, whatever $X$ happens to be,\n",
    "that yields a test that has unconditional level not greater than $\\alpha$.\n",
    "\n",
    "Fisher's exact test avoids the problem of estimating nuisance parameters by conditioning\n",
    "on $X_1+X_2$. \n",
    "If $p_1=p_2$ is big, $X$ will tend to be large; if $p_1=p_2$ is small, $X$ will tend to be small.\n",
    "Regardless, the distribution of $X_1$ given $X_1+X_2$ is hypergeometric with known parameters.\n",
    "\n",
    "While Fisher's exact test provides a straightforward way to test the hypothesis $p_1 = p_2$, it does not allow us to test the hypothesis $p_1-p_1 = c \\ne 0$ without additional assumptions, and thus does not allow us to make a confidence set for $p_1-p_2$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Bonferroni and Šidák simultaneous tests and confidence intervals\n",
    "\n",
    "One easy but inefficient way to test hypotheses or make confidence intervals for $p_1-p_2$ is to make _simultaneous_ inferences about $p_1$ and $p_2$.\n",
    "For instance, suppose that $\\mathcal{I}_1(\\cdot)$ is a $1-\\alpha_1$ confidence interval procedure for $p_1$ and $\\mathcal{I}_2(\\cdot)$ is a $1-\\alpha_2$ confidence interval procedure for $p_2$.\n",
    "\n",
    "By Bonferroni's inequality,\n",
    "\n",
    "\\begin{equation*}\n",
    " \\mathbb{P}_{p_1, p_2} \\{ (\\mathcal{I}_1(X_1) \\ni p_1) \\cap (\\mathcal{I}_2(X_2) \\ni p_2) \\} \\ge 1 - \\alpha_1 - \\alpha_2.\n",
    "\\end{equation*}\n",
    "\n",
    "Because $X_1$ and $X_2$ are independent, we can get a sharper result using Šidák's adjustment:\n",
    "\n",
    "\\begin{equation*}\n",
    " \\mathbb{P}_{p_1, p_2} \\{ (\\mathcal{I}_1(X_1) \\ni p_1) \\cap (\\mathcal{I}_2(X_2) \\ni p_2) \\} \\ge (1 - \\alpha_1)(1 - \\alpha_2).\n",
    "\\end{equation*}\n",
    "\n",
    "To see that Šidák's inequality is sharper (for independent intervals), calculate \n",
    "\\begin{equation*}\n",
    "(1 - \\alpha_1)(1 - \\alpha_2) = 1 - \\alpha_1 - \\alpha_2 + \\alpha_1\\alpha_2 \\ge 1 - \\alpha_1 - \\alpha_2.\n",
    "\\end{equation*}\n",
    "\n",
    "Whenever $\\mathcal{I}_1(X_1) \\ni p_1$ and $\\mathcal{I}_2(X_2) \\ni p_2$, \n",
    "\\begin{equation*}\n",
    "p_1-p_2 \\in \\mathcal{I}_1(X_1) - \\mathcal{I}_2(X_2) := \\{x-y: x \\in \\mathcal{I}_1(X_1),\n",
    "y \\in \\mathcal{I}_2(X_2) \\},\n",
    "\\end{equation*}\n",
    "so \n",
    "\\begin{equation*}\n",
    "\\mathbb{P}_{p_1, p_2} \\{ \\mathcal{I}_1(X_1) - \\mathcal{I}_2(X_2)  \\ni p_1 - p_2 \\}\n",
    "\\ge (1 - \\alpha_1)(1 - \\alpha_2).\n",
    "\\end{equation*}\n",
    "That is, $\\mathcal{I}_1(X_1) - \\mathcal{I}_2(X_2)$ is a $(1 - \\alpha_1)(1 - \\alpha_2)$\n",
    "confidence interval for $p_1-p_2$.\n",
    "Because of the duality between testing hypotheses and confidence intervals, we can reject the hypothesis that $p_1-p_2 = c$ at significance level $1- (1 - \\alpha_1)(1 - \\alpha_2)$\n",
    "if $c \\notin \\mathcal{I}_1(X_1) - \\mathcal{I}_2(X_2)$.\n",
    "In particular, we can reject the hypothesis $p_1=p_2$ at level $1- (1 - \\alpha_1)(1 - \\alpha_2)$ if $\\mathcal{I}_1(X_1) - \\mathcal{I}_2(X_2)$ does not contain 0.\n",
    "\n",
    "As you can see, this method involves estimating $p_1$ and $p_2$ separately, even though we only care about $p_1-p_2$.\n",
    "We can do a bit better than this."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Unconditional tests\n",
    "\n",
    "We can also base tests on the full joint distribution of $X_1$ and $X_2$.\n",
    "[TBD]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**The Santner-Snell Test**\n",
    "[Santner and Snell (1980)](https://www.tandfonline.com/doi/abs/10.1080/01621459.1980.10477482)\n",
    "give a procedure for finding an exact confidence interval for \n",
    "$p_c-p_t$ and $p_c/p_t$.\n",
    "\n",
    "Alternative approach using potential outcomes: \n",
    "We have $\\{U_j\\}_{j=1}^N$ iid $U[0,1]$, independent of the\n",
    "$\\{T_j\\}_{j=1}^N$. Let $z_{cj} = 1_{U_j \\le p_c}$ and $z_{tj} = 1_{U_j \\le p_t}$.\n",
    "Condition on $\\{U_j\\}$.\n",
    "\n",
    "[MORE TO COME]\n",
    "\n",
    "#### Individual probabilities\n",
    "\n",
    "A more reasonable model is that individual $j$ has a \"personal\" probability $p_{tj}$ of becoming infected if vaccinated and $p_{tc}$ if unvaccinated.\n",
    "\n",
    "[MORE TO COME]\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Confidence bound for the mean of a binary population from a stratified sample from the population.\n",
    "This problem occurs in opinion surveys, financial auditing, and election auditing.\n",
    "It will let us see some of the issues that arise when there are nuisance parameters.\n",
    "\n",
    "We will use four powerful techniques:\n",
    "\n",
    "+ the duality between hypothesis tests and confidence sets\n",
    "+ maximizing $P$-values over nuisance parameters\n",
    "+ union-intersection tests\n",
    "+ combining independent $P$-values\n",
    "\n",
    "It is also an example of a combinatorial optimization problem that can be solved quickly using a greedy algorithm--which is rare.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### The problem\n",
    "\n",
    "A population of $N$ items of which $G$ are labeled \"1\" and $N-G$ are labeled \"0\"\n",
    "is partitioned into $S$ strata.\n",
    "Stratum $s$ contains $N_s$ items, of which $G_s$ are labeled \"1.\"\n",
    "Thus $N = \\sum_{s=1}^S N_s$ and $G \\equiv \\sum_{s=1}^S G_s$.\n",
    "We draw a simple random sample of size $n_s$ from stratum $s$, $s = 1, \\ldots, S$, independently across strata.\n",
    "(I.e., from stratum $s$ we draw a sample of size $n_s$ in such a way \n",
    "that every subset of $n_s$ distinct items of the $N_s$ items is equally likely;\n",
    "and the $S$ samples are drawn independently.)\n",
    "\n",
    "Let $x_{sj}$ denote the value if the $j$th item in stratum $s$, $s=1, \\ldots, S$, $j=1, \\ldots, N_s$.\n",
    "Then $G_s = \\sum_{j=1}^{N_s} x_{sj}$ and \n",
    "\\begin{equation*}\n",
    "G = \\sum_{s=1}^S \\sum_{j=1}^{N_s} x_{sj}.\n",
    "\\end{equation*}\n",
    "\n",
    "Let $Y_s$ denote the sum of the values in the sample from stratum $s$.\n",
    "The variables $\\{Y_s \\}_{s=1}^S$ are independent.\n",
    "The observed value of $Y_s$ is $y_s$.\n",
    "\n",
    "We seek hypothesis tests and confidence bounds for $G$.\n",
    "We first consider one-sided tests of the hypothesis $G = g$ against the \n",
    "alternative $G > g$, and\n",
    "corresponding lower confidence bounds for $G$.\n",
    "Reversing the roles of \"0\" and \"1\" gives upper confidence \n",
    "bounds, _mutatis mutandis_.\n",
    "\n",
    "The general strategy for testing the hypothesis $G=g$ is to \n",
    "find the largest $P$-value among all ways of allocating $g$ \n",
    "items labeled \"1\" among the $S$ strata \n",
    "(honoring the stratum sizes $\\{N_s\\}$).\n",
    "That is a $P$-value for the _composite_ hypothesis $G=g$.\n",
    "The maximum can be found by examining all such allocations and\n",
    "calculating the $P$-value for each.\n",
    "\n",
    "Maximizing the $P$-value over all allocations of $G$ ones\n",
    "across $S$ strata\n",
    "is combinatorially complex: Feller's \"bars and stars\" argument shows that there are $\\binom{G+S-1}{S-1}$ ways to allocate $G$ objects among $S$ strata.\n",
    "(Some of those can be ruled out, for instance if $G$ exceeds the size of any stratum.)\n",
    "For $S=10$ strata of size $N_s = 400$ and $G = 300$, \n",
    "there are roughly 6.3e+16 allocations: impractical by any standard."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### _Aside: Feller's Bars and Stars Argument_\n",
    "\n",
    "How many distinguishable ways are there to divide $G$ indistinguishable objects into $S$ groups? \n",
    "[Feller (1950)](https://www.wiley.com/en-us/An+Introduction+to+Probability+Theory+and+Its+Applications%2C+Volume+2%2C+2nd+Edition-p-9780471257097) argues as follows.\n",
    "\n",
    "Consider $G$ \"stars\" lined up in a row:\n",
    "\n",
    "\\begin{equation*}\n",
    "********* \\cdots *********\n",
    "\\end{equation*}\n",
    "\n",
    "To divide them into $S$ groups, we place $S-1$ \"bars\" somewhere among them.\n",
    "For instance,\n",
    "\\begin{equation*}\n",
    "**|****|*** \\cdots ******|***\n",
    "\\end{equation*}\n",
    "corresponds to having 2 items in the first group, 4 in the second, ..., and 3 in the $S$th group.\n",
    "Similarly,\n",
    "\\begin{equation*}\n",
    "|||||********* \\cdots *********\n",
    "\\end{equation*}\n",
    "corresponds to having all the items in the $S$th group.\n",
    "The number of ways of partitioning the $G$ items into $S$ groups is thus the number of ways of choosing $S-1$ places to put bars among a total of $G+S-1$ places (the bars and stars pooled together), i.e., $\\binom{G+S-1}{S-1}$.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We now examine some approaches to making confidence intervals for the population total $G$ (equivalently, for the population mean $\\mu = G/N$) from a stratified sample."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Wright's Method: Sum of Šidák Intervals\n",
    "\n",
    "One easy way to get a lower confidence bound for the sum is to take the sum of\n",
    "simultaneous lower confidence bounds for each stratum.\n",
    "Because the samples from different strata are independent, Šidák's adjustment works.\n",
    "Wright (1991) suggests this approach.\n",
    "\n",
    "A confidence bound for $G_s$ can be constructed from $Y$ by inverting hypergeometric tests.\n",
    "\n",
    "To have joint confidence level $1-\\alpha$, make each confidence interval at $(1-\\alpha)^{1/S}$.\n",
    "\n",
    "The chance that _all_ of the $S$ confidence bounds cover their corresponding parameters is\n",
    "\\begin{equation*}\n",
    "\\prod_{s=1}^S (1-\\alpha)^{1/S} = 1-\\alpha\n",
    "\\end{equation*}\n",
    "because the intervals all based on independent data.\n",
    "This approach to making simultaneous confidence bounds for a set of parameters using\n",
    "independent data for each parameter is called _Šidák's_ method.\n",
    "\n",
    "Wright's method is an example of a much more general approach: make a joint $1-\\alpha$ confidence set for all the parameters $\\{G_j\\}_{j=1}^S$, then find a lower bound on a functional of interest (here, their sum) over the joint set. \n",
    "Whenever the joint confidence set covers the parameter, the lower bound does not exceed the true value of the functional of the parameter.\n",
    "But because we do not care about the individual stratum sums, only the overall population sum, this approach is unnecessarily conservative: we are \"wasting\" effort constraining them separately.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### The Wendell-Schmee Test\n",
    "\n",
    "Wendell and Schmee (1996, https://www.tandfonline.com/doi/abs/10.1080/01621459.1996.10476950) proposed making inferences about the population total by maximizing a $P$-value over a set of nuisance parameters---the individual stratum totals.\n",
    "They find the $P$-value by ordering possible outcomes based on the estimated population proportion: \n",
    "the test statistic is the \"tail probability\" of $\\hat{p}$, the unbiased\n",
    "estimator of the population percentage from the stratified sample.\n",
    "They construct confidence bounds by inverting hypothesis tests.\n",
    "\n",
    "The test statistic for the Wendell-Schmee test is the unbiased estimate\n",
    "of the population proportion, $G/N$:\n",
    "\n",
    "$$\n",
    "  \\hat{p} = \\frac{1}{N} \\sum_{s=1}^S N_s y_s/n_s.\n",
    "$$\n",
    "The $P$-value of the hypothesis $G_s=g_s$, $s=1, \\ldots, S$, \n",
    "is the \"lower tail probability\" of $\\hat{p}$.\n",
    "\n",
    "Wendell and Schmee consider maximizing this lower tail probability over all allocations\n",
    "of $g$ ones across strata, either by exhaustive search, or by numerical optimization\n",
    "using a descent method from some number of random starting points.\n",
    "(They show graphically that the tail probability is not convex in the original\n",
    "parametrization.)\n",
    "\n",
    "In practice, their method is computationally intractable when there are more than $S=3$ strata."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Constructive maximization\n",
    "\n",
    "For some test statistics, there is a much more efficient approach.\n",
    "\n",
    "Define \n",
    "$$\n",
    "   p_s(g_s) \\equiv \\Pr \\{ Y_s \\ge y_s || G_s = g_s \\} =\n",
    "   \\sum_{y = y_s}^{g_s} \\frac{\\binom{g_s}{y} \\binom{N_s-g_s}{n_s - y}}{\\binom{N_s}{n_s}},\n",
    "$$\n",
    "where $\\binom{a}{b} \\equiv 0$ if $a \\le 0$ or $b > a$.\n",
    "(The double vertical bars denote \"computed on the assumption that.\")\n",
    "This is the $P$-value of the hypothesis $G_s = g_s$ tested against the\n",
    "alternative $G_s > g_s$.\n",
    "\n",
    "A test of the conjunction hypothesis $G_s = g_s$, $s=1, \\ldots, S$ can be constructed\n",
    "using _Fisher's combining function_:\n",
    "if all $S$ hypotheses are true, the distribution of\n",
    "$$\n",
    "  X^2(\\vec{g}) \\equiv -2 \\sum_{s=1}^S \\log p_s(g_s)\n",
    "$$\n",
    "is _dominated_ by the chi-square distribution with $2S$ degrees of freedom.\n",
    "(That is, the chance $X^2 \\ge t$ is less then or equal to the chance that a random variable with\n",
    "a chi-square distribution is $\\ge t$, for all $t$.)\n",
    "\n",
    "Let $\\chi_d(z)$ denote the survival function for the chi-sqare distribution with $d$ degrees\n",
    "of freedom, i.e., the chance that a random variable with the chi-square \n",
    "distribution with $d$ degrees of freedom is greater than or equal to $z$.\n",
    "Then a conservative $P$-value for the allocation $\\vec{g}$\n",
    "$$\n",
    "   P(\\vec{g}) = \\chi_{2S}(X^2(\\vec{g})).\n",
    "$$\n",
    "The allocation $\\vec{g}$ of $g$ ones across strata that maximizes the $P$-value\n",
    "is the allocation that minimizes $X^2(\\vec{g})$ and satisfies $\\sum_s g_s = g$.\n",
    "Equivalently, it is the allocation that maximizes $\\sum_{s=1}^S \\log p_s(g_s)$.\n",
    "\n",
    "Let \n",
    "$$\n",
    "   a_s(j) \\equiv \\left \\{ \n",
    "                 \\begin{array}{ll} \n",
    "          \\log p_s(y_s), & j = y_s \\\\\n",
    "          \\log \\left (p_s(j)/p_s(j-1) \\right ), & j = y_s+1, \\ldots N_s-(n_s-y_s).\n",
    "                 \\end{array}\n",
    "                 \\right .\n",
    "$$\n",
    "\n",
    "Then $\\log p_s(g_s) = \\sum_{j=y_s}^{g_s} a_s(j)$ if \n",
    "$y_s \\le g_s \\le N-(n_s-y_s)$, and \n",
    "$\\log p_s(g_s) = -\\infty$ otherwise.\n",
    "Moreover, \n",
    "$$\n",
    "  X^2(\\vec{g}) =  -2\\sum_{s=1}^S a_s(y_s) -2\\sum_{s=1}^S \\sum_{j=y_s+1}^{g_s} a_s(j)\n",
    "$$\n",
    "provided $y_s \\le g_s \\le N-(n_s-y_s)$, $s=1, \\ldots, S$; otherwise, it is infinite.\n",
    "\n",
    "An allocation of $g$ ones across strata is inconsistent with the data unless\n",
    "$g_s \\ge y_s$, $s=1, \\ldots, S$.\n",
    "Thus, in considering how to allocate $g$ ones to maximize the $P$-value, \n",
    "the first sum above, accounting for $\\sum_s y_s$ ones, is \"mandatory,\" or the $P$-value will be zero.\n",
    "The question is how to allocate the remaining $g - \\sum_s y_s$ ones to maximize\n",
    "the $P$-value (equivalently, to minimize $X^2(\\vec{g})$).\n",
    "\n",
    "Let $b_k$ denote the $k$th largest element of the set \n",
    "\n",
    "$$\n",
    "   \\{a_s(j): j=y_s+1, \\ldots, N_s-(n_s-y_s), \\;\\; s=1, \\ldots, S \\},\n",
    "$$\n",
    "with ties broken arbitrarily.\n",
    "Define $\\tilde{g}_y \\equiv g - \\sum_{s=1}^S y_s$.\n",
    "\n",
    "**Proposition.** For every $\\vec{g}$ with $\\sum_s g_s = g$, \n",
    "$$\n",
    "X^2(\\vec{g}) \\ge X_*^2(g) \\equiv \\left \\{ \\begin{array}{ll}\n",
    "    -2 \\left ( \\sum_{s=1}^S a_s(y_s) + \\sum_{k=1}^{\\tilde{g}_y} b_k \n",
    "                \\right ), & \\sum_s y_s \\le g \\le N - \\sum_s (n_s-y_s) \\\\\n",
    "    \\infty, & \\mbox{ otherwise }\n",
    "    \\end{array}\n",
    "    \\right . .\n",
    "$$\n",
    "\n",
    "**Proof.** Any $\\vec{g}$ for which $X^2(\\vec{g})$ is finite includes the first sum\n",
    "and a sum of $\\tilde{g}_y$ elements of $\\{b_k\\}$; the latter is at most the sum of the\n",
    "$\\tilde{g}_y$ largest elements of $\\{b_k\\}$. $\\Box$\n",
    "\n",
    "Moreover, if the $\\tilde{g}_y$ largest elements of $\\{b_k \\}$ correspond to\n",
    "an allocation of $\\tilde{g}_y$ ones across the strata, the bound is sharp.\n",
    "That turns out to be true (the proof involves log convexity of the distributions).\n",
    "The second sum thus corresponds \n",
    "to a particular allocation \n",
    "$\\vec{g}$ of $g$ ones across the $S$ strata, with $y_s \\le g_s \\le N_s-(n_s-y_s)$.\n",
    "Among all allocations of $g$ items labeled \"1,\" this one minimizes has the smallest tail \n",
    "probability, because it is the exponentiation of the smallest sum of logs \n",
    "(the largest negative sum of logs). $\\Box$\n",
    "\n",
    "\n",
    "**Proposition:** For $j \\in y_s+1, \\ldots, N_s-(n_s-y_s)$, $a_s(j)$ is monotone \n",
    "decreasing in $j$.\n",
    "\n",
    "Let $P(g) \\equiv \\max_{\\vec{g}: \\sum_s g_s = g} P(\\vec{g})$.\n",
    "\n",
    "**Theorem:** If $\\sum_s y_s \\le g \\le N - \\sum_s (n_s-y_s)$, \n",
    "\n",
    "$$\n",
    "P(g) \\le \\chi_d(X_*^2(g)).\n",
    "$$\n",
    "\n",
    "**Proof:**\n",
    "Immediate from the definitions.\n",
    "\n",
    "The theorem shows that a \"greedy\" approach finds a conservative $P$-value:\n",
    "\n",
    "+ Construct the values $a_s(j)$ and the set $\\{b_k\\}$. \n",
    "+ Add the $S$ values $\\{a_s(x_k)$ to the $g-g_y$ largest elements of $\\{b_k\\}$ and \n",
    "multiply the sum by $-2$.\n",
    "+ The upper tail probability of the chi-square distribution with $2S$ degrees of \n",
    "freedom is a conservative $P$-value for the hypothesis $G=g$.\n",
    "\n",
    "A conservative upper $1-\\alpha$ confidence bound for $G$ can be found by inverting the test: it is the largest $g$ for which \n",
    "$P(g) \\ge \\alpha$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Changing the direction of the test\n",
    "\n",
    "The test of the hypothesis $G=g$ given above is a one-sided test against the alternative \n",
    "$G > g$: it rejects if the chance of observing \"so few\" good objects is small.\n",
    "\n",
    "To test against the alternative $G < g$ (i.e., to reject if the chance of observing \"so many\"\n",
    "good objects is small), exchange the role of \"good\" and \"bad.\"\n",
    "The hypothesis $G < g$ is equivalent to the hypothesis $(N-G) > (N-g)$.\n",
    "\n",
    "The resulting null hypothesis is $G = N-g$, and the data are $n_s - Y_s$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We now code the combinatorial optimization and the greedy optimization."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Requirement already satisfied: permute in /opt/anaconda3/lib/python3.7/site-packages (0.1a5)\n",
      "Requirement already satisfied: numpy in /opt/anaconda3/lib/python3.7/site-packages (from permute) (1.19.2)\n",
      "Requirement already satisfied: scipy in /opt/anaconda3/lib/python3.7/site-packages (from permute) (1.5.2)\n",
      "Requirement already satisfied: cryptorandom in /opt/anaconda3/lib/python3.7/site-packages (0.2)\n",
      "Requirement already satisfied: scipy in /opt/anaconda3/lib/python3.7/site-packages (from cryptorandom) (1.5.2)\n",
      "Requirement already satisfied: numpy in /opt/anaconda3/lib/python3.7/site-packages (from cryptorandom) (1.19.2)\n"
     ]
    }
   ],
   "source": [
    "# Install permute and cryptorandom in the current kernel, if not already installed\n",
    "import sys\n",
    "!{sys.executable} -m pip install permute --user\n",
    "!{sys.executable} -m pip install cryptorandom --user"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "from scipy.stats import binom, hypergeom, chi2\n",
    "import itertools\n",
    "import warnings\n",
    "from permute.utils import binom_conf_interval, hypergeom_conf_interval"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "@lru_cache(maxsize=None)  # decorate the function to cache the results \n",
    "                          # of calls to the function\n",
    "def strat_test_brute(strata, sams, found, good, alternative='upper'):\n",
    "    \"\"\"\n",
    "    Find p-value of the hypothesis that the number G of \"good\" objects in a \n",
    "    stratified population is less than or equal to good, using a stratified\n",
    "    random sample.\n",
    "    \n",
    "    Assumes that a simple random sample of size sams[s] was drawn from stratum s, \n",
    "    which contains strata[s] objects in all.\n",
    "    \n",
    "    The P-value is the maximum Fisher combined P-value across strata\n",
    "    over all allocations of good \"good\" objects among the strata. The allocations are\n",
    "    enumerated using Feller's \"bars and stars\" construction, constrained to honor the\n",
    "    stratum sizes (each stratum can contain no more \"good\" items than it has items in all,\n",
    "    nor fewer \"good\" items than the sample contains).\n",
    "    \n",
    "    The number of allocations grows combinatorially: there can be as many as\n",
    "    [(#strata + #good items) choose (#strata-1)] allocations, making the brute-force\n",
    "    approach computationally infeasible when the number of strata and/or the number of\n",
    "    good items is large.\n",
    "    \n",
    "    The test is a union-intersection test: the null hypothesis is the union over allocations\n",
    "    of the intersection across strata of the hypotheses that the number of good items\n",
    "    in the stratum is less than or equal to a constant.\n",
    "    \n",
    "    Parameters:\n",
    "    -----------\n",
    "    strata : list of ints\n",
    "        sizes of the strata. One int per stratum.\n",
    "    sams : list of ints\n",
    "        the sample sizes from each stratum\n",
    "    found : list of ints\n",
    "        the numbers of \"good\" items found in the samples from the strata\n",
    "    good : int\n",
    "        the hypothesized total number of \"good\" objects in the population\n",
    "    alternative : string {'lower', 'upper'}\n",
    "        test against the alternative that the true value is less than good (lower)\n",
    "        or greater than good (upper)\n",
    "    \n",
    "    Returns:\n",
    "    --------\n",
    "    p : float\n",
    "        maximum combined p-value over all ways of allocating good \"good\" objects\n",
    "        among the strata, honoring the stratum sizes.        \n",
    "    best_part : list\n",
    "        the partition that attained the maximum p-value\n",
    "    \"\"\"\n",
    "    if alternative == 'upper': # exchange roles of \"good\" and \"bad\"\n",
    "        p_func = lambda f, s, p, x: sp.stats.hypergeom.logcdf(f, s, p, x)\n",
    "    elif alternative == 'lower':\n",
    "        p_func = lambda f, s, p, x: sp.stats.hypergeom.logsf(f-1, s, p, x)\n",
    "    else:\n",
    "        raise NotImplementedError(\"alternative {} not implemented\".format(alternative))\n",
    "    best_part = found # start with what you see\n",
    "    strata = np.array(strata, dtype=int)\n",
    "    sams = np.array(sams, dtype=int)\n",
    "    found = np.array(found, dtype=int)\n",
    "    good = int(good)\n",
    "    if good < np.sum(found):     \n",
    "        p = 0 if alternative == 'lower' else 1 \n",
    "    elif good > np.sum(strata) - np.sum(sams) + np.sum(found):\n",
    "        p = 1 if alternative == 'lower' else 0\n",
    "    else:  # use Feller's \"bars and stars\" enumeration of combinations, constrained\n",
    "        log_p = np.NINF   # initial value for the max\n",
    "        n_strata = len(strata)\n",
    "        parts = sp.special.binom(good+n_strata-1, n_strata-1)\n",
    "        if parts >= 10**7:\n",
    "            print(\"warning--large number of partitions: {}\".format(parts))\n",
    "        barsNstars = good + n_strata\n",
    "        bars = [0]*n_strata + [barsNstars]\n",
    "        partition = ([bars[j+1] - bars[j] - 1 for j in range(n_strata)] \\\n",
    "            for bars[1:-1] in itertools.combinations(range(1, barsNstars), n_strata-1) \\\n",
    "            if all(((bars[j+1] - bars[j] - 1 <= strata[j]) and \\\n",
    "                   (bars[j+1] - bars[j] >= found[j])) for j in range(n_strata)))\n",
    "        for part in partition:\n",
    "            log_p_new = 0 \n",
    "            for s in range(n_strata): # should be possible to vectorize this\n",
    "                log_p_new += p_func(found[s], strata[s], part[s], sams[s])\n",
    "            if log_p_new > log_p:\n",
    "                best_part = part\n",
    "                log_p = log_p_new\n",
    "        p = sp.stats.chi2.sf(-2*log_p, df = 2*len(strata))\n",
    "    return p, best_part\n",
    "\n",
    "@lru_cache(maxsize=None)  # decorate the function to cache the results \n",
    "                          # of calls to the function\n",
    "def strat_ci_bisect(strata, sams, found, alternative='lower', cl=0.95,\n",
    "                  p_value=strat_test_brute):\n",
    "    \"\"\"\n",
    "    Confidence bound on the number of ones in a stratified population,\n",
    "    based on a stratified random sample (without replacement) from\n",
    "    the population.\n",
    "    \n",
    "    \n",
    "    If alternative=='lower', finds an upper confidence bound.\n",
    "    If alternative=='upper', finds a lower confidence bound.\n",
    "\n",
    "    Uses an integer bisection search to find an exact confidence bound.\n",
    "    The starting upper endpoint for the search is the unbiased estimate\n",
    "    of the number of ones in the population. That could be refined in various\n",
    "    ways to improve efficiency.\n",
    "    \n",
    "    The lower endpoint for the search is the Šidák joint lower confidence bounds,\n",
    "    which should be more conservative than the exact bound.\n",
    "    \n",
    "    Parameters:\n",
    "    -----------    \n",
    "    strata : list of ints\n",
    "        stratum sizes\n",
    "    sams : list of ints\n",
    "        sample sizes in the strata\n",
    "    found : list of ints\n",
    "        number of ones found in each stratum in each sample\n",
    "    alternative : string {'lower', 'upper'}\n",
    "        if alternative=='lower', finds an upper confidence bound.\n",
    "        if alternative=='upper', finds a lower confidence bound.\n",
    "        While this is not mnemonic, it corresponds to the sidedness of the tests\n",
    "        that are inverted to get the confidence bound.\n",
    "    cl : float\n",
    "        confidence level. Assumed to be at least 50%.\n",
    "    p_value : callable\n",
    "        method for computing the p-value\n",
    "    \n",
    "    Returns:\n",
    "    --------\n",
    "    b : int\n",
    "        confidence bound\n",
    "    best_part : list of ints\n",
    "        partition that attains the confidence bound\n",
    "    \"\"\"\n",
    "    assert alternative in ['lower', 'upper']\n",
    "    if alternative == 'upper':  # interchange good and bad\n",
    "        compl = np.array(sams)-np.array(found)  # bad items found\n",
    "        cb, best_part = strat_ci_bisect(strata, sams, compl, alternative='lower', \n",
    "                                       cl=cl, p_value=p_value)\n",
    "        b = np.sum(strata) - cb    # good from bad\n",
    "        best_part_b = np.array(strata, dtype=int)-np.array(best_part, dtype=int)\n",
    "    else:\n",
    "        cl_sidak = math.pow(cl, 1/len(strata))  # Šidák adjustment\n",
    "        tail = 1-cl\n",
    "        a = sum((hypergeom_conf_interval( \\\n",
    "                sams[s], found[s], strata[s], cl=cl_sidak, alternative=\"lower\")[0] \\\n",
    "                for s in range(len(strata)))) # Šidák should give a lower bound\n",
    "        b = int(np.sum(np.array(strata)*np.array(found)/np.array(sams)))-1 # expected good\n",
    "        p_a, best_part_a = p_value(strata, sams, found, a, alternative=alternative)\n",
    "        p_b, best_part_b = p_value(strata, sams, found, b, alternative=alternative)\n",
    "        tot_found = np.sum(found)\n",
    "        while p_a > tail and a > tot_found:\n",
    "            a = math.floor(a/2)\n",
    "            p_a, best_part_a = p_value(strata, sams, found, a, alternative=alternative)\n",
    "        if p_a > tail:\n",
    "            b = a\n",
    "            best_part_b = best_part_a\n",
    "        else:\n",
    "            while b-a > 1:\n",
    "                c = int((a+b)/2)\n",
    "                p_c, best_part_c = p_value(strata, sams, found, \n",
    "                                           c, alternative=alternative)\n",
    "                if p_c > tail:\n",
    "                    b, p_b, best_part_b = c, p_c, best_part_c\n",
    "                elif p_c < tail:\n",
    "                    a, p_a, best_part_a = c, p_c, best_part_c\n",
    "                elif p_c == tail:\n",
    "                    b, p_b, best_part_b = c, p_c, best_part_c\n",
    "                    break\n",
    "    return b, list(best_part_b)\n",
    "    \n",
    "@lru_cache(maxsize=None)  # decorate the function to cache the results \n",
    "                          # of calls to the function\n",
    "def strat_test(strata, sams, found, good, alternative='lower'):\n",
    "    \"\"\"\n",
    "    P-value for the hypothesis the number of ones in a stratified population is not \n",
    "    greater than (or not less than) a hypothesized value, based on a stratified \n",
    "    random sample without replacement from the population.\n",
    "    \n",
    "    Uses a fast algorithm to find (an upper bound on) the P-value constructively.\n",
    "    \n",
    "    Uses Fisher's combining function to combine stratum-level P-values.\n",
    "    \n",
    "    Parameters:\n",
    "    -----------    \n",
    "    strata : list of ints\n",
    "        stratum sizes\n",
    "    sams : list of ints\n",
    "        sample sizes in the strata\n",
    "    found : list of ints\n",
    "        number of ones found in each stratum in each sample\n",
    "    good : int\n",
    "        hypothesized number of ones in the population\n",
    "    alternative : string {'lower', 'upper'}\n",
    "        test against the alternative that the true number of \"good\" items \n",
    "        is less than (lower) or greater than (upper) the hypothesized number, good\n",
    "    \n",
    "    Returns:\n",
    "    --------\n",
    "    p : float\n",
    "        P-value\n",
    "    best_part : list\n",
    "        the partition that attained the maximum p-value\n",
    "    \"\"\"\n",
    "    assert alternative in ['lower', 'upper']\n",
    "    if alternative == 'upper':                 # exchange roles of \"good\" and \"bad\"\n",
    "        compl = np.array(sams) - np.array(found) # bad items found \n",
    "        bad = np.sum(strata) - good            # total bad items hypothesized\n",
    "        res = strat_test(strata, sams, compl, bad, alternative='lower')\n",
    "        return res[0], list(np.array(strata, dtype=int)-np.array(res[1], dtype=int))\n",
    "    else:  \n",
    "        good = int(good)\n",
    "        best_part = []               # best partition\n",
    "        if good < np.sum(found):     # impossible\n",
    "            p = 0  \n",
    "            best_part = found\n",
    "        elif good >= np.sum(strata) - np.sum(sams) + np.sum(found): # guaranteed\n",
    "            p = 1\n",
    "            best_part = list(np.array(strata, dtype=int) - \\\n",
    "                             np.array(sams, dtype=int) + \\\n",
    "                             np.array(found, dtype=int))       \n",
    "        elif good >= np.sum(found):  # outcome is possible under the null  \n",
    "            log_p = 0                # log of joint probability\n",
    "            contrib = []             # contributions to the log joint probability\n",
    "            base = np.sum(found)     # must have at least this many good items\n",
    "            for s in range(len(strata)):\n",
    "                log_p_j = sp.stats.hypergeom.logsf(found[s]-1, strata[s], found[s], sams[s])\n",
    "                                     # baseline p for minimum number of good items in stratum\n",
    "                log_p += log_p_j     # log of the product of stratum-wise P-values\n",
    "                for j in range(found[s]+1, strata[s]-(sams[s]-found[s])+1):\n",
    "                    log_p_j1 = sp.stats.hypergeom.logsf(found[s]-1, strata[s], j, sams[s])\n",
    "                                     # tail probability for j good in stratum\n",
    "                    contrib.append([log_p_j1 - log_p_j, s])  \n",
    "                                     # relative increase in P from new item\n",
    "                    log_p_j = log_p_j1\n",
    "            sorted_contrib = sorted(contrib, key = lambda x: x[0], reverse=True)\n",
    "            best_part = np.array(found)\n",
    "            for i in range(good-base):\n",
    "                log_p += sorted_contrib[i][0]\n",
    "                best_part[int(sorted_contrib[i][1])] += 1\n",
    "            p = sp.stats.chi2.sf(-2*log_p, df = 2*len(strata))\n",
    "    return p, list(best_part)\n",
    "\n",
    "def strat_ci_search(strata, sams, found, alternative='lower', cl=0.95):\n",
    "    \"\"\"\n",
    "    Confidence bound on the number of ones in a stratified population,\n",
    "    based on a stratified random sample (without replacement) from\n",
    "    the population.\n",
    "        \n",
    "    If alternative=='lower', finds an upper confidence bound.\n",
    "    If alternative=='upper', finds a lower confidence bound.\n",
    "    \n",
    "    Searches for the allocation of items that attains the confidence bound\n",
    "    by increasing the number of ones from the minimum consistent\n",
    "    with the data (total found in the sample) until the P-value is greater\n",
    "    than 1-cl.\n",
    "    \n",
    "    Parameters:\n",
    "    -----------    \n",
    "    strata : list of ints\n",
    "        stratum sizes\n",
    "    sams : list of ints\n",
    "        sample sizes in the strata\n",
    "    found : list of ints\n",
    "        number of ones found in each stratum in each sample\n",
    "    alternative : string {'lower', 'upper'}\n",
    "        if alternative=='lower', finds an upper confidence bound.\n",
    "        if alternative=='upper', finds a lower confidence bound.\n",
    "        While this is not mnemonic, it corresponds to the sidedness of the tests\n",
    "        that are inverted to get the confidence bound.\n",
    "    cl : float\n",
    "        confidence level. Assumed to be at least 50%.\n",
    "    \n",
    "    Returns:\n",
    "    --------\n",
    "    cb : int\n",
    "        confidence bound\n",
    "    best_part : list of ints\n",
    "        partition that attains the confidence bound (give or take one item)\n",
    "    \"\"\"\n",
    "    assert alternative in ['lower', 'upper']\n",
    "    if alternative == 'upper':  # interchange good and bad\n",
    "        compl = np.array(sams)-np.array(found)  # bad items found\n",
    "        cb, best_part = strat_ci(strata, sams, compl, alternative='lower', cl=cl)\n",
    "        cb = np.sum(strata) - cb    # good from bad\n",
    "        best_part = np.array(strata, dtype=int)-np.array(best_part, dtype=int)\n",
    "    else:\n",
    "        cb = int(np.sum(np.array(strata)*np.array(found)/np.array(sams)))-1 # expected good\n",
    "        p_attained, best_part = strat_test(strata, sams, found, cb, \n",
    "                                                 alternative=alternative)\n",
    "        while p_attained >= 1-cl:\n",
    "            cb -= 1\n",
    "            p_attained, best_part = strat_test(strata, sams, found, cb, \n",
    "                                                     alternative=alternative)\n",
    "        cb += 1\n",
    "        p_attained, best_part = strat_test(strata, sams, found, cb, \n",
    "                                                 alternative=alternative)\n",
    "    return cb, list(best_part)\n",
    "\n",
    "def strat_ci(strata, sams, found, alternative='lower', cl=0.95):\n",
    "    \"\"\"\n",
    "    Conservative confidence bound on the number of ones in a population,\n",
    "    based on a stratified random sample (without replacement) from\n",
    "    the population.\n",
    "    \n",
    "    If alternative=='lower', finds an upper confidence bound.\n",
    "    If alternative=='upper', finds a lower confidence bound.\n",
    "    \n",
    "    Constructs the confidence bound directly by constructing the\n",
    "    allocation of the maximum number of ones that would not be\n",
    "    rejected at (conservative) level 1-cl.\n",
    "    \n",
    "    Parameters:\n",
    "    -----------    \n",
    "    strata : list of ints\n",
    "        stratum sizes\n",
    "    sams : list of ints\n",
    "        sample sizes in the strata\n",
    "    found : list of ints\n",
    "        number of ones found in each stratum in each sample\n",
    "    alternative : string {'lower', 'upper'}\n",
    "        if alternative=='lower', finds an upper confidence bound.\n",
    "        if alternative=='upper', finds a lower confidence bound.\n",
    "        While this is not mnemonic, it corresponds to the sidedness of the tests\n",
    "        that are inverted to get the confidence bound.\n",
    "    cl : float\n",
    "        confidence level\n",
    "        \n",
    "    Returns:\n",
    "    --------\n",
    "    cb : int\n",
    "        confidence bound\n",
    "    best_part : list of ints\n",
    "        partition that attains the confidence bound (give or take one item)\n",
    "    \"\"\"\n",
    "    assert alternative in ['lower', 'upper']\n",
    "    if alternative == 'upper':  # interchange role of good and bad\n",
    "        compl = np.array(sams)-np.array(found)  # bad found\n",
    "        cb, best_part = strat_ci(strata, sams, compl, alternative='lower', cl=cl)\n",
    "        best_part = np.array(strata, dtype=int)-np.array(best_part, dtype=int)\n",
    "        cb = np.sum(strata) - cb   # good to bad\n",
    "    else:                \n",
    "        threshold = -sp.stats.chi2.ppf(cl, df=2*len(strata))/2\n",
    "        # g is in the set if \n",
    "        #   chi2.sf(-2*log(p), df=2*len(strata)) >= 1-cl\n",
    "        #  i.e.,   -2*log(p) <=  chi2.ppf(cl, df)\n",
    "        #  i.e.,   log(p) >= -chi2.ppf(cl, df)/2\n",
    "        log_p = 0                # log of joint probability\n",
    "        contrib = []             # contributions to the log joint probability\n",
    "        base = np.sum(found)     # must have at least this many good items\n",
    "        for s in range(len(strata)):\n",
    "            log_p_j = sp.stats.hypergeom.logsf(found[s]-1, strata[s], found[s], sams[s])\n",
    "                                 # baseline p for minimum number of good items in stratum\n",
    "            log_p += log_p_j     # log of the product of stratum-wise P-values\n",
    "            small = np.PINF      # for monotonicity check\n",
    "            for j in range(found[s]+1, strata[s]-(sams[s]-found[s])+1):\n",
    "                log_p_j1 = sp.stats.hypergeom.logsf(found[s]-1, strata[s], j, sams[s])\n",
    "                log_p_j1 = log_p_j if log_p_j1 < log_p_j else log_p_j1 # true difference is nonnegative\n",
    "                contrib.append([log_p_j1 - log_p_j, s]) \n",
    "                log_p_j = log_p_j1\n",
    "                if contrib[-1][0] > small:\n",
    "                    print(\"reversal in stratum {} for {} good; old: {} new:{}\".format(s, \n",
    "                                    j, small, contrib[-1][0]))\n",
    "                small = contrib[-1][0]\n",
    "        sorted_contrib = sorted(contrib, key = lambda x: x[0], reverse=True)\n",
    "        best_part = np.array(found)\n",
    "        added = 0\n",
    "        while log_p < threshold: \n",
    "            log_p += sorted_contrib[added][0]\n",
    "            best_part[int(sorted_contrib[added][1])] += 1\n",
    "            added += 1\n",
    "        cb = base + added \n",
    "    return cb, list(best_part)\n",
    "\n",
    "def strat_p(strata, sams, found, hypo, alternative='lower'):\n",
    "    \"\"\"\n",
    "    Finds tail probability for the hypothesized population counts <hypo> for \n",
    "    simple random samples of sizes <sams> from strata of sizes <strata> if \n",
    "    <found> ones are found in the strata.\n",
    "    \n",
    "    Uses Fisher's combining function across strata.\n",
    "    \n",
    "    Parameters:\n",
    "    -----------    \n",
    "    strata : list of ints\n",
    "        stratum sizes\n",
    "    sams : list of ints\n",
    "        sample sizes from the strata\n",
    "    found : list of ints\n",
    "        number of ones found in each stratum in each sample\n",
    "    hypo : list of ints\n",
    "        hypothesized number of ones in the strata\n",
    "    \n",
    "    Returns:\n",
    "    --------\n",
    "    p : float\n",
    "        appropriate tail probability\n",
    "    \"\"\"\n",
    "    assert alternative in ['lower', 'upper']\n",
    "    if alternative == 'lower':\n",
    "        p_func = lambda x, N, G, n: sp.stats.hypergeom.sf(x-1, N, G, n)\n",
    "    else:\n",
    "        p_func = lambda x, N, G, n: sp.stats.hypergeom.cdf(x, N, G, n)\n",
    "    p = 1    \n",
    "    for s in range(len(strata)):\n",
    "        p *= p_func(found[s], strata[s], hypo[s], sams[s])\n",
    "    return sp.stats.chi2.sf(-2*math.log(p), df=2*len(strata))\n",
    "\n",
    "def strat_p_ws(strata, sams, found, hypo, alternative='upper'):\n",
    "    \"\"\"\n",
    "    Finds Wendell-Schmee P-value for the hypothesized population counts <hypo> for \n",
    "    simple random samples of sizes <sams> from strata of sizes <strata> if \n",
    "    <found> ones are found in the strata.\n",
    "        \n",
    "    Parameters:\n",
    "    -----------    \n",
    "    strata : list of ints\n",
    "        stratum sizes\n",
    "    sams : list of ints\n",
    "        sample sizes from the strata\n",
    "    found : list of ints\n",
    "        number of ones found in each stratum in each sample\n",
    "    hypo : list of ints\n",
    "        hypothesized number of ones in the strata\n",
    "    \n",
    "    Returns:\n",
    "    --------\n",
    "    p : float\n",
    "        tail probability\n",
    "    \"\"\"\n",
    "    assert alternative in ['lower', 'upper']\n",
    "    if alternative == 'lower':                     # exchange roles of \"good\" and \"bad\"\n",
    "        compl = np.array(sams) - np.array(found)   # bad items found \n",
    "        hypo_c = np.array(strata) - np.array(hypo) # total bad items hypothesized\n",
    "        p = strat_p_ws(strata, sams, compl, hypo_c, alternative='upper')\n",
    "    else:    \n",
    "        p_hat = lambda f, st=strata, sa=sams: np.sum(np.array(st)*np.array(f)/np.array(sa))/np.sum(st) # pooled estimate\n",
    "        p_hat_0 = p_hat(found)\n",
    "        per_strat = np.array(strata)/np.array(sams)/np.sum(strata)\n",
    "        strat_max = np.floor(p_hat_0/per_strat)\n",
    "        lo_t = (t for t in itertools.product(*[range(int(s+1)) for s in strat_max]) \\\n",
    "                    if p_hat(t) <= p_hat_0)\n",
    "        p = sum(np.prod(sp.stats.hypergeom.pmf(t, strata, hypo, sams)) \\\n",
    "                    for t in lo_t)\n",
    "    return p\n",
    "\n",
    "@lru_cache(maxsize=None)  # decorate the function to cache the results \n",
    "                          # of calls to the function\n",
    "def strat_test_ws(strata, sams, found, good, alternative='lower'):\n",
    "    \"\"\"\n",
    "    Find p-value of the hypothesis that the number G of \"good\" objects in a \n",
    "    stratified population is less than or equal to good, using a stratified\n",
    "    random sample.\n",
    "    \n",
    "    Assumes that a simple random sample of size sams[s] was drawn from stratum s, \n",
    "    which contains strata[s] objects in all.\n",
    "    \n",
    "    The P-value is the maximum Windell-Schmee P-value over all allocations of \n",
    "    good objects among the strata. The allocations are enumerated using Feller's \n",
    "    \"bars and stars\" construction, constrained to honor the stratum sizes (each \n",
    "    stratum can contain no more \"good\" items than it has items in all, nor fewer \n",
    "    \"good\" items than the sample contains).\n",
    "    \n",
    "    The number of allocations grows combinatorially: there can be as many as\n",
    "    [(#strata + #good items) choose (#strata-1)] allocations, making the brute-force\n",
    "    approach computationally infeasible when the number of strata and/or the number of\n",
    "    good items is large.\n",
    "    \n",
    "    Parameters:\n",
    "    -----------\n",
    "    strata : list of ints\n",
    "        sizes of the strata. One int per stratum.\n",
    "    sams : list of ints\n",
    "        the sample sizes from each stratum\n",
    "    found : list of ints\n",
    "        the numbers of \"good\" items found in the samples from the strata\n",
    "    good : int\n",
    "        the hypothesized total number of \"good\" objects in the population\n",
    "    alternative : string {'lower', 'upper'}\n",
    "        test against the alternative that the true value is less than good (lower)\n",
    "        or greater than good (upper)\n",
    "    \n",
    "    Returns:\n",
    "    --------\n",
    "    p : float\n",
    "        maximum combined p-value over all ways of allocating good \"good\" objects\n",
    "        among the strata, honoring the stratum sizes.        \n",
    "    best_part : list\n",
    "        the partition that attained the maximum p-value\n",
    "    \"\"\"\n",
    "    assert alternative in ['lower', 'upper']\n",
    "    if alternative == 'lower':                   # exchange roles of \"good\" and \"bad\"\n",
    "        compl = np.array(sams) - np.array(found) # bad items found \n",
    "        bad = np.sum(strata) - good              # total bad items hypothesized\n",
    "        res = strat_test_ws(strata, sams, compl, bad, alternative='upper')\n",
    "        return res[0], list(np.array(strata, dtype=int)-np.array(res[1], dtype=int))        \n",
    "    best_part = found # start with what you see\n",
    "    good = int(good)\n",
    "    if good < np.sum(found):     \n",
    "        p = 0 if alternative == 'lower' else 1 \n",
    "    elif good > np.sum(strata) - np.sum(sams) + np.sum(found):\n",
    "        p = 1 if alternative == 'lower' else 0\n",
    "    else:  # use Feller's \"bars and stars\" enumeration of combinations, constrained\n",
    "        p_hat = lambda f, st=strata, sa=sams: np.sum(np.array(st)*np.array(f)/np.array(sa))/np.sum(st) # pooled estimate\n",
    "        p_hat_0 = p_hat(found)\n",
    "        per_strat = np.array(strata)/np.array(sams)/np.sum(strata)\n",
    "        strat_max = np.floor(p_hat_0/per_strat)\n",
    "        p = 0   # initial value for the max\n",
    "        n_strata = len(strata)\n",
    "        parts = sp.special.binom(good+n_strata-1, n_strata-1)\n",
    "        if parts >= 10**7:\n",
    "            print(\"warning--large number of partitions: {}\".format(parts))\n",
    "        barsNstars = good + n_strata\n",
    "        bars = [0]*n_strata + [barsNstars]\n",
    "        partition = ([bars[j+1] - bars[j] - 1 for j in range(n_strata)] \\\n",
    "            for bars[1:-1] in itertools.combinations(range(1, barsNstars), n_strata-1) \\\n",
    "            if all(((bars[j+1] - bars[j] - 1 <= strata[j]) and \\\n",
    "                   (bars[j+1] - bars[j] >= found[j])) for j in range(n_strata)))\n",
    "        for part in partition:\n",
    "            lo_t = (t for t in itertools.product(*[range(int(s+1)) for s in strat_max]) \\\n",
    "                    if p_hat(t) <= p_hat_0)\n",
    "            p_new = 0\n",
    "            for t in lo_t:\n",
    "                p_temp = 1\n",
    "                for s in range(len(strata)):\n",
    "                    p_temp *= sp.stats.hypergeom.pmf(t[s], strata[s], part[s], sams[s])\n",
    "                p_new += p_temp\n",
    "            if p_new > p:\n",
    "                best_part = part\n",
    "                p = p_new\n",
    "    return p, list(best_part)\n",
    "\n",
    "def strat_ci_wright(strata, sams, found, alternative='lower', cl=0.95):\n",
    "    \"\"\"\n",
    "    Confidence bound on the number of ones in a stratified population,\n",
    "    based on a stratified random sample (without replacement) from\n",
    "    the population.\n",
    "    \n",
    "    If alternative=='lower', finds an upper confidence bound.\n",
    "    If alternative=='upper', finds a lower confidence bound.\n",
    "    \n",
    "    Constructs the confidence bound by finding Šidák multiplicity-adjusted\n",
    "    joint lower confidence bounds for the number of ones in each stratum.\n",
    "    \n",
    "    This approach is mentioned in Wright, 1991.\n",
    "    \n",
    "    Parameters:\n",
    "    -----------    \n",
    "    strata : list of ints\n",
    "        stratum sizes\n",
    "    sams : list of ints\n",
    "        sample sizes in the strata\n",
    "    found : list of ints\n",
    "        number of ones found in each stratum in each sample\n",
    "    alternative : string {'lower', 'upper'}\n",
    "        if alternative=='lower', finds an upper confidence bound.\n",
    "        if alternative=='upper', finds a lower confidence bound.\n",
    "        While this is not mnemonic, it corresponds to the sidedness of the tests\n",
    "        that are inverted to get the confidence bound.\n",
    "    cl : float\n",
    "        confidence level\n",
    "        \n",
    "    Returns:\n",
    "    --------\n",
    "    cb : int\n",
    "        confidence bound\n",
    "    \"\"\"\n",
    "    assert alternative in ['lower', 'upper']\n",
    "    inx = 0 if alternative == 'lower' else 1\n",
    "    cl_sidak = math.pow(cl, 1/len(strata))  # Šidák-adjusted confidence level per stratum\n",
    "    cb = sum((hypergeom_conf_interval(\n",
    "                sams[s], found[s], strata[s], cl=cl_sidak, alternative=alternative)[inx] \n",
    "                for s in range(len(strata))))\n",
    "    return cb"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's check the empirical coverage of the greedy method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1 1 1.0\n",
      "101 101 1.0\n",
      "201 200 0.9950248756218906\n",
      "301 300 0.9966777408637874\n",
      "401 400 0.9975062344139651\n",
      "501 500 0.998003992015968\n",
      "601 600 0.9983361064891847\n",
      "701 700 0.9985734664764622\n",
      "801 800 0.9987515605493134\n",
      "901 899 0.9977802441731409\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "0.998"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# test empirical coverage\n",
    "reps = int(10**3)\n",
    "cl = 0.95\n",
    "alternative = 'upper'\n",
    "strata = [5000, 3000, 2000]\n",
    "sams = [75,50,25]\n",
    "good = [100, 100, 500]\n",
    "g = np.sum(good)\n",
    "cover = 0\n",
    "verb = False\n",
    "for i in range(reps):\n",
    "    found = sp.stats.hypergeom.rvs(strata, good, sams)\n",
    "    ub = strat_ci(strata, sams, found, alternative=alternative, cl=cl)\n",
    "    if verb: \n",
    "        print(\"f: {} ub: {} best: {}\".format(found, ub[0], ub[1]))\n",
    "    cover = cover+1 if ub[0] >= g else cover\n",
    "    if i % 100 == 0:\n",
    "        print(i+1, cover, cover/(i+1))\n",
    "cover/reps"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's write some numerical tests to confirm that the three ways of computing bounds agree."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "def test_strat_test(verbose = False):\n",
    "    strata = [[10, 20, 30, 40], [20, 20, 20]]\n",
    "    sams = [[2, 3, 4, 5], [5, 5, 10]]\n",
    "    found = [[1, 2, 3, 4], [0, 3, 2]]\n",
    "    for i in range(len(strata)):\n",
    "        compl = np.array(sams[i]) - np.array(found[i])\n",
    "        for j in list(range(4, 20)) + [60]:\n",
    "            for alternative in ['lower', 'upper']:\n",
    "                if verbose:\n",
    "                    print(\"i: {} j: {} alternative: {}\".format(i, j, alternative))\n",
    "                p_exact = strat_test_brute(strata[i], sams[i], found[i], j, alternative=alternative)\n",
    "                p_exact_c = strat_test_brute(strata[i], sams[i], compl, j, alternative=alternative)\n",
    "                p_lkp = strat_test(strata[i], sams[i], found[i], j, alternative=alternative)\n",
    "                p_lkp_c = strat_test(strata[i], sams[i], compl, j, alternative=alternative)\n",
    "                np.testing.assert_allclose(p_exact[0], p_lkp[0])\n",
    "                np.testing.assert_allclose(p_exact_c[0], p_lkp_c[0])\n",
    "\n",
    "def test_strat_ci(verbose = False):\n",
    "    strata = [[10, 20], [10, 20, 30, 40], [10, 20, 20, 30]]\n",
    "    sams = [[5, 5], [2, 3, 4, 5], [2, 4, 5, 6]]\n",
    "    found = [[2, 2], [1, 2, 3, 4], [0, 1, 2, 3]]\n",
    "    for i in range(len(strata)):\n",
    "        for alternative in ['lower','upper']:\n",
    "            brute = strat_ci_bisect(strata[i], sams[i], found[i], alternative=alternative)\n",
    "            lkp = strat_ci(strata[i], sams[i], found[i], alternative=alternative)\n",
    "            lkp_s = strat_ci_search(strata[i], sams[i], found[i], alternative=alternative)\n",
    "            if verbose:\n",
    "                print(\"{}-{}\".format(i, alternative))\n",
    "                print(\"i: {} brute: {} lkp: {} lkp_s: {} \\nbest_brute: {} best_lkp: {} best_lkp_s: {}\".format(\n",
    "                   i, brute[0], lkp[0], lkp_s[0], brute[1], lkp[1], lkp_s[1]))\n",
    "            np.testing.assert_allclose(brute[0], lkp[0])\n",
    "            np.testing.assert_allclose(brute[0], lkp_s[0])\n",
    "            np.testing.assert_allclose(brute[1], lkp[1])\n",
    "            np.testing.assert_allclose(brute[1], lkp_s[1])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "test_strat_test()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "test_strat_ci()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}