{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Hypothesis Testing\n", "=============================\n", "\n", "Credits: Forked from [CompStats](https://github.com/AllenDowney/CompStats) by Allen Downey. License: [Creative Commons Attribution 4.0 International](http://creativecommons.org/licenses/by/4.0/)." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from __future__ import print_function, division\n", "\n", "import numpy\n", "import scipy.stats\n", "\n", "import matplotlib.pyplot as pyplot\n", "\n", "from IPython.html.widgets import interact, fixed\n", "from IPython.html import widgets\n", "\n", "import first\n", "\n", "# seed the random number generator so we all get the same results\n", "numpy.random.seed(19)\n", "\n", "# some nicer colors from http://colorbrewer2.org/\n", "COLOR1 = '#7fc97f'\n", "COLOR2 = '#beaed4'\n", "COLOR3 = '#fdc086'\n", "COLOR4 = '#ffff99'\n", "COLOR5 = '#386cb0'\n", "\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Part One\n", "========\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As an example, let's look at differences between groups. The example I use in _Think Stats_ is first babies compared with others. The `first` module provides code to read the data into three pandas Dataframes." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [], "source": [ "live, firsts, others = first.MakeFrames()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The apparent effect we're interested in is the difference in the means. Other examples might include a correlation between variables or a coefficient in a linear regression. The number that quantifies the size of the effect, whatever it is, is the \"test statistic\"." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def TestStatistic(data):\n", " group1, group2 = data\n", " test_stat = abs(group1.mean() - group2.mean())\n", " return test_stat" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For the first example, I extract the pregnancy length for first babies and others. The results are pandas Series objects." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [], "source": [ "group1 = firsts.prglngth\n", "group2 = others.prglngth" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The actual difference in the means is 0.078 weeks, which is only 13 hours." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "0.078037266777549519" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "actual = TestStatistic((group1, group2))\n", "actual" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The null hypothesis is that there is no difference between the groups. We can model that by forming a pooled sample that includes first babies and others." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [], "source": [ "n, m = len(group1), len(group2)\n", "pool = numpy.hstack((group1, group2))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then we can simulate the null hypothesis by shuffling the pool and dividing it into two groups, using the same sizes as the actual sample." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def RunModel():\n", " numpy.random.shuffle(pool)\n", " data = pool[:n], pool[n:]\n", " return data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The result of running the model is two NumPy arrays with the shuffled pregnancy lengths:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "(array([36, 40, 39, ..., 43, 42, 40]), array([43, 39, 32, ..., 37, 35, 41]))" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "RunModel()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then we compute the same test statistic using the simulated data:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "0.081758440969863955" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "TestStatistic(RunModel())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we run the model 1000 times and compute the test statistic, we can see how much the test statistic varies under the null hypothesis." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "(1000,)" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "test_stats = numpy.array([TestStatistic(RunModel()) for i in range(1000)])\n", "test_stats.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here's the sampling distribution of the test statistic under the null hypothesis, with the actual difference in means indicated by a gray line." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYwAAAEPCAYAAABRHfM8AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAE0ZJREFUeJzt3X2QnVVhx/HvJQEhxmxIwbzsRoO8TEWCQCyk4svVpim2\nClgqkFaNhaozFHGsM5XoTMnaim/FKqXasQgGlNSMlDSMtSRQbwsdkkAgBFioJCVjNkKCvIaxYrK5\n/eOczT7Zvbs5u3ef+zy7+/3MPHPPPc/b2cPl/nKetwuSJEmSJEmSJEmSJEmSJEmScnQksAHYDHQB\nX4z1M4B1wE+BtcD0zDrLgCeAx4HFLWupJKlwU+LrZGA98DbgK8BfxvrPAF+K5ZMJ4XI4MA/YChzW\nqoZKksphCnAf8CbC6GFmrJ8V30MYXXwms86/Awtb1UBJ0tDy/hf8YYRRwy7gJ8CjhLDYFefvoi88\n5gDdmXW7gfac2ydJSjQ55+3vB04D2oA7gHf1m1+P02CGmidJaqG8A6PXi8CPgAWEUcUs4GlgNrA7\nLrMTmJtZpyPWHeT444+vb9u2LdfGStI4tA04oZkN5HlI6hj6roA6Cvhd4EFgDbA01i8FVsfyGuBi\n4AjgOOBEYGP/jW7bto16ve40StNVV11VeBuGO91///0HpqLbMtb7ssyT/Tm6E3B8s1/qeY4wZgMr\nCKF0GHAzcBchNFYBlwLbgQvj8l2xvgvYB1yGh6QkqTTyDIyHgTMa1D8HLBpknavjJEkqGe9zmOCq\n1WrRTRg37MvRZX+WT6XoBoxAPR6P0wS1adOmA+UFCxYU2BJp7KhUKtDkd74jDElSEgNDkpTEwJAk\nJTEwJElJDAxJUhIDQ5KUxMCQJCUxMCRJSQwMSVISA0OSlMTAkCQlMTAkSUkMDElSEgNDkpTEwJAk\nJTEwJElJDAxJUhIDQ5KUxMCQJCUxMCRJSQwMSVISA0OSlMTAkCQlMTAkSUkMDElSkslFN2Akurq6\nWr7PKVOmMG/evJbvV5LKIs/AmAvcBLwWqAPfBq4FlgN/BjwTl/ss8ONYXgZcAvQAVwBrG234HYve\nm1ebB/X87p/x8p49HHXUUS3ftySVQZ6BsRf4FLAZmApsAtYRwuNrcco6GbgovrYDdwInAfv7b/jE\nC/4ut0YP5oHrl9DT09Py/UpSWeR5DuNpQlgAvAw8RggCgEqD5c8DVhKCZjuwFTgzx/ZJkoahVSe9\n5wGnA+vj+08ADwHfAabHujlAd2adbvoCRpJUsFYExlTgh8AnCSONbwHHAacBTwHXDLFuPffWSZKS\n5H2V1OHArcD3gNWxbndm/vXA7bG8k3CivFdHrBtgx4aVB8rT2k+hrWP+KDVXksaHWq1GrVYb1W02\nOpcwmtteATxLOPndazZhZEGs/y3gjwknu28hnLfoPel9AgNHGfWFl6+m1R64fgnPPrObqVOntnzf\nOtimTZsOlBcsWFBgS6Sxo1KpQJPf+XmOMM4GPghsAR6MdZ8FlhAOR9WBJ4GPx3ldwKr4ug+4DA9J\nSVJp5BkY99D4HMmPG9T1ujpOkqSS8dEgkqQkBoYkKYmBIUlKYmBIkpIYGJKkJAaGJCmJgSFJSmJg\nSJKSGBiSpCQGhiQpiYEhSUpiYEiSkhgYkqQkBoYkKYmBIUlKYmBIkpIYGJKkJAaGJCmJgSFJSmJg\nSJKSGBiSpCQGhiQpiYEhSUpiYAzDnPYOKpVKy6e26UcX/adLEpOLbsBYsuelF1l4+eqW73f9dee3\nfJ+S1J8jDElSEgNDkpTEwJAkJTEwJElJ8gyMucBPgEeBR4ArYv0MYB3wU2AtMD2zzjLgCeBxYHGO\nbZMkDVOegbEX+BTwJmAh8OfAG4ErCYFxEnBXfA9wMnBRfD0H+GbO7Rs7KpMKuZzXS3olZeV5We3T\ncQJ4GXgMaAfOBd4Z61cANUJonAesJATNdmArcCawPsc2jg31nkIu5wUv6ZXUp1X/gp8HnA5sAGYC\nu2L9rvgeYA7QnVmnmxAwkqQSaMWNe1OBW4FPAnv6zavHaTAN5+3YsPJAeVr7KbR1zG+yiZI0vtRq\nNWq12qhuM+/AOJwQFjcDvcdUdgGzCIerZgO7Y/1OwonyXh2xboC5Zy3Jo62SNG5Uq1Wq1eqB952d\nnU1vM89DUhXgO0AX8PVM/RpgaSwvpS9I1gAXA0cAxwEnAhtzbJ8kaRjyHGGcDXwQ2AI8GOuWAV8C\nVgGXEk5uXxjndcX6LmAfcBlDH66SJLVQnoFxD4OPYBYNUn91nCRJJeN9DpKkJAaGJCmJgSFJSmJg\nSJKSGBiSpCQGhiQpiYEhSUpiYEiSkhgYkqQkBoYkKYmBIUlKYmBIkpIYGJKkJAaGJCmJgSFJSmJg\nSJKSGBiSpCQGhiQpiYEhSUpiYEiSkhgYkqQkKYFxV2KdJGkcmzzEvKOAKcCxwIxM/TSgPc9GSZLK\nZ6jA+DjwSWAOsClTvwe4Ls9GSZLKZ6jA+HqcrgCubU1zJEllNVRg9LoWeCswr9/yN+XRIElSOaUE\nxveANwCbgZ5MvYEhSRNISmAsAE4G6jm3RZJUYimX1T4CzB7h9m8AdgEPZ+qWA93Ag3F6T2beMuAJ\n4HFg8Qj3KUnKQcoI41igC9gIvBLr6sC5CeveCPw9Bx++qgNfi1PWycBF8bUduBM4CdifsB9JUs5S\nAmN5E9u/m3CyvL9Kg7rzgJXAXmA7sBU4E1jfxP4lSaMkJTBqOez3E8CHgfuBTwMvEO73yIZDN94g\nKEmlkRIYL9N3wvsI4PBYN22E+/wW8PlY/mvgGuDSQZZteKJ9x4aVB8rT2k+hrWP+CJsiSeNTrVaj\nVquN6jZTAmNqpnwY4dzFwib2uTtTvh64PZZ3AnMz8zpi3QBzz1rSxO4lafyrVqtUq9UD7zs7O5ve\n5nCfVrsfWA2c08Q+s1dcvZ++K6jWABcTRjHHAScSTrSrSJVJVCqVlk9t048u+i+X1E/KCOOCTPkw\nwn0Z/5e4/ZXAO4FjgB3AVUAVOI1wuOlJwjOrIFyJtSq+7gMuw3s/ilfvYeHlq1u+2/XXnd/yfUoa\nWkpgvI++L+59hCuYzkvcfqNjRzcMsfzVcZIklUxKYHwk70ZIksov5RzGXOA24Jk43Uo4IS1JmkBS\nAuNGwgnpOXG6PdZJkiaQlMA4lhAQe+P0XeC1ObZJklRCKYHxLPAhYBLhnMcHgV/k2ShJUvmkBMaf\nAhcCTwNPAR+IdZKkCSTlKqnPE5779Hx8PwP4W+CSvBolSSqflBHGm+kLC4DngDPyaY4kqaxSAqNC\nGFX0mkE4nyFJmkBSDkldA9xLeGxHhXAO4wt5NkqSVD4pgXETsAl4N+ERIe8nPO9JkjSBpAQGwKNx\nkiRNUMN9vLkkaYIyMCRJSQwMSVISA0OSlMTAkCQlMTAkSUkMDElSEgNDkpTEwJAkJTEwJElJDAxJ\nUhIDQ5KUxMCQJCUxMCRJSQwMSVISA0OSlCTvwLgB2AU8nKmbAawDfgqsBaZn5i0DngAeBxbn3DZJ\n0jDkHRg3Auf0q7uSEBgnAXfF9wAnAxfF13OAb7agfZKkRHl/Id8NPN+v7lxgRSyvAM6P5fOAlcBe\nYDuwFTgz5/ZJkhIV8S/4mYTDVMTXmbE8B+jOLNcNtLewXZKkIUwueP/1OA01f4AdG1YeKE9rP4W2\njvmj3CxJGttqtRq1Wm1Ut1lEYOwCZgFPA7OB3bF+JzA3s1xHrBtg7llL8myfJI151WqVarV64H1n\nZ2fT2yzikNQaYGksLwVWZ+ovBo4AjgNOBDa2vHWSpIbyHmGsBN4JHAPsAP4K+BKwCriUcHL7wrhs\nV6zvAvYBlzH04SpJUgvlHRiDHTtaNEj91XGSJJWM9zlIkpIYGJKkJAaGJCmJgSFJSmJgSJKSGBiS\npCQGhiQpiYEhSUpiYEiSkhgYkqQkBoYkKYmBIUlKYmBIkpIYGJKkJAaGJCmJgSFJSmJgSJKSGBiS\npCQGhiQpiYGhcqpMolKpNJwOWmyQZZqZ2qYfXdAfLZXb5KIbIDVU72Hh5asPuVjKMsO1/rrzR32b\n0njgCEOSlMTAkCQlMTAkSUkMDElSEgNDkpTEwJAkJTEwJElJirwPYzvwEtAD7AXOBGYAPwBeH+df\nCLxQTPMkSVlFjjDqQBU4nRAWAFcC64CTgLvie0lSCRR9SKrS7/25wIpYXgF4y60klUTRI4w7gfuB\nj8a6mcCuWN4V30uSSqDIcxhnA08BxxIOQz3eb349TpKkEigyMJ6Kr88AtxHOY+wCZgFPA7OB3Y1W\n3LFh5YHytPZTaOuYn2tDNcHEJ+W22rS26bz4wvMt36/Gp1qtRq1WG9VtFhUYU4BJwB7g1cBioBNY\nAywFvhxfGz6KdO5ZS1rTSk1MiU/KHW0+JVejqVqtUq1WD7zv7OxseptFBcZMwqiitw3fB9YSzmes\nAi6l77JaSVIJFBUYTwKnNah/DljU4rZIkhIUfVmtJGmMMDAkSUkMDElSEgNDkpTEwJAkJTEwJElJ\nDAxJUhIDQ5KUxMCQJCUp8uGDkrIKeugh+OBDpTEwpLIo6KGH4IMPlcZDUpKkJAaGJCmJgSFJSmJg\nSJKSGBiSpCQGhiQpiYEhSUpiYEiSknjjnqTC7jL3DvOxxcCQVNhd5t5hPrZ4SEqSlMTAkCQlMTAk\nSUkMDElSEgNDkpTEwJAkJTEwJElJyhgY5wCPA08Anym4LZLyFG8YLGJqm3500X/9mFO2G/cmAdcB\ni4CdwH3AGuCxIhs1nr3Y/TBtHfOLbsa4YF+OwBA3DObdn+v/4QLvbh+msgXGmcBWYHt8/8/AeRgY\nuXlp5yN+yY0S+3J05d6f3t0+bGU7JNUO7Mi87451kqSClW2EUU9ZqPuuL+fdjgH2/vqVlu9Tksqk\n9QfwhrYQWE448Q2wDNgPZBNiK3B8a5slSWPeNuCEohsxmiYT/qh5wBHAZuCNRTZIklRe7wH+hzCS\nWFZwWyRJkiSNVSk37F0b5z8EnD7MdSeaZvpzO7AFeBDYmF8Tx5RD9edvAvcCvwI+Pcx1J6Jm+nM7\nfj6zDtWXf0L4f3wL8N/AqcNYt5QmEQ5BzQMOp/G5i98H/i2WzwLWD2PdiaaZ/gR4EpiRbxPHlJT+\nPBZ4C/A3HPwF5+dzoGb6E/x8ZqX05W8DbbF8Dk18d5blPozsDXt76bthL+tcYEUsbwCmA7MS151o\nRtqfMzPzy3YFXZFS+vMZ4P44f7jrTjTN9GcvP59BSl/eC7wYyxuAjmGse5CyBEbKDXuDLTMnYd2J\nppn+hHA/zJ2E/2E/mlMbx5Jmbij1ZtSBmu0TP599htuXl9J3ZGHY/x3KcuNe0g17+K+KVM3259uA\nnxMOC6wjHOO8exTaNVal9udorzteNdsnZwNP4ecThteX7wIuIfTfcNcFyjPC2AnMzbyfS0i7oZbp\niMukrDvRjLQ/d8byz+PrM8BthKHrRNbMZ8zP50DN9slT8dXPZ3pfngr8E+FQdO+TD8fsZzPlhr3s\nSdqF9J248Wa/gZrpzynAa2L51YSrKhbn2NaxYDifseUcfJLWz+dAzfSnn8+DpfTl6wjnKhaOYN3S\nanTD3sfj1Ou6OP8h4IxDrDvRjbQ/30D44GwGHsH+7HWo/pxFOB78IuFfcD8Dpg6x7kQ30v708znQ\nofryeuBZwmXI/S9F9rMpSZIkSZIkSZIkSZIkSZIkSePDcvpu2OoEfieW3w48CjwAHAl8lXA9fut/\n6D3NAuAbRTdCksazqxj4WGuAfyQ837/XCwzvGWNleX6aJKkJnyPcdXo3cAvwF7H+u8AFhCdtPgv8\nL/A94F+BfYS7Vy8kPIzuh4Q7WTcCb43rLwduBu4Bvg8cM8RyNwA/ITwy4ROZtn2YcEf8ZuCmWDfY\n/rKqwO0J2896GfgKYeS0jvBIh/+M67wvLjOJMLraGNv1sVg/lfAk102EH8w5N9bPAx4Dvh23ewdh\nhAZwBWHU9hCwcpA2SVJpLCB8wR1JeFbQE/QFxo3AHzYoA+zJlG+h70mcrwO6Ynk5cB/wqoTl7iH8\nmMxvAL8gfDG/iRBkvT/aM/0Q28mqcnBgNNp+f/uB34vlfwHWxuVOJYQjhID4XCy/Kv598+Jyvc9a\nOobQj8R5e+n7tbUf0DdS2xnbBDCtQXskwOG5yuPthC/HX8VpzRDLDnYIahEHPzztNYQH1NXj9l5J\nWO5HhC/WZ4HdhGcavRtYBTwXl39hiO1MAX45SPsabX8mfU8H7vVrwggA4GFCf/QQRgbzYv1iYD7w\nR/H9NOAEwtNGv0joz/2E34t5bVzmSUIoQxiB9G5rCyH8VsdJasjAUFnUOTgIRvLbJxXCz83+usG8\nXyYul63rIfw/0r9tKdsZTKPt95f9lbn9mXX291v+csIhq6yPEEYWZ8TtP0nfoadXMsv1AEfF8h8A\n7yAc7vocIYh6DvmXaMIpy+9hSP8FnE/fIan3jmAbawnH43u9ucnlIITFfwAfoO+Q1NGDbOe0Q7Rv\nNH8A7A7gMvoC5CTC6GYaYeTSQ/jBnNcntOl1QA24kvDbz68exXZqHDEwVBYPEo6rP0T4nY6NQyxb\nH6R8BfCWuI1HOfhR7iNZrlcX8AXCiefNwDWDbOdjDdatZ7aZLQ+l/zKN/t7rY7seIBy2+hbh/MX3\nY5u2AB8inOgearuTCBcEbInb+gbwUkIbJUmSJEmSJEmSJEmSJEmSJEmSJEmS1Iz/BxMO3JjboFOC\nAAAAAElFTkSuQmCC\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "def VertLine(x):\n", " \"\"\"Draws a vertical line at x.\"\"\"\n", " pyplot.plot([x, x], [0, 300], linewidth=3, color='0.8')\n", "\n", "VertLine(actual)\n", "pyplot.hist(test_stats, color=COLOR5)\n", "pyplot.xlabel('difference in means')\n", "pyplot.ylabel('count')\n", "None" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The p-value is the probability that the test statistic under the null hypothesis exceeds the actual value." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "0.14999999999999999" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pvalue = sum(test_stats >= actual) / len(test_stats)\n", "pvalue" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this case the result is about 15%, which means that even if there is no difference between the groups, it is plausible that we could see a sample difference as big as 0.078 weeks.\n", "\n", "We conclude that the apparent effect might be due to chance, so we are not confident that it would appear in the general population, or in another sample from the same population." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Part Two\n", "========\n", "\n", "We can take the pieces from the previous section and organize them in a class that represents the structure of a hypothesis test." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [], "source": [ "class HypothesisTest(object):\n", " \"\"\"Represents a hypothesis test.\"\"\"\n", "\n", " def __init__(self, data):\n", " \"\"\"Initializes.\n", "\n", " data: data in whatever form is relevant\n", " \"\"\"\n", " self.data = data\n", " self.MakeModel()\n", " self.actual = self.TestStatistic(data)\n", " self.test_stats = None\n", "\n", " def PValue(self, iters=1000):\n", " \"\"\"Computes the distribution of the test statistic and p-value.\n", "\n", " iters: number of iterations\n", "\n", " returns: float p-value\n", " \"\"\"\n", " self.test_stats = numpy.array([self.TestStatistic(self.RunModel()) \n", " for _ in range(iters)])\n", "\n", " count = sum(self.test_stats >= self.actual)\n", " return count / iters\n", "\n", " def MaxTestStat(self):\n", " \"\"\"Returns the largest test statistic seen during simulations.\n", " \"\"\"\n", " return max(self.test_stats)\n", "\n", " def PlotHist(self, label=None):\n", " \"\"\"Draws a Cdf with vertical lines at the observed test stat.\n", " \"\"\"\n", " def VertLine(x):\n", " \"\"\"Draws a vertical line at x.\"\"\"\n", " pyplot.plot([x, x], [0, max(ys)], linewidth=3, color='0.8')\n", "\n", " ys, xs, patches = pyplot.hist(ht.test_stats, color=COLOR4)\n", " VertLine(self.actual)\n", " pyplot.xlabel('test statistic')\n", " pyplot.ylabel('count')\n", "\n", " def TestStatistic(self, data):\n", " \"\"\"Computes the test statistic.\n", "\n", " data: data in whatever form is relevant \n", " \"\"\"\n", " raise UnimplementedMethodException()\n", "\n", " def MakeModel(self):\n", " \"\"\"Build a model of the null hypothesis.\n", " \"\"\"\n", " pass\n", "\n", " def RunModel(self):\n", " \"\"\"Run the model of the null hypothesis.\n", "\n", " returns: simulated data\n", " \"\"\"\n", " raise UnimplementedMethodException()\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`HypothesisTest` is an abstract parent class that encodes the template. Child classes fill in the missing methods. For example, here's the test from the previous section." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false }, "outputs": [], "source": [ "class DiffMeansPermute(HypothesisTest):\n", " \"\"\"Tests a difference in means by permutation.\"\"\"\n", "\n", " def TestStatistic(self, data):\n", " \"\"\"Computes the test statistic.\n", "\n", " data: data in whatever form is relevant \n", " \"\"\"\n", " group1, group2 = data\n", " test_stat = abs(group1.mean() - group2.mean())\n", " return test_stat\n", "\n", " def MakeModel(self):\n", " \"\"\"Build a model of the null hypothesis.\n", " \"\"\"\n", " group1, group2 = self.data\n", " self.n, self.m = len(group1), len(group2)\n", " self.pool = numpy.hstack((group1, group2))\n", "\n", " def RunModel(self):\n", " \"\"\"Run the model of the null hypothesis.\n", "\n", " returns: simulated data\n", " \"\"\"\n", " numpy.random.shuffle(self.pool)\n", " data = self.pool[:self.n], self.pool[self.n:]\n", " return data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can run the test by instantiating a DiffMeansPermute object:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "means permute pregnancy length\n", "p-value = 0.16\n", "actual = 0.0780372667775\n", "ts max = 0.173695697482\n" ] } ], "source": [ "data = (firsts.prglngth, others.prglngth)\n", "ht = DiffMeansPermute(data)\n", "p_value = ht.PValue(iters=1000)\n", "print('\\nmeans permute pregnancy length')\n", "print('p-value =', p_value)\n", "print('actual =', ht.actual)\n", "print('ts max =', ht.MaxTestStat())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And we can plot the sampling distribution of the test statistic under the null hypothesis." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYwAAAEPCAYAAABRHfM8AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAEn5JREFUeJzt3X+MHOV9x/H3GtsVYIh94Yd/oZhfdnHVgoNLKSRilVLL\nVDWGRoGgENwGRW5JgDRVWlyp5a5NKVFDQhEiIuWXKeAEQWPZamkhiFWhARxc2xiMATu2iinYJDYE\nSJr4zPWP51nf2r49P3u3s8+c7/2SRjv77Mzs1+e5++w8z84MSJIkSZIkSZIkSZIkSZIkSSrQCcAT\nwIvAC8A1sb0b2AasidMFDessAV4FNgLzOlWoJCmvycAZcX4C8DJwGnA98OUBlp8NrAXGATOATcCY\nwquUJCUp8g/ym4QAAHgPeAmYFp9XBlh+IbAM2A1sJQTGWQXWJ0lqQac+wc8A5gDPxOdXA+uAO4GJ\nsW0qoauqbhv9ASNJyqwTgTEBeAi4lnCk8S3gREJ31RvATYOs21d4dZKkJGML3v444GHgPmB5bNvR\n8PodwMo4/zphoLxuemzbx8knn9y3efPm9lcqSYe2zcApw9lAkUcYFUKX0wbg5ob2KQ3zFwPr4/wK\n4NPAeMIRyKnAqv03unnzZvr6+ko1XX/99dlrsKZDqy5rsqZ2T8DJw/2jXuQRxrnA5cDzhK/PAvwl\ncBmhO6oP2AIsjq9tAB6Mj73AVdglJUmlUWRgPMXARzCPDLLODXGSJJWM5zm0QbVazV3CAawpXRnr\nsqY01tRZA50PUXZ9sT9OkpSoUqnAMP/me4QhSUpiYEiSkhgYkqQkBoYkKYmBIUlKYmBIkpIYGJKk\nJAaGJCmJgSFJSmJgSJKSFH0/jEK8//772d57/PjxjBs3Ltv7a3hWr169d/7MM8/MWIk08ozIwDju\nuGOyvO+ePR8wd+4cnnrqmYMvLEmHmBEZGO+//49Z3nfdute44oqHsry3JOXmGIYkKYmBIUlKYmBI\nkpIYGJKkJAaGJCmJgSFJSmJgSJKSGBiSpCQGhiQpiYEhSUpiYEiSkhgYkqQkBoYkKYmBIUlKYmCM\nQF1dk6hUKtmmrq5JuX8EkjIYkffDGO127Xqbvr7bs71/pbI423tLyscjDElSEgNDkpTEwJAkJXEM\no0UbN26iUqnkLkOSOs7AaNEvf9mbdcAZHHSWlEeRXVInAE8ALwIvANfE9i7gMeAV4FFgYsM6S4BX\ngY3AvAJrkyS1qMjA2A38KfBrwNnAF4DTgOsIgTETeDw+B5gNXBof5wO3FVyfJKkFRf5BfhNYG+ff\nA14CpgEXAktj+1Lgoji/EFhGCJqtwCbgrALrkyS1oFOf4GcAc4BngeOB7bF9e3wOMBXY1rDONkLA\nSJJKoBOD3hOAh4FrgXf3e60vTs0M+Fp398q989XqTKrVWcMsUZIOLbVajVqt1tZtFh0Y4whh8c/A\n8ti2HZhM6LKaAuyI7a8TBsrrpse2A3R3LyiiVkk6ZFSrVarV6t7nPT09w95mkV1SFeBOYANwc0P7\nCmBRnF9Ef5CsAD4NjAdOBE4FVhVYnySpBUUeYZwLXA48D6yJbUuAG4EHgSsJg9uXxNc2xPYNQC9w\nFYN3V0mSOqjIwHiK5kcw5zdpvyFOkqSS8TwHSVISA0OSlMTAkCQlMTAkSUkMDElSEgNDkpTEwJAk\nJTEwJElJDAxJUhIDQ5KUxMCQJCUxMCRJSQwMSVISA0OSlMTAkCQlMTAkSUkMDElSEgNDkpTEwJAk\nJTEwJElJDAxJUhIDQ5KUxMCQJCUxMCRJSQwMSVISA0OSlMTAkCQlMTAkSUkMDElSEgNDkpTEwJAk\nJTEwJElJDAxJUhIDQ5KUxMCQJCUpOjDuArYD6xvauoFtwJo4XdDw2hLgVWAjMK/g2iRJLSg6MO4G\n5u/X1gd8A5gTp0di+2zg0vg4H7itA/VJkhIV/Qf5SWDXAO2VAdoWAsuA3cBWYBNwVmGVSZJakusT\n/NXAOuBOYGJsm0roqqrbBkzrcF2SpCZyBMa3gBOBM4A3gJsGWbavIxVJkg5qbIb33NEwfwewMs6/\nDpzQ8Nr02HaA7u6Ve+er1ZlUq7PaXKIkjWy1Wo1ardbWbeYIjCmEIwuAi+n/BtUK4AHCgPg04FRg\n1UAb6O5eUHCJkjSyVatVqtXq3uc9PT3D3mbRgbEMOA84BngNuB6oErqj+oAtwOK47AbgwfjYC1yF\nXVKSVBpFB8ZlA7TdNcjyN8RJklQynucgSUpiYEiSkhgYkqQkBoYkKYmBIUlKYmBIkpLkOHFPI9zY\nsWOoVAa6fmRnTJo0kZ07B7qmpaQipQTG48DvJLRplOjt/YC+vtuzvX+lsvjgC0lqu8EC43DgCOBY\noKuh/Wi8iqwkjTqDBcZi4FrCZcdXN7S/C9xaZFGSpPIZLDBujtM1wC2dKUeSVFYpYxi3AOcAM/Zb\n/t4iCpIklVNKYNwHnASsBfY0tBsYkjSKpATGmcBsvNS4JI1qKSfuvUC46ZEkaRRLOcI4lnBTo1XA\nL2JbH3BhUUVJksonJTC6iy5CklR+KYFRK7oISVL5pQTGe/QPeI8HxsW2o4sqSpJUPimBMaFhfgxh\n7OLsYsqRJJVVq5c3/wBYDswvoBZJUomlHGF8smF+DOG8jJ8XU44kqaxSAmMB/WMYvcBWYGFRBUmS\nyiklMP6w6CIkSeWXMoZxAvA94K04PQxML7IoSVL5pATG3cAKwn0xpgIrY5skaRRJCYxjCQGxO073\nAMcVWJMkqYRSAuMnwGeBwwhjHpcDPy6yKElS+aQExh8BlwBvAm8An4ptkqRRJOVbUn8DXAHsis+7\ngK8DnyuqKElS+aQcYZxOf1gA7AQ+Wkw5kqSySgmMCuGooq6LMJ4hSRpFUrqkbgKeBh4khMengL8r\nsihJUvmkBMa9wGrgE4RLhFxMuAOfJGkUSQkMgBfjJEkapVq9vLkkaZQqOjDuArYD6xvauoDHgFeA\nR4GJDa8tAV4FNgLzCq5NktSCogPjbg682dJ1hMCYCTwenwPMBi6Nj/OB2zpQnyQpUdF/kJ9k33M4\nINzidWmcXwpcFOcXAssI16vaCmwCziq4PklSohyf4I8ndFMRH4+P81OBbQ3LbQOmdbAuSdIgcnf5\n9NF/N79mr0uSSiD1a7XttB2YTLiY4RRgR2x/nXCzprrpse0A3d0r985XqzOpVmcVUqgkjVS1Wo1a\nrdbWbeYIjBXAIuBr8XF5Q/sDwDcIXVGnAqsG2kB394Liq5SkEaxarVKtVvc+7+npGfY2iw6MZcB5\nwDHAa8BfAzcSLjNyJWFw+5K47IbYvgHoBa7CLilJKo2iA+OyJu3nN2m/IU6SpJLJPegtSRohDAxJ\nUhIDQ5KUxMCQJCUxMDTijB07hkqlMqSp0VC30dU1KdO/XMorx3kY0rD09n5AX9/tQ1p39er++aFu\no1JZPKT1pJHOIwxJUhIDQ5KUxMCQJCUxMCRJSQwMSVISA0OSlMTAkCQlMTAkSUkMDElSEgNDkpTE\nwJAkJTEwJElJDAxJUhIDQ5KUxMCQJCUxMCRJSQwMSVISA0OSlMTAkCQlMTAkSUkMDElSEgNDkpTE\nwJAkJTEwJElJDAxJUhIDQ5KUxMCQJCUxMCRJSQwMSVISA0OSlGRsxvfeCvwU2APsBs4CuoDvAh+J\nr18CvJ2nPElSo5xHGH1AFZhDCAuA64DHgJnA4/G5JKkEcndJVfZ7fiGwNM4vBS7qbDmSpGZyH2F8\nH3gO+HxsOx7YHue3x+dSqYwdO4ZKpZJ16uqalPvHoFEo5xjGucAbwLGEbqiN+73eF6cDdHev3Dtf\nrc6kWp1VUInSgXp7P6Cv7/asNVQqi7O+v8qvVqtRq9Xaus2cgfFGfHwL+B5hHGM7MBl4E5gC7Bho\nxe7uBZ2oT5JGrGq1SrVa3fu8p6dn2NvM1SV1BHBUnD8SmAesB1YAi2L7ImB550uTJA0k1xHG8YSj\ninoN9wOPEsYzHgSupP9rtZKkEsgVGFuAMwZo3wmc3+FaJEkJcn+tVpI0QhgYkqQkBoYkKYmBIUlK\nYmBIkpIYGJKkJAaGJCmJgSFJSmJgSJKS5Lz4oKQhql9iPZdJkyayc+eubO+vPAwMaQTKfYl1L68+\nOtklJUlKYmBIkpIYGJKkJAaGJCmJgSFJSmJgSJKSGBiSpCQGhiQpiYEhSUrimd6SWualSUYnA0NS\ny7w0yehkl5QkKYmBIUlKYmBIkpIYGJKkJAaGJCmJgSFJSmJgSJKSGBiSpCQGhiQpiYEhSUpiYEiS\nkhgYkqQkXnxQ0oiT+2q5MDqvmFvGwJgP3AwcBtwBfC1vOZLKJvfVcgHGjfuTUXeJ97IFxmHArcD5\nwOvAD4EVwEs5izqYWu1lqtVZucvYhzWlK2Nd1pQmZ03NQqtTNeW4xHvZxjDOAjYBW4HdwHeAhTkL\nSlGrvZK7hANYU7oy1mVNaayps8oWGNOA1xqeb4ttkqTMytYl1Zey0IIF3y66jgG9887PsryvJJVB\n3q8ZHOhsoJsw8A2wBPiAfQe+NwEnd7YsSRrxNgOn5C6incYS/lEzgPHAWuC0nAVJksrrAuBlwpHE\nksy1SJIkSRqp5gMbgVeBv2iyzC3x9XXAnBbX7XRdJwBPAC8CLwDXlKCmusOANcDKktQ0EXiIcL7N\nBsJYVu6alhD+79YDDwC/0qGafhV4Gvg/4M9aXDdHXTn388F+VpBnPx+splz7+WA1FbWfF+owQhfU\nDGAcA49d/B7wb3H+t4BnWlg3R12TgTPi/ARCN1s76hpOTXVfBu4nnBTZDsOtaSnwuTg/FvhQ5ppm\nAD+i/5fnu8CiDtV0LDAX+Cr7/nLn3s+b1ZVzP29WU12O/XywmnLt581qmkGL+3lZzsNIOWHvQsIP\nHOBZQlpPTly303UdD7xJ+M8DeI/wqWJq5poAphP+UN5B+74lN5yaPgR8HLgrvtYLvJO5pp/GdY4g\n/GIfQbjyQCdqegt4Lr7e6ro56sq5nzerCfLt581qyrmfN6up5f28LIGRcsJes2WmJqzb6bqm77fM\nDEJ3x7MZa6ov803gK4SvK7fLcH5OJxJ26LuB/wb+ibDj5qppGrATuAn4H+B/gbeB73eopiLW7dS2\nZ9DZ/XwwufbzZnLu5820vJ+XJTCSTtij8+eNDLWuxvUmEPotryV8AstVUwX4fWAHoV+3nT/L4fyc\nxgIfBW6Lj+8D12WsCcJ5Pl8i/AGcSvg//EwHa2r3up3Ydq79fCC59/OB5N7PB9Lyfl6WwHidMHhW\ndwIhKQdbZnpcJmXdTtdVP6wbBzwM3AcsL0FN5xC6YbYAy4BPAPdmrmlbnH4Y2x8i/ELlrGku8APg\nJ4Sug38h/Ow6UVMR6xa97Vz7eTM59/Nmcu7nzRS1nxcu5YS9xgHKs+kfoCzyZL/h1FUh7KTfbFMt\n7aip0Xm079sjw63pP4GZcb6b9lzSfjg1nUH4xs/hhP/HpcAXOlRTXTf7DlDm3s+b1ZVzP29WU6NO\n7+eD1ZRrP29W0+kUs593xEAn7C2OU92t8fV17JvORZ7sN9S6PkboP11LODReQ/8lT3LV1Og82vft\nkeHWdDrhk9c6wqecdnx7ZLg1/Tn9XzdcSvgU3YmaJhP6pN8BdhH6lycMsm67DLWunPv5YD+ruk7v\n54PVlGs/H6ymovZzSZIkSZIkSZIkSZIkSZIk6VB1WO4CpAJ8iHBV0OeGuP6XCN+V701cfiHhqgk/\nbnG5HsLv4JY2LS9JatEMwolIQ7UF+HALy98DfLKNyw11eUlSi74D/Ixw1nH98gtfAVYRjhy6Y9uR\nwL8SzlJeD1wCXA38AngeeHyAbd9IODN2HfAPwG8TrsXzI8JVSE8CPh/fay3hmkGHE67Rs/9y99Af\nCCnbbVz+N4H/iu/xLAee4SxJSvAR9j3CmAfcHufHEK4t9HHgD4BvNyx3VHzcAnQNsN0PE+5sVnd0\nfLw7bquucd2/Bb7YZLn689Tt1p+PJ1w/6MzYPgG7l9UBZblardRO+1/Sel6c1gCrgVnAKYRQ+V3C\np/uPAe8eZLtvE25zeSdwMfDzJu/568CThKOUzwCzB6mtle3Wn88C3oj/FgiXE99zkNqlYTMwNFr8\nPeHmPnMIVwy9m3AP5DmE4Pgq8FcH2cYewh3OHiLcc+HfG15rvC/BPcBVwG8QBqoPb7IchABI3e5g\nbVLhDAwdit6lv3sJ4D8I35o6Mj6fRrjP8RTCJ/v7ga8TwqO+/tEc6EjCbVwfIdwv+vQmy08g3Lp0\nHHA5/X/gh7td4rZejrXPjW1HYZeUJA3Z/YQjh/qg9zWELqLnCYPFJxG6qdYRuqpW0X958y8SxhT2\nH/SeTBhgXhe389nYfg5hwHp13O4fEwarnwVuof8+zvsvVx+TSN1u45jGXOBpwqD3D+gPQ0mSJEmS\nJEmSJEmSJEmSJEmSJEmSJI0E/w/hy3nGuRWPsgAAAABJRU5ErkJggg==\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "ht.PlotHist()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As an exercise, write a class named `DiffStdPermute` that extends `DiffMeansPermute` and overrides `TestStatistic` to compute the difference in standard deviations. Is the difference in standard deviations statistically significant?" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "std permute pregnancy length\n", "p-value = 0.155\n", "actual = 0.176049064229\n", "ts max = 0.44299505029\n" ] } ], "source": [ "class DiffStdPermute(DiffMeansPermute):\n", " \"\"\"Tests a difference in means by permutation.\"\"\"\n", "\n", " def TestStatistic(self, data):\n", " \"\"\"Computes the test statistic.\n", "\n", " data: data in whatever form is relevant \n", " \"\"\"\n", " group1, group2 = data\n", " test_stat = abs(group1.std() - group2.std())\n", " return test_stat\n", "\n", "data = (firsts.prglngth, others.prglngth)\n", "ht = DiffStdPermute(data)\n", "p_value = ht.PValue(iters=1000)\n", "print('\\nstd permute pregnancy length')\n", "print('p-value =', p_value)\n", "print('actual =', ht.actual)\n", "print('ts max =', ht.MaxTestStat())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's run DiffMeansPermute again to see if there is a difference in birth weight between first babies and others." ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "means permute birthweight\n", "p-value = 0.0\n", "actual = 0.124761184535\n", "ts max = 0.0917504268392\n" ] } ], "source": [ "data = (firsts.totalwgt_lb.dropna(), others.totalwgt_lb.dropna())\n", "ht = DiffMeansPermute(data)\n", "p_value = ht.PValue(iters=1000)\n", "print('\\nmeans permute birthweight')\n", "print('p-value =', p_value)\n", "print('actual =', ht.actual)\n", "print('ts max =', ht.MaxTestStat())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this case, after 1000 attempts, we never see a sample difference as big as the observed difference, so we conclude that the apparent effect is unlikely under the null hypothesis. Under normal circumstances, we can also make the inference that the apparent effect is unlikely to be caused by random sampling.\n", "\n", "One final note: in this case I would report that the p-value is less than 1/1000 or 0.001. I would not report that p=0, because the apparent effect is not impossible under the null hypothesis; just unlikely." ] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.10" } }, "nbformat": 4, "nbformat_minor": 0 }