{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# P-Hacking\n", "\n", "## Author [Jean-François Puget](https://www.ibm.com/developerworks/community/blogs/jfp)\n", "\n", "This notebook is the code used in my blog post on [Green dice are loaded (welcome to p-hacking)](https://www.ibm.com/developerworks/community/blogs/jfp/entry/green_dice_are_loaded_welcome_to_p_hacking)\n", "\n", "It demonstrates how to mislead people about a scientific experiment by hacking p-values following the methodlogy from this [xkcd comic](http://xkcd.com/882/):\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, some useful imports." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "from numpy.random import random_integers, seed" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The experiment\n", "\n", "We are given 20 dice colors, and we roll dice for each color 1,000 times. We report the number of six.\n", "We set the random seed to make sure results are reproducible." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", " | Number of Six | \n", "
---|---|
Purple | \n", "151 | \n", "
Brown | \n", "167 | \n", "
Pink | \n", "158 | \n", "
Blue | \n", "167 | \n", "
Teal | \n", "181 | \n", "
Salmon | \n", "162 | \n", "
Red | \n", "170 | \n", "
Turquoise | \n", "161 | \n", "
Magenta | \n", "165 | \n", "
Yellow | \n", "180 | \n", "
Grey | \n", "172 | \n", "
Tan | \n", "164 | \n", "
Cyan | \n", "181 | \n", "
Green | \n", "188 | \n", "
Mauve | \n", "165 | \n", "
Beige | \n", "172 | \n", "
Lilac | \n", "178 | \n", "
Black | \n", "176 | \n", "
Peach | \n", "173 | \n", "
Orange | \n", "157 | \n", "