{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Here's Allen Downey's article: http://allendowney.blogspot.fr/2011/08/jimmy-nut-company-problem.html" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> “Jimmy Nut Company advertises that their nut mix contains 40% cashews, 15% brazil nuts, 20% almonds, and only 25% peanuts. The truth in advertising investigators took a random sample (of size 20 lb) of the nut mix and found the distribution to be as follows:\n", "\n", ">Cashews Brazil Nuts Almonds Peanuts\n", "\n", ">6 lb 3 lb 5 lb 6 lb\n", "\n", ">At the 0.01 level of significance, is the claim made by Jimmy Nuts true?”" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "weights = {'Cashew': (16+18)/2, \n", " 'Brazil nut': (6+8)/2, \n", " 'Almonds': (20+24)/2, \n", " 'Peanuts': 28.}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Converting from weight in the sample to count per ounce." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "def ConvertToCount(sample, count_per):\n", " \"\"\"Convert from weight to count.\n", "\n", " sample: Hist that maps from category to weight in pounds\n", " count_per: dict that maps from category to count per ounce\n", " \"\"\"\n", " for value, count in sample.Items():\n", " sample.Mult(value, 16 * count_per[value])" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "sample = dict(cashew=6, brazil=3, almond=5, peanut=6)\n", "count_per = dict(cashew=17, brazil=7, \n", " almond=22, peanut=28)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'almond': 22, 'brazil': 7, 'cashew': 17, 'peanut': 28}" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "count_per" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "from thinkstats2 import MakeHistFromDict, MakePmfFromDict\n", "\n", "observed = MakeHistFromDict(sample)\n", "ConvertToCount(observed, count_per)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Hist({'cashew': 1632, 'brazil': 336, 'almond': 1760, 'peanut': 2688})" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "observed" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "advertised = dict(cashew=40, brazil=15, \n", " almond=20, peanut=25)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "expected = MakePmfFromDict(advertised)\n", "ConvertToCount(expected, count_per)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Pmf({'cashew': 108.80000000000001, 'brazil': 16.8, 'almond': 70.4, 'peanut': 112.0})" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "expected" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "308.0" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "expected.Normalize(observed.Total())" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Pmf({'cashew': 2266.431168831169, 'brazil': 349.9636363636364, 'almond': 1466.514285714286, 'peanut': 2333.090909090909})" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "expected" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | expected | \n", "observed | \n", "error | \n", "
---|---|---|---|
almond | \n", "1466.514286 | \n", "1760.0 | \n", "20.012469 | \n", "
brazil | \n", "349.963636 | \n", "336.0 | \n", "-3.990025 | \n", "
cashew | \n", "2266.431169 | \n", "1632.0 | \n", "-27.992519 | \n", "
peanut | \n", "2333.090909 | \n", "2688.0 | \n", "15.211970 | \n", "