{
  "cells": [
    {
      "metadata": {
        "run_control": {
          "frozen": false,
          "read_only": false
        },
        "slideshow": {
          "slide_type": "slide"
        }
      },
      "cell_type": "markdown",
      "source": "![Imgur](https://i.imgur.com/9f5keQe.png)"
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "* **Hypothesis testing is used when we need to make decisions concerning populations on the basis of only sample information **\n* **A variety of statistical tests are uese to help arrive at these decisions, but the steps are common for all tests **"
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "![Imgur](https://i.imgur.com/sz2xKlU.jpg)"
    },
    {
      "metadata": {
        "slideshow": {
          "slide_type": "slide"
        }
      },
      "cell_type": "markdown",
      "source": "![Imgur](https://i.imgur.com/jYN6L0X.png?1)"
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "## Approaches"
    },
    {
      "metadata": {},
      "cell_type": "markdown",
      "source": "**Hypothesis Testing (Critical Value Approach)**  \n**Hypothesis Testing (P-Value Approach)** "
    },
    {
      "metadata": {
        "slideshow": {
          "slide_type": "slide"
        }
      },
      "cell_type": "markdown",
      "source": "### Critical Value approach"
    },
    {
      "metadata": {
        "slideshow": {
          "slide_type": "slide"
        }
      },
      "cell_type": "markdown",
      "source": "![Imgur](https://i.imgur.com/QGnkJsp.png)"
    },
    {
      "metadata": {
        "slideshow": {
          "slide_type": "subslide"
        }
      },
      "cell_type": "markdown",
      "source": "If test statistic < critical value: Fail to reject the null hypothesis.  \nIf test statistic >= critical value: Reject the null hypothesis.  "
    },
    {
      "metadata": {
        "slideshow": {
          "slide_type": "slide"
        }
      },
      "cell_type": "markdown",
      "source": "### p-value approach"
    },
    {
      "metadata": {
        "run_control": {
          "marked": false
        },
        "slideshow": {
          "slide_type": "slide"
        }
      },
      "cell_type": "markdown",
      "source": "**p- values **"
    },
    {
      "metadata": {
        "slideshow": {
          "slide_type": "subslide"
        }
      },
      "cell_type": "markdown",
      "source": "If p-value <= alpha: Reject the null hypothesis (i.e. significant result).  \nIf p-value > alpha: Fail to reject the null hypothesis (i.e. not signifiant result). "
    },
    {
      "metadata": {
        "slideshow": {
          "slide_type": "subslide"
        }
      },
      "cell_type": "markdown",
      "source": "Statistical test rejection statement is in terms of the dichotomy of rejecting and fail to rejecting the null hypothesis.  \n\n<span style=\"color:red\">Rejecting the null hypothesis</span> means that there is <span style=\"color:red\">sufficient statistical evidence</span> that the null hypothesis does not look likely.     \n\n<span style=\"color:red\">Fail to reject the null hypothesis,</span> as in, there is <span style=\"color:red\">insufficient statistical evidence</span> to reject it."
    },
    {
      "metadata": {
        "slideshow": {
          "slide_type": "slide"
        }
      },
      "cell_type": "markdown",
      "source": "![Imgur](https://i.imgur.com/PUAA4mR.jpg?1)"
    },
    {
      "metadata": {
        "slideshow": {
          "slide_type": "slide"
        }
      },
      "cell_type": "markdown",
      "source": "**HOW TO DEFINE A NULL HYPOTHESIS:**\n* Every hypothesis test contains a set of two opposing statements, or hypotheses, about a population parameter\n* The first hypothesis is called the null hypothesis, denoted **H0**\n* The null hypothesis always states that **the population parameter is equal to the claimed value**\n* For example, if the claim is that the average waiting time to get an ordered item in a hotel is five minutes  \n![Imgur](https://i.imgur.com/S4SqUQ1.png)(That is, the population mean is 5 minutes.)\n\n"
    },
    {
      "metadata": {
        "slideshow": {
          "slide_type": "slide"
        }
      },
      "cell_type": "markdown",
      "source": "**HOW TO DEFINE AN ALTERNATIVE HYPOTHESIS**  \n* three possibilities exist for the second (or alternative) hypothesis, denoted **Ha**\n![Imgur](https://i.imgur.com/qOhBYaI.png)"
    },
    {
      "metadata": {
        "slideshow": {
          "slide_type": "subslide"
        }
      },
      "cell_type": "markdown",
      "source": "if you want to test whether the hotel is correct in claiming its average waiting time to get an ordered item five minutes and it **it doesn’t matter whether the actual average time is more or less than that, you use the not-equal-to alternative**. Your hypotheses for that test would be\n![Imgur](https://i.imgur.com/iqHyi7S.png)"
    },
    {
      "metadata": {
        "slideshow": {
          "slide_type": "subslide"
        }
      },
      "cell_type": "markdown",
      "source": "If you only want to see whether the time turns out to be **greater than what the hotel claim** (that is, whether the company is falsely advertising its quick prep time), **you use the greater-than alternative**, and your two hypotheses are\n![Imgur](https://i.imgur.com/7KxVK5D.png)"
    },
    {
      "metadata": {
        "slideshow": {
          "slide_type": "subslide"
        }
      },
      "cell_type": "markdown",
      "source": "If you think **the average waiting time for an ordered item can be in less than five minutes** (and could be marketed by the hotel as such). **The less-than alternative is the one you want**, and your two hypotheses would be\n![Imgur](https://i.imgur.com/ky7hnOw.png)"
    },
    {
      "metadata": {
        "slideshow": {
          "slide_type": "slide"
        }
      },
      "cell_type": "markdown",
      "source": "### Type I and Type II Errors"
    },
    {
      "metadata": {
        "slideshow": {
          "slide_type": "subslide"
        }
      },
      "cell_type": "markdown",
      "source": "![Imgur](https://i.imgur.com/yYmql3R.png?1)"
    },
    {
      "metadata": {
        "slideshow": {
          "slide_type": "slide"
        }
      },
      "cell_type": "markdown",
      "source": "![Imgur](https://i.imgur.com/czKU38H.png)"
    },
    {
      "metadata": {
        "slideshow": {
          "slide_type": "slide"
        }
      },
      "cell_type": "markdown",
      "source": "![Imgur](https://i.imgur.com/gfyfuWJ.png)"
    },
    {
      "metadata": {
        "slideshow": {
          "slide_type": "slide"
        }
      },
      "cell_type": "markdown",
      "source": "![Imgur](https://i.imgur.com/eugQbwB.png)"
    },
    {
      "metadata": {
        "slideshow": {
          "slide_type": "slide"
        }
      },
      "cell_type": "markdown",
      "source": "install.packages(\"ISLR\",repos=\"https://cran.r-project.org\")"
    },
    {
      "metadata": {
        "run_control": {
          "frozen": false,
          "read_only": false
        },
        "scrolled": true,
        "slideshow": {
          "slide_type": "subslide"
        },
        "trusted": false
      },
      "cell_type": "code",
      "source": "library(ISLR)",
      "execution_count": 1,
      "outputs": [
        {
          "name": "stderr",
          "output_type": "stream",
          "text": "Warning message:\n\"package 'ISLR' was built under R version 3.4.3\""
        }
      ]
    },
    {
      "metadata": {
        "run_control": {
          "frozen": false,
          "read_only": false
        },
        "scrolled": true,
        "slideshow": {
          "slide_type": "fragment"
        },
        "trusted": false
      },
      "cell_type": "code",
      "source": "dim(Wage)",
      "execution_count": 2,
      "outputs": [
        {
          "data": {
            "text/html": "<ol class=list-inline>\n\t<li>3000</li>\n\t<li>11</li>\n</ol>\n",
            "text/latex": "\\begin{enumerate*}\n\\item 3000\n\\item 11\n\\end{enumerate*}\n",
            "text/markdown": "1. 3000\n2. 11\n\n\n",
            "text/plain": "[1] 3000   11"
          },
          "metadata": {},
          "output_type": "display_data"
        }
      ]
    },
    {
      "metadata": {
        "collapsed": true,
        "run_control": {
          "frozen": false,
          "read_only": false
        },
        "slideshow": {
          "slide_type": "subslide"
        },
        "trusted": false
      },
      "cell_type": "code",
      "source": "?Wage",
      "execution_count": 3,
      "outputs": []
    },
    {
      "metadata": {
        "collapsed": true,
        "run_control": {
          "frozen": false,
          "read_only": false
        },
        "slideshow": {
          "slide_type": "subslide"
        },
        "trusted": false
      },
      "cell_type": "code",
      "source": "try(data(package=\"ISLR\"))",
      "execution_count": 4,
      "outputs": []
    },
    {
      "metadata": {
        "run_control": {
          "frozen": false,
          "read_only": false
        },
        "scrolled": true,
        "slideshow": {
          "slide_type": "subslide"
        },
        "trusted": false
      },
      "cell_type": "code",
      "source": "head(Wage)",
      "execution_count": 5,
      "outputs": [
        {
          "data": {
            "text/html": "<table>\n<thead><tr><th></th><th scope=col>year</th><th scope=col>age</th><th scope=col>maritl</th><th scope=col>race</th><th scope=col>education</th><th scope=col>region</th><th scope=col>jobclass</th><th scope=col>health</th><th scope=col>health_ins</th><th scope=col>logwage</th><th scope=col>wage</th></tr></thead>\n<tbody>\n\t<tr><th scope=row>231655</th><td>2006                                                      </td><td>18                                                        </td><td>1. Never Married                                          </td><td>1. White                                                  </td><td><span style=white-space:pre-wrap>1. &lt; HS Grad   </span></td><td>2. Middle Atlantic                                        </td><td>1. Industrial                                             </td><td><span style=white-space:pre-wrap>1. &lt;=Good     </span> </td><td>2. No                                                     </td><td>4.318063                                                  </td><td> 75.04315                                                 </td></tr>\n\t<tr><th scope=row>86582</th><td>2004              </td><td>24                </td><td>1. Never Married  </td><td>1. White          </td><td>4. College Grad   </td><td>2. Middle Atlantic</td><td>2. Information    </td><td>2. &gt;=Very Good </td><td>2. No             </td><td>4.255273          </td><td> 70.47602         </td></tr>\n\t<tr><th scope=row>161300</th><td>2003                                                     </td><td>45                                                       </td><td><span style=white-space:pre-wrap>2. Married      </span> </td><td>1. White                                                 </td><td>3. Some College                                          </td><td>2. Middle Atlantic                                       </td><td>1. Industrial                                            </td><td><span style=white-space:pre-wrap>1. &lt;=Good     </span></td><td>1. Yes                                                   </td><td>4.875061                                                 </td><td>130.98218                                                </td></tr>\n\t<tr><th scope=row>155159</th><td>2003                                                    </td><td>43                                                      </td><td><span style=white-space:pre-wrap>2. Married      </span></td><td>3. Asian                                                </td><td>4. College Grad                                         </td><td>2. Middle Atlantic                                      </td><td>2. Information                                          </td><td>2. &gt;=Very Good                                       </td><td>1. Yes                                                  </td><td>5.041393                                                </td><td>154.68529                                               </td></tr>\n\t<tr><th scope=row>11443</th><td>2005                                                     </td><td>50                                                       </td><td><span style=white-space:pre-wrap>4. Divorced     </span> </td><td>1. White                                                 </td><td><span style=white-space:pre-wrap>2. HS Grad     </span>  </td><td>2. Middle Atlantic                                       </td><td>2. Information                                           </td><td><span style=white-space:pre-wrap>1. &lt;=Good     </span></td><td>1. Yes                                                   </td><td>4.318063                                                 </td><td> 75.04315                                                </td></tr>\n\t<tr><th scope=row>376662</th><td>2008                                                    </td><td>54                                                      </td><td><span style=white-space:pre-wrap>2. Married      </span></td><td>1. White                                                </td><td>4. College Grad                                         </td><td>2. Middle Atlantic                                      </td><td>2. Information                                          </td><td>2. &gt;=Very Good                                       </td><td>1. Yes                                                  </td><td>4.845098                                                </td><td>127.11574                                               </td></tr>\n</tbody>\n</table>\n",
            "text/latex": "\\begin{tabular}{r|lllllllllll}\n  & year & age & maritl & race & education & region & jobclass & health & health\\_ins & logwage & wage\\\\\n\\hline\n\t231655 & 2006               & 18                 & 1. Never Married   & 1. White           & 1. < HS Grad       & 2. Middle Atlantic & 1. Industrial      & 1. <=Good          & 2. No              & 4.318063           &  75.04315         \\\\\n\t86582 & 2004               & 24                 & 1. Never Married   & 1. White           & 4. College Grad    & 2. Middle Atlantic & 2. Information     & 2. >=Very Good     & 2. No              & 4.255273           &  70.47602         \\\\\n\t161300 & 2003               & 45                 & 2. Married         & 1. White           & 3. Some College    & 2. Middle Atlantic & 1. Industrial      & 1. <=Good          & 1. Yes             & 4.875061           & 130.98218         \\\\\n\t155159 & 2003               & 43                 & 2. Married         & 3. Asian           & 4. College Grad    & 2. Middle Atlantic & 2. Information     & 2. >=Very Good     & 1. Yes             & 5.041393           & 154.68529         \\\\\n\t11443 & 2005               & 50                 & 4. Divorced        & 1. White           & 2. HS Grad         & 2. Middle Atlantic & 2. Information     & 1. <=Good          & 1. Yes             & 4.318063           &  75.04315         \\\\\n\t376662 & 2008               & 54                 & 2. Married         & 1. White           & 4. College Grad    & 2. Middle Atlantic & 2. Information     & 2. >=Very Good     & 1. Yes             & 4.845098           & 127.11574         \\\\\n\\end{tabular}\n",
            "text/markdown": "\n| <!--/--> | year | age | maritl | race | education | region | jobclass | health | health_ins | logwage | wage | \n|---|---|---|---|---|---|\n| 231655 | 2006               | 18                 | 1. Never Married   | 1. White           | 1. < HS Grad       | 2. Middle Atlantic | 1. Industrial      | 1. <=Good          | 2. No              | 4.318063           |  75.04315          | \n| 86582 | 2004               | 24                 | 1. Never Married   | 1. White           | 4. College Grad    | 2. Middle Atlantic | 2. Information     | 2. >=Very Good     | 2. No              | 4.255273           |  70.47602          | \n| 161300 | 2003               | 45                 | 2. Married         | 1. White           | 3. Some College    | 2. Middle Atlantic | 1. Industrial      | 1. <=Good          | 1. Yes             | 4.875061           | 130.98218          | \n| 155159 | 2003               | 43                 | 2. Married         | 3. Asian           | 4. College Grad    | 2. Middle Atlantic | 2. Information     | 2. >=Very Good     | 1. Yes             | 5.041393           | 154.68529          | \n| 11443 | 2005               | 50                 | 4. Divorced        | 1. White           | 2. HS Grad         | 2. Middle Atlantic | 2. Information     | 1. <=Good          | 1. Yes             | 4.318063           |  75.04315          | \n| 376662 | 2008               | 54                 | 2. Married         | 1. White           | 4. College Grad    | 2. Middle Atlantic | 2. Information     | 2. >=Very Good     | 1. Yes             | 4.845098           | 127.11574          | \n\n\n",
            "text/plain": "       year age maritl           race     education       region            \n231655 2006 18  1. Never Married 1. White 1. < HS Grad    2. Middle Atlantic\n86582  2004 24  1. Never Married 1. White 4. College Grad 2. Middle Atlantic\n161300 2003 45  2. Married       1. White 3. Some College 2. Middle Atlantic\n155159 2003 43  2. Married       3. Asian 4. College Grad 2. Middle Atlantic\n11443  2005 50  4. Divorced      1. White 2. HS Grad      2. Middle Atlantic\n376662 2008 54  2. Married       1. White 4. College Grad 2. Middle Atlantic\n       jobclass       health         health_ins logwage  wage     \n231655 1. Industrial  1. <=Good      2. No      4.318063  75.04315\n86582  2. Information 2. >=Very Good 2. No      4.255273  70.47602\n161300 1. Industrial  1. <=Good      1. Yes     4.875061 130.98218\n155159 2. Information 2. >=Very Good 1. Yes     5.041393 154.68529\n11443  2. Information 1. <=Good      1. Yes     4.318063  75.04315\n376662 2. Information 2. >=Very Good 1. Yes     4.845098 127.11574"
          },
          "metadata": {},
          "output_type": "display_data"
        }
      ]
    },
    {
      "metadata": {
        "run_control": {
          "frozen": false,
          "read_only": false
        },
        "scrolled": true,
        "slideshow": {
          "slide_type": "subslide"
        },
        "trusted": false
      },
      "cell_type": "code",
      "source": "# test the hypothesis whether the average wage of male workers in the Mid-Atlantic region is greater than or equal to 50, 5500 \nt.test(Wage$wage,alternative = \"less\", mu = 250.7036)\noptions(scipen = 999)",
      "execution_count": 6,
      "outputs": [
        {
          "data": {
            "text/plain": "\n\tOne Sample t-test\n\ndata:  Wage$wage\nt = -182.45, df = 2999, p-value < 2.2e-16\nalternative hypothesis: true mean is less than 250.7036\n95 percent confidence interval:\n     -Inf 112.9571\nsample estimates:\nmean of x \n 111.7036 \n"
          },
          "metadata": {},
          "output_type": "display_data"
        }
      ]
    },
    {
      "metadata": {
        "slideshow": {
          "slide_type": "slide"
        }
      },
      "cell_type": "markdown",
      "source": "## Practice"
    },
    {
      "metadata": {
        "slideshow": {
          "slide_type": "fragment"
        }
      },
      "cell_type": "markdown",
      "source": "http://www.statstutor.ac.uk/types/tests-and-quizzes/confidence-intervals-and-hypothesis-testing/"
    }
  ],
  "metadata": {
    "celltoolbar": "Slideshow",
    "hide_input": false,
    "kernelspec": {
      "name": "r",
      "display_name": "R",
      "language": "R"
    },
    "language_info": {
      "mimetype": "text/x-r-source",
      "name": "R",
      "pygments_lexer": "r",
      "version": "3.4.1",
      "file_extension": ".r",
      "codemirror_mode": "r"
    },
    "nav_menu": {},
    "toc": {
      "nav_menu": {},
      "number_sections": true,
      "sideBar": true,
      "skip_h1_title": false,
      "base_numbering": 1,
      "title_cell": "Table of Contents",
      "title_sidebar": "Contents",
      "toc_cell": false,
      "toc_position": {},
      "toc_section_display": "block",
      "toc_window_display": false
    }
  },
  "nbformat": 4,
  "nbformat_minor": 2
}