{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Statistics example problems\n",
    "\n",
    "[1. ANOVA - geological example](#1.-ANOVA-geological-example)\n",
    "\n",
    "[2. Non-parametric tests](#2.-Non-parametric-tests)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "from scipy import stats\n",
    "import pandas as pd\n",
    "import pingouin as pg"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1. ANOVA geological example\n",
    "\n",
    "#### a. Implementation\n",
    "\n",
    "Data come from Table 10.1 of McKillup and Dyar, Geostatistics Explained, Cambridge University Press, 2010 (excerpt available on class Google Drive). Values represent the weight percent of MgO present in tourmalines from three locations in Maine. \n",
    "\n",
    "Use two different methods to test whether there is a significant difference in the mean MgO content between the three different sites.\n",
    "\n",
    "##### Method 1: Scipy"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = pd.read_csv('data/MgO_example/MgO_Maine.csv') # dataframe"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Mount Mica</th>\n",
       "      <th>Sebago Batholith</th>\n",
       "      <th>Black Mountain</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>7</td>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>8</td>\n",
       "      <td>5</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>10</td>\n",
       "      <td>7</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>11</td>\n",
       "      <td>8</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   Mount Mica  Sebago Batholith  Black Mountain\n",
       "0           7                 4               1\n",
       "1           8                 5               2\n",
       "2          10                 7               4\n",
       "3          11                 8               5"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "F_onewayResult(statistic=10.799999999999999, pvalue=0.004058306777237465)"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "stats.f_oneway(df['Mount Mica'],df['Sebago Batholith'],df['Black Mountain'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Method 2: Pingouin"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [],
   "source": [
    "df2 = pd.read_csv('data/MgO_example/MgO_Maine_list.csv')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>MgO</th>\n",
       "      <th>Location</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>7</td>\n",
       "      <td>Mount Mica</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>8</td>\n",
       "      <td>Mount Mica</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>10</td>\n",
       "      <td>Mount Mica</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>11</td>\n",
       "      <td>Mount Mica</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>4</td>\n",
       "      <td>Sebago Batholith</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>5</td>\n",
       "      <td>Sebago Batholith</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>7</td>\n",
       "      <td>Sebago Batholith</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>8</td>\n",
       "      <td>Sebago Batholith</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>1</td>\n",
       "      <td>Black Mountain</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>2</td>\n",
       "      <td>Black Mountain</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>4</td>\n",
       "      <td>Black Mountain</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>5</td>\n",
       "      <td>Black Mountain</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    MgO          Location\n",
       "0     7        Mount Mica\n",
       "1     8        Mount Mica\n",
       "2    10        Mount Mica\n",
       "3    11        Mount Mica\n",
       "4     4  Sebago Batholith\n",
       "5     5  Sebago Batholith\n",
       "6     7  Sebago Batholith\n",
       "7     8  Sebago Batholith\n",
       "8     1    Black Mountain\n",
       "9     2    Black Mountain\n",
       "10    4    Black Mountain\n",
       "11    5    Black Mountain"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Source</th>\n",
       "      <th>ddof1</th>\n",
       "      <th>ddof2</th>\n",
       "      <th>F</th>\n",
       "      <th>p-unc</th>\n",
       "      <th>np2</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Location</td>\n",
       "      <td>2</td>\n",
       "      <td>9</td>\n",
       "      <td>10.8</td>\n",
       "      <td>0.004058</td>\n",
       "      <td>0.705882</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     Source  ddof1  ddof2     F     p-unc       np2\n",
       "0  Location      2      9  10.8  0.004058  0.705882"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pg.anova(data=df2,dv='MgO',between='Location')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Post-hoc test"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>A</th>\n",
       "      <th>B</th>\n",
       "      <th>mean(A)</th>\n",
       "      <th>mean(B)</th>\n",
       "      <th>diff</th>\n",
       "      <th>se</th>\n",
       "      <th>T</th>\n",
       "      <th>p-tukey</th>\n",
       "      <th>hedges</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Black Mountain</td>\n",
       "      <td>Mount Mica</td>\n",
       "      <td>3</td>\n",
       "      <td>9</td>\n",
       "      <td>-6</td>\n",
       "      <td>1.290994</td>\n",
       "      <td>-4.64758</td>\n",
       "      <td>0.001000</td>\n",
       "      <td>-2.857683</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Black Mountain</td>\n",
       "      <td>Sebago Batholith</td>\n",
       "      <td>3</td>\n",
       "      <td>6</td>\n",
       "      <td>-3</td>\n",
       "      <td>1.290994</td>\n",
       "      <td>-2.32379</td>\n",
       "      <td>0.061925</td>\n",
       "      <td>-1.428841</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Mount Mica</td>\n",
       "      <td>Sebago Batholith</td>\n",
       "      <td>9</td>\n",
       "      <td>6</td>\n",
       "      <td>3</td>\n",
       "      <td>1.290994</td>\n",
       "      <td>2.32379</td>\n",
       "      <td>0.061925</td>\n",
       "      <td>1.428841</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                A                 B  mean(A)  mean(B)  diff        se  \\\n",
       "0  Black Mountain        Mount Mica        3        9    -6  1.290994   \n",
       "1  Black Mountain  Sebago Batholith        3        6    -3  1.290994   \n",
       "2      Mount Mica  Sebago Batholith        9        6     3  1.290994   \n",
       "\n",
       "         T   p-tukey    hedges  \n",
       "0 -4.64758  0.001000 -2.857683  \n",
       "1 -2.32379  0.061925 -1.428841  \n",
       "2  2.32379  0.061925  1.428841  "
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pg.pairwise_tukey(data=df2,dv='MgO',between='Location')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### b. ANOVA interpretation\n",
    "\n",
    "Write a summary of your interpretation of the statistical results conducted above. Address the following questions.\n",
    "\n",
    "* What is the null hypothesis being tested?\n",
    "* Should the null hypothesis be accepted or rejected?\n",
    "* What does the post-hoc test tell you?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2. Non-parametric tests"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### a. Wilcoxon signed-rank test: implementation\n",
    "\n",
    "This example uses data from:\n",
    "http://www.biostathandbook.com/wilcoxonsignedrank.html\n",
    "\n",
    "The data are observations of aluminum content in 13 different poplar clones in a polluted area. The scientific question is whether there is a significant change in the aluminum content between August and November."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_al = pd.read_csv('data/wilcoxon_example/Al_content.csv',\n",
    "                   delimiter='\\t')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Clone</th>\n",
       "      <th>August</th>\n",
       "      <th>November</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Columbia River</td>\n",
       "      <td>18.3</td>\n",
       "      <td>12.7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Fritzi Pauley</td>\n",
       "      <td>13.3</td>\n",
       "      <td>11.1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Hazendans</td>\n",
       "      <td>16.5</td>\n",
       "      <td>15.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Primo</td>\n",
       "      <td>12.6</td>\n",
       "      <td>12.7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Raspalje</td>\n",
       "      <td>9.5</td>\n",
       "      <td>10.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Hoogvorst</td>\n",
       "      <td>13.6</td>\n",
       "      <td>15.6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Balsam Spire</td>\n",
       "      <td>8.1</td>\n",
       "      <td>11.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Gibecq</td>\n",
       "      <td>8.9</td>\n",
       "      <td>14.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Beaupre</td>\n",
       "      <td>10.0</td>\n",
       "      <td>16.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Unal</td>\n",
       "      <td>8.3</td>\n",
       "      <td>15.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10</th>\n",
       "      <td>Trichobel</td>\n",
       "      <td>7.9</td>\n",
       "      <td>19.9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11</th>\n",
       "      <td>Gaver</td>\n",
       "      <td>8.1</td>\n",
       "      <td>20.4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>Wolterson</td>\n",
       "      <td>13.4</td>\n",
       "      <td>36.8</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             Clone  August  November\n",
       "0   Columbia River    18.3      12.7\n",
       "1    Fritzi Pauley    13.3      11.1\n",
       "2        Hazendans    16.5      15.3\n",
       "3            Primo    12.6      12.7\n",
       "4         Raspalje     9.5      10.5\n",
       "5        Hoogvorst    13.6      15.6\n",
       "6     Balsam Spire     8.1      11.2\n",
       "7           Gibecq     8.9      14.2\n",
       "8          Beaupre    10.0      16.3\n",
       "9             Unal     8.3      15.5\n",
       "10       Trichobel     7.9      19.9\n",
       "11           Gaver     8.1      20.4\n",
       "12       Wolterson    13.4      36.8"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_al"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAWoAAAD4CAYAAADFAawfAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/Il7ecAAAACXBIWXMAAAsTAAALEwEAmpwYAAAKn0lEQVR4nO3cf4jk913H8dfbu0hLG2g1awlN1hURoQRN5Ih/REoNRdJGrApKA0oF4fzDQgqCRv+x9a8oWvxHxNMGo9aWQFotDf4I2FADmnoX05p4LZZyamzIJYTS3D9K0rd/7Fxyve7eTpKb3ffOPR6w3OzM92beHz63T+a+M7PV3QFgrm876AEAuDShBhhOqAGGE2qA4YQaYLijq7jTa665pre2tlZx1wBr6dSpU89298ZOt60k1FtbWzl58uQq7hpgLVXVf+52m1MfAMMJNcBwQg0wnFADDCfUAMMJNcBwS709r6rOJHk+yYtJXujuY6scCoCXvZL3Uf9odz+7skkA2JFTHwDDLfuMupP8fVV1kj/q7hMXH1BVx5McT5LNzc1XPdDWXQ+86r/7Wpy5+/YDeVyAvSz7jPqW7v6hJO9K8stV9faLD+juE919rLuPbWzs+HF1AF6FpULd3V9d/Hk2ySeT3LzKoQB42Z6hrqo3VNXV5y8n+bEkj696MAC2LXOO+i1JPllV54//y+7+25VOBcBL9gx1d38lyQ/uwywA7MDb8wCGE2qA4YQaYDihBhhOqAGGE2qA4YQaYDihBhhOqAGGE2qA4YQaYDihBhhOqAGGE2qA4YQaYDihBhhOqAGGE2qA4YQaYDihBhhOqAGGE2qA4YQaYDihBhhOqAGGE2qA4YQaYDihBhhOqAGGE2qA4YQaYDihBhhu6VBX1ZGq+teq+vQqBwLgm72SZ9R3Jjm9qkEA2NlSoa6q65LcnuRPVjsOABc7uuRxv5/kV5NcvdsBVXU8yfEk2dzcfM2D7betux44sMc+c/ftB/bYwHx7PqOuqh9Pcra7T13quO4+0d3HuvvYxsbGZRsQ4Eq3zKmPW5L8RFWdSfLxJLdW1V+sdCoAXrJnqLv717v7uu7eSvLeJP/Q3T+38skASOJ91ADjLftiYpKkux9K8tBKJgFgR55RAwwn1ADDCTXAcEINMJxQAwwn1ADDCTXAcEINMJxQAwwn1ADDCTXAcEINMJxQAwwn1ADDCTXAcEINMJxQAwwn1ADDCTXAcEINMJxQAwwn1ADDCTXAcEINMJxQAwwn1ADDCTXAcEINMJxQAwwn1ADDCTXAcEINMNyeoa6q11XV56rq81X1RFV9aD8GA2Db0SWO+d8kt3b3uaq6KsnDVfU33f3PK54NgCwR6u7uJOcW3161+OpVDgXAy5Y6R11VR6rqsSRnkzzY3Y+sdCoAXrJUqLv7xe6+Mcl1SW6uqhsuPqaqjlfVyao6+cwzz1zmMQGuXK/oXR/d/bUkDyW5bYfbTnT3se4+trGxcXmmA2Cpd31sVNWbFpdfn+SdSb644rkAWFjmXR/XJrm3qo5kO+z3dfenVzsWAOct866PLyS5aR9mAWAHPpkIMJxQAwwn1ADDCTXAcEINMJxQAwwn1ADDCTXAcEINMJxQAwwn1ADDCTXAcEINMJxQAwwn1ADDCTXAcEINMJxQAwwn1ADDCTXAcEINMJxQAwwn1ADDCTXAcEINMJxQAwwn1ADDCTXAcEINMJxQAwwn1ADDCTXAcEINMNyeoa6q66vqM1V1uqqeqKo792MwALYdXeKYF5L8Snc/WlVXJzlVVQ9297+veDYAssQz6u5+qrsfXVx+PsnpJG9d9WAAbFvmGfVLqmoryU1JHtnhtuNJjifJ5ubm5ZiNFdu664EDe+wzd99+YI8Nh83SLyZW1RuT3J/kA9399Ytv7+4T3X2su49tbGxczhkBrmhLhbqqrsp2pD/a3Z9Y7UgAXGiZd31Uko8kOd3dH179SABcaJln1Lck+fkkt1bVY4uvd694LgAW9nwxsbsfTlL7MAsAO/DJRIDhhBpgOKEGGE6oAYYTaoDhhBpgOKEGGE6oAYYTaoDhhBpgOKEGGE6oAYYTaoDhhBpgOKEGGE6oAYYTaoDhhBpgOKEGGE6oAYYTaoDhhBpgOKEGGE6oAYYTaoDhhBpgOKEGGE6oAYYTaoDhhBpgOKEGGE6oAYbbM9RVdU9Vna2qx/djIAC+2TLPqP80yW0rngOAXewZ6u7+bJLn9mEWAHZw9HLdUVUdT3I8STY3Ny/X3V4Rtu564KBHgLVyUD9TZ+6+fSX3e9leTOzuE919rLuPbWxsXK67BbjiedcHwHBCDTDcMm/P+1iSf0ry/VX1ZFX94urHAuC8PV9M7O479mMQAHbm1AfAcEINMJxQAwwn1ADDCTXAcEINMJxQAwwn1ADDCTXAcEINMJxQAwwn1ADDCTXAcEINMJxQAwwn1ADDCTXAcEINMJxQAwwn1ADDCTXAcEINMJxQAwwn1ADDCTXAcEINMJxQAwwn1ADDCTXAcEINMJxQAwwn1ADDCTXAcEuFuqpuq6ovVdWXq+quVQ8FwMv2DHVVHUnyB0neleRtSe6oqretejAAti3zjPrmJF/u7q909/8l+XiS96x2LADOO7rEMW9N8t8XfP9kkh+++KCqOp7k+OLbc1X1pT3u95okzy4z5CFnnTuo317hJKt1JeznlbDGZAXrfI3/rr97txuWCXXtcF1/yxXdJ5KcWHaiqjrZ3ceWPf6wss71ciWs80pYY3K41rnMqY8nk1x/wffXJfnqasYB4GLLhPpfknxfVX1PVX17kvcm+dRqxwLgvD1PfXT3C1X1/iR/l+RIknu6+4nL8NhLnyY55KxzvVwJ67wS1pgconVW97ecbgZgEJ9MBBhOqAGG25dQV9U9VXW2qh6/4LrvqKoHq+o/Fn++eT9mWaVd1vnBqvqfqnps8fXug5zxtaqq66vqM1V1uqqeqKo7F9ev1X5eYp3rtp+vq6rPVdXnF+v80OL6ddvP3dZ5KPZzX85RV9Xbk5xL8mfdfcPiut9J8lx33734/SFv7u5fW/kwK7TLOj+Y5Fx3/+5Bzna5VNW1Sa7t7ker6uokp5L8ZJJfyBrt5yXW+bNZr/2sJG/o7nNVdVWSh5PcmeSns177uds6b8sh2M99eUbd3Z9N8txFV78nyb2Ly/dm+4fgUNtlnWulu5/q7kcXl59Pcjrbn15dq/28xDrXSm87t/j2qsVXZ/32c7d1HgoHeY76Ld39VLL9Q5Hkuw5wllV7f1V9YXFq5FD/F/JCVbWV5KYkj2SN9/OidSZrtp9VdaSqHktyNsmD3b2W+7nLOpNDsJ9eTFy9P0zyvUluTPJUkt870Gkuk6p6Y5L7k3ygu79+0POsyg7rXLv97O4Xu/vGbH/q+OaquuGAR1qJXdZ5KPbzIEP99OI84PnzgWcPcJaV6e6nF/9AvpHkj7P92wgPtcU5vvuTfLS7P7G4eu32c6d1ruN+ntfdX0vyULbP267dfp534ToPy34eZKg/leR9i8vvS/LXBzjLypz/x77wU0ke3+3Yw2DxosxHkpzu7g9fcNNa7edu61zD/dyoqjctLr8+yTuTfDHrt587rvOw7Od+vevjY0neke1fK/h0kt9M8ldJ7kuymeS/kvxMdx/qF+J2Wec7sv3fqk5yJskvnT/3dxhV1Y8k+cck/5bkG4urfyPb52/XZj8vsc47sl77+QPZfrHwSLafuN3X3b9VVd+Z9drP3db55zkE++kj5ADDeTERYDihBhhOqAGGE2qA4YQaYDihBhhOqAGG+38/s+3T2oZmYwAAAABJRU5ErkJggg==",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.figure()\n",
    "plt.hist(df_al['November']);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "SkewtestResult(statistic=3.449022139607473, pvalue=0.0005626205706886182)"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "stats.skewtest(df_al['November'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "NormaltestResult(statistic=21.55304457655946, pvalue=2.0884103462437462e-05)"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "stats.normaltest(df_al['November'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "WilcoxonResult(statistic=16.0, pvalue=0.039794921875)"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "stats.wilcoxon(df_al['August'],df_al['November'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### b. Interpretation\n",
    "\n",
    "Under what situations are non-parametric statistics useful? What are the potential drawbacks in using non-parametric statistics when a parametric approach is justified?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": []
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python 3.10.6 ('data-book')",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:38:29) [Clang 13.0.1 ]"
  },
  "vscode": {
   "interpreter": {
    "hash": "0ef88d3abb6b62f34a20525ce337090c4512fe8aecf32c74604482b944e1c3bd"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}