# **Chapter 5: Statistical Hypothesis Testing**

**Table of Content:**

- [Import Libraries](#Import_Libraries)
- [5.1. Normality Tests](#Normality_Tests)
 - [5.1.1. Shapiro-Wilk Test](#Shapiro-Wilk_Test)
 - [5.1.2. D’Agostino’s $K^2$ Test](#D’Agostino’s_Test)
 - [5.1.3. Anderson-Darling Test](#Anderson-Darling_Test)
- [5.2. Correlation Tests](#Correlation_Tests)
 - [5.2.1. Pearson’s Correlation Coefficient](#Pearson’s_Correlation_Coefficient)
 - [5.2.2. Spearman’s Rank Correlation](#Spearman’s_Rank_Correlation)
 - [5.2.3. Kendall’s Rank Correlation](#Kendall’s_Rank_Correlation)
 - [5.2.4. Chi-Squared Test](#Chi-Squared_Test)
- [5.3. Stationary Tests](#Stationary_Tests)
 - [5.3.1. Augmented Dickey-Fuller Unit Root Test](#Augmented_Dickey-Fuller_Unit_Root_Test)
 - [5.3.2. Kwiatkowski-Phillips-Schmidt-Shin Test](#Kwiatkowski-Phillips-Schmidt-Shin_Test) 
- [5.4. Other Tests](#Other_Tests)
 - [5.4.1. Mann-Whitney U-Test](#Mann-Whitney_U-Test)
 - [5.4.2. Wilcoxon Signed-Rank Test](#Wilcoxon_Signed-Rank-Test)
 - [5.4.3. Kruskal-Wallis H Test](#Kruskal-Wallis_H_Test)
 - [5.4.4. Friedman Test](#Friedman_Test) 



## **Import Libraries**

In [1]:
!pip install --upgrade scipy

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting scipy
 Downloading scipy-1.7.3-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (38.1 MB)
[K |████████████████████████████████| 38.1 MB 1.3 MB/s 
Installing collected packages: scipy
 Attempting uninstall: scipy
 Found existing installation: scipy 1.4.1
 Uninstalling scipy-1.4.1:
 Successfully uninstalled scipy-1.4.1
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
albumentations 0.1.12 requires imgaug<0.2.7,>=0.2.5, but you have imgaug 0.2.9 which is incompatible.[0m
Successfully installed scipy-1.7.3


In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import seaborn as sns
import math
from scipy import stats
from scipy.stats import norm
from scipy.stats import chi2
from scipy.stats import t
from scipy.stats import f
from scipy.stats import bernoulli
from scipy.stats import binom
from scipy.stats import nbinom
from scipy.stats import geom
from scipy.stats import poisson
from scipy.stats import uniform
from scipy.stats import randint
from scipy.stats import expon
from scipy.stats import gamma
from scipy.stats import beta
from scipy.stats import weibull_min
from scipy.stats import hypergeom
from scipy.stats import shapiro
from scipy.stats import pearsonr
from scipy.stats import normaltest
from scipy.stats import anderson
from scipy.stats import spearmanr
from scipy.stats import kendalltau
from scipy.stats import chi2_contingency
from scipy.stats import ttest_ind
from scipy.stats import ttest_rel
from scipy.stats import mannwhitneyu
from scipy.stats import wilcoxon
from scipy.stats import kruskal
from scipy.stats import friedmanchisquare
from statsmodels.tsa.stattools import adfuller
from statsmodels.tsa.stattools import kpss
from statsmodels.stats.weightstats import ztest
from scipy.integrate import quad
from IPython.display import display, Latex

import warnings
warnings.filterwarnings('ignore')
warnings.simplefilter(action='ignore', category=FutureWarning)

 import pandas.util.testing as tm




## **5.1. Normality Tests:**



### **5.1.1. Shapiro-Wilk Test:**

$H_0$ : The sample has a Normal (Gaussian) distribution

$H_1$ : The sample does not have a Normal (Gaussian) distribution.

Assumptions: 
* Observations in each sample are independent and identically distributed (iid).

$\\ $

[Shapiro-Wilk Test Doc](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.shapiro.html)

In [3]:
N = 100
alpha = 0.05
np.random.seed(1)
data = np.random.normal(0, 1, N)

Test_statistic, p_value = shapiro(data)
print(f'Test_statistic_shapiro = {Test_statistic}, p_value = {p_value}', '\n')

if p_value < alpha:
	print(f'Since p_value < {alpha}, reject null hypothesis. Therefore, The data is probably normal.')
else:
	print(f'Since p_value > {alpha}, the null hypothesis cannot be rejected. Therefore, The data is not probably normal.')

Test_statistic_shapiro = 0.9920045137405396, p_value = 0.8215526342391968 

Since p_value > 0.05, the null hypothesis cannot be rejected. Therefore, The data is not probably normal.




### **5.1.2. D’Agostino’s $K^2$ Test:**

$H_0$ : The sample has a Normal (Gaussian) distribution

$H_1$ : The sample does not have a Normal (Gaussian) distribution.

Assumptions: 
* Observations in each sample are independent and identically distributed (iid).




$\\ $

[D’Agostino’s $K^2$ Test Doc](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.normaltest.html)

In [4]:
N = 100
alpha = 0.05
np.random.seed(1)
data = np.random.normal(0, 1, N)

Test_statistic, p_value = normaltest(data)
print(f"Test_statistic_D'Agostino's K-squared = {Test_statistic}, p_value = {p_value}", "\n")

if p_value < alpha:
	print(f'Since p_value < {alpha}, reject null hypothesis. Therefore, The data is probably normal.')
else:
	print(f'Since p_value > {alpha}, the null hypothesis cannot be rejected. Therefore, The data is not probably normal.')

Test_statistic_D'Agostino's K-squared = 0.10202388832581702, p_value = 0.9502673203169621 

Since p_value > 0.05, the null hypothesis cannot be rejected. Therefore, The data is not probably normal.




### **5.1.3. Anderson-Darling Test:**

$H_0$ : The sample has a Normal (Gaussian) distribution

$H_1$ : The sample does not have a Normal (Gaussian) distribution.

Assumptions: 
* Observations in each sample are independent and identically distributed (iid).

$\\ $

[Anderson-Darling Test Doc](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.anderson.html)

Critical values provided are for the following significance levels:

normal/exponential:

$15\%, 10\%, 5\%, 2.5\%, 1\%$

logistic:

$25\%, 10\%, 5\%, 2.5\%, 1\%, 0.5\%$

Gumbel:

$25\%, 10\%, 5\%, 2.5\%, 1\%$

If the test statistic is larger than these critical values then for the corresponding significance level, the null hypothesis that the data come from the chosen distribution can be rejected.

In [5]:
N = 100
np.random.seed(1)
data = np.random.normal(0, 1, N)

Test_statistic, critical_values, significance_level = anderson(data, dist='norm')
print(f'Test_statistic_anderson = {Test_statistic}', '\n')

for i in range(len(critical_values)):
 sl, cv = significance_level[i], critical_values[i]
 if Test_statistic > cv:
 print(f'(Test statistic = {Test_statistic}) > (critical value = {sl}%), therefore for the corresponding significance level, the null hpothesis cannot be rejected.')
 else:
 print(f'(Test statistic = {Test_statistic}) > (critical value = {sl}%), therefore for the corresponding significance level, the null hpothesis is rejected.')

Test_statistic_anderson = 0.2196508855594459 

(Test statistic = 0.2196508855594459) > (critical value = 15.0%), therefore for the corresponding significance level, the null hpothesis is rejected.
(Test statistic = 0.2196508855594459) > (critical value = 10.0%), therefore for the corresponding significance level, the null hpothesis is rejected.
(Test statistic = 0.2196508855594459) > (critical value = 5.0%), therefore for the corresponding significance level, the null hpothesis is rejected.
(Test statistic = 0.2196508855594459) > (critical value = 2.5%), therefore for the corresponding significance level, the null hpothesis is rejected.
(Test statistic = 0.2196508855594459) > (critical value = 1.0%), therefore for the corresponding significance level, the null hpothesis is rejected.


Note that you can use Anderson-Darling test for other distributions. 

The valid values are: {‘norm’, ‘expon’, ‘logistic’, ‘gumbel’, ‘gumbel_l’, ‘gumbel_r’, ‘extreme1’}



## **5.2. Correlation Tests:**



### **5.2.1. Pearson’s Correlation Coefficient:**

Tests whether two data sample have a linear relationship.

$H_0$: The two data are independent.

$H_1$: There is a dependency between the two data.

Assumptions:
* Observations in each data sample are independent and identically distributed (iid).
* Observations in each data sample are normally distributed.
* Observations in each data sample have the same variance.

$\\ $

[Pearson’s Correlation Coefficient Doc](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.pearsonr.html)

In [6]:
N = 10
alpha = 0.05
np.random.seed(1)
data1 = np.random.normal(0, 1, N)
data2 = np.random.normal(0, 1, N) + 2

Test_statistic, p_value = pearsonr(data1, data2)
print(f"Test_statistic_Pearson's Correlation = {Test_statistic}, p_value = {p_value}", "\n")

if p_value < alpha:
	print(f'Since p_value < {alpha}, reject null hypothesis. Therefore, Two data are probably dependent.')
else:
	print(f'Since p_value > {alpha}, the null hypothesis cannot be rejected. Therefore, Two data are probably independent.')

Test_statistic_Pearson's Correlation = 0.6556177144470315, p_value = 0.03957633895447448 

Since p_value < 0.05, reject null hypothesis. Therefore, Two data are probably dependent.


This test is parametric.



### **5.2.2. Spearman’s Rank Correlation:**

Tests whether two data samples have a monotonic relationship.

$H_0$: The two data are independent.

$H_1$: There is a dependency between the two data.

Assumptions:
* Observations in each data sample are independent and identically distributed (iid).
* Observations in each data sample can be ranked.

$\\ $

[Spearman’s Rank Correlation Doc](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.spearmanr.html)

In [7]:
N = 10
alpha = 0.05
np.random.seed(1)
data1 = np.random.normal(0, 1, N)
data2 = np.random.normal(0, 1, N) + 2

Test_statistic, p_value = spearmanr(data1, data2, alternative = 'two-sided')
print(f"Test_statistic_Spearman's Rank Correlation = {Test_statistic}, p_value = {p_value}", "\n")

if p_value < alpha:
	print(f'Since p_value < {alpha}, reject null hypothesis. Therefore, Two data are probably dependent.')
else:
	print(f'Since p_value > {alpha}, the null hypothesis cannot be rejected. Therefore, Two data are probably independent.')

Test_statistic_Spearman's Rank Correlation = 0.7818181818181817, p_value = 0.007547007781067878 

Since p_value < 0.05, reject null hypothesis. Therefore, Two data are probably dependent.


Alternative hypothesis can be {‘two-sided’, ‘less’, ‘greater’}.

'two-sided': the correlation is non-zero

'less': the correlation is negative (less than zero)

'greater': the correlation is positive (greater than zero)



### **5.2.3. Kendall’s Rank Correlation:**

Tests whether two data samples have a monotonic relationship.

$H_0$: The two data are independent.

$H_1$: There is a dependency between the two data.

Assumptions:
* Observations in each data sample are independent and identically distributed (iid).
* Observations in each data sample can be ranked.

$\\ $

[Kendall’s Rank Correlation Doc](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.kendalltau.html)

In [8]:
N = 10
alpha = 0.05
np.random.seed(1)
data1 = np.random.normal(0, 1, N)
data2 = np.random.normal(0, 1, N) + 2

Test_statistic, p_value = kendalltau(data1, data2)
print(f"Test_statistic_Kendall's Rank Correlation = {Test_statistic}, p_value = {p_value}", "\n")

if p_value < alpha:
	print(f'Since p_value < {alpha}, reject null hypothesis. Therefore, Two data are probably dependent.')
else:
	print(f'Since p_value > {alpha}, the null hypothesis cannot be rejected. Therefore, Two data are probably independent.')

Test_statistic_Kendall's Rank Correlation = 0.6, p_value = 0.016666115520282188 

Since p_value < 0.05, reject null hypothesis. Therefore, Two data are probably dependent.




### **5.2.4. Chi-Squared Test:**

Tests whether two categorical variables are related or independent.

$H_0$: The two data are independent.

$H_1$: There is a dependency between the two data.

Assumptions:
* Observations used in the calculation of the contingency table are independent.
* 25 or more examples in each cell of the contingency table.

$\\ $

[Chi-Squared Doc](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.chi2_contingency.html)

$\\ $

degrees of freedom: $(rows - 1) * (cols - 1)$

In [9]:
N = 10
alpha = 0.05
table = [[10, 20, 30],
			 [6, 9, 17]]

Test_statistic, p_value, dof, expected = chi2_contingency(table)
print(f"Test_statistic_Chi-Squared = {Test_statistic}, p_value = {p_value}, df = {dof}, \n", f"Expected = {expected}","\n")

if p_value < alpha:
	print(f'Since p_value < {alpha}, reject null hypothesis. Therefore, Two data are probably dependent.')
else:
	print(f'Since p_value > {alpha}, the null hypothesis cannot be rejected. Therefore, Two data are probably independent.')

Test_statistic_Chi-Squared = 0.27157465150403504, p_value = 0.873028283380073, df = 2, 
 Expected = [[10.43478261 18.91304348 30.65217391]
 [ 5.56521739 10.08695652 16.34782609]] 

Since p_value > 0.05, the null hypothesis cannot be rejected. Therefore, Two data are probably independent.




## **5.3. Stationary Tests:**



### **5.3.1. Augmented Dickey-Fuller Unit Root Test:**

Tests whether a time series has a unit root, e.g. has a trend or more generally is autoregressive.

$H_0$: A unit root is present (series is non-stationary).

$H_1$: A unit root is not present (series is stationary).

Assumptions:
* Observations in are temporally ordered.

$\\ $

[Augmented Dickey-Fuller Unit Root Test Doc](https://www.statsmodels.org/dev/generated/statsmodels.tsa.stattools.adfuller.html)

In [10]:
alpha = 0.05
data = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Test_statistic, p_value, lags, obs, crit, t = adfuller(data)
print(f"Test_statistic_Mann-Whitney = {Test_statistic}, p_value = {p_value}", "\n")

if p_value < alpha:
	print(f'Since p_value < {alpha}, reject null hypothesis. Therefore, the series is probably stationary.')
else:
	print(f'Since p_value > {alpha}, the null hypothesis cannot be rejected. Therefore, the series is probably non-stationary.')

Test_statistic_Mann-Whitney = 0.5171974540944098, p_value = 0.9853865316323872 

Since p_value > 0.05, the null hypothesis cannot be rejected. Therefore, the series is probably non-stationary.




### **5.3.2. Kwiatkowski-Phillips-Schmidt-Shin Test:**

Tests whether a time series is trend stationary or not.

$H_0$: The time series is trend-stationary.

$H_1$: The time series is not trend-stationary.

Assumptions:
* Observations in are temporally ordered.

$\\ $

[Kwiatkowski-Phillips-Schmidt-Shin Test Doc](https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.kpss.html#statsmodels.tsa.stattools.kpss)

In [11]:
alpha = 0.05
data = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Test_statistic, p_value, lags, crit = kpss(data)
print(f"Test_statistic_Kwiatkowski = {Test_statistic}, p_value = {p_value}", "\n")

if p_value < alpha:
	print(f'Since p_value < {alpha}, reject null hypothesis. Therefore, the series is probably not trend-stationary.')
else:
	print(f'Since p_value > {alpha}, the null hypothesis cannot be rejected. Therefore, the series is probably trend-stationary.')

Test_statistic_Kwiatkowski = 0.4099630996309963, p_value = 0.072860732917674 

Since p_value > 0.05, the null hypothesis cannot be rejected. Therefore, the series is probably trend-stationary.




## **5.4. Other Tests:**



### **5.4.1. Mann-Whitney U-Test:**

Tests whether the distributions of two independent samples are equal or not.

$H_0$: The distributions of both samples are equal.

$H_1$: The distributions of both samples are not equal.

Assumptions:
* Observations in each sample are independent and identically distributed (iid).
* Observations in each sample can be ranked.

$\\ $

[Mann-Whitney U Test Doc](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mannwhitneyu.html)

In [12]:
N = 10
alpha = 0.05
data1 = np.random.normal(0, 1, N)
data2 = np.random.normal(0, 1, N)

Test_statistic, p_value = mannwhitneyu(data1, data2, alternative='two-sided')
print(f"Test_statistic_Mann-Whitney = {Test_statistic}, p_value = {p_value}", "\n")

if p_value < alpha:
	print(f'Since p_value < {alpha}, reject null hypothesis. Therefore, Two data distributions are probably not equal.')
else:
	print(f'Since p_value > {alpha}, the null hypothesis cannot be rejected. Therefore, Two data distributions are probably equal.')

Test_statistic_Mann-Whitney = 61.0, p_value = 0.4273553138978077 

Since p_value > 0.05, the null hypothesis cannot be rejected. Therefore, Two data distributions are probably equal.




### **5.4.2. Wilcoxon Signed-Rank Test:**

Tests whether the distributions of two paired samples are equal or not.

$H_0$: The distributions of both samples are equal.

$H_1$: The distributions of both samples are not equal.

Assumptions:
* Observations in each sample are independent and identically distributed (iid).
* Observations in each sample can be ranked.
* Observations across each sample are paired.

$\\ $

[Wilcoxon Signed-Rank Test Doc](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.wilcoxon.html)

In [13]:
N = 10
alpha = 0.05
data1 = np.random.normal(0, 1, N)
data2 = np.random.normal(0, 1, N)

Test_statistic, p_value = wilcoxon(data1, data2, alternative='two-sided')
print(f"Test_statistic_Wilcoxon = {Test_statistic}, p_value = {p_value}", "\n")

if p_value < alpha:
	print(f'Since p_value < {alpha}, reject null hypothesis. Therefore, Two data distributions are probably not equal.')
else:
	print(f'Since p_value > {alpha}, the null hypothesis cannot be rejected. Therefore, Two data distributions are probably equal.')

Test_statistic_Wilcoxon = 24.0, p_value = 0.76953125 

Since p_value > 0.05, the null hypothesis cannot be rejected. Therefore, Two data distributions are probably equal.




### **5.4.3. Kruskal-Wallis H Test:**

Tests whether the distributions of two or more independent samples are equal or not.

$H_0$: The distributions of all samples are equal.

$H_1$: The distributions of one or more samples are not equal.

Assumptions:
* Observations in each sample are independent and identically distributed (iid).
* Observations in each sample can be ranked.

$\\ $

[Kruskal-Wallis H Test Doc](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.kruskal.html)

In [14]:
N = 10
alpha = 0.05
data1 = np.random.normal(0, 1, N)
data2 = np.random.normal(0, 1, N)

Test_statistic, p_value = kruskal(data1, data2)
print(f"Test_statistic_Wilcoxon = {Test_statistic}, p_value = {p_value}", "\n")

if p_value < alpha:
	print(f'Since p_value < {alpha}, reject null hypothesis. Therefore, Two data distributions are probably not equal.')
else:
	print(f'Since p_value > {alpha}, the null hypothesis cannot be rejected. Therefore, Two data distributions are probably equal.')

Test_statistic_Wilcoxon = 1.462857142857132, p_value = 0.22647606604348455 

Since p_value > 0.05, the null hypothesis cannot be rejected. Therefore, Two data distributions are probably equal.




### **5.4.4. Friedman Test:**

Tests whether the distributions of two or more paired samples are equal or not.

$H_0$: The distributions of both samples are equal.

$H_1$: The distributions of both samples are not equal.

Assumptions:
* Observations in each sample are independent and identically distributed (iid).
* Observations in each sample can be ranked.
* Observations across each sample are paired.

$\\ $

[Friedman Test Doc](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.friedmanchisquare.html)

In [15]:
alpha = 0.05
data1 = [0.873, 2.817, 0.121, -0.945, -0.055, -1.436, 0.360, -1.478, -1.637, -1.869]
data2 = [1.142, -0.432, -0.938, -0.729, -0.846, -0.157, 0.500, 1.183, -1.075, -0.169]
data3 = [-0.208, 0.696, 0.928, -1.148, -0.213, 0.229, 0.137, 0.269, -0.870, -1.204]

Test_statistic, p_value = friedmanchisquare(data1, data2, data3)
print(f"Test_statistic_Friedman = {Test_statistic}, p_value = {p_value}", "\n")

if p_value < alpha:
	print(f'Since p_value < {alpha}, reject null hypothesis. Therefore, data distributions are probably not equal.')
else:
	print(f'Since p_value > {alpha}, the null hypothesis cannot be rejected. Therefore, data distributions are probably equal.')

Test_statistic_Friedman = 0.8000000000000114, p_value = 0.6703200460356356 

Since p_value > 0.05, the null hypothesis cannot be rejected. Therefore, data distributions are probably equal.
