# **`Chapter 3: Confidence Intervals`**

**Table of Content:**

- [Import Libraries](#Import_Libraries)
- [3.1. Confidence Interval for the Mean of a Normal Population](#Confidence_Interval_for_the_Mean_of_a_Normal_Population)
 - [3.1.1. Known Standard Deviation](#Known_Standard_Deviation)
 - [3.1.2. Unknown Standard Deviation](#Unknown_Standard_Deviation)

- [3.2. Confidence Interval for the Variance of a Normal Population](#Confidence_Interval_for_the_Variance_of_a_Normal_Population)
 - [3.2.1. Unknown Mean of the Population](#Unknown_Mean_of_the_Population)
 - [3.2.2. Known Mean of the Population](#Known_Mean_of_the_Population)

- [3.3. Confidence Interval for the Difference in Means of Two Normal Population](#Confidence_Interval_for_the_Difference_in_Means_of_Two_Normal_Populations)
 - [3.3.1. Known Variances](#Known_Variances)
 - [3.3.2. Unknown but Equal Variances](#Unknown_but_Equal_Variances)

- [3.4. Confidence Interval for the Ratio of Variances of Two Normal Populations](#Confidence_Interval_for_the_Ratio_of_Variances_of_Two_Normal_Populations)
- [3.5. Confidence Interval for the Mean of a Bernoulli Random Variable](#Confidence_Interval_for_the_Mean_of_a_Bernoulli_Random_Variable)




## **Import Libraries**

In [None]:
!pip install --upgrade scipy

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting scipy
 Downloading scipy-1.7.3-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (38.1 MB)
[K |████████████████████████████████| 38.1 MB 1.2 MB/s 
Installing collected packages: scipy
 Attempting uninstall: scipy
 Found existing installation: scipy 1.4.1
 Uninstalling scipy-1.4.1:
 Successfully uninstalled scipy-1.4.1
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
albumentations 0.1.12 requires imgaug<0.2.7,>=0.2.5, but you have imgaug 0.2.9 which is incompatible.[0m
Successfully installed scipy-1.7.3


In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import seaborn as sns
import math
from scipy import stats
from scipy.stats import norm
from scipy.stats import chi2
from scipy.stats import t
from scipy.stats import f
from scipy.stats import bernoulli
from scipy.stats import binom
from scipy.stats import nbinom
from scipy.stats import geom
from scipy.stats import poisson
from scipy.stats import uniform
from scipy.stats import randint
from scipy.stats import expon
from scipy.stats import gamma
from scipy.stats import beta
from scipy.stats import weibull_min
from scipy.stats import hypergeom
from scipy.stats import shapiro
from scipy.stats import pearsonr
from scipy.stats import normaltest
from scipy.stats import anderson
from scipy.stats import spearmanr
from scipy.stats import kendalltau
from scipy.stats import chi2_contingency
from scipy.stats import ttest_ind
from scipy.stats import ttest_rel
from scipy.stats import mannwhitneyu
from scipy.stats import wilcoxon
from scipy.stats import kruskal
from scipy.stats import friedmanchisquare
from statsmodels.tsa.stattools import adfuller
from statsmodels.tsa.stattools import kpss
from statsmodels.stats.weightstats import ztest
from scipy.integrate import quad
from IPython.display import display, Latex

import warnings
warnings.filterwarnings('ignore')
warnings.simplefilter(action='ignore', category=FutureWarning)

 import pandas.util.testing as tm




## **3.1. Confidence Interval for the Mean of a Normal Population:**



### **3.1.1. Known Standard Deviation:**

**A. Two-sided Confidence Interval:**

Suppose that $X_1, X_2, ..., X_n$ is a sample of size $n$ from a normal distribution having an unknown mean $\mu$ and a known variance $\sigma^2$.

$X_1, X_2, ..., X_n \sim N( \mu, \sigma^2)$

$\\ $

Significance level = $\alpha$

$\\ $

$P(-\ Z_{\frac{\alpha}{2}}\ \leq\ \frac{\overline{X}-\mu}{\frac{\sigma}{\sqrt{n}}}\ \leq\ Z_{\frac{\alpha}{2}}) = 1-\alpha$

$P(\overline{X}\ -\ Z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}}\ \leq\ \mu\ \leq\ \overline{X}\ +\ Z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}}) = 1-\alpha$

$\\ $

Therefore, the $1-\alpha$ confidence interval for the mean of a normal population is:

$[\overline{X}\ -\ Z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}},\ \overline{X}\ +\ Z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}}]$

**B. One-sided Lower Confidence Interval:**

Suppose that $X_1, X_2, ..., X_n$ is a sample of size $n$ from a normal distribution having an unknown mean $\mu$ and a known variance $\sigma^2$.

$X_1, X_2, ..., X_n \sim N( \mu, \sigma^2)$

$\\ $

Significance level = $\alpha$

$\\ $

$P(-\infty\ \leq\ \frac{\overline{X}-\mu}{\frac{\sigma}{\sqrt{n}}}\ \leq\ Z_{\alpha}) = 1-\alpha$

$P(-\infty\ \leq\ \mu\ \leq\ \overline{X}\ +\ Z_{\alpha} \frac{\sigma}{\sqrt{n}}) = 1-\alpha$

$\\ $

Therefore, the $1-\alpha$ confidence interval for the mean of a normal population is:

$[-\infty,\ \overline{X}\ +\ Z_{\alpha} \frac{\sigma}{\sqrt{n}}]$

**C. One-sided Upper Confidence Interval:**

Suppose that $X_1, X_2, ..., X_n$ is a sample of size $n$ from a normal distribution having an unknown mean $\mu$ and a known variance $\sigma^2$.

$X_1, X_2, ..., X_n \sim N( \mu, \sigma^2)$

$\\ $

Significance level = $\alpha$

$\\ $

$P(-\ Z_{\alpha}\ \leq\ \frac{\overline{X}-\mu}{\frac{\sigma}{\sqrt{n}}}\ \leq\ \infty) = 1-\alpha$

$P(\overline{X}\ -\ Z_{\alpha} \frac{\sigma}{\sqrt{n}}\ \leq\ \mu\ \leq\ \infty) = 1-\alpha$

$\\ $

Therefore, the $1-\alpha$ confidence interval for the mean of a normal population is:

$[\overline{X}\ -\ Z_{\alpha} \frac{\sigma}{\sqrt{n}},\ \infty]$

In [None]:
class confidence_interval_for_mean_with_known_variance:
 """
 Parameters
 ----------
 population_sd : known standrad deviation of the population
 n : optional, number of sample members
 c_level : % confidence level
 type_t : 'two_sided_confidence', 'lower_confidence', 'upper_confidence'
 Sample_mean : mean of the sample
 data : optional, if you do not know the Sample_mean and n, just pass the data
 """
 def __init__(self, population_sd, c_level, type_c, Sample_mean = 0., n = 0., data=None):
 self.Sample_mean = Sample_mean
 self.population_sd = population_sd
 self.type_c = type_c
 self.n = n
 self.c_level = c_level
 self.data = data
 if data is not None:
 self.Sample_mean = np.mean(list(data))
 self.n = len(list(data))

 confidence_interval_for_mean_with_known_variance.__test(self)
 
 def __test(self):
 if self.type_c == 'two_sided_confidence':
 c_u = self.Sample_mean + (-norm.ppf((1-self.c_level)/2)) * (self.population_sd/np.sqrt(self.n))
 c_l = self.Sample_mean - (-norm.ppf((1-self.c_level)/2)) * (self.population_sd/np.sqrt(self.n))
 display(Latex(f'${c_l} \leq \mu \leq {c_u}$'))
 elif self.type_c == 'lower_confidence':
 c_u = self.Sample_mean + (-norm.ppf(1-self.c_level)) * (self.population_sd/np.sqrt(self.n))
 display(Latex(f'$\mu \leq {c_u}$'))
 elif self.type_c == 'upper_confidence':
 c_l = self.Sample_mean - (-norm.ppf(1-self.c_level)) * (self.population_sd/np.sqrt(self.n))
 display(Latex(f'${c_l} \leq \mu$'))

In [None]:
np.random.seed(1)
data = np.random.normal(loc = 2, scale = 3, size = 20)
confidence_interval_for_mean_with_known_variance(population_sd = 3, c_level = 0.95, type_c = 'two_sided_confidence', data=data);



In [None]:
data = [5, 8.5, 12, 15, 7, 9, 7.5, 6.5, 10.5]
confidence_interval_for_mean_with_known_variance(population_sd = 2, c_level = 0.95, type_c = 'lower_confidence', data=data);





### **3.1.2. Unknown Standard Deviation:**

**A. Two-sided Confidence Interval:**

Suppose that $X_1, X_2, ..., X_n$ is a sample of size $n$ from a normal distribution having an unknown mean $\mu$ and a unknown variance $\sigma^2$.

$X_1, X_2, ..., X_n \sim N( \mu, \sigma^2)$

$\\ $

Significance level = $\alpha$

$P(-\ t_{\frac{\alpha}{2},n-1}\ <\ \frac{\overline{X}-\mu}{\frac{S}{\sqrt{n}}} <\ t_{\frac{\alpha}{2},n-1}) = 1-\alpha$

$P(\overline{X}\ -\ t_{\frac{\alpha}{2},n-1} \frac{S}{\sqrt{n}}\ <\ \mu\ <\ \overline{X}\ +\ t_{\frac{\alpha}{2},n-1} \frac{S}{\sqrt{n}}) = 1- \alpha$

$\\ $

Therefore, the $1-\alpha$ confidence interval for the mean of a normal population is:

$[\overline{X}\ -\ t_{\frac{\alpha}{2},n-1} \frac{S}{\sqrt{n}},\ \overline{X}\ +\ t_{\frac{\alpha}{2},n-1} \frac{S}{\sqrt{n}}]$

**B. One-sided Lower Confidence Interval:**

Suppose that $X_1, X_2, ..., X_n$ is a sample of size $n$ from a normal distribution having an unknown mean $\mu$ and a known variance $\sigma^2$.

$X_1, X_2, ..., X_n \sim N( \mu, \sigma^2)$

$\\ $

Significance level = $\alpha$

$\\ $

$P(-\infty\ \leq\ \frac{\overline{X}-\mu}{\frac{S}{\sqrt{n}}}\ \leq\ t_{\alpha,n-1}) = 1-\alpha$

$P(-\infty\ \leq\ \mu\ \leq\ \overline{X}\ +\ t_{\alpha,n-1} \frac{S}{\sqrt{n}}) = 1-\alpha$

$\\ $

Therefore, the $1-\alpha$ confidence interval for the mean of a normal population is:

$[-\infty,\ \overline{X}\ +\ t_{\alpha,n-1} \frac{S}{\sqrt{n}}]$

**B. One-sided upper Confidence Interval:**

Suppose that $X_1, X_2, ..., X_n$ is a sample of size $n$ from a normal distribution having an unknown mean $\mu$ and a known variance $\sigma^2$.

$X_1, X_2, ..., X_n \sim N( \mu, \sigma^2)$

$\\ $

Significance level = $\alpha$

$\\ $

$P(-t_{\alpha,n-1} \leq\ \frac{\overline{X}-\mu}{\frac{S}{\sqrt{n}}}\ \leq\ \infty) = 1-\alpha$

$P(\overline{X}\ -\ t_{\alpha,n-1} \frac{S}{\sqrt{n}} \leq\ \mu\ \leq\ \infty) = 1-\alpha$

$\\ $

Therefore, the $1-\alpha$ confidence interval for the mean of a normal population is:

$[\overline{X}\ -\ t_{\alpha,n-1} \frac{S}{\sqrt{n}},\ \infty]$

In [None]:
class confidence_interval_for_mean_with_unknown_variance:
 """
 Parameters
 ----------
 n : optional, number of sample members
 c_level : % confidence level
 type_t : 'two_sided_confidence', 'lower_confidence', 'upper_confidence'
 Sample_std : optional, std of the sample
 Sample_mean : optional, mean of the sample
 data : optional, if you do not know the Sample_mean and n, just pass the data
 """
 def __init__(self, c_level, type_c, Sample_std = 0., Sample_mean = 0., n = 0., data=None):
 self.Sample_mean = Sample_mean
 self.Sample_std = Sample_std
 self.type_c = type_c
 self.n = n
 self.c_level = c_level
 self.data = data
 if data is not None:
 self.Sample_mean = np.mean(list(data))
 self.Sample_std = np.std(list(data), ddof=1)
 self.n = len(list(data))

 confidence_interval_for_mean_with_unknown_variance.__test(self)
 
 def __test(self):
 if self.type_c == 'two_sided_confidence':
 c_u = self.Sample_mean + (t.isf((1-self.c_level)/2, self.n-1)) * (self.Sample_std/np.sqrt(self.n))
 c_l = self.Sample_mean - (t.isf((1-self.c_level)/2, self.n-1)) * (self.Sample_std/np.sqrt(self.n))
 display(Latex(f'${c_l} \leq \mu \leq {c_u}$'))
 elif self.type_c == 'lower_confidence':
 c_u = self.Sample_mean + (t.isf(1-self.c_level, self.n-1)) * (self.Sample_std/np.sqrt(self.n))
 display(Latex(f'$\mu \leq {c_u}$'))
 elif self.type_c == 'upper_confidence':
 c_l = self.Sample_mean - (t.isf(1-self.c_level, self.n-1)) * (self.Sample_std/np.sqrt(self.n))
 display(Latex(f'${c_l} \leq \mu$'))

In [None]:
data = [5, 8.5, 12, 15, 7, 9, 7.5, 6.5, 10.5]
confidence_interval_for_mean_with_unknown_variance(c_level = 0.95, type_c = 'two_sided_confidence', data=data);





## **3.2. Confidence Interval for the Variance of a Normal Population:**



### **3.2.1. Unknown Mean of the Population:**

**A. Two-sided Confidence Interval:**

Suppose that $X_1, X_2, ..., X_n$ is a sample of size $n$ from a normal distribution having an unknown mean $\mu$ and a unknown variance $\sigma^2$.

$X_1, X_2, ..., X_n \sim N( \mu, \sigma^2)$

$\\ $

Significance level = $\alpha$

$\ \chi^2_{1-\frac{\alpha}{2}, n-1}\ \leq\ \frac{(n-1)\ S^2}{\sigma^2} \ \leq \chi^2_{\frac{\alpha}{2}, n-1} $

$\ \frac{(n-1)\ S^2}{\chi^2_{\frac{\alpha}{2}, n-1}} \leq\ \sigma^2 \leq\ \frac{(n-1)\ S^2}{\chi^2_{1-\frac{\alpha}{2}, n-1}}$

$\ \sqrt{\frac{(n-1)\ S^2}{\chi^2_{\frac{\alpha}{2}, n-1}}} \leq\ \sigma \leq\ \sqrt{\frac{(n-1)\ S^2}{\chi^2_{1-\frac{\alpha}{2}, n-1}}}$

$\\ $

Therefore, the $1-\alpha$ confidence interval for the variance of a normal population is:

$[\frac{(n-1)\ S^2}{\chi^2_{\frac{\alpha}{2}, n-1}},\ \frac{(n-1)\ S^2}{\chi^2_{1-\frac{\alpha}{2}, n-1}}]$

and the $1-\alpha$ confidence interval for the standard deviation of a normal population is:

$[\sqrt{\frac{(n-1)\ S^2}{\chi^2_{\frac{\alpha}{2}, n-1}}},\ \sqrt{\frac{(n-1)\ S^2}{\chi^2_{1-\frac{\alpha}{2}, n-1}}}]$

**B. One-sided Lower Confidence Interval:**

Suppose that $X_1, X_2, ..., X_n$ is a sample of size $n$ from a normal distribution having an unknown mean $\mu$ and a unknown variance $\sigma^2$.

$X_1, X_2, ..., X_n \sim N( \mu, \sigma^2)$

$\\ $

Significance level = $\alpha$

$\ 0 \leq\ \sigma^2 \leq\ \frac{(n-1)\ S^2}{\chi^2_{1-\alpha, n-1}}$

$\ 0 \leq\ \sigma \leq\ \sqrt{\frac{(n-1)\ S^2}{\chi^2_{1-\alpha, n-1}}}$

$\\ $

Therefore, the $1-\alpha$ confidence interval for the variance of a normal population is:

$[0,\ \frac{(n-1)\ S^2}{\chi^2_{1-\frac{\alpha}{2}, n-1}}]$

and the $1-\alpha$ confidence interval for the standard deviation of a normal population is:

$[0,\ \sqrt{\frac{(n-1)\ S^2}{\chi^2_{1-\alpha, n-1}}}]$

**C. One-sided Upper Confidence Interval:**

Suppose that $X_1, X_2, ..., X_n$ is a sample of size $n$ from a normal distribution having an unknown mean $\mu$ and a unknown variance $\sigma^2$.

$X_1, X_2, ..., X_n \sim N( \mu, \sigma^2)$

$\\ $

Significance level = $\alpha$

$\ \frac{(n-1)\ S^2}{\chi^2_{\alpha, n-1}} \leq\ \sigma^2 \leq\ \infty$

$\ \sqrt{\frac{(n-1)\ S^2}{\chi^2_{\alpha, n-1}}} \leq\ \sigma \leq\ \infty$

$\\ $

Therefore, the $1-\alpha$ confidence interval for the variance of a normal population is:

$[\frac{(n-1)\ S^2}{\chi^2_{\alpha, n-1}},\ \infty]$

and the $1-\alpha$ confidence interval for the standard deviation of a normal population is:

$[\sqrt{\frac{(n-1)\ S^2}{\chi^2_{\alpha, n-1}}},\ \infty]$

In [None]:
class confidence_interval_for_var_with_unknown_mean:
 """
 Parameters
 ----------
 population_sd : known standrad deviation of the population
 Sample_var : optional, variance of the sample
 n : optional, number of sample members
 c_level : % confidence level
 type_t : 'two_sided_confidence', 'lower_confidence', 'upper_confidence'
 data : optional, if you do not know the Sample_mean and n, just pass the data
 """
 def __init__(self, c_level, type_c, Sample_var = 0., n = 0., data=None):
 self.type_c = type_c
 self.n = n
 self.Sample_var = Sample_var
 self.c_level = c_level
 self.data = data
 if data is not None:
 self.n = len(list(data))
 self.Sample_var = np.std(list(data), ddof=1)**2

 confidence_interval_for_var_with_unknown_mean.__test(self)
 
 def __test(self):
 if self.type_c == 'two_sided_confidence':
 alpha = 1 - self.c_level
 c_u = ((self.n-1) * self.Sample_var) / chi2.isf(1-(alpha/2), self.n-1)
 c_l = ((self.n-1) * self.Sample_var) / chi2.isf(alpha/2, self.n-1)
 c_u_r = np.sqrt(c_u)
 c_l_r = np.sqrt(c_l)
 display(Latex(f'${c_l} \leq \sigma^2 \leq {c_u}$'))
 display(Latex(f'${c_l_r} \leq \sigma \leq {c_u_r}$'))
 elif self.type_c == 'lower_confidence':
 alpha = 1 - self.c_level
 c_u = ((self.n-1) * self.Sample_var) / chi2.isf(1-alpha, self.n-1)
 c_u_r = np.sqrt(c_u)
 display(Latex(f'$0 \leq \sigma^2 \leq {c_u}$'))
 display(Latex(f'$0 \leq \sigma \leq {c_u_r}$'))
 elif self.type_c == 'upper_confidence':
 alpha = 1 - self.c_level
 c_l = ((self.n-1) * self.Sample_var) / chi2.isf((alpha), self.n-1)
 c_l_r = np.sqrt(c_l)
 display(Latex(f'${c_l} \leq \sigma^2$'))
 display(Latex(f'${c_l_r} \leq \sigma$'))

In [None]:
data = [.123, .133, .124, .125, .126, .128, .120, .124, .130, .126]
confidence_interval_for_var_with_unknown_mean(c_level = 0.9, type_c = 'two_sided_confidence', data=data);







### **3.2.2. Known Mean of the Population:**

**A. Two-sided Confidence Interval:**

Suppose that $X_1, X_2, ..., X_n$ is a sample of size $n$ from a normal distribution having an known mean $\mu$ and a unknown variance $\sigma^2$.

$X_1, X_2, ..., X_n \sim N( \mu, \sigma^2)$

$\\ $

$S' = \sqrt{\frac{\sum_{i=1}^n\ (x_i\ -\ \overline{x})^2}{n}}$

$\\ $

Significance level = $\alpha$

$\ \chi^2_{1-\frac{\alpha}{2}, n}\ \leq\ \frac{(n)\ S'^2}{\sigma^2} \ \leq \chi^2_{\frac{\alpha}{2}, n} $

$\ \frac{(n)\ S'^2}{\chi^2_{\frac{\alpha}{2}, n}} \leq\ \sigma^2 \leq\ \frac{(n)\ S'^2}{\chi^2_{1-\frac{\alpha}{2}, n}}$

$\ \sqrt{\frac{(n)\ S'^2}{\chi^2_{\frac{\alpha}{2}, n}}} \leq\ \sigma \leq\ \sqrt{\frac{(n)\ S'^2}{\chi^2_{1-\frac{\alpha}{2}, n}}}$

$\\ $

Therefore, the $1-\alpha$ confidence interval for the variance of a normal population is:

$[\frac{(n)\ S'^2}{\chi^2_{\frac{\alpha}{2}, n}},\ \frac{(n)\ S'^2}{\chi^2_{1-\frac{\alpha}{2}, n}}]$

and the $1-\alpha$ confidence interval for the standard deviation of a normal population is:

$[\sqrt{\frac{(n)\ S'^2}{\chi^2_{\frac{\alpha}{2}, n}}},\ \sqrt{\frac{(n)\ S'^2}{\chi^2_{1-\frac{\alpha}{2}, n}}}]$

**B. One-sided Lower Confidence Interval:**

Suppose that $X_1, X_2, ..., X_n$ is a sample of size $n$ from a normal distribution having an known mean $\mu$ and a unknown variance $\sigma^2$.

$X_1, X_2, ..., X_n \sim N( \mu, \sigma^2)$

$\\ $

Significance level = $\alpha$

$\ 0\ \leq\ \frac{(n)\ S'^2}{\sigma^2} \ \leq \chi^2_{1-\alpha, n} $

$\ 0 \leq\ \sigma^2 \leq\ \frac{(n)\ S'^2}{\chi^2_{1-\alpha, n}}$

$\ 0 \leq\ \sigma \leq\ \sqrt{\frac{(n)\ S'^2}{\chi^2_{1-\alpha, n}}}$

$\\ $

Therefore, the $1-\alpha$ confidence interval for the variance of a normal population is:

$[0,\ \frac{(n)\ S'^2}{\chi^2_{1-\frac{\alpha}{2}, n}}]$

and the $1-\alpha$ confidence interval for the standard deviation of a normal population is:

$[0,\ \sqrt{\frac{(n)\ S'^2}{\chi^2_{1-\alpha, n}}}]$

**C. One-sided Upper Confidence Interval:**

Suppose that $X_1, X_2, ..., X_n$ is a sample of size $n$ from a normal distribution having an known mean $\mu$ and a unknown variance $\sigma^2$.

$X_1, X_2, ..., X_n \sim N( \mu, \sigma^2)$

$\\ $

Significance level = $\alpha$

$\ \chi^2_{1-\alpha, n}\ \leq\ \frac{(n)\ S'^2}{\sigma^2} \ \leq \infty $

$\ \frac{(n)\ S'^2}{\chi^2_{\alpha, n}} \leq\ \sigma^2 \leq\ \infty$

$\ \sqrt{\frac{(n)\ S'^2}{\chi^2_{\alpha, n}}} \leq\ \sigma \leq\ \infty$

$\\ $

Therefore, the $1-\alpha$ confidence interval for the variance of a normal population is:

$[\frac{(n)\ S'^2}{\chi^2_{\alpha, n}},\ \infty]$

and the $1-\alpha$ confidence interval for the standard deviation of a normal population is:

$[\sqrt{\frac{(n)\ S'^2}{\chi^2_{\alpha, n}}},\ \infty]$

In [None]:
class confidence_interval_for_var_with_known_mean:
 """
 Parameters
 ----------
 population_sd : known standrad deviation of the population
 Sample_var : optional, variance of the sample
 n : optional, number of sample members
 c_level : % confidence level
 type_t : 'two_sided_confidence', 'lower_confidence', 'upper_confidence'
 data : optional, if you do not know the Sample_mean and n, just pass the data
 """
 def __init__(self, c_level, type_c, Sample_var = 0., n = 0., data=None):
 self.type_c = type_c
 self.n = n
 self.Sample_var = Sample_var
 self.c_level = c_level
 self.data = data
 if data is not None:
 self.n = len(list(data))
 self.Sample_var = np.std(list(data))**2

 confidence_interval_for_var_with_known_mean.__test(self)
 
 def __test(self):
 if self.type_c == 'two_sided_confidence':
 alpha = 1 - self.c_level
 c_u = ((self.n) * self.Sample_var) / chi2.isf(1-(alpha/2), self.n)
 c_l = ((self.n) * self.Sample_var) / chi2.isf(alpha/2, self.n)
 c_u_r = np.sqrt(c_u)
 c_l_r = np.sqrt(c_l)
 display(Latex(f'${c_l} \leq \sigma^2 \leq {c_u}$'))
 display(Latex(f'${c_l_r} \leq \sigma \leq {c_u_r}$'))
 elif self.type_c == 'lower_confidence':
 alpha = 1 - self.c_level
 c_u = ((self.n) * self.Sample_var) / chi2.isf(1-alpha, self.n)
 c_u_r = np.sqrt(c_u)
 display(Latex(f'$0 \leq \sigma^2 \leq {c_u}$'))
 display(Latex(f'$0 \leq \sigma \leq {c_u_r}$'))
 elif self.type_c == 'upper_confidence':
 alpha = 1 - self.c_level
 c_l = ((self.n) * self.Sample_var) / chi2.isf((alpha), self.n)
 c_l_r = np.sqrt(c_l)
 display(Latex(f'${c_l} \leq \sigma^2$'))
 display(Latex(f'${c_l_r} \leq \sigma$'))

In [None]:
data = [.123, .133, .124, .125, .126, .128, .120, .124, .130, .126]
confidence_interval_for_var_with_known_mean(c_level = 0.9, type_c = 'two_sided_confidence', data=data);







## **3.3. Confidence Interval for the Difference in Means of Two Normal Populations:**



### **3.3.1. Known Variances:**

**A. Two-sided Confidence Interval:**

Suppose that $X_1, X_2, ..., X_n$ is a sample of size $n_x$ from a normal distribution having an unknown mean $\mu_x$ and a known variance $\sigma^2_x$. 

$Y_1, Y_2, ..., Y_n$ is a sample of size $n_y$ from a normal distribution having an unknown mean $\mu_y$ and a known variance $\sigma^2_y$.

$X_1, X_2, ..., X_{n_x} \sim N( \mu_x, \sigma^2_x)$

$Y_1, Y_2, ..., Y_{n_y} \sim N( \mu_y, \sigma^2_y)$

$\\ $

Significance level = $\alpha$

$P(-\ Z_{\frac{\alpha}{2}}\ \leq\ \frac{\overline{X}-\overline{Y} -\ (\mu_x - \mu_y)}{\sqrt{(\frac{\sigma^2_x}{n_x}) + (\frac{\sigma^2_y}{n_y})}}\ \leq\ Z_{\frac{\alpha}{2}}) = 1- \alpha$

$P(\overline{X} - \overline{Y}-\ Z_{\frac{\alpha}{2}} {\sqrt{(\frac{\sigma^2_x}{n_x}) + (\frac{\sigma^2_y}{n_y})}}\ \leq\ \mu_x\ - \mu_y\ \leq\ \overline{X} - \overline{Y}+\ Z_{\frac{\alpha}{2}} {\sqrt{(\frac{\sigma^2_x}{n_x}) + (\frac{\sigma^2_y}{n_y})}}) = 1- \alpha$

$\\ $

Therefore, the $1-\alpha$ confidence interval for the difference in Means of two normal populatios is:

$[\overline{X} - \overline{Y}-\ Z_{\frac{\alpha}{2}} {\sqrt{(\frac{\sigma^2_x}{n_x}) + (\frac{\sigma^2_y}{n_y})}},\ \overline{X} - \overline{Y}+\ Z_{\frac{\alpha}{2}} {\sqrt{(\frac{\sigma^2_x}{n_x}) + (\frac{\sigma^2_y}{n_y})}}]$

**B. One-sided Lower Confidence Interval:**

Suppose that $X_1, X_2, ..., X_n$ is a sample of size $n_x$ from a normal distribution having an unknown mean $\mu_x$ and a known variance $\sigma^2_x$. 

$Y_1, Y_2, ..., Y_n$ is a sample of size $n_y$ from a normal distribution having an unknown mean $\mu_y$ and a known variance $\sigma^2_y$.

$X_1, X_2, ..., X_{n_x} \sim N( \mu_x, \sigma^2_x)$

$Y_1, Y_2, ..., Y_{n_y} \sim N( \mu_y, \sigma^2_y)$

$\\ $

Significance level = $\alpha$

$P(-\ \infty\ \leq\ \frac{\overline{X}-\overline{Y} -\ (\mu_x - \mu_y)}{\sqrt{(\frac{\sigma^2_x}{n_x}) + (\frac{\sigma^2_y}{n_y})}}\ \leq\ Z_{\alpha}) = 1- \alpha$

$P(-\ \infty\ \leq\ \mu_x\ - \mu_y\ \leq\ \overline{X} - \overline{Y}+\ Z_{\alpha} {\sqrt{(\frac{\sigma^2_x}{n_x}) + (\frac{\sigma^2_y}{n_y})}}\ ) = 1- \alpha$

$\\ $

Therefore, the $1-\alpha$ confidence interval for the difference in Means of two normal populatios is:

$[-\infty,\ \overline{X} - \overline{Y}+\ Z_{\alpha} {\sqrt{(\frac{\sigma^2_x}{n_x}) + (\frac{\sigma^2_y}{n_y})}}]$

**C. One-sided Upper Confidence Interval:**

Suppose that $X_1, X_2, ..., X_n$ is a sample of size $n_x$ from a normal distribution having an unknown mean $\mu_x$ and a known variance $\sigma^2_x$. 

$Y_1, Y_2, ..., Y_n$ is a sample of size $n_y$ from a normal distribution having an unknown mean $\mu_y$ and a known variance $\sigma^2_y$.

$X_1, X_2, ..., X_{n_x} \sim N( \mu_x, \sigma^2_x)$

$Y_1, Y_2, ..., Y_{n_y} \sim N( \mu_y, \sigma^2_y)$

$\\ $

Significance level = $\alpha$

$P(-\ Z_{\alpha}\ \leq\ \frac{\overline{X}-\overline{Y} -\ (\mu_x - \mu_y)}{\sqrt{(\frac{\sigma^2_x}{n_x}) + (\frac{\sigma^2_y}{n_y})}}\ \leq\ \infty) = 1- \alpha$

$P(\overline{X} - \overline{Y}-\ Z_{\alpha} {\sqrt{(\frac{\sigma^2_x}{n_x}) + (\frac{\sigma^2_y}{n_y})}}\ \leq\ \mu_x\ - \mu_y\ \leq\ \infty) = 1- \alpha$

$\\ $

Therefore, the $1-\alpha$ confidence interval for the difference in Means of two normal populatios is:

$[\overline{X} - \overline{Y}-\ Z_{\alpha} {\sqrt{(\frac{\sigma^2_x}{n_x}) + (\frac{\sigma^2_y}{n_y})}},\ \infty]$

In [None]:
class confidence_interval_for_two_mean_with_known_variances:
 """
 Parameters
 ----------
 population_sd1 : known standrad deviation of the population1
 population_sd2 : known standrad deviation of the population2
 n1 : optional, number of sample1 members
 n2 : optional, number of sample2 members
 c_level : % confidence level
 type_t : 'two_sided_confidence', 'lower_confidence', 'upper_confidence'
 Sample_mean1 : optional, mean of the sample1
 Sample_mean2 : optional, mean of the sample2
 data : optional, if you do not know the Sample_mean and n, just pass the data
 """
 def __init__(self, population_sd1, population_sd2, c_level, type_c, Sample_mean1 = 0., Sample_mean2 = 0., n1 = 0., n2 = 0., data1=None, data2=None):
 self.Sample_mean1 = Sample_mean1
 self.Sample_mean2 = Sample_mean2
 self.population_sd1 = population_sd1
 self.population_sd2 = population_sd2
 self.type_c = type_c
 self.n1 = n1
 self.n2 = n2
 self.c_level = c_level
 self.data1 = data1
 self.data2 = data2
 if data1 is not None:
 self.Sample_mean1 = np.mean(list(data1))
 self.n1 = len(list(data1))
 if data2 is not None:
 self.Sample_mean2 = np.mean(list(data2))
 self.n2 = len(list(data2)) 

 confidence_interval_for_two_mean_with_known_variances.__test(self)
 
 def __test(self):
 if self.type_c == 'two_sided_confidence':
 c_u = self.Sample_mean1 - self.Sample_mean2 + (-norm.ppf((1-self.c_level)/2)) * np.sqrt(self.population_sd1**2/self.n1 + self.population_sd2**2/self.n2)
 c_l = self.Sample_mean1 - self.Sample_mean2 - (-norm.ppf((1-self.c_level)/2)) * np.sqrt(self.population_sd1**2/self.n1 + self.population_sd2**2/self.n2)
 display(Latex(f'${c_l} \leq \mu_x - \mu_y \leq {c_u}$'))
 elif self.type_c == 'lower_confidence':
 c_u = self.Sample_mean1 - self.Sample_mean2 + (-norm.ppf(1-self.c_level)) * np.sqrt(self.population_sd1**2/self.n1 + self.population_sd2**2/self.n2)
 display(Latex(f'$\mu_x - \mu_y \leq {c_u}$'))
 elif self.type_c == 'upper_confidence':
 c_l = self.Sample_mean1 - self.Sample_mean2 - (-norm.ppf(1-self.c_level)) * np.sqrt(self.population_sd1**2/self.n1 + self.population_sd2**2/self.n2)
 display(Latex(f'${c_l} \leq \mu_x - \mu_y$'))

In [None]:
data1 = [36,44,41,53,38,36,34,54,52,37,51,44,35,44]
data2 = [52,64,38,68,66,52,60,44,48,46,70,62]
confidence_interval_for_two_mean_with_known_variances(population_sd1 = np.sqrt(40), population_sd2 = 10, c_level = 0.95, 
 type_c = 'two_sided_confidence', data1=data1, data2=data2);





### **3.3.2. Unknown but Equal Variances:**

**A. Two-sided Confidence Interval:**

Suppose that $X_1, X_2, ..., X_n$ is a sample of size $n_x$ from a normal distribution having an unknown mean $\mu_x$ and a unknown variance $\sigma^2_x$. 

$Y_1, Y_2, ..., Y_n$ is a sample of size $n_y$ from a normal distribution having an unknown mean $\mu_y$ and a unknown variance $\sigma^2_y$.

$X_1, X_2, ..., X_{n_x} \sim N( \mu_x, \sigma^2_x)$

$Y_1, Y_2, ..., Y_{n_y} \sim N( \mu_y, \sigma^2_y)$

$\\ $

$S_p^2 = \frac{(n_x-1)S_x^2 + (n_y-1)S_y^2}{n_x+n_y-2}$

$\\ $

Significance level = $\alpha$

$P(-\ t_{\frac{\alpha}{2}, n_x-n_y-2}\ \leq\ \frac{\overline{X}-\overline{Y} -\ (\mu_x - \mu_y)}{{{S_p} \sqrt{\frac{1}{n_x} + \frac{1}{n_y}}}}\ \leq\ t_{\frac{\alpha}{2}, n_x-n_y-2}) = 1- \alpha$

$P(\overline{X} - \overline{Y}-\ t_{\frac{\alpha}{2}, n_x-n_y-2}\ {S_p} \sqrt{\frac{1}{n_x} + \frac{1}{n_y}}\ \leq\ \mu_x\ - \mu_y\ \leq\ \overline{X} - \overline{Y}+\ t_{\frac{\alpha}{2}, n_x-n_y-2}\ {S_p} \sqrt{\frac{1}{n_x} + \frac{1}{n_y}}) = 1- \alpha$

$\\ $

Therefore, the $1-\alpha$ confidence interval for the difference in Means of two normal populatios is:

$[\overline{X} - \overline{Y}-\ t_{\frac{\alpha}{2}, n_x-n_y-2}\ {S_p} \sqrt{\frac{1}{n_x} + \frac{1}{n_y}},\ \overline{X} - \overline{Y}+\ t_{\frac{\alpha}{2}, n_x-n_y-2}\ {S_p} \sqrt{\frac{1}{n_x} + \frac{1}{n_y}}]$

**B. One-sided Lower Confidence Interval:**

Suppose that $X_1, X_2, ..., X_n$ is a sample of size $n_x$ from a normal distribution having an unknown mean $\mu_x$ and a unknown variance $\sigma^2_x$. 

$Y_1, Y_2, ..., Y_n$ is a sample of size $n_y$ from a normal distribution having an unknown mean $\mu_y$ and a unknown variance $\sigma^2_y$.

$X_1, X_2, ..., X_{n_x} \sim N( \mu_x, \sigma^2_x)$

$Y_1, Y_2, ..., Y_{n_y} \sim N( \mu_y, \sigma^2_y)$

$\\ $

$S_p^2 = \frac{(n_x-1)S_x^2 + (n_y-1)S_y^2}{n_x+n_y-2}$

$\\ $

Significance level = $\alpha$

$P(-\ \infty\ \leq\ \frac{\overline{X}-\overline{Y} -\ (\mu_x - \mu_y)}{{{S_p} \sqrt{\frac{1}{n_x} + \frac{1}{n_y}}}}\ \leq\ t_{\alpha, n_x-n_y-2}) = 1- \alpha$

$P(-\ \infty\ \leq\ \mu_x\ - \mu_y\ \leq\ \overline{X} - \overline{Y}+\ t_{\alpha, n_x-n_y-2}\ {S_p} \sqrt{\frac{1}{n_x} + \frac{1}{n_y}}) = 1- \alpha$

$\\ $

Therefore, the $1-\alpha$ confidence interval for the difference in Means of two normal populatios is:

$[-\infty,\ \overline{X} - \overline{Y}+\ t_{\alpha, n_x-n_y-2}\ {S_p} \sqrt{\frac{1}{n_x} + \frac{1}{n_y}}]$

**C. One-sided Upper Confidence Interval:**

Suppose that $X_1, X_2, ..., X_n$ is a sample of size $n_x$ from a normal distribution having an unknown mean $\mu_x$ and a unknown variance $\sigma^2_x$. 

$Y_1, Y_2, ..., Y_n$ is a sample of size $n_y$ from a normal distribution having an unknown mean $\mu_y$ and a unknown variance $\sigma^2_y$.

$X_1, X_2, ..., X_{n_x} \sim N( \mu_x, \sigma^2_x)$

$Y_1, Y_2, ..., Y_{n_y} \sim N( \mu_y, \sigma^2_y)$

$\\ $

$S_p^2 = \frac{(n_x-1)S_x^2 + (n_y-1)S_y^2}{n_x+n_y-2}$

$\\ $

Significance level = $\alpha$

$P(-\ t_{\alpha, n_x-n_y-2}\ \leq\ \frac{\overline{X}-\overline{Y} -\ (\mu_x - \mu_y)}{{{S_p} \sqrt{\frac{1}{n_x} + \frac{1}{n_y}}}}\ \leq\ \infty) = 1- \alpha$

$P(\overline{X} - \overline{Y}-\ t_{\alpha, n_x-n_y-2}\ {S_p} \sqrt{\frac{1}{n_x} + \frac{1}{n_y}}\ \leq\ \mu_x\ - \mu_y\ \leq\ \infty) = 1- \alpha$

$\\ $

Therefore, the $1-\alpha$ confidence interval for the difference in Means of two normal populatios is:

$[\overline{X} - \overline{Y}-\ t_{\alpha, n_x-n_y-2}\ {S_p} \sqrt{\frac{1}{n_x} + \frac{1}{n_y}}\ ,\ \infty]$

In [None]:
class confidence_interval_for_two_mean_with_unknown_variances:
 """
 Parameters
 ----------
 n1 : optional, number of sample1 members
 n2 : optional, number of sample2 members
 c_level : % confidence level
 type_t : 'two_sided_confidence', 'lower_confidence', 'upper_confidence'
 Sample_mean1 : optional, mean of the sample1
 Sample_mean2 : optional, mean of the sample2
 S1 : optional, std of the sample1
 S2 : optional, std of the sample2
 data : optional, if you do not know the Sample_mean and n, just pass the data
 """
 def __init__(self, c_level, type_c, Sample_mean1 = 0., S1 = 0., S2 = 0., Sample_mean2 = 0., n1 = 0., n2 = 0., data1=None, data2=None):
 self.Sample_mean1 = Sample_mean1
 self.Sample_mean2 = Sample_mean2
 self.S1 = S1
 self.S2 = S2
 self.type_c = type_c
 self.n1 = n1
 self.n2 = n2
 self.c_level = c_level
 self.data1 = data1
 self.data2 = data2
 if data1 is not None:
 self.Sample_mean1 = np.mean(list(data1))
 self.n1 = len(list(data1))
 self.S1 = np.std(list(data1), ddof = 1)
 if data2 is not None:
 self.Sample_mean2 = np.mean(list(data2))
 self.n2 = len(list(data2)) 
 self.S2 = np.std(list(data2), ddof = 1)
 
 self.SP2 = ((self.n1-1)*(self.S1**2) + (self.n2-1)*(self.S2**2)) / (self.n1+self.n2-2)

 confidence_interval_for_two_mean_with_unknown_variances.__test(self)
 
 def __test(self):
 if self.type_c == 'two_sided_confidence':
 alpha = 1-self.c_level
 c_u = self.Sample_mean1 - self.Sample_mean2 + (t.isf(alpha/2, df = self.n1+self.n2-2)) * (np.sqrt(self.SP2)*np.sqrt(1/self.n1+1/self.n2))
 c_l = self.Sample_mean1 - self.Sample_mean2 - (t.isf(alpha/2, df = self.n1+self.n2-2)) * (np.sqrt(self.SP2)*np.sqrt(1/self.n1+1/self.n2))
 display(Latex(f'${c_l} \leq \mu_x - \mu_y \leq {c_u}$'))
 elif self.type_c == 'lower_confidence':
 alpha = 1-self.c_level
 c_u = self.Sample_mean1 - self.Sample_mean2 + (t.isf(alpha, df = self.n1+self.n2-2)) * (np.sqrt(self.SP2)*np.sqrt(1/self.n1+1/self.n2))
 display(Latex(f'$\mu_x - \mu_y \leq {c_u}$'))
 elif self.type_c == 'upper_confidence':
 alpha = 1-self.c_level
 c_l = self.Sample_mean1 - self.Sample_mean2 - (t.isf(alpha, df = self.n1+self.n2-2)) * (np.sqrt(self.SP2)*np.sqrt(1/self.n1+1/self.n2))
 display(Latex(f'${c_l} \leq \mu_x - \mu_y$'))

In [None]:
data1 = [140,136,138,150,152,144,132,142,150,154,136,142]
data2 = [144,132,136,140,128,150,130,134,130,146,128,131,137,135]
confidence_interval_for_two_mean_with_unknown_variances(c_level = 0.9, type_c = 'two_sided_confidence', data1=data1, data2=data2);




## **3.4. Confidence Interval for the Ratio of Variances of Two Normal Populations:**

**A. Two-sided Confidence Interval:**

Suppose that $X_1, X_2, ..., X_n$ is a sample of size $n_x$ from a normal distribution having an unknown mean $\mu_x$ and a unknown variance $\sigma^2_x$. 

$Y_1, Y_2, ..., Y_n$ is a sample of size $n_y$ from a normal distribution having an unknown mean $\mu_y$ and a unknown variance $\sigma^2_y$.

$X_1, X_2, ..., X_n \sim N( \mu_x, \sigma^2_x)$

$Y_1, Y_2, ..., Y_n \sim N( \mu_y, \sigma^2_y)$

$\\ $

$S^2_x = \frac{\sum_{i=1}^{n_x}\ (x_i\ -\ \overline{x})^2}{{n_x}-1}$

$S^2_y = \frac{\sum_{i=1}^{n_y}\ (y_i\ -\ \overline{y})^2}{{n_y}-1}$

$\\ $

Significance level = $\alpha$

$P(F_{{1-\frac{\alpha}{2}}, n_x-1, n_y-1}\ \leq\ \frac{S^2_x}{S^2_y}/\frac{\sigma^2_x}{\sigma^2_y} \ \leq F_{{\frac{\alpha}{2}}, n_x-1, n_y-1}) = 1-\alpha$

$P(\frac{S^2_x}{S^2_y}\ \times \frac{1}{F_{\frac{\alpha}{2}, n_x-1, n_y-1}} \leq\ \frac{\sigma^2_x}{\sigma^2_y} \leq\ \frac{S^2_x}{S^2_y}\ \times \frac{1}{F_{1-\frac{\alpha}{2}, n_x-1, n_y-1}}) = 1-\alpha$

$P(\frac{S_x}{S_y}\ \times \sqrt{\frac{1}{F_{\frac{\alpha}{2}, n_x-1, n_y-1}}} \leq\ \frac{\sigma_x}{\sigma_y} \leq\ \frac{S_x}{S_y}\ \times \sqrt{\frac{1}{F_{1-\frac{\alpha}{2}, n_x-1, n_y-1}}}) = 1-\alpha$

$\\ $

Therefore, the $1-\alpha$ confidence interval for ratio of variances of two normal populatios is:

$[\frac{S^2_x}{S^2_y}\ \times \frac{1}{F_{\frac{\alpha}{2}, n_x-1, n_y-1}}, \frac{S^2_x}{S^2_y}\ \times \frac{1}{F_{1-\frac{\alpha}{2}, n_x-1, n_y-1}}]$

Therefore, the $1-\alpha$ confidence interval for ratio of standard deviations of two normal populatios is:

$[\frac{S_x}{S_y}\ \times \sqrt{\frac{1}{F_{\frac{\alpha}{2}, n_x-1, n_y-1}}}\ ,\ \frac{S_x}{S_y}\ \times \sqrt{\frac{1}{F_{1-\frac{\alpha}{2}, n_x-1, n_y-1}}}]$

**B. One-sided Lower Confidence Interval:**

Suppose that $X_1, X_2, ..., X_n$ is a sample of size $n_x$ from a normal distribution having an unknown mean $\mu_x$ and a unknown variance $\sigma^2_x$. 

$Y_1, Y_2, ..., Y_n$ is a sample of size $n_y$ from a normal distribution having an unknown mean $\mu_y$ and a unknown variance $\sigma^2_y$.

$X_1, X_2, ..., X_n \sim N( \mu_x, \sigma^2_x)$

$Y_1, Y_2, ..., Y_n \sim N( \mu_y, \sigma^2_y)$

$\\ $

$S^2_x = \frac{\sum_{i=1}^{n_x}\ (x_i\ -\ \overline{x})^2}{{n_x}-1}$

$S^2_y = \frac{\sum_{i=1}^{n_y}\ (y_i\ -\ \overline{y})^2}{{n_y}-1}$

$\\ $

Significance level = $\alpha$

$P(0 \leq\ \frac{\sigma^2_x}{\sigma^2_y} \leq\ \frac{S^2_x}{S^2_y}\ \times \frac{1}{F_{1-\alpha, n_x-1, n_y-1}}) = 1-\alpha$

$P(0 \leq\ \frac{\sigma_x}{\sigma_y} \leq\ \frac{S_x}{S_y}\ \times \sqrt{\frac{1}{F_{1-\alpha, n_x-1, n_y-1}}}) = 1-\alpha$

$\\ $

Therefore, the $1-\alpha$ confidence interval for ratio of variances of two normal populatios is:

$[0\ ,\ \frac{S^2_x}{S^2_y}\ \times \frac{1}{F_{1-\alpha, n_x-1, n_y-1}}]$

Therefore, the $1-\alpha$ confidence interval for ratio of standard deviations of two normal populatios is:

$[0\ ,\ \frac{S_x}{S_y}\ \times \sqrt{\frac{1}{F_{1-\alpha, n_x-1, n_y-1}}}]$

**C. One-sided Upper Confidence Interval:**

Suppose that $X_1, X_2, ..., X_n$ is a sample of size $n_x$ from a normal distribution having an unknown mean $\mu_x$ and a unknown variance $\sigma^2_x$. 

$Y_1, Y_2, ..., Y_n$ is a sample of size $n_y$ from a normal distribution having an unknown mean $\mu_y$ and a unknown variance $\sigma^2_y$.

$X_1, X_2, ..., X_n \sim N( \mu_x, \sigma^2_x)$

$Y_1, Y_2, ..., Y_n \sim N( \mu_y, \sigma^2_y)$

$\\ $

$S^2_x = \frac{\sum_{i=1}^{n_x}\ (x_i\ -\ \overline{x})^2}{{n_x}-1}$

$S^2_y = \frac{\sum_{i=1}^{n_y}\ (y_i\ -\ \overline{y})^2}{{n_y}-1}$

$\\ $

Significance level = $\alpha$

$P(\frac{S^2_x}{S^2_y}\ \times \frac{1}{F_{\alpha, n_x-1, n_y-1}} \leq\ \frac{\sigma^2_x}{\sigma^2_y} \leq\ \infty) = 1-\alpha$

$P(\frac{S_x}{S_y}\ \times \sqrt{\frac{1}{F_{\alpha, n_x-1, n_y-1}}} \leq\ \frac{\sigma_x}{\sigma_y} \leq\ \infty) = 1-\alpha$

$\\ $

Therefore, the $1-\alpha$ confidence interval for ratio of variances of two normal populatios is:

$[\frac{S^2_x}{S^2_y}\ \times \frac{1}{F_{\alpha, n_x-1, n_y-1}}, \infty]$

Therefore, the $1-\alpha$ confidence interval for ratio of standard deviations of two normal populatios is:

$[\frac{S_x}{S_y}\ \times \sqrt{\frac{1}{F_{\alpha, n_x-1, n_y-1}}}\ ,\ \infty]$

In [None]:
class confidence_interval_for_ratio_variances:
 """
 Parameters
 ----------
 n1 : optional, number of sample1 members
 n2 : optional, number of sample2 members
 c_level : % confidence level
 type_t : 'two_sided_confidence', 'lower_confidence', 'upper_confidence'
 Sample_mean1 : optional, mean of the sample1
 Sample_mean2 : optional, mean of the sample2
 S1 : optional, std of the sample1
 S2 : optional, std of the sample2
 data : optional, if you do not know the Sample_mean and n, just pass the data
 """
 def __init__(self, c_level, type_c, S1 = 0., S2 = 0., n1 = 0., n2 = 0., data1=None, data2=None):
 self.S1 = S1
 self.S2 = S2
 self.type_c = type_c
 self.n1 = n1
 self.n2 = n2
 self.c_level = c_level
 self.data1 = data1
 self.data2 = data2
 if data1 is not None:
 self.n1 = len(list(data1))
 self.S1 = np.std(list(data1), ddof = 1)
 if data2 is not None:
 self.n2 = len(list(data2)) 
 self.S2 = np.std(list(data2), ddof = 1)
 
 confidence_interval_for_ratio_variances.__test(self)
 
 def __test(self):
 if self.type_c == 'two_sided_confidence':
 alpha = 1-self.c_level
 c_u = ((self.S1**2)/(self.S2**2)) * (1/f.isf(1-alpha/2, self.n1-1, self.n1-1))
 c_l = ((self.S1**2)/(self.S2**2)) * (1/f.isf(alpha/2, self.n1-1, self.n1-1))
 c_u_r = np.sqrt(c_u)
 c_l_r = np.sqrt(c_l)
 display(Latex(f'${c_l} \leq \sigma^2_x/ \sigma^2_y \leq {c_u}$'))
 display(Latex(f'${c_l_r} \leq \sigma_x/ \sigma_y \leq {c_u_r}$'))
 elif self.type_c == 'lower_confidence':
 alpha = 1-self.c_level
 c_u = ((self.S1**2)/(self.S2**2)) * (1/f.isf(1-alpha, self.n1-1, self.n1-1))
 c_u_r = np.sqrt(c_u)
 display(Latex(f'$\sigma^2_x/ \sigma^2_y \leq {c_u}$'))
 display(Latex(f'$\sigma_x/ \sigma_y \leq {c_u_r}$'))
 elif self.type_c == 'upper_confidence':
 alpha = 1-self.c_level
 c_l = ((self.S1**2)/(self.S2**2)) * (1/f.isf(alpha, self.n1-1, self.n1-1))
 c_l_r = np.sqrt(c_l)
 display(Latex(f'${c_l} \leq \sigma^2_x/ \sigma^2_y$'))
 display(Latex(f'${c_l} \leq \sigma_x/ \sigma_y$'))

In [None]:
confidence_interval_for_ratio_variances(c_level = 0.95, type_c = 'two_sided_confidence', S1 = 2.51, S2 = 1.9, n1 = 10, n2 = 10);







## **3.5. Confidence Interval for the Mean of a Bernoulli Random Variable:**

**A. Two-sided Confidence Interval:**

Suppose that $X_1, X_2, ..., X_n$ is a sample of size $n$ (large) from a bernoulli distribution having an unknown parameter $P$. 

$X_1, X_2, ..., X_n \sim Ber(P)$

$\\ $

Significance level = $\alpha$

We accept $H_0$ if:

$P(-\ Z_{\frac{\alpha}{2}}\ \leq\ \frac{\overline{X}\ -\ P}{\sqrt{\frac{P(1-P)}{n}}}\ \leq Z_{\frac{\alpha}{2}}) = 1-\alpha$

$P(\overline{X}\ -\ Z_{\frac{\alpha}{2}} \sqrt{\frac{P(1-P)}{n}} \ \leq\ P \ \leq \overline{X}\ +\ Z_{\frac{\alpha}{2}} \sqrt{\frac{P(1-P)}{n}}) = 1-\alpha$

Approximtely:

$P(\overline{X}\ -\ Z_{\frac{\alpha}{2}} \sqrt{\frac{\overline{X}(1-\overline{X})}{n}} \ \leq\ P \ \leq \overline{X}\ +\ Z_{\frac{\alpha}{2}} \sqrt{\frac{\overline{X}(1-\overline{X})}{n}}) = 1-\alpha$

$\\ $

Therefore, the $1-\alpha$ confidence interval for ratio of variances of two normal populatios is:

$[\overline{X}\ -\ Z_{\frac{\alpha}{2}} \sqrt{\frac{P\overline{X}(1-\overline{X})}{n}}\ ,\ \overline{X}\ +\ Z_{\frac{\alpha}{2}} \sqrt{\frac{\overline{X}(1-\overline{X})}{n}}]$

**B. One-sided Lower Confidence Interval:**

Suppose that $X_1, X_2, ..., X_n$ is a sample of size $n$ (large) from a bernoulli distribution having an unknown parameter $P$. 

$X_1, X_2, ..., X_n \sim Ber(P)$

$\\ $

Significance level = $\alpha$

We accept $H_0$ if:

$P(-\infty\ \leq\ \frac{\overline{X}\ -\ P}{\sqrt{\frac{P(1-P)}{n}}}\ \leq Z_{\alpha}) = 1-\alpha$

$P(-\ \infty \ \leq\ P \ \leq \overline{X}\ +\ Z_{\alpha} \sqrt{\frac{P(1-P)}{n}}) = 1-\alpha$

Approxiamtely:

$P(-\ \infty \ \leq\ P \ \leq \overline{X}\ +\ Z_{\alpha} \sqrt{\frac{\overline{X}(1-\overline{X})}{n}}) = 1-\alpha$

$\\ $

Therefore, the $1-\alpha$ confidence interval for ratio of variances of two normal populatios is:

$[-\infty\ ,\ \overline{X}\ +\ Z_{\alpha} \sqrt{\frac{\overline{X}(1-\overline{X})}{n}}]$

**C. One-sided Upper Confidence Interval:**

Suppose that $X_1, X_2, ..., X_n$ is a sample of size $n$ (large) from a bernoulli distribution having an unknown parameter $P$. 

$X_1, X_2, ..., X_n \sim Ber(P)$

$\\ $

Significance level = $\alpha$

We accept $H_0$ if:

$P(-\ Z_{\alpha}\ \leq\ \frac{\overline{X}\ -\ P}{\sqrt{\frac{P(1-P)}{n}}}\ \leq \infty) = 1-\alpha$

$P(\overline{X}\ -\ Z_{\alpha} \sqrt{\frac{P(1-P)}{n}} \ \leq\ P \ \leq \infty) = 1-\alpha$

Approxiamtely:

$P(\overline{X}\ -\ Z_{\alpha} \sqrt{\frac{\overline{X}(1-\overline{X})}{n}} \ \leq\ P \ \leq \infty) = 1-\alpha$

$\\ $

Therefore, the $1-\alpha$ confidence interval for ratio of variances of two normal populatios is:

$[\overline{X}\ -\ Z_{\alpha} \sqrt{\frac{\overline{X}(1-\overline{X})}{n}}\ ,\ \infty]$

In [None]:
class confidence_interval_for_p_bernoulli:
 """
 Parameters
 ----------
 n : optional, number of sample members
 c_level : % confidence level
 type_t : 'two_sided_confidence', 'lower_confidence', 'upper_confidence'
 Sample_mean : mean of the sample
 data : optional, if you do not know the Sample_mean and n, just pass the data
 """
 def __init__(self, c_level, type_c, Sample_mean = 0., n = 0., data=None):
 self.Sample_mean = Sample_mean
 self.type_c = type_c
 self.n = n
 self.c_level = c_level
 self.data = data
 if data is not None:
 self.Sample_mean = np.mean(list(data))
 self.n = len(list(data))

 confidence_interval_for_p_bernoulli.__test(self)
 
 def __test(self):
 if self.type_c == 'two_sided_confidence':
 c_u = self.Sample_mean + (-norm.ppf((1-self.c_level)/2)) * (np.sqrt(self.Sample_mean*(1-self.Sample_mean)/self.n))
 c_l = self.Sample_mean - (-norm.ppf((1-self.c_level)/2)) * (np.sqrt(self.Sample_mean*(1-self.Sample_mean)/self.n))
 display(Latex(f'${c_l} \leq P \leq {c_u}$'))
 elif self.type_c == 'lower_confidence':
 c_u = self.Sample_mean + (-norm.ppf(1-self.c_level)) * (np.sqrt(self.Sample_mean*(1-self.Sample_mean)/self.n))
 display(Latex(f'$P \leq {c_u}$'))
 elif self.type_c == 'upper_confidence':
 c_l = self.Sample_mean - (-norm.ppf(1-self.c_level)) * (np.sqrt(self.Sample_mean*(1-self.Sample_mean)/self.n))
 display(Latex(f'${c_l} \leq P$'))

In [None]:
confidence_interval_for_p_bernoulli(c_level = 0.95, type_c = 'two_sided_confidence', Sample_mean = 0.2, n = 100);

