{ "metadata": { "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5-final" }, "orig_nbformat": 2, "kernelspec": { "name": "Python 3.8.5 64-bit ('streamlit': conda)", "display_name": "Python 3.8.5 64-bit ('streamlit': conda)", "metadata": { "interpreter": { "hash": "d5b47a323bfc67f94a37e7cb8fd52db1ad06cc55d227a5b8c375f2f5af6e26dc" } } } }, "nbformat": 4, "nbformat_minor": 2, "cells": [ { "source": [ "# Hypothesis test power from scratch with Python" ], "cell_type": "markdown", "metadata": {} }, { "source": [ "Calculating power of hypothesis tests.\n", "\n", "The code is from the [Data Science from Scratch](https://www.oreilly.com/library/view/data-science-from/9781492041122/) book.\n" ], "cell_type": "markdown", "metadata": {} }, { "source": [ "## Libraries and helper functions" ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from typing import Tuple\n", "import math as m" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "def calc_normal_cdf(x: float, mu: float = 0, sigma: float = 1) -> float:\n", " return (1 + m.erf((x - mu) / m.sqrt(2) / sigma)) / 2\n", "\n", "normal_probability_below = calc_normal_cdf" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "def normal_probability_between(lo: float, hi: float, mu: float = 0, sigma: float = 1) -> float:\n", " return normal_probability_below(hi, mu, sigma) - normal_probability_below(lo, mu, sigma)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "def calc_normal_cdf(x: float, mu: float = 0, sigma: float = 1) -> float:\n", " return (1 + m.erf((x - mu) / m.sqrt(2) / sigma)) / 2" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "def calc_inverse_normal_cdf(p: float, mu:float = 0, sigma: float = 1, tolerance: float = 1E-5, show_steps=False) -> float:\n", "\n", " if p == 0: return -np.inf\n", " if p == 1: return np.inf\n", "\n", " # In case it is not a standard normal distribution, calculate the standard normal first and then rescale\n", " if mu != 0 or sigma != 1:\n", " return mu + sigma * calc_inverse_normal_cdf(p, tolerance=tolerance)\n", "\n", " low_z = -10\n", " hi_z = 10\n", "\n", " if show_steps: print(f\"{'':<19}\".join(['low_z', 'mid_z', 'hi_z']), \"\\n\")\n", "\n", " while hi_z - low_z > tolerance:\n", " mid_z = (low_z + hi_z) / 2\n", " mid_p = calc_normal_cdf(mid_z)\n", "\n", " if mid_p < p:\n", " low_z = mid_z\n", " else:\n", " hi_z = mid_z\n", "\n", " if show_steps: print(\"\\t\".join(map(to_string, [low_z, mid_z, hi_z])))\n", "\n", " return mid_z" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "def normal_upper_bound(probabilty: float, mu: float = 0, sigma: float = 1) -> float:\n", " return calc_inverse_normal_cdf(probabilty, mu, sigma)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "def normal_lower_bound(probabilty: float, mu: float = 0, sigma: float = 1) -> float:\n", " return calc_inverse_normal_cdf(1 - probabilty, mu, sigma)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "def normal_two_sided_bounds(probability: float, mu: float = 0, sigma: float = 1) -> float:\n", " if probability == 0: return 0, 0\n", "\n", " tail_probability = (1 - probability) / 2\n", "\n", " lower_bound = normal_upper_bound(tail_probability, mu, sigma)\n", " upper_bound = normal_lower_bound(tail_probability, mu, sigma)\n", " \n", " return lower_bound, upper_bound" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "def normal_approximation_to_binomial(n: int, p: float) -> Tuple[float, float]:\n", " mu = p * n\n", " sigma = m.sqrt(p * (1 - p) * n)\n", "\n", " return mu, sigma" ] }, { "source": [ "## Type 1 Error and Tolerance" ], "cell_type": "markdown", "metadata": {} }, { "source": [ "Let's make our null hypothesis ($H_0$) that the probability of head is 0.5" ], "cell_type": "markdown", "metadata": {} }, { "source": [ "mu_0, sigma_0 = normal_approximation_to_binomial(1000, 0.5) \n", "mu_0, sigma_0" ], "cell_type": "code", "metadata": {}, "execution_count": 10, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "(500.0, 15.811388300841896)" ] }, "metadata": {}, "execution_count": 10 } ] }, { "source": [ "We define our tolerance at 5%. That is, we accept our model to produce 'type 1' errors (false positive) in 5% of the time. With the coin flipping example, we expect to receive 5% of the results to fall outsied of our defined interval." ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "(469.01026640487555, 530.9897335951244)" ] }, "metadata": {}, "execution_count": 11 } ], "source": [ "lo, hi = normal_two_sided_bounds(0.95, mu_0, sigma_0)\n", "lo, hi" ] }, { "source": [ "## Type 2 Error and Power" ], "cell_type": "markdown", "metadata": {} }, { "source": [ "At type 2 error we consider false negatives, that is, those cases where we fail to reject our null hypothesis even though we should." ], "cell_type": "markdown", "metadata": {} }, { "source": [ "Let's assume that contra $H_0$ the actual probability is 0.55." ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "(550.0, 15.732132722552274)" ] }, "metadata": {}, "execution_count": 12 } ], "source": [ "mu_1, sigma_1 = normal_approximation_to_binomial(1000, 0.55)\n", "mu_1, sigma_1" ] }, { "source": [ "In this case we get our Type 2 probability as the overlapping of the real distribution and the 95% probability region of $H_0$. In this particular case, in 11% of the cases we will wrongly fail to reject our null hypothesis." ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "0.1134519987046329" ] }, "metadata": {}, "execution_count": 13 } ], "source": [ "type_2_probability = normal_probability_between(lo, hi, mu_1, sigma_1)\n", "type_2_probability" ] }, { "source": [ "The power of the test is then the probability of rightly rejecting the $H_0$" ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "0.886548001295367" ] }, "metadata": {}, "execution_count": 14 } ], "source": [ "power = 1 - type_2_probability\n", "power" ] }, { "source": [ "Now, let's redefine our null hypothesis so that we expect the probability of head to be less than or equal to 0.5.\n", "\n", "In this case we have a one-sided test." ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "526.0073585242053" ] }, "metadata": {}, "execution_count": 15 } ], "source": [ "hi = normal_upper_bound(0.95, mu_0, sigma_0)\n", "hi" ] }, { "source": [ "Because this is a less strict hypothesis than our previus one, it has a smaller T2 probability and a greater power." ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "0.06362051966928273" ] }, "metadata": {}, "execution_count": 16 } ], "source": [ "type_2_probability = normal_probability_below(hi, mu_1, sigma_1)\n", "type_2_probability" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "0.9363794803307173" ] }, "metadata": {}, "execution_count": 17 } ], "source": [ "power = 1 - type_2_probability\n", "power" ] } ] }