{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# MLE of Normal parameters\n",
    "\n",
    "[Dataset download](https://s3.amazonaws.com/bebi103.caltech.edu/data/good_invitro_droplet_data.csv)\n",
    "\n",
    "<hr>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<hr>\n",
    "\n",
    "Recall our first model for spindle length.\n",
    "\n",
    "\\begin{align}\n",
    "l_i \\sim \\text{Norm}(\\phi, \\sigma) \\;\\;\\forall i.\n",
    "\\end{align}\n",
    "\n",
    "For this model, we have already worked out analytically that the maximum likelihood estimates for the parameters $\\phi$ and $\\sigma$ are given by their respective plug-in estimates.\n",
    "\n",
    "\\begin{align}\n",
    "&\\phi^* = \\hat{\\phi}\\\\[1em]\n",
    "&\\sigma^* = \\hat{\\sigma}.\n",
    "\\end{align}\n",
    "\n",
    "We can directly compute these and use parametric bootstrap to get their confidence intervals. We start by loading in the data and computing the MLEs for $\\phi$ and $\\sigma$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "φ MLE: 32.86402985074626 µm\n",
      "σ MLE: 4.784665043782949 µm\n"
     ]
    }
   ],
   "source": [
    "df = pd.read_csv(\"../data/good_invitro_droplet_data.csv\", comment=\"#\")\n",
    "\n",
    "spindle_length = df[\"Spindle Length (um)\"].values\n",
    "\n",
    "phi_mle = np.mean(spindle_length)\n",
    "sigma_mle = np.std(spindle_length)\n",
    "\n",
    "print(\"φ MLE:\", phi_mle, \"µm\")\n",
    "print(\"σ MLE:\", sigma_mle, \"µm\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we use parametric bootstrap to compute the confidence intervals. First, we'll write functions to perform the bootstrapping."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "def draw_parametric_bs_reps(data, phi, sigma, size=1):\n",
    "    \"\"\"Parametric bootstrap replicates of parameters of\n",
    "    Normal distribution.\"\"\"\n",
    "    bs_reps_phi = np.empty(size)\n",
    "    bs_reps_sigma = np.empty(size)\n",
    "    \n",
    "    for i in range(size):\n",
    "        bs_sample = np.random.normal(phi, sigma, size=len(data))\n",
    "        bs_reps_phi[i] = np.mean(bs_sample)\n",
    "        bs_reps_sigma[i] = np.std(bs_sample)\n",
    "                                     \n",
    "    return bs_reps_phi, bs_reps_sigma"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can get the bootstrap replicates."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "bs_reps_phi, bs_reps_sigma = draw_parametric_bs_reps(\n",
    "    spindle_length, phi_mle, sigma_mle, size=100000\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With the bootstrap replicates in hand, we can compute the 95% confidence intervals using percentiles."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "φ conf int: [32.50167668 33.22578602]\n",
      "σ conf int: [4.52462138 5.03646672]\n"
     ]
    }
   ],
   "source": [
    "print('φ conf int:', np.percentile(bs_reps_phi, [2.5, 97.5]))\n",
    "print('σ conf int:', np.percentile(bs_reps_sigma, [2.5, 97.5]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Thus, we have estimated the parameters under this model."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Computing environment"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Python implementation: CPython\n",
      "Python version       : 3.11.4\n",
      "IPython version      : 8.12.2\n",
      "\n",
      "numpy     : 1.24.3\n",
      "pandas    : 2.0.3\n",
      "jupyterlab: 4.0.5\n",
      "\n"
     ]
    }
   ],
   "source": [
    "%load_ext watermark\n",
    "%watermark -v -p numpy,pandas,jupyterlab"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}