{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# MLE of Normal parameters\n", "\n", "[Dataset download](https://s3.amazonaws.com/bebi103.caltech.edu/data/good_invitro_droplet_data.csv)\n", "\n", "
" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# Colab setup ------------------\n", "import os, sys, subprocess\n", "if \"google.colab\" in sys.modules:\n", " cmd = \"pip install --upgrade watermark\"\n", " process = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE, stderr=subprocess.PIPE)\n", " stdout, stderr = process.communicate()\n", " data_path = \"https://s3.amazonaws.com/bebi103.caltech.edu/data/\"\n", "else:\n", " data_path = \"../data/\"\n", "# ------------------------------\n", "\n", "import numpy as np\n", "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", "Recall our first model for spindle length.\n", "\n", "\\begin{align}\n", "l_i \\sim \\text{Norm}(\\phi, \\sigma) \\;\\;\\forall i.\n", "\\end{align}\n", "\n", "For this model, we have already worked out analytically that the maximum likelihood estimates for the parameters $\\phi$ and $\\sigma$ are given by their respective plug-in estimates.\n", "\n", "\\begin{align}\n", "&\\phi^* = \\hat{\\phi}\\\\[1em]\n", "&\\sigma^* = \\hat{\\sigma}.\n", "\\end{align}\n", "\n", "We can directly compute these and use parametric bootstrap to get their confidence intervals. We start by loading in the data and computing the MLEs for $\\phi$ and $\\sigma$." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "φ MLE: 32.86402985074626 µm\n", "σ MLE: 4.784665043782949 µm\n" ] } ], "source": [ "df = pd.read_csv(os.path.join(data_path, 'good_invitro_droplet_data.csv'), comment='#')\n", "\n", "spindle_length = df['Spindle Length (um)'].values\n", "\n", "phi_mle = np.mean(spindle_length)\n", "sigma_mle = np.std(spindle_length)\n", "\n", "print('φ MLE:', phi_mle, 'µm')\n", "print('σ MLE:', sigma_mle, 'µm')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we use parametric bootstrap to compute the confidence intervals. First, we'll write functions to perform the bootstrapping." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "rg = np.random.default_rng()\n", "\n", "def draw_parametric_bs_reps(data, phi, sigma, size=1):\n", " \"\"\"Parametric bootstrap replicates of parameters of\n", " Normal distribution.\"\"\"\n", " bs_reps_phi = np.empty(size)\n", " bs_reps_sigma = np.empty(size)\n", " \n", " for i in range(size):\n", " bs_sample = np.random.normal(phi, sigma, size=len(data))\n", " bs_reps_phi[i] = np.mean(bs_sample)\n", " bs_reps_sigma[i] = np.std(bs_sample)\n", " \n", " return bs_reps_phi, bs_reps_sigma" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can get the bootstrap replicates." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "bs_reps_phi, bs_reps_sigma = draw_parametric_bs_reps(\n", " spindle_length, phi_mle, sigma_mle, size=100000\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With the bootstrap replicates in hand, we can compute the 95% confidence intervals using percentiles." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "φ conf int: [32.50257673 33.22919193]\n", "σ conf int: [4.5262453 5.03866745]\n" ] } ], "source": [ "print('φ conf int:', np.percentile(bs_reps_phi, [2.5, 97.5]))\n", "print('σ conf int:', np.percentile(bs_reps_sigma, [2.5, 97.5]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So, we have estimated the parameters under this model." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Computing environment" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPython 3.8.5\n", "IPython 7.19.0\n", "\n", "numpy 1.19.2\n", "pandas 1.1.3\n", "jupyterlab 2.2.6\n" ] } ], "source": [ "%load_ext watermark\n", "%watermark -v -p numpy,pandas,jupyterlab" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 4 }