{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### The Delta Method\n",
    "\n",
    "> CFO: What's our variation in churn this year?\n",
    "\n",
    "> Data scientist: Our $\\lambda$ value has been increasing, but $\\rho$ is staying the same, so we should see-\n",
    "\n",
    "> CFO: Our banana value is increasing?\n",
    "\n",
    "We want to connect our parameters to business logic _and_ carry over variance in estimates. \n",
    "\n",
    "Example: It's silly to present a point estimate without confidence intervals (CIs), since arguably the CIs contains more useful information than a point estimate. \n",
    "\n",
    "We'll start with asking: \n",
    "\n",
    "> what is the CI for the survival function, $S(t; \\hat{\\theta})$? \n",
    "\n",
    "\n",
    "We will use the **Delta method** to do this (bolded because it's awesome):\n",
    "\n",
    "$$\\text{Var}(f(\\hat{\\theta})) \\approx \\text{grad}(f)(\\hat{\\theta}) \\cdot \\text{Var}(\\hat{\\theta}) \\cdot \\text{grad}(f)(\\hat{\\theta}) ^ T $$\n",
    "\n",
    "1. $f$ in our case is the survival function, $S$\n",
    "2. We know $\\text{Var}(\\hat{\\theta})$ (inverse of the Hessian)\n",
    "3. Do we need to compute $\\text{grad}(f)$ by hand? Heck no, use `autograd`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# seen all this...\n",
    "%matplotlib inline\n",
    "from autograd import numpy as np\n",
    "from autograd import elementwise_grad, value_and_grad, hessian\n",
    "from scipy.optimize import minimize\n",
    "\n",
    "# N = 50 for this example\n",
    "T = (np.random.exponential(size=50)/1.5) ** 2.3\n",
    "E = np.random.binomial(1, 0.95, size=50)\n",
    "\n",
    "def cumulative_hazard(params, t):\n",
    "    lambda_, rho_ = params\n",
    "    return (t / lambda_) ** rho_\n",
    "\n",
    "hazard = elementwise_grad(cumulative_hazard, argnum=1)\n",
    "\n",
    "def log_hazard(params, t):\n",
    "    return np.log(hazard(params, t))\n",
    "\n",
    "def log_likelihood(params, t, e):\n",
    "    return np.sum(e * log_hazard(params, t)) - np.sum(cumulative_hazard(params, t))\n",
    "\n",
    "def negative_log_likelihood(params, t, e):\n",
    "    return -log_likelihood(params, t, e)\n",
    "\n",
    "from autograd import value_and_grad\n",
    "\n",
    "results = minimize(\n",
    "        value_and_grad(negative_log_likelihood), \n",
    "        x0 = np.array([1.0, 1.0]),\n",
    "        method=None, \n",
    "        args=(T, E),\n",
    "        jac=True,\n",
    "        bounds=((0.00001, None), (0.00001, None)))\n",
    "\n",
    "estimates_ = results.x\n",
    "H = hessian(negative_log_likelihood)(estimates_, T, E)\n",
    "variance_matrix_ = np.linalg.inv(H)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from autograd import grad\n",
    "\n",
    "def survival_function(params, t):\n",
    "    return np.exp(-cumulative_hazard(params, t))\n",
    "\n",
    "grad_sf = # what goes here?\n",
    "grad_sf(estimates_, 5.0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def variance_at_t(t):\n",
    "    return # what goes here?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "variance_at_t(5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "t = np.linspace(.001, 10, 100)\n",
    "\n",
    "std_sf = np.sqrt(np.array([variance_at_t(_) for _ in t]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.plot(t, survival_function(estimates_, t))\n",
    "plt.fill_between(t, \n",
    "                 y1=survival_function(estimates_, t) + 1.65 * std_sf, \n",
    "                 y2=survival_function(estimates_, t) - 1.65 * std_sf,\n",
    "                 alpha=0.3\n",
    "                )\n",
    "plt.ylim(0, 1)\n",
    "plt.title(\"Estimated survival function with CIs (Delta method)\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we will explore a subscription service LTV example. Move to Part 7! "
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}