{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "### The Delta Method\n", "\n", "> CFO: What's our variation in churn this year?\n", "\n", "> Data scientist: Our $\\lambda$ value has been increasing, but $\\rho$ is staying the same, so we should see-\n", "\n", "> CFO: Our banana value is increasing?\n", "\n", "We want to connect our parameters to business logic _and_ carry over variance in estimates. \n", "\n", "Example: It's silly to present a point estimate without confidence intervals (CIs), since arguably the CIs contains more useful information than a point estimate. \n", "\n", "We'll start with asking: \n", "\n", "> what is the CI for the survival function, $S(t; \\hat{\\theta})$? \n", "\n", "\n", "We will use the **Delta method** to do this (bolded because it's awesome):\n", "\n", "$$\\text{Var}(f(\\hat{\\theta})) \\approx \\text{grad}(f)(\\hat{\\theta}) \\cdot \\text{Var}(\\hat{\\theta}) \\cdot \\text{grad}(f)(\\hat{\\theta}) ^ T $$\n", "\n", "1. $f$ in our case is the survival function, $S$\n", "2. We know $\\text{Var}(\\hat{\\theta})$ (inverse of the Hessian)\n", "3. Do we need to compute $\\text{grad}(f)$ by hand? Heck no, use `autograd`" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# seen all this...\n", "%matplotlib inline\n", "from autograd import numpy as np\n", "from autograd import elementwise_grad, value_and_grad, hessian\n", "from scipy.optimize import minimize\n", "\n", "# N = 50 for this example\n", "T = (np.random.exponential(size=50)/1.5) ** 2.3\n", "E = np.random.binomial(1, 0.95, size=50)\n", "\n", "def cumulative_hazard(params, t):\n", " lambda_, rho_ = params\n", " return (t / lambda_) ** rho_\n", "\n", "hazard = elementwise_grad(cumulative_hazard, argnum=1)\n", "\n", "def log_hazard(params, t):\n", " return np.log(hazard(params, t))\n", "\n", "def log_likelihood(params, t, e):\n", " return np.sum(e * log_hazard(params, t)) - np.sum(cumulative_hazard(params, t))\n", "\n", "def negative_log_likelihood(params, t, e):\n", " return -log_likelihood(params, t, e)\n", "\n", "from autograd import value_and_grad\n", "\n", "results = minimize(\n", " value_and_grad(negative_log_likelihood), \n", " x0 = np.array([1.0, 1.0]),\n", " method=None, \n", " args=(T, E),\n", " jac=True,\n", " bounds=((0.00001, None), (0.00001, None)))\n", "\n", "estimates_ = results.x\n", "H = hessian(negative_log_likelihood)(estimates_, T, E)\n", "variance_matrix_ = np.linalg.inv(H)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from autograd import grad\n", "\n", "def survival_function(params, t):\n", " return np.exp(-cumulative_hazard(params, t))\n", "\n", "grad_sf = # what goes here?\n", "grad_sf(estimates_, 5.0)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def variance_at_t(t):\n", " return # what goes here?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "variance_at_t(5)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "t = np.linspace(.001, 10, 100)\n", "\n", "std_sf = np.sqrt(np.array([variance_at_t(_) for _ in t]))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.plot(t, survival_function(estimates_, t))\n", "plt.fill_between(t, \n", " y1=survival_function(estimates_, t) + 1.65 * std_sf, \n", " y2=survival_function(estimates_, t) - 1.65 * std_sf,\n", " alpha=0.3\n", " )\n", "plt.ylim(0, 1)\n", "plt.title(\"Estimated survival function with CIs (Delta method)\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we will explore a subscription service LTV example. Move to Part 7! " ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 2 }