{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Lifetime value example \n",
    "\n",
    "\n",
    "Suppose we have a subscription business that has monthly churn, and we'd like to know an estimate of LTV (lifetime value) and build confidence intervals for it. \n",
    "\n",
    "\n",
    "Subscription businesses have a very predictable churn profile (it looks piecewise) but the rates are unknown. \n",
    "\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "We'll use a piecewise-constant hazard model with known breakpoints, $\\tau$.\n",
    "$$ \n",
    "h(t) = \\begin{cases}\n",
    "                        \\lambda_0  & \\text{if $t \\le \\tau_0$} \\\\\n",
    "                        \\lambda_1 & \\text{if $\\tau_0 < t \\le \\tau_1$} \\\\\n",
    "                        \\lambda_2 & \\text{if $\\tau_1 < t \\le \\tau_2$} \\\\\n",
    "                        ...\n",
    "                      \\end{cases}\n",
    "$$\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "from autograd import numpy as np\n",
    "from autograd import elementwise_grad, value_and_grad, hessian\n",
    "from scipy.optimize import minimize\n",
    "\n",
    "df = pd.read_csv(\"../churn_data.csv\")\n",
    "T = df['T'].values\n",
    "E = df['E'].values"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "breakpoints = np.array([28,  33,  58,  63,  88,  93,  117, 122, 148, 153])\n",
    "\n",
    "def cumulative_hazard(params, times):\n",
    "    # this is NumPy black magic to get piecewise hazards, let's chat after. \n",
    "    times = np.atleast_1d(times)\n",
    "    n = times.shape[0]\n",
    "    times = times.reshape((n, 1))\n",
    "    M = np.minimum(np.tile(breakpoints, (n, 1)), times)\n",
    "    M = np.hstack([M[:, tuple([0])], np.diff(M, axis=1)])\n",
    "    return np.dot(M, params)\n",
    "\n",
    "hazard = elementwise_grad(cumulative_hazard, argnum=1)\n",
    "\n",
    "def survival_function(params, t):\n",
    "    return np.exp(-cumulative_hazard(params, t))\n",
    "\n",
    "def log_hazard(params, t):\n",
    "    return np.log(np.clip(hazard(params, t), 1e-25, np.inf))\n",
    "\n",
    "def log_likelihood(params, t, e):\n",
    "    return np.sum(e * log_hazard(params, t)) - np.sum(cumulative_hazard(params, t))\n",
    "\n",
    "def negative_log_likelihood(params, t, e):\n",
    "    return -log_likelihood(params, t, e)\n",
    "\n",
    "from autograd import value_and_grad\n",
    "\n",
    "results = minimize(\n",
    "        value_and_grad(negative_log_likelihood), \n",
    "        x0 = np.ones(len(breakpoints)),\n",
    "        method=None, \n",
    "        args=(T, E),\n",
    "        jac=True,\n",
    "        bounds=[(0.0001, None)] * len(breakpoints)\n",
    ")\n",
    "\n",
    "print(results)\n",
    "estimates_ = results.x\n",
    "H = hessian(negative_log_likelihood)(estimates_, T, E)\n",
    "variance_matrix_ = np.linalg.inv(H)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "t = np.linspace(.001, 150, 100)\n",
    "plt.plot(t, survival_function(estimates_, t))\n",
    "plt.ylim(0.5, 1)\n",
    "plt.title(\"\"\"Estimated survival function using \\npiecewise hazards\"\"\");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "On day 30, we charge users \\\\$10, and on every 30 days after that, we charge \\\\$20. What's the LTV, and CIs, at the end of day 120?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def LTV_120(params):\n",
    "    # think about how complicated the gradient of this function is. Now imagine an even more\n",
    "    # complicated function.\n",
    "    # how do we implement this function?\n",
    "    return ..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ltv_ = LTV_120(estimates_)\n",
    "print(\"LTV estimate: \", ltv_)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from autograd import grad\n",
    "var_ltv_ = # what goes here?\n",
    "print(\"Variance LTV estimate:\", var_ltv_)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "std_ltv = np.sqrt(var_ltv_)\n",
    "print(\"Estimated LTV at day 120: ($%.2f, $%.2f)\" % (ltv_ - 1.96 * std_ltv, ltv_ + 1.96 * std_ltv))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "From here, we can compute p-values, scenario analysis, sensitvity analysis, etc. \n",
    "\n",
    "Let's continue this analytical train to Part 8. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Bonus, if time permits and Cameron is talking too fast. \n",
    "\n",
    "In the above model, we are not suggesting to the model much apriori information about this \"predictable\" process. For example, suppose we want a model that gives \"inbetween\" period rates to be close to each other, and likewise for the \"jump\" rates. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "def negative_log_likelihood(params, t, e):\n",
    "    return -log_likelihood(params, t, e) + 1e12 * (np.var(params[::2]) + np.var(params[1::2]))\n",
    "\n",
    "from autograd import value_and_grad\n",
    "\n",
    "results = minimize(\n",
    "        value_and_grad(negative_log_likelihood), \n",
    "        x0 = np.ones(len(breakpoints)),\n",
    "        method=None, \n",
    "        args=(T, E),\n",
    "        jac=True,\n",
    "        bounds=[(0.0001, None)] * len(breakpoints)\n",
    ")\n",
    "\n",
    "print(results)\n",
    "estimates_ = results.x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "t = np.linspace(.001, 150, 100)\n",
    "plt.plot(t, hazard(estimates_, t))\n",
    "plt.title(\"\"\"Estimated hazard function using \\npiecewise hazards \\nand penalizing variance\"\"\");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Why do this?\n",
    "\n",
    "1) From a Bayesian perspective, this is a way to add prior information into a model. Note that if we have _lots_ of observations, the prior becomes less relevant (just like a traditional Bayesian model). That's good. *We are again using the likelihood to \"add\" information to our system.*\n",
    "\n",
    "2) When we have low data sizes, we can \"borrow\" information between periods. That is, deaths in the earlier \"inbetween\" periods can inform the rate in the later \"inbetween\" periods - thus we can do better inference. \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}