{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "- Currently we are using a naive minimization algorithm. It will approximate the gradient and move in that direction. This requires lots of function calls. \n", "- We have no way to get the Hessian - which is needed for the variance estimates of our parameters. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Automatic differentiation - how to differentiate an algorithm\n", "\n", "Symbolic differentiation is what we learned in school, and software like SymPy, Wolfram and Mathematica do this well. But I want to differentiate the following:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import numpy as np\n", "from matplotlib import pyplot as plt\n", "\n", "def f(x):\n", " y = 0.\n", " for i in range(100):\n", " y = np.sin(y + x)\n", " return y\n", "\n", "x = np.linspace(-2, 2, 250)\n", "plt.plot(x, [f(_) for _ in x]);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Mathematically, it's something like:\n", "\n", "$$f(x) = \\sin(x + \\sin(x + \\sin(x + ...\\sin(x)))...)$$\n", "\n", "Good luck differentiating that and getting a nice closed form. If this is not complicated enough for you, feel free to add some `if` statements. \n", "\n", "We can use `autograd`, an automatical diff package, to compute _exact, pointwise_ derivatives." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from autograd import grad\n", "from autograd import numpy as np\n", "\n", "def f(x):\n", " y = 0.\n", " for i in range(100):\n", " # this np. is now from autograd - important! \n", " y = np.sin(y + x)\n", " return y\n", "\n", "grad_f = grad(f)\n", "grad_f(1.)\n", "# magic! " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### We just differentiated an algorithm. Pick your brain pieces off the walls please. \n", "\n", "At a high level, autograd has a lookup of simple functions and their derivatives, and uses repeated use of the chain rule + calculus rules \n", "\n", "Of course, you can string together these pointwise values into a function:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = np.linspace(-2, 2, 250)\n", "plt.plot(x, [grad_f(x_) for x_ in x])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To use this in our optimization routines:\n", "\n", " - we can automatically compute gradients that the optimizer can use. \n", " - Hessians can be exactly calculated (we will do this later)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "T = (np.random.exponential(size=1000)/1.5) ** 2.3\n", "E = np.random.binomial(1, 0.95, size=1000)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def cumulative_hazard(params, t):\n", " lambda_, rho_ = params\n", " return (t / lambda_) ** rho_\n", "\n", "def log_hazard(params, t):\n", " lambda_, rho_ = params\n", " return np.log(rho_) - np.log(lambda_) + (rho_ - 1) * (np.log(t) - np.log(lambda_))\n", "\n", "def log_likelihood(params, t, e):\n", " return np.sum(e * log_hazard(params, t)) - np.sum(cumulative_hazard(params, t))\n", "\n", "def negative_log_likelihood(params, t, e):\n", " return -log_likelihood(params, t, e)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So `grad(negative_log_likelihood)` will find the gradient of `negative_log_likelihood` with respect to the first parameter, `params`. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "grad_negative_log_likelihood = # what goes here?\n", "\n", "print(grad_negative_log_likelihood(np.array([1., 1.]), T, E))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from scipy.optimize import minimize\n", "\n", "results = minimize(negative_log_likelihood, \n", " x0 = np.array([1.0, 1.0]),\n", " method=None, \n", " args=(T, E),\n", " jac=grad_negative_log_likelihood,\n", " bounds=((0.00001, None), (0.00001, None)))\n", "\n", "print(results)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from autograd import value_and_grad\n", "\n", "results = minimize(\n", " # fill this in. \n", " x0 = np.array([1.0, 1.0]),\n", " method=None, \n", " args=(T, E),\n", " jac=True, # notice this set to True now.\n", " bounds=((0.00001, None), (0.00001, None)))\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "results" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's continue this analytical-train 🚂 to Part 4! " ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 2 }