{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "t09eeeR5prIJ"
      },
      "source": [
        "##### Copyright 2018 The TensorFlow Authors."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "cellView": "form",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        },
        "colab_type": "code",
        "id": "GCCk8_dHpuNf"
      },
      "outputs": [],
      "source": [
        "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "# https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "xh8WkEwWpnm7"
      },
      "source": [
        "# Automatic differentiation and gradient tape"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "idv0bPeCp325"
      },
      "source": [
        "<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://www.tensorflow.org/tutorials/eager/automatic_differentiation\"><img src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" />View on TensorFlow.org</a>\n",
        "  </td>\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/tutorials/eager/automatic_differentiation.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
        "  </td>\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://github.com/tensorflow/docs/blob/master/site/en/tutorials/eager/automatic_differentiation.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View source on GitHub</a>\n",
        "  </td>\n",
        "</table>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "vDJ4XzMqodTy"
      },
      "source": [
        "In the previous tutorial we introduced `Tensor`s and operations on them. In this tutorial we will cover [automatic differentiation](https://en.wikipedia.org/wiki/Automatic_differentiation), a key technique for optimizing machine learning models."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "GQJysDM__Qb0"
      },
      "source": [
        "## Setup\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        },
        "colab_type": "code",
        "id": "OiMPZStlibBv"
      },
      "outputs": [],
      "source": [
        "import tensorflow as tf\n",
        "tf.enable_eager_execution()\n",
        "\n",
        "tfe = tf.contrib.eager # Shorthand for some symbols"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "1CLWJl0QliB0"
      },
      "source": [
        "## Derivatives of a function\n",
        "\n",
        "TensorFlow provides APIs for automatic differentiation - computing the derivative of a function. The way that more closely mimics the math is to encapsulate the computation in a Python function, say `f`, and use `tfe.gradients_function` to create a function that computes the derivatives of `f` with respect to its arguments. If you're familiar with [autograd](https://github.com/HIPS/autograd) for differentiating numpy functions, this will be familiar. For example: "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        },
        "colab_type": "code",
        "id": "9FViq92UX7P8"
      },
      "outputs": [],
      "source": [
        "from math import pi\n",
        "\n",
        "def f(x):\n",
        "  return tf.square(tf.sin(x))\n",
        "\n",
        "assert f(pi/2).numpy() == 1.0\n",
        "\n",
        "\n",
        "# grad_f will return a list of derivatives of f\n",
        "# with respect to its arguments. Since f() has a single argument,\n",
        "# grad_f will return a list with a single element.\n",
        "grad_f = tfe.gradients_function(f)\n",
        "assert tf.abs(grad_f(pi/2)[0]).numpy() \u003c 1e-7"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "v9fPs8RyopCf"
      },
      "source": [
        "### Higher-order gradients\n",
        "\n",
        "The same API can be used to differentiate as many times as you like:\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        },
        "colab_type": "code",
        "id": "3D0ZvnGYo0rW"
      },
      "outputs": [],
      "source": [
        "def f(x):\n",
        "  return tf.square(tf.sin(x))\n",
        "\n",
        "def grad(f):\n",
        "  return lambda x: tfe.gradients_function(f)(x)[0]\n",
        "\n",
        "x = tf.lin_space(-2*pi, 2*pi, 100)  # 100 points between -2π and +2π\n",
        "\n",
        "import matplotlib.pyplot as plt\n",
        "\n",
        "plt.plot(x, f(x), label=\"f\")\n",
        "plt.plot(x, grad(f)(x), label=\"first derivative\")\n",
        "plt.plot(x, grad(grad(f))(x), label=\"second derivative\")\n",
        "plt.plot(x, grad(grad(grad(f)))(x), label=\"third derivative\")\n",
        "plt.legend()\n",
        "plt.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "-39gouo7mtgu"
      },
      "source": [
        "## Gradient tapes\n",
        "\n",
        "Every differentiable TensorFlow operation has an associated gradient function. For example, the gradient function of `tf.square(x)` would be a function that returns `2.0 * x`.  To compute the gradient of a user-defined function (like `f(x)` in the example above), TensorFlow first \"records\" all the operations applied to compute the output of the function. We call this record a \"tape\". It then uses that tape and the gradients functions associated with each primitive operation to compute the gradients of the user-defined function using [reverse mode differentiation](https://en.wikipedia.org/wiki/Automatic_differentiation).\n",
        "\n",
        "Since operations are recorded as they are executed, Python control flow (using `if`s and `while`s for example) is naturally handled:\n",
        "\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        },
        "colab_type": "code",
        "id": "MH0UfjympWf7"
      },
      "outputs": [],
      "source": [
        "def f(x, y):\n",
        "  output = 1\n",
        "  # Must use range(int(y)) instead of range(y) in Python 3 when\n",
        "  # using TensorFlow 1.10 and earlier. Can use range(y) in 1.11+\n",
        "  for i in range(int(y)):\n",
        "    output = tf.multiply(output, x)\n",
        "  return output\n",
        "\n",
        "def g(x, y):\n",
        "  # Return the gradient of `f` with respect to it's first parameter\n",
        "  return tfe.gradients_function(f)(x, y)[0]\n",
        "\n",
        "assert f(3.0, 2).numpy() == 9.0   # f(x, 2) is essentially x * x\n",
        "assert g(3.0, 2).numpy() == 6.0   # And its gradient will be 2 * x\n",
        "assert f(4.0, 3).numpy() == 64.0  # f(x, 3) is essentially x * x * x\n",
        "assert g(4.0, 3).numpy() == 48.0  # And its gradient will be 3 * x * x"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "aNmR5-jhpX2t"
      },
      "source": [
        "At times it may be inconvenient to encapsulate computation of interest into a function. For example, if you want the gradient of the output with respect to intermediate values computed in the function. In such cases, the slightly more verbose but explicit [tf.GradientTape](https://www.tensorflow.org/api_docs/python/tf/GradientTape) context is useful. All computation inside the context of a `tf.GradientTape` is \"recorded\".\n",
        "\n",
        "For example:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        },
        "colab_type": "code",
        "id": "bAFeIE8EuVIq"
      },
      "outputs": [],
      "source": [
        "x = tf.ones((2, 2))\n",
        "  \n",
        "# a single t.gradient() call when the bug is resolved.\n",
        "with tf.GradientTape(persistent=True) as t:\n",
        "  t.watch(x)\n",
        "  y = tf.reduce_sum(x)\n",
        "  z = tf.multiply(y, y)\n",
        "\n",
        "# Use the same tape to compute the derivative of z with respect to the\n",
        "# intermediate value y.\n",
        "dz_dy = t.gradient(z, y)\n",
        "assert dz_dy.numpy() == 8.0\n",
        "\n",
        "# Derivative of z with respect to the original input tensor x\n",
        "dz_dx = t.gradient(z, x)\n",
        "for i in [0, 1]:\n",
        "  for j in [0, 1]:\n",
        "    assert dz_dx[i][j].numpy() == 8.0"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "DK05KXrAAld3"
      },
      "source": [
        "### Higher-order gradients\n",
        "\n",
        "Operations inside of the `GradientTape` context manager are recorded for automatic differentiation. If gradients are computed in that context, then the gradient computation is recorded as well. As a result, the exact same API works for higher-order gradients as well. For example:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        },
        "colab_type": "code",
        "id": "cPQgthZ7ugRJ"
      },
      "outputs": [],
      "source": [
        "x = tf.constant(1.0)  # Convert the Python 1.0 to a Tensor object\n",
        "\n",
        "with tf.GradientTape() as t:\n",
        "  with tf.GradientTape() as t2:\n",
        "    t2.watch(x)\n",
        "    y = x * x * x\n",
        "  # Compute the gradient inside the 't' context manager\n",
        "  # which means the gradient computation is differentiable as well.\n",
        "  dy_dx = t2.gradient(y, x)\n",
        "d2y_dx2 = t.gradient(dy_dx, x)\n",
        "\n",
        "assert dy_dx.numpy() == 3.0\n",
        "assert d2y_dx2.numpy() == 6.0"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "4U1KKzUpNl58"
      },
      "source": [
        "## Next Steps\n",
        "\n",
        "In this tutorial we covered gradient computation in TensorFlow. With that we have enough of the primitives required to build and train neural networks."
      ]
    }
  ],
  "metadata": {
    "colab": {
      "collapsed_sections": [],
      "default_view": {},
      "name": "automatic_differentiation.ipynb",
      "private_outputs": true,
      "provenance": [],
      "toc_visible": true,
      "version": "0.3.2",
      "views": {}
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}