{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# CS 20 : TensorFlow for Deep Learning Research\n",
    "## Lecture 04 : Eager execution\n",
    "### Automatic differentiation and gradient tape\n",
    "* Reference\n",
    " + https://www.tensorflow.org/tutorials/eager/automatic_differentiation?hl=ko"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Setup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1.12.0\n"
     ]
    }
   ],
   "source": [
    "from __future__ import absolute_import, division, print_function\n",
    "import numpy as np\n",
    "import tensorflow as tf\n",
    "\n",
    "tf.enable_eager_execution()\n",
    "\n",
    "print(tf.__version__)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Gradient tapes\n",
    "\n",
    "TensorFlow provides the `tf.GradientTape` API for automatic differentiation - ***computing the gradient of a computation with respect to its input variables. Tensorflow \"records\" all operations executed inside the context of a `tf.GradientTape` onto a \"tape\". Tensorflow then uses that tape and the gradients associated with each recorded operation to compute the gradients of a \"recorded\" computation using reverse mode differentiation.***"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tf.Tensor(8.0, shape=(), dtype=float32)\n"
     ]
    }
   ],
   "source": [
    "# Trainable variables (created by `tf.Variable` or `tf.get_variable`, where\n",
    "# `trainable=True` is default in both cases) are automatically watched. Tensors\n",
    "# can be manually watched by invoking the `watch` method on this context\n",
    "# manager.\n",
    "\n",
    "x = tf.constant(1, dtype = tf.float32)\n",
    "\n",
    "# z = y^2, y = 2x, z = (2x)^2\n",
    "with tf.GradientTape() as tape:\n",
    "    tape.watch(x)\n",
    "    y = tf.add(x, x)\n",
    "    z = tf.multiply(y, y)\n",
    "    \n",
    "# Derivative of z with respect to the original input tensor x\n",
    "dz_dx = tape.gradient(target = z, sources = x)\n",
    "print(dz_dx)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can also request gradients of the output with respect to intermediate values computed during a \"recorded\" `tf.GradientTape` context."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tf.Tensor(4.0, shape=(), dtype=float32)\n"
     ]
    }
   ],
   "source": [
    "x = tf.constant(1, dtype = tf.float32)\n",
    "\n",
    "# z = y^2, y = 2x, z = (2x)^2\n",
    "with tf.GradientTape() as tape:\n",
    "    tape.watch(x)\n",
    "    y = tf.add(x, x)\n",
    "    z = tf.multiply(y, y)\n",
    "    \n",
    "# Use the tape to compute the derivative of z with respect to the\n",
    "# intermediate value y.\n",
    "dz_dy = tape.gradient(target = z, sources = y)\n",
    "print(dz_dy)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "By default, the resources held by a GradientTape are released as soon as GradientTape.gradient() method is called. To compute multiple gradients over the same computation, create a persistent gradient tape. This allows multiple calls to the `gradient()` method. as resources are released when the tape object is garbage collected. For example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tf.Tensor(4.0, shape=(), dtype=float32) tf.Tensor(2.0, shape=(), dtype=float32) tf.Tensor(8.0, shape=(), dtype=float32)\n"
     ]
    }
   ],
   "source": [
    "x = tf.constant(1, dtype = tf.float32)\n",
    "\n",
    "# z = y^2, y = 2x, z = (2x)^2\n",
    "with tf.GradientTape(persistent = True) as tape:\n",
    "    tape.watch(x)\n",
    "    y = tf.add(x, x)\n",
    "    z = tf.multiply(y, y)\n",
    "    \n",
    "dz_dy = tape.gradient(target = z, sources = y)\n",
    "dy_dx = tape.gradient(target = y, sources = x)\n",
    "dz_dx = tape.gradient(target = z, sources = x)\n",
    "\n",
    "print(dz_dy, dy_dx, dz_dx)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Recording control flow\n",
    "Because tapes record operations as they are executed, Python control flow (using `if`s and `while`s for example) is naturally handled:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tf.Tensor(12.0, shape=(), dtype=float32)\n",
      "tf.Tensor(12.0, shape=(), dtype=float32)\n",
      "tf.Tensor(4.0, shape=(), dtype=float32)\n"
     ]
    }
   ],
   "source": [
    "def f(x, y):\n",
    "    output = 1.0\n",
    "    for i in range(y):\n",
    "        if i > 1 and i < 5:\n",
    "            output = tf.multiply(output, x)\n",
    "    return output\n",
    "\n",
    "def grad(x, y):\n",
    "    with tf.GradientTape() as tape:\n",
    "        tape.watch(x)\n",
    "        out = f(x, y)\n",
    "    return tape.gradient(out, x) \n",
    "\n",
    "x = tf.convert_to_tensor(2.0)\n",
    "\n",
    "print(grad(x, 6)) # out = x^3\n",
    "print(grad(x, 5)) # out = x^3\n",
    "print(grad(x, 4)) # out = x^2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "####  Higher-order gradients"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Operations inside of the `GradientTape` context manager are recorded for automatic differentiation. If gradients are computed in that context, then the gradient computation is recorded as well. As a result, the exact same API works for higher-order gradients as well. For example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tf.Tensor(3.0, shape=(), dtype=float32)\n",
      "tf.Tensor(6.0, shape=(), dtype=float32)\n"
     ]
    }
   ],
   "source": [
    "x = tf.Variable(1.0)  # Create a Tensorflow variable initialized to 1.0\n",
    "\n",
    "with tf.GradientTape() as t:\n",
    "    with tf.GradientTape() as t2:\n",
    "        y = x * x * x\n",
    "    # Compute the gradient inside the 't' context manager\n",
    "    # which means the gradient computation is differentiable as well.\n",
    "    dy_dx = t2.gradient(y, x)\n",
    "d2y_dx2 = t.gradient(dy_dx, x)\n",
    "\n",
    "print(dy_dx)\n",
    "print(d2y_dx2)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}