{ "cells": [ { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "CCQY7jpBfMur" }, "source": [ "##### Copyright 2018 The TensorFlow Authors." ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "cellView": "form", "colab": {}, "colab_type": "code", "id": "z6X9omPnfO_h" }, "outputs": [], "source": [ "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n", "# you may not use this file except in compliance with the License.\n", "# You may obtain a copy of the License at\n", "#\n", "# https://www.apache.org/licenses/LICENSE-2.0\n", "#\n", "# Unless required by applicable law or agreed to in writing, software\n", "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", "# See the License for the specific language governing permissions and\n", "# limitations under the License." ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "2QQJJyDzqGRb" }, "source": [ "# Eager execution\n" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "B1xdylywqUSX" }, "source": [ "\n", " \n", " \n", " \n", " \n", "
\n", " View on TensorFlow.org\n", " \n", " Run in Google Colab\n", " \n", " View source on GitHub\n", " \n", " Download notebook\n", "
" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "EGjDcGxIqEfX" }, "source": [ "TensorFlow's eager execution is an imperative programming environment that\n", "evaluates operations immediately, without building graphs: operations return\n", "concrete values instead of constructing a computational graph to run later. This\n", "makes it easy to get started with TensorFlow and debug models, and it\n", "reduces boilerplate as well. To follow along with this guide, run the code\n", "samples below in an interactive `python` interpreter.\n", "\n", "Eager execution is a flexible machine learning platform for research and\n", "experimentation, providing:\n", "\n", "* *An intuitive interface*—Structure your code naturally and use Python data\n", " structures. Quickly iterate on small models and small data.\n", "* *Easier debugging*—Call ops directly to inspect running models and test\n", " changes. Use standard Python debugging tools for immediate error reporting.\n", "* *Natural control flow*—Use Python control flow instead of graph control\n", " flow, simplifying the specification of dynamic models.\n", "\n", "Eager execution supports most TensorFlow operations and GPU acceleration.\n", "\n", "Note: Some models may experience increased overhead with eager execution\n", "enabled. Performance improvements are ongoing, but please\n", "[file a bug](https://github.com/tensorflow/tensorflow/issues) if you find a\n", "problem and share your benchmarks." ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "RBAeIwOMrYk8" }, "source": [ "## Setup and basic usage" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "ByNsp4VqqEfa" }, "outputs": [], "source": [ "import os\n", "\n", "import tensorflow as tf\n", "\n", "import cProfile" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "48P3-8q4qEfe" }, "source": [ "In Tensorflow 2.0, eager execution is enabled by default." ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "attributes": { "classes": [ "py" ], "id": "" }, "colab": {}, "colab_type": "code", "id": "7aFsD8csqEff" }, "outputs": [], "source": [ "tf.executing_eagerly()" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "x_G1zZT5qEfh" }, "source": [ "Now you can run TensorFlow operations and the results will return immediately:" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "attributes": { "classes": [ "py" ], "id": "" }, "colab": {}, "colab_type": "code", "id": "9gsI54pbqEfj" }, "outputs": [], "source": [ "x = [[2.]]\n", "m = tf.matmul(x, x)\n", "print(\"hello, {}\".format(m))" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "ajFn6qsdqEfl" }, "source": [ "Enabling eager execution changes how TensorFlow operations behave—now they\n", "immediately evaluate and return their values to Python. `tf.Tensor` objects\n", "reference concrete values instead of symbolic handles to nodes in a computational\n", "graph. Since there isn't a computational graph to build and run later in a\n", "session, it's easy to inspect results using `print()` or a debugger. Evaluating,\n", "printing, and checking tensor values does not break the flow for computing\n", "gradients.\n", "\n", "Eager execution works nicely with [NumPy](http://www.numpy.org/). NumPy\n", "operations accept `tf.Tensor` arguments. The TensorFlow\n", "`tf.math` operations convert\n", "Python objects and NumPy arrays to `tf.Tensor` objects. The\n", "`tf.Tensor.numpy` method returns the object's value as a NumPy `ndarray`." ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "sTO0_5TYqz1n" }, "outputs": [], "source": [ "a = tf.constant([[1, 2],\n", " [3, 4]])\n", "print(a)" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "Dp14YT8Gq4r1" }, "outputs": [], "source": [ "# Broadcasting support\n", "b = tf.add(a, 1)\n", "print(b)" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "69p3waMfq8cQ" }, "outputs": [], "source": [ "# Operator overloading is supported\n", "print(a * b)" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "attributes": { "classes": [ "py" ], "id": "" }, "colab": {}, "colab_type": "code", "id": "Ui025t1qqEfm" }, "outputs": [], "source": [ "# Use NumPy values\n", "import numpy as np\n", "\n", "c = np.multiply(a, b)\n", "print(c)" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "Tq_aFRzWrCua" }, "outputs": [], "source": [ "# Obtain numpy value from a tensor:\n", "print(a.numpy())\n", "# => [[1 2]\n", "# [3 4]]" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "H08f9ss9qEft" }, "source": [ "## Dynamic control flow\n", "\n", "A major benefit of eager execution is that all the functionality of the host\n", "language is available while your model is executing. So, for example,\n", "it is easy to write [fizzbuzz](https://en.wikipedia.org/wiki/Fizz_buzz):" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "attributes": { "classes": [ "py" ], "id": "" }, "colab": {}, "colab_type": "code", "id": "0fudRMeUqEfu" }, "outputs": [], "source": [ "def fizzbuzz(max_num):\n", " counter = tf.constant(0)\n", " max_num = tf.convert_to_tensor(max_num)\n", " for num in range(1, max_num.numpy()+1):\n", " num = tf.constant(num)\n", " if int(num % 3) == 0 and int(num % 5) == 0:\n", " print('FizzBuzz')\n", " elif int(num % 3) == 0:\n", " print('Fizz')\n", " elif int(num % 5) == 0:\n", " print('Buzz')\n", " else:\n", " print(num.numpy())\n", " counter += 1" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "P2cKknQWrJLB" }, "outputs": [], "source": [ "fizzbuzz(15)" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "7kA-aC3BqEfy" }, "source": [ "This has conditionals that depend on tensor values and it prints these values\n", "at runtime." ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "8huKpuuAwICq" }, "source": [ "## Eager training" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "mp2lCCZYrxHd" }, "source": [ "### Computing gradients\n", "\n", "[Automatic differentiation](https://en.wikipedia.org/wiki/Automatic_differentiation)\n", "is useful for implementing machine learning algorithms such as\n", "[backpropagation](https://en.wikipedia.org/wiki/Backpropagation) for training\n", "neural networks. During eager execution, use `tf.GradientTape` to trace\n", "operations for computing gradients later.\n", "\n", "You can use `tf.GradientTape` to train and/or compute gradients in eager. It is especially useful for complicated training loops. \n", "\n", "Since different operations can occur during each call, all\n", "forward-pass operations get recorded to a \"tape\". To compute the gradient, play\n", "the tape backwards and then discard. A particular `tf.GradientTape` can only\n", "compute one gradient; subsequent calls throw a runtime error." ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "attributes": { "classes": [ "py" ], "id": "" }, "colab": {}, "colab_type": "code", "id": "7g1yWiSXqEf-" }, "outputs": [], "source": [ "w = tf.Variable([[1.0]])\n", "with tf.GradientTape() as tape:\n", " loss = w * w\n", "\n", "grad = tape.gradient(loss, w)\n", "print(grad) # => tf.Tensor([[ 2.]], shape=(1, 1), dtype=float32)" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "vkHs32GqweYS" }, "source": [ "### Train a model\n", "\n", "The following example creates a multi-layer model that classifies the standard\n", "MNIST handwritten digits. It demonstrates the optimizer and layer APIs to build\n", "trainable graphs in an eager execution environment." ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "38kymXZowhhz" }, "outputs": [], "source": [ "# Fetch and format the mnist data\n", "(mnist_images, mnist_labels), _ = tf.keras.datasets.mnist.load_data()\n", "\n", "dataset = tf.data.Dataset.from_tensor_slices(\n", " (tf.cast(mnist_images[...,tf.newaxis]/255, tf.float32),\n", " tf.cast(mnist_labels,tf.int64)))\n", "dataset = dataset.shuffle(1000).batch(32)" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "rl1K8rOowmwT" }, "outputs": [], "source": [ "# Build the model\n", "mnist_model = tf.keras.Sequential([\n", " tf.keras.layers.Conv2D(16,[3,3], activation='relu',\n", " input_shape=(None, None, 1)),\n", " tf.keras.layers.Conv2D(16,[3,3], activation='relu'),\n", " tf.keras.layers.GlobalAveragePooling2D(),\n", " tf.keras.layers.Dense(10)\n", "])" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "fvyk-HgGwxwl" }, "source": [ "Even without training, call the model and inspect the output in eager execution:" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "BsxystjBwxLS" }, "outputs": [], "source": [ "for images,labels in dataset.take(1):\n", " print(\"Logits: \", mnist_model(images[0:1]).numpy())" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "Y3PGa8G7qEgB" }, "source": [ "While keras models have a builtin training loop (using the `fit` method), sometimes you need more customization. Here's an example, of a training loop implemented with eager:" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "bzRhM7JDnaEG" }, "outputs": [], "source": [ "optimizer = tf.keras.optimizers.Adam()\n", "loss_object = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)\n", "\n", "loss_history = []" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "tXaupYXRI2YM" }, "source": [ "Note: Use the assert functions in `tf.debugging` to check if a condition holds up. This works in eager and graph execution." ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "DDHrigtiCIA4" }, "outputs": [], "source": [ "def train_step(images, labels):\n", " with tf.GradientTape() as tape:\n", " logits = mnist_model(images, training=True)\n", " \n", " # Add asserts to check the shape of the output.\n", " tf.debugging.assert_equal(logits.shape, (32, 10))\n", " \n", " loss_value = loss_object(labels, logits)\n", "\n", " loss_history.append(loss_value.numpy().mean())\n", " grads = tape.gradient(loss_value, mnist_model.trainable_variables)\n", " optimizer.apply_gradients(zip(grads, mnist_model.trainable_variables))" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "attributes": { "classes": [ "py" ], "id": "" }, "colab": {}, "colab_type": "code", "id": "0m1xAXrmqEgJ" }, "outputs": [], "source": [ "def train(epochs):\n", " for epoch in range(epochs):\n", " for (batch, (images, labels)) in enumerate(dataset):\n", " train_step(images, labels)\n", " print ('Epoch {} finished'.format(epoch))" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "C5dGz0p_nf4W" }, "outputs": [], "source": [ "train(epochs = 3)" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "5vG5ql_2vYB5" }, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "\n", "plt.plot(loss_history)\n", "plt.xlabel('Batch #')\n", "plt.ylabel('Loss [entropy]')" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "kKpOlHPLqEgl" }, "source": [ "### Variables and optimizers\n", "\n", "`tf.Variable` objects store mutable `tf.Tensor`-like values accessed during\n", "training to make automatic differentiation easier. \n", "\n", "The collections of variables can be encapsulated into layers or models, along with methods that operate on them. See [Custom Keras layers and models](./keras/custom_layers_and_models.ipynb) for details. The main difference between layers and models is that models add methods like `Model.fit`, `Model.evaluate`, and `Model.save`.\n", "\n", "For example, the automatic differentiation example above\n", "can be rewritten:" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "2qXcPngYk8dN" }, "outputs": [], "source": [ "class Linear(tf.keras.Model):\n", " def __init__(self):\n", " super(Linear, self).__init__()\n", " self.W = tf.Variable(5., name='weight')\n", " self.B = tf.Variable(10., name='bias')\n", " def call(self, inputs):\n", " return inputs * self.W + self.B" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "attributes": { "classes": [ "py" ], "id": "" }, "colab": {}, "colab_type": "code", "id": "nnQLBYmEqEgm" }, "outputs": [], "source": [ "# A toy dataset of points around 3 * x + 2\n", "NUM_EXAMPLES = 2000\n", "training_inputs = tf.random.normal([NUM_EXAMPLES])\n", "noise = tf.random.normal([NUM_EXAMPLES])\n", "training_outputs = training_inputs * 3 + 2 + noise\n", "\n", "# The loss function to be optimized\n", "def loss(model, inputs, targets):\n", " error = model(inputs) - targets\n", " return tf.reduce_mean(tf.square(error))\n", "\n", "def grad(model, inputs, targets):\n", " with tf.GradientTape() as tape:\n", " loss_value = loss(model, inputs, targets)\n", " return tape.gradient(loss_value, [model.W, model.B])" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "Q7x1CDurl3IG" }, "source": [ "Next:\n", "\n", "1. Create the model.\n", "2. The Derivatives of a loss function with respect to model parameters.\n", "3. A strategy for updating the variables based on the derivatives." ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "SbXJk0f2lztg" }, "outputs": [], "source": [ "model = Linear()\n", "optimizer = tf.keras.optimizers.SGD(learning_rate=0.01)\n", "\n", "print(\"Initial loss: {:.3f}\".format(loss(model, training_inputs, training_outputs)))\n", "\n", "steps = 300\n", "for i in range(steps):\n", " grads = grad(model, training_inputs, training_outputs)\n", " optimizer.apply_gradients(zip(grads, [model.W, model.B]))\n", " if i % 20 == 0:\n", " print(\"Loss at step {:03d}: {:.3f}\".format(i, loss(model, training_inputs, training_outputs)))" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "PV_dqer7pzSH" }, "outputs": [], "source": [ "print(\"Final loss: {:.3f}\".format(loss(model, training_inputs, training_outputs)))" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "rvt_Wj3Tp0hm" }, "outputs": [], "source": [ "print(\"W = {}, B = {}\".format(model.W.numpy(), model.B.numpy()))" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "rPjb8nRWqEgr" }, "source": [ "Note: Variables persist until the last reference to the python object\n", "is removed, and is the variable is deleted." ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "scMjg6L6qEgv" }, "source": [ "### Object-based saving\n" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "Y-0ZcCcjwkux" }, "source": [ "A `tf.keras.Model` includes a covienient `save_weights` method allowing you to easily create a checkpoint: " ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "oJrMX94PwD9s" }, "outputs": [], "source": [ "model.save_weights('weights')\n", "status = model.load_weights('weights')" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "2EfTjWV_wEng" }, "source": [ "Using `tf.train.Checkpoint` you can take full control over this process.\n", "\n", "This section is an abbreviated version of the [guide to training checkpoints](./checkpoint.ipynb).\n" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "7z5xRfdHzZOQ" }, "outputs": [], "source": [ "x = tf.Variable(10.)\n", "checkpoint = tf.train.Checkpoint(x=x)" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "IffrUVG7zyVb" }, "outputs": [], "source": [ "x.assign(2.) # Assign a new value to the variables and save.\n", "checkpoint_path = './ckpt/'\n", "checkpoint.save('./ckpt/')" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "attributes": { "classes": [ "py" ], "id": "" }, "colab": {}, "colab_type": "code", "id": "eMT9koCoqEgw" }, "outputs": [], "source": [ "x.assign(11.) # Change the variable after saving.\n", "\n", "# Restore values from the checkpoint\n", "checkpoint.restore(tf.train.latest_checkpoint(checkpoint_path))\n", "\n", "print(x) # => 2.0" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "vbFnP-yLqEgx" }, "source": [ "To save and load models, `tf.train.Checkpoint` stores the internal state of objects,\n", "without requiring hidden variables. To record the state of a `model`,\n", "an `optimizer`, and a global step, pass them to a `tf.train.Checkpoint`:" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "attributes": { "classes": [ "py" ], "id": "" }, "colab": {}, "colab_type": "code", "id": "hWZHyAXMqEg0" }, "outputs": [], "source": [ "model = tf.keras.Sequential([\n", " tf.keras.layers.Conv2D(16,[3,3], activation='relu'),\n", " tf.keras.layers.GlobalAveragePooling2D(),\n", " tf.keras.layers.Dense(10)\n", "])\n", "optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)\n", "checkpoint_dir = 'path/to/model_dir'\n", "if not os.path.exists(checkpoint_dir):\n", " os.makedirs(checkpoint_dir)\n", "checkpoint_prefix = os.path.join(checkpoint_dir, \"ckpt\")\n", "root = tf.train.Checkpoint(optimizer=optimizer,\n", " model=model)\n", "\n", "root.save(checkpoint_prefix)\n", "root.restore(tf.train.latest_checkpoint(checkpoint_dir))" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "R-ITwkBCF6GJ" }, "source": [ "Note: In many training loops, variables are created after `tf.train.Checkpoint.restore` is called. These variables will be restored as soon as they are created, and assertions are available to ensure that a checkpoint has been fully loaded. See the [guide to training checkpoints](./checkpoint.ipynb) for details." ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "3yoD0VJ7qEg3" }, "source": [ "### Object-oriented metrics\n", "\n", "`tf.keras.metrics` are stored as objects. Update a metric by passing the new data to\n", "the callable, and retrieve the result using the `tf.keras.metrics.result` method,\n", "for example:" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "attributes": { "classes": [ "py" ], "id": "" }, "colab": {}, "colab_type": "code", "id": "9ccu0iAaqEg5" }, "outputs": [], "source": [ "m = tf.keras.metrics.Mean(\"loss\")\n", "m(0)\n", "m(5)\n", "m.result() # => 2.5\n", "m([8, 9])\n", "m.result() # => 5.5" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "aB8qWtT955pI" }, "source": [ "### Summaries and TensorBoard\n", "\n", "[TensorBoard](https://tensorflow.org/tensorboard) is a visualization tool for\n", "understanding, debugging and optimizing the model training process. It uses\n", "summary events that are written while executing the program.\n", "\n", "You can use `tf.summary` to record summaries of variable in eager execution.\n", "For example, to record summaries of `loss` once every 100 training steps:" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "z6VInqhA6RH4" }, "outputs": [], "source": [ "logdir = \"./tb/\"\n", "writer = tf.summary.create_file_writer(logdir)\n", "\n", "steps = 1000\n", "with writer.as_default(): # or call writer.set_as_default() before the loop.\n", " for i in range(steps):\n", " step = i + 1\n", " # Calculate loss with your real train function.\n", " loss = 1 - 0.001 * step\n", " if step % 100 == 0:\n", " tf.summary.scalar('loss', loss, step=step)" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "08QQD2j36TaI" }, "outputs": [], "source": [ "!ls tb/" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "xEL4yJe5qEhD" }, "source": [ "## Advanced automatic differentiation topics\n", "\n", "### Dynamic models\n", "\n", "`tf.GradientTape` can also be used in dynamic models. This example for a\n", "[backtracking line search](https://wikipedia.org/wiki/Backtracking_line_search)\n", "algorithm looks like normal NumPy code, except there are gradients and is\n", "differentiable, despite the complex control flow:" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "attributes": { "classes": [ "py" ], "id": "" }, "colab": {}, "colab_type": "code", "id": "L518n5dkqEhE" }, "outputs": [], "source": [ "def line_search_step(fn, init_x, rate=1.0):\n", " with tf.GradientTape() as tape:\n", " # Variables are automatically tracked.\n", " # But to calculate a gradient from a tensor, you must `watch` it.\n", " tape.watch(init_x)\n", " value = fn(init_x)\n", " grad = tape.gradient(value, init_x)\n", " grad_norm = tf.reduce_sum(grad * grad)\n", " init_value = value\n", " while value > init_value - rate * grad_norm:\n", " x = init_x - rate * grad\n", " value = fn(x)\n", " rate /= 2.0\n", " return x, value" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "gieGOf_DqEhK" }, "source": [ "### Custom gradients\n", "\n", "Custom gradients are an easy way to override gradients. Within the forward function, define the gradient with respect to the\n", "inputs, outputs, or intermediate results. For example, here's an easy way to clip\n", "the norm of the gradients in the backward pass:" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "attributes": { "classes": [ "py" ], "id": "" }, "colab": {}, "colab_type": "code", "id": "-OwwsWUAqEhK" }, "outputs": [], "source": [ "@tf.custom_gradient\n", "def clip_gradient_by_norm(x, norm):\n", " y = tf.identity(x)\n", " def grad_fn(dresult):\n", " return [tf.clip_by_norm(dresult, norm), None]\n", " return y, grad_fn" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "JPLDHkF_qEhN" }, "source": [ "Custom gradients are commonly used to provide a numerically stable gradient for a\n", "sequence of operations:" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "attributes": { "classes": [ "py" ], "id": "" }, "colab": {}, "colab_type": "code", "id": "24WiLROnqEhO" }, "outputs": [], "source": [ "def log1pexp(x):\n", " return tf.math.log(1 + tf.exp(x))\n", "\n", "def grad_log1pexp(x):\n", " with tf.GradientTape() as tape:\n", " tape.watch(x)\n", " value = log1pexp(x)\n", " return tape.gradient(value, x)\n" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "n8fq69r9-B-c" }, "outputs": [], "source": [ "# The gradient computation works fine at x = 0.\n", "grad_log1pexp(tf.constant(0.)).numpy()" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "_VFSU0mG-FSp" }, "outputs": [], "source": [ "# However, x = 100 fails because of numerical instability.\n", "grad_log1pexp(tf.constant(100.)).numpy()" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "-VcTR34rqEhQ" }, "source": [ "Here, the `log1pexp` function can be analytically simplified with a custom\n", "gradient. The implementation below reuses the value for `tf.exp(x)` that is\n", "computed during the forward pass—making it more efficient by eliminating\n", "redundant calculations:" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "attributes": { "classes": [ "py" ], "id": "" }, "colab": {}, "colab_type": "code", "id": "Q7nvfx_-qEhS" }, "outputs": [], "source": [ "@tf.custom_gradient\n", "def log1pexp(x):\n", " e = tf.exp(x)\n", " def grad(dy):\n", " return dy * (1 - 1 / (1 + e))\n", " return tf.math.log(1 + e), grad\n", "\n", "def grad_log1pexp(x):\n", " with tf.GradientTape() as tape:\n", " tape.watch(x)\n", " value = log1pexp(x)\n", " return tape.gradient(value, x)\n" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "5gHPKMfl-Kge" }, "outputs": [], "source": [ "# As before, the gradient computation works fine at x = 0.\n", "grad_log1pexp(tf.constant(0.)).numpy()" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "u38MOfz3-MDE" }, "outputs": [], "source": [ "# And the gradient computation also works at x = 100.\n", "grad_log1pexp(tf.constant(100.)).numpy()" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "rnZXjfQzqEhV" }, "source": [ "## Performance\n", "\n", "Computation is automatically offloaded to GPUs during eager execution. If you\n", "want control over where a computation runs you can enclose it in a\n", "`tf.device('/gpu:0')` block (or the CPU equivalent):" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "attributes": { "classes": [ "py" ], "id": "" }, "colab": {}, "colab_type": "code", "id": "Ac9Y64H-qEhX" }, "outputs": [], "source": [ "import time\n", "\n", "def measure(x, steps):\n", " # TensorFlow initializes a GPU the first time it's used, exclude from timing.\n", " tf.matmul(x, x)\n", " start = time.time()\n", " for i in range(steps):\n", " x = tf.matmul(x, x)\n", " # tf.matmul can return before completing the matrix multiplication\n", " # (e.g., can return after enqueing the operation on a CUDA stream).\n", " # The x.numpy() call below will ensure that all enqueued operations\n", " # have completed (and will also copy the result to host memory,\n", " # so we're including a little more than just the matmul operation\n", " # time).\n", " _ = x.numpy()\n", " end = time.time()\n", " return end - start\n", "\n", "shape = (1000, 1000)\n", "steps = 200\n", "print(\"Time to multiply a {} matrix by itself {} times:\".format(shape, steps))\n", "\n", "# Run on CPU:\n", "with tf.device(\"/cpu:0\"):\n", " print(\"CPU: {} secs\".format(measure(tf.random.normal(shape), steps)))\n", "\n", "# Run on GPU, if available:\n", "if tf.config.experimental.list_physical_devices(\"GPU\"):\n", " with tf.device(\"/gpu:0\"):\n", " print(\"GPU: {} secs\".format(measure(tf.random.normal(shape), steps)))\n", "else:\n", " print(\"GPU: not found\")" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "RLw3IS7UqEhe" }, "source": [ "A `tf.Tensor` object can be copied to a different device to execute its\n", "operations:" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "attributes": { "classes": [ "py" ], "id": "" }, "colab": {}, "colab_type": "code", "id": "ny6LX2BVqEhf" }, "outputs": [], "source": [ "if tf.config.experimental.list_physical_devices(\"GPU\"):\n", " x = tf.random.normal([10, 10])\n", "\n", " x_gpu0 = x.gpu()\n", " x_cpu = x.cpu()\n", "\n", " _ = tf.matmul(x_cpu, x_cpu) # Runs on CPU\n", " _ = tf.matmul(x_gpu0, x_gpu0) # Runs on GPU:0" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "oA_qaII3-p6c" }, "source": [ "### Benchmarks\n", "\n", "For compute-heavy models, such as\n", "[ResNet50](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/eager/python/examples/resnet50)\n", "training on a GPU, eager execution performance is comparable to `tf.function` execution.\n", "But this gap grows larger for models with less computation and there is work to\n", "be done for optimizing hot code paths for models with lots of small operations.\n", "\n", "## Work with functions\n", "\n", "While eager execution makes development and debugging more interactive,\n", "TensorFlow 1.x style graph execution has advantages for distributed training, performance\n", "optimizations, and production deployment. To bridge this gap, TensorFlow 2.0 introduces `function`s via the `tf.function` API. For more information, see the [tf.function](./function.ipynb) guide." ] } ], "metadata": { "accelerator": "GPU", "colab": { "collapsed_sections": [], "name": "eager.ipynb", "private_outputs": true, "provenance": [], "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "name": "python3" } }, "nbformat": 4, "nbformat_minor": 0 }