{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "*Accompanying code examples of the book \"Introduction to Artificial Neural Networks and Deep Learning: A Practical Guide with Applications in Python\" by [Sebastian Raschka](https://sebastianraschka.com). All code examples are released under the [MIT license](https://github.com/rasbt/deep-learning-book/blob/master/LICENSE). If you find this content useful, please consider supporting the work by buying a [copy of the book](https://leanpub.com/ann-and-deeplearning).*\n", "\n", "Other code examples and content are available on [GitHub](https://github.com/rasbt/deep-learning-book). The PDF and ebook versions of the book are available through [Leanpub](https://leanpub.com/ann-and-deeplearning)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Appendix G - TensorFlow Basics" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Sebastian Raschka 2017-05-25 \n", "\n", "tensorflow 1.1.0\n", "numpy 1.12.1\n" ] } ], "source": [ "%load_ext watermark\n", "%watermark -a 'Sebastian Raschka' -d -p tensorflow,numpy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Table of Contents\n", "\n", "\n", "- TensorFlow in a Nutshell\n", "- Installation\n", "- Computation Graphs Variables\n", "- Placeholder Variables\n", "- Saving and Restoring Models\n", "- Naming TensorFlow Objects\n", "- CPU and GPU\n", "- TensorBoard" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "This appendix offers a brief overview of TensorFlow, an open-source library for numerical computation and deep learning. This section is intended for readers who want to gain a basic overview of this library before progressing through the hands-on sections that are concluding the main chapters.\n", "\n", "The majority of *hands-on* sections in this book focus on TensorFlow and its Python API, assuming that you have TensorFlow >=1.0 installed if you are planning to execute the code sections shown in this book.\n", "\n", "In addition to glancing over this appendix, I recommend the following resources from TensorFlow's official documentation for a more in-depth coverage on using TensorFlow:\n", "\n", "- **[Download and setup instructions](https://www.tensorflow.org/get_started/os_setup)**\n", "- **[Python API documentation](https://www.tensorflow.org/api_docs/python/)**\n", "- **[Tutorials](https://www.tensorflow.org/tutorials/)**\n", "- **[TensorBoard, an optional tool for visualizing learning](https://www.tensorflow.org/how_tos/summaries_and_tensorboard/)**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## TensorFlow in a Nutshell" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "At its core, TensorFlow is a library for efficient multidimensional array operations with a focus on deep learning. Developed by the Google Brain Team, TensorFlow was open-sourced on November 9th, 2015. And augmented by its convenient Python API layer, TensorFlow has gained much popularity and wide-spread adoption in industry as well as academia.\n", "\n", "TensorFlow shares some similarities with NumPy, such as providing data structures and computations based on multidimensional arrays. What makes TensorFlow particularly suitable for deep learning, though, are its primitives for defining functions on tensors, the ability of parallelizing tensor operations, and convenience tools such as automatic differentiation.\n", "\n", "While TensorFlow can be run entirely on a CPU or multiple CPUs, one of the core strength of this library is its support of GPUs (Graphical Processing Units) that are very efficient at performing highly parallelized numerical computations. In addition, TensorFlow also supports distributed systems as well as mobile computing platforms, including Android and Apple's iOS.\n", "\n", "But what is a *tensor*? In simplifying terms, we can think of tensors as multidimensional arrays of numbers, as a generalization of scalars, vectors, and matrices.\n", "\n", "1. Scalar: $\\mathbb{R}$\n", "2. Vector: $\\mathbb{R}^n$\n", "3. Matrix: $\\mathbb{R}^n \\times \\mathbb{R}^m$\n", "4. 3-Tensor: $\\mathbb{R}^n \\times \\mathbb{R}^m \\times \\mathbb{R}^p$\n", "5. ...\n", "\n", "When we describe tensors, we refer to its \"dimensions\" as the *rank* (or *order*) of a tensor, which is not to be confused with the dimensions of a matrix. For instance, an $m \\times n$ matrix, where $m$ is the number of rows and $n$ is the number of columns, would be a special case of a rank-2 tensor. A visual explanation of tensors and their ranks is given is the figure below.\n", "\n", "![Tensors](images/tensors.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Installation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Code conventions in this book follow the Python 3.x syntax, and while the code examples should be backward compatible to Python 2.7, I highly recommend the use of Python >=3.5.\n", "\n", "Once you have your Python Environment set up ([Appendix - Python Setup]), the most convenient ways for installing TensorFlow are via `pip` or `conda` -- the latter only applies if you have the Anaconda/Miniconda Python distribution installed, which I prefer and recommend.\n", "\n", "Since TensorFlow is under active development, I recommend you to consult the official \"[Download and Setup](https://www.tensorflow.org/get_started/os_setup)\" documentation for detailed installation instructions to install TensorFlow on you operating system, macOS, Linux, or Windows.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Computation Graphs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In contrast to other tools such as NumPy, the numerical computations in TensorFlow can be categorized into two steps: a construction step and an execution step. Consequently, the typical workflow in TensorFlow can be summarized as follows:\n", "\n", "- Build a computational graph\n", "- Start a new *session* to evaluate the graph\n", " - Initialize variables\n", " - Execute the operations in the compiled graph\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that the computation graph has no numerical values before we initialize and evaluate it. To see how this looks like in practice, let us set up a new graph for computing the column sums of a matrix, which we define as a constant tensor (`reduce_sum` is the TensorFlow equivalent of NumPy's `sum` function).\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "tf_x:\n", " Tensor(\"Const:0\", shape=(3, 2), dtype=float32)\n", "\n", "col_sum:\n", " Tensor(\"Sum:0\", shape=(2,), dtype=float32)\n" ] } ], "source": [ "import tensorflow as tf\n", "\n", "g = tf.Graph()\n", "\n", "with g.as_default() as g:\n", " tf_x = tf.constant([[1., 2.], \n", " [3., 4.],\n", " [5., 6.]], dtype=tf.float32)\n", " col_sum = tf.reduce_sum(tf_x, axis=0)\n", "\n", "print('tf_x:\\n', tf_x)\n", "print('\\ncol_sum:\\n', col_sum)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As we can see from the output above, the operations in the graph are represented as `Tensor` objects that require an explicit evaluation before the `tf_x` matrix is populated with numerical values and its column sum gets computed.\n", "\n", "Now, we pass the graph that we created earlier to a new, active *session*, where the graph gets compiled and evaluated:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "mat:\n", " [[ 1. 2.]\n", " [ 3. 4.]\n", " [ 5. 6.]]\n", "\n", "csum:\n", " [ 9. 12.]\n" ] } ], "source": [ "with tf.Session(graph=g) as sess:\n", " mat, csum = sess.run([tf_x, col_sum])\n", " \n", "print('mat:\\n', mat)\n", "print('\\ncsum:\\n', csum)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that if we are only interested in the result of a particular operation, we don't need to `run` its dependencies -- TensorFlow will automatically take care of that. For instance, we can directly fetch the numerical values of `col_sum_times_2` in the active session without explicitly passing `col_sum` to `sess.run(...)` as the following example illustrates:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "csum_2:\n", " [ 18. 24.]\n" ] } ], "source": [ "g = tf.Graph()\n", "\n", "with g.as_default() as g:\n", " tf_x = tf.constant([[1., 2.], \n", " [3., 4.],\n", " [5., 6.]], dtype=tf.float32)\n", " col_sum = tf.reduce_sum(tf_x, axis=0)\n", " col_sum_times_2 = col_sum * 2\n", "\n", "\n", "with tf.Session(graph=g) as sess:\n", " csum_2 = sess.run(col_sum_times_2)\n", " \n", "print('csum_2:\\n', csum_2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Variables" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Variables are constructs in TensorFlow that allows us to store and update parameters of our models in the current session during training. To define a \"variable\" tensor, we use TensorFlow's `Variable()` constructor, which looks similar to the use of `constant` that we used to create a matrix previously. However, to execute a computational graph that contains variables, we must initialize all variables in the active session first (using `tf.global_variables_initializer()`), as illustrated in the example below.\n" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 2. 3.]\n", " [ 4. 5.]\n", " [ 6. 7.]]\n" ] } ], "source": [ "g = tf.Graph()\n", "\n", "with g.as_default() as g:\n", " tf_x = tf.Variable([[1., 2.], \n", " [3., 4.],\n", " [5., 6.]], dtype=tf.float32)\n", " x = tf.constant(1., dtype=tf.float32)\n", " \n", " # add a constant to the matrix:\n", " tf_x = tf_x + x\n", " \n", "with tf.Session(graph=g) as sess:\n", " sess.run(tf.global_variables_initializer())\n", " result = sess.run(tf_x)\n", " \n", "print(result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, let us do an experiment and evaluate the same graph twice:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 2. 3.]\n", " [ 4. 5.]\n", " [ 6. 7.]]\n" ] } ], "source": [ "with tf.Session(graph=g) as sess:\n", " sess.run(tf.global_variables_initializer())\n", " result = sess.run(tf_x)\n", " result = sess.run(tf_x)\n", "\n", "print(result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As we can see, the result of running the computation twice did not affect the numerical values fetched from the graph. To update or to assign new values to a variable, we use TensorFlow's `assign` operation. The function syntax of `assign` is `assign(ref, val, ...)`, where '`ref`' is updated by assigning '`value`' to it:\n" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 3. 4.]\n", " [ 5. 6.]\n", " [ 7. 8.]]\n" ] } ], "source": [ "g = tf.Graph()\n", "\n", "with g.as_default() as g:\n", " tf_x = tf.Variable([[1., 2.], \n", " [3., 4.],\n", " [5., 6.]], dtype=tf.float32)\n", " x = tf.constant(1., dtype=tf.float32)\n", " \n", " update_tf_x = tf.assign(tf_x, tf_x + x)\n", "\n", "\n", "with tf.Session(graph=g) as sess:\n", " sess.run(tf.global_variables_initializer())\n", " result = sess.run(update_tf_x)\n", " result = sess.run(update_tf_x)\n", "\n", "print(result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As we can see, the contents of the variable `tf_x` were successfully updated twice now; in the active session we\n", "\n", "- initialized the variable `tf_x`\n", "- added a constant scalar `1.` to `tf_x` matrix via `assign`\n", "- added a constant scalar `1.` to the previously updated `tf_x` matrix via `assign`\n", "\n", "Although the example above is kept simple for illustrative purposes, variables are an\n", "important concept in TensorFlow, and we will see throughout the chapters, they are\n", "not only useful for updating model parameters but also for saving and loading\n", "variables for reuse." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Placeholder Variables" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "Another important concept in TensorFlow is the use of placeholder variables,\n", "which allow us to feed the computational graph with numerical values in an active session at runtime.\n", "\n", "In the following example, we will define a computational graph that performs a\n", "simple matrix multiplication operation. First, we define a placeholder variable\n", "that can hold 3x2-dimensional matrices. And after initializing the placeholder\n", "variable in the active session, we will use a dictionary, `feed_dict` we feed\n", "a NumPy array to the graph, which then evaluates the matrix multiplication operation.\n" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 25. 39. 53.]\n", " [ 39. 61. 83.]\n", " [ 53. 83. 113.]]\n" ] } ], "source": [ "import numpy as np\n", "\n", "g = tf.Graph()\n", "\n", "with g.as_default() as g:\n", " tf_x = tf.placeholder(dtype=tf.float32,\n", " shape=(3, 2))\n", "\n", " output = tf.matmul(tf_x, tf.transpose(tf_x))\n", "\n", "\n", "with tf.Session(graph=g) as sess:\n", " sess.run(tf.global_variables_initializer())\n", " np_ary = np.array([[3., 4.],\n", " [5., 6.],\n", " [7., 8.]])\n", " feed_dict = {tf_x: np_ary}\n", " print(sess.run(output,\n", " feed_dict=feed_dict))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Throughout the main chapters, we will make heavy use of placeholder variables,\n", "which allow us to pass our datasets to various learning algorithms\n", "in the computational graphs.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Saving and Loading Variables" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Training deep neural networks requires a lot of computations and computational resources, and in practice, it would be infeasible to retrain our model each time we start a new TensorFlow session before we can use it to make predictions. In this section, we will go over the basics of saving and re-using the results of our TensorFlow models.\n", "\n", "The most convenient way to store the main components of our model is to use TensorFlows `Saver` class (`tf.train.Saver()`). To see how it works, let us reuse the simple example from the [Variables](#variables) section, where we added a constant `1.` to all elements in a 3x2 matrix:\n" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": true }, "outputs": [], "source": [ "g = tf.Graph()\n", "\n", "with g.as_default() as g:\n", " \n", " tf_x = tf.Variable([[1., 2.], \n", " [3., 4.],\n", " [5., 6.]], dtype=tf.float32)\n", " x = tf.constant(1., dtype=tf.float32)\n", " \n", " update_tf_x = tf.assign(tf_x, tf_x + x)\n", " \n", " # initialize a Saver, which gets all variables\n", " # within this computation graph context\n", " saver = tf.train.Saver()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, after we initialized the graph above, let us execute its operations in a new session:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": true }, "outputs": [], "source": [ "with tf.Session(graph=g) as sess:\n", " sess.run(tf.global_variables_initializer())\n", " result = sess.run(update_tf_x)\n", " \n", " saver.save(sess, save_path='./my-model.ckpt')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice the `saver.save` call above, which saves all variables in the graph to \"checkpoint\" files bearing the prefix `my-model.ckpt` in our local directory (`'./'`). Since we didn't specify which variables we wanted to save when we instantiated a `tf.train.Saver()`, it saved all variables in the graph by default -- here, we only have one variable, `tf_x`. Alternatively, if we are only interested in keeping particular variables, we can specify this by feeding `tf.train.Saver()` a dictionary or list of these variables upon instantiation. For example, if our graph contained more than one variable, but we were only interested in saving `tf_x`, we could instantiate a `saver` object as `tf.train.Saver([tf_x])`.\n", "\n", "After we executed the previous code example, we should find the three `my-model.ckpt` files (in binary format) in our local directory:\n", "\n", "- `my-model.ckpt.data-00000-of-00001`\n", "- `my-model.ckpt.index`\n", "- `my-model.ckpt.meta`\n", "\n", "The file `my-model.ckpt.data-00000-of-00001` saves our main variable values, the `.index` file keeps track of the data structures, and the `.meta` file describes the structure of our computational graph that we executed.\n", "\n", "Note that in our simple example above, we just saved our variable one single time. However, in real-world applications, we typically train models over multiple iterations or epochs, and it is useful to create intermediate checkpoint files during training so that we can pick up where we left off in case we need to interrupt our session or encounter unforeseen technical difficulties. For instance, by using the `global_step` parameter, we could save our results after each 10th iteration by making the following modification to our code:\n" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": true }, "outputs": [], "source": [ "g = tf.Graph()\n", "\n", "with g.as_default() as g:\n", " \n", " tf_x = tf.Variable([[1., 2.], \n", " [3., 4.],\n", " [5., 6.]], dtype=tf.float32)\n", " x = tf.constant(1., dtype=tf.float32)\n", " \n", " update_tf_x = tf.assign(tf_x, tf_x + x)\n", " \n", " # initialize a Saver, which gets all variables\n", " # within this computation graph context\n", " saver = tf.train.Saver()\n", "\n", "with tf.Session(graph=g) as sess:\n", " sess.run(tf.global_variables_initializer())\n", " \n", " for epoch in range(100):\n", " result = sess.run(update_tf_x)\n", " if not epoch % 10:\n", " saver.save(sess, \n", " save_path='./my-model-multiple_ckpts.ckpt', \n", " global_step=epoch)\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After we executed this code we find five `my-model.ckpt` files in our local directory:\n", "\n", "- `my-model.ckpt-50 {.data-00000-of-00001, .ckpt.index, .ckpt.meta}`\n", "- `my-model.ckpt-60 {.data-00000-of-00001, .ckpt.index, .ckpt.meta}`\n", "- `my-model.ckpt-70 {.data-00000-of-00001, .ckpt.index, .ckpt.meta}`\n", "- `my-model.ckpt-80 {.data-00000-of-00001, .ckpt.index, .ckpt.meta}`\n", "- `my-model.ckpt-90 {.data-00000-of-00001, .ckpt.index, .ckpt.meta}`\n", "\n", "Although we saved our variables ten times, the `saver` only keeps the five most recent checkpoints by default to save storage space. However, if we want to keep more than five recent checkpoint files, we can provide an optional argument `max_to_keep=n` when we initialize the `saver`, where `n` is an integer specifying the number of the most recent checkpoint files we want to keep.\n", "\n", "Now that we learned how to save TensorFlow `Variable`s, let us see how we can restore them. Assuming that we started a fresh computational session, we need to specify the graph first. Then, we can use the `saver`'s `restore` method to restore our variables as shown below:\n" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:Restoring parameters from ./my-model.ckpt\n", "[[ 3. 4.]\n", " [ 5. 6.]\n", " [ 7. 8.]]\n" ] } ], "source": [ "g = tf.Graph()\n", "\n", "with g.as_default() as g:\n", " \n", " tf_x = tf.Variable([[1., 2.], \n", " [3., 4.],\n", " [5., 6.]], dtype=tf.float32)\n", " x = tf.constant(1., dtype=tf.float32)\n", " \n", " update_tf_x = tf.assign(tf_x, tf_x + x)\n", " \n", " # initialize a Saver, which gets all variables\n", " # within this computation graph context\n", " saver = tf.train.Saver()\n", "\n", "with tf.Session(graph=g) as sess:\n", " saver.restore(sess, save_path='./my-model.ckpt')\n", " result = sess.run(update_tf_x)\n", " print(result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that the returned values of the `tf_x` `Variable` are now increased by a constant of two, compared to the values in the computational graph. The reason is that we ran the graph one time before we saved the variable,\n", "\n", "```python\n", "with tf.Session(graph=g) as sess:\n", " sess.run(tf.global_variables_initializer())\n", " result = sess.run(update_tf_x)\n", "\n", " # save the model\n", " saver.save(sess, save_path='./my-model.ckpt')\n", "```\n", "\n", "and we ran it a second time when after we restored the session.\n", "\n", "\n", "Similar to the example above, we can reload one of our checkpoint files by providing the desired checkpoint suffix (here: `-90`, which is the index of our last checkpoint):\n", "\n" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:Restoring parameters from ./my-model-multiple_ckpts.ckpt-90\n", "[[ 93. 94.]\n", " [ 95. 96.]\n", " [ 97. 98.]]\n" ] } ], "source": [ "with tf.Session(graph=g) as sess:\n", " saver.restore(sess, save_path='./my-model-multiple_ckpts.ckpt-90')\n", " result = sess.run(update_tf_x)\n", " print(result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this section, we merely covered the basics of saving and restoring TensorFlow models. If you want to learn more, please take a look at the official [API documentation](https://www.tensorflow.org/api_docs/python/tf/train/Saver) of TensorFlow's `Saver` class.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Naming TensorFlow Objects" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When we create new TensorFlow objects like `Variables`, we can provide an optional argument for their `name` parameter -- for example:\n", "\n", "```python\n", "tf_x = tf.Variable([[1., 2.],\n", " [3., 4.],\n", " [5., 6.]],\n", " name='tf_x_0',\n", " dtype=tf.float32)\n", "```\n", "\n", "Assigning names to `Variable`s explicitly is not a requirement, but I personally recommend making it a habit when building (more) complex models. Let us walk through a scenario to illustrate the importance of naming variables, taking the simple example from the previous section and add new variable `tf_y` to the graph:\n" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": true }, "outputs": [], "source": [ "g = tf.Graph()\n", "\n", "with g.as_default() as g:\n", " \n", " tf_x = tf.Variable([[1., 2.], \n", " [3., 4.],\n", " [5., 6.]], dtype=tf.float32)\n", " \n", " tf_y = tf.Variable([[7., 8.], \n", " [9., 10.],\n", " [11., 12.]], dtype=tf.float32)\n", " \n", " x = tf.constant(1., dtype=tf.float32)\n", " update_tf_x = tf.assign(tf_x, tf_x + x)\n", " saver = tf.train.Saver()\n", " \n", "with tf.Session(graph=g) as sess:\n", " sess.run(tf.global_variables_initializer())\n", " result = sess.run(update_tf_x)\n", " \n", " saver.save(sess, save_path='./my-model.ckpt')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The variable `tf_y` does not do anything in the code example above; we added it for illustrative purposes, as we will see in a moment. Now, let us assume we started a new computational session and loaded our saved my-model into the following computational graph:\n" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:Restoring parameters from ./my-model.ckpt\n", "[[ 8. 9.]\n", " [ 10. 11.]\n", " [ 12. 13.]]\n" ] } ], "source": [ "g = tf.Graph()\n", "\n", "with g.as_default() as g:\n", " \n", " tf_y = tf.Variable([[7., 8.], \n", " [9., 10.],\n", " [11., 12.]], dtype=tf.float32)\n", " \n", " tf_x = tf.Variable([[1., 2.], \n", " [3., 4.],\n", " [5., 6.]], dtype=tf.float32)\n", " \n", " x = tf.constant(1., dtype=tf.float32)\n", " update_tf_x = tf.assign(tf_x, tf_x + x)\n", " saver = tf.train.Saver()\n", " \n", "with tf.Session(graph=g) as sess:\n", " saver.restore(sess, save_path='./my-model.ckpt')\n", " result = sess.run(update_tf_x)\n", " print(result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Unless you paid close attention on how we initialized the graph above, this result above surely was not the one you expected. What happened? Intuitively, we expected our session to `print` \n", "\n", "```python\n", " [[ 3. 4.]\n", " [ 5. 6.]\n", " [ 7. 8.]]\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The explanation behind this unexpected `result` is that we reversed the order of `tf_y` and `tf_x` in the graph above. TensorFlow applies a default naming scheme to all operations in the computational graph, unless we use do it explicitly via the `name` parameter -- or in other words, we confused TensorFlow by reversing the order of two similar objects, `tf_y` and `tf_x`.\n", "\n", "To circumvent this problem, we could give our variables specific names -- for example, `'tf_x_0'` and `'tf_y_0'`:\n" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import tensorflow as tf\n", "\n", "g = tf.Graph()\n", "\n", "with g.as_default() as g:\n", " \n", " tf_x = tf.Variable([[1., 2.], \n", " [3., 4.],\n", " [5., 6.]], \n", " name='tf_x_0',\n", " dtype=tf.float32)\n", " \n", " tf_y = tf.Variable([[7., 8.], \n", " [9., 10.,\n", " ]], \n", " name='tf_y_0',\n", " dtype=tf.float32)\n", " \n", " x = tf.constant(1., dtype=tf.float32)\n", " update_tf_x = tf.assign(tf_x, tf_x + x)\n", " saver = tf.train.Saver()\n", "\n", "with tf.Session(graph=g) as sess:\n", " sess.run(tf.global_variables_initializer())\n", " result = sess.run(update_tf_x)\n", " \n", " saver.save(sess, save_path='./my-model.ckpt')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then, even if we flip the order of these variables in a new computational graph, TensorFlow knows which values to use for each variable when loading our model -- assuming we provide the corresponding variable names:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:Restoring parameters from ./my-model.ckpt\n", "[[ 3. 4.]\n", " [ 5. 6.]\n", " [ 7. 8.]]\n" ] } ], "source": [ "g = tf.Graph()\n", "\n", "with g.as_default() as g:\n", " \n", " tf_y = tf.Variable([[7., 8.], \n", " [9., 10.,\n", " ]], \n", " name='tf_y_0',\n", " dtype=tf.float32)\n", " \n", " tf_x = tf.Variable([[1., 2.], \n", " [3., 4.],\n", " [5., 6.]], \n", " name='tf_x_0',\n", " dtype=tf.float32)\n", " \n", " x = tf.constant(1., dtype=tf.float32)\n", " update_tf_x = tf.assign(tf_x, tf_x + x)\n", " saver = tf.train.Saver()\n", " \n", "with tf.Session(graph=g) as sess:\n", " saver.restore(sess, save_path='./my-model.ckpt')\n", " result = sess.run(update_tf_x)\n", " print(result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## CPU and GPU" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Please note that all code examples in this book, and all TensorFlow operations in general, can be executed on a CPU. If you have a GPU version of TensorFlow installed, TensorFlow will automatically execute those operations that have GPU support on GPUs and use your machine's CPU, otherwise.\n", "However, if you wish to define your computing device manually, for instance, if you have the GPU version installed but want to use the main CPU for prototyping, we can run an active section on a specific device using the `with` context as follows\n", "\n", " with tf.Session() as sess:\n", " with tf.device(\"/gpu:1\"):\n", "\n", "where\n", "\n", "- \"/cpu:0\": The CPU of your machine.\n", "- \"/gpu:0\": The GPU of your machine, if you have one.\n", "- \"/gpu:1\": The second GPU of your machine, etc.\n", "- etc.\n", "\n", "You can get a list of all available devices on your machine via\n", "\n", " from tensorflow.python.client import device_lib\n", "\n", " device_lib.list_local_devices()\n", "\n", "For more information on using GPUs in TensorFlow, please refer to the GPU documentation at https://www.tensorflow.org/how_tos/using_gpu/.\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```python\n", "with tf.Session() as sess:\n", " with tf.device(\"/gpu:1\"):\n", "\n", "\n", "\n", "\n", "from tensorflow.python.client import device_lib\n", "\n", "device_lib.list_local_devices()\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another good way to check whether your current TensorFlow session runs on a GPU is to execute\n", "\n", "```python\n", ">>> import tensorflow as tf\n", ">>> tf.test.gpu_device_name()\n", "```\n", "In your current Python session. If a GPU is available to TensorFlow, it will return a non-empty string; for example, `'/gpu:0'`. Otherwise, if now GPU can be found, the function will return an empty string." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## TensorBoard" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "TensorBoard is one of the coolest features of TensorFlow, which provides us with a suite of tools to visualize our computational graphs and operations before and during runtime. Especially, when we are implementing large neural networks, our graphs can be quite complicated, and TensorBoard is only useful to visually track the training cost and performance of our network, but it can also be used as an additional tool for debugging our implementation. In this section, we will go over the basic concepts of TensorBoard, but make sure you also check out the [official documentation](https://www.tensorflow.org/how_tos/summaries_and_tensorboard/) for more details. \n", "\n", "To visualize a computational graph via TensorBoard, let us create a simple graph with two `Variable`s, the tensors `tf_x` and `tf_y` with shape `[2, 3]`. The first operation is to add these two tensors together. Second, we transpose `tf_x` and multiply it with `tf_y`:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Simple graph visualization\n", "\n", "g = tf.Graph()\n", "\n", "with g.as_default() as g:\n", " \n", "\n", " tf_x = tf.Variable([[1., 2.], \n", " [3., 4.],\n", " [5., 6.]], \n", " name='tf_x_0',\n", " dtype=tf.float32)\n", " \n", " tf_y = tf.Variable([[7., 8.], \n", " [9., 10.],\n", " [11., 12.]], \n", " name='tf_y_0',\n", " dtype=tf.float32)\n", "\n", " output = tf_x + tf_y\n", " output = tf.matmul(tf.transpose(tf_x), output)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we want to visualize the graph via TensorBoard, we need to instantiate a new `FileWriter` object in our session, which we provide with a `logdir` and the graph itself. The `FileWriter` object will then write a [protobuf](https://developers.google.com/protocol-buffers/docs/overview) file to the `logdir` path that we can load into TensorBoard:\n" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 124. 142.]\n", " [ 160. 184.]]\n" ] } ], "source": [ "with tf.Session(graph=g) as sess:\n", " sess.run(tf.global_variables_initializer())\n", " \n", " # create FileWrite object that writes the logs\n", " file_writer = tf.summary.FileWriter(logdir='logs/1', graph=g)\n", " result = sess.run(output)\n", " print(result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you installed TensorFlow via `pip`, the `tensorboard` command should be available from your command line terminal. So, after running the preceding code examples for defining the grap and running the session, you just need to execute the command `tensorboard --logdir logs/1`. You should then see an output similar to the following:\n", "\n", " Desktop Sebastian{$$} tensorboard --logdir logs/1\n", " Starting TensorBoard b'41' on port 6006\n", " (You can navigate to http://xxx.xxx.x.xx:6006)\n", "\n", "Copy and paste the `http` address from the terminal and open it in your favorite web browser to open the TensorBoard window. Then, click on the `Graph` tab at the top, to visualize the computational graph as shown in the figure below:\n", "\n", "![TensorBoard](images/tensorboard-1.png)\n", "\n", "In our TensorBoard window, we can now see a visual summary of our computational graph (as shown in the screenshot above). The dark-shaded nodes labeled as `tf_x_0` and `tf_y_0` are the two variables we initializes, and following the connective lines, we can track the flow of operations. We can see the graph edges that are connecting `tf_x_0` and `tf_y_0` to an `add` node, with is the addition we defined in the graph, followed by the multiplication with the transpose of and the result of `add`.\n", "\n", "Next, we are introducing the concept of `name_scope`s, which lets us organize different parts in our graph. In the following code example, we are going to take the initial code snippets and add `with tf.name_scope(...)` contexts as follows:\n", "\n" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 124. 142.]\n", " [ 160. 184.]]\n" ] } ], "source": [ "# Graph visualization with name scopes\n", "\n", "g = tf.Graph()\n", "\n", "with g.as_default() as g:\n", " \n", " tf_x = tf.Variable([[1., 2.], \n", " [3., 4.],\n", " [5., 6.]], \n", " name='tf_x_0',\n", " dtype=tf.float32)\n", " \n", " tf_y = tf.Variable([[7., 8.], \n", " [9., 10.],\n", " [11., 12.]], \n", " name='tf_y_0',\n", " dtype=tf.float32)\n", " \n", " # add custom name scope\n", " with tf.name_scope('addition'):\n", " output = tf_x + tf_y\n", " \n", " # add custom name scope\n", " with tf.name_scope('matrix_multiplication'):\n", " output = tf.matmul(tf.transpose(tf_x), output)\n", " \n", "with tf.Session(graph=g) as sess:\n", " sess.run(tf.global_variables_initializer())\n", " file_writer = tf.summary.FileWriter(logdir='logs/2', graph=g)\n", " result = sess.run(output)\n", " print(result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After executing the code example above, quit your previous TensorBoard session by pressing `CTRL+C` in the command line terminal and launch a new TensorBoard session via `tensorboard --logdir logs/2`. After you refreshed your browser window, you should see the following graph:\n", "\n", "\n", "![TensorBoard](images/tensorboard-2.png)\n", "\n", "Comparing this visualization to our initial one, we can see that our operations have been grouped into our custom name scopes. If we double-click on one of these name scope summary nodes, we can expand it and inspect the individual operations in more details as shown for the `matrix_multiplication` name scope in the screenshot below:\n", "\n", "![TensorBoard](images/tensorboard-3.png)\n", "\n", "So far, we have only been looking at the computational graph itself. However, TensorBoard implements many more useful features. In the following example, we will make use of the \"Scalar\" and \"Histogram\" tabs. The \"Scalar\" tab in TensorBoard allows us to track scalar values over time, and the \"Histogram\" tab is useful for displaying the distribution of value in our tensor `Variable`s (for instance, the model parameters during training). For simplicity, let us take our previous code snippet and modify it to demonstrate the capabilities of TensorBoard:\n", "\n" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Graph visualization and variable inspection\n", "\n", "g = tf.Graph()\n", "\n", "with g.as_default() as g:\n", "\n", " some_value = tf.placeholder(dtype=tf.int32, \n", " shape=None, \n", " name='some_value')\n", " \n", " tf_x = tf.Variable([[1., 2.], \n", " [3., 4.],\n", " [5., 6.]], \n", " name='tf_x_0',\n", " dtype=tf.float32)\n", " \n", " tf_y = tf.Variable([[7., 8.], \n", " [9., 10.],\n", " [11., 12.]], \n", " name='tf_y_0',\n", " dtype=tf.float32)\n", " \n", " with tf.name_scope('addition'): \n", " output = tf_x + tf_y\n", " \n", " with tf.name_scope('matrix_multiplication'):\n", " output = tf.matmul(tf.transpose(tf_x), output)\n", " \n", " with tf.name_scope('update_tensor_x'): \n", " tf_const = tf.constant(2., shape=None, name='some_const')\n", " update_tf_x = tf.assign(tf_x, tf_x * tf_const)\n", " \n", " # create summaries\n", " tf.summary.scalar(name='some_value', tensor=some_value) \n", " tf.summary.histogram(name='tf_x_values', values=tf_x) \n", "\n", " # merge all summaries into a single operation\n", " merged_summary = tf.summary.merge_all()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that we added an additional `placeholder` to the graph which later receives a scalar value from the session. We also added a new operation that updates our `tf_x` tensor by multiplying it with a constant `2.`:\n", "\n", "```python\n", "with tf.name_scope('update_tensor_x'): \n", " tf_const = tf.constant(2., shape=None, name='some_const')\n", " update_tf_x = tf.assign(tf_x, tf_x * tf_const)\n", "```\n", "\n", "Finally, we added the lines \n", "\n", "```python\n", "# create summaries\n", "tf.summary.scalar(name='some_value', tensor=some_value) \n", "tf.summary.histogram(name='tf_x_values', values=tf_x) \n", "```\n", "\n", "at the end of our graph. These will create the \"summaries\" of the values we want to display in TensorBoard later. The last line of our graph is \n", "\n", "```python\n", "merged_summary = tf.summary.merge_all()\n", "```\n", "\n", "which summarizes all the `tf.summary` calls to one single operation, so that we only have to fetch one variable from the graph when we execute the session. When we executed the session, we simply fetched this merged summary from `merged_summary` as follows:\n", "\n", "```python\n", "result, summary = sess.run([update_tf_x, merged_summary],\n", " feed_dict={some_value: i})\n", "```\n", "\n", "Next, let us add a `for`-loop to our session that runs the graph five times, and feeds the counter of the `range` iterator to the `some_value` `placeholder` variable:\n", "\n" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": true }, "outputs": [], "source": [ "with tf.Session(graph=g) as sess:\n", " \n", " sess.run(tf.global_variables_initializer())\n", " \n", " # create FileWrite object that writes the logs\n", " file_writer = tf.summary.FileWriter(logdir='logs/3', graph=g)\n", " \n", " for i in range(5):\n", " # fetch the summary from the graph\n", " result, summary = sess.run([update_tf_x, merged_summary],\n", " feed_dict={some_value: i})\n", " # write the summary to the log\n", " file_writer.add_summary(summary=summary, global_step=i)\n", " file_writer.flush()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The two lines at the end of the preceding code snippet,\n", "\n", "```python\n", "file_writer.add_summary(summary=summary, global_step=i)\n", "file_writer.flush()\n", "```\n", "\n", "will write the summary data to our log file and the `flush` method updates TensorBoard. Executing `flush` explicitely is usually not necessary in real-world applications, but since the computations in our graph are so simple and \"cheap\" to execute, TensorBoard may not fetch the updates in real time.\n", "To visualize the results, quit your previous TensorBoard session (via `CTRL+C`) and execute `tensorboard --logdir logs/3` from the command line. In the TensorBoard window under the tab \"Scalar,\" you should now see an entry called \"some_value_1,\" which refers to our `placeholder` in the graph that we called `some_value`. Since we just fed it the iteration index of our `for`-loop, we expect to see a linear graph with the iteration index on the *x*- and *y*-axis:\n", "\n", "![TensorBoard](images/tensorboard-4.png)\n", "\n", "Keep in mind that this is just a simple demonstration of how `tf.summary.scalar` works. For instance, more useful applications include the tracking of the training loss and the predictive performance of a model on training and validation sets throughout the different training rounds or epochs. \n", "\n", "Next, let us go to the \"Distributions\" tab:\n", "\n", "\n", "![TensorBoard](images/tensorboard-5.png)\n", "\n", "The \"Distributions\" graph above shows us the distribution of values in `tf_x` for each step in the `for`-loop. Since we doubled the value in the tensor after each `for`-loop iteration, we the distribution graph grows wider over time.\n", "\n", "Finally, let us head over to the \"Histograms\" tab, which provides us with an individual histogram for each `for`-loop step that we can scroll through. Below, I selected the 3rd `for`-loop step that highlights the histogram of values in `tf_x` during this step:\n", "\n", "\n", "![TensorBoard](images/tensorboard-6.png)\n", "\n", "\n", "Since TensorBoard is such a highly visual tool with graph and data exploration in mind, I highly recommend you to take it for a test drive and explore it interactively. Also, the are several features that we haven't covered in this simple introduction to TensorBoard, so be sure to check out the [official documentation](https://www.tensorflow.org/how_tos/summaries_and_tensorboard/) for more information." ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.0" } }, "nbformat": 4, "nbformat_minor": 1 }