{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "# TensorBoard\n",
    "\n",
    "TensorBoard provides a suite of visualization tools to make it easier\n",
    "to understand, debug, and optimize Edward programs. You can use it\n",
    "\"to visualize your TensorFlow graph, plot quantitative metrics about\n",
    "the execution of your graph, and show additional data like images that\n",
    "pass through it\"\n",
    "([tensorflow.org](https://www.tensorflow.org/get_started/summaries_and_tensorboard)).\n",
    "\n",
    "A webpage version of this tutorial is available at\n",
    "http://edwardlib.org/tutorials/tensorboard.\n",
    "\n",
    "![](https://raw.githubusercontent.com/blei-lab/edward/master/docs/images/tensorboard-scalars.png)\n",
    "\n",
    "To use TensorBoard, we first need to specify a directory for storing\n",
    "logs during inference. For example, if manually controlling inference,\n",
    "call\n",
    "```python\n",
    "inference.initialize(logdir='log')\n",
    "```\n",
    "If you're using the catch-all `inference.run()`, include \n",
    "`logdir` as an argument. As inference runs, files are \n",
    "outputted to `log/` within the working directory. In \n",
    "commandline, we run TensorBoard and point to that directory.\n",
    "```bash\n",
    "tensorboard --logdir=log/\n",
    "```\n",
    "The command will provide a web address to access TensorBoard. By \n",
    "default, it is http://localhost:6006.  If working correctly, you \n",
    "should see something like the above picture.\n",
    "\n",
    "You're set up!\n",
    "\n",
    "Additional steps need to be taken in order to clean up TensorBoard's\n",
    "naming. Specifically, we might configure names for random variables\n",
    "and tensors in the computational graph. To provide a concrete example,\n",
    "we extend the\n",
    "[supervised learning tutorial](http://edwardlib.org/tutorials/supervised-regression), \n",
    "where the task is to infer hidden structure from labeled examples\n",
    "$\\{(x_n, y_n)\\}$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "from __future__ import absolute_import\n",
    "from __future__ import division\n",
    "from __future__ import print_function\n",
    "\n",
    "import edward as ed\n",
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "import tensorflow as tf\n",
    "\n",
    "from edward.models import Normal\n",
    "\n",
    "plt.style.use('ggplot')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Data\n",
    "\n",
    "Simulate training and test sets of $40$ data points. They comprise of\n",
    "pairs of inputs $\\mathbf{x}_n\\in\\mathbb{R}^{5}$ and outputs\n",
    "$y_n\\in\\mathbb{R}$. They have a linear dependence with normally\n",
    "distributed noise."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def build_toy_dataset(N, w):\n",
    "  D = len(w)\n",
    "  x = np.random.normal(0.0, 2.0, size=(N, D))\n",
    "  y = np.dot(x, w) + np.random.normal(0.0, 0.01, size=N)\n",
    "  return x, y\n",
    "\n",
    "ed.set_seed(42)\n",
    "\n",
    "N = 40  # number of data points\n",
    "D = 5  # number of features\n",
    "\n",
    "w_true = np.random.randn(D) * 0.5\n",
    "X_train, y_train = build_toy_dataset(N, w_true)\n",
    "X_test, y_test = build_toy_dataset(N, w_true)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Model\n",
    "\n",
    "Posit the model as Bayesian linear regression (Murphy, 2012).\n",
    "For a set of $N$ data points $(\\mathbf{X},\\mathbf{y})=\\{(\\mathbf{x}_n, y_n)\\}$,\n",
    "the model posits the following distributions:\n",
    "\n",
    "\\begin{align*}\n",
    "  p(\\mathbf{w})\n",
    "  &=\n",
    "  \\text{Normal}(\\mathbf{w} \\mid \\mathbf{0}, \\sigma_w^2\\mathbf{I}),\n",
    "  \\\\[1.5ex]\n",
    "  p(b)\n",
    "  &=\n",
    "  \\text{Normal}(b \\mid 0, \\sigma_b^2),\n",
    "  \\\\\n",
    "  p(\\mathbf{y} \\mid \\mathbf{w}, b, \\mathbf{X})\n",
    "  &=\n",
    "  \\prod_{n=1}^N\n",
    "  \\text{Normal}(y_n \\mid \\mathbf{x}_n^\\top\\mathbf{w} + b, \\sigma_y^2).\n",
    "\\end{align*}\n",
    "\n",
    "The latent variables are the linear model's weights $\\mathbf{w}$ and\n",
    "intercept $b$, also known as the bias.\n",
    "Assume $\\sigma_w^2,\\sigma_b^2$ are known prior variances and $\\sigma_y^2$ is a\n",
    "known likelihood variance. The mean of the likelihood is given by a\n",
    "linear transformation of the inputs $\\mathbf{x}_n$.\n",
    "\n",
    "Let's build the model in Edward, fixing $\\sigma_w,\\sigma_b,\\sigma_y=1$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "with tf.name_scope('model'):\n",
    "  X = tf.placeholder(tf.float32, [N, D], name=\"X\")\n",
    "  w = Normal(loc=tf.zeros(D, name=\"weights/loc\"),\n",
    "             scale=tf.ones(D, name=\"weights/scale\"),\n",
    "             name=\"weights\")\n",
    "  b = Normal(loc=tf.zeros(1, name=\"bias/loc\"),\n",
    "             scale=tf.ones(1, name=\"bias/scale\"),\n",
    "             name=\"bias\")\n",
    "  y = Normal(loc=ed.dot(X, w) + b,\n",
    "             scale=tf.ones(N, name=\"y/scale\"),\n",
    "             name=\"y\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here, we define a placeholder `X`. During inference, we pass in\n",
    "the value for this placeholder according to batches of data.\n",
    "We also use a name scope. This adds the scope's name as a prefix\n",
    "(`\"model/\"`) to all tensors in the `with` context.\n",
    "Similarly, we name the parameters in each random variable under a\n",
    "grouped naming system."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Inference\n",
    "\n",
    "We now turn to inferring the posterior using variational inference.\n",
    "Define the variational model to be a fully factorized normal across\n",
    "the weights. We add another scope to group naming in the variational\n",
    "family."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "with tf.name_scope(\"posterior\"):\n",
    "  qw_loc = tf.get_variable(\"qw/loc\", [D])\n",
    "  qw_scale = tf.nn.softplus(tf.get_variable(\"qw/unconstrained_scale\", [D]))\n",
    "  qw = Normal(loc=qw_loc, scale=qw_scale, name=\"qw\")\n",
    "  qb_loc = tf.get_variable(\"qb/loc\", [1])\n",
    "  qb_scale = tf.nn.softplus(tf.get_variable(\"qb/unconstrained_scale\", [1]))\n",
    "  qb = Normal(loc=qb_loc, scale=qb_scale, name=\"qb\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Run variational inference with the Kullback-Leibler divergence.\n",
    "We use $5$ latent variable samples for computing\n",
    "black box stochastic gradients in the algorithm.\n",
    "(For more details, see the\n",
    "[$\\text{KL}(q\\|p)$ tutorial](http://edwardlib.org/tutorials/klqp).)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "250/250 [100%] ██████████████████████████████ Elapsed: 5s | Loss: 50.865\n"
     ]
    }
   ],
   "source": [
    "inference = ed.KLqp({w: qw, b: qb}, data={X: X_train, y: y_train})\n",
    "inference.run(n_samples=5, n_iter=250, logdir='log/n_samples_5')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Optionally, we might include an `\"inference\"` name scope.  \n",
    "If it is absent, the charts are partitioned naturally\n",
    "and not automatically grouped under the monolithic `\"inference\"`. \n",
    "If it is added, the TensorBoard graph is slightly more organized."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Criticism\n",
    "\n",
    "We can use TensorBoard to explore learning and diagnose any problems.\n",
    "After running TensorBoard with the command above, we can navigate the\n",
    "tabs.\n",
    "\n",
    "Below we assume the above code is run twice with different\n",
    "configurations\n",
    "of the `n_samples` hyperparameter.\n",
    "We specified the log directory to be `log/n_samples_*`.\n",
    "By default, Edward also includes a timestamped subdirectory so that\n",
    "multiple runs of the same experiment have properly organized logs for\n",
    "TensorBoard. You can turn it off by specifying\n",
    "`log_timestamp=False` during inference.\n",
    "\n",
    "__TensorBoard Scalars.__\n",
    "![](https://raw.githubusercontent.com/blei-lab/edward/master/docs/images/tensorboard-scalars.png)\n",
    "\n",
    "Scalars provides scalar-valued information across iterations of the\n",
    "algorithm, wall time, and relative wall time. In Edward, the tab\n",
    "includes the value of scalar TensorFlow variables in the model or\n",
    "approximating family.\n",
    "\n",
    "With variational inference, we also include information such as the\n",
    "loss function and its decomposition into individual terms. This\n",
    "particular example shows that `n_samples=1` tends to have higher\n",
    "variance than `n_samples=5` but still converges to the same solution.\n",
    "\n",
    "__TensorBoard Distributions.__\n",
    "![](https://raw.githubusercontent.com/blei-lab/edward/master/docs/images/tensorboard-distributions.png)\n",
    "\n",
    "Distributions display the distribution of each non-scalar TensorFlow\n",
    "variable in the model and approximating family across iterations.\n",
    "\n",
    "__TensorBoard Histograms.__\n",
    "![](https://raw.githubusercontent.com/blei-lab/edward/master/docs/images/tensorboard-histograms.png)\n",
    "\n",
    "Histograms displays the same information as Distributions but as a 3-D\n",
    "histogram changing aross iteration.\n",
    "\n",
    "__TensorBoard Graphs.__\n",
    "![](https://raw.githubusercontent.com/blei-lab/edward/master/docs/images/tensorboard-graphs-0.png)\n",
    "![](https://raw.githubusercontent.com/blei-lab/edward/master/docs/images/tensorboard-graphs-1.png)\n",
    "\n",
    "Graphs displays the computational graph underlying the model,\n",
    "approximating family, and inference. Boxes denote tensors grouped\n",
    "under the same name scope. Cleaning up names in the graph makes it\n",
    "easy to better understand and optimize your code."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Acknowledgments\n",
    "\n",
    "We thank Sean Kruzel for writing the initial version of this\n",
    "tutorial.\n",
    "\n",
    "A TensorFlow tutorial to TensorBoard can be found \n",
    "[here](https://www.tensorflow.org/get_started/summaries_and_tensorboard)."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}