{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This notebook is an interactive version of the [companion webpage](http://edwardlib.org/iclr2017) for the article, Deep Probabilistic Programming [(Tran et al., 2017)](https://arxiv.org/abs/1701.03757). See Edward's [API](http://edwardlib.org/api/) for details on how to interact with data, models, inference, and criticism.\n",
    "\n",
    "The code snippets assume the following versions.\n",
    "```bash\n",
    "pip install edward==1.3.1\n",
    "pip install tensorflow==1.1.0  # alternatively, tensorflow-gpu==1.1.0\n",
    "pip install keras==2.0.0\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Section 3. Compositional Representations for Probabilistic Models\n",
    "\n",
    "__Figure 1__. Beta-Bernoulli program."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import tensorflow as tf\n",
    "from edward.models import Bernoulli, Beta\n",
    "\n",
    "theta = Beta(1.0, 1.0)\n",
    "x = Bernoulli(tf.ones(50) * theta)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For an example of it in use, see\n",
    "[`examples/beta_bernoulli.py`](https://github.com/blei-lab/edward/blob/master/examples/beta_bernoulli.py)\n",
    "in the Github repository.\n",
    "\n",
    "__Figure 2__. Variational auto-encoder for a data set of 28 x 28 pixel images\n",
    "(Kingma & Welling, 2014; Rezende, Mohamed, & Wierstra, 2014)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import tensorflow as tf\n",
    "from edward.models import Bernoulli, Normal\n",
    "from keras.layers import Dense\n",
    "\n",
    "N = 55000  # number of data points\n",
    "d = 50  # latent dimension\n",
    "\n",
    "# Probabilistic model\n",
    "z = Normal(loc=tf.zeros([N, d]), scale=tf.ones([N, d]))\n",
    "h = Dense(256, activation='relu')(z)\n",
    "x = Bernoulli(logits=Dense(28 * 28, activation=None)(h))\n",
    "\n",
    "# Variational model\n",
    "qx = tf.placeholder(tf.float32, [N, 28 * 28])\n",
    "qh = Dense(256, activation='relu')(qx)\n",
    "qz = Normal(loc=Dense(d, activation=None)(qh),\n",
    "            scale=Dense(d, activation='softplus')(qh))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For an example of it in use, see\n",
    "[`examples/vae.py`](https://github.com/blei-lab/edward/blob/master/examples/vae.py)\n",
    "in the Github repository.\n",
    "\n",
    "__Figure 3__. Bayesian recurrent neural network (Radford M Neal, 2012).\n",
    "The program has an unspecified number of time steps; it uses a\n",
    "symbolic for loop (`tf.scan`)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import edward as ed\n",
    "import tensorflow as tf\n",
    "from edward.models import Normal\n",
    "\n",
    "H = 50  # number of hidden units\n",
    "D = 10  # number of features\n",
    "\n",
    "def rnn_cell(hprev, xt):\n",
    "  return tf.tanh(ed.dot(hprev, Wh) + ed.dot(xt, Wx) + bh)\n",
    "\n",
    "Wh = Normal(loc=tf.zeros([H, H]), scale=tf.ones([H, H]))\n",
    "Wx = Normal(loc=tf.zeros([D, H]), scale=tf.ones([D, H]))\n",
    "Wy = Normal(loc=tf.zeros([H, 1]), scale=tf.ones([H, 1]))\n",
    "bh = Normal(loc=tf.zeros(H), scale=tf.ones(H))\n",
    "by = Normal(loc=tf.zeros(1), scale=tf.ones(1))\n",
    "\n",
    "x = tf.placeholder(tf.float32, [None, D])\n",
    "h = tf.scan(rnn_cell, x, initializer=tf.zeros(H))\n",
    "y = Normal(loc=tf.matmul(h, Wy) + by, scale=1.0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Section 4. Compositional Representations for Inference\n",
    "\n",
    "__Figure 5__. Hierarchical model (Gelman & Hill, 2006).\n",
    "  It is a mixture of Gaussians over\n",
    "  $D$-dimensional data $\\{x_n\\}\\in\\mathbb{R}^{N\\times D}$. There are\n",
    "  $K$ latent cluster means $\\beta\\in\\mathbb{R}^{K\\times D}$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import tensorflow as tf\n",
    "from edward.models import Categorical, Normal\n",
    "\n",
    "N = 10000  # number of data points\n",
    "D = 2  # data dimension\n",
    "K = 5  # number of clusters\n",
    "\n",
    "beta = Normal(loc=tf.zeros([K, D]), scale=tf.ones([K, D]))\n",
    "z = Categorical(logits=tf.zeros([N, K]))\n",
    "x = Normal(loc=tf.gather(beta, z), scale=tf.ones([N, D]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It is used below in Figure 6 (left/right) and Figure * (variational EM).\n",
    "\n",
    "__Figure 6__ __(left)__. Variational inference\n",
    "(Jordan, Ghahramani, Jaakkola, & Saul, 1999).\n",
    "It performs inference on the model defined in Figure 5."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import edward as ed\n",
    "import numpy as np\n",
    "import tensorflow as tf\n",
    "from edward.models import Categorical, Normal\n",
    "\n",
    "x_train = np.zeros([N, D])\n",
    "\n",
    "qbeta = Normal(loc=tf.Variable(tf.zeros([K, D])),\n",
    "               scale=tf.exp(tf.Variable(tf.zeros([K, D]))))\n",
    "qz = Categorical(logits=tf.Variable(tf.zeros([N, K])))\n",
    "\n",
    "inference = ed.VariationalInference({beta: qbeta, z: qz}, data={x: x_train})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "__Figure 6__ __(right)__. Monte Carlo (Robert & Casella, 1999).\n",
    "It performs inference on the model defined in Figure 5."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import edward as ed\n",
    "import numpy as np\n",
    "import tensorflow as tf\n",
    "from edward.models import Empirical\n",
    "\n",
    "x_train = np.zeros([N, D])\n",
    "\n",
    "T = 10000  # number of samples\n",
    "qbeta = Empirical(params=tf.Variable(tf.zeros([T, K, D])))\n",
    "qz = Empirical(params=tf.Variable(tf.zeros([T, N])))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "__Figure 7__. Generative adversarial network\n",
    "(Goodfellow et al., 2014)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import edward as ed\n",
    "import numpy as np\n",
    "import tensorflow as tf\n",
    "from edward.models import Normal\n",
    "from keras.layers import Dense\n",
    "\n",
    "N = 55000  # number of data points\n",
    "d = 50  # latent dimension\n",
    "\n",
    "def generative_network(eps):\n",
    "  h = Dense(256, activation='relu')(eps)\n",
    "  return Dense(28 * 28, activation=None)(h)\n",
    "\n",
    "def discriminative_network(x):\n",
    "  h = Dense(28 * 28, activation='relu')(x)\n",
    "  return Dense(1, activation=None)(h)\n",
    "\n",
    "# DATA\n",
    "x_train = np.zeros([N, 28 * 28])\n",
    "\n",
    "# MODEL\n",
    "eps = Normal(loc=tf.zeros([N, d]), scale=tf.ones([N, d]))\n",
    "x = generative_network(eps)\n",
    "\n",
    "# INFERENCE\n",
    "inference = ed.GANInference(data={x: x_train},\n",
    "                            discriminator=discriminative_network)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For an example of it in use, see the\n",
    "[generative adversarial networks](http://edwardlib.org/tutorials/gan) tutorial.\n",
    "\n",
    "__Figure *__. Variational EM (Radford M. Neal & Hinton, 1993).\n",
    "It performs inference on the model defined in Figure 5."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import edward as ed\n",
    "import numpy as np\n",
    "import tensorflow as tf\n",
    "from edward.models import Categorical, PointMass\n",
    "\n",
    "# DATA\n",
    "x_train = np.zeros([N, D])\n",
    "\n",
    "# INFERENCE\n",
    "qbeta = PointMass(params=tf.Variable(tf.zeros([K, D])))\n",
    "qz = Categorical(logits=tf.Variable(tf.zeros([N, K])))\n",
    "\n",
    "inference_e = ed.VariationalInference({z: qz}, data={x: x_train, beta: qbeta})\n",
    "inference_m = ed.MAP({beta: qbeta}, data={x: x_train, z: qz})\n",
    "\n",
    "inference_e.initialize()\n",
    "inference_m.initialize()\n",
    "\n",
    "tf.initialize_all_variables().run()\n",
    "\n",
    "for _ in range(10000):\n",
    "  inference_e.update()\n",
    "  inference_m.update()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For more details, see the\n",
    "[inference compositionality](http://edwardlib.org/api/inference-compositionality) webpage.\n",
    "See\n",
    "[`examples/factor_analysis.py`](https://github.com/blei-lab/edward/blob/master/examples/factor_analysis.py) for\n",
    "a version performing Monte Carlo EM for logistic factor analysis\n",
    "in the Github repository.\n",
    "It leverages Hamiltonian Monte Carlo for the E-step to perform maximum\n",
    "marginal a posteriori.\n",
    "\n",
    "__Figure *__. Data subsampling."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import edward as ed\n",
    "import tensorflow as tf\n",
    "from edward.models import Categorical, Normal\n",
    "\n",
    "N = 10000  # number of data points\n",
    "M = 128  # batch size during training\n",
    "D = 2  # data dimension\n",
    "K = 5  # number of clusters\n",
    "\n",
    "# DATA\n",
    "x_batch = tf.placeholder(tf.float32, [M, D])\n",
    "\n",
    "# MODEL\n",
    "beta = Normal(loc=tf.zeros([K, D]), scale=tf.ones([K, D]))\n",
    "z = Categorical(logits=tf.zeros([M, K]))\n",
    "x = Normal(loc=tf.gather(beta, z), scale=tf.ones([M, D]))\n",
    "\n",
    "# INFERENCE\n",
    "qbeta = Normal(loc=tf.Variable(tf.zeros([K, D])),\n",
    "               scale=tf.nn.softplus(tf.Variable(tf.zeros([K, D]))))\n",
    "qz = Categorical(logits=tf.Variable(tf.zeros([M, D])))\n",
    "\n",
    "inference = ed.VariationalInference({beta: qbeta, z: qz}, data={x: x_batch})\n",
    "inference.initialize(scale={x: float(N) / M, z: float(N) / M})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For more details, see the\n",
    "[data subsampling](http://edwardlib.org/api/inference-data-subsampling) webpage.\n",
    "\n",
    "## Section 5. Experiments\n",
    "\n",
    "__Figure 9__. Bayesian logistic regression with Hamiltonian Monte Carlo."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import edward as ed\n",
    "import numpy as np\n",
    "import tensorflow as tf\n",
    "from edward.models import Bernoulli, Empirical, Normal\n",
    "\n",
    "N = 581012  # number of data points\n",
    "D = 54  # number of features\n",
    "T = 100  # number of empirical samples\n",
    "\n",
    "# DATA\n",
    "x_data = np.zeros([N, D])\n",
    "y_data = np.zeros([N])\n",
    "\n",
    "# MODEL\n",
    "x = tf.Variable(x_data, trainable=False)\n",
    "beta = Normal(loc=tf.zeros(D), scale=tf.ones(D))\n",
    "y = Bernoulli(logits=ed.dot(x, beta))\n",
    "\n",
    "# INFERENCE\n",
    "qbeta = Empirical(params=tf.Variable(tf.zeros([T, D])))\n",
    "inference = ed.HMC({beta: qbeta}, data={y: y_data})\n",
    "inference.run(step_size=0.5 / N, n_steps=10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For an example of it in use, see\n",
    "[`examples/bayesian_logistic_regression.py`](https://github.com/blei-lab/edward/blob/master/examples/bayesian_logistic_regression.py)\n",
    "in the Github repository.\n",
    "\n",
    "## Appendix A. Model Examples\n",
    "\n",
    "__Figure 10__. Bayesian neural network for classification (Denker, Schwartz, Wittner, & Solla, 1987)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import tensorflow as tf\n",
    "from edward.models import Bernoulli, Normal\n",
    "\n",
    "N = 1000  # number of data points\n",
    "D = 528  # number of features\n",
    "H = 256  # hidden layer size\n",
    "\n",
    "W_0 = Normal(loc=tf.zeros([D, H]), scale=tf.ones([D, H]))\n",
    "W_1 = Normal(loc=tf.zeros([H, 1]), scale=tf.ones([H, 1]))\n",
    "b_0 = Normal(loc=tf.zeros(H), scale=tf.ones(H))\n",
    "b_1 = Normal(loc=tf.zeros(1), scale=tf.ones(1))\n",
    "\n",
    "x = tf.placeholder(tf.float32, [N, D])\n",
    "y = Bernoulli(logits=tf.matmul(tf.nn.tanh(tf.matmul(x, W_0) + b_0), W_1) + b_1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For an example of it in use, see\n",
    "[`examples/getting_started_example.py`](https://github.com/blei-lab/edward/blob/master/examples/getting_started_example.py)\n",
    "in the Github repository.\n",
    "\n",
    "__Figure 11__. Latent Dirichlet allocation (D. M. Blei, Ng, & Jordan, 2003)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import tensorflow as tf\n",
    "from edward.models import Categorical, Dirichlet\n",
    "\n",
    "D = 4  # number of documents\n",
    "N = [11502, 213, 1523, 1351]  # words per doc\n",
    "K = 10  # number of topics\n",
    "V = 100000  # vocabulary size\n",
    "\n",
    "theta = Dirichlet(tf.zeros([D, K]) + 0.1)\n",
    "phi = Dirichlet(tf.zeros([K, V]) + 0.05)\n",
    "z = [[0] * N] * D\n",
    "w = [[0] * N] * D\n",
    "for d in range(D):\n",
    "  for n in range(N[d]):\n",
    "    z[d][n] = Categorical(theta[d, :])\n",
    "    w[d][n] = Categorical(phi[z[d][n], :])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "__Figure 12__. Gaussian matrix factorization\n",
    "(Salakhutdinov & Mnih, 2011)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import tensorflow as tf\n",
    "from edward.models import Normal\n",
    "\n",
    "N = 10\n",
    "M = 10\n",
    "K = 5  # latent dimension\n",
    "\n",
    "U = Normal(loc=tf.zeros([M, K]), scale=tf.ones([M, K]))\n",
    "V = Normal(loc=tf.zeros([N, K]), scale=tf.ones([N, K]))\n",
    "Y = Normal(loc=tf.matmul(U, V, transpose_b=True), scale=tf.ones([N, M]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "__Figure 13__. Dirichlet process mixture model (Antoniak, 1974)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import tensorflow as tf\n",
    "from edward.models import DirichletProcess, Normal\n",
    "\n",
    "N = 1000  # number of data points\n",
    "D = 5  # data dimensionality\n",
    "\n",
    "dp = DirichletProcess(alpha=1.0, base=Normal(loc=tf.zeros(D), scale=tf.ones(D)))\n",
    "mu = dp.sample(N)\n",
    "x = Normal(loc=loc, scale=tf.ones([N, D]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To see the essential component defining the `DirichletProcess`, see\n",
    "[`examples/pp_dirichlet_process.py`](https://github.com/blei-lab/edward/blob/master/examples/pp_dirichlet_process.py)\n",
    "in the Github repository. Its source implementation can be found at\n",
    "[`edward/models/dirichlet_process.py`](https://github.com/blei-lab/edward/blob/master/edward/models/dirichlet_process.py).\n",
    "\n",
    "## Appendix B. Inference Examples\n",
    "\n",
    "__Figure *__. Stochastic variational inference (M. D. Hoffman, Blei, Wang, & Paisley, 2013). \n",
    "For more details, see the\n",
    "[data subsampling](http://edwardlib.org/api/inference-data-subsampling) webpage.\n",
    "\n",
    "## Appendix C. Complete Examples\n",
    "\n",
    "__Figure 15__. Variational auto-encoder\n",
    "(Kingma & Welling, 2014; Rezende et al., 2014).\n",
    "See the script\n",
    "\\href{https://github.com/blei-lab/edward/blob/master/examples/vae.py}{\\texttt{examples/vae.py}}\n",
    "in the Github repository.\n",
    "\n",
    "__Figure 16__. Exponential family embedding (Rudolph, Ruiz, Mandt, & Blei, 2016).\n",
    "A Github repository with comprehensive features is available at\n",
    "[mariru/exponential_family_embeddings](https://github.com/mariru/exponential_family_embeddings)."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}