{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# How GPflow relates to TensorFlow: tips & tricks \n", "\n", "GPflow is built on top of TensorFlow, so it is useful to have some understanding of how TensorFlow works. In particular, TensorFlow's two-stage concept of first building a static compute graph and then executing it for specific input values can cause problems. This notebook aims to help with the most common issues." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import gpflow\n", "import tensorflow as tf\n", "import numpy as np\n", "\n", "from gpflow import settings" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Computation time increases when I create GPflow objects" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following example shows a typical situation when computation time increases proportionally to the number of GPflow objects that are created." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "for n in range(2, 4):\n", " kernel = gpflow.kernels.RBF(input_dim=1) # This is a gpflow object with tf.Variables inside\n", " x = np.random.randn(n, 1) # gpflow expects rank-2 input matrices, even for D=1\n", " kxx = kernel.K(x) # This is a tensor!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Remember, we operate on a TensorFlow graph!**\n", "\n", "Every time we create (build and compile) a new GPflow object, we continue to add more tensors to the graph and change only the reference to them, despite overwriting (in this case) the kernel variable.\n", "\n", "So, unnecessary expansion of the graph slows down your computation!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following examples show how to fix the issue (imagine running this code snippet in `ipython` repeatedly):" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "for n in range(2, 4):\n", " gpflow.reset_default_graph_and_session()\n", " kernel = gpflow.kernels.RBF(1)\n", " x = np.random.randn(n, 1)\n", " kxx = kernel.K(x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we were simply resetting the default graph and session using GPflow's `reset_default_graph_and_session()` function. In the next example we explicitly build new `tf.Graph()` and `tf.Session()` objects:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "for n in range(2, 4):\n", " with tf.Graph().as_default() as graph:\n", " with tf.Session(graph=graph).as_default():\n", " kernel = gpflow.kernels.RBF(1)\n", " x = np.random.randn(n, 1)\n", " kxx = kernel.K(x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the [Custom mean functions](tailor/external-mean-function.ipynb) notebook we show a real-world example of this idea." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. I want to reuse a model on different data" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-2.8766930392437593\n" ] } ], "source": [ "np.random.seed(1)\n", "x = np.random.randn(2, 1)\n", "y = np.random.randn(2, 1)\n", "kernel = gpflow.kernels.RBF(1)\n", "model = gpflow.models.GPR(x, y, kernel)\n", "print(model.compute_log_likelihood())\n", "\n", "x_new = np.random.randn(100, 1)\n", "y_new = np.random.randn(100, 1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can compute the log-likelihood of the model on different data. Note that we didn't change the original model!" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "-140.83017563045192" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x_tensor = model.X.parameter_tensor\n", "y_tensor = model.Y.parameter_tensor\n", "model.compute_log_likelihood(feed_dict={x_tensor: x_new, y_tensor: y_new}) # we can still probe the model with the old data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can do the same by permanently updating the values of the dataholders." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "-140.83017563045192" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model.X = x_new\n", "model.Y = y_new\n", "model.compute_log_likelihood()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. I want to use external TensorFlow tensors and pass them to a GPflow model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can pass TensorFlow tensors for any non-trainable parameters of the GPflow objects like DataHolders." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "-196.46001677717464" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.random.seed(1)\n", "kernel = gpflow.kernels.RBF(1)\n", "likelihood = gpflow.likelihoods.Gaussian()\n", "\n", "x_tensor = tf.random_normal((100, 1), dtype=settings.float_type)\n", "y_tensor = tf.random_normal((100, 1), dtype=settings.float_type)\n", "z = np.random.randn(10, 1)\n", "\n", "model = gpflow.models.SVGP(x_tensor, y_tensor, kern=kernel, likelihood=likelihood, Z=z)\n", "model.compute_log_likelihood()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also use TensorFlow variables for trainable objects:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "z = tf.Variable(np.random.randn(10, 1))\n", "model = gpflow.models.SVGP(x_tensor, y_tensor, kern=kernel, likelihood=likelihood, Z=z)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, in this case you have to initialise them manually, before interacting with a model:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "-193.12198293763578" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "session = gpflow.get_default_session()\n", "session.run(z.initializer)\n", "model.compute_log_likelihood()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4. I want to share parameters between GPflow objects" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sometimes we want to impose a hard-coded structure on the model (for example, if we have a multi-output model where some output dimensions share the same kernel and others don't). Unfortunately we cannot do this after the kernel object is compiled. We have to do it at build time and then manually compile the object.\n" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "with gpflow.decors.defer_build():\n", " kernels = [gpflow.kernels.RBF(1) for _ in range(3)]\n", " mo_kernels = gpflow.multioutput.kernels.SeparateMixedMok(kernels, W=np.random.randn(3, 4))\n", " mo_kernels.kernels[0].lengthscales = mo_kernels.kernels[1].lengthscales\n", " mo_kernels.compile()\n", "\n", "assert mo_kernels.kernels[0].lengthscales is mo_kernels.kernels[1].lengthscales" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5. Optimising my model repeatedly slows down the computation time" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following is an example of bad practice:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "x = np.random.randn(100, 1)\n", "y = np.random.randn(100, 1)\n", "model = gpflow.models.GPR(x, y, kernel)\n", "\n", "optimizer = gpflow.training.AdamOptimizer()\n", "\n", "optimizer.minimize(model, maxiter=2)\n", "\n", "# Do something with the model\n", "\n", "optimizer.minimize(model, maxiter=2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `minimize()` call creates a bunch of optimisation tensors. Calling `minimize()` again causes the same issue discussed under issue (1)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The correct way of optimising your model without polluting your graph is as follows:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "kernel = gpflow.kernels.RBF(1)\n", "x = np.random.randn(100, 1)\n", "y = np.random.randn(100, 1)\n", "model = gpflow.models.GPR(x, y, kernel)\n", "\n", "optimizer = gpflow.training.AdamOptimizer()\n", "optimizer_tensor = optimizer.make_optimize_tensor(model)\n", "session = gpflow.get_default_session()\n", "for _ in range(2):\n", " session.run(optimizer_tensor)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Don't forget to **anchor** your model to the session after optimisation. Then you can continue working with your model.
" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "model.anchor(session)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, if you need to optimise it again, you can reuse the same optimiser tensor." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "for _ in range(2):\n", " session.run(optimizer_tensor)\n", "\n", "model.anchor(session)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 6. When I try to read parameter values, I'm getting stale values" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "np.random.seed(1)\n", "x = np.random.randn(100, 1)\n", "y = np.random.randn(100, 1)\n", "\n", "kernel = gpflow.kernels.RBF(1)\n", "model = gpflow.models.GPR(x, y, kernel)\n", "optimizer = gpflow.training.AdamOptimizer()\n", "optimizer_tensor = optimizer.make_optimize_tensor(model)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The initial value before optimisation is:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array(1.)" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model.kern.lengthscales.value" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's call one step of the optimisation and check the new value of the parameter." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array(1.)" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gpflow.get_default_session().run(optimizer_tensor)\n", "model.kern.lengthscales.value" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After optimisation you would expect that the parameters were updated, but they weren't. The trick is that the `value` property returns a cached NumPy value of a parameter." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can get the value of the optimised parameter by using the `read_value()` method, specifying the correct `session`." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1.0006322362558255" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model.kern.lengthscales.read_value(session)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alternatively, you can `anchor(session)` your model to the session after the optimisation step. The `anchor()` updates the parameters' cache.\n", "\n", "**NOTE:** The `anchor(session)` method is significantly more time-consuming than `read_value(session)`. Do not call it too often unless you need to.\n" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array(1.00063224)" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model.anchor(session)\n", "model.kern.lengthscales.value" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 7. I want to save and load a GPflow model" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "kernel = gpflow.kernels.RBF(1)\n", "x = np.random.randn(100, 1)\n", "y = np.random.randn(100, 1)\n", "model = gpflow.models.GPR(x, y, kernel)\n", "\n", "from pathlib import Path\n", "filename = \"/tmp/gpr.gpflow\"\n", "path = Path(filename)\n", "if path.exists():\n", " path.unlink()\n", "saver = gpflow.saver.Saver()\n", "saver.save(filename, model)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can load the model into a different graph:" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [], "source": [ "with tf.Graph().as_default() as graph, tf.Session().as_default():\n", " model_copy = saver.load(filename)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alternatively, you can load the model into the same session:" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [], "source": [ "ctx_for_loading = gpflow.saver.SaverContext(autocompile=False)\n", "model_copy = saver.load(filename, context=ctx_for_loading)\n", "model_copy.clear()\n", "model_copy.compile()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The difference between the former approach and the latter lies in the TensorFlow name scopes which are used for naming variables. The former approach replicates the instance of the TensorFlow objects (which already exist in the original graph), so we need to load the model into a new graph.\n", "The latter approach uses different name scopes for the variables so that you can dump the model in the same graph." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.8" } }, "nbformat": 4, "nbformat_minor": 2 }