{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Manipulating GPflow models\n", "\n", "One of the key ingredients in GPflow is the model class, which enables you to carefully control parameters. This notebook shows how some of these parameter control features work, and how to build your own model with GPflow. First we'll look at:\n", "\n", " - how to view models and parameters\n", " - how to set parameter values\n", " - how to constrain parameters (for example, variance > 0)\n", " - how to fix model parameters\n", " - how to apply priors to parameters\n", " - how to optimise models\n", "\n", "Then we'll show how to build a simple logistic regression model, demonstrating the ease of the parameter framework. \n", "\n", "GPy users should feel right at home, but there are some small differences.\n", "\n", "First, let's deal with the usual notebook boilerplate and make a simple GP regression model. See [Basic (Gaussian likelihood) GP regression model](../basics/regression.ipynb) for specifics of the model; we just want some parameters to play with." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import gpflow\n", "import numpy as np" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a very simple GP regression model wrapped in `defer_build()`. You must use `defer_build()` so that the model is not compiled in the TensorFlow graph, because adding transforms and priors is possible only for non-compiled models. " ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "np.random.seed(1)\n", "X = np.random.rand(20, 1)\n", "Y = np.sin(12 * X) + 0.66 * np.cos(25 * X) + np.random.randn(20,1) * 0.01\n", "\n", "with gpflow.defer_build():\n", " m = gpflow.models.GPR(X, Y, kern=gpflow.kernels.Matern32(1) + gpflow.kernels.Linear(1))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Viewing, getting, and setting parameters\n", "You can display the state of the model in a terminal by using `print(m)`, and by simply returning it in a notebook:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
classpriortransformtrainableshapefixed_shapevalue
GPR/kern/kernels/0/lengthscalesParameterNone+veTrue()True1.0
GPR/kern/kernels/0/varianceParameterNone+veTrue()True1.0
GPR/kern/kernels/1/varianceParameterNone+veTrue()True1.0
GPR/likelihood/varianceParameterNone+veTrue()True1.0
\n", "
" ], "text/plain": [ "" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This model has four parameters. The kernel is made of the sum of two parts. The first (counting from zero) is a Matern32 kernel that has a variance parameter and a lengthscale parameter; the second is a linear kernel that has only a variance parameter. There is also a parameter that controls the variance of the noise, as part of the likelihood. \n", "\n", "All the model variables have been initialised at `1.0`. You can access individual parameters in the same way that you display the state of the model in a terminal; for example, to see all the parameters that are part of the likelihood, run:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
classpriortransformtrainableshapefixed_shapevalue
GPR/likelihood/varianceParameterNone+veTrue()True1.0
\n", "
" ], "text/plain": [ "" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m.likelihood" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This gets more useful with more complex models!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To set the value of a parameter, just use an assignment statement:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
classpriortransformtrainableshapefixed_shapevalue
GPR/kern/kernels/0/lengthscalesParameterNone+veTrue()True0.5
GPR/kern/kernels/0/varianceParameterNone+veTrue()True1.0
GPR/kern/kernels/1/varianceParameterNone+veTrue()True1.0
GPR/likelihood/varianceParameterNone+veTrue()True0.01
\n", "
" ], "text/plain": [ "" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m.kern.kernels[0].lengthscales = 0.5\n", "m.likelihood.variance = 0.01\n", "m" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Constraints and trainable variables\n", "\n", "GPflow helpfully creates an unconstrained representation of all the variables. In the previous example, all the variables are constrained positively (see the right-hand column in the table), the unconstrained representation is given by $\\alpha = \\log(\\exp(\\theta)-1)$. `read_trainables()` returns the constrained values:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'GPR/kern/kernels/0/lengthscales': array(0.5),\n", " 'GPR/kern/kernels/0/variance': array(1.),\n", " 'GPR/kern/kernels/1/variance': array(1.),\n", " 'GPR/likelihood/variance': array(0.01)}" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m.read_trainables()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Each parameter has an `unconstrained_tensor` attribute that enables you to access the unconstrained value as a TensorFlow Tensor (though only after the model has been compiled). You can also check the unconstrained value as follows:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "-0.4327546710632299" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "p = m.kern.kernels[0].lengthscales\n", "p.transform.backward(p.value)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Constraints are handled by the `Transform` classes. You might prefer to use the constraint $\\alpha = \\log(\\theta)$; this is easily done by changing the transform attribute on a parameter, with one simple condition - the model has not been compiled yet:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "m.kern.kernels[0].lengthscales.transform = gpflow.transforms.Exp()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Though the lengthscale itself remains the same, the unconstrained lengthscale has changed:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "-0.6931491805619453" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "p.transform.backward(p.value)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another helpful feature is the ability to fix parameters. To do this, simply set the `trainable` attribute to `False`; this is shown in the **trainable** column of the representation, and the corresponding variable is removed from the free state." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
classpriortransformtrainableshapefixed_shapevalue
GPR/kern/kernels/0/lengthscalesParameterNoneExpTrue()True0.5
GPR/kern/kernels/0/varianceParameterNone+veTrue()True1.0
GPR/kern/kernels/1/varianceParameterNone+veFalse()True1.0
GPR/likelihood/varianceParameterNone+veTrue()True0.01
\n", "
" ], "text/plain": [ "" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m.kern.kernels[1].variance.trainable = False\n", "m" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'GPR/kern/kernels/0/lengthscales': array(0.5),\n", " 'GPR/kern/kernels/0/variance': array(1.),\n", " 'GPR/likelihood/variance': array(0.01)}" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m.read_trainables()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To unfix a parameter, just set the `trainable` attribute to `True` again." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
classpriortransformtrainableshapefixed_shapevalue
GPR/kern/kernels/0/lengthscalesParameterNoneExpTrue()True0.5
GPR/kern/kernels/0/varianceParameterNone+veTrue()True1.0
GPR/kern/kernels/1/varianceParameterNone+veTrue()True1.0
GPR/likelihood/varianceParameterNone+veTrue()True0.01
\n", "
" ], "text/plain": [ "" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m.kern.kernels[1].variance.trainable = True\n", "m" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Priors\n", "\n", "You can set priors in the same way as transforms and trainability, by using members of the `gpflow.priors` module. Let's set a Gamma prior on the RBF-variance." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
classpriortransformtrainableshapefixed_shapevalue
GPR/kern/kernels/0/lengthscalesParameterNoneExpTrue()True0.5
GPR/kern/kernels/0/varianceParameterGa(2.0,3.0)+veTrue()True1.0
GPR/kern/kernels/1/varianceParameterNone+veTrue()True1.0
GPR/likelihood/varianceParameterNone+veTrue()True0.01
\n", "
" ], "text/plain": [ "" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m.kern.kernels[0].variance.prior = gpflow.priors.Gamma(2, 3)\n", "m" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Optimisation\n", "\n", "To optimise your model, first create an instance of an optimiser (in this case, `gpflow.train.ScipyOptimizer`), which has optional arguments that are passed to `scipy.optimize.minimize` (we minimise the negative log likelihood). Then, call the `minimize` method of that optimiser, with your model as the optimisation target. Variables that have priors are maximum a priori (MAP) estimated, that is, we add the log prior to the log likelihood, and otherwise use Maximum Likelihood." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "WARNING:tensorflow:From /home/st/anaconda3/envs/relaxedgp/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.\n", "Instructions for updating:\n", "Colocations handled automatically by placer.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "WARNING:tensorflow:From /home/st/anaconda3/envs/relaxedgp/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.\n", "Instructions for updating:\n", "Colocations handled automatically by placer.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "WARNING:tensorflow:From /home/st/anaconda3/envs/relaxedgp/lib/python3.7/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.\n", "Instructions for updating:\n", "Use tf.cast instead.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "WARNING:tensorflow:From /home/st/anaconda3/envs/relaxedgp/lib/python3.7/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.\n", "Instructions for updating:\n", "Use tf.cast instead.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:Optimization terminated with:\n", " Message: b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH'\n", " Objective function value: 1.884455\n", " Number of iterations: 41\n", " Number of functions evaluations: 46\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:tensorflow:Optimization terminated with:\n", " Message: b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH'\n", " Objective function value: 1.884455\n", " Number of iterations: 41\n", " Number of functions evaluations: 46\n" ] } ], "source": [ "m.compile()\n", "opt = gpflow.train.ScipyOptimizer()\n", "opt.minimize(m)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Building new models\n", "\n", "To build new models, you'll need to inherit from `gpflow.models.Model`. Parameters are instantiated with `gpflow.Param`. You might also be interested in `gpflow.params.Parameterized`, which acts as a 'container' for `Param`s (for example, kernels are parameterised). \n", "\n", "In this very simple demo, we'll implement linear multiclass classification.\n", "\n", "There are two parameters: a weight matrix and a bias (offset). The key thing to implement is the private `_build_likelihood` method, which returns a TensorFlow scalar that represents the (log) likelihood. You can use param objects inside `_build_likelihood`, but you need to use the constrained tensor attribute to access the tensor to use when building the graph (for example, `self.kernel.variance.constrained_tensor`).\n", "\n", "Alternatively, decorate the function with `@gpflow.params_as_tensors` so that the objects appear as constrained tensors (for example, you can now refer to the TensorFlow object as `self.kernel.variance` rather than `self.kernel.variance.constrained_tensor`). This is useful when you are writing the model likelihood and dealing with the `constrained_tensor` attributes of several different `Param`s, but note that you cannot access any of the other features of a `Param` object.\n" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "import tensorflow as tf\n", "\n", "class LinearMulticlass(gpflow.models.Model):\n", " def __init__(self, X, Y, name=None):\n", " super().__init__(name=name) # always call the parent constructor\n", " \n", " self.X = X.copy() # X is a numpy array of inputs\n", " self.Y = Y.copy() # Y is a 1-of-k (one-hot) representation of the labels\n", " \n", " self.num_data, self.input_dim = X.shape\n", " _, self.num_classes = Y.shape\n", " \n", " #make some parameters\n", " self.W = gpflow.Param(np.random.randn(self.input_dim, self.num_classes))\n", " self.b = gpflow.Param(np.random.randn(self.num_classes))\n", " \n", " # ^^ You must make the parameters attributes of the class for\n", " # them to be picked up by the model. i.e. this won't work:\n", " #\n", " # W = gpflow.Param(... <-- must be self.W\n", " \n", " @gpflow.params_as_tensors\n", " def _build_likelihood(self): # takes no arguments\n", " p = tf.nn.softmax(tf.matmul(self.X, self.W) + self.b) # Param variables are used as tensorflow arrays. \n", " return tf.reduce_sum(tf.log(p) * self.Y) # be sure to return a scalar" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "...and that's it. Let's build a really simple demo to show that it works." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "np.random.seed(123)\n", "X = np.vstack([np.random.randn(10,2) + [2,2],\n", " np.random.randn(10,2) + [-2,2],\n", " np.random.randn(10,2) + [2,-2]])\n", "Y = np.repeat(np.eye(3), 10, 0)\n", "\n", "from matplotlib import pyplot as plt\n", "plt.style.use('ggplot')\n", "%matplotlib inline\n", "import matplotlib\n", "matplotlib.rcParams['figure.figsize'] = (12,6)\n", "plt.scatter(X[:,0], X[:,1], 100, np.argmax(Y, 1), lw=2, cmap=plt.cm.viridis);" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
classpriortransformtrainableshapefixed_shapevalue
LinearMulticlass/WParameterNone(none)True(2, 3)True[[-0.7727087142471915, 0.7948626677932181, 0.3...
LinearMulticlass/bParameterNone(none)True(3,)True[0.045490080631097156, -0.2330920609844135, -1...
\n", "
" ], "text/plain": [ "<__main__.LinearMulticlass at 0x7fd35c3174e0>" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m = LinearMulticlass(X, Y)\n", "m" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:Optimization terminated with:\n", " Message: b'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'\n", " Objective function value: 0.000013\n", " Number of iterations: 26\n", " Number of functions evaluations: 27\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "INFO:tensorflow:Optimization terminated with:\n", " Message: b'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'\n", " Objective function value: 0.000013\n", " Number of iterations: 26\n", " Number of functions evaluations: 27\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
classpriortransformtrainableshapefixed_shapevalue
LinearMulticlass/WParameterNone(none)True(2, 3)True[[8.558497428410769, -30.636553276376564, 22.4...
LinearMulticlass/bParameterNone(none)True(3,)True[11.857844278133033, -12.947434324958664, -0.2...
\n", "
" ], "text/plain": [ "<__main__.LinearMulticlass at 0x7fd35c3174e0>" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "opt = gpflow.train.ScipyOptimizer()\n", "opt.minimize(m)\n", "m" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [], "source": [ "xx, yy = np.mgrid[-4:4:200j, -4:4:200j]\n", "X_test = np.vstack([xx.flatten(), yy.flatten()]).T\n", "f_test = np.dot(X_test, m.W.read_value()) + m.b.read_value()\n", "p_test = np.exp(f_test)\n", "p_test /= p_test.sum(1)[:,None]" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.figure(figsize=(12, 6))\n", "for i in range(3):\n", " plt.contour(xx, yy, p_test[:,i].reshape(200,200), [0.5], colors='k', linewidths=1)\n", "plt.scatter(X[:,0], X[:,1], 100, np.argmax(Y, 1), lw=2, cmap=plt.cm.viridis);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That concludes the new model example and this notebook. You might want to see for yourself that the `LinearMulticlass` model and its parameters have all the functionality demonstrated here. You could also add some priors and run Hamiltonian Monte Carlo using the HMC optimiser `gpflow.train.HMC` and its `sample` method. See [Markov Chain Monte Carlo (MCMC)](../advanced/mcmc.ipynb) for more information on running the sampler." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.8" }, "widgets": { "state": {}, "version": "1.1.2" } }, "nbformat": 4, "nbformat_minor": 2 }