{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Deep Gaussian Processes II" ], "id": "5d3957a5-225f-49ee-82a3-f8b528c62993" }, { "cell_type": "markdown", "metadata": {}, "source": [], "id": "6e8a73c2-0f93-4a08-a97e-1aa699e99350" }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\n", "$$" ], "id": "d7b7334b-93d3-4bdd-b6b1-8b63f7db1a23" }, { "cell_type": "markdown", "metadata": {}, "source": [ "::: {.cell .markdown}\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "" ], "id": "aa2cddb7-7830-4426-a6fa-cfb6fede589f" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Robot Wireless Data\n", "\n", "\\[edit\\]\n", "\n", "The robot wireless data is taken from an experiment run by Brian Ferris\n", "at University of Washington. It consists of the measurements of WiFi\n", "access point signal strengths as Brian walked in a loop. It was\n", "published at IJCAI in 2007 (Ferris et al., 2007)." ], "id": "e55ac001-fbef-4bef-8b2c-343c84d9abec" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pods\n", "import numpy as np" ], "id": "db1bc214-ae38-4e1f-94ce-e43df68c5a78" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data=pods.datasets.robot_wireless()" ], "id": "ee9dcdb4-8add-4cc1-b409-307dcb5517c4" }, { "cell_type": "markdown", "metadata": {}, "source": [ "The ground truth is recorded in the data, the actual loop is given in\n", "the plot below." ], "id": "065c913d-ca26-4baa-91fd-28f4c75bda47" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import mlai\n", "import mlai.plot as plot" ], "id": "6d0d0079-f2cd-4eb8-a7c9-84c8cd5ad9a5" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_figsize)\n", "plt.plot(data['X'][:, 1], data['X'][:, 2], 'r.', markersize=5)\n", "ax.set_xlabel('x position', fontsize=20)\n", "ax.set_ylabel('y position', fontsize=20)\n", "mlai.write_figure(figure=fig, \n", " filename='robot-wireless-ground-truth.svg',\n", " directory='./datasets')" ], "id": "2c20d7d6-5cfa-4a8c-887e-f8dae3c9086e" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Robot Wireless Ground Truth\n", "\n", "\n", "\n", "Figure: Ground truth movement for the position taken while recording\n", "the multivariate time-course of wireless access point signal\n", "strengths.\n", "\n", "We will ignore this ground truth in making our predictions, but see if\n", "the model can recover something similar in one of the latent layers." ], "id": "2c4afc16-ac18-42e2-9755-31836451d178" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "output_dim=1\n", "xlim = (-0.3, 1.3)\n", "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "_ = ax.plot(np.linspace(0,1,215),\n", " data['Y'][:, output_dim], \n", " 'r.', markersize=5)\n", "\n", "ax.set_xlabel('time', fontsize=20)\n", "ax.set_ylabel('signal strength', fontsize=20)\n", "xlim = (-0.2, 1.2)\n", "ylim = (-0.6, 2.0)\n", "ax.set_xlim(xlim)\n", "ax.set_ylim(ylim)\n", "\n", "ma.write_figure(figure=fig, \n", " filename='robot-wireless-dim-' + str(output_dim) + '.svg', \n", " directory='./datasets')" ], "id": "881f77e2-f157-437a-a05f-ac89277be46f" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Robot WiFi Data\n", "\n", "One challenge with the data is that the signal strength ‘drops out’.\n", "This is because the device only tracks a limited number of wifi access\n", "points, when one of the access points falls outside the track, the value\n", "disappears (in the plot below it reads -0.5). The data is missing, but\n", "it is not missing at random because the implication is that the wireless\n", "access point must be weak to have dropped from the list of those that\n", "are tracked.\n", "\n", "\n", "\n", "Figure: Output dimension 1 from the robot wireless data. This plot\n", "shows signal strength changing over time." ], "id": "51ffc75b-61ad-46b5-ad68-52f00f266d3e" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Gaussian Process Fit to Robot Wireless Data\n", "\n", "Perform a Gaussian process fit on the data using GPy." ], "id": "d6851cba-c6ad-41a8-8d30-cafb8e817ed3" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m_full = GPy.models.GPRegression(x,yhat)\n", "_ = m_full.optimize() # Optimize parameters of covariance function" ], "id": "0483fcf8-feb4-43c8-90b4-4cf9633f456e" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Robot WiFi Data GP" ], "id": "9116b85e-cd8c-4e78-a225-2694245179a9" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from matplotlib import pyplot as plt\n", "import mlai\n", "import mlai.plot as plot" ], "id": "141453d9-d3df-43fe-84de-995f2496efc3" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: Gaussian process fit to the Robot Wireless dimension 1.\n", "\n", "The deep Gaussian process code we are using is research code by Andreas\n", "Damianou.\n", "\n", "To extend the research code we introduce some approaches to\n", "initialization and optimization that we’ll use in examples. These\n", "approaches can be found in the `deepgp_tutorial.py` file.\n", "\n", "Deep Gaussian process models also can require some thought in the\n", "initialization. Here we choose to start by setting the noise variance to\n", "be one percent of the data variance." ], "id": "5d7c85d3-463c-472f-99e1-06a06e87ecc2" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import deepgp_tutorial" ], "id": "bc826cda-3023-4738-8382-cfdc8da81564" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%load -n deepgp_tutorial.initialize" ], "id": "47e25abb-a403-4dfc-932f-45106cd3375e" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Secondly, we introduce a staged optimization approach.\n", "\n", "Optimization requires moving variational parameters in the hidden layer\n", "representing the mean and variance of the expected values in that layer.\n", "Since all those values can be scaled up, and this only results in a\n", "downscaling in the output of the first GP, and a downscaling of the\n", "input length scale to the second GP. It makes sense to first of all fix\n", "the scales of the covariance function in each of the GPs.\n", "\n", "Sometimes, deep Gaussian processes can find a local minima which\n", "involves increasing the noise level of one or more of the GPs. This\n", "often occurs because it allows a minimum in the KL divergence term in\n", "the lower bound on the likelihood. To avoid this minimum we habitually\n", "train with the likelihood variance (the noise on the output of the GP)\n", "fixed to some lower value for some iterations.\n", "\n", "Next an optimization of the kernel function parameters at each layer is\n", "performed, but with the variance of the likelihood fixed. Again, this is\n", "to prevent the model minimizing the Kullback-Leibler divergence between\n", "the approximate posterior and the prior *before* achieving a good\n", "data-fit.\n", "\n", "Finally, all parameters of the model are optimized together." ], "id": "1cae4249-790a-439c-bdda-438b533f54e1" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import deepgp_tutorial" ], "id": "333a9238-1113-4af2-a9a9-64379f16336c" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%load -n deepgp_tutorial.staged_optimize" ], "id": "bd08baba-0ce6-477e-8956-bedc8127ea82" }, { "cell_type": "markdown", "metadata": {}, "source": [ "The next code is for visualizing the intermediate layers of the deep\n", "model. This visualization is only appropriate for models with\n", "intermediate layers containing a single latent variable." ], "id": "47beecec-ad3d-4c54-84e1-3714fbbf532e" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import deepgp_tutorial" ], "id": "06548c40-6771-4097-a946-bce51ab58698" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%load -n deepgp_tutorial.visualize" ], "id": "18d33e98-6a45-49b6-8500-ddfd7cb0a98e" }, { "cell_type": "markdown", "metadata": {}, "source": [ "The pinball visualization is to bring the pinball-analogy to life in the\n", "model. It shows how a ball would fall through the model to end up in the\n", "right pbosition. This visualization is only appropriate for models with\n", "intermediate layers containing a single latent variable." ], "id": "5f0e29bd-03d3-442c-9cc1-b1e6c4252b06" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import deepgp_tutorial" ], "id": "f736117c-b7d6-4791-a641-8d35961a32de" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%load -n deepgp_tutorial.visualize_pinball" ], "id": "95c94565-770a-4b95-baca-be164e201928" }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `posterior_sample` code allows us to see the output sample locations\n", "for a given input. This is useful for visualizing the non-Gaussian\n", "nature of the output density." ], "id": "594f2998-456f-43fc-a466-0c7a18db1a34" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import deepgp_tutorial" ], "id": "521631c9-3a2d-4604-9683-224607956e01" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%load -n deepgp_tutorial.posterior_sample" ], "id": "2435f550-2906-43eb-9ab2-ff4fa9ccaa6f" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we bind these methods to the DeepGP object for ease of calling." ], "id": "bd225020-6257-48f2-ae03-c9aa33b3f8b3" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "deepgp.DeepGP.initialize=initialize\n", "deepgp.DeepGP.staged_optimize=staged_optimize\n", "deepgp.DeepGP.posterior_sample = posterior_sample\n", "deepgp.DeepGP.visualize=visualize\n", "deepgp.DeepGP.visualize_pinball=visualize_pinball" ], "id": "5c1abb33-db74-44d6-9585-7a22bdee40b6" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "layers = [y.shape[1], 10, 5, 2, 2, x.shape[1]]\n", "inits = ['PCA']*(len(layers)-1)\n", "kernels = []\n", "for i in layers[1:]:\n", " kernels += [GPy.kern.RBF(i, ARD=True)]" ], "id": "a9b227a3-0c48-4f26-93fa-3e49f126a04e" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m = deepgp.DeepGP(layers,Y=y, X=x, inits=inits, \n", " kernels=kernels,\n", " num_inducing=50, back_constraint=False)\n", "m.initialize()" ], "id": "b94868a8-8a3a-415a-8206-109d4dae23f0" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m.staged_optimize(messages=(True,True,True))" ], "id": "f5d956fa-bb23-4468-811a-957f12b38024" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax=plt.subplots(figsize=plot.big_wide_figsize)\n", "plot.model_output(m, output_dim=output_dim, scale=scale, offset=offset, ax=ax, \n", " xlabel='time', ylabel='signal strength', fontsize=20, portion=0.5)\n", "ax.set_ylim(ylim)\n", "ax.set_xlim(xlim)\n", "mlai.write_figure(figure=fig, filename='./deepgp/robot-wireless-deep-gp-dim-' + str(output_dim)+ '.svg', \n", " transparent=True, frameon=True)" ], "id": "d14bb74a-aada-47c5-a742-b19ea4586f27" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Robot WiFi Data Deep GP\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: Fit of the deep Gaussian process to dimension 1 of the robot\n", "wireless data." ], "id": "547b02bc-46b8-4825-948f-8fa5bad1a7a9" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax=plt.subplots(figsize=plot.big_wide_figsize)\n", "plot.model_sample(m, output_dim=output_dim, scale=scale, offset=offset, samps=10, ax=ax,\n", " xlabel='time', ylabel='signal strength', fontsize=20, portion=0.5)\n", "ax.set_ylim(ylim)\n", "ax.set_xlim(xlim)\n", "mlai.write_figure(figure=fig, filename='./deepgp/robot-wireless-deep-gp-samples-dim-' + str(output_dim)+ '.svg', \n", " transparent=True, frameon=True)" ], "id": "c62cba8d-0b59-45e6-8a57-b69aaa871e6e" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Robot WiFi Data Deep GP\n", "\n", "\n", "\n", "Figure: Samples from the deep Gaussian process fit to dimension 1 of\n", "the robot wireless data." ], "id": "65f393af-b71e-48ea-851d-16ecc9189a68" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Robot WiFi Data Latent Space" ], "id": "9cd25aab-8e62-479e-a145-a900e6ed32cf" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_figsize)\n", "ax.plot(m.layers[-2].latent_space.mean[:, 0], \n", " m.layers[-2].latent_space.mean[:, 1], \n", " 'r.-', markersize=5)\n", "\n", "ax.set_xlabel('latent dimension 1', fontsize=20)\n", "ax.set_ylabel('latent dimension 2', fontsize=20)\n", "\n", "mlai.write_figure(figure=fig, filename='./deepgp/robot-wireless-latent-space.svg', \n", " transparent=True, frameon=True)" ], "id": "967e08ad-2c75-45c7-8cb4-eaf197bb2e0c" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: Inferred two dimensional latent space from the model for the\n", "robot wireless data." ], "id": "4913384c-03f4-45c0-944b-0bb8f41c37aa" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install gpy" ], "id": "0952df2d-ce99-4c65-9488-7c7875878e4c" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## GPy: A Gaussian Process Framework in Python\n", "\n", "\\[edit\\]\n", "\n", "Gaussian processes are a flexible tool for non-parametric analysis with\n", "uncertainty. The GPy software was started in Sheffield to provide a easy\n", "to use interface to GPs. One which allowed the user to focus on the\n", "modelling rather than the mathematics.\n", "\n", "\n", "\n", "Figure: GPy is a BSD licensed software code base for implementing\n", "Gaussian process models in Python. It is designed for teaching and\n", "modelling. We welcome contributions which can be made through the GitHub\n", "repository \n", "\n", "GPy is a BSD licensed software code base for implementing Gaussian\n", "process models in python. This allows GPs to be combined with a wide\n", "variety of software libraries.\n", "\n", "The software itself is available on\n", "[GitHub](https://github.com/SheffieldML/GPy) and the team welcomes\n", "contributions.\n", "\n", "The aim for GPy is to be a probabilistic-style programming language,\n", "i.e., you specify the model rather than the algorithm. As well as a\n", "large range of covariance functions the software allows for non-Gaussian\n", "likelihoods, multivariate outputs, dimensionality reduction and\n", "approximations for larger data sets.\n", "\n", "The documentation for GPy can be found\n", "[here](https://gpy.readthedocs.io/en/latest/).\n", "\n", "This notebook depends on PyDeepGP. This library can be installed via\n", "pip." ], "id": "63746f9a-b82f-47ea-82c6-3c3802464a88" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install --upgrade git+https://github.com/SheffieldML/PyDeepGP.git" ], "id": "f3c740e8-39dc-484f-a340-b56d389e4dfc" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install mlai" ], "id": "f3d70e57-f261-426b-b925-615d05aacde0" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Late bind setup methods to DeepGP object\n", "from mlai.deepgp_tutorial import initialize\n", "from mlai.deepgp_tutorial import staged_optimize\n", "from mlai.deepgp_tutorial import posterior_sample\n", "from mlai.deepgp_tutorial import visualize\n", "from mlai.deepgp_tutorial import visualize_pinball\n", "\n", "import deepgp\n", "deepgp.DeepGP.initialize=initialize\n", "deepgp.DeepGP.staged_optimize=staged_optimize\n", "deepgp.DeepGP.posterior_sample=posterior_sample\n", "deepgp.DeepGP.visualize=visualize\n", "deepgp.DeepGP.visualize_pinball=visualize_pinball" ], "id": "f28345dc-8ef9-49d9-9c59-908cf1aa8893" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Olympic Marathon Data\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "\n", "- Gold medal times for Olympic Marathon since 1896.\n", "- Marathons before 1924 didn’t have a standardized distance.\n", "- Present results using pace per km.\n", "- In 1904 Marathon was badly organized leading to very slow times.\n", "\n", "\n", "\n", "\n", "Image from Wikimedia Commons \n", "\n", "
\n", "\n", "The first thing we will do is load a standard data set for regression\n", "modelling. The data consists of the pace of Olympic Gold Medal Marathon\n", "winners for the Olympics from 1896 to present. Let’s load in the data\n", "and plot." ], "id": "41e54ddb-3144-4a72-a49a-53543bbd6cc6" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install pods" ], "id": "48739638-488d-4fca-b106-3851c19e4080" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pods" ], "id": "72d2837e-7399-401d-acee-1afffd947955" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data = pods.datasets.olympic_marathon_men()\n", "x = data['X']\n", "y = data['Y']\n", "\n", "offset = y.mean()\n", "scale = np.sqrt(y.var())\n", "yhat = (y - offset)/scale" ], "id": "9bf1a6a6-8af3-4f19-9fda-2d990b75bc3f" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import mlai.plot as plot\n", "import mlai" ], "id": "a9802e63-53f1-495a-a7f2-c43ce952e96a" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\n", "xlim = (1875,2030)\n", "ylim = (2.5, 6.5)\n", "\n", "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "_ = ax.plot(x, y, 'r.',markersize=10)\n", "ax.set_xlabel('year', fontsize=20)\n", "ax.set_ylabel('pace min/km', fontsize=20)\n", "ax.set_xlim(xlim)\n", "ax.set_ylim(ylim)\n", "\n", "mlai.write_figure(filename='olympic-marathon.svg', \n", " directory='./datasets')" ], "id": "74b09e1a-b40f-448e-a3d3-23aff30a0910" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: Olympic marathon pace times since 1896.\n", "\n", "Things to notice about the data include the outlier in 1904, in that\n", "year the Olympics was in St Louis, USA. Organizational problems and\n", "challenges with dust kicked up by the cars following the race meant that\n", "participants got lost, and only very few participants completed. More\n", "recent years see more consistently quick marathons." ], "id": "893ed984-c4ff-4b98-816a-9f3adab7eea5" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Alan Turing\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "\n", "Figure: Alan Turing, in 1946 he was only 11 minutes slower than the\n", "winner of the 1948 games. Would he have won a hypothetical games held in\n", "1946? Source:\n", "Alan\n", "Turing Internet Scrapbook.\n", "\n", "If we had to summarise the objectives of machine learning in one word, a\n", "very good candidate for that word would be *generalization*. What is\n", "generalization? From a human perspective it might be summarised as the\n", "ability to take lessons learned in one domain and apply them to another\n", "domain. If we accept the definition given in the first session for\n", "machine learning, $$\n", "\\text{data} + \\text{model} \\stackrel{\\text{compute}}{\\rightarrow} \\text{prediction}\n", "$$ then we see that without a model we can’t generalise: we only have\n", "data. Data is fine for answering very specific questions, like “Who won\n", "the Olympic Marathon in 2012?”, because we have that answer stored,\n", "however, we are not given the answer to many other questions. For\n", "example, Alan Turing was a formidable marathon runner, in 1946 he ran a\n", "time 2 hours 46 minutes (just under four minutes per kilometer, faster\n", "than I and most of the other [Endcliffe Park\n", "Run](http://www.parkrun.org.uk/sheffieldhallam/) runners can do 5 km).\n", "What is the probability he would have won an Olympics if one had been\n", "held in 1946?\n", "\n", "To answer this question we need to generalize, but before we formalize\n", "the concept of generalization let’s introduce some formal representation\n", "of what it means to generalize in machine learning." ], "id": "77a524d4-9c59-4265-b860-c0bda3b10045" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Gaussian Process Fit\n", "\n", "\\[edit\\]\n", "\n", "Our first objective will be to perform a Gaussian process fit to the\n", "data, we’ll do this using the [GPy\n", "software](https://github.com/SheffieldML/GPy)." ], "id": "de4ae281-c4f7-48cf-9551-baa2f159319e" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import GPy" ], "id": "595105e8-8d0b-466d-8cd0-6e1e093ae20e" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m_full = GPy.models.GPRegression(x,yhat)\n", "_ = m_full.optimize() # Optimize parameters of covariance function" ], "id": "dd38d21e-e17e-4eb5-a901-e4cd8b63e1d0" }, { "cell_type": "markdown", "metadata": {}, "source": [ "The first command sets up the model, then `m_full.optimize()` optimizes\n", "the parameters of the covariance function and the noise level of the\n", "model. Once the fit is complete, we’ll try creating some test points,\n", "and computing the output of the GP model in terms of the mean and\n", "standard deviation of the posterior functions between 1870 and 2030. We\n", "plot the mean function and the standard deviation at 200 locations. We\n", "can obtain the predictions using `y_mean, y_var = m_full.predict(xt)`" ], "id": "72e6dcbb-3f83-4eed-a1f7-0f90ed640179" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "xt = np.linspace(1870,2030,200)[:,np.newaxis]\n", "yt_mean, yt_var = m_full.predict(xt)\n", "yt_sd=np.sqrt(yt_var)" ], "id": "93425b59-1598-4384-b2b0-029af3276436" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we plot the results using the helper function in `mlai.plot`." ], "id": "848c6dab-2dc0-4a17-a603-52fd58759a4a" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import mlai.plot as plot\n", "import mlai" ], "id": "6fc7cbf1-af6f-4fab-bb87-234257704da3" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "plot.model_output(m_full, scale=scale, offset=offset, ax=ax, xlabel=\"year\", ylabel=\"pace min/km\", fontsize=20, portion=0.2)\n", "ax.set_xlim(xlim)\n", "ax.set_ylim(ylim)\n", "mlai.write_figure(figure=fig,\n", " filename=\"olympic-marathon-gp.svg\", \n", " directory = \"./gp\",\n", " transparent=True, frameon=True)" ], "id": "db5aa66a-8015-446e-8d43-3f3ad5227686" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: Gaussian process fit to the Olympic Marathon data. The error\n", "bars are too large, perhaps due to the outlier from 1904." ], "id": "2438df74-3f91-4e46-b0d2-ce47cc6dea25" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Fit Quality\n", "\n", "In the fit we see that the error bars (coming mainly from the noise\n", "variance) are quite large. This is likely due to the outlier point in\n", "1904, ignoring that point we can see that a tighter fit is obtained. To\n", "see this make a version of the model, `m_clean`, where that point is\n", "removed." ], "id": "e2c79ec7-631b-446c-b7a0-fbae3779f9ea" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x_clean=np.vstack((x[0:2, :], x[3:, :]))\n", "y_clean=np.vstack((yhat[0:2, :], yhat[3:, :]))\n", "\n", "m_clean = GPy.models.GPRegression(x_clean,y_clean)\n", "_ = m_clean.optimize()" ], "id": "2c3290da-fc75-4b3c-95ef-435bd0c86c2f" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import mlai.plot as plot\n", "import mlai" ], "id": "8e1ab0b6-89a2-4fb0-a31c-d9dce5f68130" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "plot.model_output(m_clean, scale=scale, offset=offset, ax=ax, xlabel='year', ylabel='pace min/km', fontsize=20, portion=0.2)\n", "ax.set_xlim(xlim)\n", "ax.set_ylim(ylim)\n", "mlai.write_figure(figure=fig,\n", " filename='./gp/olympic-marathon-gp.svg', \n", " transparent=True, frameon=True)" ], "id": "aaced6ee-93d2-40cf-85e3-8a94d67e1219" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Deep GP Fit\n", "\n", "\\[edit\\]\n", "\n", "Let’s see if a deep Gaussian process can help here. We will construct a\n", "deep Gaussian process with one hidden layer (i.e. one Gaussian process\n", "feeding into another).\n", "\n", "Build a Deep GP with an additional hidden layer (one dimensional) to fit\n", "the model." ], "id": "ef1af060-03e1-4468-aa79-6db47c689389" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import GPy\n", "import deepgp" ], "id": "684d5f04-a168-477d-96f7-4c56a2e3f5f1" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "hidden = 1\n", "m = deepgp.DeepGP([y.shape[1],hidden,x.shape[1]],Y=yhat, X=x, inits=['PCA','PCA'], \n", " kernels=[GPy.kern.RBF(hidden,ARD=True),\n", " GPy.kern.RBF(x.shape[1],ARD=True)], # the kernels for each layer\n", " num_inducing=50, back_constraint=False)" ], "id": "492d8998-9d25-4639-b352-ee9083d1bf81" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Call the initalization\n", "m.initialize()" ], "id": "4f3e7499-f90e-4e5b-aab1-ddb174ada8e3" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now optimize the model." ], "id": "3d93522d-7c13-4e30-a966-75920866507c" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for layer in m.layers:\n", " layer.likelihood.variance.constrain_positive(warning=False)\n", "m.optimize(messages=True,max_iters=10000)" ], "id": "27119b97-178a-4d58-a01e-3385248c865b" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m.staged_optimize(messages=(True,True,True))" ], "id": "0b455dfd-d5a0-463e-b2ea-7fdfc2916a67" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import mlai.plot as plot\n", "import mlai" ], "id": "a0e184b6-1f6c-4337-bee7-0be20d43f05e" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "plot.model_output(m, scale=scale, offset=offset, ax=ax, xlabel='year', ylabel='pace min/km', \n", " fontsize=20, portion=0.2)\n", "ax.set_xlim(xlim)\n", "\n", "ax.set_ylim(ylim)\n", "mlai.write_figure(figure=fig, filename='./deepgp/olympic-marathon-deep-gp.svg', \n", " transparent=True, frameon=True)" ], "id": "53782b86-89a2-4169-b65f-7b33d8a10833" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Olympic Marathon Data Deep GP\n", "\n", "\n", "\n", "Figure: Deep GP fit to the Olympic marathon data. Error bars now\n", "change as the prediction evolves." ], "id": "1affd05b-a6e6-46e2-84cc-e61dcfab4671" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "plot.model_sample(m, scale=scale, offset=offset, samps=10, ax=ax, \n", " xlabel='year', ylabel='pace min/km', portion = 0.225)\n", "ax.set_xlim(xlim)\n", "ax.set_ylim(ylim)\n", "mlai.write_figure(figure=fig, filename='./deepgp/olympic-marathon-deep-gp-samples.svg', \n", " transparent=True, frameon=True)" ], "id": "440b514c-d7e4-4325-a221-b9857a58d9c5" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Olympic Marathon Data Deep GP\n", "\n", "\n", "\n", "Figure: Point samples run through the deep Gaussian process show the\n", "distribution of output locations." ], "id": "46b9875c-1484-4ff4-b9e0-1ccd7a5075ed" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Fitted GP for each layer\n", "\n", "Now we explore the GPs the model has used to fit each layer. First of\n", "all, we look at the hidden layer." ], "id": "321a532d-7fd2-42fa-9827-853fb1d5d913" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m.visualize(scale=scale, offset=offset, xlabel='year',\n", " ylabel='pace min/km',xlim=xlim, ylim=ylim,\n", " dataset='olympic-marathon',\n", " diagrams='./deepgp')" ], "id": "e75defd0-cb46-4332-a417-837b636ffee3" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import notutils as nu" ], "id": "2fbe547d-6a5c-4d96-a7b1-bc59494ce239" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "nu.display_plots('olympic-marathon-deep-gp-layer-{sample:0>1}.svg', \n", " './deepgp', sample=(0,1))" ], "id": "5d6afe4a-9855-4a7e-86ae-eca6dc9e6737" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: The mapping from input to the latent layer is broadly, with\n", "some flattening as time goes on. Variance is high across the input\n", "range.\n", "\n", "\n", "\n", "Figure: The mapping from the latent layer to the output layer." ], "id": "c1aef125-9388-474d-ace3-ddd7967ea36e" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "m.visualize_pinball(ax=ax, scale=scale, offset=offset, points=30, portion=0.1,\n", " xlabel='year', ylabel='pace km/min', vertical=True)\n", "mlai.write_figure(figure=fig, filename='./deepgp/olympic-marathon-deep-gp-pinball.svg', \n", " transparent=True, frameon=True)" ], "id": "dd87f4f9-3395-409c-ba0b-5483648616e8" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Olympic Marathon Pinball Plot\n", "\n", "\n", "\n", "Figure: A pinball plot shows the movement of the ‘ball’ as it passes\n", "through each layer of the Gaussian processes. Mean directions of\n", "movement are shown by lines. Shading gives one standard deviation of\n", "movement position. At each layer, the uncertainty is reset. The overal\n", "uncertainty is the cumulative uncertainty from all the layers. There is\n", "some grouping of later points towards the right in the first layer,\n", "which also injects a large amount of uncertainty. Due to flattening of\n", "the curve in the second layer towards the right the uncertainty is\n", "reduced in final output.\n", "\n", "The pinball plot shows the flow of any input ball through the deep\n", "Gaussian process. In a pinball plot a series of vertical parallel lines\n", "would indicate a purely linear function. For the olypmic marathon data\n", "we can see the first layer begins to shift from input towards the right.\n", "Note it also does so with some uncertainty (indicated by the shaded\n", "backgrounds). The second layer has less uncertainty, but bunches the\n", "inputs more strongly to the right. This input layer of uncertainty,\n", "followed by a layer that pushes inputs to the right is what gives the\n", "heteroschedastic noise." ], "id": "96c134bd-cef0-42ed-b2e1-29b171399831" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Gene Expression Example\n", "\n", "\\[edit\\]\n", "\n", "We now consider an example in gene expression. Gene expression is the\n", "measurement of mRNA levels expressed in cells. These mRNA levels show\n", "which genes are ‘switched on’ and producing data. In the example we will\n", "use a Gaussian process to determine whether a given gene is active, or\n", "we are merely observing a noise response." ], "id": "063c1a59-e77c-492d-bd57-56465b1b4527" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Della Gatta Gene Data\n", "\n", "\\[edit\\]\n", "\n", "- Given given expression levels in the form of a time series from\n", " Della Gatta et al. (2008)." ], "id": "ba5360c0-0a7c-48db-b630-2a58a9e9df39" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pods" ], "id": "7ea76b23-5a4d-44a4-bb07-49f5eba5fa39" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data = pods.datasets.della_gatta_TRP63_gene_expression(data_set='della_gatta',gene_number=937)\n", "\n", "x = data['X']\n", "y = data['Y']\n", "\n", "offset = y.mean()\n", "scale = np.sqrt(y.var())" ], "id": "34163873-013b-4120-9e32-d0ce23ac082a" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import mlai.plot as plot\n", "import mlai" ], "id": "ef1529c4-c716-4de9-a64f-3083c851bf45" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\n", "xlim = (-20,260)\n", "ylim = (5, 7.5)\n", "yhat = (y-offset)/scale\n", "\n", "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "_ = ax.plot(x, y, 'r.',markersize=10)\n", "ax.set_xlabel('time/min', fontsize=20)\n", "ax.set_ylabel('expression', fontsize=20)\n", "ax.set_xlim(xlim)\n", "ax.set_ylim(ylim)\n", "\n", "mlai.write_figure(figure=fig, \n", " filename='./datasets/della-gatta-gene.svg', \n", " transparent=True, \n", " frameon=True)" ], "id": "da569d5f-d74b-4857-9371-b07b71c7e83d" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: Gene expression levels over time for a gene from data\n", "provided by Della Gatta et al. (2008). We would like to understand\n", "whether there is signal in the data, or we are only observing noise.\n", "\n", "- Want to detect if a gene is expressed or not, fit a GP to each gene\n", " Kalaitzis and Lawrence (2011).\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Freddie Kalaitzis\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "Figure: The example is taken from the paper “A Simple Approach to\n", "Ranking Differentially Expressed Gene Expression Time Courses through\n", "Gaussian Process Regression.” Kalaitzis and Lawrence (2011).\n", "\n", "
\n", "\n", "\n", "\n", "
\n", "\n", "Our first objective will be to perform a Gaussian process fit to the\n", "data, we’ll do this using the [GPy\n", "software](https://github.com/SheffieldML/GPy)." ], "id": "620a3a3b-5b03-4332-a713-ac8edf0c1e4c" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import GPy" ], "id": "884f0d50-5567-45ed-b472-446037e477e4" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m_full = GPy.models.GPRegression(x,yhat)\n", "m_full.kern.lengthscale=50\n", "_ = m_full.optimize() # Optimize parameters of covariance function" ], "id": "7c38aebe-b03e-436c-aba3-c4befec675b9" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Initialize the length scale parameter (which here actually represents a\n", "*time scale* of the covariance function) to a reasonable value. Default\n", "would be 1, but here we set it to 50 minutes, given points are arriving\n", "across zero to 250 minutes." ], "id": "cf539c27-419c-4bc8-b166-873078e273a7" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "xt = np.linspace(-20,260,200)[:,np.newaxis]\n", "yt_mean, yt_var = m_full.predict(xt)\n", "yt_sd=np.sqrt(yt_var)" ], "id": "933000bd-5aaf-4d14-ac2c-a4a192293a1e" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we plot the results using the helper function in `mlai.plot`." ], "id": "2299e8a1-4eb5-4c91-85e7-4e9ffa3c4105" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import mlai.plot as plot" ], "id": "ec95b398-b58f-45e0-9ba2-177e2e26629a" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "plot.model_output(m_full, scale=scale, offset=offset, ax=ax, xlabel='time/min', ylabel='expression', fontsize=20, portion=0.2)\n", "ax.set_xlim(xlim)\n", "ax.set_ylim(ylim)\n", "ax.set_title('log likelihood: {ll:.3}'.format(ll=m_full.log_likelihood()), fontsize=20)\n", "mlai.write_figure(figure=fig,\n", " filename='./gp/della-gatta-gene-gp.svg', \n", " transparent=True, frameon=True)" ], "id": "975f25f1-c713-4b65-9f1f-bec3fe58e9ee" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: Result of the fit of the Gaussian process model with the time\n", "scale parameter initialized to 50 minutes.\n", "\n", "Now we try a model initialized with a longer length scale." ], "id": "0801f612-87d9-428b-9aa4-61a00b19e5fe" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m_full2 = GPy.models.GPRegression(x,yhat)\n", "m_full2.kern.lengthscale=2000\n", "_ = m_full2.optimize() # Optimize parameters of covariance function" ], "id": "2f290e0e-7b23-406f-a105-8f9b55a8b5d2" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import mlai.plot as plot" ], "id": "98826c8a-cb81-4b10-97af-c8069c57fc1e" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "plot.model_output(m_full2, scale=scale, offset=offset, ax=ax, xlabel='time/min', ylabel='expression', fontsize=20, portion=0.2)\n", "ax.set_xlim(xlim)\n", "ax.set_ylim(ylim)\n", "ax.set_title('log likelihood: {ll:.3}'.format(ll=m_full2.log_likelihood()), fontsize=20)\n", "mlai.write_figure(figure=fig,\n", " filename='./gp/della-gatta-gene-gp2.svg', \n", " transparent=True, frameon=True)" ], "id": "f6fc70d0-d972-4a88-b8af-d99986ce7200" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: Result of the fit of the Gaussian process model with the time\n", "scale parameter initialized to 2000 minutes.\n", "\n", "Now we try a model initialized with a lower noise." ], "id": "753df1ab-264c-4646-bb95-b970d2c2bf6a" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m_full3 = GPy.models.GPRegression(x,yhat)\n", "m_full3.kern.lengthscale=20\n", "m_full3.likelihood.variance=0.001\n", "_ = m_full3.optimize() # Optimize parameters of covariance function" ], "id": "f2d1e981-7e3a-4acd-a928-7a913d04d1cd" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import mlai.plot as plot" ], "id": "252161a5-a376-4b8f-8a48-3534ff4614e0" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "plot.model_output(m_full3, scale=scale, offset=offset, ax=ax, xlabel='time/min', ylabel='expression', fontsize=20, portion=0.2)\n", "ax.set_xlim(xlim)\n", "ax.set_ylim(ylim)\n", "ax.set_title('log likelihood: {ll:.3}'.format(ll=m_full3.log_likelihood()), fontsize=20)\n", "mlai.write_figure(figure=fig,\n", " filename='./gp/della-gatta-gene-gp3.svg', \n", " transparent=True, frameon=True)" ], "id": "1f36407b-8084-47e7-8c7c-edfdfca4b2e2" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: Result of the fit of the Gaussian process model with the\n", "noise initialized low (standard deviation 0.1) and the time scale\n", "parameter initialized to 20 minutes." ], "id": "a47b3a91-b25d-4bc5-8d79-9eb5c5c9acc5" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import mlai.plot as plot" ], "id": "db303f49-7746-4716-990c-4a7430619b67" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plot.multiple_optima(diagrams='./gp')" ], "id": "f742920c-9b02-478f-8bd8-19ad3555b934" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: \n", "\n", "" ], "id": "64edccae-a39e-4e42-8424-ad3dec06a38c" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "layers = [y.shape[1], 1,x.shape[1]]\n", "inits = ['PCA']*(len(layers)-1)\n", "kernels = []\n", "for i in layers[1:]:\n", " kernels += [GPy.kern.RBF(i)]\n", "m = deepgp.DeepGP(layers,Y=yhat, X=x, \n", " inits=inits, \n", " kernels=kernels, # the kernels for each layer\n", " num_inducing=20, back_constraint=False)" ], "id": "997dd541-0d93-4edc-84cc-779f0d1db32b" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m.initialize()\n", "m.staged_optimize()" ], "id": "8bcb3837-6e4f-413f-84c9-b75c44a1da32" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax=plt.subplots(figsize=plot.big_wide_figsize)\n", "plot.model_output(m, scale=scale, offset=offset, ax=ax, fontsize=20, portion=0.5)\n", "ax.set_ylim(ylim)\n", "ax.set_xlim(xlim)\n", "mlai.write_figure(filename=\"./deepgp/della-gatta-gene-deep-gp.svg\", \n", " transparent=True, frameon=True)" ], "id": "47b59618-9b16-4854-a1a1-baeedb5de4c0" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Della Gatta Gene Data Deep GP\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: Deep Gaussian process fit to the Della Gatta gene expression\n", "data." ], "id": "13eb461c-0d72-4051-98cd-b85ddb5f3505" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax=plt.subplots(figsize=plot.big_wide_figsize)\n", "plot.model_sample(m, scale=scale, offset=offset, samps=10, ax=ax, portion = 0.5)\n", "ax.set_ylim(ylim)\n", "ax.set_xlim(xlim)\n", "mlai.write_figure(figure=fig, filename='./deepgp/della-gatta-gene-deep-gp-samples.svg', \n", " transparent=True, frameon=True)" ], "id": "41aa976c-e2ff-4be0-a069-6fa680167cdb" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Della Gatta Gene Data Deep GP\n", "\n", "\n", "\n", "Figure: Deep Gaussian process samples fitted to the Della Gatta gene\n", "expression data." ], "id": "05224d7b-956d-44fa-82a0-d1acbbd4478a" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m.visualize(offset=offset, scale=scale, xlim=xlim, ylim=ylim,\n", " dataset=\"della-gatta-gene\",\n", " diagrams=\"./deepgp\")" ], "id": "fd0fa14a-cbfc-4b43-aba6-47c283402aae" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Della Gatta Gene Data Latent 1\n", "\n", "\n", "\n", "Figure: Gaussian process mapping from input to latent layer for the\n", "della Gatta gene expression data." ], "id": "7b30ae00-2d07-4bb5-b93f-758dd8b6e998" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Della Gatta Gene Data Latent 2\n", "\n", "\n", "\n", "Figure: Gaussian process mapping from latent to output layer for the\n", "della Gatta gene expression data." ], "id": "4067792a-c820-4bf9-b7e2-d822eda6155e" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax=plt.subplots(figsize=plot.big_wide_figsize)\n", "m.visualize_pinball(offset=offset, ax=ax, scale=scale, xlim=xlim, ylim=ylim, portion=0.1, points=50)\n", "mlai.write_figure(figure=fig, filename=\"./deepgp/della-gatta-gene-deep-gp-pinball.svg\", \n", " transparent=True, frameon=True, ax=ax)" ], "id": "292460ec-2ccd-42bd-bc60-90a4d59ef087" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## TP53 Gene Pinball Plot\n", "\n", "\n", "\n", "Figure: A pinball plot shows the movement of the ‘ball’ as it passes\n", "through each layer of the Gaussian processes. Mean directions of\n", "movement are shown by lines. Shading gives one standard deviation of\n", "movement position. At each layer, the uncertainty is reset. The overal\n", "uncertainty is the cumulative uncertainty from all the layers. Pinball\n", "plot of the della Gatta gene expression data." ], "id": "8ca695f2-fdfb-47ba-8760-d62de3b073fd" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step Function\n", "\n", "\\[edit\\]\n", "\n", "Next we consider a simple step function data set." ], "id": "3b602840-9c08-435a-8392-e55b0aaf55c0" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "num_low=25\n", "num_high=25\n", "gap = -.1\n", "noise=0.0001\n", "x = np.vstack((np.linspace(-1, -gap/2.0, num_low)[:, np.newaxis],\n", " np.linspace(gap/2.0, 1, num_high)[:, np.newaxis]))\n", "y = np.vstack((np.zeros((num_low, 1)), np.ones((num_high,1))))\n", "scale = np.sqrt(y.var())\n", "offset = y.mean()\n", "yhat = (y-offset)/scale" ], "id": "a13b3a9d-e17a-4396-b292-5fb1650bd243" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "_ = ax.plot(x, y, 'r.',markersize=10)\n", "_ = ax.set_xlabel('$x$', fontsize=20)\n", "_ = ax.set_ylabel('$y$', fontsize=20)\n", "xlim = (-2, 2)\n", "ylim = (-0.6, 1.6)\n", "ax.set_ylim(ylim)\n", "ax.set_xlim(xlim)\n", "mlai.write_figure(figure=fig, filename='./datasets/step-function.svg', \n", " transparent=True, frameon=True)" ], "id": "02e2d564-faa9-4eff-a914-0c9897f35abb" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step Function Data\n", "\n", "\n", "\n", "Figure: Simulation study of step function data artificially\n", "generated. Here there is a small overlap between the two lines." ], "id": "7e78354c-a128-447a-aeaf-5b6bfe211086" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step Function Data GP\n", "\n", "We can fit a Gaussian process to the step function data using `GPy` as\n", "follows." ], "id": "bf12d86b-7995-4967-8230-0848988c757e" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m_full = GPy.models.GPRegression(x,yhat)\n", "_ = m_full.optimize() # Optimize parameters of covariance function" ], "id": "b4e4e4d9-7e7e-459a-90d9-707d6e5d713f" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Where `GPy.models.GPRegression()` gives us a standard GP regression\n", "model with exponentiated quadratic covariance function.\n", "\n", "The model is optimized using `m_full.optimize()` which calls an L-BGFS\n", "gradient based solver in python." ], "id": "833ce5bc-4f41-4b87-a8ac-c419e3e90c4c" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax=plt.subplots(figsize=plot.big_wide_figsize)\n", "plot.model_output(m_full, scale=scale, offset=offset, ax=ax, fontsize=20, portion=0.5)\n", "ax.set_ylim(ylim)\n", "ax.set_xlim(xlim)\n", "\n", "mlai.write_figure(figure=fig,filename='./gp/step-function-gp.svg', \n", " transparent=True, frameon=True)" ], "id": "78750f43-3244-4837-abe0-118be146f450" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: Gaussian process fit to the step function data. Note the\n", "large error bars and the over-smoothing of the discontinuity. Error bars\n", "are shown at two standard deviations.\n", "\n", "The resulting fit to the step function data shows some challenges. In\n", "particular, the over smoothing at the discontinuity. If we know how many\n", "discontinuities there are, we can parameterize them in the step\n", "function. But by doing this, we form a semi-parametric model. The\n", "parameters indicate how many discontinuities are, and where they are.\n", "They can be optimized as part of the model fit. But if new, unforeseen,\n", "discontinuities arise when the model is being deployed in practice,\n", "these won’t be accounted for in the predictions." ], "id": "31d0c86b-cdcb-4689-86c3-190ef1089038" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step Function Data Deep GP\n", "\n", "\\[edit\\]\n", "\n", "First we initialize a deep Gaussian process with three latent layers\n", "(four layers total). Within each layer we create a GP with an\n", "exponentiated quadratic covariance (`GPy.kern.RBF`).\n", "\n", "At each layer we use 20 inducing points for the variational\n", "approximation." ], "id": "13e6326f-a808-4ad4-9e13-52546daf07ef" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "layers = [y.shape[1], 1, 1, 1,x.shape[1]]\n", "inits = ['PCA']*(len(layers)-1)\n", "kernels = []\n", "for i in layers[1:]:\n", " kernels += [GPy.kern.RBF(i)]\n", " \n", "m = deepgp.DeepGP(layers,Y=yhat, X=x, \n", " inits=inits, \n", " kernels=kernels, # the kernels for each layer\n", " num_inducing=20, back_constraint=False)" ], "id": "74b25f6a-d753-43c5-8096-0f0eabb5237e" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once the model is constructed we initialize the parameters, and perform\n", "the staged optimization which starts by optimizing variational\n", "parameters with a low noise and proceeds to optimize the whole model." ], "id": "03c533bc-1174-4460-afb1-61c46e9c8579" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m.initialize()\n", "m.staged_optimize()" ], "id": "2f686d5b-63ec-4bcd-9e89-35588aee70cb" }, { "cell_type": "markdown", "metadata": {}, "source": [ "We plot the output of the deep Gaussian process fitted to the step data\n", "as follows." ], "id": "cef0c555-7266-4415-85ee-04b9791c20f3" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax=plt.subplots(figsize=plot.big_wide_figsize)\n", "plot.model_output(m, scale=scale, offset=offset, ax=ax, fontsize=20, portion=0.5)\n", "ax.set_ylim(ylim)\n", "ax.set_xlim(xlim)\n", "mlai.write_figure(filename='./deepgp/step-function-deep-gp.svg', \n", " transparent=True, frameon=True)" ], "id": "691f119a-2f3e-41fe-a7a5-8c4cf9fdc712" }, { "cell_type": "markdown", "metadata": {}, "source": [ "The deep Gaussian process does a much better job of fitting the data. It\n", "handles the discontinuity easily, and error bars drop to smaller values\n", "in the regions of data.\n", "\n", "\n", "\n", "Figure: Deep Gaussian process fit to the step function data." ], "id": "67f87c3e-7521-4a59-a338-fd729aa36042" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step Function Data Deep GP\n", "\n", "The samples of the model can be plotted with the helper function from\n", "`mlai.plot`, `model_sample`" ], "id": "934ca0ad-43c1-4ef8-9db1-7db2a157b946" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import mlai.plot as plot" ], "id": "523e100c-3ed8-4db3-928d-06c5bfa30228" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax=plt.subplots(figsize=plot.big_wide_figsize)\n", "\n", "plot.model_sample(m, scale=scale, offset=offset, samps=10, ax=ax, portion = 0.5)\n", "ax.set_ylim(ylim)\n", "ax.set_xlim(xlim)\n", "mlai.write_figure(figure=fig, filename='./deepgp/step-function-deep-gp-samples.svg', \n", " transparent=True, frameon=True)" ], "id": "145e953e-6538-41d0-9796-a9465bd72cb8" }, { "cell_type": "markdown", "metadata": {}, "source": [ "The samples from the model show that the error bars, which are\n", "informative for Gaussian outputs, are less informative for this model.\n", "They make clear that the data points lie, in output mainly at 0 or 1, or\n", "occasionally in between.\n", "\n", "\n", "\n", "Figure: Samples from the deep Gaussian process model for the step\n", "function fit.\n", "\n", "The visualize code allows us to inspect the intermediate layers in the\n", "deep GP model to understand how it has reconstructed the step function." ], "id": "d5559a34-39da-40ca-849d-f636f0cb1687" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m.visualize(offset=offset, scale=scale, xlim=xlim, ylim=ylim,\n", " dataset='step-function',\n", " diagrams='./deepgp')" ], "id": "c6e900f2-4055-4d1d-b786-8a294bc27dd9" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "\n", "\n", "\n", "Figure: From top to bottom, the Gaussian process mapping function\n", "that makes up each layer of the resulting deep Gaussian process.\n", "\n", "A pinball plot can be created for the resulting model to understand how\n", "the input is being translated to the output across the different layers." ], "id": "2cd5f593-04c0-46bd-8646-bd79975e8477" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import mlai.plot as plot\n", "import mlai" ], "id": "26d5dbd2-a073-4293-a92f-f27e5ee32a39" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax=plt.subplots(figsize=plot.big_wide_figsize)\n", "m.visualize_pinball(offset=offset, ax=ax, scale=scale, xlim=xlim, ylim=ylim, portion=0.1, points=50)\n", "mlai.write_figure(figure=fig, filename='./deepgp/step-function-deep-gp-pinball.svg', \n", " transparent=True, frameon=True, ax=ax)" ], "id": "bb99e8c4-3d68-46aa-bbb6-477888c00eb6" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: Pinball plot of the deep GP fitted to the step function data.\n", "Each layer of the model pushes the ‘ball’ towards the left or right,\n", "saturating at 1 and 0. This causes the final density to be be peaked at\n", "0 and 1. Transitions occur driven by the uncertainty of the mapping in\n", "each layer." ], "id": "389d1282-c6df-46c9-b66c-e7cd4154210f" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pods" ], "id": "80d9aaa5-a8d1-4da2-ac73-62a18620b901" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data = pods.datasets.mcycle()\n", "x = data['X']\n", "y = data['Y']\n", "scale=np.sqrt(y.var())\n", "offset=y.mean()\n", "yhat = (y - offset)/scale" ], "id": "626d7731-fff3-4e55-862f-bb552540da15" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import mlai\n", "import mlai.plot as plot" ], "id": "3c9ee859-7861-4003-a130-6ec41175e065" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "_ = ax.plot(x, y, 'r.',markersize=10)\n", "_ = ax.set_xlabel('time', fontsize=20)\n", "_ = ax.set_ylabel('acceleration', fontsize=20)\n", "xlim = (-20, 80)\n", "ylim = (-175, 125)\n", "ax.set_xlim(xlim)\n", "ax.set_ylim(ylim)\n", "mlai.write_figure(filename='motorcycle-helmet.svg', directory='./datasets/',\n", " transparent=True, frameon=True)" ], "id": "5b6a6375-1aca-4a05-b9f3-2436f3d40880" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Motorcycle Helmet Data\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: Motorcycle helmet data. The data consists of acceleration\n", "readings on a motorcycle helmet undergoing a collision. The data\n", "exhibits heteroschedastic (time varying) noise levles and\n", "non-stationarity." ], "id": "386f665f-7582-48d0-b7bb-4744e159ab96" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m_full = GPy.models.GPRegression(x,yhat)\n", "_ = m_full.optimize() # Optimize parameters of covariance function" ], "id": "52cb3776-dcd8-4f8e-89a9-af7a3eca3e85" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import mlai.plot as plot\n", "import mlai" ], "id": "7169287b-e86f-4988-9aaa-50ab180f83a3" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax=plt.subplots(figsize=plot.big_wide_figsize)\n", "plot.model_output(m_full, scale=scale, offset=offset, ax=ax, xlabel='time', ylabel='acceleration/$g$', fontsize=20, portion=0.5)\n", "xlim=(-20,80)\n", "ylim=(-180,120)\n", "ax.set_ylim(ylim)\n", "ax.set_xlim(xlim)\n", "mlai.write_figure(figure=fig,filename='./gp/motorcycle-helmet-gp.svg', \n", " transparent=True, frameon=True)" ], "id": "1ce3d7ae-d3a5-4be9-a6a8-39cbe3c95e52" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Motorcycle Helmet Data GP\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: Gaussian process fit to the motorcycle helmet accelerometer\n", "data." ], "id": "63c25e71-f48e-4d99-b311-02db04024859" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Motorcycle Helmet Data Deep GP\n", "\n", "\\[edit\\]" ], "id": "8b3cfd9a-56d0-4e26-817c-42796c7dc86e" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import deepgp" ], "id": "dea4180c-bf6d-434c-93a9-7826e31cc03d" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "layers = [y.shape[1], 1, x.shape[1]]\n", "inits = ['PCA']*(len(layers)-1)\n", "kernels = []\n", "for i in layers[1:]:\n", " kernels += [GPy.kern.RBF(i)]\n", "m = deepgp.DeepGP(layers,Y=yhat, X=x, \n", " inits=inits, \n", " kernels=kernels, # the kernels for each layer\n", " num_inducing=20, back_constraint=False)\n", "\n", "\n", "\n", "m.initialize()" ], "id": "66db8a94-1f19-4f0d-93e8-4792ad43ddfa" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m.staged_optimize(iters=(1000,1000,10000), messages=(True, True, True))" ], "id": "c6b45795-7db7-48c9-b60c-814208829cc2" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import mlai.plot as plot\n", "import mlai" ], "id": "137e51f4-33ed-4cd9-95d1-47afca2e9133" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax=plt.subplots(figsize=plot.big_wide_figsize)\n", "plot.model_output(m, scale=scale, offset=offset, ax=ax, xlabel='time', ylabel='acceleration/$g$', fontsize=20, portion=0.5)\n", "ax.set_ylim(ylim)\n", "ax.set_xlim(xlim)\n", "mlai.write_figure(filename='./deepgp/motorcycle-helmet-deep-gp.svg', \n", " transparent=True, frameon=True)" ], "id": "d441be2a-869d-4290-80f6-018fc3eb38cf" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: Deep Gaussian process fit to the motorcycle helmet\n", "accelerometer data." ], "id": "51fda425-2738-491e-9115-2a2f87801992" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import mlai.plot as plot\n", "import mlai" ], "id": "a90424b3-ea8e-4e38-974c-94cb8f6d0867" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax=plt.subplots(figsize=plot.big_wide_figsize)\n", "plot.model_sample(m, scale=scale, offset=offset, samps=10, ax=ax, xlabel='time', ylabel='acceleration/$g$', portion = 0.5)\n", "ax.set_ylim(ylim)\n", "ax.set_xlim(xlim)\n", "\n", "mlai.write_figure(figure=fig, filename='./deepgp/motorcycle-helmet-deep-gp-samples.svg', \n", " transparent=True, frameon=True)" ], "id": "fd516418-c941-41d1-819d-a0a224ea9535" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Motorcycle Helmet Data Deep GP\n", "\n", "\n", "\n", "Figure: Samples from the deep Gaussian process as fitted to the\n", "motorcycle helmet accelerometer data." ], "id": "e91d7ca3-2da5-4c0a-97f9-02e70aeff41c" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m.visualize(xlim=xlim, ylim=ylim, scale=scale,offset=offset, \n", " xlabel=\"time\", ylabel=\"acceleration/$g$\", portion=0.5,\n", " dataset='motorcycle-helmet',\n", " diagrams='./deepgp')" ], "id": "93196a71-1dbc-4265-93eb-8d639ead7181" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Motorcycle Helmet Data Latent 1\n", "\n", "\n", "\n", "Figure: Mappings from the input to the latent layer for the\n", "motorcycle helmet accelerometer data." ], "id": "dbb36e6a-2164-4a37-b14b-1ffcbd5fe060" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Motorcycle Helmet Data Latent 2\n", "\n", "\n", "\n", "Figure: Mappings from the latent layer to the output layer for the\n", "motorcycle helmet accelerometer data." ], "id": "6ffc32d3-9e1c-493d-b721-a2427599856d" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax=plt.subplots(figsize=plot.big_wide_figsize)\n", "m.visualize_pinball(ax=ax, xlabel='time', ylabel='acceleration/g', \n", " points=50, scale=scale, offset=offset, portion=0.1)\n", "mlai.write_figure(figure=fig, filename='./deepgp/motorcycle-helmet-deep-gp-pinball.svg', \n", " transparent=True, frameon=True)" ], "id": "b36a8eb6-02d6-4aed-8cbb-415e34d8a054" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Motorcycle Helmet Pinball Plot\n", "\n", "\n", "\n", "Figure: Pinball plot for the mapping from input to output layer for\n", "the motorcycle helmet accelerometer data." ], "id": "0e626e4c-d4ed-40b3-bb5e-ea0df1ea3a1f" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## ‘High Five’ Motion Capture Data\n", "\n", "\\[edit\\]\n", "\n", "Motion capture data from the CMU motion capture data base (CMU Motion\n", "Capture Lab, 2003). It contains two subjects approaching each other and\n", "executing a ‘high five’. The subjects are number 10 and 11 and their\n", "motion numbers are 21." ], "id": "0e3ea6ad-673d-4021-892a-3c86882a8bd0" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pods" ], "id": "91923deb-5d1f-489f-ad1c-67047c35801c" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data = pods.datasets.cmu_mocap_high_five()" ], "id": "50b8195a-3dbe-418c-9beb-e2706877ba45" }, { "cell_type": "markdown", "metadata": {}, "source": [ "The data dictionary contains the keys ‘Y1’ and ‘Y2’, which represent the\n", "motions of the two different subjects. Their skeleton files are included\n", "in the keys ‘skel1’ and ‘skel2’." ], "id": "3d2d5b82-1a72-4880-831b-74cd0d99b0c0" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data['Y1'].shape\n", "data['Y2'].shape" ], "id": "cd256494-c066-4981-893f-3b3e7c22879f" }, { "cell_type": "markdown", "metadata": {}, "source": [ "The data was used in the hierarchical GP-LVM paper (Lawrence and Moore,\n", "2007) in an experiment that was also recreated in the Deep Gaussian\n", "process paper (Damianou and Lawrence, 2013)." ], "id": "8e9f39ec-6840-4611-b6ff-a5d67cd6c418" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(data['citation'])" ], "id": "11ad7f99-461f-46bd-a463-3560716e1b66" }, { "cell_type": "markdown", "metadata": {}, "source": [ "And extra information about the data is included, as standard, under the\n", "keys `info` and `details`." ], "id": "1d8fb73d-3b23-4408-b997-df4cc426d541" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(data['info'])\n", "print()\n", "print(data['details'])" ], "id": "53a8ff77-1c8f-4e2c-8f60-791bd3bb7694" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Shared LVM\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: Shared latent variable model structure. Here two related data\n", "sets are brought together with a set of latent variables that are\n", "partially shared and partially specific to one of the data sets.\n", "\n", "\n", "\n", "Figure: Latent spaces of the ‘high five’ data. The structure of the\n", "model is automatically learnt. One of the latent spaces is coordinating\n", "how the two figures walk together, the other latent spaces contain\n", "latent variables that are specific to each of the figures\n", "separately." ], "id": "f4ba4048-5e82-4cea-97e8-54103df98c31" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Subsample of the MNIST Data\n", "\n", "\\[edit\\]\n", "\n", "We will look at a sub-sample of the MNIST digit data set.\n", "\n", "First load in the MNIST data set from scikit learn. This can take a\n", "little while because it’s large to download." ], "id": "0abb5ec5-1ccd-4604-a9b7-9fe839595d94" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from sklearn.datasets import fetch_openml" ], "id": "8c06fcc3-5691-498a-92b3-b4d944d29d30" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "mnist = fetch_openml('mnist_784')" ], "id": "4d75aea1-34b9-473f-8c4e-39a66f93fcc3" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sub-sample the dataset to make the training faster." ], "id": "cd390743-5847-4d38-8eba-f1a95b8c4fa2" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np" ], "id": "2cbb570b-b4ea-44ae-a5e7-21bc173cadfc" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.random.seed(0)\n", "digits = [0,1,2,3,4]\n", "N_per_digit = 100\n", "Y = []\n", "labels = []\n", "for d in digits:\n", " imgs = mnist['data'][mnist['target']==str(d)]\n", " Y.append(imgs.loc[np.random.permutation(imgs.index)[:N_per_digit]])\n", " labels.append(np.ones(N_per_digit)*d)\n", "Y = np.vstack(Y).astype(np.float64)\n", "labels = np.hstack(labels)\n", "Y /= 255" ], "id": "4595ebbb-47b7-4803-ac9f-cf96e7cc4478" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Fitting a Deep GP to a the MNIST Digits Subsample\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Zhenwen Dai\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Andreas Damianou\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "We now look at the deep Gaussian processes’ capacity to perform\n", "unsupervised learning." ], "id": "307f7ea9-d384-44b6-8069-8c947f80d3b3" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Fit a Deep GP\n", "\n", "We’re going to fit a Deep Gaussian process model to the MNIST data with\n", "two hidden layers. Each of the two Gaussian processes (one from the\n", "first hidden layer to the second, one from the second hidden layer to\n", "the data) has an exponentiated quadratic covariance." ], "id": "36bf0eef-3996-4037-9d41-6dcc27972153" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import deepgp\n", "import GPy" ], "id": "e7bf3eec-62cb-4d93-9920-20b6b9fd6017" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "num_latent = 2\n", "num_hidden_2 = 5\n", "m = deepgp.DeepGP([Y.shape[1],num_hidden_2,num_latent],\n", " Y,\n", " kernels=[GPy.kern.RBF(num_hidden_2,ARD=True), \n", " GPy.kern.RBF(num_latent,ARD=False)], \n", " num_inducing=50, back_constraint=False, \n", " encoder_dims=[[200],[200]])" ], "id": "e4bb7154-d74f-4b53-8676-19e76a50b549" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Initialization\n", "\n", "Just like deep neural networks, there are some tricks to intitializing\n", "these models. The tricks we use here include some early training of the\n", "model with model parameters constrained. This gives the variational\n", "inducing parameters some scope to tighten the bound for the case where\n", "the noise variance is small and the variances of the Gaussian processes\n", "are around 1." ], "id": "ef112872-e1c3-45c6-8ada-08344e43294d" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m.obslayer.likelihood.variance[:] = Y.var()*0.01\n", "for layer in m.layers:\n", " layer.kern.variance.fix(warning=False)\n", " layer.likelihood.variance.fix(warning=False)" ], "id": "6ee67f3a-4883-4130-8e71-764743dffd22" }, { "cell_type": "markdown", "metadata": {}, "source": [ "We now we optimize for a hundred iterations with the constrained model." ], "id": "07eb8ffd-bd2a-4c9f-8a2e-c9d9010f0059" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m.optimize(messages=False,max_iters=100)" ], "id": "b4af7402-01bb-4a4f-a416-3d400192ffdd" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we remove the fixed constraint on the kernel variance parameters,\n", "but keep the noise output constrained, and run for a further 100\n", "iterations." ], "id": "8ab7f9b6-b9f0-4275-b331-c806da303a8b" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for layer in m.layers:\n", " layer.kern.variance.constrain_positive(warning=False)\n", "m.optimize(messages=False,max_iters=100)" ], "id": "79bda2b3-f8fa-4ab2-984f-dabc6b403297" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally we unconstrain the layer likelihoods and allow the full model to\n", "be trained for 1000 iterations." ], "id": "b87f5f20-8a5c-4d13-b91a-576fd8dc88c6" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for layer in m.layers:\n", " layer.likelihood.variance.constrain_positive(warning=False)\n", "m.optimize(messages=True,max_iters=10000)" ], "id": "4cd6d4db-9208-4870-b9bb-5226bf134b17" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualize the latent space of the top layer\n", "\n", "Now the model is trained, let’s plot the mean of the posterior\n", "distributions in the top latent layer of the model." ], "id": "f24dccf9-4d5c-4783-a3aa-e8ff3949231c" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import mlai.plot as plot\n", "import mlai" ], "id": "7af43863-42f2-4c5a-81f2-baff91db14d9" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from matplotlib import rc" ], "id": "ce0824ae-a1a1-47f3-b70a-c80413d8e044" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "rc(\"font\", **{'family':'sans-serif','sans-serif':['Helvetica'],'size':20})\n", "fig, ax = plt.subplots(figsize=plot.big_figsize)\n", "for d in digits:\n", " ax.plot(m.layer_1.X.mean[labels==d,0],m.layer_1.X.mean[labels==d,1],'.',label=str(d))\n", "_ = plt.legend()\n", "mlai.write_figure(figure=fig, filename=\"./deepgp/mnist-digits-subsample-latent.svg\", transparent=True)" ], "id": "cd6cc9be-4d92-4d09-b714-05748fd5b7d3" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: Latent space for the deep Gaussian process learned through\n", "unsupervised learning and fitted to a subset of the MNIST digits\n", "subsample." ], "id": "a721f068-10aa-4ed5-aeb7-f4dc123810bf" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualize the latent space of the intermediate layer\n", "\n", "We can also visualize dimensions of the intermediate layer. First the\n", "lengthscale of those dimensions is given by" ], "id": "cabb5bf3-fbae-4730-a99f-b1ba2d02b49a" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m.obslayer.kern.lengthscale" ], "id": "7f678cd8-c433-4cbb-9765-d0f16d58cfcb" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import mlai.plot as plot\n", "import mlai" ], "id": "0879a48c-eb10-4868-bbe5-b8df2b5434fb" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_figsize)\n", "for i in range(5):\n", " for j in range(i):\n", " dims=[i, j]\n", " ax.cla()\n", " for d in digits:\n", " ax.plot(m.obslayer.X.mean[labels==d,dims[0]],\n", " m.obslayer.X.mean[labels==d,dims[1]],\n", " '.', label=str(d))\n", " plt.legend()\n", " plt.xlabel('dimension ' + str(dims[0]))\n", " plt.ylabel('dimension ' + str(dims[1]))\n", " mlai.write_figure(figure=fig, filename=\"./deepgp/mnist-digits-subsample-hidden-\" + str(dims[0]) + '-' + str(dims[1]) + '.svg', transparent=True)" ], "id": "987018db-79ea-4431-b6ed-89df06f12642" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: Visualisation of the intermediate layer, plot of dimension 1\n", "vs dimension 0.\n", "\n", "\n", "\n", "Figure: Visualisation of the intermediate layer, plot of dimension 1\n", "vs dimension 0.\n", "\n", "\n", "\n", "Figure: Visualisation of the intermediate layer, plot of dimension 1\n", "vs dimension 0.\n", "\n", "\n", "\n", "Figure: Visualisation of the intermediate layer, plot of dimension 1\n", "vs dimension 0." ], "id": "040b911c-3c29-4661-a958-e4961614c008" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Generate From Model\n", "\n", "Now we can take a look at a sample from the model, by drawing a Gaussian\n", "random sample in the latent space and propagating it through the model." ], "id": "9c3284b1-0860-402f-bbf7-1faf8ed6519a" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\n", "rows = 10\n", "cols = 20\n", "t=np.linspace(-1, 1, rows*cols)[:, None]\n", "kern = GPy.kern.RBF(1,lengthscale=0.05)\n", "cov = kern.K(t, t)\n", "x = np.random.multivariate_normal(np.zeros(rows*cols), cov, num_latent).T" ], "id": "7e723ada-70a8-4a55-bcad-92d033aa0f97" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import mlai.plot as plot\n", "import mlai" ], "id": "46e0782c-e618-4489-888a-85ab6f5195f9" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "yt = m.predict(x)\n", "fig, axs = plt.subplots(rows,cols,figsize=(10,6))\n", "for i in range(rows):\n", " for j in range(cols):\n", " #v = np.random.normal(loc=yt[0][i*cols+j, :], scale=np.sqrt(yt[1][i*cols+j, :]))\n", " v = yt[0][i*cols+j, :]\n", " axs[i,j].imshow(v.reshape(28,28), \n", " cmap='gray', interpolation='none',\n", " aspect='equal')\n", " axs[i,j].set_axis_off()\n", "mlai.write_figure(figure=fig, filename=\"./deepgp/digit-samples-deep-gp.svg\", transparent=True)" ], "id": "dba8ff09-94ff-4eb3-bded-6cda009646dc" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: These digits are produced by taking a tour of the two\n", "dimensional latent space (as described by a Gaussian process sample) and\n", "mapping the tour into the data space. We visualize the mean of the\n", "mapping in the images." ], "id": "7c4789db-85cb-4f1d-8105-c87c59383328" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Deep Health\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: The deep health model uses different layers of abstraction in\n", "the deep Gaussian process to represent information about diagnostics and\n", "treatment to model interelationships between a patients different data\n", "modalities.\n", "\n", "From a machine learning perspective, we’d like to be able to interrelate\n", "all the different modalities that are informative about the state of the\n", "disease. For deep health, the notion is that the state of the disease is\n", "appearing at the more abstract levels, as we descend the model, we\n", "express relationships between the more abstract concept, that sits\n", "within the physician’s mind, and the data we can measure." ], "id": "f3262acc-2e56-42ed-81e9-94b8f0c20015" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Thanks!\n", "\n", "For more information on these subjects and more you might want to check\n", "the following resources.\n", "\n", "- twitter: [@lawrennd](https://twitter.com/lawrennd)\n", "- podcast: [The Talking Machines](http://thetalkingmachines.com)\n", "- newspaper: [Guardian Profile\n", " Page](http://www.theguardian.com/profile/neil-lawrence)\n", "- blog:\n", " [http://inverseprobability.com](http://inverseprobability.com/blog.html)" ], "id": "cefa71a4-8b33-47a6-acc2-72176dc2b22f" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## References" ], "id": "faecae7a-8ddb-4723-9c1d-d45e217a2cb9" }, { "cell_type": "markdown", "metadata": {}, "source": [ "CMU Motion Capture Lab, 2003. The CMU mocap database.\n", "\n", "Damianou, A., Lawrence, N.D., 2013. Deep Gaussian processes. pp.\n", "207–215.\n", "\n", "Della Gatta, G., Bansal, M., Ambesi-Impiombato, A., Antonini, D.,\n", "Missero, C., Bernardo, D. di, 2008. Direct targets of the TRP63\n", "transcription factor revealed by a combination of gene expression\n", "profiling and reverse engineering. Genome Research 18, 939–948.\n", "\n", "\n", "Ferris, B.D., Fox, D., Lawrence, N.D., 2007. WiFi-SLAM using Gaussian\n", "process latent variable models, in: Veloso, M.M. (Ed.), Proceedings of\n", "the 20th International Joint Conference on Artificial Intelligence\n", "(IJCAI 2007). pp. 2480–2485.\n", "\n", "Kalaitzis, A.A., Lawrence, N.D., 2011. A simple approach to ranking\n", "differentially expressed gene expression time courses through Gaussian\n", "process regression. BMC Bioinformatics 12.\n", "\n", "\n", "Lawrence, N.D., Moore, A.J., 2007. Hierarchical Gaussian process latent\n", "variable models. pp. 481–488." ], "id": "679ce0d1-7f6a-47d9-a379-3b579b6b63b1" } ], "nbformat": 4, "nbformat_minor": 5, "metadata": {} }