{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Emulation\n", "\n", "### [Neil D. Lawrence](http://inverseprobability.com), University of\n", "\n", "Cambridge\n", "\n", "### 2023-10-24" ], "id": "8ceced46-d778-402e-b1ef-c82cb0f48384" }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Abstract**: In this lecture we motivate the use of emulation and\n", "introduce the GPy software as a framework for building Gaussian process\n", "emulators." ], "id": "982e67bc-561d-4a36-98d7-fae5f51f263b" }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\n", "$$" ], "id": "be5b3911-e656-473e-b1bc-45cf3189a8f7" }, { "cell_type": "markdown", "metadata": {}, "source": [ "::: {.cell .markdown}\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "" ], "id": "e4d6c0c6-8ee2-4479-81de-ce63e18c0289" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "plt.rcParams.update({'font.size': 22})" ], "id": "c8dc9a95-f215-4772-b5d2-2f87d0918da5" }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ], "id": "0b824205-574e-4843-87ce-09e6fc7e3bad" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## notutils\n", "\n", "\\[edit\\]\n", "\n", "This small package is a helper package for various notebook utilities\n", "used below.\n", "\n", "The software can be installed using" ], "id": "4c6d7d4a-981f-472c-a816-8ce4c3067270" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install notutils" ], "id": "4d6558c3-c56c-4f96-a235-b7bd6325a551" }, { "cell_type": "markdown", "metadata": {}, "source": [ "from the command prompt where you can access your python installation.\n", "\n", "The code is also available on GitHub:\n", "\n", "\n", "Once `notutils` is installed, it can be imported in the usual manner." ], "id": "bf4278a9-adea-444a-ab26-482f30d0c4f2" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import notutils" ], "id": "49702d12-dd12-4d3e-b8fd-2fc90e5b8692" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## mlai\n", "\n", "\\[edit\\]\n", "\n", "The `mlai` software is a suite of helper functions for teaching and\n", "demonstrating machine learning algorithms. It was first used in the\n", "Machine Learning and Adaptive Intelligence course in Sheffield in 2013.\n", "\n", "The software can be installed using" ], "id": "9d8906ca-0ceb-4acc-9279-99886abde830" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install mlai" ], "id": "2dcb52ef-9dd4-438e-9af2-750005953a5f" }, { "cell_type": "markdown", "metadata": {}, "source": [ "from the command prompt where you can access your python installation.\n", "\n", "The code is also available on GitHub: \n", "\n", "Once `mlai` is installed, it can be imported in the usual manner." ], "id": "8eeef27d-800e-4fee-9ed8-baa310808fb9" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import mlai" ], "id": "3939af06-488a-41f2-8435-feddedf5806f" }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Emulation\n", "\n", "There are a number of ways we can use machine learning to accelerate\n", "scientific discovery. But one way is to have the machine learning model\n", "learn the effect of the rules. Rather than worrying about the detail of\n", "the rules through computing each step, we can have the machine learning\n", "model look to abstract the rules and capture emergent phenomena, just as\n", "the Maxwell-Boltzmann distribution captures the essence of the behavior\n", "of the ideal gas.\n", "\n", "The challenges of Laplace’s gremlin present us with issues that we solve\n", "in a particular way, this is the focus of Chapter 6 in *The Atomic\n", "Human*." ], "id": "d41c6ec0-f484-40b7-8b73-dc49c26201b3" }, { "cell_type": "markdown", "metadata": {}, "source": [ "# The Atomic Human\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: [The Atomic Human](https://www.amazon.co.uk/dp/B0CGZHBSLL)\n", "(Lawrence, 2024) due for release in June 2024.\n", "\n", "What follows is a quote form Chapter 6, which introduces Laplace’s\n", "gremlin and its consequences.\n", "\n", "In Douglas Adams’s *Hitchhiker’s Guide to the Galaxy* the computer Deep\n", "Thought is asked to provide the answer to the ‘great question’ of ‘life,\n", "the universe and everything’. After seven and a half million years of\n", "computation, Deep Thought has completed its program but is reluctant to\n", "give its creators the answer.\n", "\n", "> ‘You’re really not going to like it,’ observed Deep Thought. \n", "> ‘Tell us!’ \n", "> ‘All right,’ said Deep Thought. ‘The Answer to the Great Question . .\n", "> .’ \n", "> ‘Yes . . . !’ \n", "> ‘Of Life, the Universe and Everything …’ said Deep Thought. \n", "> ‘Yes … !’ \n", "> ‘Is …’ said Deep Thought, and paused. \n", "> ‘Yes … !’ \n", "> ‘Is …’ \n", "> ‘Yes … !!! … ?’ \n", "> ‘Forty-two,’ said Deep Thought, with infinite majesty and calm.\n", ">\n", "> Douglas Adams *The Hitchhiker’s Guide to the Galaxy*, 1979, Chapter 27\n", "\n", "After a period of shock from the questioners, the machine goes on to\n", "explain.\n", "\n", "> ‘I checked it very thoroughly,’ said the computer, ‘and that quite\n", "> definitely is the answer. I think the problem, to be quite honest with\n", "> you, is that you’ve never actually known what the question is.’\n", ">\n", "> Douglas Adams *The Hitchhiker’s Guide to the Galaxy*, 1979, Chapter 28\n", "\n", "To understand the question, Deep Thought goes on to agree to design a\n", "computer which will work out what the question is. In the book that\n", "machine is the planet Earth, and its operators are mice. Deep Thought’s\n", "idea is that the mice will observe the Earth and their observations will\n", "allow them to know what the Great Question is.\n", "\n", "To understand the consequences of Hawking’s Theory of Everything, we\n", "would have to carry out a scheme similar to Deep Thought’s. The Theory\n", "wouldn’t directly tell us that hurricanes exist or that when the sun\n", "sets on Earth the sky will have a red hue. It wouldn’t directly tell us\n", "that water will boil at 100 degrees centigrade. These consequences of\n", "the Theory would only play out once it was combined with the data to\n", "give us the emergent qualities of the Universe. The Deep Thought problem\n", "hints at the intractability of doing this. The computation required to\n", "make predictions from Laplace’s demon can be enormous, Deep Thought\n", "intends to create a planet to run it. As a result, this isn’t how our\n", "intelligence works in practice. The computations required are just too\n", "gargantuan. Relative to the scale of the Universe our brains are\n", "extremely limited. Fortunately though, to make these predictions, we\n", "don’t have to build our own Universe, because we’ve already got one.\n", "\n", "\n", "\n", "Figure: Rabbit and Pooh watch the result of Pooh’s hooshing idea to\n", "move Eeyore towards the shore.\n", "\n", "> When you are a Bear of Very Little Brain, and you Think of Things, you\n", "> find sometimes that a Thing which seemed very Thingish inside you is\n", "> quite different when it gets out into the open and has other people\n", "> looking at it.\n", ">\n", "> A.A. Milne as Winnie-the-Pooh in *The House at Pooh Corner*, 1928\n", "\n", "This comment from Pooh bear comes just as he’s tried to rescue his\n", "donkey friend, Eeyore, from a river by dropping a large stone on him\n", "from a bridge. Pooh’s idea had been to create a wave to push the donkey\n", "to the shore, a process that Pooh’s rabbit friend calls “hooshing”.\n", "\n", "Hooshing is a technique many children will have tried to retrieve a ball\n", "from a river. It can work, so Pooh’s idea wasn’t a bad one, but the\n", "challenge he faced was in its execution. Pooh aimed to the side of\n", "Eeyore, unfortunately the stone fell directly on the stuffed donkey. But\n", "where is Laplace’s demon in hooshing? Just as we can talk about Gliders\n", "and Loafers in Conway’s Game of Life, we talk about stones and donkeys\n", "in our Universe. Pooh’s prediction that he can hoosh the donkey with the\n", "stone is not based on the Theory, it comes from observing the way\n", "objects interact in the actual Universe. Pooh is like the mice in\n", "Douglas Adams’s Earth. He is observing his environment. He looks for\n", "patterns in that environment. Pooh then borrows the computation that the\n", "Universe has already done for us. He has seen similar situations before,\n", "perhaps he once used a stone to hoosh a ball. He is then generalising\n", "from these previous circumstances to suggest that he can also hoosh the\n", "donkey. Despite being a bear of little brain, like the mice on Adams’s\n", "Earth, Pooh can answer questions about his universe by observing the\n", "results of the Theory of Everything playing out around him." ], "id": "976694cc-c1be-4f7c-a2d2-58a819d353d9" }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Surrogate Modelling in Practice\n", "\n", "The knowledge of ones own limitations that Pooh shows is sometimes known\n", "as Socratic wisdom, and its a vital part of our intelligence. It\n", "expresses itself as humility and skepticism.\n", "\n", "In the papers we reviewed last lecture, neural networks are being used\n", "to speed up computations. In this course we’ve introduced Gaussian\n", "processes that will be used to speed up these computations. In both\n", "cases the ideas are similar. Rather than rerunning the simulation, we\n", "use data from the simulation to *fit* the neural network or the Gaussian\n", "process to the data.\n", "\n", "We’ll see an example of how this is done in a moment, taken from a\n", "simple ride hailing simulator, but before we look at that, we’ll first\n", "consider why this might be a useful approach.\n", "\n", "As we’ve seen from the very simple rules in the Game of Life, emergent\n", "phenomena we might be interested in take computation power to discover,\n", "just as Laplace’s and Dirac’s quotes suggest. The objective in surrogate\n", "modelling is to harness machine learning models to learn those physical\n", "characteristics." ], "id": "7da9d74b-a7e4-4140-966d-495d43410d30" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Types of Simulations\n", "\n", "We’ve introduced simulations from the perspective of laws of physics. In\n", "practice, many simulations may not directly encode for the laws of\n", "physics, but they might encode expert intuitions about a problem (like\n", "Pooh’s intuition about hooshing).\n", "\n", "For example, in Formula 1 races, the cars have tyres that wear at\n", "different rates. Softer tyres allow the cars to drive faster but wear\n", "quicker. Harder tyres man the car drives slower but they last longer.\n", "Changing between tyres is part of the race, and it has a time penalty.\n", "Before each race the teams decide what their strategy will be with tyre\n", "changes. It’s not only how many tyre changes that are important, but\n", "when they happen. If you change your tyre early, you might get a speed\n", "advantage and be able to pass your rival when they change their tyre\n", "later. This is a trick known as ‘undercutting’, but if your early change\n", "puts you back onto the track behind other slower cars, you will lose\n", "this advantage.\n", "\n", "Formula 1 teams determine their strategy through simulating the race.\n", "Each team knows how fast other teams are around the track, and what\n", "their top speeds are. So, the teams simulate many thousands or millions\n", "of races with different strategies for their rivals, and they choose the\n", "strategy for which they maximize their number of expected points.\n", "\n", "When many simulations are done, the results take time to come. During\n", "the actual race, the simulations are too slow to provide the real time\n", "information teams would need. In this case F1 teams can use emulators,\n", "models that have learnt the effect of the simulations, to give real time\n", "updates.\n", "\n", "Formula 1 race simulations contain assumptions that derive from physics\n", "but don’t directly encode the physical laws. For example, if one car is\n", "stuck behind another, in any given lap, it might overtake. A typical\n", "race simulation will look at the lap speed of each car and the top speed\n", "of each car (as measured in ‘speed traps’ that are placed on the\n", "straight). It will assume a probability of overtake for each lap that is\n", "a function of these values. Of course, underlying that function is the\n", "physics of how cars overtake each other, but that can be abstracted away\n", "into a simpler function that the Race Strategy Engineer defines from\n", "their knowledge and previous experience.\n", "\n", "Many simulations have this characteristic: major parts of the simulation\n", "are the result of encoding expert knowledge in the code. But this can\n", "lead to challenges. I once asked a strategy engineer, who had completed\n", "a new simulation, how it was going. He replied that things had started\n", "out well, but over time its performance was degrading. We discussed this\n", "for a while and over time a challenge of mis-specified granularity\n", "emerged." ], "id": "7b4d5e8f-1ca6-4b62-b4f0-a369e12bb61b" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Fidelity of the Simulation\n", "\n", "The engineer explained how there’d been a race where the simulation had\n", "suggested that their main driver *shouldn’t* pit because he would have\n", "emerged behind a car with a slower lap speed, but a high top-speed. This\n", "would have made that car difficult to overtake. However, the driver of\n", "that slower car was also in the team’s ‘development program’, so\n", "everyone in the team knew that the slower car would have moved aside to\n", "let their driver through. Unfortunately, the simulation didn’t know\n", "this. So, the team felt the wrong stategy decision was made. After the\n", "race, the simulation was updated to include a special case for this\n", "situation. The new code checked whether the slower car was a development\n", "driver, making it ‘more realistic’.\n", "\n", "Over time there were a number of similar changes, each of which should\n", "have improved the simulation, but the reality was the code was now\n", "‘mixing granularities’. The formula for computing the probability of\n", "overtake as a function of speeds is one that is relatively easy to\n", "verify. It ignores the relationships between drivers, whether a given\n", "driver is a development driver, whether one bears a grudge or not,\n", "whether one is fighting for their place in the team. That’s all\n", "assimilated into the equation. The original equation is easy to\n", "calibrate, but as soon as you move to a finer granularity and consider\n", "more details about individual drivers, the model seems more realistic,\n", "but it becomes difficult to specify, and therefore performance degrades.\n", "\n", "Simulations work at different fidelities, but as the Formula 1 example\n", "shows you must be very careful about mixing fidelities within the same\n", "simulation. The appropriate fidelity of a simulation is strongly\n", "dependent on the question being asked of it. On the context. For\n", "example, in Formula 1 races you can also simulate the performance of the\n", "car in the wind tunnel and using computational fluid dynamics\n", "representations of the Navier-Stokes equations. That level of fidelity\n", "*is* appropriate when designing the aerodynamic components of the car,\n", "but inappropriate when building a strategy simulation." ], "id": "576ead10-8654-4474-a056-ec025dd08565" }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Epidemiology\n", "\n", "The same concept of modelling at a particular fidelity comes up in\n", "epidemiology. Disease is transmitted by direct person to person\n", "interactions between individuals and objects. But in theoretical\n", "epidemiology, this is approximated by differential equations. The\n", "resulting models look very similar to reaction rate models used in\n", "Chemistry for well mixed beakers. Let’s have a look at a simple example\n", "used for modelling the policy of ‘herd immunity’ for Covid19." ], "id": "7f56f181-6a3c-4f33-a1ad-bd3a36deb7a9" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Modelling Herd Immunity\n", "\n", "\\[edit\\]\n", "\n", "This example is taken from [Thomas House’s blog\n", "post](https://personalpages.manchester.ac.uk/staff/thomas.house/blog/modelling-herd-immunity.html)\n", "on Herd Immunity. This model was shared at the beginning of the Covid19\n", "pandemic when the first UK lockdown hadn’t yet occurred." ], "id": "c72a4004-8a51-452f-a0bb-2d39aa5a8520" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from scipy import integrate" ], "id": "1af52f6b-5feb-4faa-85ee-e039289bde0a" }, { "cell_type": "markdown", "metadata": {}, "source": [ "The next piece of code sets up the dynamics of the compartmental model.\n", "He doesn’t give the specific details in the blog post, but my\n", "understanding is that the four states are as follows. `x[0]` is the\n", "susceptible population, those that haven’t had the disease yet. The\n", "susceptible population decreases by encounters with infections people.\n", "In Thomas’s model, both `x[3]` and `x[4]` are infections. So, the\n", "dynamics of the reduction of the susceptible is given by $$\n", "\\frac{\\text{d}{S}}{\\text{d}t} = - \\beta S (I_1 + I_2).\n", "$$ Here, we’ve used $I_1$ and $I_2$ to represent what appears to be two\n", "separate infectious compartments in Thomas’s model. We’ll speculate\n", "about why there are two in a moment.\n", "\n", "The model appears to be an SEIR model, so rather than becoming\n", "infectious directly you next move to an ‘exposed’, where you have the\n", "disease, but you are not yet infectious. There are again *two* exposed\n", "states, we’ll return to that in a moment. We denote the first, `x[1]` by\n", "$E_1$. We have $$\n", "\\frac{\\text{d}{E_1}}{\\text{d}t} = \\beta S (I_1 + I_2) - \\sigma E_1.\n", "$$ Note that the first term matches the term from the Susceptible\n", "equation. This is because it is the incoming exposed population.\n", "\n", "The exposed population move to a second compartment of exposure, $E_2$.\n", "I believe the reason for this is that if you use only one exposure\n", "compartment, then the statistics of the duration of exposure are\n", "incorrect (implicitly they are exponentially distributed in the\n", "underlying stochastic version of the model). By using two exposure\n", "departments, Thomas is making a slight correction to this which would\n", "impose a first order gamma distribution on those statistics. A similar\n", "trick is being deployed for the ‘infectious group’. So, we gain an\n", "additional equation to help with these statistics, $$\n", "\\frac{\\text{d}{E_2}}{\\text{d}t} = \\sigma E_1 - \\sigma E_2.\n", "$$ giving us the exposed group as the sum of the two compartments $E_1$\n", "and $E_2$. The exposed group from the second compartment then become\n", "‘infected’, which we represent with $I_1$, in the code this is `x[3]`,\n", "$$\n", "\\frac{\\text{d}{I_1}}{\\text{d}t} = \\sigma E_2 - \\gamma I_1,\n", "$$ and similarly, Thomas is using a two-compartment infectious group to\n", "fix up the duration model. So we have, $$\n", "\\frac{\\text{d}{I_2}}{\\text{d}t} = \\gamma I_1 - \\gamma I_2.\n", "$$ And finally, we have those that have recovered emerging from the\n", "second infections compartment. In this model there is no separate model\n", "for ‘deaths’, so the recovered compartment, $R$, would also include\n", "those that die, $$\n", "\\frac{\\text{d}R}{\\text{d}t} = \\gamma I_2.\n", "$$ All these equations are then represented in code as follows." ], "id": "6e861b22-ef5d-4da7-8373-79e15699ef0b" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def odefun(t,x,beta0,betat,t0,t1,sigma,gamma):\n", " dx = np.zeros(6)\n", " if ((t>=t0) and (t<=t1)):\n", " beta = betat\n", " else:\n", " beta = beta0\n", " dx[0] = -beta*x[0]*(x[3] + x[4])\n", " dx[1] = beta*x[0]*(x[3] + x[4]) - sigma*x[1]\n", " dx[2] = sigma*x[1] - sigma*x[2]\n", " dx[3] = sigma*x[2] - gamma*x[3]\n", " dx[4] = gamma*x[3] - gamma*x[4]\n", " dx[5] = gamma*x[4]\n", " return dx" ], "id": "f886d082-c288-4b74-8181-eb4160e973f2" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Where the code takes in the states of the compartments (the values of\n", "`x`) and returns the gradients of those states for the provided\n", "parameters (`sigma`, `gamma` and `beta`). Those parameters are set\n", "according to the known characteristics of the disease.\n", "\n", "The next block of code sets up the parameters of the SEIR model. A\n", "particularly important parameter is the reproduction number ($R_0$),\n", "here Thomas has assumed a reproduction number of 2.5, implying that each\n", "infected member of the population transmits the infection up to 2.5\n", "other people. The effective $R$ decreases over time though, because some\n", "of those people they meet will no longer be in the susceptible group." ], "id": "c3d6f3d1-b3cd-4bd8-909a-1b8d992c62c0" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Parameters of the model\n", "N = 6.7e7 # Total population\n", "i0 = 1e-4 # 0.5*Proportion of the population infected on day 0\n", "tlast = 365.0 # Consider a year\n", "latent_period = 5.0 # Days between being infected and becoming infectious\n", "infectious_period = 7.0 # Days infectious\n", "R0 = 2.5 # Basic reproduction number in the absence of interventions\n", "Rt = 0.75 # Reproduction number in the presence of interventions\n", "tend = 21.0 # Number of days of interventions" ], "id": "ff8d96a3-c3d0-4a15-a7d8-7315b7b313f5" }, { "cell_type": "markdown", "metadata": {}, "source": [ "The parameters are correct for the ‘discrete system’, where the\n", "infectious period is a discrete time, and the numbers are discrete\n", "values. To translate into our continuous differential equation system’s\n", "parameters, we need to do a couple of manipulations. Note the factor of\n", "2 associated with `gamma` and `sigma`. This is a doubling of the rate to\n", "account for the fact that there are two compartments for each of these\n", "states (to fix-up the statistics of the duration models)." ], "id": "420b5ab2-57eb-4f06-b563-0b1b6e7953a1" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "beta0 = R0 / infectious_period\n", "betat = Rt / infectious_period\n", "sigma = 2.0 / latent_period\n", "gamma = 2.0 / infectious_period" ], "id": "a26fadd1-eeb4-4184-8179-44a53dbacefd" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we solve the system using `scipy`’s initial value problem solver.\n", "The solution method is Runge-Kutta-Fehlberg method, as indicated by the\n", "`'RK45'` solver. This is a numerical method for solving differential\n", "equations. The 45 is the order of the method and the error estimator.\n", "\n", "We can view the solver itself as somehow a piece of simulation code, but\n", "here it’s being called as sub routine in the system. It returns a\n", "solution for each time step, stored in a list `sol`.\n", "\n", "This is typical of this type of non-linear differential equation\n", "problem. Whether it’s partial differential equations, ordinary\n", "differential equations, there’s a step where a numerical solver needs to\n", "be called. These are often expensive to run. For climate and weather\n", "models, this would be where we solved the Navier-Stokes equations. For\n", "this simple model, the solution is relatively quick." ], "id": "076b665f-82cd-4942-adea-21ae8282aef1" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\n", "t0ran = np.array([-100, 40, 52.5, 65])\n", "sol=[]\n", "for tt in range(0,len(t0ran)):\n", " sol.append(integrate.solve_ivp(lambda t,x: odefun(t,x,beta0,betat,t0ran[tt],t0ran[tt]+tend,sigma,gamma),\n", " (0.0,tlast),\n", " np.array([1.0-2.0*i0, 0.0, 0.0, i0, i0, 0.0]),\n", " 'RK45',\n", " atol=1e-8,\n", " rtol=1e-9))" ], "id": "9b3ed962-d444-4334-acfd-b4ac376f97a8" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import mlai.plot as plot\n", "import mlai" ], "id": "279057b3-6076-4bb2-ba07-3356f5110517" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def mylab(t):\n", " if t>0:\n", " return \"Start at \" + str(t) + \" days\"\n", " else:\n", " return \"Baseline\"" ], "id": "b6af0e46-0e92-4645-852f-3dbe6ffc2ffc" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(1, 2, figsize=plot.big_wide_figsize)\n", "for tt in range(0,len(t0ran)):\n", " ax[0].plot(sol[tt].t,N*(sol[tt].y[3] + sol[tt].y[4]).T, label=mylab(t0ran[tt]))\n", "ax[0].set_xlim([30,70])\n", "ax[0].set_ylim([0,7e6])\n", "ax[0].set_xlabel('Time (days)')\n", "ax[0].set_ylabel('Number of infectious cases')\n", "ax[0].legend()\n", "for tt in range(0,len(t0ran)):\n", " ax[1].plot(sol[tt].t,N*sol[tt].y[5].T, label=mylab(t0ran[tt]))\n", "ax[1].set_xlim([30,70])\n", "ax[1].set_ylim([0,1e7])\n", "ax[1].set_xlabel('Time (days)')\n", "ax[1].set_ylabel('Cumulative infections')\n", "ax[1].legend()\n", "\n", "mlai.write_figure('house-model-zoom.svg', directory='./simulation')" ], "id": "b15eac73-0847-4e4b-a99a-fdb20f87e1e7" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: A zoomed in version of Thomas House’s variation on the SEIR\n", "model for evaluating the effect of early interventions." ], "id": "cae31ace-f79c-40d6-9980-2c9d6dad9364" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(1, 2, figsize=plot.big_wide_figsize)\n", "for tt in range(0,len(t0ran)):\n", " ax[0].plot(sol[tt].t,N*(sol[tt].y[3] + sol[tt].y[4]).T, label=mylab(t0ran[tt]))\n", "ax[0].set_xlim([0,tlast])\n", "ax[0].set_ylim([0,1.2e7])\n", "ax[0].set_xlabel('Time (days)')\n", "ax[0].set_ylabel('Number of infectious cases')\n", "ax[0].legend()\n", "for tt in range(0,len(t0ran)):\n", " ax[1].plot(sol[tt].t,N*sol[tt].y[5].T, label=mylab(t0ran[tt]))\n", "ax[1].set_xlabel('Time (days)')\n", "ax[1].set_ylabel('Cumulative infections')\n", "ax[1].legend()\n", "ax[1].set_xlim([0,tlast])\n", "ax[1].set_ylim([0,6.2e7])\n", "\n", "mlai.write_figure('house-model-full.svg', directory='./simulation/')" ], "id": "b30fc5f5-345b-4c54-85a1-2acbb79551fd" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: The full progress of the disease in Thomas House’s variation\n", "on the SEIR model for evaluating the effect of early interventions.\n", "\n", "In practice, immunity for Covid19 may only last around 6 months. As an\n", "exercise, try to extend Thomas’s model for the case where immunity is\n", "temporary. You’ll need to account for deaths as well in your new model.\n", "\n", "Thinking about our Formula 1 example, and the differing levels of\n", "fidelity that might be included in a model, you can now imagine the\n", "challenges of doing large scale theoretical epidemiology. The\n", "compartment model is operating at a particular level of fidelity.\n", "Imagine trying to modify this model for a specific circumstance, like\n", "the way that the University of Cambridge chooses to do lectures. It’s\n", "not appropriate for this level of fidelity. You need to use different\n", "types of models for that decision making. [One of our case\n", "studies](https://mlatcl.github.io/mlphysical/casestudies/tti-explorer.html)\n", "looks at a simulation that was used to advise the government on the Test\n", "Trace Isolate program that took a different approach (The DELVE\n", "Initiative, 2020a)." ], "id": "109c5a56-bf2b-4ae4-86ba-490f6de6791a" }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Strategies for Simulation\n", "\n", "Within any simulation, we can roughly split the variables of interest\n", "into the state variables and the parameters. In the Herd immunity\n", "example, the state variables were the different susceptible, exposed,\n", "infectious and recovered groups. The parameters were the reproduction\n", "number and the expected lengths of infection and the timing of lockdown.\n", "Often parameters are viewed as the inputs to the simulation, the things\n", "we can control. We might want to know how to time lock down to minimize\n", "the number of deaths. This behavior of the simulator is what we may want\n", "to emulate with our Gaussian process model.\n", "\n", "So far, we’ve introduced simulation motivated by the physical laws of\n", "the universe. Those laws are sometimes encoded in differential\n", "equations, in which case we can try to solve those systems (like with\n", "Herd Immunity or Navier Stokes). An alternative approach is taken in the\n", "Game of Life. There a turn-based simulation is used, at each turn, we\n", "iterate through the simulation updating the state of the simulation.\n", "This is known as a *discrete event simulation*. In race simulation for\n", "Formula 1 a discrete event simulation is also used. There is another\n", "form of discrete event simulation, often used in chemical models, where\n", "the events don’t take place at regular intervals. Instead, the timing to\n", "the next event is computed, and the simulator advances that amount of\n", "time. For an example of this see [the Gillespie\n", "algorithm](https://en.wikipedia.org/wiki/Gillespie_algorithm).\n", "\n", "There is a third type of simulation that we’d also like to introduce.\n", "That is simulation within computer software. In particular, the need to\n", "backtest software with ‘what if’ ideas – known as counter factual\n", "simulation – or to trace errors that may have occurred in production.\n", "This can involve loading up entire code bases and rerunning them with\n", "simulated inputs. This is a third form of simulation where emulation can\n", "also come in useful." ], "id": "8006a3ff-dcbf-4243-bb18-5920383a3377" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Backtesting Production Code\n", "\n", "In Amazon the team I led looked at examples of simulations and emulation\n", "as varied as Prime Air drones across to the Amazon Supply Chain. In a\n", "purchasing system, the idea is to store stock to balance supply and\n", "demand. The aim is to keep product in stock for quick dispatch while\n", "keeping prices (and therefore costs) low. This idea is at the heart of\n", "Amazon’s focus on customer experience." ], "id": "49ec3ca1-f49d-4cb1-a6b9-0d608eed6d3b" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Alexa\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Tom Taylor\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Joe Walowski\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Rohit Prasad\n", "\n", "\n", "\n", "\n", "\n", "" ], "id": "9439ad8b-2ed7-4716-bbf3-5713335c6576" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.lib.display import YouTubeVideo\n", "YouTubeVideo('VQVZ2hvNVfo')" ], "id": "c3a33085-1e33-44d5-8cf4-25c53a8a6df0" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Figure: Alexa Architecture overview taken from the Alexa Skills\n", "programme.\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Rohit Prasad\n", "\n", "\n", "\n", "\n", "\n", "" ], "id": "492602f6-b800-449c-9d9b-aac0d2757671" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.lib.display import YouTubeVideo\n", "YouTubeVideo('wa8DU-Sui8Q')" ], "id": "6fef8fbc-b7e2-4b4e-a33d-4bed918f703c" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Figure: Rohit Prasad talking about Alexa at the Amazon 2019 re:MARS\n", "event.\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Tom Taylor\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Joe Walowski\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Rohit Prasad\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "Figure: Simple schmeatic of an intelligent agent’s components." ], "id": "d85f20a2-0c02-458a-82f3-68171e8242b5" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Speech to Text\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Catherine Breslin\n", "\n", "\n", "\n", "\n", "\n", "" ], "id": "d72ecdaa-9d62-426e-9a6b-d23528d272ba" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Cloud Service: Knowledge Base\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "David Hardcastle\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Arpit Mittal\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Christos Christodoulopoulos\n", "\n", "\n", "\n", "\n", "\n", "" ], "id": "a8554fde-fecd-49d1-a3ec-eb3e81c6d2b2" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Text to Speech\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Andrew Breen\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Roberto Barra Chicote\n", "\n", "\n", "\n", "\n", "\n", "" ], "id": "ff6a74b9-b401-4956-b965-cc960bf28f38" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prime Air\n", "\n", "\\[edit\\]\n", "\n", "One project where the components of machine learning and the physical\n", "world come together is Amazon’s Prime Air drone delivery system.\n", "\n", "Automating the process of moving physical goods through autonomous\n", "vehicles completes the loop between the ‘bits’ and the ‘atoms’. In other\n", "words, the information and the ‘stuff’. The idea of the drone is to\n", "complete a component of package delivery, the notion of last mile\n", "movement of goods, but in a fully autonomous way.\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Gur Kimchi\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Paul Viola\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "David Moro\n", "\n", "\n", "\n", "\n", "\n", "" ], "id": "5c5ab2ad-2c39-4e62-a6a5-ab6f78ed9034" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.lib.display import YouTubeVideo\n", "YouTubeVideo('3HJtmx5f1Fc')" ], "id": "95a359e2-52db-486e-a9d7-b814e9fcc731" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Figure: An actual ‘Santa’s sleigh’. Amazon’s prototype delivery\n", "drone. Machine learning algorithms are used across various systems\n", "including sensing (computer vision for detection of wires, people, dogs\n", "etc) and piloting. The technology is necessarily a combination of old\n", "and new ideas. The transition from vertical to horizontal flight is\n", "vital for efficiency and uses sophisticated machine learning to\n", "achieve.\n", "\n", "As Jeff Wilke (who was CEO of Amazon Retail at the time) [announced in\n", "June\n", "2019](https://blog.aboutamazon.com/transportation/a-drone-program-taking-flight)\n", "the technology is ready, but still needs operationalization including\n", "e.g. regulatory approval." ], "id": "1df11f41-e0e2-4c84-ac68-9e3501f5ccab" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.lib.display import YouTubeVideo\n", "YouTubeVideo('wa8DU-Sui8Q')" ], "id": "814e19e8-fc08-418b-b8c5-98c1954a144f" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Figure: Jeff Wilke (CEO Amazon Consumer) announcing the new drone at\n", "the Amazon 2019 re:MARS event alongside the scale of the Amazon supply\n", "chain.\n", "\n", "> When we announced earlier this year that we were evolving our Prime\n", "> two-day shipping offer in the U.S. to a one-day program, the response\n", "> was terrific. But we know customers are always looking for something\n", "> better, more convenient, and there may be times when one-day delivery\n", "> may not be the right choice. Can we deliver packages to customers even\n", "> faster? We think the answer is yes, and one way we’re pursuing that\n", "> goal is by pioneering autonomous drone technology.\n", "\n", "> Today at Amazon’s re:MARS Conference (Machine Learning, Automation,\n", "> Robotics and Space) in Las Vegas, we unveiled our latest Prime Air\n", "> drone design. We’ve been hard at work building fully electric drones\n", "> that can fly up to 15 miles and deliver packages under five pounds to\n", "> customers in less than 30 minutes. And, with the help of our\n", "> world-class fulfillment and delivery network, we expect to scale Prime\n", "> Air both quickly and efficiently, delivering packages via drone to\n", "> customers within months.\n", "\n", "The 15 miles in less than 30 minutes implies air speed velocities of\n", "around 50 kilometers per hour.\n", "\n", "> Our newest drone design includes advances in efficiency, stability\n", "> and, most importantly, in safety. It is also unique, and it advances\n", "> the state of the art. How so? First, it’s a hybrid design. It can do\n", "> vertical takeoffs and landings – like a helicopter. And it’s efficient\n", "> and aerodynamic—like an airplane. It also easily transitions between\n", "> these two modes—from vertical-mode to airplane mode, and back to\n", "> vertical mode.\n", "\n", "> It’s fully shrouded for safety. The shrouds are also the wings, which\n", "> makes it efficient in flight.\n", "\n", "\n", "\n", "Figure: Picture of the drone from Amazon Re-MARS event in 2019.\n", "\n", "> Our drones need to be able to identify static and moving objects\n", "> coming from any direction. We employ diverse sensors and advanced\n", "> algorithms, such as multi-view stereo vision, to detect static objects\n", "> like a chimney. To detect moving objects, like a paraglider or\n", "> helicopter, we use proprietary computer-vision and machine learning\n", "> algorithms.\n", "\n", "> A customer’s yard may have clotheslines, telephone wires, or\n", "> electrical wires. Wire detection is one of the hardest challenges for\n", "> low-altitude flights. Through the use of computer-vision techniques\n", "> we’ve invented, our drones can recognize and avoid wires as they\n", "> descend into, and ascend out of, a customer’s yard." ], "id": "f301f0be-8256-4387-ba9f-765e8af72319" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Supply Chain Optimization\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Llew Mason\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Devesh Mishra\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "Supply chain is the process of matching between the supply of the\n", "product and demand for the product. This matching process is done\n", "through the deployment of resources: places to store the products, ways\n", "and means of transporting the product and even the ability to transform\n", "products from one form to another (e.g. transforming paper into books\n", "through printing)." ], "id": "bab52496-dbc8-4e5f-a3e8-18d89c39a1ef" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.lib.display import YouTubeVideo\n", "YouTubeVideo('ncwsr1Of6Cw')" ], "id": "02ca0eb8-8c26-454b-984d-f8890a40797f" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Figure: Promotional video for the Amazon supply chain optimization\n", "team.\n", "\n", "Arugably the Amazon supply chain is (or was) the largest automated\n", "decision-making system in the world in terms of the amount of money\n", "spent and the quantity of product moved through automated decision\n", "making." ], "id": "72ed2742-18ce-4e06-a8ed-f9a20432eb27" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Supply Chain Optimization\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Llew Mason\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Devesh Mishra\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "At the heart of an automated supply chain is the buying system, which\n", "determines the optimal stock level for products and makes purchases\n", "based on the difference between that stock level and the current stock\n", "level.\n", "\n", "\n", "\n", "Figure: A schematic of a typical buying system for supply chain.\n", "\n", "To make these decisions predictive models (often machine learning or\n", "statistical models) have to be moved. For example, the demand for a\n", "particular product needs to be forecast." ], "id": "9e73901c-d00d-4b09-bc09-107d926d79f6" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Forecasting\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Jenny Freshwater\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Ping Xu\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Dean Foster\n", "\n", "\n", "\n", "\n", "\n", "" ], "id": "ec09ff78-e22b-4748-8c7f-55017d034255" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.lib.display import YouTubeVideo\n", "YouTubeVideo('wa8DU-Sui8Q')" ], "id": "f93f32bb-3827-4969-941d-259fbbd297ca" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Figure: Jenny Freshwater speaking at the Amazon re:MARS event in June\n", "2019.\n", "\n", "The process of forecasting is described by Jenny Freshwater (at the time\n", "Director for Forecasting within Amazon’s Supply Chain Optimization team\n", "in the video in Figure .\n", "\n", "For each product in the Amazon catalogue, the demand is forecast across\n", "the a given future period." ], "id": "f8d4a1cf-1204-406a-a92f-f49777e2ab45" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Inventory and Buying\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Deepak Bhatia\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Piyush Saraogi\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Raman Iyer\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Salal Humair\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Narayan Venkatasubramanyan\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "Forecast information is combined with predictions around lead times from\n", "suppliers, understanding of the network’s capacity (in terms of how much\n", "space is available in which fulfillment centers), the cost of storing\n", "and transporting products and the “value” for the consumer in finding\n", "the product is in stock. These models are typically operational research\n", "models (such as the “newsvendor problem” combined with machine learning\n", "and/or statistical forecasts." ], "id": "2234d9da-0448-4757-9208-c4735fb20675" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Buying System\n", "\n", "\\[edit\\]\n", "\n", "An example of a complex decision-making system might be an automated\n", "buying system. In such a system, the idea is to match demand for\n", "products to supply of products.\n", "\n", "The matching of demand and supply is a repetitive theme for decision\n", "making systems. Not only does it occur in automated buying, but also in\n", "the allocation of drivers to riders in a ride sharing system. Or in the\n", "allocation of compute resource to users in a cloud system.\n", "\n", "The components of any of these systems include predictions of the demand\n", "for the product, the drivers, or the compute. Predictions of the supply.\n", "Decisions are then made for how much material to keep in stock, or how\n", "many drivers to have on the road, or how much computer capacity to have\n", "in your data centers. These decisions have cost implications. The\n", "optimal amount of product will depend on the cost of making it\n", "available. For a buying system this is the storage costs.\n", "\n", "Decisions are made based on the supply and demand to make new orders, to\n", "encourage more drivers to come into the system or to build new data\n", "centers or rent more computational power.\n", "\n", "\n", "\n", "Figure: The components of a putative automated buying system" ], "id": "8790fe75-111e-417d-90aa-4554dd28b8a6" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Monolithic System\n", "\n", "\\[edit\\]\n", "\n", "The classical approach to building these systems was a ‘monolithic\n", "system’. Built in a similar way to the successful applications software\n", "such as Excel or Word, or large operating systems, a single code base\n", "was constructed. The complexity of such code bases run to many lines.\n", "\n", "In practice, shared dynamically linked libraries may be used for aspects\n", "such as user interface, or networking, but the software often has many\n", "millions of lines of code. For example, the Microsoft Office suite is\n", "said to contain over 30 million lines of code.\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Charlie Bell\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Peter Vosshall\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "Figure: A potential path of models in a machine learning system.\n", "\n", "Such software is not only difficult to develop, but also to scale when\n", "computation demands increase. Amazon’s original website software (called\n", "Obidos) was a [monolithic\n", "design](https://en.wikipedia.org/wiki/Obidos_(software)) but by the\n", "early noughties it was becoming difficult to sustain and maintain. The\n", "software was phased out in 2006 to be replaced by a modularized software\n", "known as a ‘service-oriented architecture’." ], "id": "3035a6b2-608a-49a1-bf53-3f0854dc3f30" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Service Oriented Architecture\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Charlie Bell\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Peter Vosshall\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "In Service Oriented Architecture, or “Software as a Service” the idea is\n", "that code bases are modularized and communicate with one another using\n", "network requests. A standard approach is to use a [REST\n", "API](https://en.wikipedia.org/wiki/Representational_state_transfer). So,\n", "rather than a single monolithic code base, the code is developed with\n", "individual services that handle the different requests.\n", "\n", "\n", "\n", "Figure: A potential path of models in a machine learning system.\n", "\n", "Modern software development uses an approach known as *service-oriented\n", "architecture* to build highly complex systems. Such systems have similar\n", "emergent properties to Conway’s “Game of Life”. Understanding these\n", "emergent properties is vitally important when diagnosing problems in the\n", "system." ], "id": "f3e2b210-8373-49c3-ad6d-bd451a3255e5" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Intellectual Debt\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: Jonathan Zittrain’s term to describe the challenges of\n", "explanation that come with AI is Intellectual Debt.\n", "\n", "In the context of machine learning and complex systems, Jonathan\n", "Zittrain has coined the term [“Intellectual\n", "Debt”](https://medium.com/berkman-klein-center/from-technical-debt-to-intellectual-debt-in-ai-e05ac56a502c)\n", "to describe the challenge of understanding what you’ve created. In [the\n", "ML@CL group we’ve been foucssing on developing the notion of a\n", "*data-oriented\n", "architecture*](https://mlatcl.github.io/projects/data-oriented-architectures-for-ai-based-systems.html)\n", "to deal with intellectual debt (Cabrera et al., 2023).\n", "\n", "Zittrain points out the challenge around the lack of interpretability of\n", "individual ML models as the origin of intellectual debt. In machine\n", "learning I refer to work in this area as fairness, interpretability and\n", "transparency or FIT models. To an extent I agree with Zittrain, but if\n", "we understand the context and purpose of the decision making, I believe\n", "this is readily put right by the correct monitoring and retraining\n", "regime around the model. A concept I refer to as “progression testing”.\n", "Indeed, the best teams do this at the moment, and their failure to do it\n", "feels more of a matter of technical debt rather than intellectual,\n", "because arguably it is a maintenance task rather than an explanation\n", "task. After all, we have good statistical tools for interpreting\n", "individual models and decisions when we have the context. We can\n", "linearise around the operating point, we can perform counterfactual\n", "tests on the model. We can build empirical validation sets that explore\n", "fairness or accuracy of the model.\n", "\n", "This is the landscape we now find ourselves in with regard to software\n", "development. In practice, each of these services is often ‘owned’ and\n", "maintained by an individual team. The team is judged by the quality of\n", "their service provision. They work to detailed specifications on what\n", "their service should output, what its availability should be and other\n", "objectives like speed of response. This allows for conditional\n", "independence between teams and for faster development.\n", "\n", "Clearly Conway’s Game of Life exhibits an enormous amount of\n", "intellectual debt, indeed that was the idea. Build something simple that\n", "exhibits great complexity. That’s what makes it so popular. But in\n", "deployed system software, intellectual debt is a major headache and\n", "emulation presents one way of dealing with it.\n", "\n", "Unfortunately, it also makes sophisticated software systems a breeding\n", "ground for intellectual debt. Particularly when they contain components\n", "which are themselves ML components. Dealing with this challenge is a\n", "major objective of my Senior AI Fellowship at the Alan Turing Institute.\n", "You can see me talking about the problems [at this recent seminar given\n", "virtually in\n", "Manchester](http://inverseprobability.com/talks/notes/deploying-machine-learning-systems-intellectual-debt-and-auto-ai.html)." ], "id": "c71d8fd9-f58f-4f58-b041-a9f0b2550fcf" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Examples in Python\n", "\n", "There are a number of different simulations available in python, and\n", "tools specifically design for simulation. For example `simpy` has a\n", "[list of example\n", "simulations](https://simpy.readthedocs.io/en/latest/examples/) around\n", "machine shops, gas stations, car washes.\n", "\n", "Operations research ideas like the newsvendor problem, can also be\n", "solved, Andrea Fabry provides an example of the newsvendor problem in\n", "python for order of [NFL replica jerseys\n", "here](https://github.com/fabryandrea/newsvendor).\n", "\n", "The news vendor problem is also a component in the Amazon supply chain.\n", "The Amazon supply chain makes predictions about supply and demand,\n", "combines them with a cost basis and computes optimal stock. Inventory\n", "orders are based on the difference between this optimal stock and\n", "current stock. The [MiniSCOT\n", "simulation](https://github.com/amzn/supply-chain-simulation-environment)\n", "provides code that illustrates this.\n", "\n", "Control problems are also systems that aim to achieve objectives given a\n", "set of parameters. A classic control problem is the inverted pendulum.\n", "This [python code](https://www.moorepants.info/blog/npendulum.html)\n", "generalises the standard inverted pendulum using one with $n$-links\n", "using symbolic python code.\n", "\n", "In reinforcement learning similar control problems are studied, a\n", "classic reinforcement learning problem is known as the [Mountain\n", "Car](https://github.com/openai/gym/blob/master/gym/envs/classic_control/mountain_car.py).\n", "You can work with this problem using an environment known as [OpenAI\n", "gym](https://github.com/openai/gym) that includes many different\n", "reinforcement learning scenarios.\n", "\n", "In neuroscience Hodgkin and Huxley studied the giant axon in squid to\n", "work out a set of differential equations that are still used today to\n", "model spiking neurons. Mark Kramer explains how to [simulate them in\n", "python here](https://mark-kramer.github.io/Case-Studies-Python/HH.html).\n", "\n", "All Formula One race teams simulate the race to determine their strategy\n", "(tyres, pit stops etc). While those simulations are proprietary, there\n", "is sufficient interest in the wider community that race simulations have\n", "been developed in python. Here is [one that Nick Roth has made\n", "available](https://github.com/rothnic/formulapy).\n", "\n", "Formula one teams also make use of simulation to design car parts. There\n", "are at least two components to this, simulations that allow the\n", "modelling of fluid flow across the car, and simulations that analyze the\n", "capabilities of materials under stress. You can find computational\n", "[fluild dynamics simulations in python\n", "here](https://github.com/barbagroup/CFDPython) and [finite element\n", "analysis methods for computing stress in materials\n", "here](https://solidspy.readthedocs.io/en/latest/readme.html).\n", "\n", "Alongside continuous system simulation through partial differential\n", "equations, another form of simulation that arises is known as *discrete\n", "event simulation*. This comes up in c[omputer\n", "networks](https://github.com/mkalewski/sim2net) and also in chemical\n", "systems where the approach to simulating is kown as the [Gillespie\n", "algorithm](https://github.com/karinsasaki/gillespie-algorithm-python/)." ], "id": "36d8de43-a856-45ca-b826-e6d25f4540f1" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Simulation System\n", "\n", "\\[edit\\]\n", "\n", "An example of a complex decision-making system might be a climate model,\n", "in such a system there are separate models for the atmosphere, the ocean\n", "and the land.\n", "\n", "The components of these systems include flowing of currents, chemical\n", "interactions in the upper atmosphere, evaporation of water etc..\n", "\n", "\n", "\n", "Figure: Representation of the Carbon Cycle from the US National\n", "Oceanic and Atmospheric Administration. While everything is\n", "interconnected in the system, we can decompose into separate models for\n", "atmosphere, ocean, land.\n", "\n", "The influence of human activity also needs to be incorporated and\n", "modelled so we can make judgments about how to mitigate the effects of\n", "global warming.\n", "\n", "\n", "\n", "Figure: The components of a simulation system for climate\n", "modelling." ], "id": "2bc6f287-4e87-4f66-9724-040c00980837" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Monolithic System\n", "\n", "The classical approach to building these systems was a ‘monolithic\n", "system’. Built in a similar way to the successful applications software\n", "such as Excel or Word, or large operating systems, a single code base\n", "was constructed. The complexity of such code bases run to many lines.\n", "\n", "In practice, shared dynamically linked libraries may be used for aspects\n", "such as user interface, or networking, but the software often has many\n", "millions of lines of code. For example, the Microsoft Office suite is\n", "said to contain over 30 million lines of code.\n", "\n", "\n", "\n", "Figure: A potential path of models in a machine learning system." ], "id": "84b322b8-a138-45fe-b386-cbe9fd237816" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Service Oriented Architecture\n", "\n", "Such software is not only difficult to develop, but also to scale when\n", "computation demands increase. For example, Amazon’s original website\n", "software (called Obidos) was a [monolithic\n", "design](https://en.wikipedia.org/wiki/Obidos_(software)) but by the\n", "early noughties it was becoming difficult to sustain and maintain. The\n", "software was phased out in 2006 to be replaced by a modularized software\n", "known as a ‘service-oriented architecture’.\n", "\n", "In Service Oriented Architecture, or “Software as a Service” the idea is\n", "that code bases are modularized and communicate with one another using\n", "network requests. A standard approach is to use a [REST\n", "API](https://en.wikipedia.org/wiki/Representational_state_transfer). So,\n", "rather than a single monolithic code base, the code is developed with\n", "individual services that handle the different requests.\n", "\n", "The simulation software is turned inside out to expose the individual\n", "components to the operator.\n", "\n", "\n", "\n", "Figure: A potential path of models in a machine learning system.\n", "\n", "This is the landscape we now find ourselves in for software development.\n", "In practice, each of these services is often ‘owned’ and maintained by\n", "an individual team. The team is judged by the quality of their service\n", "provision. They work to detailed specifications on what their service\n", "should output, what its availability should be and other objectives like\n", "speed of response. This allows for conditional independence between\n", "teams and for faster development.\n", "\n", "One question is to what extent is the same approach possible/desirable\n", "for scientific models? The components we listed above are already\n", "separated and often run independently. But those components themselves\n", "are made up of other sub-components that could also be exposed in a\n", "similar manner to software-as-a-service, giving us the notion of\n", "“simulation as a service”.\n", "\n", "One thing about working in an industrial environment, is the way that\n", "short-term thinking actions become important. For example, in Formula\n", "One, the teams are working on a two-week cycle to digest information\n", "from the previous week’s race and incorporate updates to the car or\n", "their strategy.\n", "\n", "However, businesses must also think about more medium-term horizons. For\n", "example, in Formula 1 you need to worry about next year’s car. So, while\n", "you’re working on updating this year’s car, you also need to think about\n", "what will happen for next year and prioritize these conflicting needs\n", "appropriately.\n", "\n", "In the Amazon supply chain, there are equivalent demands. If we accept\n", "that an artificial intelligence is just an automated decision-making\n", "system. And if we measure in terms of money automatically spent, or\n", "goods automatically moved, then Amazon’s buying system is perhaps the\n", "world’s largest AI.\n", "\n", "Those decisions are being made on short time schedules; purchases are\n", "made by the system on weekly cycles. But just as in Formula 1, there is\n", "also a need to think about what needs to be done next month, next\n", "quarter and next year. Planning meetings are held not only on a weekly\n", "basis (known as weekly business reviews), but monthly, quarterly, and\n", "then yearly meetings for planning spends and investments.\n", "\n", "Amazon is known for being longer term thinking than many companies, and\n", "a lot of this is coming from the CEO. One quote from Jeff Bezos that\n", "stuck with me was the following.\n", "\n", "> “I very frequently get the question: ‘What’s going to change in the\n", "> next 10 years?’ And that is a very interesting question; it’s a very\n", "> common one. I almost never get the question: ‘What’s not going to\n", "> change in the next 10 years?’ And I submit to you that that second\n", "> question is actually the more important of the two – because you can\n", "> build a business strategy around the things that are stable in time. …\n", "> \\[I\\]n our retail business, we know that customers want low prices,\n", "> and I know that’s going to be true 10 years from now. They want fast\n", "> delivery; they want vast selection. It’s impossible to imagine a\n", "> future 10 years from now where a customer comes up and says, ‘Jeff I\n", "> love Amazon; I just wish the prices were a little higher,’ \\[or\\] ‘I\n", "> love Amazon; I just wish you’d deliver a little more slowly.’\n", "> Impossible. And so the effort we put into those things, spinning those\n", "> things up, we know the energy we put into it today will still be\n", "> paying off dividends for our customers 10 years from now. When you\n", "> have something that you know is true, even over the long term, you can\n", "> afford to put a lot of energy into it.”\n", "\n", "This quote is incredibly important for long term thinking. Indeed, it’s\n", "a failure of many of our simulations that they focus on what is going to\n", "happen, not what will not happen. In Amazon, this meant that there was\n", "constant focus on these three areas, keeping costs low, making delivery\n", "fast and improving selection. For example, shortly before I left Amazon\n", "moved its entire US network from two-day delivery to one-day delivery.\n", "This involves changing the way the entire buying system operates. Or,\n", "more recently, the company has had to radically change the portfolio of\n", "products it buys in the face of Covid19.\n", "\n", "\n", "\n", "\n", "\n", "Figure: Experiment, analyze and design is a flywheel of knowledge\n", "that is the dual of the model, data and compute. By running through this\n", "spiral, we refine our hypothesis/model and develop new experiments which\n", "can be analyzed to further refine our hypothesis.\n", "\n", "From the perspective of the team that we had in the supply chain, we\n", "looked at what we most needed to focus on. Amazon moves very quickly,\n", "but we could also take a leaf out of Jeff’s book, and instead of\n", "worrying about what was going to change, remember what wasn’t going to\n", "change.\n", "\n", "> We don’t know what science we’ll want to do in five years’ time, but\n", "> we won’t want slower experiments, we won’t want more expensive\n", "> experiments and we won’t want a narrower selection of experiments.\n", "\n", "As a result, our focus was on how to speed up the process of\n", "experiments, increase the diversity of experiments that we can do, and\n", "keep the experiments price as low as possible.\n", "\n", "The faster the innovation flywheel can be iterated, then the quicker we\n", "can ask about different parts of the supply chain, and the better we can\n", "tailor systems to answering those questions.\n", "\n", "We need faster, cheaper and more diverse experiments which implies we\n", "need better ecosystems for experimentation. This has led us to focus on\n", "the software frameworks we’re using to develop machine learning systems\n", "including data oriented architectures (Borchert (2020);Lawrence\n", "(2019);Vorhemus and Schikuta (2017);Joshi (2007)), data maturity\n", "assessments (Lawrence et al. (2020)) and data readiness levels (See this\n", "blog post on [Data Readiness\n", "Levels](http://inverseprobability.com/2017/01/12/data-readiness-levels).\n", "and Lawrence (2017);The DELVE Initiative (2020b))" ], "id": "21cdb170-7c16-4612-8f9c-41a3651b1036" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Packing Problems\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: Packing 9 squares into a single square. This example is\n", "trivially solved. Credit \n", "\n", "\n", "\n", "Figure: Packing 17 squares into a single square. The optimal solution\n", "is sometimes hard to find. Here the side length of the smallest square\n", "that holds 17 similarly shaped squares is at least 4.675 times the\n", "smaller square. This solution found by John Bidwell in 1997. Credit\n", "\n", "\n", "Another example of a problem where the “physics” is understood because\n", "it’s really mathematics, is packing problems. Here the mathematics is\n", "just geometry, but still we need some form of compute to solve these\n", "problems. [Erich Friedman’s\n", "website](https://erich-friedman.github.io/packing/) contains a host of\n", "these problems, only some of which are analytically tractable.\n", "\n", "\n", "\n", "Figure: Packing 10 squares into a single square. This example is\n", "proven by Walter Stromquist (Stromquist, 1984). Here\n", "$s=3+\\frac{1}{\\sqrt{2}}$. Credit\n", "" ], "id": "d548f30c-5cf7-4287-9460-a62a858e8dc4" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Modelling with a Function\n", "\n", "What if the question of interest was quite simple, for example in the\n", "packing problem, we just wanted to know the minimum side length.\n", "Sometimes, regardless of the complexity of the problem, there can be a\n", "pattern to the answer that is emergent due to regularities in the\n", "underlying problem." ], "id": "9e9fb4a7-0c87-4c50-bace-2f762749b048" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Erich Friedman Packing Data\n", "\n", "\\[edit\\]" ], "id": "bc91adb7-7e16-4fdb-a8a9-d5ebf6a1c2d7" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install pods" ], "id": "ca1ff9f2-9c39-4491-9afa-a0bd1a7b4b94" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pods" ], "id": "8b821e21-4624-4e4c-98cb-af73d4777a5b" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data = pods.datasets.erich_friedman_packing_data()\n", "x = data['X']\n", "y = data['Y']" ], "id": "262196e8-8071-4efb-b0f2-523ef68c19d1" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import mlai.plot as plot\n", "import mlai" ], "id": "b03e4568-e755-461f-bdb1-6b4b29907328" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pods\n", "from matplotlib import pyplot as plt" ], "id": "64500de0-e13e-4c2c-b059-9b6bc7e6dd58" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\n", "xlim = (0,100)\n", "ylim = (0, 11)\n", "\n", "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "_ = ax.plot(x, y, 'r.',markersize=10)\n", "ax.set_xlabel('n', fontsize=20)\n", "ax.set_ylabel('s', fontsize=20)\n", "ax.set_xlim(xlim)\n", "ax.set_ylim(ylim)\n", "\n", "mlai.write_figure(filename='squares-in-squares.svg', \n", " directory='./datasets')" ], "id": "9f112bc5-1e91-4d9b-989f-ff7f8e3161c3" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: Plot of minimum side length known as a function of number of\n", "squares inside." ], "id": "e6d26434-d377-4400-9a78-2dddc492aa05" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install gpy" ], "id": "715bc94e-c131-4236-b30b-40f7377faf29" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Gaussian Process Fit\n", "\n", "\\[edit\\]\n", "\n", "Our first objective will be to perform a Gaussian process fit to the\n", "data, we’ll do this using the [GPy\n", "software](https://github.com/SheffieldML/GPy)." ], "id": "dbce3530-fc51-44f0-b402-63e25f54e5f9" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import GPy" ], "id": "3efb496a-14fd-4b65-8993-1ae24551251e" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m_full = GPy.models.GPRegression(x,y)\n", "_ = m_full.optimize() # Optimize parameters of covariance function" ], "id": "c11a7eab-1f22-42f4-84b3-9e593c4083fc" }, { "cell_type": "markdown", "metadata": {}, "source": [ "The first command sets up the model, then `m_full.optimize()` optimizes\n", "the parameters of the covariance function and the noise level of the\n", "model. Once the fit is complete, we’ll try creating some test points,\n", "and computing the output of the GP model in terms of the mean and\n", "standard deviation of the posterior functions between 1870 and 2030. We\n", "plot the mean function and the standard deviation at 200 locations. We\n", "can obtain the predictions using `y_mean, y_var = m_full.predict(xt)`" ], "id": "edc997e8-55bf-481c-92fc-ffbf0020f858" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "xt = np.linspace(0,100,400)[:,np.newaxis]\n", "yt_mean, yt_var = m_full.predict(xt)\n", "yt_sd=np.sqrt(yt_var)" ], "id": "cccf8a2a-d35b-4bb4-919b-051a2eb26554" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we plot the results using the helper function in `mlai.plot`." ], "id": "1a52fc26-4281-44ca-9f98-cb7a21e5c854" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import mlai.plot as plot\n", "import mlai" ], "id": "156453dd-129b-40b7-8891-79615a45184a" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "plot.model_output(m_full, ax=ax, xlabel=\"n\", ylabel=\"s\", fontsize=20, portion=0.2)\n", "ax.set_xlim(xlim)\n", "ax.set_ylim(ylim)\n", "mlai.write_figure(figure=fig,\n", " filename=\"erich-friedman-packing-gp.svg\", \n", " directory = \"./gp\",\n", " transparent=True)" ], "id": "0f258daa-b101-49ba-805e-d6eb7378794f" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: Gaussian process fit to the Erich Friedman Packing data." ], "id": "90d4c411-dde9-48a5-82ff-dfc36096cc65" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Statistical Emulation\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: The UK Met office runs a shared code base for its simulations\n", "of climate and the weather. This plot shows the different spatial and\n", "temporal scales used.\n", "\n", "In many real-world systems, decisions are made through simulating the\n", "environment. Simulations may operate at different granularities. For\n", "example, simulations are used in weather forecasts and climate\n", "forecasts. Interestingly, the UK Met office uses the same code for both,\n", "it has a [“Unified Model”\n", "approach](https://www.metoffice.gov.uk/research/approach/modelling-systems/unified-model/index),\n", "but they operate climate simulations at greater spatial and temporal\n", "resolutions.\n", "\n", "\n", "\n", "Figure: Real world systems consist of simulators that capture our\n", "domain knowledge about how our systems operate. Different simulators run\n", "at different speeds and granularities.\n", "\n", "\n", "\n", "Figure: A statistical emulator is a system that reconstructs the\n", "simulation with a statistical model.\n", "\n", "A statistical emulator is a data-driven model that learns about the\n", "underlying simulation. Importantly, learns with uncertainty, so it\n", "‘knows what it doesn’t know’. In practice, we can call the emulator in\n", "place of the simulator. If the emulator ‘doesn’t know’, it can call the\n", "simulator for the answer.\n", "\n", "\n", "\n", "Figure: A statistical emulator is a system that reconstructs the\n", "simulation with a statistical model. As well as reconstructing the\n", "simulation, a statistical emulator can be used to correlate with the\n", "real world.\n", "\n", "As well as reconstructing an individual simulator, the emulator can\n", "calibrate the simulation to the real world, by monitoring differences\n", "between the simulator and real data. This allows the emulator to\n", "characterize where the simulation can be relied on, i.e., we can\n", "validate the simulator.\n", "\n", "Similarly, the emulator can adjudicate between simulations. This is\n", "known as *multi-fidelity emulation*. The emulator characterizes which\n", "emulations perform well where.\n", "\n", "If all this modelling is done with judicious handling of the\n", "uncertainty, the *computational doubt*, then the emulator can assist in\n", "deciding what experiment should be run next to aid a decision: should we\n", "run a simulator, in which case which one, or should we attempt to\n", "acquire data from a real-world intervention." ], "id": "34f0a4ed-098e-4144-9f4b-34bdba0cfb75" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## GPy: A Gaussian Process Framework in Python\n", "\n", "\\[edit\\]\n", "\n", "Gaussian processes are a flexible tool for non-parametric analysis with\n", "uncertainty. The GPy software was started in Sheffield to provide a easy\n", "to use interface to GPs. One which allowed the user to focus on the\n", "modelling rather than the mathematics.\n", "\n", "\n", "\n", "Figure: GPy is a BSD licensed software code base for implementing\n", "Gaussian process models in Python. It is designed for teaching and\n", "modelling. We welcome contributions which can be made through the GitHub\n", "repository \n", "\n", "GPy is a BSD licensed software code base for implementing Gaussian\n", "process models in python. This allows GPs to be combined with a wide\n", "variety of software libraries.\n", "\n", "The software itself is available on\n", "[GitHub](https://github.com/SheffieldML/GPy) and the team welcomes\n", "contributions.\n", "\n", "The aim for GPy is to be a probabilistic-style programming language,\n", "i.e., you specify the model rather than the algorithm. As well as a\n", "large range of covariance functions the software allows for non-Gaussian\n", "likelihoods, multivariate outputs, dimensionality reduction and\n", "approximations for larger data sets.\n", "\n", "The documentation for GPy can be found\n", "[here](https://gpy.readthedocs.io/en/latest/)." ], "id": "650e971a-96fb-43a7-91f8-802b778b5ff3" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## GPy Tutorial\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "James Hensman\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Nicolas Durrande\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "This GPy tutorial is based on material we share in the Gaussian process\n", "summer school for teaching these models . It contains\n", "material from various members and former members of the Sheffield\n", "machine learning group, but particular mention should be made of\n", "[Nicolas\n", "Durrande](https://sites.google.com/site/nicolasdurrandehomepage/) and\n", "[James Hensman](https://jameshensman.github.io/), see\n", "." ], "id": "bdda656e-edd3-4c83-b55a-c358fbd096fd" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import GPy" ], "id": "a72aa0a8-7bf7-48c1-b378-373d5097023e" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from matplotlib import pyplot as plt" ], "id": "2227de09-f853-4072-b8a7-3aa0120bc512" }, { "cell_type": "markdown", "metadata": {}, "source": [ "To give a feel for the software we’ll start by creating an exponentiated\n", "quadratic covariance function, $$\n", "k(\\mathbf{ x}, \\mathbf{ x}^\\prime) = \\alpha \\exp\\left(-\\frac{\\left\\Vert \\mathbf{ x}- \\mathbf{ x}^\\prime \\right\\Vert_2^2}{2\\ell^2}\\right),\n", "$$ where the length scale is $\\ell$ and the variance is $\\alpha$.\n", "\n", "To set this up in GPy we create a kernel in the following manner." ], "id": "46a503b0-3471-4cfc-82e5-688cdffb0a0b" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "input_dim=1\n", "alpha = 1.0\n", "lengthscale = 2.0\n", "kern = GPy.kern.RBF(input_dim=input_dim, variance=alpha, lengthscale=lengthscale)" ], "id": "c0793606-88a6-49e8-9355-cfc2e2cda048" }, { "cell_type": "markdown", "metadata": {}, "source": [ "That builds a kernel object for us. The kernel can be displayed." ], "id": "e4a9ad10-c55a-462e-9aaa-93e807b0fc1c" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "display(kern)" ], "id": "27db27d4-e60b-4b33-8fca-f92fe880a47a" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Or because it’s one dimensional, you can also plot the kernel as a\n", "function of its inputs (while the other is fixed)." ], "id": "0fd99c1c-6007-4f22-a4f3-802cc746cd92" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import mlai\n", "import mlai.plot as plot" ], "id": "9ac5e8e2-847b-46d1-af47-54eeb04ae6a9" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "kern.plot(ax=ax)\n", "mlai.write_figure('gpy-eq-covariance.svg', directory='./kern')" ], "id": "4eee5bdf-a8b6-4ff2-902f-cd81559dc963" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: The exponentiated quadratic covariance function as plotted by\n", "the `GPy.kern.plot` command.\n", "\n", "You can set the length scale of the covariance to different values and\n", "plot the result." ], "id": "8c2ec96a-f965-4452-b626-c8b4ef2b2711" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kern = GPy.kern.RBF(input_dim=input_dim) # By default, the parameters are set to 1.\n", "lengthscales = np.asarray([0.2,0.5,1.,2.,4.])" ], "id": "57321c2e-0a20-41b4-8a08-9ab5acdd86dc" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "\n", "for lengthscale in lengthscales:\n", " kern.lengthscale=lengthscale\n", " kern.plot(ax=ax)\n", "\n", "ax.legend(lengthscales)\n", "mlai.write_figure('gpy-eq-covariance-lengthscales.svg', directory='./kern')" ], "id": "c7198374-bfff-41a8-abd3-ba8cf02f5428" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: The exponentiated quadratic covariance function plotted for\n", "different length scales by `GPy.kern.plot` command." ], "id": "1fd2ef7f-b414-463f-a39b-7220be0f38a6" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Covariance Functions in GPy\n", "\n", "Many covariance functions are already implemented in GPy. Instead of\n", "rbf, try constructing and plotting the following covariance functions:\n", "`exponential`, `Matern32`, `Matern52`, `Brownian`, `linear`, `bias`,\n", "`rbfcos`, `periodic_Matern32`, etc. Some of these covariance functions,\n", "such as `rbfcos`, are not parametrized by a variance and a length scale.\n", "Further, not all kernels are stationary (i.e., they can’t all be written\n", "as\n", "$k(\\mathbf{ x}, \\mathbf{ x}^\\prime) = f(\\mathbf{ x}-\\mathbf{ x}^\\prime)$,\n", "see for example the Brownian covariance function). So for plotting it\n", "may be interesting to change the value of the fixed input." ], "id": "9bea216c-f53b-41d8-8897-c64f1d0fb4db" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "\n", "brownian_kern = GPy.kern.Brownian(input_dim=1)\n", "inputs = np.array([2., 4.])\n", "for x in inputs:\n", " brownian_kern.plot(x,plot_limits=[0,5], ax=ax)\n", "ax.legend(inputs)\n", "ax.set_ylim(-0.1,5.1)\n", "\n", "mlai.write_figure('gpy-brownian-covariance-lengthscales.svg', directory='./kern')" ], "id": "5d608ab9-8385-42cf-bd48-3e232f3ec802" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Combining Covariance Functions in GPy\n", "\n", "In GPy you can easily combine covariance functions you have created\n", "using the sum and product operators, `+` and `*`. So, for example, if we\n", "wish to combine an exponentiated quadratic covariance with a Matern 5/2\n", "then we can write" ], "id": "5314a40b-2460-425d-bf4b-11f10ec10432" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kern1 = GPy.kern.RBF(1, variance=1., lengthscale=2.)\n", "kern2 = GPy.kern.Matern52(1, variance=2., lengthscale=4.)\n", "kern = kern1 + kern2\n", "display(kern)" ], "id": "7785417b-3d0a-49c4-ad36-b457065ced5f" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "\n", "kern.plot(ax=ax)\n", "\n", "mlai.write_figure('gpy-eq-plus-matern52-covariance.svg', directory='./kern')" ], "id": "f7d8202d-8d69-4405-aff1-1ef2645fcbf2" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: A combination of the exponentiated quadratic covariance plus\n", "the Matern $5/2$ covariance.\n", "\n", "Or if we wanted to multiply them, we can write" ], "id": "9f4f8c78-86df-4007-8132-e51df7b63684" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kern1 = GPy.kern.RBF(1, variance=1., lengthscale=2.)\n", "kern2 = GPy.kern.Matern52(1, variance=2., lengthscale=4.)\n", "kern = kern1 * kern2\n", "display(kern)" ], "id": "852f9553-c573-4eaa-9f9c-d1f81f756c21" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "\n", "kern.plot(ax=ax)\n", "\n", "mlai.write_figure('gpy-eq-times-matern52-covariance.svg', directory='./kern')" ], "id": "61964ad5-d84b-4a99-9b26-7b2dae105079" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: A combination of the exponentiated quadratic covariance\n", "multiplied by the Matern $5/2$ covariance.\n", "\n", "You can learn about how to implement [new kernel objects in GPy\n", "here](https://gpy.readthedocs.io/en/latest/tuto_creating_new_kernels.html)." ], "id": "db404948-4b32-4633-a1fb-cc342a800642" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.lib.display import YouTubeVideo\n", "YouTubeVideo('-sY8zW3Om1Y')" ], "id": "2458b914-d7c7-47fd-8b77-91813d3c748b" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Figure: Designing the covariance function for your Gaussian process\n", "is a key place in which you introduce your understanding of the data\n", "problem. To learn more about the design of covariance functions, see\n", "this talk from Nicolas Durrande at GPSS in 2016." ], "id": "0226698e-07c6-40c3-a67e-4c107ea40958" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## A Gaussian Process Regression Model\n", "\n", "We will now combine the Gaussian process prior with some data to form a\n", "GP regression model with GPy. We will generate data from the function $$\n", "f( x) = − \\cos(\\pi x) + \\sin(4\\pi x)\n", "$$ over the domain $[0, 1]$, adding some noise to gives $$\n", "y(x) = f(x) + \\epsilon,\n", "$$ with the noise being Gaussian distributed,\n", "$\\epsilon\\sim \\mathcal{N}\\left(0,0.01\\right)$." ], "id": "1f9d3f3e-8e47-4be6-b7ea-b76461dbc9ee" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "X = np.linspace(0.05,0.95,10)[:,np.newaxis]\n", "Y = -np.cos(np.pi*X) + np.sin(4*np.pi*X) + np.random.normal(loc=0.0, scale=0.1, size=(10,1))" ], "id": "f5e7a768-e7da-4425-9026-2d8261abb41c" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "ax.plot(X,Y,'kx',mew=1.5, linewidth=2)\n", "\n", "mlai.write_figure('noisy-sine.svg', directory='./gp')" ], "id": "700d3b64-0f12-4bca-a48e-45da537e6fa7" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: Data from the noisy sine wave for fitting with a GPy\n", "model.\n", "\n", "A GP regression model based on an exponentiated quadratic covariance\n", "function can be defined by first defining a covariance function." ], "id": "55ae222a-4f23-4405-bb56-10530f9f9c44" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kern = GPy.kern.RBF(input_dim=1, variance=1., lengthscale=1.)" ], "id": "31f27b72-3363-4b69-a254-b38a6bb1bdb9" }, { "cell_type": "markdown", "metadata": {}, "source": [ "And then combining it with the data to form a Gaussian process model." ], "id": "e297fbf9-f2e6-43d4-8b1a-6c59718dff54" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model = GPy.models.GPRegression(X,Y,kern)" ], "id": "0c13a2cb-ad0d-432b-af39-e710b20c5bc5" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Just as for the covariance function object, we can find out about the\n", "model using the command `display(model)`." ], "id": "0b5348a2-9333-450e-a748-74edf68805c3" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "display(model)" ], "id": "c22a879c-6772-43ff-b245-0912825f8107" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that by default the model includes some observation noise with\n", "variance 1. We can see the posterior mean prediction and visualize the\n", "marginal posterior variances using `model.plot()`." ], "id": "5e5c7673-e802-4a4c-8fee-29940ea7db6e" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "model.plot(ax=ax)\n", "\n", "mlai.write_figure('noisy-sine-gp-fit.svg', directory='./gp')" ], "id": "fda157df-f007-4a26-aa0f-d5e8dcdea481" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: A Gaussian process fit to the noisy sine data. Here the\n", "parameters of the process and the covariance function haven’t yet been\n", "optimized.\n", "\n", "You can also look directly at the predictions for the model using." ], "id": "9e94ec72-9627-4aa6-bcb6-73e6c6c93fab" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Xstar = np.linspace(0, 10, 100)[:, np.newaxis]\n", "Ystar, Vstar = model.predict(Xstar)" ], "id": "048101d4-594c-4ea6-a899-707e97eaaafa" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Which gives you the mean (`Ystar`), the variance (`Vstar`) at the\n", "locations given by `Xstar`." ], "id": "6a6e308e-fb04-42c8-a2ee-b8da0db5829f" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Covariance Function Parameter Estimation\n", "\n", "As we have seen during the lectures, the parameters values can be\n", "estimated by maximizing the likelihood of the observations. Since we\n", "don’t want any of the variances to become negative during the\n", "optimization, we can constrain all parameters to be positive before\n", "running the optimization." ], "id": "cb44a9e4-3379-4da2-b3c6-c64fb2200ac1" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model.constrain_positive()" ], "id": "de5a41d4-d321-40c0-a09c-5afbc99ae76a" }, { "cell_type": "markdown", "metadata": {}, "source": [ "The warnings are because the parameters are already constrained by\n", "default, the software is warning us that they are being reconstrained.\n", "\n", "Now we can optimize the model using the `model.optimize()` method. Here\n", "we switch messages on, which allows us to see the progression of the\n", "optimization." ], "id": "bc148778-8139-46b7-b10d-69791581e28c" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model.optimize(messages=True)" ], "id": "0e23a13b-d744-4576-a01e-5261d8adb674" }, { "cell_type": "markdown", "metadata": {}, "source": [ "By default, the optimization is using a limited memory BFGS optimizer\n", "(Byrd et al., 1995).\n", "\n", "Once again, we can display the model, now to see how the parameters have\n", "changed." ], "id": "a4f73f89-c6c1-4bb2-9e2f-df210015c3bd" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "display(model)" ], "id": "d96d553b-2686-4d2d-afe0-844d6a9f1041" }, { "cell_type": "markdown", "metadata": {}, "source": [ "The length scale is much smaller, as well as the noise level. The\n", "variance of the exponentiated quadratic has also reduced." ], "id": "30c43994-1912-4a62-8ab6-a3034f34fe67" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "model.plot(ax=ax)\n", "\n", "mlai.write_figure('noisy-sine-gp-optimized-fit.svg', directory='./gp')" ], "id": "e515e32c-b13f-4493-99c8-12d1a532cbd3" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: A Gaussian process fit to the noisy sine data with parameters\n", "optimized." ], "id": "b13babf7-688c-42d4-94ba-22dfdfecf593" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## GPy and Emulation\n", "\n", "\\[edit\\]\n", "\n", "Let $\\mathbf{ x}$ be a random variable defined over the real numbers,\n", "$\\Re$, and $f(\\cdot)$ be a function mapping between the real numbers\n", "$\\Re \\rightarrow \\Re$.\n", "\n", "The problem of *uncertainty propagation* is the study of the\n", "distribution of the random variable $f(\\mathbf{ x})$.\n", "\n", "We’re going to address this problem using emulation and GPy. We will see\n", "in this section the advantage of using a model when only a few\n", "observations of $f$ are available.\n", "\n", "Firstly, we’ll make use of a test function known as the Branin test\n", "function. $$\n", "f(\\mathbf{ x}) = a(x_2 - bx_1^2 + cx_1 - r)^2 + s(1-t \\cos(x_1)) + s\n", "$$ where we are setting $a=1$, $b=5.1/(4\\pi^2)$, $c=5/\\pi$, $r=6$,\n", "$s=10$ and $t=1/(8\\pi)$." ], "id": "e81d04f7-0156-45f2-8994-4e89213ec414" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np" ], "id": "f7949f8b-b50e-4270-92ec-0034e46cfc00" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def branin(X):\n", " y = ((X[:,1]-5.1/(4*np.pi**2)*X[:,0]**2+5*X[:,0]/np.pi-6)**2 \n", " + 10*(1-1/(8*np.pi))*np.cos(X[:,0])+10)\n", " return(y)" ], "id": "feb80fa3-d772-4101-a9d6-354df5a8ae0f" }, { "cell_type": "markdown", "metadata": {}, "source": [ "We’ll define a grid of twenty-five observations over \\[−5, 10\\] × \\[0,\n", "15\\] and a set of 25 observations." ], "id": "a4c91c21-b9f4-4d6d-9ff7-8454d1bd5e0a" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Training set defined as a 5*5 grid:\n", "xg1 = np.linspace(-5,10,5)\n", "xg2 = np.linspace(0,15,5)\n", "X = np.zeros((xg1.size * xg2.size,2))\n", "for i,x1 in enumerate(xg1):\n", " for j,x2 in enumerate(xg2):\n", " X[i+xg1.size*j,:] = [x1,x2]\n", "\n", "Y = branin(X)[:,np.newaxis]" ], "id": "aa19d8ae-7b96-4410-b93f-597a5c04288a" }, { "cell_type": "markdown", "metadata": {}, "source": [ "The task here will be to consider the distribution of $f(U)$, where $U$\n", "is a random variable with uniform distribution over the input space of\n", "$f$. We focus on the computaiton of two quantities, the expectation of\n", "$f(U)$, $\\left\\langle f(U)\\right\\rangle$, and the probability that the\n", "value is greater than 200." ], "id": "f742ce92-40ce-483a-a8b4-fd4b89917f3b" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Computation of $\\left\\langle f(U)\\right\\rangle$\n", "\n", "The expectation of $f(U )$ is given by\n", "$\\int_\\mathbf{ x}f( \\mathbf{ x})\\text{d}\\mathbf{ x}$. A basic approach\n", "to approximate this integral is to compute the mean of the 25\n", "observations: `np.mean(Y)`. Since the points are distributed on a grid,\n", "this can be seen as the approximation of the integral by a rough Riemann\n", "sum." ], "id": "8d875aed-7e34-4781-8bfd-064599fb9b0b" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print('Estimate of the expectation is given by: {mean}'.format(mean=Y.mean()))" ], "id": "2bdce72c-d800-4004-9095-4e0f840e77e8" }, { "cell_type": "markdown", "metadata": {}, "source": [ "The result can be compared with the actual mean of the Branin function\n", "which is 54.31.\n", "\n", "Alternatively, we can fit a GP model and compute the integral of the\n", "best predictor by Monte Carlo sampling.\n", "\n", "Firstly, we create the covariance function. Here we’re going to use an\n", "exponentiated quadratic, but we’ll augment it with the ‘bias’ covariance\n", "function. This covariance function represents a single fixed bias that\n", "is added to the overall covariance. It allows us to deal with\n", "non-zero-mean emulations." ], "id": "7a4c75c7-9698-49c0-bd42-3216806471a0" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import GPy" ], "id": "d29faa9e-0ef8-4593-9b74-699bda19e966" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create an exponentiated quadratic plus bias covariance function\n", "kern_eq = GPy.kern.RBF(input_dim=2, ARD = True)\n", "kern_bias = GPy.kern.Bias(input_dim=2)\n", "kern = kern_eq + kern_bias" ], "id": "fa318c5d-ffb1-4c8a-ab5e-a95a8ef3d99c" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we construct the Gaussian process regression model in GPy." ], "id": "e3bce89b-0fbf-4602-be7c-614b30ac13b3" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Build a GP model\n", "model = GPy.models.GPRegression(X,Y,kern)" ], "id": "ec276800-69f8-4950-945f-a7e3cde9baaf" }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the sinusoid example above, we learnt the variance of the process.\n", "But in this example, we are fitting an emulator to a function we know is\n", "noise-free. However, we don’t fix the noise value to precisely zero, as\n", "this can lead to some numerical errors. Instead, we fix the variance of\n", "the Gaussian noise to a very small value." ], "id": "8010e356-bd3a-4959-a11e-8cacb53cf5c6" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# fix the noise variance\n", "model.likelihood.variance.fix(1e-5)" ], "id": "cf33349c-d531-41e5-a0cd-fab5c62f0583" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we fit the model. Note, that the initial values for the length scale\n", "are not appropriate. So first set the length scale of the model needs to\n", "be reset." ], "id": "c09648f7-de6a-443c-9f02-6c9c16c19fa8" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kern.rbf.lengthscale = np.asarray([3, 3])" ], "id": "dcc95e5d-37f9-4349-b2a0-e1c561402714" }, { "cell_type": "markdown", "metadata": {}, "source": [ "It’s a common error in Gaussian process fitting to initialize the length\n", "scale too small or too big. The challenge is that the error surface is\n", "normally multimodal, and the final solution can be very sensitive to\n", "this initialization. If the length scale is initialized too small, the\n", "solution can converge on an place where the signal isn’t extracted by\n", "the covariance function. If the length scale is initialized too large,\n", "then the variations of the function are often missing. Here the length\n", "scale is set for each dimension of inputs as 3. Now that’s done, we can\n", "optimize the model." ], "id": "a4ebf65a-3396-41cf-ac7f-cde18181048b" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Randomize the model and optimize\n", "model.optimize(messages=True)" ], "id": "ac0f87f2-e790-4ee7-8ac8-5a8606eb40fd" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt" ], "id": "bad3b330-513a-4da8-8b95-007692550d23" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "model.plot(ax=ax)\n", "\n", "mlai.write_figure('branin-gp-optimized-fit.svg', directory='./gp')" ], "id": "7a205a5d-388a-403d-b4ab-2419482142bb" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: A Gaussian process fit to the Branin test function, used to\n", "assess the mean of the function by emulation.\n", "\n", "Finally, we can compute the mean of the model predictions using very\n", "many Monte Carlo samples.\n", "\n", "Note, that in this example, because we’re using a test function, we\n", "could simply have done the Monte Carlo estimation directly on the Branin\n", "function. However, imagine instead that we were trying to understand the\n", "results of a complex computational fluid dynamics simulation, where each\n", "run of the simulation (which is equivalent to our test function) took\n", "many hours. In that case the advantage of the emulator is clear." ], "id": "84384a75-208f-44de-8bfe-9a462557dd75" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Compute the mean of model prediction on 1e5 Monte Carlo samples\n", "Xp = np.random.uniform(size=(int(1e5),2))\n", "Xp[:,0] = Xp[:,0]*15-5\n", "Xp[:,1] = Xp[:,1]*15\n", "mu, var = model.predict(Xp)\n", "print('The estimate of the mean of the Branin function is {mean}'.format(mean=np.mean(mu)))" ], "id": "cd0f3bb5-1eec-4fa7-8a47-b54f7d5473da" }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise 1\n", "\n", "Now think about how to make use of the variance estimation from the\n", "Gaussian process to obtain error bars around your estimate." ], "id": "3893001f-0178-4729-8cdf-3d2ffff1077b" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Write your answer to Exercise 1 here\n", "\n", "\n", "\n", "\n" ], "id": "28158e58-7e74-4156-853b-d21898928aa1" }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise 2\n", "\n", "You’ve seen how the Monte Carlo estimates work with the Gaussian\n", "process. Now make your estimate of the probability that the Branin\n", "function is greater than 200 with the uniform random inputs." ], "id": "7008f3f9-646f-45a7-afeb-5dc5401a259d" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Write your answer to Exercise 2 here\n", "\n", "\n", "\n", "\n" ], "id": "f914602e-11d8-4f14-864c-b82dfe875a9a" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Uncertainty Quantification and Design of Experiments\n", "\n", "\\[edit\\]\n", "\n", "We’re introducing you to the optimization and analysis of real-world\n", "models through emulation, this domain is part of a broader field known\n", "as surrogate modelling.\n", "\n", "Although we’re approaching this from the machine learning perspective,\n", "with a computer-scientist’s approach, you won’t be surprised to find out\n", "that this field is not new and there are a diverse range of research\n", "communities interested in this domain.\n", "\n", "We’ve been focussing on *active* experimental design. In particular, the\n", "case where we are sequentially selecting points to run our simulation\n", "based on previous results.\n", "\n", "Here, we pause for a moment and cover approaches to *passive*\n", "experimental design. Almost all the emulation examples we’ve looked at\n", "so far need some initial points to ‘seed’ the emulator. Selecting these\n", "is also a task of experimental design, but one we perform without\n", "running our simulator.\n", "\n", "This type of challenge, of where to run the simulation to get the answer\n", "you require is an old challenge. One classic paper, McKay et al. (1979),\n", "reviews three different methods for designing these inputs. They are\n", "*random sampling*, *stratified sampling*, and *Latin hypercube\n", "sampling*.\n", "\n", "{Random sampling is the default approach, this is where across the input\n", "domain of interest, we just choose to select samples randomly (perhaps\n", "uniformly, or if we believe there’s an underlying distribution\n", "\n", "> Let the input values $\\mathbf{ x}_1, \\dots, \\mathbf{ x}_n$ be a random\n", "> sample from $f(\\mathbf{ x})$. This method of sampling is perhaps the\n", "> most obvious, and an entire body of statistical literature may be used\n", "> in making inferences regarding the distribution of $Y(t)$.\n", "\n", "In statistical surveillance stratified sampling is an approach from\n", "statistics where a population is divided into sub-populations before\n", "sampling. For example, imagine that we are interested surveillance data\n", "for Covid-19 tracking. If we can only afford to track 100 people, we\n", "could sample them randomly from across the population. But we might\n", "worry that (by chance) we don’t get many people from a particular age\n", "group. So instead, we could divide the population into sub-groups and\n", "sample a fixed number from each group. This ensures we get coverage\n", "across all ages, although downstream we might have to do some weighting\n", "of the samples when considering the population effect.\n", "\n", "The same ideas can be deployed in emulation, your input domain can be\n", "divided into domains of particular intererest. For example if testing\n", "the performance of a component from an F1 car, you might be interested\n", "in the performance on the straight, the performance in “fast corners”\n", "and the performance in “slow corners”. Because slow corners have a very\n", "large effect on the final performance, you might take more samples from\n", "slow corners relative to the frequency that such corners appear in the\n", "actual races.\n", "\n", "> Using stratified sampling, all areas of the sample space of\n", "> $\\mathbf{ x}$ are represented by input values. Let the sample space\n", "> $S$ of $\\mathbf{ x}$ be partitioned into $I$ disjoint strata $S_t$.\n", "> Let $\\pi = P(\\mathbf{ x}\\in S_i)$ represent the size of $S_i$. Obtain\n", "> a random sample $\\mathbf{ x}_{ij}$, $j = 1, \\dots, n$ from $S_i$. Then\n", "> of course the $n_i$ sum to $n$. If $I = 1$, we have random sampling\n", "> over the entire sample space.\n", "\n", "Latin hypercube sampling is a form of stratified sampling. For a Latin\n", "square if $M$ samples are requred, then the strata are determined by\n", "dividing the area of the inputs into discrete $M$ rows and $M$ columns.\n", "Then the samples are taken so that each row and column only contains one\n", "total sample. The Latin hypercube is the generalisation of this idea to\n", "more than two dimensions.\n", "\n", "> The same reasoning that led to stratified sampling, ensuring that all\n", "> portions of $S$ were sampled, could lead further. If we wish to ensure\n", "> also that each of the input variables $\\mathbf{ x}_k$ has all portions\n", "> of its distribution represented by input values, we can divide the\n", "> range of each $\\mathbf{ x}_k$ into $n$ strata of equal marginal\n", "> probability $1/n$, and sample once from each stratum. Let this sample\n", "> be $\\mathbf{ x}_{kj}$, $j = 1, \\dots, n$. These form the\n", "> $\\mathbf{ x}_k$ component, $k = 1, \\dots , K$, in $\\mathbf{ x}_i$,\n", "> $i = 1, \\dots, n$. The components of the various $\\mathbf{ x}_k$’s are\n", "> matched at random. This method of selecting input values is an\n", "> extension of quota sampling (Steinberg 1963), and can be viewed as a\n", "> $K$-dimensional extension of Latin square sampling (Raj 1968).\n", "\n", "The paper’s rather dated reference to “Output from a Computer Code” does\n", "carry forward through this literature, which has continued to be a focus\n", "of interest for statisticians. [Tony\n", "O’Hagan](http://www.tonyohagan.co.uk/academic/), who was a colleague in\n", "Sheffield but is also one of the pioneers of Gaussian process models was\n", "developing these methods when I first arrived there (Kennedy and\n", "O’Hagan, 2001), and continued with a large EPSRC funded project for\n", "managing uncertainty in computational models, .\n", "You can see a list of [their technical reports\n", "here](http://www.mucm.ac.uk/Pages/Dissemination/TechnicalReports.html).\n", "\n", "Another important group based in France is the “MASCOT-NUM Research\n", "Group”, . These researchers bring\n", "together statisticians, applied mathematicians and engineers in solving\n", "these problems." ], "id": "4c2fe50d-749f-4532-8f11-ffdfe4fd5d8e" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Emukit Playground\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Leah Hirst\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Cliff McCollum\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "Emukit playground is a software toolkit for exploring the use of\n", "statistical emulation as a tool. It was built by [Leah\n", "Hirst](https://www.linkedin.com/in/leahhirst/), during her software\n", "engineering internship at Amazon and supervised by [Cliff\n", "McCollum](https://www.linkedin.com/in/cliffmccollum/).\n", "\n", "\n", "\n", "Figure: Emukit playground is a tutorial for understanding the\n", "simulation/emulation relationship.\n", "\n", "\n", "\n", "\n", "Figure: Tutorial on Bayesian optimization of the number of taxis\n", "deployed from Emukit playground.\n", "\n", "\n", "You can explore Bayesian optimization of a taxi simulation." ], "id": "3435f3c5-9a23-4a97-90c1-0689ff8e90d5" }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise 3\n", "\n", "You now know enough to build a simple emulation. To test your knowledge\n", "have a go at cobmining GPy with Thomas House’s herd immunity simulation.\n", "Can you build a Gaussian process emulator of the simulation? Don’t spent\n", "do long on this exercise. The idea is just to consolidate things like\n", "what the inputs and outputs should be." ], "id": "ff06a681-6fe0-4faf-a5bd-7a167dd01bf0" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Write your answer to Exercise 3 here\n", "\n", "\n", "\n", "\n" ], "id": "75214c6b-3099-46a9-9ab9-c97f26e10af4" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Conclusions\n", "\n", "We summarized the different types of simulation into roughly three\n", "groups. Firstly, those based on physical laws in the form of\n", "differential equations. Examples include certain compartmental\n", "epidemiological models, climate models and weather models. Secondly,\n", "discrete event simulations. These simulations often run to a ‘clock’,\n", "where updates to the state are taken in turns. The Game of Life is an\n", "example of this type of simulation, and Formula 1 models of race\n", "strategy also use this approach. There is another type of discrete event\n", "simulation that doesn’t use a turn-based approach but waits for the next\n", "event. The [Gillespie\n", "algorithm](https://en.wikipedia.org/wiki/Gillespie_algorithm) is an\n", "example of such an approach but we didn’t cover it here. Finally, we\n", "realized that general computer code bases are also simulations. If a\n", "company has a large body of code, and particularly if it’s hosted within\n", "a streaming environment (such as Apache Kafka), it’s possible to back\n", "test the code with different inputs. Such backtests can be viewed as\n", "simulations, and in the case of large bodies of code (such as the code\n", "that manages Amazon’s automated buying systems) the back tests can be\n", "slow and could also benefit from emulation.\n", "\n", "We’ve introduced emulation as a way of dealing with different fidelities\n", "of simulations and removing the computational demands that come with\n", "them. We’ve highlighted how emulation can be deployed and introduced the\n", "`GPy` software for Gaussian process modelling." ], "id": "20e68f20-957d-46e7-b0ea-ade24e2295e2" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Thanks!\n", "\n", "For more information on these subjects and more you might want to check\n", "the following resources.\n", "\n", "- twitter: [@lawrennd](https://twitter.com/lawrennd)\n", "- podcast: [The Talking Machines](http://thetalkingmachines.com)\n", "- newspaper: [Guardian Profile\n", " Page](http://www.theguardian.com/profile/neil-lawrence)\n", "- blog:\n", " [http://inverseprobability.com](http://inverseprobability.com/blog.html)" ], "id": "55d32d1c-8e14-4ff7-889b-88dec29b54d8" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## References" ], "id": "637fe587-7ef2-46da-99ee-11a14d8fbe19" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Borchert, T., 2020. [Milan: An evolution of data-oriented\n", "programming](https://tborchertblog.wordpress.com/2020/02/13/28/).\n", "\n", "Byrd, R.H., Lu, P., Nocedal, J., 1995. A limited memory algorithm for\n", "bound constrained optimization. SIAM Journal on Scientific and\n", "Statistical Computing 16, 1190–1208.\n", "\n", "Cabrera, C., Paleyes, A., Thodoroff, P., Lawrence, N.D., 2023.\n", "[Real-world machine learning systems: A survey from a data-oriented\n", "architecture perspective](https://arxiv.org/abs/2302.04810).\n", "\n", "Joshi, R., 2007. [A loosely-coupled real-time\n", "SOA](http://community.rti.com/sites/default/files/archive/Data-Oriented_Architecture.pdf).\n", "Real-Time Innovations Inc.\n", "\n", "Kennedy, M.C., O’Hagan, A., 2001. Bayesian calibration of computer\n", "models. Journal of the Royal Statistical Society: Series B (Statistical\n", "Methodology) 63, 425–464. \n", "\n", "Lawrence, N.D., 2024. The atomic human: Understanding ourselves in the\n", "age of AI. Allen Lane.\n", "\n", "Lawrence, N.D., 2019. [Modern data oriented\n", "programming](http://inverseprobability.com/talks/notes/modern-data-oriented-programming.html).\n", "\n", "Lawrence, N.D., 2017. Data readiness levels. ArXiv.\n", "\n", "Lawrence, N.D., Montgomery, J., Paquet, U., 2020. [Organisational data\n", "maturity](https://rs-delve.github.io/addenda/2020/11/24/organizational-data-maturity.html).\n", "The Royal Society.\n", "\n", "McKay, M.D., Beckman, R.J., Conover, W.J., 1979. [A comparison of three\n", "methods for selecting values of input variables in the analysis of\n", "output from a computer code](http://www.jstor.org/stable/1268522).\n", "Technometrics 21, 239–245.\n", "\n", "Stromquist, W.R., 1984. Packing unit squares inside squares, III. Daniel\n", "H. Wagner Associates.\n", "\n", "The DELVE Initiative, 2020b. [Data readiness: Lessons from an\n", "emergency](http://rs-delve.github.io/reports/2020/11/24/data-readiness-lessons-from-an-emergency.html).\n", "The Royal Society.\n", "\n", "The DELVE Initiative, 2020a. [Test, trace,\n", "isolate](https://rs-delve.github.io/reports/2020/05/27/test-trace-isolate.html).\n", "The Royal Society.\n", "\n", "Vorhemus, C., Schikuta, E., 2017. A data-oriented architecture for\n", "loosely coupled real-time information systems, in: Proceedings of the\n", "19th International Conference on Information Integration and Web-Based\n", "Applications & Services, iiWAS ’17. Association for Computing Machinery,\n", "New York, NY, USA, pp. 472–481.\n", "" ], "id": "17319d72-d53a-41e8-a343-c5d4983b0506" } ], "nbformat": 4, "nbformat_minor": 5, "metadata": {} }