{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Emulation\n", "\n", "### [Neil D. Lawrence](http://inverseprobability.com), University of\n", "\n", "Cambridge\n", "\n", "### 2021-09-15" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Abstract**: In this session we introduce the notion of emulation and\n", "systems modeling with Gaussian processes." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "> We may regard the present state of the universe as the effect of its\n", "> past and the cause of its future. An intellect which at a certain\n", "> moment would know all forces that set nature in motion, and all\n", "> positions of all items of which nature is composed, …\n", "\n", "> … if this intellect were also vast enough to submit these data to\n", "> analysis, it would embrace in a single formula the movements of the\n", "> greatest bodies of the universe and those of the tiniest atom; for\n", "> such an intellect nothing would be uncertain and the future just like\n", "> the past would be present before its eyes.\n", ">\n", "> — Pierre Simon Laplace (Laplace, 1814)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Game of Life\n", "\n", "\\[edit\\]\n", "\n", "[John Horton Conway](https://en.wikipedia.org/wiki/John_Horton_Conway)\n", "was a mathematician who developed a game known as the Game of Life. He\n", "died in April 2020, but since he invented the game, he was in effect\n", "‘god’ for this game. But as we will see, just inventing the rules\n", "doesn’t give you omniscience in the game.\n", "\n", "The Game of Life is played on a grid of squares, or pixels. Each pixel\n", "is either on or off. The game has no players, but a set of simple rules\n", "that are followed at each turn the rules are." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Life Rules\n", "\n", "\\[edit\\]\n", "\n", "John Conway’s game of life is a cellular automata where the cells obey\n", "three very simple rules. The cells live on a rectangular grid, so that\n", "each cell has 8 possible neighbors.\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\n", "*loneliness*\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "Figure: ‘Death’ through loneliness in Conway’s game of life. If a\n", "cell is surrounded by less than three cells, it ‘dies’ through\n", "loneliness.\n", "\n", "The game proceeds in turns, and at each location in the grid is either\n", "alive or dead. Each turn, a cell counts its neighbors. If there are two\n", "or fewer neighbors, the cell ‘dies’ of ‘loneliness.’\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\n", "*overcrowding*\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "Figure: ‘Death’ through overpopulation in Conway’s game of life. If a\n", "cell is surrounded by more than three cells, it ‘dies’ through\n", "loneliness.\n", "\n", "If there are four or more neigbors, the cell ‘dies’ from ‘overcrowding.’\n", "If there are three neigbors, the cell persists, or if it is currently\n", "dead, a new cell is born.\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\n", "*birth*\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "Figure: Birth in Conway’s life. Any position surounded by precisely\n", "three live cells will give birth to a new cell at the next turn.\n", "\n", "And that’s it. Those are the simple ‘physical laws’ for Conway’s game.\n", "\n", "The game leads to patterns emerging, some of these patterns are static,\n", "but some oscillate in place, with varying periods. Others oscillate, but\n", "when they complete their cycle they’ve translated to a new location, in\n", "other words they move. In Life the former are known as\n", "[oscillators](https://conwaylife.com/wiki/Oscillator) and the latter as\n", "[spaceships](https://conwaylife.com/wiki/Spaceship)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loafers and Gliders\n", "\n", "\\[edit\\]\n", "\n", "John Horton Conway, as the creator of the game of life, could be seen\n", "somehow as the god of this small universe. He created the rules. The\n", "rules are so simple that in many senses he, and we, are all-knowing in\n", "this space. But despite our knowledge, this world can still ‘surprise’\n", "us. From the simple rules, emergent patterns of behaviour arise. These\n", "include static patterns that don’t change from one turn to the next.\n", "They also include, oscillators, that pulse between different forms\n", "across different periods of time. A particular form of oscillator is\n", "known as a ‘spaceship,’ this is one that moves across the board as the\n", "game evolves. One of the simplest and earliest spaceships to be\n", "discovered is known as the glider.\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "*Glider (1969)*\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "\n", "Figure: *Left* A Glider pattern discovered 1969 by Richard K. Guy.\n", "*Right*. John Horton Conway, creator of *Life* (1937-2020). The glider\n", "is an oscillator that moves diagonally after creation. From the simple\n", "rules of Life it’s not obvious that such an object does exist, until you\n", "do the necessary computation.\n", "\n", "The glider was ‘discovered’ in 1969 by Richard K. Guy. What do we mean\n", "by discovered in this context? Well, as soon as the game of life is\n", "defined, objects such as the glider do somehow exist, but the many\n", "configurations of the game mean that it takes some time for us to see\n", "one and know it exists. This means, that despite being the creator,\n", "Conway, and despite the rules of the game being simple, and despite the\n", "rules being deterministic, we are not ‘omniscient’ in any simplistic\n", "sense. It requires computation to ‘discover’ what can exist in this\n", "universe once it’s been defined.\n", "\n", "\n", "\n", "Figure: The Gosper glider gun is a configuration that creates\n", "gliders. A new glider is released after every 30 turns.\n", "\n", "These patterns had to be discovered, in the same way that a scientist\n", "might discover a disease, or an explorer a new land. For example, the\n", "Gosper glider gun was [discovered by Bill Gosper in\n", "1970](https://conwaylife.com/wiki/Bill_Gosper). It is a pattern that\n", "creates a new glider every 30 turns of the game.\n", "\n", "Despite widespread interest in Life, some of its patterns were only very\n", "recently discovered like the Loafer, discovered in 2013 by Josh Ball. So\n", "despite the game having existed for over forty years, and the rules of\n", "the game being simple, there are emergent behaviours that are unknown.\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "*Loafer (2013)*\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "\n", "Figure: *Left* A Loafer pattern discovered by Josh Ball in 2013.\n", "*Right*. John Horton Conway, creator of *Life* (1937-2020).\n", "\n", "Once these patterns are discovered, they are combined (or engineered) to\n", "create new Life patterns that do some remarkable things. For example,\n", "there’s a life pattern that runs a Turing machine, or more remarkably\n", "there’s a Life pattern that runs Life itself.\n", "\n", "\n", "\n", "Figure: The Game of Life running in Life. The video is drawing out\n", "recursively showing pixels that are being formed by filling cells with\n", "moving spaceships. Each individual pixel in this game of life is made up\n", "of $2048 \\times 2048$ pixels called an [OTCA\n", "metapixel](https://www.conwaylife.com/wiki/OTCA_metapixel).\n", "\n", "To find out more about the Game of Life you can watch this video by Alan\n", "Zucconi or read his [associated blog\n", "post](https://www.alanzucconi.com/2020/10/13/conways-game-of-life/)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.lib.display import YouTubeVideo\n", "YouTubeVideo('Kk2MH9O4pXY')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Figure: An introduction to the Game of Life by Alan Zucconi.\n", "\n", "Contrast this with our situation where in ‘real life’ we don’t know the\n", "simple rules of the game, the state space is larger, and emergent\n", "behaviors (hurricanes, earthquakes, volcanos, climate change) have\n", "direct consequences for our daily lives, and we understand why the\n", "process of ‘understanding’ the physical world is so difficult. We also\n", "see immediately how much easier we might expect the physical sciences to\n", "be than the social sciences, where the emergent behaviors are contingent\n", "on highly complex human interactions.\n", "\n", "We summarize this notion as $$\n", "\\text{data} + \\text{model} \\stackrel{\\text{compute}}{\\rightarrow} \\text{prediction}\n", "$$ As we pointed out, there is an irony in Laplace’s demon forming the\n", "cornerstone of a movement known as ‘determinism,’ because Laplace wrote\n", "about this idea in an essay on probabilities. The more important quote\n", "in the essay was" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Laplace’s Gremlin\n", "\n", "\\[edit\\]\n", "\n", "> The curve described by a simple molecule of air or vapor is regulated\n", "> in a manner just as certain as the planetary orbits; the only\n", "> difference between them is that which comes from our ignorance.\n", "> Probability is relative, in part to this ignorance, in part to our\n", "> knowledge. We know that of three or greater number of events a single\n", "> one ought to occur; but nothing induces us to believe that one of them\n", "> will occur rather than the others. In this state of indecision it is\n", "> impossible for us to announce their occurrence with certainty. It is,\n", "> however, probable that one of these events, chosen at will, will not\n", "> occur because we see several cases equally possible which exclude its\n", "> occurrence, while only a single one favors it.\n", ">\n", "> — Pierre-Simon Laplace (Laplace, 1814), pg 5\n", "\n", "The representation of ignorance through probability is the true message\n", "of Laplace, I refer to this message as “Laplace’s gremlin,” because it\n", "is the gremlin of uncertainty that interferes with the demon of\n", "determinism to mean that our predictions are not deterministic.\n", "\n", "Our separation of the uncertainty into the data, the model and the\n", "computation give us three domains in which our doubts can creep into our\n", "ability to predict. Over the last three lectures we’ve introduced some\n", "of the basic tools we can use to unpick this uncertainty. You’ve been\n", "introduced to, (or have yow reviewed) *Bayes’ rule*. The rule, which is\n", "a simple consequence of the product rule of probability, is the\n", "foundation of how we update our beliefs in the presence of new\n", "information.\n", "\n", "The real point of Laplace’s essay was that we don’t have access to all\n", "the data, we don’t have access to a complete physical understanding, and\n", "as the example of the Game of Life shows, even if we did have access to\n", "both (as we do for “Conway’s universe”) we still don’t have access to\n", "all the compute that we need to make deterministic predictions. There is\n", "uncertainty in the system which means we can’t make precise predictions.\n", "\n", "I like to call this “Laplace’s Gremlin.” Gremlins are imaginary\n", "creatures used as an explanation of failure in aircraft, causing\n", "crashes. In that sense the Gremlin represents the uncertainty that a\n", "pilot felt about what might go wrong in a plane which might be\n", "“theoretically sound” but in practice is poorly maintained or exposed to\n", "conditions that take it beyond its design criteria. Laplace’s gremlin is\n", "all the things that your model, data and ability to compute don’t\n", "account for bringing about failures in your ability to predict.\n", "Laplace’s Gremlin is the uncertainty in the system.\n", "\n", "\n", "\n", "Figure: Gremlins are seen as the cause of a number of challenges in\n", "this World War II poster." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Simulation System\n", "\n", "\\[edit\\]\n", "\n", "An example of a complex decision-making system might be a climate model,\n", "in such a system there are separate models for the atmosphere, the ocean\n", "and the land.\n", "\n", "The components of these systems include flowing of currents, chemical\n", "interactions in the upper atmosphere, evaporation of water etc..\n", "\n", "\n", "\n", "Figure: Representation of the Carbon Cycle from the US National\n", "Oceanic and Atmospheric Administration. While everything is\n", "interconnected in the system, we can decompose into separate models for\n", "atmosphere, ocean, land.\n", "\n", "The influence of human activity also needs to be incorporated and\n", "modelled so we can make judgments about how to mitigate the effects of\n", "global warming.\n", "\n", "\n", "\n", "Figure: The components of a simulation system for climate\n", "modelling." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Monolithic System\n", "\n", "The classical approach to building these systems was a ‘monolithic\n", "system.’ Built in a similar way to the successful applications software\n", "such as Excel or Word, or large operating systems, a single code base\n", "was constructed. The complexity of such code bases run to many lines.\n", "\n", "In practice, shared dynamically linked libraries may be used for aspects\n", "such as user interface, or networking, but the software often has many\n", "millions of lines of code. For example, the Microsoft Office suite is\n", "said to contain over 30 million lines of code.\n", "\n", "\n", "\n", "Figure: A potential path of models in a machine learning system." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Service Oriented Architecture\n", "\n", "Such software is not only difficult to develop, but also to scale when\n", "computation demands increase. For example, Amazon’s original website\n", "software (called Obidos) was a [monolithic\n", "design](https://en.wikipedia.org/wiki/Obidos_(software)) but by the\n", "early noughties it was becoming difficult to sustain and maintain. The\n", "software was phased out in 2006 to be replaced by a modularized software\n", "known as a ‘service-oriented architecture.’\n", "\n", "In Service Oriented Architecture, or “Software as a Service” the idea is\n", "that code bases are modularized and communicate with one another using\n", "network requests. A standard approach is to use a [REST\n", "API](https://en.wikipedia.org/wiki/Representational_state_transfer). So,\n", "rather than a single monolithic code base, the code is developed with\n", "individual services that handle the different requests.\n", "\n", "The simulation software is turned inside out to expose the individual\n", "components to the operator.\n", "\n", "\n", "\n", "Figure: A potential path of models in a machine learning system.\n", "\n", "This is the landscape we now find ourselves in for software development.\n", "In practice, each of these services is often ‘owned’ and maintained by\n", "an individual team. The team is judged by the quality of their service\n", "provision. They work to detailed specifications on what their service\n", "should output, what its availability should be and other objectives like\n", "speed of response. This allows for conditional independence between\n", "teams and for faster development.\n", "\n", "One question is to what extent is the same approach possible/desirable\n", "for scientific models? The components we listed above are already\n", "separated and often run independently. But those components themselves\n", "are made up of other sub-components that could also be exposed in a\n", "similar manner to software-as-a-service, giving us the notion of\n", "“simulation as a service.”\n", "\n", "One thing about working in an industrial environment, is the way that\n", "short-term thinking actions become important. For example, in Formula\n", "One, the teams are working on a two-week cycle to digest information\n", "from the previous week’s race and incorporate updates to the car or\n", "their strategy.\n", "\n", "However, businesses must also think about more medium-term horizons. For\n", "example, in Formula 1 you need to worry about next year’s car. So while\n", "you’re working on updating this year’s car, you also need to think about\n", "what will happen for next year and prioritize these conflicting needs\n", "appropriately.\n", "\n", "In the Amazon supply chain, there are the equivalent demands. If we\n", "accept that an artificial intelligence is just an automated\n", "decision-making system. And if we measure in terms of money\n", "automatically spent, or goods automatically moved, then Amazon’s buying\n", "system is perhaps the world’s largest AI.\n", "\n", "Those decisions are being made on short time schedules; purchases are\n", "made by the system on weekly cycles. But just as in Formula 1, there is\n", "also a need to think about what needs to be done next month, next\n", "quarter and next year. Planning meetings are held not only on a weekly\n", "basis (known as weekly business reviews), but monthly, quarterly, and\n", "then yearly meetings for planning spends and investments.\n", "\n", "Amazon is known for being longer term thinking than many companies, and\n", "a lot of this is coming from the CEO. One quote from Jeff Bezos that\n", "stuck with me was the following.\n", "\n", "> “I very frequently get the question: ‘What’s going to change in the\n", "> next 10 years?’ And that is a very interesting question; it’s a very\n", "> common one. I almost never get the question: ‘What’s not going to\n", "> change in the next 10 years?’ And I submit to you that that second\n", "> question is actually the more important of the two – because you can\n", "> build a business strategy around the things that are stable in time. …\n", "> \\[I\\]n our retail business, we know that customers want low prices,\n", "> and I know that’s going to be true 10 years from now. They want fast\n", "> delivery; they want vast selection. It’s impossible to imagine a\n", "> future 10 years from now where a customer comes up and says, ‘Jeff I\n", "> love Amazon; I just wish the prices were a little higher,’ \\[or\\] ‘I\n", "> love Amazon; I just wish you’d deliver a little more slowly.’\n", "> Impossible. And so the effort we put into those things, spinning those\n", "> things up, we know the energy we put into it today will still be\n", "> paying off dividends for our customers 10 years from now. When you\n", "> have something that you know is true, even over the long term, you can\n", "> afford to put a lot of energy into it.”\n", "\n", "This quote is incredibly important for long term thinking. Indeed, it’s\n", "a failure of many of our simulations that they focus on what is going to\n", "happen, not what will not happen. In Amazon, this meant that there was\n", "constant focus on these three areas, keeping costs low, making delivery\n", "fast and improving selection. For example, shortly before I left Amazon\n", "moved its entire US network from two-day delivery to one-day delivery.\n", "This involves changing the way the entire buying system operates. Or,\n", "more recently, the company has had to radically change the portfolio of\n", "products it buys in the face of Covid19.\n", "\n", "\n", "\n", "\n", "\n", "Figure: Experiment, analyze and design is a flywheel of knowledge\n", "that is the dual of the model, data and compute. By running through this\n", "spiral, we refine our hypothesis/model and develop new experiments which\n", "can be analyzed to further refine our hypothesis.\n", "\n", "From the perspective of the team that we had in the supply chain, we\n", "looked at what we most needed to focus on. Amazon moves very quickly,\n", "but we could also take a leaf out of Jeff’s book, and instead of\n", "worrying about what was going to change, remember what wasn’t going to\n", "change.\n", "\n", "> We don’t know what science we’ll want to do in five years’ time, but\n", "> we won’t want slower experiments, we won’t want more expensive\n", "> experiments and we won’t want a narrower selection of experiments.\n", "\n", "As a result, our focus was on how to speed up the process of\n", "experiments, increase the diversity of experiments that we can do, and\n", "keep the experiments price as low as possible.\n", "\n", "The faster the innovation flywheel can be iterated, then the quicker we\n", "can ask about different parts of the supply chain, and the better we can\n", "tailor systems to answering those questions.\n", "\n", "We need faster, cheaper and more diverse experiments which implies we\n", "need better ecosystems for experimentation. This has led us to focus on\n", "the software frameworks we’re using to develop machine learning systems\n", "including data oriented architectures (Borchert\n", "(2020);(**Lawrence-doa19?**);Vorhemus and Schikuta (2017);Joshi (2007)),\n", "data maturity assessments (Lawrence et al. (2020)) and data readiness\n", "levels (See this blog post on [Data Readiness\n", "Levels](http://inverseprobability.com/2017/01/12/data-readiness-levels).\n", "and Lawrence (2017);The DELVE Initiative (2020))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Statistical Emulation\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: The UK Met office runs a shared code base for its simulations\n", "of climate and the weather. This plot shows the different spatial and\n", "temporal scales used.\n", "\n", "In many real-world systems, decisions are made through simulating the\n", "environment. Simulations may operate at different granularities. For\n", "example, simulations are used in weather forecasts and climate\n", "forecasts. Interestingly, the UK Met office uses the same code for both,\n", "it has a [“Unified Model”\n", "approach](https://www.metoffice.gov.uk/research/approach/modelling-systems/unified-model/index),\n", "but they operate climate simulations at greater spatial and temporal\n", "resolutions.\n", "\n", "\n", "\n", "Figure: Real world systems consist of simulators that capture our\n", "domain knowledge about how our systems operate. Different simulators run\n", "at different speeds and granularities.\n", "\n", "\n", "\n", "Figure: A statistical emulator is a system that reconstructs the\n", "simulation with a statistical model.\n", "\n", "A statistical emulator is a data-driven model that learns about the\n", "underlying simulation. Importantly, learns with uncertainty, so it\n", "‘knows what it doesn’t know.’ In practice, we can call the emulator in\n", "place of the simulator. If the emulator ‘doesn’t know,’ it can call the\n", "simulator for the answer.\n", "\n", "\n", "\n", "Figure: A statistical emulator is a system that reconstructs the\n", "simulation with a statistical model. As well as reconstructing the\n", "simulation, a statistical emulator can be used to correlate with the\n", "real world.\n", "\n", "As well as reconstructing an individual simulator, the emulator can\n", "calibrate the simulation to the real world, by monitoring differences\n", "between the simulator and real data. This allows the emulator to\n", "characterize where the simulation can be relied on, i.e., we can\n", "validate the simulator.\n", "\n", "Similarly, the emulator can adjudicate between simulations. This is\n", "known as *multi-fidelity emulation*. The emulator characterizes which\n", "emulations perform well where.\n", "\n", "If all this modelling is done with judicious handling of the\n", "uncertainty, the *computational doubt*, then the emulator can assist in\n", "desciding what experiment should be run next to aid a decision: should\n", "we run a simulator, in which case which one, or should we attempt to\n", "acquire data from a real-world intervention." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install gpy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## GPy and Emulation\n", "\n", "\\[edit\\]\n", "\n", "Let $\\mathbf{ x}$ be a random variable defined over the real numbers,\n", "$\\Re$, and $f(\\cdot)$ be a function mapping between the real numbers\n", "$\\Re \\rightarrow \\Re$.\n", "\n", "The problem of *uncertainty propagation* is the study of the\n", "distribution of the random variable $f(\\mathbf{ x})$.\n", "\n", "We’re going to address this problem using emulation and GPy. We will see\n", "in this section the advantage of using a model when only a few\n", "observations of $f$ are available.\n", "\n", "Firstly, we’ll make use of a test function known as the Branin test\n", "function. $$\n", "f(\\mathbf{ x}) = a(x_2 - bx_1^2 + cx_1 - r)^2 + s(1-t \\cos(x_1)) + s\n", "$$ where we are setting $a=1$, $b=5.1/(4\\pi^2)$, $c=5/\\pi$, $r=6$,\n", "$s=10$ and $t=1/(8\\pi)$." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def branin(X):\n", " y = ((X[:,1]-5.1/(4*np.pi**2)*X[:,0]**2+5*X[:,0]/np.pi-6)**2 \n", " + 10*(1-1/(8*np.pi))*np.cos(X[:,0])+10)\n", " return(y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We’ll define a grid of twenty-five observations over \\[−5, 10\\] × \\[0,\n", "15\\] and a set of 25 observations." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Training set defined as a 5*5 grid:\n", "xg1 = np.linspace(-5,10,5)\n", "xg2 = np.linspace(0,15,5)\n", "X = np.zeros((xg1.size * xg2.size,2))\n", "for i,x1 in enumerate(xg1):\n", " for j,x2 in enumerate(xg2):\n", " X[i+xg1.size*j,:] = [x1,x2]\n", "\n", "Y = branin(X)[:,np.newaxis]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The task here will be to consider the distribution of $f(U)$, where $U$\n", "is a random variable with uniform distribution over the input space of\n", "$f$. We focus on the computaiton of two quantities, the expectation of\n", "$f(U)$, $\\left\\langle f(U)\\right\\rangle$, and the probability that the\n", "value is greater than 200." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Computation of $\\left\\langle f(U)\\right\\rangle$\n", "\n", "The expectation of $f(U )$ is given by\n", "$\\int_\\mathbf{ x}f( \\mathbf{ x})\\text{d}\\mathbf{ x}$. A basic approach\n", "to approximate this integral is to compute the mean of the 25\n", "observations: `np.mean(Y)`. Since the points are distributed on a grid,\n", "this can be seen as the approximation of the integral by a rough Riemann\n", "sum." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print('Estimate of the expectation is given by: {mean}'.format(mean=Y.mean()))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The result can be compared with the actual mean of the Branin function\n", "which is 54.31.\n", "\n", "Alternatively, we can fit a GP model and compute the integral of the\n", "best predictor by Monte Carlo sampling.\n", "\n", "Firstly, we create the covariance function. Here we’re going to use an\n", "exponentiated quadratic, but we’ll augment it with the ‘bias’ covariance\n", "function. This covariance function represents a single fixed bias that\n", "is added to the overall covariance. It allows us to deal with\n", "non-zero-mean emulations." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import GPy" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create an exponentiated quadratic plus bias covariance function\n", "kern_eq = GPy.kern.RBF(input_dim=2, ARD = True)\n", "kern_bias = GPy.kern.Bias(input_dim=2)\n", "kern = kern_eq + kern_bias" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we construct the Gaussian process regression model in GPy." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Build a GP model\n", "model = GPy.models.GPRegression(X,Y,kern)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the sinusoid example above, we learnt the variance of the process.\n", "But in this example, we are fitting an emulator to a function we know is\n", "noise-free. However, we don’t fix the noise value to precisely zero, as\n", "this can lead to some numerical errors. Instead, we fix the variance of\n", "the Gaussian noise to a very small value." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# fix the noise variance\n", "model.likelihood.variance.fix(1e-5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we fit the model. Note, that the initial values for the length scale\n", "are not appropriate. So first set the length scale of the model needs to\n", "be reset." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kern.rbf.lengthscale = np.asarray([3, 3])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It’s a common error in Gaussian process fitting to initialize the length\n", "scale too small or too big. The challenge is that the error surface is\n", "normally multimodal, and the final solution can be very sensitive to\n", "this initialization. If the length scale is initialized too small, the\n", "solution can converge on an place where the signal isn’t extracted by\n", "the covariance function. If the length scale is initialized too large,\n", "then the variations of the function are often missing. Here the length\n", "scale is set for each dimension of inputs as 3. Now that’s done, we can\n", "optimize the model." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Randomize the model and optimize\n", "model.optimize(messages=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "model.plot(ax=ax)\n", "\n", "mlai.write_figure('branin-gp-optimized-fit.svg', directory='./gp')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: A Gaussian process fit to the Branin test function, used to\n", "assess the mean of the function by emulation.\n", "\n", "Finally, we can compute the mean of the model predictions using very\n", "many Monte Carlo samples.\n", "\n", "Note, that in this example, because we’re using a test function, we\n", "could simply have done the Monte Carlo estimation directly on the Branin\n", "function. However, imagine instead that we were trying to understand the\n", "results of a complex computational fluid dynamics simulation, where each\n", "run of the simulation (which is equivalent to our test function) took\n", "many hours. In that case the advantage of the emulator is clear." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Compute the mean of model prediction on 1e5 Monte Carlo samples\n", "Xp = np.random.uniform(size=(int(1e5),2))\n", "Xp[:,0] = Xp[:,0]*15-5\n", "Xp[:,1] = Xp[:,1]*15\n", "mu, var = model.predict(Xp)\n", "print('The estimate of the mean of the Branin function is {mean}'.format(mean=np.mean(mu)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise 2\n", "\n", "Now think about how to make use of the variance estimation from the\n", "Gaussian process to obtain error bars around your estimate." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Write your answer to Exercise 2 here\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise 3\n", "\n", "You’ve seen how the Monte Carlo estimates work with the Gaussian\n", "process. Now make your estimate of the probability that the Branin\n", "function is greater than 200 with the uniform random inputs." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Write your answer to Exercise 3 here\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Uncertainty Quantification\n", "\n", "We’re introducing you to the optimization and analysis of real-world\n", "models through emulation, this domain is part of a broader field known\n", "as surrogate modelling.\n", "\n", "Although we’re approaching this from the machine learning perspective,\n", "with a computer-scientist’s approach, you won’t be surprised to find out\n", "that this field is not new and there are a range of research groups\n", "interested in this domain.\n", "\n", "This type of challenge, of where to run the simulation to get the answer\n", "you require is an old challenge. One classic paper, McKay et al. (1979),\n", "reviews three different methods for designing these inputs. They are\n", "*random sampling*, *stratified sampling* and *Latin hypercube sampling*.\n", "\n", "> Let the input values $\\mathbf{ x}_1, \\dots, \\mathbf{ x}_n$ be a random\n", "> sample from $f(\\mathbf{ x})$. This method of sampling is perhaps the\n", "> most obvious, and an entire body of statistical literature may be used\n", "> in making inferences regarding the distribution of $Y(t)$.\n", "\n", "> Using stratified sampling, all areas of the sample space of\n", "> $\\mathbf{ x}$ are represented by input values. Let the sample space\n", "> $S$ of $\\mathbf{ x}$ be partitioned into $I$ disjoint strata $S_t$.\n", "> Let $\\pi = P(\\mathbf{ x}\\in S_i)$ represent the size of $S_i$. Obtain\n", "> a random sample $\\mathbf{ x}_{ij}$, $j = 1, \\dots, n$ from $S_i$. Then\n", "> of course the $n_i$ sum to $n$. If $I = 1$, we have random sampling\n", "> over the entire sample space.\n", "\n", "> The same reasoning that led to stratified sampling, ensuring that all\n", "> portions of $S$ were sampled, could lead further. If we wish to ensure\n", "> also that each of the input variables $\\mathbf{ x}_k$ has all portions\n", "> of its distribution represented by input values, we can divide the\n", "> range of each $\\mathbf{ x}_k$ into $n$ strata of equal marginal\n", "> probability $1/n$, and sample once from each stratum. Let this sample\n", "> be $\\mathbf{ x}_{kj}$, $j = 1, \\dots, n$. These form the\n", "> $\\mathbf{ x}_k$ component, $k = 1, \\dots , K$, in $\\mathbf{ x}_i$,\n", "> $i = 1, \\dots, n$. The components of the various $\\mathbf{ x}_k$’s are\n", "> matched at random. This method of selecting input values is an\n", "> extension of quota sampling (Steinberg 1963), and can be viewed as a\n", "> $K$-dimensional extension of Latin square sampling (Raj 1968).\n", "\n", "The paper’s rather dated reference to “Output from a Computer Code” does\n", "carry forward through this literature, which has continued to be a focus\n", "of interest for statisticians. [Tony\n", "O’Hagan](http://www.tonyohagan.co.uk/academic/), who was a colleague in\n", "Sheffield but is also one of the pioneers of Gaussian process models was\n", "developing these methods when I first arrived there (Kennedy and\n", "O’Hagan, 2001), and continued with a large EPSRC funded project for\n", "managing uncertainty in computational models, .\n", "You can see a list of [their technical reports\n", "here](http://www.mucm.ac.uk/Pages/Dissemination/TechnicalReports.html).\n", "\n", "Another important group based in France is the “MASCOT-NUM Research\n", "Group,” . These researchers bring\n", "together statisticians, applied mathematicians and engineers in solving\n", "these problems." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Emukit\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Javier Gonzalez\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "The Emukit software we will be using across the next part of this module\n", "is a python software library that facilitates emulation of systems. The\n", "software’s origins go back to work done by [Javier\n", "Gonzalez](https://javiergonzalezh.github.io/) as part of his\n", "post-doctoral project at the University of Sheffield. Javier led the\n", "design and build of a Bayesian optimization software. The package\n", "`GPyOpt` worked with the SheffieldML software GPy for performing\n", "Bayesian optimization.\n", "\n", "`GPyOpt` has a modular design that allows the user to provide their own\n", "surrogate models, the package is build with `GPy` as a surrogate model\n", "in mind, but other surrogate models can also be wrapped and integrated.\n", "\n", "However, `GPyOpt` doesn’t allow the full flexibility of surrogate\n", "modelling for domains like experimental design, sensitivity analysis\n", "etc.\n", "\n", "Emukit was designed and built for a more general approach. The software\n", "is MIT licensed and its design and implementation was led by Javier\n", "Gonzalez and [Andrei Paleyes](https://www.linkedin.com/in/andreipaleyes)\n", "at Amazon. Building on the experience of `GPyOpt`, the aim with Emukit\n", "was to use the modularisation ideas embedded in `GPyOpt`, but to extend\n", "them beyond the modularisation of the surrogate models to modularisation\n", "of the acquisition function.\n", "\n", "\n", "\n", "Figure: The Emukit software is a set of software tools for emulation\n", "and surrogate modeling. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install pyDOE" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install emukit" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Javier Gonzalez\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Andrei Paleyes\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Mark Pullin\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Maren Mahsereci\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "The software was initially built by the team in Amazon. As well as\n", "Javier Gonzalez (ML side) and Andrei Paleyes (Software Engineering)\n", "included Mark Pullin, Maren Mahsereci, Alex Gessner, Aaron Klein, Henry\n", "Moss, David-Elias Künstle as well as management input from Cliff\n", "McCollum and myself." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Emukit Vision\n", "\n", "\\[edit\\]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Preface about emulation\n", "\n", "We see emulation comprising of three main parts:\n", "\n", "**Models**. This is a probabilistic data-driven representation of the\n", "process/simulator that the user is working with. There is normally a\n", "modelling framework that is used to create a model. Examples: neural\n", "network, Gaussian process, random forest.\n", "\n", "**Methods**. Relatively low-level techniques that are aimed that either\n", "understanding, quantifying or using uncertainty that the model provides.\n", "Examples: Bayesian optimization, experimental design.\n", "\n", "**Tasks**. High level goals that owners of the process/simulator might\n", "be actually interested in. Examples: measure quality of a simulator,\n", "explain complex system behavior.\n", "\n", "Typical workflow that we envision for a user interested in emulation is:\n", "\n", "1. Figure out which questions/tasks are important for them in regard to\n", " their process/simulation.\n", "\n", "2. Understand which emulation techniques are needed to accomplish the\n", " chosen task.\n", "\n", "3. Build an emulator of the process. That can be a very involved step,\n", " that may include a lot of fine tuning and validation.\n", "\n", "Feed the emulator to the chosen technique and use it to answer the\n", "question/complete the task." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Emukit and Emulation\n", "\n", "\n", "\n", "Figure: The emukit approach to the three parts of emulation." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Methods\n", "\n", "This is the main focus of Emukit. Emukit defines a general sctructure of\n", "a decision making method, called OuterLoop, and then offers\n", "implementations of few such methods: Bayesian optimization, experimental\n", "design. In addition to provide a framework for decision making Emukit\n", "provide other tools, like sensitivity analysis, that help to debug and\n", "interpret emulators. All methods in Emukit are model-agnostic." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Models\n", "\n", "Generally speaking, Emukit does not provide modelling capabilities,\n", "instead expecting users to bring their own models. Because of the\n", "variety of modelling frameworks out there, Emukit does not mandate or\n", "make any assumptions about a particular modelling technique or a\n", "library. Instead it suggests to implement a subset of defined model\n", "interfaces required to use a particular method. Nevertheless, there are\n", "a few model-related functionalities in Emukit: - Example models, which\n", "give users something to play with to explore Emukit. - Model wrappers,\n", "which are designed to help adapting models in particular modelling\n", "frameworks to Emukit interfaces. - Multi-fidelity models, implemented\n", "based on GPy." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Tasks\n", "\n", "Emukit does not contribute much to this part at the moment. However\n", "Emukit team are on lookout for typical use cases for Emukit, and if a\n", "reoccuring pattern emerges, it may become a part of the library.\n", "\n", " while stopping condition is not met:\n", " optimize acquisition function\n", " evaluate user function\n", " update model with new observation\n", "\n", "Emukit is build in a modular way so that each component in this loop can\n", "be swapped out. This means that scientists, applied mathematicians,\n", "machine learnings, statisticians can swap out the relavant part of their\n", "method and build on the undelrying structure. You just need to pick out\n", "the part that requires implementation." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loop\n", "\n", "The `emukit.core.loop.OuterLoop` class is the abstract loop where the\n", "different components come together. There are more specific loops for\n", "Bayesian optimization and experimental design that construct some of the\n", "component parts for you." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Model\n", "\n", "All `Emukit` loops need a probabilistic model of the underlying system.\n", "Emukit does not provide functionality to build models as there are\n", "already many good modelling frameworks available in python. Instead, we\n", "provide a way of interfacing third part modelling libraries with Emukit.\n", "We already provide a wrapper for using a model created with `GPy`. For\n", "instructions on how to include your own model please [see this\n", "notebook](https://emukit.readthedocs.io/en/latest/notebooks/Emukit-tutorial-custom-model.html).\n", "\n", "Different models and modelling frameworks will provide different\n", "functionality. For instance a Gaussian process will usually have\n", "derivatives of the predictions available but random forests will not.\n", "These different functionalities are represented by a set of interfaces\n", "which a model implements. The basic interface that all models must\n", "implement is `IModel`, which implements functionality to make\n", "predictions and update the model but a model may implement any number of\n", "other interfaces such as `IDifferentiable` which indicates a model has\n", "prediction derivatives available." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Candidate Point Calculator\n", "\n", "This class decides which point to evaluate next. The simplest\n", "implementation, `SequentialPointCalculator`, collects one point at a\n", "time by finding where the acquisition is a maximum by applying the\n", "acquisition optimizer to the acquisition function. More complex\n", "implementations will enable batches of points to be collected so that\n", "the user function can be evaluated in parallel." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Acquisition\n", "\n", "The acquisition is a heuristic quantification of how valuable collecting\n", "a future point might be. It is used by the candidate point calculator to\n", "decide which point(s) to collect next." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Acquisition Optimizer\n", "\n", "The `AcquisitionOptimizer` optimizes the acquisition function to find\n", "the point at which the acquisition is a maximum. This will use the\n", "acquisition function gradients if they are available. If gradients of\n", "the acquisition function are not available it will either estimate them\n", "numerically or use a gradient free optimizer." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## User Function\n", "\n", "This is the function that we are trying to reason about. It can be\n", "either evaluated by the user or it can be passed into the loop and\n", "evaluated by Emukit." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Model Updater\n", "\n", "The `ModelUpdater` class updates the model with new training data after\n", "a new point is observed and optimizes any hyper-parameters of the model.\n", "It can decide whether hyper-parameters need updating based on some\n", "internal logic." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Stopping Condition\n", "\n", "The `StoppingCondition` class chooses when we should stop collecting\n", "points. The most commonly used example is to stop when a set number of\n", "iterations have been reached." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Emukit Playground\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Leah Hirst\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Cliff McCollum\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "Emukit playground is a software toolkit for exploring the use of\n", "statistical emulation as a tool. It was built by [Leah\n", "Hirst](https://www.linkedin.com/in/leahhirst/), during her software\n", "engineering internship at Amazon and supervised by [Cliff\n", "McCollum](https://www.linkedin.com/in/cliffmccollum/).\n", "\n", "\n", "\n", "Figure: Emukit playground is a tutorial for understanding the\n", "simulation/emulation relationship.\n", "\n", "\n", "\n", "\n", "Figure: Tutorial on Bayesian optimization of the number of taxis\n", "deployed from Emukit playground.\n", "\n", "\n", "You can explore Bayesian optimization of a taxi simulation." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Emukit Tutorial\n", "\n", "\\[edit\\]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install mlai" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Set up the python imports that Emukit will use." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import GPy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now set up Emukit to run." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from emukit.experimental_design.experimental_design_loop import ExperimentalDesignLoop" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let’s check the help function for the experimental design loop. This is\n", "the outer loop that provides all the decision making parts of Emukit." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ExperimentalDesignLoop?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let’s load in the model wrapper for our probabilistic model. In this\n", "case, instead of using GPy, we’ll make use of a simple model wrapper\n", "Emukit provides for a basic form of Gaussian process." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from emukit.model_wrappers import SimpleGaussianProcessModel" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let’s have a quick look at how the included GP model works." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "SimpleGaussianProcessModel?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let’s create the data." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x_min = -30.0\n", "x_max = 30.0\n", "\n", "x = np.random.uniform(x_min, x_max, (10, 1))\n", "y = np.sin(x) + np.random.randn(10, 1) * 0.05" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To learn about how to include your own model in Emukit, check [this\n", "notebook](https://github.com/EmuKit/emukit/blob/master/notebooks/Emukit-tutorial-custom-model.ipynb)\n", "which shows how to include a `sklearn` GP model." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "emukit_model = SimpleGaussianProcessModel(x, y)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from emukit.core import ParameterSpace, ContinuousParameter\n", "from emukit.core.loop import UserFunctionWrapper" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "p = ContinuousParameter('c', x_min, x_max)\n", "space = ParameterSpace([p])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "loop = ExperimentalDesignLoop(space, emukit_model)\n", "loop.run_loop(np.sin, 30)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plot_min = -40.0\n", "plot_max = 40.0\n", "\n", "real_x = np.arange(plot_min, plot_max, 0.2)\n", "real_y = np.sin(real_x)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import mlai.plot as plot\n", "import mlai" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "\n", "ax.plot(real_x, real_y, c='r', linewidth=2)\n", "ax.scatter(loop.loop_state.X[:, 0].tolist(), \n", " loop.loop_state.Y[:, 0].tolist(), s=50)\n", "\n", "ax.set_xlabel('$x$')\n", "ax.set_ylabel('$y$', rotation=None)\n", "ax.set_ylim([-2.5, 2.5])\n", "\n", "ax.legend(['true function', 'acquired data points'], loc='lower right')\n", "\n", "mlai.write_figure('emukit-sine-function.svg', directory='./uq')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: Experimental design in Emukit using the\n", "`ExperimentalDesignLoop`: learning function $\\sin(x)$ with Emukit.\n", "\n", "Computer the predictions from the Emukit model." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "predicted_y = []\n", "predicted_std = []\n", "for x in real_x:\n", " y, var = emukit_model.predict(np.array([[x]]))\n", " std = np.sqrt(var)\n", " predicted_y.append(y)\n", " predicted_std.append(std)\n", "\n", "predicted_y = np.array(predicted_y).flatten()\n", "predicted_std = np.array(predicted_std).flatten()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "\n", "ax.plot(real_x, predicted_y, linewidth=3)\n", "ax.plot(real_x, real_y, c='r', linewidth=2)\n", "\n", "ax.set_ylim([-2.5, 2.5])\n", "\n", "ax.fill_between(real_x, predicted_y - 2 * predicted_std, \n", " predicted_y + 2 * predicted_std, alpha=.5)\n", "\n", "ax.set_xlabel('$x$')\n", "ax.set_ylabel('$y$')\n", "ax.legend(['true function', 'estimated function'], loc='lower right')\n", "\n", "mlai.write_figure('emukit-sine-function-fit.svg', directory='./uq')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: The fit to the sine function after runnning the Emukit\n", "`ExperimentalDesignLoop`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise 4\n", "\n", "Repeat the above experiment but using the Gaussian process model from\n", "`sklearn`. You can see step by step instructions on how to do this in\n", "[this\n", "notebook](https://github.com/EmuKit/emukit/blob/master/notebooks/Emukit-tutorial-custom-model.ipynb)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Write your answer to Exercise 4 here\n", "\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Emukit Overview Summary\n", "\n", "The aim is to provide a suite where different approaches to emulation\n", "are assimilated under one roof. The current version of Emukit includes\n", "*multi-fidelity emulation* for build surrogate models when data is\n", "obtained from multiple information sources that have different fidelity\n", "and/or cost; *Bayesian optimisation* for optimising physical experiments\n", "and tune parameters of machine learning algorithms or other\n", "computational simulations; *experimental design and active learning*:\n", "design the most informative experiments and perform active learning with\n", "machine learning models; *sensitivity analysis*: analyse the influence\n", "of inputs on the outputs of a given system; and *Bayesian quadrature*:\n", "efficiently compute the integrals of functions that are expensive to\n", "evaluate. But it’s easy to extend." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Emukit Sensitivity Analysis\n", "\n", "\\[edit\\]\n", "\n", "This introduction is based on [Introduction to Global Sensitivity\n", "Analysis with\n", "Emukit](https://github.com/EmuKit/emukit/blob/master/notebooks/Emukit-tutorial-sensitivity-montecarlo.ipynb)\n", "written by Mark Pullin, Javier Gonzalez, Juan Emmanuel Johnson and\n", "Andrei Paleyes. Some references include (Kennedy and O’Hagan, 2000;\n", "Saltelli et al., 2010, 2008, 2004; Sobol, 2001, 1990)\n", "\n", "> A possible definition of sensitivity analysis is the following: The\n", "> study of how uncertainty in the output of a model (numerical or\n", "> otherwise) can be apportioned to different sources of uncertainty in\n", "> the model input (Saltelli et al., 2004). A related practice is\n", "> ‘uncertainty analysis,’ which focuses rather on quantifying\n", "> uncertainty in model output. Ideally, uncertainty and sensitivity\n", "> analyses should be run in tandem, with uncertainty analysis preceding\n", "> in current practice.\n", ">\n", "> In Chapter 1 of Saltelli et al. (2008)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "\n", "from matplotlib import colors as mcolors\n", "from matplotlib import cm" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install pyDOE" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import mlai\n", "import mlai.plot as plot" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sensitivity analysis is a statistical technique widely used to test the\n", "reliability of real systems. Imagine a simulator of taxis picking up\n", "customers in a city like the one showed in the [Emukit\n", "playground](https://github.com/amzn/emukit-playground). The profit of\n", "the taxi company depends on factors like the number of taxis on the road\n", "and the price per trip. In this example, a global sensitivity analysis\n", "of the simulator could be useful to decompose the variance of the profit\n", "in a way that can be assigned to the input variables of the simulator.\n", "\n", "There are different ways of doing a sensitivity analysis of the\n", "variables of a simulator. In this notebook we will start with an\n", "approach based on Monte Carlo sampling that is useful when evaluating\n", "the simulator is cheap. If evaluating the simulator is expensive,\n", "emulators can then be used to speed up computations. We will show this\n", "in the last part of the notebook. Next, we start with a few formal\n", "definitions and literature review so we can understand the basics of\n", "Sensitivity Analysis and how it can be performed with Emukit." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Local Sensitivity\n", "\n", "Given any function, $g(\\cdot)$, we might be interested in how sensitive\n", "that function is to variations in its input space. One route to\n", "determining this is to compute the partial derivatives of that function\n", "with respect to its inputs, $$\n", "\\frac{\\partial}{\\partial x_i} g(\\mathbf{ x}).\n", "$$ The matrix of all these partial derivatives is known as the Jacobian.\n", "\n", "These types of local sensitivity analysis can be used for determining\n", "the effect of changing an input variable around an operating point. But\n", "they don’t give us an understanding of the response of the target\n", "function to variations in the input across the domain of inputs. For\n", "this, we need to look to *global sensitivity analysis*." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Global Sensitivity Analysis\n", "\n", "In global sensitivity analysis, rather than looking around a single\n", "operating point, we’re interested in the overall sensitivity of a\n", "function to its inputs, or combinations of inputs, across its entire\n", "domain. The key tool in determining this sensitivity is known as the\n", "ANOVA decomposition, or the *Hoeffding-Sobol decomposition*.\n", "\n", "For global sensitivity analysis, we need to make an assumption about how\n", "inputs are going to vary to create different values of the function. The\n", "fundamental object we’re interested in is the total variance of the\n", "function, $$\n", "\\text{var}\\left(g(\\mathbf{ x})\\right) = \\left\\langle g(\\mathbf{ x})^2 \\right\\rangle _{p(\\mathbf{ x})} - \\left\\langle g(\\mathbf{ x}) \\right\\rangle _{p(\\mathbf{ x})}^2,\n", "$$ where $$\n", "\\left\\langle h(\\mathbf{ x}) \\right\\rangle _{p(\\mathbf{ x})} = \\int_\\mathbf{ x}h(\\mathbf{ x}) p(\\mathbf{ x}) \\text{d}\\mathbf{ x}\n", "$$ is the expectation of the function $h(\\mathbf{ x})$ under the density\n", "$p(\\mathbf{ x})$, which represents the probability distribution of\n", "inputs we’re interested in.\n", "\n", "The total variance of the function gives us the overall variation of the\n", "function across the domain of inputs, as represented by the probability\n", "density, $p(\\mathbf{ x})$. Normally, we perform analysis by assuming\n", "that, $$\n", "p(\\mathbf{ x}) = \\prod_{i=1}^pp(x_i)\n", "$$ and that each $p(x_i)$ is *uniformly distributed* across its input\n", "domain. Assuming we scale the input domain down to the interval\n", "$[0, 1]$, that gives us $$\n", "x_i \\sim \\mathcal{U}\\left(0,1\\right).\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Hoeffding-Sobol Decomposition\n", "\n", "The Hoeffding-Sobol, or ANOVA, decomposition of a function allows us to\n", "write it as, $$\n", "\\begin{align*}\n", "g(\\mathbf{ x}) = & g_0 + \\sum_{i=1}^pg_i(x_i) + \\sum_{i\\[edit\\]\n", "\n", "The Ishigami function (Ishigami and Homma, 1989) is a well-known test\n", "function for uncertainty and sensitivity analysis methods because of its\n", "strong nonlinearity and peculiar dependence on $x_3$. More details of\n", "this function can be found in (Sobol and Levitan, 1999).\n", "\n", "Mathematically, the form of the Ishigami function is $$\n", "g(\\textbf{x}) = \\sin(x_1) + a \\sin^2(x_2) + b x_3^4 \\sin(x_1). \n", "$$ We will set the parameters to be $a = 5$ and $b=0.1$ . The input\n", "variables are sampled randomly\n", "$x_i \\sim \\mathcal{U}\\left(-\\pi,\\pi\\right)$.\n", "\n", "Next, we create the function object and visualize its shape marginally\n", "for each one of its three inputs.\n", "\n", "Load the Ishigami function" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from emukit.test_functions.sensitivity import Ishigami" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ishigami = Ishigami(a=5, b=0.1)\n", "target_function = ishigami.fidelity1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That gives us the target function, next we define the input space for\n", "the simulator." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from emukit.core import ContinuousParameter, ParameterSpace" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "variable_domain = (-np.pi,np.pi)\n", " \n", "space = ParameterSpace(\n", " [ContinuousParameter('x1', *variable_domain), \n", " ContinuousParameter('x2', *variable_domain),\n", " ContinuousParameter('x3', *variable_domain)])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Before moving to any further analysis, we first plot the non-zero\n", "components $g(\\mathbf{ x})$. These components are $$\n", "\\begin{align*}\n", "g_1(x_1) & = \\sin(x_1) \\\\\n", "g_2(x_1) & = a \\sin^2 (x_2) \\\\\n", "g_{13}(x_1,x_3) & = b x_3^4 \\sin(x_1) \n", "\\end{align*}\n", "$$" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x_grid = np.linspace(*variable_domain,100)\n", "target_simulator = ishigami.fidelity1\n", "f1 = ishigami.f1(x_grid)\n", "f2 = ishigami.f2(x_grid)\n", "F13 = ishigami.f13(np.array([x_grid,x_grid]).T)[:,np.newaxis]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from mpl_toolkits.mplot3d import Axes3D" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, axs = plt.subplots(2, 2, figsize=plot.big_wide_figsize)\n", "gs = axs[1, 1].get_gridspec()\n", "for ax in axs[1, 0:]:\n", " ax.remove()\n", "\n", "ax2 = fig.add_subplot(gs[1, 0:], projection='3d')\n", "\n", "axs[0,0].plot(x_grid, f1,'-r')\n", "axs[0,0].set_xlabel('$x_1$')\n", "axs[0,0].set_ylabel('$f_1$')\n", "\n", "axs[0,1].plot(x_grid,f2,'-r')\n", "axs[0,1].set_xlabel('$x_2$')\n", "axs[0,1].set_ylabel('$f_2$')\n", "\n", "X, Y = np.meshgrid(x_grid, x_grid)\n", "surf = ax2.plot_surface(X, Y, F13, cmap=cm.coolwarm, linewidth=0, antialiased=False)\n", "\n", "ax2.set_xlabel('$x_1$')\n", "ax2.set_ylabel('$x_3$')\n", "ax2.set_zlabel('$f_{13}$')\n", "\n", "mlai.write_figure(filename='non-zero-sobol-ishigami.svg', directory='./uq')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: The non-zero components of the Ishigami function." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Total Variance\n", "\n", "The total variance $\\text{var}(y)$ in this example is" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(ishigami.variance_total)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "which is the sum of the variance of $\\text{var}\\left(g_1(x_1)\\right)$,\n", "$\\text{var}\\left(g_2(x_2)\\right)$ and\n", "$\\text{var}\\left(g_{1,3}(x_{1,3})\\right)$" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(ishigami.variance_x1, ishigami.variance_x2, ishigami.variance_x13)\n", "print(ishigami.variance_x1 + ishigami.variance_x2 + ishigami.variance_x13)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## First Order Sobol Indices using Monte Carlo\n", "\n", "The first order Sobol indices are a measure of “first order sensitivity”\n", "of each input variable. They account for the proportion of variance of\n", "$y$ explained by changing each variable alone while marginalizing over\n", "the rest. Recall that the Sobol index of the $i$th variable is computed\n", "as $$\n", "S_i = \\frac{\\text{var}\\left(g_i(x_i)\\right)}{\\text{var}\\left(g(\\mathbf{ x})\\right)}.\n", "$$ This value is standardized using the total variance, so it is\n", "possible to account for a fractional contribution of each variable to\n", "the total variance of the output.\n", "\n", "The Sobol indices for higher order interactions $S_{i,j}$ are computed\n", "similarly. Due to the normalization by the total variance, the the sum\n", "of all Sobol indices equals to one.\n", "\n", "In most cases we are interested in the first order indices. The Ishigami\n", "function has the benefit that these can be computed analytically. In\n", "`EmuKit` you can extract these values with the code." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ishigami.main_effects" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "But in general, these indices need to be sampled using Monte Carlo or\n", "one of the quasi-Monte Carlo methods we’ve seen in the model-free\n", "experimental design. Details are given in (Sobol, 2001).\n", "\n", "With Emukit, the first-order Sobol indices can be easily computed. We\n", "first need to define the space where of target simulator is analyzed." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from emukit.sensitivity.monte_carlo import ModelFreeMonteCarloSensitivity" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.random.seed(10) # for reproducibility\n", "\n", "num_monte_carlo_points = 10000 # Number of MC samples\n", "senstivity_ishigami = ModelFreeMonteCarloSensitivity(target_simulator, space)\n", "main_effects, total_effects, _ = senstivity_ishigami.compute_effects(num_monte_carlo_points = num_monte_carlo_points)\n", "print(main_effects)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We compare the true effects with the Monte Carlo effects in a bar-plot.\n", "The total effects are discussed later." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "\n", "d = {'Sobol True': ishigami.main_effects,\n", " 'Monte Carlo': main_effects}\n", "\n", "pd.DataFrame(d).plot(kind='bar', ax=ax)\n", "ax.set_title('First-order Sobol indices - Ishigami')\n", "ax.set_ylabel('% of explained output variance')\n", "\n", "mlai.write_figure(filename='first-order-sobol-indices-ishigami.svg', directory='./uq')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: The non-zero components of the Ishigami function." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Total Effects Using Monte Carlo\n", "\n", "Computing high order sensitivity indices can be computationally very\n", "demanding in high dimensional scenarios and measuring the total\n", "influence of each variable on the variance of the output is infeasible.\n", "To solve this issue the *total* indices are used which account for the\n", "contribution to the output variance of $x_i$ including all variance\n", "caused by the variable alone and all its interactions of any order.\n", "\n", "The total effect for $x_i$ is given by: $$ \n", "S_{Ti} = \\frac{\\left\\langle \\text{var}_{x_i} (y\\mid \\mathbf{ x}_{\\sim i}) \\right\\rangle _{p(\\mathbf{ x}_{\\sim i})}}{\\text{var}\\left(g(\\mathbf{ x})\\right)} = 1 - \\frac{\\text{var}_{\\mathbf{ x}_{\\sim i}} \\left\\langle y\\mid \\mathbf{ x}_{\\sim i} \\right\\rangle _{p(\\mathbf{ x}_{\\sim i})}}{\\text{var}\\left(g(\\mathbf{ x})\\right)}\n", "$$\n", "\n", "Note that the sum of $S_{Ti}$ is not necessarily one in this case unless\n", "the model is additive. In the Ishigami example the value of the total\n", "effects is" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ishigami.total_effects" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As in the previous example, the total effects can be computed with Monte\n", "Carlo. In the next plot we show the comparison with the true total\n", "effects." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "\n", "d = {'Sobol True': ishigami.total_effects,\n", " 'Monte Carlo': total_effects}\n", "\n", "pd.DataFrame(d).plot(kind='bar', ax=ax)\n", "ax.set_title('Total effects - Ishigami')\n", "ax.set_ylabel('Effects value')\n", "\n", "mlai.write_figure(filename='total-effects-ishigami.svg', directory='./uq')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: The total effects from the Ishigami function as computed via\n", "Monte Carlo estimate alongside the true total effects for the Ishigami\n", "function." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Computing the Sensitivity Indices Using the Output of a Model\n", "\n", "In the example used above the Ishigami function is very cheap to\n", "evaluate. However, in most real scenarios the functions of interest are\n", "expensive, and we need to limit ourselves to a few number of\n", "evaluations. Using Monte Carlo methods is infeasible in these scenarios\n", "as a large number of samples are typically required to provide good\n", "estimates of the Sobol indices.\n", "\n", "An alternative in these cases is to use Gaussaian process emulator of\n", "the function of interest trained on a few inputs and outputs (Marrel et\n", "al., 2009). If the model is properly trained, its mean prediction which\n", "is cheap to evaluate, can be used to compute the Monte Carlo estimates\n", "of the Sobol indices, the variance from the GP emulator can also be used\n", "to assess our uncertainty about the Sobol indices. Let’s see how we can\n", "do this in Emukit.\n", "\n", "We start by generating 100 samples in the input domain. Note that this a\n", "just 1% of the number of samples that we used to compute the Sobol\n", "coefficients using Monte Carlo." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from emukit.core.initial_designs import RandomDesign" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "design = RandomDesign(space)\n", "x = design.get_samples(500)\n", "y = ishigami.fidelity1(x)[:,np.newaxis]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, we fit a standard Gaussian process to the samples, and we wrap it\n", "as an Emukit model." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from GPy.models import GPRegression\n", "from emukit.model_wrappers import GPyModelWrapper\n", "from emukit.sensitivity.monte_carlo import MonteCarloSensitivity" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model_gpy = GPRegression(x,y)\n", "model_emukit = GPyModelWrapper(model_gpy)\n", "model_emukit.optimize()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The final step is to compute the coefficients using the class\n", "`ModelBasedMonteCarloSensitivity` which directly calls the model and\n", "uses its predictive mean to compute the Monte Carlo estimates of the\n", "Sobol indices. We plot the true estimates, those computed using 10000\n", "direct evaluations of the object using Monte Carlo and those computed\n", "using a Gaussian process model trained on 100 evaluations." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "num_mc = 10000\n", "senstivity_ishigami_gpbased = MonteCarloSensitivity(model = model_emukit, input_domain = space)\n", "main_effects_gp, total_effects_gp, _ = senstivity_ishigami_gpbased.compute_effects(num_monte_carlo_points = num_mc)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "\n", "main_effects_gp = {ivar: main_effects_gp[ivar][0] for ivar in main_effects_gp}\n", "\n", "d = {'Sobol True': ishigami.main_effects,\n", " 'Monte Carlo': main_effects,\n", " 'GP Monte Carlo':main_effects_gp}\n", "\n", "pd.DataFrame(d).plot(kind='bar', ax=ax)\n", "plt.title('First-order Sobol indices - Ishigami')\n", "plt.ylabel('% of explained output variance')\n", "\n", "mlai.write_figure(filename='first-order-sobol-indices-gp-ishigami.svg', directory='./uq')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: First order Sobol indices as estimated by Monte Carlo and\n", "GP-emulator based Monte Carlo." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "\n", "total_effects_gp = {ivar: total_effects_gp[ivar][0] for ivar in total_effects_gp}\n", "\n", "d = {'Sobol True': ishigami.total_effects,\n", " 'Monte Carlo': total_effects,\n", " 'GP Monte Carlo':total_effects_gp}\n", "\n", "pd.DataFrame(d).plot(kind='bar', ax=ax)\n", "ax.set_title('Total effects - Ishigami')\n", "ax.set_ylabel('% of explained output variance')\n", "\n", "mlai.write_figure(filename='total-effects-sobol-indices-gp-ishigami.svg', directory='./uq')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: Total effects as estimated by Monte Carlo and GP based Monte\n", "Carlo.\n", "\n", "We observe some discrepancies with respect to the real value of the\n", "Sobol index when using the Gaussian process, but we get a fairly good\n", "approximation with a very reduced number of evaluations of the original\n", "target function." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Conclusions\n", "\n", "The Sobol indices are a tool for explaining the variance of the output\n", "of a function as components of the input variables. Monte Carlo is an\n", "approach for computing these indices if the function is cheap to\n", "evaluate. Other approaches are needed when $g(\\cdot)$ is expensive to\n", "compute." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Catapult Simulation\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Nicolas Durrande\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "As a worked example we’re going to introduce a catapult simulation\n", "written by Nicolas Durrande, .\n", "\n", "\n", "\n", "Figure: A catapult simulation for experimenting with surrogate\n", "models, kindly provided by Nicolas Durrande\n", "\n", "The simulator allows you to set various parameters of the catapult\n", "including the axis of rotation, `roation_axis`, the position of the arm\n", "stop, `arm_stop`, and the location of the two bindings of the catapult’s\n", "spring, `spring_binding_1` and `spring_binding_2`.\n", "\n", "These parameters are then collated in a vector, $$\n", "\\mathbf{ x}_i = \\begin{bmatrix}\n", "\\texttt{rotation_axis} \\\\\n", "\\texttt{arm_stop} \\\\\n", "\\texttt{spring_binding_1} \\\\\n", "\\texttt{spring_binding_2}\n", "\\end{bmatrix}\n", "$$\n", "\n", "Having set those parameters, you can run an experiment, by firing the\n", "catapult. This will show you how far it goes.\n", "\n", "Because you will need to operate the catapult yourself, we’ll create a\n", "function to query you about the result of an individual firing." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def catapult_distance(x):\n", " \"\"\"Request user input for the catapult.\"\"\"\n", " y = np.zeros((x.shape[0], 1))\n", " for i in range(x.shape[0]):\n", " rotation_axis=x[i, 0]\n", " arm_stop=x[i, 1]\n", " spring_binding_1=x[i, 2]\n", " spring_binding_2=x[i, 3]\n", " \n", " print('Please set the following values:')\n", " print('x_1 = {rotation_axis:.2f} (rotation axis)'.format(rotation_axis=rotation_axis))\n", " print('x_2 = {arm_stop:.2f} (arm stop)'.format(arm_stop=arm_stop))\n", " print('x_3 = {spring_binding_1:.2f} (spring binding 1)'.format(spring_binding_1=spring_binding_1))\n", " print('x_4 = {spring_binding_2:.2f} (spring binding 2)'.format(spring_binding_2=spring_binding_2))\n", " y[i, 0] = float(input('What is the distance? '))\n", " return y" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also set the parameter space for the model. Each of these\n", "variables is scaled to operate $\\in [0, 1]$." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from emukit.core import ContinuousParameter, ParameterSpace" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "variable_domain = [0,1]\n", " \n", "space = ParameterSpace(\n", " [ContinuousParameter('rotation_axis', *variable_domain), \n", " ContinuousParameter('arm_stop', *variable_domain),\n", " ContinuousParameter('spring_binding_1', *variable_domain),\n", " ContinuousParameter('spring_binding_2', *variable_domain)])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Before we perform sensitivity analysis, we need to build an emulator of\n", "the catapulter, which we do using our experimental design process." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Experimental Design for the Catapult\n", "\n", "\\[edit\\]\n", "\n", "Now we will build an emulator for the catapult using the experimental\n", "design loop.\n", "\n", "We’ll start with a small model-free design, we’ll use a random design\n", "for initializing our model." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from emukit.core.initial_designs import RandomDesign" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "design = RandomDesign(space)\n", "x = design.get_samples(5)\n", "y = catapult_distance(x)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from GPy.models import GPRegression\n", "from emukit.model_wrappers import GPyModelWrapper\n", "from emukit.sensitivity.monte_carlo import MonteCarloSensitivity" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Set up the GPy model. The variance of the RBF kernel is set to $150^2$\n", "because that’s roughly the square of the range of the catapult. We set\n", "the noise variance to a small value." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model_gpy = GPRegression(x,y)\n", "model_gpy.kern.variance = 150**2\n", "model_gpy.likelihood.variance.fix(1e-5)\n", "display(model_gpy)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Wrap the model for EmuKit." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model_emukit = GPyModelWrapper(model_gpy)\n", "model_emukit.optimize()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "display(model_gpy)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we set up the model loop. We’ll use integrated variance reduction as\n", "the acquisition function for our model-based design loop.\n", "\n", "*Warning*: This loop runs much slower on Google `colab` than on a local\n", "machine." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from emukit.experimental_design.experimental_design_loop import ExperimentalDesignLoop" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from emukit.experimental_design.acquisitions import IntegratedVarianceReduction, ModelVariance" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "integrated_variance = IntegratedVarianceReduction(space=space,\n", " model=model_emukit)\n", "ed = ExperimentalDesignLoop(space=space, \n", " model=model_emukit, \n", " acquisition = integrated_variance)\n", "ed.run_loop(catapult_distance, 10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Sensitivity Analysis of a Catapult Simulation\n", "\n", "The final step is to compute the coefficients using the class\n", "`ModelBasedMonteCarloSensitivity` which directly calls the model and\n", "uses its predictive mean to compute the Monte Carlo estimates of the\n", "Sobol indices. We plot the estimates of the Sobol indices computed using\n", "a Gaussian process model trained on the observations we’ve acquired." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "num_mc = 10000\n", "senstivity = MonteCarloSensitivity(model = model_emukit, input_domain = space)\n", "main_effects_gp, total_effects_gp, _ = senstivity.compute_effects(num_monte_carlo_points = num_mc)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import mlai.plot as plot\n", "import mlai" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "\n", "main_effects_gp_plot = {ivar: main_effects_gp[ivar][0] for ivar in main_effects_gp}\n", "\n", "d = {'GP Monte Carlo':main_effects_gp_plot}\n", "\n", "pd.DataFrame(d).plot(kind='bar', ax=ax)\n", "plt.ylabel('% of explained output variance')\n", "\n", "mlai.write_figure(filename='first-order-sobol-indices-gp-catapult.svg', directory='./uq')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: First Order sobol indices as estimated by GP-emulator based\n", "Monte Carlo on the catapult." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "\n", "total_effects_gp_plot = {ivar: total_effects_gp[ivar][0] for ivar in total_effects_gp}\n", "\n", "d = {'GP Monte Carlo':total_effects_gp_plot}\n", "\n", "pd.DataFrame(d).plot(kind='bar', ax=ax)\n", "ax.set_ylabel('% of explained output variance')\n", "\n", "mlai.write_figure(filename='total-effects-sobol-indices-gp-catapult.svg', directory='./uq')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: Total effects as estimated by GP based Monte Carlo on the\n", "catapult." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Thanks!\n", "\n", "For more information on these subjects and more you might want to check\n", "the following resources.\n", "\n", "- twitter: [@lawrennd](https://twitter.com/lawrennd)\n", "- podcast: [The Talking Machines](http://thetalkingmachines.com)\n", "- newspaper: [Guardian Profile\n", " Page](http://www.theguardian.com/profile/neil-lawrence)\n", "- blog:\n", " [http://inverseprobability.com](http://inverseprobability.com/blog.html)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## References" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Borchert, T., 2020. Milan: An evolution of data-oriented programming.\n", "\n", "Durrande, N., Ginsbourger, D., Olivier, Carraro, L., 2013. ANOVA kernels\n", "and RKHS of zero mean functions for model-based sensitivity analysis.\n", "Journal of Multivariate Analysis 115, 57–67.\n", "https://doi.org/\n", "\n", "Ishigami, T., Homma, T., 1989. An importance quantification technique in\n", "uncertainty analysis for computer models. \\[1990\\] Proceedings. First\n", "International Symposium on Uncertainty Modeling and Analysis 398–403.\n", "\n", "Joshi, R., 2007. A loosely-coupled real-time SOA. Real-Time Innovations\n", "Inc.\n", "\n", "Kennedy, M.C., O’Hagan, A., 2001. Bayesian calibration of computer\n", "models. Journal of the Royal Statistical Society: Series B (Statistical\n", "Methodology) 63, 425–464. \n", "\n", "Kennedy, M.C., O’Hagan, A., 2000. Predicting the output from a complex\n", "computer code when fast approximations are available. Biometrika 87,\n", "1–13.\n", "\n", "Laplace, P.S., 1814. Essai philosophique sur les probabilités, 2nd ed.\n", "Courcier, Paris.\n", "\n", "Lawrence, N.D., 2017. Data readiness levels. ArXiv.\n", "\n", "Lawrence, N.D., Montgomery, J., Paquet, U., 2020. Organisational data\n", "maturity. The Royal Society.\n", "\n", "Marrel, A., Iooss, B., Laurent, B., Roustant, O., 2009. Calculations of\n", "Sobol indices for the Gaussian process metamodel. Reliability\n", "Engineering & System Safety 94, 742–751.\n", "https://doi.org/\n", "\n", "McKay, M.D., Beckman, R.J., Conover, W.J., 1979. A comparison of three\n", "methods for selecting values of input variables in the analysis of\n", "output from a computer code. Technometrics 21, 239–245.\n", "\n", "Saltelli, A., Annoni, P., Azzini, I., Campolongo, F., Ratto, M.,\n", "Tarantola, S., 2010. Variance based sensitivity analysis of model\n", "output. Design and estimator for the total sensitivity index. Computer\n", "Physics Communications 181, 259–270.\n", "\n", "\n", "Saltelli, A., Ratto, M., Andres, T., Campolongo, F., Cariboni, J.,\n", "Gatelli, D., Saisana, M., Tarantola, S., 2008. Global sensitivity\n", "analysis: The primer. wiley.\n", "\n", "Saltelli, A., Tarantola, S., Campolongo, F., Ratto, M., 2004.\n", "Sensitivity analysis in practice: A guide to assessing scientific\n", "methods. wiley.\n", "\n", "Sobol, I.M., 2001. Global sensitivity indices for nonlinear mathematical\n", "models and their Monte Carlo estimates. Mathematics and Computers in\n", "Simulation 55, 271–280. \n", "\n", "Sobol, I.M., 1990. On sensitivity estimation for nonlinear mathematical\n", "models. Matematicheskoe Modelirovanie 2, 112–118.\n", "\n", "Sobol, I.M., Levitan, Y.L., 1999. On the use of variance reducing\n", "multipliers in Monte Carlo computations of a global sensitivity index.\n", "Computer Physics Communications 117, 52–61.\n", "\n", "\n", "The DELVE Initiative, 2020. Data readiness: Lessons from an emergency.\n", "The Royal Society.\n", "\n", "Vorhemus, C., Schikuta, E., 2017. A data-oriented architecture for\n", "loosely coupled real-time information systems, in: Proceedings of the\n", "19th International Conference on Information Integration and Web-Based\n", "Applications & Services, iiWAS ’17. Association for Computing Machinery,\n", "New York, NY, USA, pp. 472–481.\n", "" ] } ], "nbformat": 4, "nbformat_minor": 5, "metadata": {} }