{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Machine Learning and the Physical World\n", "\n", "### [Neil D. Lawrence](http://inverseprobability.com), University of\n", "\n", "Cambridge\n", "\n", "### 2021-07-07" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Abstract**: Machine learning technologies have underpinned the recent\n", "revolution in artificial intelligence. But at their heart, they are\n", "simply data driven decision making algorithms. While the popular press\n", "is filled with the achievements of these algorithms in important domains\n", "such as object detection in images, machine translation and speech\n", "recognition, there are still many open questions about how these\n", "technologies might be implemented in domains where we have existing\n", "solutions but we are constantly looking for improvements. Roughly\n", "speaking, we characterise this domain as “machine learning in the\n", "physical world.” How do we design, build and deploy machine learning\n", "algorithms that are part of a decision making system that interacts with\n", "the physical world around us. In particular, machine learning is a data\n", "driven endeavour, but real world systems are physical and mechanistic.\n", "In this talk we will introduce some of the challenges for this domain\n", "and and propose some ways forward in terms of solutions." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Emergent Behaviour\n", "\n", "\\[edit\\]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Life Rules\n", "\n", "\\[edit\\]\n", "\n", "John Conway’s game of life is a cellular automata where the cells obey\n", "three very simple rules. The cells live on a rectangular grid, so that\n", "each cell has 8 possible neighbours.\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\n", "*loneliness*\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "Figure: ‘Death’ through loneliness in Conway’s game of life. If a\n", "cell is surrounded by less than three cells, it ‘dies’ through\n", "loneliness.\n", "\n", "The game proceeds in turns, and at each location in the grid is either\n", "alive or dead. Each turn, a cell counts its neighbours. If there are two\n", "or fewer neighbours, the cell ‘dies’ of ‘loneliness.’\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\n", "*overcrowding*\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "Figure: ‘Death’ through overpopulation in Conway’s game of life. If a\n", "cell is surrounded by more than three cells, it ‘dies’ through\n", "loneliness.\n", "\n", "If there are four or more neigbours, the cell ‘dies’ from\n", "‘overcrowding.’ If there are three neigbours, the cell persists, or if\n", "it is currently dead, a new cell is born.\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\n", "*birth*\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "Figure: Birth in Conway’s life. Any position surounded by precisely\n", "three live cells will give birth to a new cell at the next turn." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loafers and Gliders\n", "\n", "\\[edit\\]\n", "\n", "John Horton Conway, as the creator of the game of life, could be seen\n", "somehow as the god of this small universe. He created the rules. The\n", "rules are so simple that in many senses he, and we, are all-knowing in\n", "this space. But despite our knowledge, this world can still ‘surprise’\n", "us. From the simple rules, emergent patterns of behaviour arise. These\n", "include static patterns that don’t change from one turn to the next.\n", "They also include, oscillators, that pulse between different forms\n", "across different periods of time. A particular form of oscillator is\n", "known as a ‘spaceship,’ this is one that moves across the board as the\n", "game evolves. One of the simplest and earliest spaceships to be\n", "discovered is known as the glider.\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "*Glider (1969)*\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "\n", "Figure: *Left* A Glider pattern discovered 1969 by Richard K. Guy.\n", "*Right*. John Horton Conway, creator of *Life* (1937-2020).\n", "\n", "The glider was ‘discovered’ in 1969 by Richard K. Guy. What do we mean\n", "by discovered in this context? Well, as soon as the game of life is\n", "defined, objects such as the glider do somehow exist, but the many\n", "configurations of the game mean that it takes some time for us to see\n", "one and know it exists. This means, that despite being the creator,\n", "Conway, and despite the rules of the game being simple, and despite the\n", "rules being deterministic, we are not ‘omniscient’ in any simplistic\n", "sense. It requires computation to ‘discover’ what can exist in this\n", "universe once it’s been defined.\n", "\n", "Another spaceship is known as the ‘loafer.’ It was ‘discovered’ in 2013\n", "by Josh Ball. So despite the game having existed for over forty years,\n", "and the rules of the game being simple, there are emergent behaviours\n", "that are unknown.\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "*Loafer (2013)*\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "\n", "Figure: *Left* A Loafer pattern discovered by Josh Ball in 2013.\n", "*Right*. John Horton Conway, creator of *Life* (1937-2020).\n", "\n", "Contrast this with our situation where in ‘real life’ we don’t know the\n", "simple rules of the game, the state space is larger, and emergent\n", "behaviours (hurricanes, earthquakes, volcanos, climate change) have\n", "direct consequences for our daily lives, and we understand why the\n", "process of ‘understanding’ the physical world is so difficult. We also\n", "see immediately how much easier we might expect the physical sciences to\n", "be than the social sciences, where the emergent behaviours are\n", "contingent on highly complex human interactions.\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The Centrifugal Governor\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: Centrifugal governor as held by “Science” on Holborn\n", "Viaduct" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Boulton and Watt’s Steam Engine\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: Watt’s Steam Engine which made Steam Power Efficient and\n", "Practical.\n", "\n", "James Watt’s steam engine contained an early machine learning device. In\n", "the same way that modern systems are component based, his engine was\n", "composed of components. One of which is a speed regulator sometimes\n", "known as *Watt’s governor*. The two balls in the center of the image,\n", "when spun fast, rise, and through a linkage mechanism.\n", "\n", "The centrifugal governor was made famous by Boulton and Watt when it was\n", "deployed in the steam engine. Studying stability in the governor is the\n", "main subject of James Clerk Maxwell’s paper on the theoretical analysis\n", "of governors (Maxwell, 1867). This paper is a founding paper of control\n", "theory. In an acknowledgment of its influence, Wiener used the name\n", "[*cybernetics*](https://en.wikipedia.org/wiki/Cybernetics) to describe\n", "the field of control and communication in animals and the machine\n", "(Wiener, 1948). Cybernetics is the Greek word for governor, which comes\n", "from the latin for helmsman.\n", "\n", "A governor is one of the simplest artificial intelligence systems. It\n", "senses the speed of an engine, and acts to change the position of the\n", "valve on the engine to slow it down.\n", "\n", "Although it’s a mechanical system a governor can be seen as automating a\n", "role that a human would have traditionally played. It is an early\n", "example of artificial intelligence.\n", "\n", "The centrifugal governor has several parameters, the weight of the balls\n", "used, the length of the linkages and the limits on the balls movement.\n", "\n", "Two principle differences exist between the centrifugal governor and\n", "artificial intelligence systems of today.\n", "\n", "1. The centrifugal governor is a physical system and it is an integral\n", " part of a wider physical system that it regulates (the engine).\n", "2. The parameters of the governor were set by hand, our modern\n", " artificial intelligence systems have their parameters set by *data*.\n", "\n", "\n", "\n", "Figure: The centrifugal governor, an early example of a decision\n", "making system. The parameters of the governor include the lengths of the\n", "linkages (which effect how far the throttle opens in response to\n", "movement in the balls), the weight of the balls (which effects inertia)\n", "and the limits of to which the balls can rise.\n", "\n", "This has the basic components of sense and act that we expect in an\n", "intelligent system, and this system saved the need for a human operator\n", "to manually adjust the system in the case of overspeed. Overspeed has\n", "the potential to destroy an engine, so the governor operates as a safety\n", "device.\n", "\n", "The first wave of automation did bring about sabotoage as a worker’s\n", "response. But if machinery was sabotaged, for example, if the linkage\n", "between sensor (the spinning balls) and action (the valve closure) was\n", "broken, this would be obvious to the engine operator at start up time.\n", "The machine could be repaired before operation.\n", "\n", "\n", "\n", "- There is a gap between the world of data science and AI.\n", "- The mapping of the virtual onto the physical world.\n", "- E.g. Causal understanding." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prime Air\n", "\n", "\\[edit\\]\n", "\n", "One project where a number of components of machine learning and the\n", "physical world come together is Amazon’s Prime Air drone delivery\n", "system.\n", "\n", "Automating the process of moving physical goods through autonomous\n", "vehicles completes the loop between the ‘bits’ and the ‘atoms.’ In other\n", "words, the information and the ‘stuff.’ The idea of the drone is to\n", "complete a component of package delivery, the notion of last mile\n", "movement of goods, but in a fully autonomous way.\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Gur Kimchi\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Paul Viola\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "David Moro\n", "\n", "\n", "\n", "\n", "\n", "" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.lib.display import YouTubeVideo\n", "YouTubeVideo('3HJtmx5f1Fc')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Figure: An actual Santa’s sleigh. Amazon’s new delivery drone.\n", "Machine learning algorithms are used across various systems including\n", "sensing (computer vision for detection of wires, people, dogs etc) and\n", "piloting. The technology is necessarily a combination of old and new\n", "ideas. The transition from vertical to horizontal flight is vital for\n", "efficiency and requires sophisticated machine learning to achieve.\n", "\n", "As Jeff Wilke (CEO of Amazon Retail) [announced in June\n", "2019](https://blog.aboutamazon.com/transportation/a-drone-program-taking-flight)\n", "the technology is ready, but still needs operationalisation including\n", "e.g. regulatory approval." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.lib.display import YouTubeVideo\n", "YouTubeVideo('wa8DU-Sui8Q')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Figure: Jeff Wilke (CEO Amazon Consumer) announcing the new drone at\n", "the Amazon 2019 re:MARS event alongside the scale of the Amazon supply\n", "chain.\n", "\n", "> When we announced earlier this year that we were evolving our Prime\n", "> two-day shipping offer in the U.S. to a one-day program, the response\n", "> was terrific. But we know customers are always looking for something\n", "> better, more convenient, and there may be times when one-day delivery\n", "> may not be the right choice. Can we deliver packages to customers even\n", "> faster? We think the answer is yes, and one way we’re pursuing that\n", "> goal is by pioneering autonomous drone technology.\n", "\n", "> Today at Amazon’s re:MARS Conference (Machine Learning, Automation,\n", "> Robotics and Space) in Las Vegas, we unveiled our latest Prime Air\n", "> drone design. We’ve been hard at work building fully electric drones\n", "> that can fly up to 15 miles and deliver packages under five pounds to\n", "> customers in less than 30 minutes. And, with the help of our\n", "> world-class fulfillment and delivery network, we expect to scale Prime\n", "> Air both quickly and efficiently, delivering packages via drone to\n", "> customers within months.\n", "\n", "The 15 miles in less than 30 minutes implies air speed velocities of\n", "around 50 kilometers per hour.\n", "\n", "> Our newest drone design includes advances in efficiency, stability\n", "> and, most importantly, in safety. It is also unique, and it advances\n", "> the state of the art. How so? First, it’s a hybrid design. It can do\n", "> vertical takeoffs and landings – like a helicopter. And it’s efficient\n", "> and aerodynamic—like an airplane. It also easily transitions between\n", "> these two modes—from vertical-mode to airplane mode, and back to\n", "> vertical mode.\n", "\n", "> It’s fully shrouded for safety. The shrouds are also the wings, which\n", "> makes it efficient in flight.\n", "\n", "\n", "\n", "Figure: Picture of the drone from Amazon Re-MARS event in 2019.\n", "\n", "> Our drones need to be able to identify static and moving objects\n", "> coming from any direction. We employ diverse sensors and advanced\n", "> algorithms, such as multi-view stereo vision, to detect static objects\n", "> like a chimney. To detect moving objects, like a paraglider or\n", "> helicopter, we use proprietary computer-vision and machine learning\n", "> algorithms.\n", "\n", "> A customer’s yard may have clotheslines, telephone wires, or\n", "> electrical wires. Wire detection is one of the hardest challenges for\n", "> low-altitude flights. Through the use of computer-vision techniques\n", "> we’ve invented, our drones can recognize and avoid wires as they\n", "> descend into, and ascend out of, a customer’s yard." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Buying System\n", "\n", "\\[edit\\]\n", "\n", "An example of a complex decision making system might be an automated\n", "buying system. In such a system, the idea is to match demand for\n", "products to supply of products.\n", "\n", "The matching of demand and supply is a repetetive theme for decision\n", "making systems. Not only does it occur in automated buying, but also in\n", "the allocation of drivers to riders in a ride sharing system. Or in the\n", "allocation of compute resource to users in a cloud system.\n", "\n", "The components of any of these system include: predictions of the demand\n", "for the product, or the drivers or the compute. Then predictions of the\n", "supply. Decisions are then made for how much material to keep in stock,\n", "or how many drivers to have on the road, or how much computer capacity\n", "to have in your data centres. These decisions have cost implications.\n", "The optimal amount of product will depend on the cost of making it\n", "available. For a buying system this is the storage costs.\n", "\n", "Decisions are made on the basis of the supply and demand to make new\n", "orders, to encourage more drivers to come into the system or to build\n", "new data centers or rent more computational power.\n", "\n", "\n", "\n", "Figure: The components of a putative automated buying system" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Monolithic System\n", "\n", "The classical approach to building these systems was a ‘monolithic\n", "system.’ Built in a similar way to the successful applicaitons software\n", "such as Excel or Word, or large operating systems, a single code base\n", "was constructed. The complexity of such code bases run to many lines.\n", "\n", "In practice, shared dynamically linked libraries may be used for aspects\n", "such as user interface, or networking, but the software often has many\n", "millions of lines of code. For example, the Microsoft Office suite is\n", "said to contain over 30 millions of lines of code.\n", "\n", "\n", "\n", "Figure: A potential path of models in a machine learning system." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Service Oriented Architecture\n", "\n", "Such software is not only difficult to develop, it is difficult to scale\n", "when computation demands increase. Amazon’s original website software\n", "(called Obidos) was a [monolithic\n", "design](https://en.wikipedia.org/wiki/Obidos_(software)) but by the\n", "early noughties it was becoming difficult to sustain and maintain. The\n", "software was phased out in 2006 to be replaced by a modularized software\n", "known as a ‘service oriented architecture.’\n", "\n", "In Service Oriented Architecture, or “Software as a Service” the idea is\n", "that code bases are modularized and communicate with one another using\n", "network requests. A standard approach is to use a [REST\n", "API](https://en.wikipedia.org/wiki/Representational_state_transfer). So,\n", "rather than a single monolithic code base, the code is developed with\n", "individual services that handle the different requests.\n", "\n", "\n", "\n", "Figure: A potential path of models in a machine learning system.\n", "\n", "This is the landscape we now find ourselves in with regard to software\n", "development. In practice, each of these services is often ‘owned’ and\n", "maintained by an individual team. The team is judged by the quality of\n", "their service provision. They work to detailed specifications on what\n", "their service should output, what its availability should be and other\n", "objectives like speed of response. This allows for conditional\n", "independence between teams and for faster development." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Intellectual Debt\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: Jonathan Zittrain’s term to describe the challenges of\n", "explanation that come with AI is Intellectual Debt.\n", "\n", "In computer systems the concept of *technical debt* has been surfaced by\n", "authors including Sculley et al. (2015). It is an important concept,\n", "that I think is somewhat hidden from the academic community, because it\n", "is a phenomenon that occurs when a computer software system is deployed." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Separation of Concerns\n", "\n", "\\[edit\\]\n", "\n", "To construct such complex systems an approach known as “separation of\n", "concerns” has been developed. The idea is that you architect your\n", "system, which consists of a large-scale complex task, into a set of\n", "simpler tasks. Each of these tasks is separately implemented. This is\n", "known as the decomposition of the task.\n", "\n", "This is where Jonathan Zittrain’s beautifully named term “intellectual\n", "debt” rises to the fore. Separation of concerns enables the construction\n", "of a complex system. But who is concerned with the overall system?\n", "\n", "- Technical debt is the inability to *maintain* your complex software\n", " system.\n", "\n", "- Intellectual debt is the inability to *explain* your software\n", " system.\n", "\n", "It is right there in our approach to software engineering. “Separation\n", "of concerns” means no one is concerned about the overall system itself." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Statistical Emulation\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: The UK Met office runs a shared code base for its simulations\n", "of climate and the weather. This plot shows the different spatial and\n", "temporal scales used.\n", "\n", "In many real world systems, decisions are made through simulating the\n", "environment. Simulations may operate at different granularities. For\n", "example, simulations are used in weather forecasts and climate\n", "forecasts. Interestingly, the UK Met office uses the same code for both,\n", "it has a [“Unified Model”\n", "approach](https://www.metoffice.gov.uk/research/approach/modelling-systems/unified-model/index),\n", "but they operate climate simulations one at greater spatial and temporal\n", "resolutions.\n", "\n", "\n", "\n", "Figure: Real world systems consist of simulators that capture our\n", "domain knowledge about how our systems operate. Different simulators run\n", "at different speeds and granularities.\n", "\n", "\n", "\n", "Figure: A statistical emulator is a system that reconstructs the\n", "simulation with a statistical model.\n", "\n", "A statistical emulator is a data-driven model that learns about the\n", "underlying simulation. Importantly, learns with uncertainty, so it\n", "‘knows what it doesn’t know.’ In practice, we can call the emulator in\n", "place of the simulator. If the emulator ‘doesn’t know,’ it can call the\n", "simulator for the answer.\n", "\n", "\n", "\n", "Figure: A statistical emulator is a system that reconstructs the\n", "simulation with a statistical model. As well as reconstructing the\n", "simulation, a statistical emulator can be used to correlate with the\n", "real world.\n", "\n", "As well as reconstructing an individual simulator, the emulator can\n", "calibrate the simulation to the real world, by monitoring differences\n", "between the simulator and real data. This allows the emulator to\n", "characterise where the simulation can be relied on, i.e. we can validate\n", "the simulator.\n", "\n", "Similarly, the emulator can adjudicate between simulations. This is\n", "known as *multi-fidelity emulation*. The emulator characterizes which\n", "emulations perform well where.\n", "\n", "If all this modelling is done with judiscious handling of the\n", "uncertainty, the *computational doubt*, then the emulator can assist in\n", "desciding what experiment should be run next to aid a decision: should\n", "we run a simulator, in which case which one, or should we attempt to\n", "acquire data from a real world intervention." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Auto AI\n", "\n", "\\[edit\\]\n", "\n", "Supervised machine learning models are data-driven statistical\n", "functional estimators. Each ML model is trained to perform a task.\n", "Machine learning systems are created when these models are integrated as\n", "interacting components in a more complex system that carries out a\n", "larger scale task, e.g. an autonomous drone delivery system.\n", "\n", "Artificial Intelligence can also be seen as *algorithmic\n", "decision-making*. ML systems are *data driven* algorithmic\n", "decision-makers. Designing decision-making engines requires us to\n", "firstly decompose the system into its component parts. The\n", "decompositions are driven by (1) system performance requirements (2) the\n", "suite of ML algorithms at our disposal (3) the data availability.\n", "Performance requirements could be computational speed, accuracy,\n", "interpretability, and ‘fairness.’ The current generation of ML Systems\n", "is often based around *supervised learning* and human annotated data.\n", "But in the future, we may expect more use of *reinforcement learning*\n", "and automated knowledge discovery using *unsupervised learning*.\n", "\n", "The classical systems approach assumes decomposability of components. In\n", "ML, upstream components (e.g. a pedestrian detector in an autonomous\n", "vehicle) make decisions that require revisiting once a fuller picture is\n", "realized at a downstream stage (e.g. vehicle path planning). The\n", "relative weaknesses and strengths of the different component parts need\n", "to be assessed when resolving conflicts.\n", "\n", "In long-term planning, e.g. logistics and supply chain, a plan may be\n", "computed multiple times under different constraints as data evolves. In\n", "logistics, an initial plan for delivery may be computed when an item is\n", "viewed on a webpage. Webpage waiting-time constraints dominate the\n", "solution we choose. However, when an order is placed the time constraint\n", "may be relaxed and an accuracy constraint or a cost constraint may now\n", "dominate.\n", "\n", "Such sub-systems will make inconsistent decisions, but we should monitor\n", "and control the extent of the inconsistency.\n", "\n", "One solution to aid with both the lack of decomposability of the\n", "components and the inconsistency between components is *end-to-end*\n", "learning of the system. End-to-end learning is when we use ML techniques\n", "to fit parameters across the entire decision pipeline. We exploit\n", "gradient descent and automated differentiation software to achieve this.\n", "However, components in the system may themselves be running a\n", "*simulation* (e.g. a transport delivery-time simulation) or\n", "*optimization* (e.g. a linear program) as a subroutine. This limits the\n", "universality of automatic differentiation. Another alternative is to\n", "replace the entire system with a single ML model, such as in Deep\n", "Reinforcement Learning. However, this can severely limit the\n", "interpretability of the resulting system.\n", "\n", "We envisage AutoAI as allowing us to take advantage of end-to-end\n", "learning without sacrificing the interpretability of the underlying\n", "system. Instead of optimizing each component individually, we introduce\n", "*Bayesian system optimization* (BSO). We will make use of the end-to-end\n", "learning signals and attribute them to the system sub-components through\n", "the construction of an interconnected network of *surrogate models*,\n", "known as emulators, each of which is associated with an individual\n", "component from the underlying ML-system. Instead of optimizing each\n", "component individually (e.g. by classical Bayesian optimization) in BSO\n", "we account for upstream and downstream interactions in the optimization,\n", "leveraging our end-to-end knowledge without damaging the\n", "interpretability of the underlying system." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Deep Emulation\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: A potential path of models in a machine learning system.\n", "\n", "As a solution we can use of *emulators*. When constructing an ML system,\n", "software engineers, ML engineers, economists and operations researchers\n", "are explicitly defining relationships between variables of interest in\n", "the system. That implicitly defines a joint distribution,\n", "$p(\\mathbf{ y}^*, \\mathbf{ y})$. In a decomposable system any\n", "sub-component may be defined as\n", "$p(\\mathbf{ y}_\\mathbf{i}|\\mathbf{ y}_\\mathbf{j})$ where\n", "$\\mathbf{ y}_\\mathbf{i}$ and $\\mathbf{ y}_\\mathbf{j}$ represent sub-sets\n", "of the full set of variables\n", "$\\left\\{\\mathbf{ y}^*, \\mathbf{ y}\\right\\}$. In those cases where the\n", "relationship is deterministic, the probability density would collapse to\n", "a vector-valued deterministic function,\n", "$\\mathbf{ f}_\\mathbf{i}\\left(\\mathbf{ y}_\\mathbf{j}\\right)$.\n", "\n", "Inter-variable relationships could be defined by, for example a neural\n", "network (machine learning), an integer program (operational research),\n", "or a simulation (supply chain). This makes probabilistic inference in\n", "this joint density for real world systems is either very hard or\n", "impossible.\n", "\n", "Emulation is a form of meta-modelling: we construct a model of the\n", "model. We can define the joint density of an emulator as\n", "$s(\\mathbf{ y}*, \\mathbf{ y})$, but if this probability density is to be\n", "an accurate representation of our system, it is likely to be\n", "prohibitively complex. Current practice is to design an emulator to deal\n", "with a specific question. This is done by fitting an ML model to a\n", "simulation from the the appropriate conditional distribution,\n", "$p(\\mathbf{ y}_\\mathbf{i}|\\mathbf{ y}_\\mathbf{j})$, which is\n", "intractable. The emulator provides an approximated answer of the form\n", "$s(\\mathbf{ y}_\\mathbf{i}|\\mathbf{ y}_\\mathbf{j})$. Critically, an\n", "emulator should incorporate its uncertainty about its approximation. So\n", "the emulator answer will be less certain than direct access to the\n", "conditional $p(\\mathbf{ y}_i|\\mathbf{ y}_j)$, but it may be sufficiently\n", "confident to act upon. Careful design of emulators to answer a given\n", "question leads to efficient diagnostics and understanding of the system.\n", "But in a complex interacting system an exponentially increasing number\n", "of questions can be asked. This calls for a system of automated\n", "construction of emulators which selects the right structure and\n", "redeploys the emulator as necessary. Rapid redeployment of emulators\n", "could exploit pre-existing emulators through *transfer learning*.\n", "\n", "Automatically deploying these families of emulators for full system\n", "understanding is highly ambitious. It requires advances in engineering\n", "infrastructure, emulation and Bayesian optimization. However, the\n", "intermediate steps of developing this architecture also allow for\n", "automated monitoring of system accuracy and fairness. This facilitates\n", "AutoML on a component-wise basis which we can see as a simple\n", "implementation of AutoAI. The proposal is structured so that despite its\n", "technical ambition there is a smooth ramp of benefits to be derived\n", "across the programme of work.\n", "\n", "In Applied Mathematics, the field studying these techniques is known as\n", "*uncertainty quantification*. The new challenge is the automation of\n", "emulator creation on demand to answer questions of interest and\n", "facilitate the system design, i.e. AutoAI through BSO.\n", "\n", "At design stage, any particular AI task could be decomposed in multiple\n", "ways. Bayesian system optimization will assist both in determining the\n", "large-scale system design through exploring different decompositions and\n", "in refinement of the deployed system.\n", "\n", "So far, most work on emulators has focussed on emulating a single\n", "component. Automated deployment and maintenance of ML systems requires\n", "networks of emulators that can be deployed and redeployed on demand\n", "depending on the particular question of interest. Therefore, the\n", "technical innovations we require are in the mathematical composition of\n", "emulator models (Damianou and Lawrence, 2013; Perdikaris et al., 2017).\n", "Different chains of emulators will need to be rapidly composed to make\n", "predictions of downstream performance. This requires rapid retraining of\n", "emulators and *propagation of uncertainty* through the emulation\n", "pipeline a process we call *deep emulation*.\n", "\n", "\n", "\n", "Recomposing the ML system requires structural learning of the network.\n", "By parameterizing covariance functions appropriately this can be done\n", "through Gaussian processes (e.g. (Damianou et al., n.d.)), but one could\n", "also consider Bayesian neural networks and other generative models,\n", "e.g. Generative Adversarial Networks (Goodfellow et al., 2014).\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "Figure: A potential path of models in a machine learning system.\n", "\n", "\n", "\n", "Figure: A potential path of models in a machine learning system.\n", "\n", "\n", "\n", "Figure: A potential path of models in a machine learning system." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The Accelerate Programme\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: The Accelerate Programme for Scientific Discovery covers\n", "research, education and training, engagement. Our aim is to bring about\n", "a step change in scientific discovery through AI.\n", "\n", "\n", "We’re now in a new phase of the development of computing, with rapid\n", "advances in machine learning. But we see some of the same issues –\n", "researchers across disciplines hope to make use of machine learning, but\n", "need access to skills and tools to do so, while the field machine\n", "learning itself will need to develop new methods to tackle some complex,\n", "‘real world’ problems.\n", "\n", "It is with these challenges in mind that the Computer Lab has started\n", "the Accelerate Programme for Scientific Discovery. This new Programme is\n", "seeking to support researchers across the University to develop the\n", "skills they need to be able to use machine learning and AI in their\n", "research.\n", "\n", "To do this, the Programme is developing three areas of activity:\n", "\n", "- Research: we’re developing a research agenda that develops and\n", " applies cutting edge machine learning methods to scientific\n", " challenges, with four Accelerate Research fellows working directly\n", " on issues relating to computational biology, psychiatry, string\n", " theory and materials science. While we’re concentrating on STEM\n", " subjects for now, in the longer term our ambition is to build links\n", " with the social sciences and humanities.\n", "\n", "- Teaching and learning: building on the teaching activities already\n", " delivered through University courses, we’re creating a pipeline of\n", " learning opportunities to help PhD students and postdocs better\n", " understand how to use data science and machine learning in their\n", " work. Our programme with Spark is one element of this, and we’ll be\n", " announcing further activities soon.\n", "\n", "- Engagement: we hope that Accelerate will help build a community of\n", " researchers working across the University at the interface on\n", " machine learning and the sciences, helping to share best practice\n", " and new methods, and support each other in advancing their research.\n", " Over the coming years, we’ll be running a variety of events and\n", " activities in support of this, and would welcome your ideas about\n", " what might be most useful." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## ML and the Physical World Course\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: Machine Learning and the Physical World is a course focussed\n", "on teaching the principles and techniques of emulation. It’s freely\n", "available on line. \n", "\n", "The [ML and the Physical World\n", "course](http://mlatcl.github.io/mlphysical/) is focused on machine\n", "learning systems that interact directly with the real world. Building\n", "artificial systems that interact with the physical world have\n", "significantly different challenges compared to the purely digital\n", "domain. In the real world data is scares, often uncertain and decisions\n", "can have costly and irreversible consequences. However, we also have the\n", "benefit of centuries of scientific knowledge that we can draw from. This\n", "module will provide the methodological background to machine learning\n", "applied in this scenario. We will study how we can build models with a\n", "principled treatment of uncertainty, allowing us to leverage prior\n", "knowledge and provide decisions that can be interrogated." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Thanks!\n", "\n", "For more information on these subjects and more you might want to check\n", "the following resources.\n", "\n", "- twitter: [@lawrennd](https://twitter.com/lawrennd)\n", "- podcast: [The Talking Machines](http://thetalkingmachines.com)\n", "- newspaper: [Guardian Profile\n", " Page](http://www.theguardian.com/profile/neil-lawrence)\n", "- blog:\n", " [http://inverseprobability.com](http://inverseprobability.com/blog.html)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## References" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Damianou, A., Ek, C.H., Titsias, M.K., Lawrence, N.D., n.d. Manifold\n", "relevance determination.\n", "\n", "Damianou, A., Lawrence, N.D., 2013. Deep Gaussian processes. pp.\n", "207–215.\n", "\n", "Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D.,\n", "Ozair, S., Courville, A., Bengio, Y., 2014. Generative adversarial nets,\n", "in: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger,\n", "K.Q. (Eds.), Advances in Neural Information Processing Systems 27.\n", "Curran Associates, Inc., pp. 2672–2680.\n", "\n", "Maxwell, J.C., 1867. On governors. Proceedings of the Royal Society of\n", "London 16, 270–283.\n", "\n", "Perdikaris, P., Raissi, M., Damianou, A., Lawrence, N.D., Karnidakis,\n", "G.E., 2017. Nonlinear information fusion algorithms for data-efficient\n", "multi-fidelity modelling. Proc. R. Soc. A 473.\n", "\n", "\n", "Sculley, D., Holt, G., Golovin, D., Davydov, E., Phillips, T., Ebner,\n", "D., Chaudhary, V., Young, M., Crespo, J.-F., Dennison, D., 2015. Hidden\n", "technical debt in machine learning systems, in: Cortes, C., Lawrence,\n", "N.D., Lee, D.D., Sugiyama, M., Garnett, R. (Eds.), Advances in Neural\n", "Information Processing Systems 28. Curran Associates, Inc., pp.\n", "2503–2511.\n", "\n", "Wiener, N., 1948. Cybernetics: Control and communication in the animal\n", "and the machine. MIT Press, Cambridge, MA." ] } ], "nbformat": 4, "nbformat_minor": 5, "metadata": {} }