{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# The Atomic Human\n", "\n", "### [Neil D. Lawrence](http://inverseprobability.com), University of\n", "\n", "Cambridge\n", "\n", "### 2024-03-12" ], "id": "6811d68c-73f1-49b5-be2a-23bbdf3d091b" }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Abstract**: A vital perspective is missing from the discussions we’re\n", "having about Artificial Intelligence: what does it mean for our\n", "identity?\n", "\n", "Our fascination with AI stems from the perceived uniqueness of human\n", "intelligence. We believe it’s what differentiates us. Fears of AI not\n", "only concern how it invades our digital lives, but also the implied\n", "threat of an intelligence that displaces us from our position at the\n", "centre of the world.\n", "\n", "Atomism, proposed by Democritus, suggested it was impossible to continue\n", "dividing matter down into ever smaller components: eventually we reach a\n", "point where a cut cannot be made (the Greek for uncuttable is ‘atom’).\n", "In the same way, by slicing away at the facets of human intelligence\n", "that can be replaced by machines, AI uncovers what is left: an\n", "indivisible core that is the essence of humanity.\n", "\n", "By contrasting our own (evolved, locked-in, embodied) intelligence with\n", "the capabilities of machine intelligence through history, The Atomic\n", "Human reveals the technical origins, capabilities and limitations of AI\n", "systems, and how they should be wielded. Not just by the experts, but\n", "ordinary people. Either AI is a tool for us, or we become a tool of AI.\n", "Understanding this will enable us to choose the future we want.\n", "\n", "This talk is based on Neil’s forthcoming book to be published with Allen\n", "Lane in June 2024. Machine learning solutions, in particular those based\n", "on deep learning methods, form an underpinning of the current revolution\n", "in “artificial intelligence” that has dominated popular press headlines\n", "and is having a significant influence on the wider tech agenda.\n", "\n", "In this talk I will give an overview of where we are now with machine\n", "learning solutions, and what challenges we face both in the near and far\n", "future. These include practical application of existing algorithms in\n", "the face of the need to explain decision making, mechanisms for\n", "improving the quality and availability of data, dealing with large\n", "unstructured datasets." ], "id": "3a1a72e9-b560-41d5-a982-93d11c98282c" }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\n", "$$" ], "id": "ad249886-9168-4297-b74b-cff606cda5ac" }, { "cell_type": "markdown", "metadata": {}, "source": [ "::: {.cell .markdown}\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "" ], "id": "28f7b572-2faf-41c7-9611-639decabf1a6" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Henry Ford’s Faster Horse\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: A 1925 Ford Model T built at Henry Ford’s Highland Park Plant\n", "in Dearborn, Michigan. This example now resides in Australia, owned by\n", "the founder of FordModelT.net. From\n", "\n", "\n", "It’s said that Henry Ford’s customers wanted a “a faster horse”. If\n", "Henry Ford was selling us artificial intelligence today, what would the\n", "customer call for, “a smarter human”? That’s certainly the picture of\n", "machine intelligence we find in science fiction narratives, but the\n", "reality of what we’ve developed is much more mundane.\n", "\n", "Car engines produce prodigious power from petrol. Machine intelligences\n", "deliver decisions derived from data. In both cases the scale of\n", "consumption enables a speed of operation that is far beyond the\n", "capabilities of their natural counterparts. Unfettered energy\n", "consumption has consequences in the form of climate change. Does\n", "unbridled data consumption also have consequences for us?\n", "\n", "If we devolve decision making to machines, we depend on those machines\n", "to accommodate our needs. If we don’t understand how those machines\n", "operate, we lose control over our destiny. Our mistake has been to see\n", "machine intelligence as a reflection of our intelligence. We cannot\n", "understand the smarter human without understanding the human. To\n", "understand the machine, we need to better understand ourselves." ], "id": "3a03d6d9-9ce4-4b70-929b-47a610b8d654" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The Diving Bell and the Butterfly\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: The Diving Bell and the Buttefly is the autobiography of Jean\n", "Dominique Bauby.\n", "\n", "[The Diving Bell and the\n", "Butterfly](https://www.penguinrandomhouse.com/books/9616/the-diving-bell-and-the-butterfly-by-jean-dominique-bauby/)\n", "is the autobiography of Jean Dominique Bauby. Jean Dominique, the editor\n", "of French Elle magazine, suffered a major stroke at the age of 43 in\n", "1995. The stroke paralyzed him and rendered him speechless. He was only\n", "able to blink his left eyelid, he became a sufferer of locked in\n", "syndrome.\n", "\n", "E S A R I N T U L
O M D P C F B V
H G J Q Z Y X K W\n", "\n", "Figure: The ordering of the letters that Bauby used for writing his\n", "autobiography.\n", "\n", "How could he do that? Well, first, they set up a mechanism where he\n", "could scan across letters and blink at the letter he wanted to use. In\n", "this way, he was able to write each letter.\n", "\n", "It took him 10 months of four hours a day to write the book. Each word\n", "took two minutes to write.\n", "\n", "Imagine doing all that thinking, but so little speaking, having all\n", "those thoughts and so little ability to communicate.\n", "\n", "The idea behind this talk is that we are all in that situation. While\n", "not as extreme as for Bauby, we all have somewhat of a locked in\n", "intelligence.\n", "\n", "\n", "\n", "Figure: Jean Dominique Bauby was the Editor in Chief of the French\n", "Elle Magazine, he suffered a stroke that destroyed his brainstem,\n", "leaving him only capable of moving one eye. Jean Dominique became a\n", "victim of locked in syndrome.\n", "\n", "Incredibly, Jean Dominique wrote his book after he became locked in. It\n", "took him 10 months of four hours a day to write the book. Each word took\n", "two minutes to write.\n", "\n", "The idea behind embodiment factors is that we are all in that situation.\n", "While not as extreme as for Bauby, we all have somewhat of a locked in\n", "intelligence.\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "Figure: Claude Shannon developed information theory which allows us\n", "to quantify how much Bauby can communicate. This allows us to compare\n", "how locked in he is to us." ], "id": "3e4ca90e-544c-4e9e-b122-15a31763ffb5" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The Atomic Human\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: The Atomic Eye, by slicing away aspects of the human that we\n", "used to believe to be unique to us, but are now the preserve of the\n", "machine, we learn something about what it means to be human.\n", "\n", "\n", "\n", "Figure: A Colossus Mark 2 codebreaking computer being operated by\n", "Dorothy Du Boisson (left) and Elsie Booker (right). Colossus was\n", "designed by Tommy Flowers, but programmed and operated by groups of\n", "Wrens based at Bletchley Park." ], "id": "89097b99-cae2-47ec-81e6-c9e8bd0192f1" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Embodiment Factors\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "\n", "bits/min\n", "\n", "\n", "\n", "billions\n", "\n", "\n", "\n", "2,000\n", "\n", "
\n", "\n", "billion
calculations/s\n", "\n", "
\n", "\n", "~100\n", "\n", "\n", "\n", "a billion\n", "\n", "
\n", "\n", "embodiment\n", "\n", "\n", "\n", "20 minutes\n", "\n", "\n", "\n", "5 billion years\n", "\n", "
\n", "\n", "Figure: Embodiment factors are the ratio between our ability to\n", "compute and our ability to communicate. Relative to the machine we are\n", "also locked in. In the table we represent embodiment as the length of\n", "time it would take to communicate one second’s worth of computation. For\n", "computers it is a matter of minutes, but for a human, it is a matter of\n", "thousands of millions of years. See also “Living Together: Mind and\n", "Machine Intelligence” Lawrence (2017)\n", "\n", "There is a fundamental limit placed on our intelligence based on our\n", "ability to communicate. Claude Shannon founded the field of information\n", "theory. The clever part of this theory is it allows us to separate our\n", "measurement of information from what the information pertains to.[1]\n", "\n", "Shannon measured information in bits. One bit of information is the\n", "amount of information I pass to you when I give you the result of a coin\n", "toss. Shannon was also interested in the amount of information in the\n", "English language. He estimated that on average a word in the English\n", "language contains 12 bits of information.\n", "\n", "Given typical speaking rates, that gives us an estimate of our ability\n", "to communicate of around 100 bits per second (Reed and Durlach, 1998).\n", "Computers on the other hand can communicate much more rapidly. Current\n", "wired network speeds are around a billion bits per second, ten million\n", "times faster.\n", "\n", "When it comes to compute though, our best estimates indicate our\n", "computers are slower. A typical modern computer can process make around\n", "100 billion floating-point operations per second, each floating-point\n", "operation involves a 64 bit number. So the computer is processing around\n", "6,400 billion bits per second.\n", "\n", "It’s difficult to get similar estimates for humans, but by some\n", "estimates the amount of compute we would require to *simulate* a human\n", "brain is equivalent to that in the UK’s fastest computer\n", "(Ananthanarayanan et al., 2009), the MET office machine in Exeter, which\n", "in 2018 ranked as the 11th fastest computer in the world. That machine\n", "simulates the world’s weather each morning, and then simulates the\n", "world’s climate in the afternoon. It is a 16-petaflop machine,\n", "processing around 1,000 *trillion* bits per second.\n", "\n", "[1] the challenge of understanding what information pertains to is known\n", "as knowledge representation." ], "id": "c04374cb-c30c-44e2-80d5-c56a96f913f4" }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Bandwidth Constrained Conversations\n", "\n", "\\[edit\\]" ], "id": "8aa07774-0570-4384-94eb-8dce92b57d68" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import notutils as nu\n", "from ipywidgets import IntSlider" ], "id": "9453f6ca-d251-4f44-9fc2-f550d9bdf41f" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import notutils as nu" ], "id": "e202db91-7ea4-4f63-a390-551358858ef0" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "nu.display_plots('anne-bob-conversation{sample:0>3}.svg', \n", " 'https://inverseprobability.com/talks/../slides/diagrams/', sample=IntSlider(0, 0, 7, 1))" ], "id": "48d2a8fb-d89f-4c18-8333-6397d4b00880" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: Conversation relies on internal models of other\n", "individuals.\n", "\n", "\n", "\n", "Figure: Misunderstanding of context and who we are talking to leads\n", "to arguments.\n", "\n", "Embodiment factors imply that, in our communication between humans, what\n", "is *not* said is, perhaps, more important than what is said. To\n", "communicate with each other we need to have a model of who each of us\n", "are.\n", "\n", "To aid this, in society, we are required to perform roles. Whether as a\n", "parent, a teacher, an employee or a boss. Each of these roles requires\n", "that we conform to certain standards of behaviour to facilitate\n", "communication between ourselves.\n", "\n", "Control of self is vitally important to these communications.\n", "\n", "The high availability of data available to humans undermines\n", "human-to-human communication channels by providing new routes to\n", "undermining our control of self.\n", "\n", "The consequences between this mismatch of power and delivery are to be\n", "seen all around us. Because, just as driving an F1 car with bicycle\n", "wheels would be a fine art, so is the process of communication between\n", "humans.\n", "\n", "If I have a thought and I wish to communicate it, I first need to have a\n", "model of what you think. I should think before I speak. When I speak,\n", "you may react. You have a model of who I am and what I was trying to\n", "say, and why I chose to say what I said. Now we begin this dance, where\n", "we are each trying to better understand each other and what we are\n", "saying. When it works, it is beautiful, but when mis-deployed, just like\n", "a badly driven F1 car, there is a horrible crash, an argument." ], "id": "3efcdd3d-0ed9-4008-836d-04805e81ec5b" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Sistine Chapel Ceiling\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: The ceiling of the Sistine Chapel.\n", "\n", "[Patrick Boyde](https://www.mmll.cam.ac.uk/pb127)’s talks on the Sistine\n", "Chapel focussed on both the structure of the chapel ceiling, describing\n", "the impression of height it was intended to give, as well as the\n", "significance and positioning of each of the panels and the meaning of\n", "the individual figures." ], "id": "3f282f14-e8c1-466b-abe8-7a2914310df0" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The Creation of Man\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: Photo of Detail of Creation of Man from the Sistine chapel\n", "ceiling.\n", "\n", "One of the most famous panels is central in the ceiling, it’s the\n", "creation of man. Here, God in the guise of a pink-robed bearded man\n", "reaches out to a languid Adam.\n", "\n", "The representation of God in this form seems typical of the time,\n", "because elsewhere in the Vatican Museums there are similar\n", "representations.\n", "\n", "\n", "\n", "Figure: Photo detail of God.\n", "\n", "" ], "id": "7cc82c4a-7470-4ae6-b4c9-cb98bee69b17" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## New Flow of Information\n", "\n", "\\[edit\\]\n", "\n", "Classically the field of statistics focused on mediating the\n", "relationship between the machine and the human. Our limited bandwidth of\n", "communication means we tend to over-interpret the limited information\n", "that we are given, in the extreme we assign motives and desires to\n", "inanimate objects (a process known as anthropomorphizing). Much of\n", "mathematical statistics was developed to help temper this tendency and\n", "understand when we are valid in drawing conclusions from data.\n", "\n", "\n", "\n", "Figure: The trinity of human, data, and computer, and highlights the\n", "modern phenomenon. The communication channel between computer and data\n", "now has an extremely high bandwidth. The channel between human and\n", "computer and the channel between data and human is narrow. New direction\n", "of information flow, information is reaching us mediated by the\n", "computer. The focus on classical statistics reflected the importance of\n", "the direct communication between human and data. The modern challenges\n", "of data science emerge when that relationship is being mediated by the\n", "machine.\n", "\n", "Data science brings new challenges. In particular, there is a very large\n", "bandwidth connection between the machine and data. This means that our\n", "relationship with data is now commonly being mediated by the machine.\n", "Whether this is in the acquisition of new data, which now happens by\n", "happenstance rather than with purpose, or the interpretation of that\n", "data where we are increasingly relying on machines to summarize what the\n", "data contains. This is leading to the emerging field of data science,\n", "which must not only deal with the same challenges that mathematical\n", "statistics faced in tempering our tendency to over interpret data but\n", "must also deal with the possibility that the machine has either\n", "inadvertently or maliciously misrepresented the underlying data." ], "id": "42a9000e-91e5-4098-bd4e-2e499797df01" }, { "cell_type": "markdown", "metadata": {}, "source": [ "### A Six Word Novel\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: Consider the six-word novel, apocryphally credited to Ernest\n", "Hemingway, “For sale: baby shoes, never worn”. To understand what that\n", "means to a human, you need a great deal of additional context. Context\n", "that is not directly accessible to a machine that has not got both the\n", "evolved and contextual understanding of our own condition to realize\n", "both the implication of the advert and what that implication means\n", "emotionally to the previous owner.\n", "\n", "But this is a very different kind of intelligence than ours. A computer\n", "cannot understand the depth of the Ernest Hemingway’s apocryphal\n", "six-word novel: “For Sale, Baby Shoes, Never worn”, because it isn’t\n", "equipped with that ability to model the complexity of humanity that\n", "underlies that statement." ], "id": "c634a03d-f1f3-4b4f-824b-5206cc8441a2" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Revolution\n", "\n", "\\[edit\\]\n", "\n", "Arguably the information revolution we are experiencing is unprecedented\n", "in history. But changes in the way we share information have a long\n", "history. Over 5,000 years ago in the city of Uruk, on the banks of the\n", "Euphrates, communities which relied on the water to irrigate their corps\n", "developed an approach to recording transactions in clay. Eventually the\n", "system of recording system became sophisticated enough that their oral\n", "histories could be recorded in the form of the first epic: Gilgamesh.\n", "\n", "\n", "\n", "Figure: Chicago Stone, side 2, recording sale of a number of fields,\n", "probably from Isin, Early Dynastic Period, c. 2600 BC, black basalt\n", "\n", "It was initially develoepd for people as a recordd of who owed what to\n", "whom, expanding individuals’ capacity to remember. But over a five\n", "hundred year period writing evolved to become a tool for literature as\n", "well. More pithily put, writing was invented by accountants not poets\n", "(see e.g. [this piece by Tim\n", "Harford](https://www.bbc.co.uk/news/business-39870485)).\n", "\n", "In some respects today’s revolution is different, because it involves\n", "also the creation of stories as well as their curation. But in some\n", "fundamental ways we can see what we have produced as another tool for us\n", "in the information revolution." ], "id": "c0a3b6f7-adca-4dc3-b50d-545bef351306" }, { "cell_type": "markdown", "metadata": {}, "source": [ "# The Future of Professions\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: [The Future of\n", "Professions](https://www.amazon.co.uk/Future-Professions-Technology-Transform-Experts/dp/0198713398)\n", "(Susskind and Susskind, 2015) is a 2015 book focussed on how the next\n", "wave of technology revolution is going to effect the professions." ], "id": "e558461f-2376-469c-ba2f-59852626eabb" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Coin Pusher\n", "\n", "\\[edit\\]\n", "\n", "Disruption of society is like a coin pusher, it’s those who are already\n", "on the edge who are most likely to be effected by disruption.\n", "\n", "\n", "\n", "Figure: A coin pusher is a game where coins are dropped into th etop\n", "of the machine, and they disrupt those on the existing steps. With any\n", "coin drop, many coins move, but it is those on the edge, who are often\n", "only indirectly effected, but also most traumatically effected by the\n", "change.\n", "\n", "One danger of the current hype around ChatGPT is that we are overly\n", "focussing on the fact that it seems to have significant effect on\n", "professional jobs, people are naturally asking the question “what does\n", "it do for my role?”. No doubt, there will be disruption, but the coin\n", "pusher hypothesis suggests that that disruption will likely involve\n", "movement on the same step. However it is those on the edge already, who\n", "are often not working directly in the information economy, who often\n", "have less of a voice in the policy conversation who are likely to be\n", "most disrupted." ], "id": "8adfdc57-b553-482c-bbda-eebcc496ac79" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Royal Society Report\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: The Royal Society report on Machine Learning was released on\n", "25th April 2017\n", "\n", "A useful reference for state of the art in machine learning is the UK\n", "Royal Society Report, [Machine Learning: Power and Promise of Computers\n", "that Learn by\n", "Example](https://royalsociety.org/~/media/policy/projects/machine-learning/publications/machine-learning-report.pdf)." ], "id": "e5ed94fb-1532-4f3b-86a4-9621bed38108" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Public Research\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: The Royal Society comissioned [public research from\n", "Mori](https://royalsociety.org/-/media/policy/projects/machine-learning/publications/public-views-of-machine-learning-ipsos-mori.pdf)\n", "as part of the machine learning review.\n", "\n", "\n", "\n", "Figure: One of the questions focussed on machine learning\n", "applications.\n", "\n", "\n", "\n", "Figure: The public were broadly supportive of a range of application\n", "areas.\n", "\n", "\n", "\n", "Figure: But they failed to see the point in AI’s that could produce\n", "poetry." ], "id": "37009cdb-8231-45c2-9dfa-8ea3798bfaf5" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Mercutio\n", "\n", "\\[edit\\]\n", "\n", "It’s ironic that despite the public focus on transport, health, cities,\n", "crime and their confusion as to why we might want AI to do art, one area\n", "where significant progress has been made is on AIs that can create\n", "literature.\n", "\n", "\n", "\n", "Figure: Despite the public’s desire to see challenges in health,\n", "finance, transport, crime solved by machine learning and their\n", "skepticism around poetry, one of the problems we have solved seems to be\n", "poetry.\n", "\n", "> **User**: Hi ChatGPT, Could you write soliloquy that introduces what a\n", "> Large Language Model is as spoken by Mercutio?\n", ">\n", "> **Machine**: Ah, friends, lend me your ears, I prithee, For now I must\n", "> utter a tale of wonder, Of a creation divine, called the Large\n", "> Language Model, An oracle, a sorcerer, a wordsmith, yea verily, a\n", "> master of the quill.\n", ">\n", "> From the depths of this scholarly dominion, A creature of artifice did\n", "> rise; Twas wrought of ones and zeroes, woven as if by Puck’s own hand,\n", "> To mimic our thoughts, our voice, our very essence.\n", ">\n", "> A thousand minds, nay, a millionfold, Conjoined in this digital\n", "> Athenaeum, Words and phrases, like soldiers in battle, Marching to\n", "> form the verses of our age.\n", ">\n", "> This Large Language Model, a changeling so fair, Hath the power to\n", "> bring forth wisdom, To transmute our queries, our curiosities, Into\n", "> answers, clear as a midsummer’s day.\n", ">\n", "> Yet, beware, my companions, for magic hath its limits, Not unlike the\n", "> mortal bounds of we who dwell in flesh, The Large Language Model hath\n", "> its strengths, But ’tis not an all-knowing oracle, a god it is not.\n", ">\n", "> As Mercutio, your humble servant, doth stand before thee, I marvel at\n", "> the art and science that hath brought forth this wondrous creation,\n", "> This Large Language Model, a testament to man’s yearning, To\n", "> comprehend the world and make meaning of the great tapestry of\n", "> existence.\n", ">\n", "> In faith, dear friends, let us embrace the magic and beauty of this\n", "> digital bard, And mayhaps, by its grace, we shall glean some wisdom,\n", "> As we wander together, hand in hand, Through this boundless realm of\n", "> knowledge and wonder.\n", ">\n", "> ChatGPT transcript, 14th April 2023." ], "id": "1c7b52ce-8da4-4f5c-89cc-44d13b7e6056" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The Blue Marble\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: The Blue Marble, a photo of Earth taken from Apollo 17." ], "id": "61dbc5fb-4be2-48f7-a8ed-1148bb191db3" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Eagle from Columbia\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: Eagle photographed from Columbia on its return from the Lunar\n", "surface." ], "id": "64a66320-56bd-4588-99b4-b576ff40d323" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Amelia Earhart\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: The Little Red Bus, Amelia Earhart’s plane in Derry after\n", "landing." ], "id": "d3b2bc1d-338a-4b32-a914-c02d6f1d6414" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## NACA Langley\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: 1945 photo of the NACA test pilots, from left Mel Gough, Herb\n", "Hoover, Jack Reeder, Stefan Cavallo and Bill Gray (photo NASA, NACA LMAL\n", "42612)\n", "\n", "The NACA Langley Field proving ground tested US aircraft. Bob Gilruth\n", "worked on the [flying qualities of\n", "aircraft](https://ntrs.nasa.gov/search.jsp?R=19930091834). One of his\n", "collaborators suggested that\n", "\n", "> Hawker Hurricane airplane. A heavily armed fighter airplane noted for\n", "> its role in the Battle of Britain, the Hurricane’s flying qualities\n", "> were found to be generally satisfactory. The most notable deficiencies\n", "> were heavy aileron forces at high speeds and large friction in the\n", "> controls.\n", ">\n", "> W. Hewitt Phillips[1]\n", "\n", "and\n", "\n", "> Supermarine Spitfire airplane. A high-performance fighter noted for\n", "> its role in the Battle of Britain and throughout WW II, the Spitfire\n", "> had desirably light elevator control forces in maneuvers and near\n", "> neutral longitudinal stability. Its greatest deficiency from the\n", "> combat standpoint was heavy aileron forces and sluggish roll response\n", "> at high speeds.\n", ">\n", "> W. Hewitt Phillips[2]\n", "\n", "Gilruth went beyond the reports of feel to characterise how the plane\n", "should respond to different inputs on the control stick. In other words\n", "he quantified that feel of the plane.\n", "\n", "[1] monographs\n", "\n", "[2] monographs" ], "id": "cfbcd422-83ea-44ff-8b95-da23dd9dede5" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Computer Conversations\n", "\n", "\\[edit\\]" ], "id": "8e7f5aeb-4d8c-43b6-8a17-f50c28f4b2ab" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import notutils as nu\n", "from ipywidgets import IntSlider" ], "id": "24bef4b9-35f8-42c2-a1e4-b73e9a311e32" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import notutils as nu" ], "id": "6094cab8-8bac-438b-ae62-bea35fcc1792" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "nu.display_plots('anne-bob-conversation{sample:0>3}.svg', \n", " 'https://inverseprobability.com/talks/../slides/diagrams/', sample=IntSlider(0, 0, 7, 1))" ], "id": "e9ef068d-439b-4e2e-84db-454633d7f982" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: Conversation relies on internal models of other\n", "individuals.\n", "\n", "\n", "\n", "Figure: Misunderstanding of context and who we are talking to leads\n", "to arguments.\n", "\n", "Similarly, we find it difficult to comprehend how computers are making\n", "decisions. Because they do so with more data than we can possibly\n", "imagine.\n", "\n", "In many respects, this is not a problem, it’s a good thing. Computers\n", "and us are good at different things. But when we interact with a\n", "computer, when it acts in a different way to us, we need to remember\n", "why.\n", "\n", "Just as the first step to getting along with other humans is\n", "understanding other humans, so it needs to be with getting along with\n", "our computers.\n", "\n", "Embodiment factors explain why, at the same time, computers are so\n", "impressive in simulating our weather, but so poor at predicting our\n", "moods. Our complexity is greater than that of our weather, and each of\n", "us is tuned to read and respond to one another.\n", "\n", "Their intelligence is different. It is based on very large quantities of\n", "data that we cannot absorb. Our computers don’t have a complex internal\n", "model of who we are. They don’t understand the human condition. They are\n", "not tuned to respond to us as we are to each other.\n", "\n", "Embodiment factors encapsulate a profound thing about the nature of\n", "humans. Our locked in intelligence means that we are striving to\n", "communicate, so we put a lot of thought into what we’re communicating\n", "with. And if we’re communicating with something complex, we naturally\n", "anthropomorphize them.\n", "\n", "We give our dogs, our cats, and our cars human motivations. We do the\n", "same with our computers. We anthropomorphize them. We assume that they\n", "have the same objectives as us and the same constraints. They don’t.\n", "\n", "This means, that when we worry about artificial intelligence, we worry\n", "about the wrong things. We fear computers that behave like more powerful\n", "versions of ourselves that will struggle to outcompete us.\n", "\n", "In reality, the challenge is that our computers cannot be human enough.\n", "They cannot understand us with the depth we understand one another. They\n", "drop below our cognitive radar and operate outside our mental models.\n", "\n", "The real danger is that computers don’t anthropomorphize. They’ll make\n", "decisions in isolation from us without our supervision because they\n", "can’t communicate truly and deeply with us.\n", "\n", "" ], "id": "8172f65e-d4dd-48dc-a958-efda2d60044a" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## A Question of Trust\n", "\n", "\\[edit\\]\n", "\n", "In Baroness Onora O’Neill’s Reeith Lectures from 2002, she raises the\n", "challenge of trust. There are many aspects to her arcuments, but one of\n", "the key points she makes is that we cannot trust without the notion of\n", "duty. O’Neill is bemoaning the substitution of duty with process. The\n", "idea is that processes and transparency are supposed to hold us to\n", "account by measuring outcomes. But these processes themselves overwhelm\n", "decision makers and undermine their professional duty to deliver the\n", "right outcome.\n", "\n", "\n", "\n", "Figure: [A Question of Trust by Onora\n", "O’Neil](https://www.bbc.co.uk/programmes/p00gpzfq) which examines the\n", "nature of trust and its role in society.\n", "\n", "> Again Univesities are to treat each applicant fairly on the basis of\n", "> ability and promise, but they are supposed also to admit a socially\n", "> more representative intake.\n", ">\n", "> There’s no guarantee that the process meets the target.\n", ">\n", "> Onora O’Neill *A Question of Trust: Called to Account* Reith Lectures\n", "> 2002 O’Neill (2002)\\]\n", "\n", "O’Neill is speaking in 2002, in the early days of the internet and\n", "before social media. Much of her thoughts are even more relevant for\n", "today than they were when she spoke. This is because the increased\n", "availability of information and machine driven decision-making makes the\n", "mistaken premise, that process is an adequate substitute for duty, more\n", "apparently plausible. But this undermines what O’Neill calls\n", "“intelligent accountability”, which is not accounting by the numbers,\n", "but through professional education and institutional safeguards.\n", "\n", "" ], "id": "5c2501a9-31f1-4712-9494-862dd15a241d" }, { "cell_type": "markdown", "metadata": {}, "source": [ "# The Structure of Scientific Revolutions\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: [The Structure of Scientific Revolutions by Thomas S.\n", "Kuhn](https://en.wikipedia.org/wiki/The_Structure_of_Scientific_Revolutions)\n", "suggests scientific paradigms are recorded in books." ], "id": "f1d2acc4-3de1-4fb5-a141-57ee654c8e94" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Blake’s Newton\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: William Blake’s *Newton*." ], "id": "f8f6806b-9a11-4c37-a270-ae72200ce366" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Lunette Rehoboam Abijah\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: Lunette containing Rehoboam and Abijah." ], "id": "e48cc3ea-cc06-41f9-9847-68af299a716c" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Elohim Creating Adam\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: William Blake’s *Elohim Creating Adam*." ], "id": "e9b710e1-5900-46eb-b685-10c63f6a1735" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Fall and Expulsion from Garden of Eden\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: Photo of detail of the fall and expulsion from the Garden of\n", "Eden.\n", "\n", "\n", "\n", "Figure: People communicate through artifacts and culture.\n", "\n", "" ], "id": "ee4f6a22-df97-415c-8df5-1e95533eb0f2" }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Lecture 2\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: Bandwidth vs Complexity.\n", "\n", "" ], "id": "1919e079-e090-40c6-b5d1-9c00ad5d45d2" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Artificial Intelligence\n", "\n", "\\[edit\\]\n", "\n", "One of the struggles of artificial intelligence is that the term means\n", "different things to different people. Our intelligence is precious to\n", "us, and the notion that it can be easily recreated is disturbing to us.\n", "This leads to some dystopian notions of artificial intelligence, such as\n", "the singularity.\n", "\n", "Depending on whether this powerful technology is viewed as beneficent or\n", "maleficent, it can be viewed either as a helpful assistant, in the\n", "manner of Jeeves, or a tyrannical dictator.\n", "\n", "" ], "id": "8f6ad27b-c8a9-4f7f-a1bb-5dafcefd816b" }, { "cell_type": "markdown", "metadata": {}, "source": [ "# The Great AI Fallacy\n", "\n", "\\[edit\\]\n", "\n", "There is a lot of variation in the use of the term artificial\n", "intelligence. I’m sometimes asked to define it, but depending on whether\n", "you’re speaking to a member of the public, a fellow machine learning\n", "researcher, or someone from the business community, the sense of the\n", "term differs.\n", "\n", "However, underlying its use I’ve detected one disturbing trend. A trend\n", "I’m beginining to think of as “The Great AI Fallacy”.\n", "\n", "The fallacy is associated with an implicit promise that is embedded in\n", "many statements about Artificial Intelligence. Artificial Intelligence,\n", "as it currently exists, is merely a form of automated decision making.\n", "The implicit promise of Artificial Intelligence is that it will be the\n", "first wave of automation where the machine adapts to the human, rather\n", "than the human adapting to the machine.\n", "\n", "How else can we explain the suspension of sensible business judgment\n", "that is accompanying the hype surrounding AI?\n", "\n", "This fallacy is particularly pernicious because there are serious\n", "benefits to society in deploying this new wave of data-driven automated\n", "decision making. But the AI Fallacy is causing us to suspend our\n", "calibrated skepticism that is needed to deploy these systems safely and\n", "efficiently.\n", "\n", "The problem is compounded because many of the techniques that we’re\n", "speaking of were originally developed in academic laboratories in\n", "isolation from real-world deployment.\n", "\n", "\n", "\n", "Figure: We seem to have fallen for a perspective on AI that suggests\n", "it will adapt to our schedule, rather in the manner of a 1930s\n", "manservant.\n", "\n", "The history of automation and technology is a history of us adapting to\n", "technological change. The invention of the railways, and the need for\n", "consistent national times to timetable our movements. The development of\n", "the factory system in the mills of Derbyshire required workers to\n", "operate and maintain the machines that replaced them.\n", "\n", "Listening to modern to conversations about artificial intelligence, I\n", "think the use of the term *intelligence* has given rise to an idea that\n", "this technology will be the But amoung these different assessments of\n", "artificial intelligence is buried an idea, one that will be the first\n", "technology to adapt to us.\n", "\n", "In Greek mythology, Panacea was the goddess of the universal remedy. One\n", "consequence of the pervasive potential of AI is that it is positioned,\n", "like Panacea, as the purveyor of a universal solution. Whether it is\n", "overcoming industry’s productivity challenges, or as a salve for\n", "strained public sector services, or a remedy for pressing global\n", "challenges in sustainable development, AI is presented as an elixir to\n", "resolve society’s problems.\n", "\n", "In practice, translation of AI technology into practical benefit is not\n", "simple. Moreover, a growing body of evidence shows that risks and\n", "benefits from AI innovations are unevenly distributed across society.\n", "\n", "When carelessly deployed, AI risks exacerbating existing social and\n", "economic inequalities.\n", "\n", "I’m reminded of this because from 2015 to 2017 I was on the Working\n", "Group that compiled the Royal Society’s machine learning report. The\n", "process of constructing the report went across the UK Referendum, and\n", "the 2016 US election. I remember vividly a meeting we convened at the\n", "Society in London which had experts alongside MPs from all parties,\n", "policy advisors and civil servants. One of the MPs (likely correctly)\n", "pointed out “I suspect no one around this table voted for Brexit” to\n", "which I replied “But isn’t that the problem? There are a large number of\n", "people who aren’t empowered who are experiencing quite a different\n", "reality than us. And they aren’t reprented in these forums.” So it’s no\n", "surprise that so much of the press conversation around AI is still\n", "focussed on how it is likely to effect middle class jobs. We shouldn’t\n", "underestimate these effects, but it’s often the case that better\n", "educated people are better placed to deal with such challenges. For\n", "example, when stock brokers’ roles disappeared they simply moved on to\n", "other roles in banks and related industries." ], "id": "b1c06d64-6297-40d9-a87e-a5596e264359" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Intellectual Debt\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: Jonathan Zittrain’s term to describe the challenges of\n", "explanation that come with AI is Intellectual Debt.\n", "\n", "In the context of machine learning and complex systems, Jonathan\n", "Zittrain has coined the term [“Intellectual\n", "Debt”](https://medium.com/berkman-klein-center/from-technical-debt-to-intellectual-debt-in-ai-e05ac56a502c)\n", "to describe the challenge of understanding what you’ve created. In [the\n", "ML@CL group we’ve been foucssing on developing the notion of a\n", "*data-oriented\n", "architecture*](https://mlatcl.github.io/projects/data-oriented-architectures-for-ai-based-systems.html)\n", "to deal with intellectual debt (Cabrera et al., 2023).\n", "\n", "Zittrain points out the challenge around the lack of interpretability of\n", "individual ML models as the origin of intellectual debt. In machine\n", "learning I refer to work in this area as fairness, interpretability and\n", "transparency or FIT models. To an extent I agree with Zittrain, but if\n", "we understand the context and purpose of the decision making, I believe\n", "this is readily put right by the correct monitoring and retraining\n", "regime around the model. A concept I refer to as “progression testing”.\n", "Indeed, the best teams do this at the moment, and their failure to do it\n", "feels more of a matter of technical debt rather than intellectual,\n", "because arguably it is a maintenance task rather than an explanation\n", "task. After all, we have good statistical tools for interpreting\n", "individual models and decisions when we have the context. We can\n", "linearise around the operating point, we can perform counterfactual\n", "tests on the model. We can build empirical validation sets that explore\n", "fairness or accuracy of the model." ], "id": "380a713f-3463-4d1a-b6cb-4a2505848570" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Technical Debt\n", "\n", "In computer systems the concept of *technical debt* has been surfaced by\n", "authors including Sculley et al. (2015). It is an important concept,\n", "that I think is somewhat hidden from the academic community, because it\n", "is a phenomenon that occurs when a computer software system is deployed." ], "id": "4dd93528-d63b-4d21-b99a-29125e7ff6b4" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Separation of Concerns\n", "\n", "\\[edit\\]\n", "\n", "To construct such complex systems an approach known as “separation of\n", "concerns” has been developed. The idea is that you architect your\n", "system, which consists of a large-scale complex task, into a set of\n", "simpler tasks. Each of these tasks is separately implemented. This is\n", "known as the decomposition of the task.\n", "\n", "This is where Jonathan Zittrain’s beautifully named term “intellectual\n", "debt” rises to the fore. Separation of concerns enables the construction\n", "of a complex system. But who is concerned with the overall system?\n", "\n", "- Technical debt is the inability to *maintain* your complex software\n", " system.\n", "\n", "- Intellectual debt is the inability to *explain* your software\n", " system.\n", "\n", "It is right there in our approach to software engineering. “Separation\n", "of concerns” means no one is concerned about the overall system itself." ], "id": "a0fc0ccd-31ff-4f55-a0df-8bb3426f2f21" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The Horizon Scandal\n", "\n", "\\[edit\\]\n", "\n", "In the UK we saw these effects play out in the Horizon scandal: the\n", "accounting system of the national postal service was computerized by\n", "Fujitsu and first installed in 1999, but neither the Post Office nor\n", "Fujitsu were able to control the system they had deployed. When it went\n", "wrong individual sub postmasters were blamed for the systems’ errors.\n", "Over the next two decades they were prosecuted and jailed leaving lives\n", "ruined in the wake of the machine’s mistakes.\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "Figure: The still is from [the 2015 select\n", "committee](https://www.parliamentlive.tv/Event/Index/d05cb9e7-04d0-4d05-8a43-ddd74b1eecc0)." ], "id": "7f73d8fa-c7f3-4ece-a5b8-55394b88c81e" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The Mythical Man-month\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: The Mythical Man-month (Brooks, n.d.) is a 1975 book focussed\n", "on the challenges of software project coordination.\n", "\n", "However, when managing systems in production, you soon discover\n", "maintenance of a rapidly deployed system is not your only problem.\n", "\n", "To deploy large and complex software systems, an engineering approach\n", "known as “separation of concerns” is taken. Frederick Brooks’ book “The\n", "Mythical Man-month” (Brooks, n.d.), has itself gained almost mythical\n", "status in the community. It focuses on what has become known as Brooks’\n", "law “adding manpower to a late software project makes it later”.\n", "\n", "Adding people (men or women!) to a project delays it because of the\n", "communication overhead required to get people up to speed." ], "id": "3a78de55-8412-4b6a-9558-67805d4b5c42" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Artificial vs Natural Systems\n", "\n", "\\[edit\\]\n", "\n", "Let’s take a step back from artificial intelligence, and consider\n", "natural intelligence. Or even more generally, let’s consider the\n", "contrast between an artificial *system* and an natural system. The key\n", "difference between the two is that artificial systems are *designed*\n", "whereas natural systems are *evolved*.\n", "\n", "Systems design is a major component of all Engineering disciplines. The\n", "details differ, but there is a single common theme: achieve your\n", "objective with the minimal use of resources to do the job. That provides\n", "efficiency. The engineering designer imagines a solution that requires\n", "the minimal set of components to achieve the result. A water pump has\n", "one route through the pump. That minimises the number of components\n", "needed. Redundancy is introduced only in safety critical systems, such\n", "as aircraft control systems. Students of biology, however, will be aware\n", "that in nature system-redundancy is everywhere. Redundancy leads to\n", "robustness. For an organism to survive in an evolving environment it\n", "must first be robust, then it can consider how to be efficient. Indeed,\n", "organisms that evolve to be too efficient at a particular task, like\n", "those that occupy a niche environment, are particularly vulnerable to\n", "extinction.\n", "\n", "This notion is akin to the idea that only the best will survive,\n", "popularly encoded into an notion of evolution by Herbert Spencer’s\n", "quote.\n", "\n", "> Survival of the fittest\n", ">\n", "> [Herbet Spencer](https://en.wikipedia.org/wiki/Herbert_Spencer), 1864\n", "\n", "Darwin himself never said “Survival of the Fittest” he talked about\n", "evolution by natural selection.\n", "\n", "> Non-survival of the non-fit\n", "\n", "Evolution is better described as “non-survival of the non-fit”. You\n", "don’t have to be the fittest to survive, you just need to avoid the\n", "pitfalls of life. This is the first priority.\n", "\n", "So it is with natural vs artificial intelligences. Any natural\n", "intelligence that was not robust to changes in its external environment\n", "would not survive, and therefore not reproduce. In contrast the\n", "artificial intelligences we produce are designed to be efficient at one\n", "specific task: control, computation, playing chess. They are *fragile*.\n", "\n", "The first rule of a natural system is not be intelligent, it is “don’t\n", "be stupid”.\n", "\n", "A mistake we make in the design of our systems is to equate fitness with\n", "the objective function, and to assume it is known and static. In\n", "practice, a real environment would have an evolving fitness function\n", "which would be unknown at any given time.\n", "\n", "You can also read this blog post on [Natural and Artificial\n", "Intelligence](http://inverseprobability.com/2018/02/06/natural-and-artificial-intelligence)..\n", "\n", "The first criterion of a natural intelligence is *don’t fail*, not\n", "because it has a will or intent of its own, but because if it had failed\n", "it wouldn’t have stood the test of time. It would no longer exist. In\n", "contrast, the mantra for artificial systems is to be more efficient. Our\n", "artificial systems are often given a single objective (in machine\n", "learning it is encoded in a mathematical function) and they aim to\n", "achieve that objective efficiently. These are different characteristics.\n", "Even if we wanted to incorporate *don’t fail* in some form, it is\n", "difficult to design for. To design for “don’t fail”, you have to\n", "consider every which way in which things can go wrong, if you miss one\n", "you fail. These cases are sometimes called corner cases. But in a real,\n", "uncontrolled environment, almost everything is a corner. It is difficult\n", "to imagine everything that can happen. This is why most of our automated\n", "systems operate in controlled environments, for example in a factory, or\n", "on a set of rails. Deploying automated systems in an uncontrolled\n", "environment requires a different approach to systems design. One that\n", "accounts for uncertainty in the environment and is robust to unforeseen\n", "circumstances." ], "id": "f6f2ef95-1e4f-478b-9515-f9fa3b20d2ff" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Today’s Artificial Systems\n", "\n", "The systems we produce today only work well when their tasks are\n", "pigeonholed, bounded in their scope. To achieve robust artificial\n", "intelligences we need new approaches to both the design of the\n", "individual components, and the combination of components within our AI\n", "systems. We need to deal with uncertainty and increase robustness.\n", "Today, it is easy to make a fool of an artificial intelligent agent,\n", "technology needs to address the challenge of the uncertain environment\n", "to achieve robust intelligences.\n", "\n", "However, even if we find technological solutions for these challenges,\n", "it may be that the essence of human intelligence remains out of reach.\n", "It may be that the most quintessential element of our intelligence is\n", "defined by limitations. Limitations that computers have never\n", "experienced.\n", "\n", "Claude Shannon developed the idea of information theory: the mathematics\n", "of information. He defined the amount of information we gain when we\n", "learn the result of a coin toss as a “bit” of information. A typical\n", "computer can communicate with another computer with a billion bits of\n", "information per second. Equivalent to a billion coin tosses per second.\n", "So how does this compare to us? Well, we can also estimate the amount of\n", "information in the English language. Shannon estimated that the average\n", "English word contains around 12 bits of information, twelve coin tosses,\n", "this means our verbal communication rates are only around the order of\n", "tens to hundreds of bits per second. Computers communicate tens of\n", "millions of times faster than us, in relative terms we are constrained\n", "to a bit of pocket money, while computers are corporate billionaires.\n", "\n", "Our intelligence is not an island, it interacts, it infers the goals or\n", "intent of others, it predicts our own actions and how we will respond to\n", "others. We are social animals, and together we form a communal\n", "intelligence that characterises our species. For intelligence to be\n", "communal, our ideas to be shared somehow. We need to overcome this\n", "bandwidth limitation. The ability to share and collaborate, despite such\n", "constrained ability to communicate, characterises us. We must\n", "intellectually commune with one another. We cannot communicate all of\n", "what we saw, or the details of how we are about to react. Instead, we\n", "need a shared understanding. One that allows us to infer each other’s\n", "intent through context and a common sense of humanity. This\n", "characteristic is so strong that we anthropomorphise any object with\n", "which we interact. We apply moods to our cars, our cats, our\n", "environment. We seed the weather, volcanoes, trees with intent. Our\n", "desire to communicate renders us intellectually animist.\n", "\n", "But our limited bandwidth doesn’t constrain us in our imaginations. Our\n", "consciousness, our sense of self, allows us to play out different\n", "scenarios. To internally observe how our self interacts with others. To\n", "learn from an internal simulation of the wider world. Empathy allows us\n", "to understand others’ likely responses without having the full detail of\n", "their mental state. We can infer their perspective. Self-awareness also\n", "allows us to understand our own likely future responses, to look forward\n", "in time, play out a scenario. Our brains contain a sense of self and a\n", "sense of others. Because our communication cannot be complete it is both\n", "contextual and cultural. When driving a car in the UK a flash of the\n", "lights at a junction concedes the right of way and invites another road\n", "user to proceed, whereas in Italy, the same flash asserts the right of\n", "way and warns another road user to remain.\n", "\n", "Our main intelligence is our social intelligence, intelligence that is\n", "dedicated to overcoming our bandwidth limitation. We are individually\n", "complex, but as a society we rely on shared behaviours and\n", "oversimplification of our selves to remain coherent.\n", "\n", "This nugget of our intelligence seems impossible for a computer to\n", "recreate directly, because it is a consequence of our evolutionary\n", "history. The computer, on the other hand, was born into a world of data,\n", "of high bandwidth communication. It was not there through the genesis of\n", "our minds and the cognitive compromises we made are lost to time. To be\n", "a truly human intelligence you need to have shared that journey with us.\n", "\n", "Of course, none of this prevents us emulating those aspects of human\n", "intelligence that we observe in humans. We can form those emulations\n", "based on data. But even if an artificial intelligence can emulate humans\n", "to a high degree of accuracy it is a different type of intelligence. It\n", "is not constrained in the way human intelligence is. You may ask does it\n", "matter? Well, it is certainly important to us in many domains that\n", "there’s a human pulling the strings. Even in pure commerce it matters:\n", "the narrative story behind a product is often as important as the\n", "product itself. Handmade goods attract a price premium over factory\n", "made. Or alternatively in entertainment: people pay more to go to a live\n", "concert than for streaming music over the internet. People will also pay\n", "more to go to see a play in the theatre rather than a movie in the\n", "cinema.\n", "\n", "In many respects I object to the use of the term Artificial\n", "Intelligence. It is poorly defined and means different things to\n", "different people. But there is one way in which the term is very\n", "accurate. The term artificial is appropriate in the same way we can\n", "describe a plastic plant as an artificial plant. It is often difficult\n", "to pick out from afar whether a plant is artificial or not. A plastic\n", "plant can fulfil many of the functions of a natural plant, and plastic\n", "plants are more convenient. But they can never replace natural plants.\n", "\n", "In the same way, our natural intelligence is an evolved thing of beauty,\n", "a consequence of our limitations. Limitations which don’t apply to\n", "artificial intelligences and can only be emulated through artificial\n", "means. Our natural intelligence, just like our natural landscapes,\n", "should be treasured and can never be fully replaced." ], "id": "8be838e3-2569-4502-9053-4bb18db95a5d" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Technical Consequence\n", "\n", "\\[edit\\]\n", "\n", "Classical systems design assumes that the system is decomposable. That\n", "we can decompose the complex decision making process into distinct and\n", "independently designable parts. The composition of these parts gives us\n", "our final system.\n", "\n", "Nicolas Negroponte, the original founder of MIT’s media lab used to\n", "write a column called ‘bits and atoms’. This referred to the ability of\n", "information to effect movement of goods in the physical world. It is\n", "this interaction where machine learning technologies have the\n", "possibility to bring most benefit.\n", "\n", "\n", "\n", "Figure: Some software components in a ride allocation system. Circled\n", "components are hypothetical, rectangles represent actual data." ], "id": "8221308b-f462-4576-a53e-e9a1757c2d45" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data Oriented Architectures\n", "\n", "\\[edit\\]\n", "\n", "In a streaming architecture we shift from management of services, to\n", "management of data streams. Instead of worrying about availability of\n", "the services we shift to worrying about the quality of the data those\n", "services are producing.\n", "\n", "Historically we’ve been *software first*, this is a necessary but\n", "insufficient condition for *data first*. We need to move from\n", "software-as-a-service to data-as-a-service, from service oriented\n", "architectures to *data oriented architectures*." ], "id": "9d77868e-c68a-4ee8-8899-f0b5a787f8bf" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data Oriented Principles\n", "\n", "\n", "\n", "Figure: For an overview of data oriented principles see Cabrera et\n", "al. (2023).\n", "\n", "Our work comes from surveying machine learning case studies\n", "(**Paleyes-challenges22?**) identifying the challenges and then\n", "surveying papers that focus on deployment (Cabrera et al., 2023) and\n", "identifying the principles they use.\n", "\n", "The philosphy of DOA is also possible with more standard data\n", "infrastructures, such as SQL data bases, but more work has to be put\n", "into place to ensure that book-keeping around data provenance and origin\n", "is stored, as well as approaches for snapshotting the data ecosystem.\n", "Our studies (**Paleyes-flow22?**) have made a lot of use of flow based\n", "programming (**Paleyes-flow22?**)." ], "id": "7b595cec-6836-4cf0-9c70-6bc5d08439c5" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## How LLMs are Different\n", "\n", "So far our description has provided an understanding of how digital\n", "computers present problems for our understanding. So what do LLMs do\n", "that’s different?" ], "id": "6d6378c0-fc15-401e-a2a2-6776ddb795bf" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The MONIAC\n", "\n", "\\[edit\\]\n", "\n", "[The MONIAC](https://en.wikipedia.org/wiki/MONIAC) was an analogue\n", "computer designed to simulate the UK economy. Analogue comptuers work\n", "through analogy, the analogy in the MONIAC is that both money and water\n", "flow. The MONIAC exploits this through a system of tanks, pipes, valves\n", "and floats that represent the flow of money through the UK economy.\n", "Water flowed from the treasury tank at the top of the model to other\n", "tanks representing government spending, such as health and education.\n", "The machine was initially designed for teaching support but was also\n", "found to be a useful economic simulator. Several were built and today\n", "you can see the original at Leeds Business School, there is also one in\n", "the London Science Museum and one [in the Unisversity of Cambridge’s\n", "economics\n", "faculty](https://www.econ.cam.ac.uk/economics-alumni/drip-down-economics-phillips-machine).\n", "\n", "\n", "\n", "Figure: Bill Phillips and his MONIAC (completed in 1949). The machine\n", "is an analogue computer designed to simulate the workings of the UK\n", "economy." ], "id": "eefaed29-c7c5-4efa-8879-906bd821530d" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Donald MacKay\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: Donald M. MacKay (1922-1987), a physicist who was an early\n", "member of the cybernetics community and member of the Ratio Club.\n", "\n", "Donald MacKay was a physicist who worked on naval gun targetting during\n", "the second world war. The challenge with gun targetting for ships is\n", "that both the target and the gun platform are moving. The challenge was\n", "tackled using analogue computers, for example in the US the [Mark I fire\n", "control\n", "computer](https://en.wikipedia.org/wiki/Mark_I_Fire_Control_Computer)\n", "which was a mechanical computer. MacKay worked on radar systems for gun\n", "laying, here the velocity and distance of the target could be assessed\n", "through radar and an mechanical electrical analogue computer." ], "id": "99a0ff62-50b2-4d9a-9b01-24c1703a04d3" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Fire Control Systems\n", "\n", "\\[edit\\]\n", "\n", "Naval gunnery systems deal with targeting guns while taking into account\n", "movement of ships. The Royal Navy’s Gunnery Pocket Book (The Admiralty,\n", "1945) gives details of one system for gun laying.\n", "\n", "Like many challenges we face today, in the second world war, fire\n", "control was handled by a hybrid system of humans and computers. This\n", "means deploying human beings for the tasks that they can manage, and\n", "machines for the tasks that are better performed by a machine. This\n", "leads to a division of labour between the machine and the human that can\n", "still be found in our modern digital ecosystems.\n", "\n", "\n", "\n", "Figure: The fire control computer set at the centre of a system of\n", "observation and tracking (The Admiralty, 1945).\n", "\n", "As analogue computers, fire control computers from the second world war\n", "would contain components that directly represented the different\n", "variables that were important in the problem to be solved, such as the\n", "inclination between two ships.\n", "\n", "\n", "\n", "Figure: Measuring inclination between two ships (The Admiralty,\n", "1945). Sophisticated fire control computers allowed the ship to continue\n", "to fire while under maneuvers.\n", "\n", "The fire control systems were electro-mechanical analogue computers that\n", "represented the “state variables” of interest, such as inclination and\n", "ship speed with gears and cams within the machine.\n", "\n", "\n", "\n", "Figure: A second world war gun computer’s control table (The\n", "Admiralty, 1945).\n", "\n", "For more details on fire control computers, you can watch a 1953 film on\n", "the the US the [Mark IA fire control\n", "computer](https://en.wikipedia.org/wiki/Mark_I_Fire_Control_Computer)\n", "from Periscope Film." ], "id": "fbc1767a-f468-447e-bf72-2c5379ed48bc" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.lib.display import YouTubeVideo\n", "YouTubeVideo('gwf5mAlI7Ug')" ], "id": "4f2b7646-4100-4cd1-89cb-17a3e573cc4a" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Figure: U.S. Navy training film MN-6783a. Basic Mechanisms of Fire\n", "Control Computers. Mechanical Computer Instructional Film 27794 (1953)\n", "for the Mk 1A Fire Control Computer." ], "id": "264e4558-dc77-44c6-846c-a15ae2e4e7f9" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Behind the Eye\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: [Behind the\n", "Eye](https://www.amazon.co.uk/Behind-Eye-Gifford-Lectures-MACKAY/dp/0631173323)\n", "(MacKay, 1991) summarises MacKay’s Gifford Lectures, where MacKay uses\n", "the operation of the eye as a window on the operation of the brain.\n", "\n", "Donald MacKay was at King’s College for his PhD. He was just down the\n", "road from Bill Phillips at LSE who was building the MONIAC. He was part\n", "of the Ratio Club. A group of early career scientists who were\n", "interested in communication and control in animals and humans, or more\n", "specifically they were interested in computers and brains. The were part\n", "of an international movement known as cybernetics.\n", "\n", "Donald MacKay wrote of the influence that his own work on radar had on\n", "his interest in the brain.\n", "\n", "> … during the war I had worked on the theory of automated and\n", "> electronic computing and on the theory of information, all of which\n", "> are highly relevant to such things as automatic pilots and automatic\n", "> gun direction. I found myself grappling with problems in the design of\n", "> artificial sense organs for naval gun-directors and with the\n", "> principles on which electronic circuits could be used to simulate\n", "> situations in the external world so as to provide goal-directed\n", "> guidance for ships, aircraft, missiles and the like.\n", "\n", "> Later in the 1940’s, when I was doing my Ph.D. work, there was much\n", "> talk of the brain as a computer and of the early digital computers\n", "> that were just making the headlines as “electronic brains.” As an\n", "> analogue computer man I felt strongly convinced that the brain,\n", "> whatever it was, was not a digital computer. I didn’t think it was an\n", "> analogue computer either in the conventional sense.\n", "\n", "> But this naturally rubbed under my skin the question: well, if it is\n", "> not either of these, what kind of system is it? Is there any way of\n", "> following through the kind of analysis that is appropriate to their\n", "> artificial automata so as to understand better the kind of system the\n", "> human brain is? That was the beginning of my slippery slope into brain\n", "> research.\n", ">\n", "> *Behind the Eye* pg 40. Edited version of the 1986 Gifford Lectures\n", "> given by Donald M. MacKay and edited by Valerie MacKay\n", "\n", "Importantly, MacKay distinguishes between the *analogue* computer and\n", "the *digital* computer. As he mentions, his experience was with analogue\n", "machines. An analogue machine is *literally* an analogue. The radar\n", "systems that Wiener and MacKay both worked on were made up of electronic\n", "components such as resistors, capacitors, inductors and/or mechanical\n", "components such as cams and gears. Together these components could\n", "represent a physical system, such as an anti-aircraft gun and a plane.\n", "The design of the analogue computer required the engineer to simulate\n", "the real world in analogue electronics, using dualities that exist\n", "between e.g. mechanical circuits (mass, spring, damper) and electronic\n", "circuits (inductor, resistor, capacitor). The analogy between mass and a\n", "damper, between spring and a resistor and between capacitor and a damper\n", "works because the underlying mathematics is approximated with the same\n", "linear system: a second order differential equation. This mathematical\n", "analogy allowed the designer to map from the real world, through\n", "mathematics, to a virtual world where the components reflected the real\n", "world through analogy." ], "id": "d617c945-7baf-4c7f-9e1d-cc8ac751393e" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Human Analogue Machine\n", "\n", "\\[edit\\]\n", "\n", "The machine learning systems we have built today that can reconstruct\n", "human text, or human classification of images, necessarily must have\n", "some aspects to them that are analagous to our understanding. As MacKay\n", "suggests the brain is neither a digital or an analogue computer, and the\n", "same can be said of the modern neural network systems that are being\n", "tagged as “artificial intelligence”.\n", "\n", "I believe a better term for them is “human-analogue machines”, because\n", "what we have built is not a system that can make intelligent decisions\n", "from first principles (a rational approach) but one that observes how\n", "humans have made decisions through our data and reconstructs that\n", "process. Machine learning is more empiricist than rational, but now we n\n", "empirical approach that distils our evolved intelligence.\n", "\n", "\n", "\n", "Figure: The human analogue machine creates a feature space which is\n", "analagous to that we use to reason, one way of doing this is to have a\n", "machine attempt to compress all human generated text in an\n", "auto-regressive manner." ], "id": "23eca322-7ef5-43a3-af90-d4c4432ca5c1" }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Heider and Simmel (1944)\n", "\n", "\\[edit\\]" ], "id": "5160511c-5054-4c79-aaa2-3fcab7e0682e" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.lib.display import YouTubeVideo\n", "YouTubeVideo('8FIEZXMUM2I')" ], "id": "4de861bd-f77d-4c9e-8a98-63ea94f91c8b" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Figure: Fritz Heider and Marianne Simmel’s video of shapes from\n", "(**Heider-experimental44?**).\n", "\n", "[Fritz Heider](https://en.wikipedia.org/wiki/Fritz_Heider) and [Marianne\n", "Simmel](https://en.wikipedia.org/wiki/Marianne_Simmel)’s experiments\n", "with animated shapes from 1944 (**Heider-experimental44?**). Our\n", "interpretation of these objects as showing motives and even emotion is a\n", "combination of our desire for narrative, a need for understanding of\n", "each other, and our ability to empathize. At one level, these are\n", "crudely drawn objects, but in another way, the animator has communicated\n", "a story through simple facets such as their relative motions, their\n", "sizes and their actions. We apply our psychological representations to\n", "these faceless shapes to interpret their actions.\n", "\n", "See also a recent review paper on Human Cooperation by Henrich and\n", "Muthukrishna (2021).\n", "\n", "The perils of developing this capability include counterfeit people, a\n", "notion that the philosopher [Daniel Dennett has described in *The\n", "Atlantic*](https://www.theatlantic.com/technology/archive/2023/05/problem-counterfeit-people/674075/).\n", "This is where computers can represent themselves as human and fool\n", "people into doing things on that basis." ], "id": "cc5de544-057f-4eab-b481-2fed08aaa707" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## LLM Conversations\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: The focus so far has been on reducing uncertainty to a few\n", "representative values and sharing numbers with human beings. We forget\n", "that most people can be confused by basic probabilities for example the\n", "prosecutor’s fallacy." ], "id": "284efb86-8249-4a10-bfb6-b2ef296084a0" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.lib.display import YouTubeVideo\n", "YouTubeVideo('0sJjdxn5kcI')" ], "id": "60fa105a-d087-454b-8337-07dc2bc2a4e3" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Figure: The Inner Monologue paper suggests using LLMs for robotic\n", "planning (Huang et al., 2023).\n", "\n", "By interacting directly with machines that have an understanding of\n", "human cultural context, it should be possible to share the nature of\n", "uncertainty in the same way humans do. See for example the paper [Inner\n", "Monologue: Embodied Reasoning through\n", "Planning](https://innermonologue.github.io/) Huang et al. (2023).\n", "\n", "But if we can avoid the pitfalls of counterfeit people, this also offers\n", "us an opportunity to *psychologically represent* (Heider, 1958) the\n", "machine in a manner where humans can communicate without special\n", "training. This in turn offers the opportunity to overcome the challenge\n", "of *intellectual debt*.\n", "\n", "Despite the lack of interpretability of machine learning models, they\n", "allow us access to what the machine is doing in a way that bypasses many\n", "of the traditional techniques developed in statistics. But understanding\n", "this new route for access is a major new challenge." ], "id": "98c1705b-509e-4152-887c-983b7bb1258c" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## HAM\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: The trinity of human, data, and computer, and highlights the\n", "modern phenomenon. The communication channel between computer and data\n", "now has an extremely high bandwidth. The channel between human and\n", "computer and the channel between data and human is narrow. New direction\n", "of information flow, information is reaching us mediated by the\n", "computer. The focus on classical statistics reflected the importance of\n", "the direct communication between human and data. The modern challenges\n", "of data science emerge when that relationship is being mediated by the\n", "machine.\n", "\n", "\n", "\n", "Figure: The HAM now sits between us and the traditional digital\n", "computer." ], "id": "74f62c15-6529-47ca-b399-5a7044a8be18" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Networked Interactions\n", "\n", "Our modern society intertwines the machine with human interactions. The\n", "key question is who has control over these interfaces between humans and\n", "machines.\n", "\n", "\n", "\n", "Figure: Humans and computers interacting should be a major focus of\n", "our research and engineering efforts.\n", "\n", "So the real challenge that we face for society is understanding which\n", "systemic interventions will encourage the right interactions between the\n", "humans and the machine at all of these interfaces.\n", "\n", "\n", "\n", "Figure: \n", "\n", "\n", "\n", "\n", "\n", "\n", "Figure: Humans use culture, facts and ‘artefacts’ to communicate." ], "id": "bb2f9cad-48c2-4f10-8c84-fa68dd4a6ea0" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Known Biases\n", "\n", "\\[edit\\]" ], "id": "c7a7bebd-fc06-45e9-a96c-cfdb6df1601d" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Complexity in Action\n", "\n", "\\[edit\\]\n", "\n", "As an exercise in understanding complexity, watch the following video.\n", "You will see the basketball being bounced around, and the players\n", "moving. Your job is to count the passes of those dressed in white and\n", "ignore those of the individuals dressed in black." ], "id": "6b204a0b-269d-4a29-82c0-9d0de00e97fb" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.lib.display import YouTubeVideo\n", "YouTubeVideo('vJG698U2Mvo')" ], "id": "cc4a6b6f-5cad-4e47-b9b3-d62810bc23a3" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Figure: Daniel Simon’s famous illusion “monkey business”. Focus on\n", "the movement of the ball distracts the viewer from seeing other aspects\n", "of the image.\n", "\n", "In a classic study Simons and Chabris (1999) ask subjects to count the\n", "number of passes of the basketball between players on the team wearing\n", "white shirts. Fifty percent of the time, these subjects don’t notice the\n", "gorilla moving across the scene.\n", "\n", "The phenomenon of inattentional blindness is well known, e.g in their\n", "paper Simons and Charbris quote the Hungarian neurologist, Rezsö Bálint,\n", "\n", "> It is a well-known phenomenon that we do not notice anything happening\n", "> in our surroundings while being absorbed in the inspection of\n", "> something; focusing our attention on a certain object may happen to\n", "> such an extent that we cannot perceive other objects placed in the\n", "> peripheral parts of our visual field, although the light rays they\n", "> emit arrive completely at the visual sphere of the cerebral cortex.\n", ">\n", "> Rezsö Bálint 1907 (translated in Husain and Stein 1988, page 91)\n", "\n", "When we combine the complexity of the world with our relatively low\n", "bandwidth for information, problems can arise. Our focus on what we\n", "perceive to be the most important problem can cause us to miss other\n", "(potentially vital) contextual information.\n", "\n", "This phenomenon is known as selective attention or ‘inattentional\n", "blindness’." ], "id": "61c07f94-04e3-422a-97a4-4b5b3dafa760" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.lib.display import YouTubeVideo\n", "YouTubeVideo('_oGAzq5wM_Q')" ], "id": "81f005a9-58e4-4362-a1e7-9f2a8d7ebc9e" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Figure: For a longer talk on inattentional bias from Daniel Simons\n", "see this video." ], "id": "396d052d-20bc-4122-8351-c424026890d3" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## A Hypothesis as a Liability\n", "\n", "\\[edit\\]\n", "\n", "This analysis is from an article titled “A Hypothesis as a Liability”\n", "(Yanai and Lercher, 2020), they start their article with the following\n", "quite from Herman Hesse.\n", "\n", "> ” ‘When someone seeks,’ said Siddhartha, ‘then it easily happens that\n", "> his eyes see only the thing that he seeks, and he is able to find\n", "> nothing, to take in nothing. \\[…\\] Seeking means: having a goal. But\n", "> finding means: being free, being open, having no goal.’ ”\n", ">\n", "> Hermann Hesse\n", "\n", "Their idea is that having a hypothesis can constrain our thinking.\n", "However, in answer to their paper Felin et al. (2021) argue that some\n", "form of hypothesis is always necessary, suggesting that a hypothesis\n", "*can* be a liability\n", "\n", "My view is captured in the introductory chapter to an edited volume on\n", "computational systems biology that I worked on with Mark Girolami,\n", "Magnus Rattray and Guido Sanguinetti.\n", "\n", "\n", "\n", "Figure: Quote from Lawrence (2010) highlighting the importance of\n", "interaction between data and hypothesis.\n", "\n", "Popper nicely captures the interaction between hypothesis and data by\n", "relating it to the chicken and the egg. The important thing is that\n", "these two co-evolve.\n", "\n", "\n", "" ], "id": "39ba5ced-3f56-40b3-83b3-3f4bf42540f1" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Conclusions\n", "\n", "We’re not yet in the position where we understand our own systems,\n", "intellectual debt is pervasive today sowing the seeds for many more\n", "Horizon challenges. Part of the difficulty is that shown by Blake with\n", "Newton’s focus on the simple while ignoring the complex. LLMs offer us\n", "the dual potential to either fix the problem or make it much worse. In\n", "practice it’s likely they’ll do both.\n", "\n", "" ], "id": "04081e8b-12db-4289-8102-105f5846c2bd" }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Lecture 3" ], "id": "31ec5c42-dab6-4746-82cb-69745535adf9" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Richard Feynmann on Doubt\n", "\n", "\\[edit\\]\n", "\n", "> One thing is I can live with is doubt, and uncertainty and not\n", "> knowing. I think it’s much more interesting to live with not knowing\n", "> than to have an answer that might be wrong.\n", ">\n", "> Richard P. Feynmann in the *The Pleasure of Finding Things Out* 1981.\n", "\n", "" ], "id": "894f6fd0-7eaf-4595-8ecb-cd7471bc42b5" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Probability Conversations\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: The focus so far has been on reducing uncertainty to a few\n", "representative values and sharing numbers with human beings. We forget\n", "that most people can be confused by basic probabilities for example the\n", "prosecutor’s fallacy.\n", "\n", "In practice we know that probabilities can be very unintuitive, for\n", "example in court there is a fallacy known as the “prosecutor’s fallacy”\n", "that confuses conditional probabilities. This can cause problems in jury\n", "trials (Thompson, 1989).\n", "\n", "\n", "" ], "id": "f4e4b026-3c12-47ac-b997-5a832ca08b87" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Lies and Damned Lies\n", "\n", "\\[edit\\]\n", "\n", "> There are three types of lies: lies, damned lies and statistics\n", ">\n", "> Arthur Balfour 1848-1930\n", "\n", "Arthur Balfour was quoting the lawyer James Munro[1] when he said that\n", "there three types of lies: lies, damned lies and statistics in 1892.\n", "This is 20 years before the first academic department of applied\n", "statistics was founded at UCL. If Balfour were alive today, it is likely\n", "that he’d rephrase his quote:\n", "\n", "> There are three types of lies, lies damned lies and *big data*.\n", "\n", "Why? Because the challenges of understanding and interpreting big data\n", "today are similar to those that Balfour (who was a Conservative\n", "politician and statesman and would later become Prime Minister) faced in\n", "governing an empire through statistics in the latter part of the 19th\n", "century.\n", "\n", "The quote lies, damned lies and statistics was also credited to Benjamin\n", "Disraeli by Mark Twain in Twain’s autobiography.[2] It characterizes the\n", "idea that statistic can be made to prove anything. But Disraeli died in\n", "1881 and Mark Twain died in 1910. The important breakthrough in\n", "overcoming our tendency to over-interpet data came with the\n", "formalization of the field through the development of *mathematical\n", "statistics*.\n", "\n", "Data has an elusive quality, it promises so much but can deliver little,\n", "it can mislead and misrepresent. To harness it, it must be tamed. In\n", "Balfour and Disraeli’s time during the second half of the 19th century,\n", "numbers and data were being accumulated, the social sciences were being\n", "developed. There was a large-scale collection of data for the purposes\n", "of government.\n", "\n", "The modern ‘big data era’ is on the verge of delivering the same sense\n", "of frustration that Balfour experienced, the early promise of big data\n", "as a panacea is evolving to demands for delivery. For me, personally,\n", "peak-hype coincided with an email I received inviting collaboration on a\n", "project to deploy “*Big Data* and *Internet of Things* in an *Industry\n", "4.0* environment”. Further questioning revealed that the actual project\n", "was optimization of the efficiency of a manufacturing production line, a\n", "far more tangible and *realizable* goal.\n", "\n", "The antidote to this verbiage is found in increasing awareness. When\n", "dealing with data the first trap to avoid is the games of buzzword bingo\n", "that we are wont to play. The first goal is to quantify what challenges\n", "can be addressed and what techniques are required. Behind the hype\n", "fundamentals are changing. The phenomenon is about the increasing access\n", "we have to data. The way customers’ information is recorded and\n", "processes are codified and digitized with little overhead. Internet of\n", "things is about the increasing number of cheap sensors that can be\n", "easily interconnected through our modern network structures. But\n", "businesses are about making money, and these phenomena need to be recast\n", "in those terms before their value can be realized.\n", "\n", "For more thoughts on the challenges that statistics brings see Chapter 8\n", "of Lawrence (2024).\n", "\n", "[1] The quote is reported in the *Manchester Guardian* on 29th June\n", "1892. See also .\n", "\n", "[2] Although Twain attributes Disraeli in this way there’s [no record of\n", "him having said\n", "this.](https://en.wikipedia.org/wiki/Lies,_damned_lies,_and_statistics)." ], "id": "6084aacc-47c8-4ca6-8fd8-275a4c7beb91" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## *Mathematical* Statistics\n", "\n", "[Karl Pearson](https://en.wikipedia.org/wiki/Karl_Pearson) (1857-1936),\n", "[Ronald Fisher](https://en.wikipedia.org/wiki/Ronald_Fisher) (1890-1962)\n", "and others considered the question of what conclusions can truly be\n", "drawn from data. Their mathematical studies act as a restraint on our\n", "tendency to over-interpret and see patterns where there are none. They\n", "introduced concepts such as randomized control trials that form a\n", "mainstay of our decision making today, from government, to clinicians to\n", "large scale A/B testing that determines the nature of the web interfaces\n", "we interact with on social media and shopping.\n", "\n", "\n", "\n", "Figure: Karl Pearson (1857-1936), one of the founders of Mathematical\n", "Statistics.\n", "\n", "Their movement did the most to put statistics to rights, to eradicate\n", "the ‘damned lies’. It was known as [‘mathematical\n", "statistics’](https://en.wikipedia.org/wiki/Mathematical_statistics).\n", "Today I believe we should look to the emerging field of *data science*\n", "to provide the same role. Data science is an amalgam of statistics, data\n", "mining, computer systems, databases, computation, machine learning and\n", "artificial intelligence. Spread across these fields are the tools we\n", "need to realize data’s potential. For many businesses this might be\n", "thought of as the challenge of ‘converting bits into atoms’. Bits: the\n", "data stored on computer, atoms: the physical manifestation of what we\n", "do; the transfer of goods, the delivery of service. From fungible to\n", "tangible. When solving a challenge through data there are a series of\n", "obstacles that need to be addressed.\n", "\n", "Firstly, data awareness: what data you have and where its stored.\n", "Sometimes this includes changing your conception of what data is and how\n", "it can be obtained. From automated production lines to apps on employee\n", "smart phones. Often data is locked away: manual logbooks, confidential\n", "data, personal data. For increasing awareness an internal audit can\n", "help. The website [data.gov.uk](https://data.gov.uk/) hosts data made\n", "available by the UK government. To create this website the government’s\n", "departments went through an audit of what data they each hold and what\n", "data they could make available. Similarly, within private businesses\n", "this type of audit could be useful for understanding their internal\n", "digital landscape: after all the key to any successful campaign is a\n", "good map.\n", "\n", "Secondly, availability. How well are the data sources interconnected?\n", "How well curated are they? The curse of Disraeli was associated with\n", "unreliable data and *unreliable statistics*. The misrepresentations this\n", "leads to are worse than the absence of data as they give a false sense\n", "of confidence to decision making. Understanding how to avoid these\n", "pitfalls involves an improved sense of data and its value, one that\n", "needs to permeate the organization.\n", "\n", "The final challenge is analysis, the accumulation of the necessary\n", "expertise to digest what the data tells us. Data requires\n", "interpretation, and interpretation requires experience. Analysis is\n", "providing a bottleneck due to a skill shortage, a skill shortage made\n", "more acute by the fact that, ideally, analysis should be carried out by\n", "individuals not only skilled in data science but also equipped with the\n", "domain knowledge to understand the implications in a given application,\n", "and to see opportunities for improvements in efficiency." ], "id": "6ac15a09-a375-459a-8356-c91b38d27086" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## ‘Mathematical Data Science’\n", "\n", "As a term ‘big data’ promises much and delivers little, to get true\n", "value from data, it needs to be curated and evaluated. The three stages\n", "of awareness, availability and analysis provide a broad framework\n", "through which organizations should be assessing the potential in the\n", "data they hold. Hand waving about big data solutions will not do, it\n", "will only lead to self-deception. The castles we build on our data\n", "landscapes must be based on firm foundations, process and scientific\n", "analysis. If we do things right, those are the foundations that will be\n", "provided by the new field of data science.\n", "\n", "Today the statement “There are three types of lies: lies, damned lies\n", "and ‘big data’” may be more apt. We are revisiting many of the mistakes\n", "made in interpreting data from the 19th century. Big data is laid down\n", "by happenstance, rather than actively collected with a particular\n", "question in mind. That means it needs to be treated with care when\n", "conclusions are being drawn. For data science to succeed it needs the\n", "same form of rigor that Pearson and Fisher brought to statistics, a\n", "“mathematical data science” is needed.\n", "\n", "You can also check my blog post on [Lies, Damned Lies and Big\n", "Data](http://inverseprobability.com/2016/11/19/lies-damned-lies-big-data)." ], "id": "f2ef0451-e8a0-4291-bd59-1f20d9d5f176" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Laplace’s Demon\n", "\n", "\\[edit\\]" ], "id": "6ff3b89e-4358-42d7-9d16-a4444713b9af" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import notutils as nu\n", "nu.display_google_book(id='1YQPAAAAQAAJ', page='PR17-IA2')" ], "id": "31e250f0-665b-4fe6-8902-cc93b1767c07" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: English translation of Laplace’s demon, taken from the\n", "Philosophical Essay on probabilities Laplace (1814) pg 3.\n", "\n", "One way of viewing what Laplace is saying is that we can take “the\n", "forces by which nature is animated” or our best\n", "mathematical/computational abstraction of that which we would call the\n", "*model* and combine it with the “respective situation of the beings who\n", "compose it” which I would refer to as the *data* and if we have an\n", "“intelligence sufficiently vast enough to submit these data to\n", "analysis”, or sufficient *compute* then we would have a system for which\n", "“nothing would be uncertain and the future, as the past, would be\n", "present in its eyes”, or in other words we can make a *prediction*. Or\n", "more succinctly put we have\n", "\n", "
\n", "\n", "$$\n", "\\text{model} + \\text{data} \\stackrel{\\text{compute}}{\\rightarrow} \\text{prediction}.$$\n", "\n", "
\n", "\n", "Laplace’s demon has been a recurring theme in science, we can also find\n", "it in Stephen Hawking’s book *A Brief History of Time* (*A brief history\n", "of time*, 1988).\n", "\n", "> If we do discover a theory of everything … it would be the ultimate\n", "> triumph of human reason-for then we would truly know the mind of God\n", ">\n", "> Stephen Hawking in *A Brief History of Time* 1988\n", "\n", "But is it really that simple? Do we just need more and more accurate\n", "models and more and more data?" ], "id": "fb2dddf8-4eaa-48a3-a46c-8484650478ea" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Game of Life\n", "\n", "\\[edit\\]\n", "\n", "[John Horton Conway](https://en.wikipedia.org/wiki/John_Horton_Conway)\n", "was a mathematician who developed a game known as the Game of Life. He\n", "died in April 2020, but since he invented the game, he was in effect\n", "‘god’ for this game. But as we will see, just inventing the rules\n", "doesn’t give you omniscience in the game.\n", "\n", "The Game of Life is played on a grid of squares, or pixels. Each pixel\n", "is either on or off. The game has no players, but a set of simple rules\n", "that are followed at each turn the rules are." ], "id": "63218fb0-d43a-4d6e-b9b1-d54370f137bc" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Life Rules\n", "\n", "\\[edit\\]\n", "\n", "John Conway’s game of life is a cellular automaton where the cells obey\n", "three very simple rules. The cells live on a rectangular grid, so that\n", "each cell has 8 possible neighbors.\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\n", "*loneliness*\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "Figure: ‘Death’ through loneliness in Conway’s game of life. If a\n", "cell is surrounded by less than three cells, it ‘dies’ through\n", "loneliness.\n", "\n", "The game proceeds in turns, and at each location in the grid is either\n", "alive or dead. Each turn, a cell counts its neighbors. If there are two\n", "or fewer neighbors, the cell ‘dies’ of ‘loneliness’.\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\n", "*overcrowding*\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "Figure: ‘Death’ through overpopulation in Conway’s game of life. If a\n", "cell is surrounded by more than three cells, it ‘dies’ through\n", "loneliness.\n", "\n", "If there are four or more neighbors, the cell ‘dies’ from\n", "‘overcrowding’. If there are three neighbors, the cell persists, or if\n", "it is currently dead, a new cell is born.\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\n", "*birth*\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "Figure: Birth in Conway’s life. Any position surrounded by precisely\n", "three live cells will give birth to a new cell at the next turn.\n", "\n", "And that’s it. Those are the simple ‘physical laws’ for Conway’s game.\n", "\n", "The game leads to patterns emerging, some of these patterns are static,\n", "but some oscillate in place, with varying periods. Others oscillate, but\n", "when they complete their cycle they’ve translated to a new location, in\n", "other words they move. In Life the former are known as\n", "[oscillators](https://conwaylife.com/wiki/Oscillator) and the latter as\n", "[spaceships](https://conwaylife.com/wiki/Spaceship)." ], "id": "a51ba582-724c-4784-b256-90c4aac3435c" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loafers and Gliders\n", "\n", "\\[edit\\]\n", "\n", "John Horton Conway, as the creator of the game of life, could be seen\n", "somehow as the god of this small universe. He created the rules. The\n", "rules are so simple that in many senses he, and we, are all-knowing in\n", "this space. But despite our knowledge, this world can still ‘surprise’\n", "us. From the simple rules, emergent patterns of behaviour arise. These\n", "include static patterns that don’t change from one turn to the next.\n", "They also include, oscillators, that pulse between different forms\n", "across different periods of time. A particular form of oscillator is\n", "known as a ‘spaceship’, this is one that moves across the board as the\n", "game evolves. One of the simplest and earliest spaceships to be\n", "discovered is known as the glider.\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "*Glider (1969)*\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "\n", "Figure: *Left* A Glider pattern discovered 1969 by Richard K. Guy.\n", "*Right*. John Horton Conway, creator of *Life* (1937-2020). The glider\n", "is an oscillator that moves diagonally after creation. From the simple\n", "rules of Life it’s not obvious that such an object does exist, until you\n", "do the necessary computation.\n", "\n", "The glider was ‘discovered’ in 1969 by Richard K. Guy. What do we mean\n", "by discovered in this context? Well, as soon as the game of life is\n", "defined, objects such as the glider do somehow exist, but the many\n", "configurations of the game mean that it takes some time for us to see\n", "one and know it exists. This means, that despite being the creator,\n", "Conway, and despite the rules of the game being simple, and despite the\n", "rules being deterministic, we are not ‘omniscient’ in any simplistic\n", "sense. It requires computation to ‘discover’ what can exist in this\n", "universe once it’s been defined.\n", "\n", "\n", "\n", "Figure: The Gosper glider gun is a configuration that creates\n", "gliders. A new glider is released after every 30 turns.\n", "\n", "These patterns had to be discovered, in the same way that a scientist\n", "might discover a disease, or an explorer a new land. For example, the\n", "Gosper glider gun was [discovered by Bill Gosper in\n", "1970](https://conwaylife.com/wiki/Bill_Gosper). It is a pattern that\n", "creates a new glider every 30 turns of the game.\n", "\n", "Despite widespread interest in Life, some of its patterns were only very\n", "recently discovered like the Loafer, discovered in 2013 by Josh Ball.\n", "So, despite the game having existed for over forty years, and the rules\n", "of the game being simple, there are emergent behaviors that are unknown.\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "*Loafer (2013)*\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "\n", "Figure: *Left* A Loafer pattern discovered by Josh Ball in 2013.\n", "*Right*. John Horton Conway, creator of *Life* (1937-2020).\n", "\n", "Once these patterns are discovered, they are combined (or engineered) to\n", "create new Life patterns that do some remarkable things. For example,\n", "there’s a life pattern that runs a Turing machine, or more remarkably\n", "there’s a Life pattern that runs Life itself.\n", "\n", "\n", "\n", "Figure: The Game of Life running in Life. The video is drawing out\n", "recursively showing pixels that are being formed by filling cells with\n", "moving spaceships. Each individual pixel in this game of life is made up\n", "of $2048 \\times 2048$ pixels called an [OTCA\n", "metapixel](https://www.conwaylife.com/wiki/OTCA_metapixel).\n", "\n", "To find out more about the Game of Life you can watch this video by Alan\n", "Zucconi or read his [associated blog\n", "post](https://www.alanzucconi.com/2020/10/13/conways-game-of-life/)." ], "id": "1611d858-9e89-4ff6-b5c8-791547bd62b1" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.lib.display import YouTubeVideo\n", "YouTubeVideo('Kk2MH9O4pXY')" ], "id": "64cb9fc4-b5b9-4d2b-b24a-5feae6c1a442" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Figure: An introduction to the Game of Life by Alan Zucconi.\n", "\n", "Contrast this with our situation where in ‘real life’ we don’t know the\n", "simple rules of the game, the state space is larger, and emergent\n", "behaviors (hurricanes, earthquakes, volcanos, climate change) have\n", "direct consequences for our daily lives, and we understand why the\n", "process of ‘understanding’ the physical world is so difficult. We also\n", "see immediately how much easier we might expect the physical sciences to\n", "be than the social sciences, where the emergent behaviors are contingent\n", "on highly complex human interactions." ], "id": "90a759b0-e003-48d9-8913-df0420197b66" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Emergent Behaviour\n", "\n", "\\[edit\\]" ], "id": "8ea3dd10-8d07-4070-bdc8-ff26617a4be1" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Laplace’s Gremlin\n", "\n", "\\[edit\\]\n", "\n", "> The curve described by a simple molecule of air or vapor is regulated\n", "> in a manner just as certain as the planetary orbits; the only\n", "> difference between them is that which comes from our ignorance.\n", "> Probability is relative, in part to this ignorance, in part to our\n", "> knowledge. We know that of three or greater number of events a single\n", "> one ought to occur; but nothing induces us to believe that one of them\n", "> will occur rather than the others. In this state of indecision it is\n", "> impossible for us to announce their occurrence with certainty. It is,\n", "> however, probable that one of these events, chosen at will, will not\n", "> occur because we see several cases equally possible which exclude its\n", "> occurrence, while only a single one favors it.\n", ">\n", "> — Pierre-Simon Laplace (Laplace, 1814), pg 5\n", "\n", "The representation of ignorance through probability is the true message\n", "of Laplace, I refer to this message as “Laplace’s gremlin”, because it\n", "is the gremlin of uncertainty that interferes with the demon of\n", "determinism to mean that our predictions are not deterministic.\n", "\n", "Our separation of the uncertainty into the data, the model and the\n", "computation give us three domains in which our doubts can creep into our\n", "ability to predict. Over the last three lectures we’ve introduced some\n", "of the basic tools we can use to unpick this uncertainty. You’ve been\n", "introduced to, (or have yow reviewed) *Bayes’ rule*. The rule, which is\n", "a simple consequence of the product rule of probability, is the\n", "foundation of how we update our beliefs in the presence of new\n", "information.\n", "\n", "The real point of Laplace’s essay was that we don’t have access to all\n", "the data, we don’t have access to a complete physical understanding, and\n", "as the example of the Game of Life shows, even if we did have access to\n", "both (as we do for “Conway’s universe”) we still don’t have access to\n", "all the compute that we need to make deterministic predictions. There is\n", "uncertainty in the system which means we can’t make precise predictions.\n", "\n", "Gremlins are imaginary creatures used as an explanation of failure in\n", "aircraft, causing crashes. In that sense the Gremlin represents the\n", "uncertainty that a pilot felt about what might go wrong in a plane which\n", "might be “theoretically sound” but in practice is poorly maintained or\n", "exposed to conditions that take it beyond its design criteria. Laplace’s\n", "gremlin is all the things that your model, data and ability to compute\n", "don’t account for bringing about failures in your ability to predict.\n", "Laplace’s gremlin is the uncertainty in the system.\n", "\n", "\n", "\n", "Figure: Gremlins are seen as the cause of a number of challenges in\n", "this World War II poster." ], "id": "970a56b4-f617-4940-b826-0dfed04eea3f" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Abstraction and Emergent Properties\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: A scale of different simulations we might be interested in\n", "when modelling the physical world. The scale is $\\log_{10}$ meters. The\n", "scale reflects something about the level of granularity where we might\n", "choose to know “all positions of all items of which nature is\n", "composed”.\n", "\n", "Unfortunately, even if such an equation were to exist, we would be\n", "unlikely to know “all positions of all items of which nature is\n", "composed”. A good example here is computational systems biology. In that\n", "domain we are interested in understanding the underlying function of the\n", "cell. These systems sit somewhere between the two extremes that Laplace\n", "described: “the movements of the greatest bodies of the universe and\n", "those of the smallest atom”.\n", "\n", "When the smallest atom is considered, we need to introduce uncertainty.\n", "We again turn to a different work of Maxwell, building on Bernoulli’s\n", "kinetic theory of gases we end up with probabilities for representing\n", "the location of the ‘molecules of air’. Instead of a deterministic\n", "location for these particles we represent our belief about their\n", "location in a distribution.\n", "\n", "Computational systems biology is a world of micro-machines, built of\n", "three dimensional foldings of strings of proteins. There are spindles\n", "(stators) and rotors (e.g. [ATP\n", "Synthase](https://en.wikipedia.org/wiki/ATP_synthase)), there are small\n", "copying machines (e.g. [RNA\n", "Polymerase](https://en.wikipedia.org/wiki/RNA_polymerase)) there are\n", "sequence to sequence translators\n", "([Ribosomes](https://en.wikipedia.org/wiki/Ribosome)). The cells store\n", "information in DNA but have an ecosystem of structures and messages\n", "being built and sent in proteins and RNA. Unpicking these structures has\n", "been a major preoccupation of biology. That is knowing where the atoms\n", "of these molecules are in the structure, and how the parts of the\n", "structure move when these small micro-machines are carrying out their\n", "roles.\n", "\n", "We understand most (if not all) of the physical laws that drive the\n", "movements of these molecules, but we don’t understand all the actions of\n", "the cell, nor can we intervene reliably to improve things. So, even in\n", "the case where we have a good understanding of the physical laws,\n", "Laplace’s gremlin emerges in our knowledge of “the positions of all\n", "items of which nature is composed”." ], "id": "204a2d81-9885-4d54-b271-ea4f0cbba951" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Molecular Dynamics Simulations\n", "\n", "By understanding and simulating the physics, we can recreate operations\n", "that are happening at the level of proteins in the human cell.\n", "[V-ATPase](https://en.wikipedia.org/wiki/V-ATPase) is an enzyme that\n", "pumps protons. But at the microscopic level it’s a small machine. It\n", "produces ATP in response to a proton gradient. A paper in *Science\n", "Advances* (Roh et al., 2020) simulates the functioning of these proteins\n", "that operate across the cell membrane. This makes these proteins\n", "difficult to crystallize, the response to this challenge is to use a\n", "simulation which (somewhat) abstracts the processes. You can also check\n", "this [blog\n", "post](https://www6.slac.stanford.edu/news/2020-10-07-first-detailed-look-how-molecular-ferris-wheel-delivers-protons-cellular-factories)\n", "from the paper’s press release.\n", "\n", "\n", "\n", "Figure: The V-ATPase enzyme pumps proteins across membranes. This\n", "molecular dynamics simulation was published in *Science Advances* (Roh\n", "et al., 2020). The scale is roughly $10^{-8} m$." ], "id": "c06bfbad-8401-48a4-89dd-83a25f20619b" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Quantum Mechanics\n", "\n", "Alternatively, we can drop down a few scales and consider simulation of\n", "the Schrödinger equation. Intractabilities in the many-electron\n", "Schrödinger equation have been addressed using deep neural networks to\n", "speed up the solution enabling simulation of chemical bonds (Pfau et\n", "al., 2020). The [PR-blog post is also\n", "available](https://deepmind.com/blog/article/FermiNet). The paper uses a\n", "neural network to model the quantum state of a number of electrons.\n", "\n", "\n", "\n", "Figure: The many-electron Schrödinger equation is important in\n", "understanding how Chemical bonds are formed.\n", "\n", "Each of these simulations have the same property of being based on a set\n", "of (physical) rules about how particles interact. But one of the\n", "interesting characteristics of such systems is how the properties of the\n", "system are emergent as the dynamics are allowed to continue.\n", "\n", "These properties cannot be predicted without running the physics, or the\n", "equivalently the equation. Computation is required. And often the amount\n", "of computation that is required is prohibitive." ], "id": "a735a6b2-4b7e-45c5-be76-22f6bb38b727" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Lenox Globe\n", "\n", "\n", "\n", "Figure: [The Lenox\n", "globe](http://www.myoldmaps.com/renaissance-maps-1490-1800/314-the-lenox-globe/314-lenox.pdf),\n", "which dates from early 16th century, one of the earliest known\n", "globes.\n", "\n", "\n", "\n", "Figure: Drawing of the Lenox Globe by the historian for the Magazine\n", "of American History in September 1879.\n", "\n", "\n", "\n", "Figure: Detail from the Lenox globe located in the region of China,\n", "“hic sunt dracones”" ], "id": "24f153cc-f65e-467f-a602-4896331638b2" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Weather\n", "\n", "\\[edit\\]\n", "\n", "So what’s going on here? One analogy I like to use is with weather\n", "forecasting. Historically, before the use of computer driven weather\n", "forecasting, we used a process of interpolation to measure the pressure.\n", "\n", "\n", "\n", "Figure: Forecast from UK Met Office on 5th June 1944. (detail from\n", ")\n", "\n", "This was problematic for German forces in the Second World War because\n", "they had no ability to predict the weather when it was coming in from\n", "across the UK. Conversely, the UK had a number of weather stations in\n", "the UK, and some information (perhaps from spies or Enigma decrypts)\n", "about weather on the mainland.\n", "\n", "\n", "\n", "Figure: Forecast from Deutscher Wetterdienst on 5th June 1944.\n", "(detail from\n", ").\n", "Note the lack of measurements within the UK. THis is the direction that\n", "weather was coming from so the locaiton of weather fronts (and\n", "associated storms) was harder for the Deutscher Wetterdienst to predict\n", "than the Met Office.\n", "\n", "This meant that more accurate forecasts were possible for D-Day for the\n", "Allies than for the defending forces. As a result, on the morning that\n", "Eisenhower invated, Rommel was back in Germany attending his wife’s 50th\n", "birthday party.\n", "\n", "Modern artificial intelligence solutions are using very large amounts of\n", "data to build a landscape in which this interpolation can take place.\n", "Tools like ChatGPT are allowing us to interpolate between different\n", "human concepts. This is an amazing achievement, but it is also a\n", "challenge.\n", "\n", "" ], "id": "ba06783a-3c6a-45c3-a8ce-5504dc1ff66d" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Boulton and Watt’s Lap Engine\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "*Lap Engine (1788)*\n", "\n", "
\n", "
\n", "
\n", "\n", "total energy
=
available energy
+
temperature
\n", "$\\times$
entropy\n", "\n", "
\n", "
\n", "\n", "Figure: James Watt’s Lap Engine which incorporates many of his\n", "innovations to the steam engine, making it more efficient.\n", "\n", "" ], "id": "c168b5d3-c91a-42c5-bea6-18cf2aef5e52" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Brownian Motion and Wiener\n", "\n", "\\[edit\\]\n", "\n", "Robert Brown was a botanist who was studying plant pollen in 1827 when\n", "he noticed a trembling motion of very small particles contained within\n", "cavities within the pollen. He worked hard to eliminate the potential\n", "source of the movement by exploring other materials where he found it to\n", "be continuously present. Thus, the movement was not associated, as he\n", "originally thought, with life.\n", "\n", "In 1905 Albert Einstein produced the first mathematical explanation of\n", "the phenomenon. This can be seen as our first model of a ‘curve of a\n", "simple molecule of air’. To model the phenomenon Einstein introduced\n", "stochasticity to a differential equation. The particles were being\n", "peppered with high-speed water molecules, that was triggering the\n", "motion. Einstein modelled this as a stochastic process.\n", "\n", "\n", "\n", "Figure: Albert Einstein’s 1905 paper on Brownian motion introduced\n", "stochastic differential equations which can be used to model the ‘curve\n", "of a simple molecule of air’.\n", "\n", "Norbert Wiener was a child prodigy, whose father had schooled him in\n", "philosophy. He was keen to have his son work with the leading\n", "philosophers of the age, so at the age of 18 Wiener arrived in Cambridge\n", "(already with a PhD). He was despatched to study with Bertrand Russell\n", "but Wiener and Russell didn’t get along. Wiener wasn’t persuaded by\n", "Russell’s ideas for theories of knowledge through logic. He was more\n", "aligned with Laplace and his desire for a theory of ignorance. In is\n", "autobiography he relates it as the first thing he could see his father\n", "was proud of (at around the age of 10 or 11) (Wiener, 1953).\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "Figure: Bertrand Russell (1872-1970), Albert Einstein (1879-1955),\n", "Norbert Wiener, (1894-1964)\n", "\n", "But Russell (despite also not getting along well with Wiener) introduced\n", "Wiener to Einstein’s works, and Wiener also met G. H. Hardy. He left\n", "Cambridge for Göttingen where he studied with Hilbert. He developed the\n", "underlying mathematics for proving the existence of the solutions to\n", "Einstein’s equation, which are now known as Wiener processes.\n", "\n", "\n", "\n", "Figure: Brownian motion of a large particle in a group of smaller\n", "particles. The movement is known as a *Wiener process* after Norbert\n", "Wiener.\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "\n", "Figure: Norbert Wiener (1894 - 1964). Founder of cybernetics and the\n", "information era. He used Gibbs’s ideas to develop a “theory of\n", "ignorance” that he deployed in early communication. On the right is\n", "Wiener’s wartime report that used stochastic processes in forecasting\n", "with applications in radar control (image from Coales and Kane\n", "(2014)).\n", "\n", "Wiener himself used the processes in his work. He was focused on\n", "mathematical theories of communication. Between the world wars he was\n", "based at Massachusetts Institute of Technology where the burgeoning\n", "theory of electrical engineering was emerging, with a particular focus\n", "on communication lines. Winer developed theories of communication that\n", "used Gibbs’s entropy to encode information. He also used the ideas\n", "behind the Wiener process for developing tracking methods for radar\n", "systems in the second world war. These processes are what we know of now\n", "as Gaussian processes (Wiener (1949)).\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "*James Clerk Maxwell*\n", "\n", "
\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "
\n", "\n", "*Ludwig Boltzmann*\n", "\n", "
\n", "
\n", "\n", "\n", "\n", "
\n", "\n", "*Josiah Willard Gibbs*\n", "\n", "
\n", "
\n", "\n", "Figure: James Clerk Maxwell (1831-1879), Ludwig Boltzmann (1844-1906)\n", "Josiah Willard Gibbs (1839-1903)" ], "id": "47f4fb70-2111-4933-bab3-b5f5e88aedaf" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Entropy Billiards\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "\n", "Entropy:\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "Figure: Bernoulli’s simple kinetic models of gases assume that the\n", "molecules of air operate like billiard balls." ], "id": "7475d43d-039a-4d71-8cef-70be093364df" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np" ], "id": "7a966b15-4dd8-4538-8835-9b65833bf012" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "p = np.random.randn(10000, 1)\n", "xlim = [-4, 4]\n", "x = np.linspace(xlim[0], xlim[1], 200)\n", "y = 1/np.sqrt(2*np.pi)*np.exp(-0.5*x*x)" ], "id": "961e90a7-59ea-43df-82a6-1f1f8c724ac1" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import mlai.plot as plot\n", "import mlai" ], "id": "5d075c24-3efe-4c07-b0c7-11faf1d7d91e" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "ax.plot(x, y, 'r', linewidth=3)\n", "ax.hist(p, 100, density=True)\n", "ax.set_xlim(xlim)\n", "\n", "mlai.write_figure('gaussian-histogram.svg', directory='./ml')" ], "id": "5195621e-7b9e-4ab6-81b4-8fd4c0f70c8e" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another important figure for Cambridge was the first to derive the\n", "probability distribution that results from small balls banging together\n", "in this manner. In doing so, James Clerk Maxwell founded the field of\n", "statistical physics.\n", "\n", "\n", "\n", "Figure: James Clerk Maxwell 1831-1879 Derived distribution of\n", "velocities of particles in an ideal gas (elastic fluid)." ], "id": "d86aa386-6dd9-4078-9799-536f60bb2917" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Maxwell’s Demon\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: Maxwell’s demon opens and closes a door which allows fast\n", "particles to pass from left to right and slow particles to pass from\n", "right to left. This makes the left hand side colder than the right.\n", "\n", "\n", "\n", "\n", "Entropy:\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "Figure: Maxwell’s Demon. The demon decides balls are either cold\n", "(blue) or hot (red) according to their velocity. Balls are allowed to\n", "pass the green membrane from right to left only if they are cold, and\n", "from left to right, only if they are hot.\n", "\n", "\n", "\n", "Figure: Rabbit and Pooh watch the result of Pooh’s hooshing idea to\n", "move Eeyore towards the shore.\n", "\n", "> When you are a Bear of Very Little Brain, and you Think of Things, you\n", "> find sometimes that a Thing which seemed very Thingish inside you is\n", "> quite different when it gets out into the open and has other people\n", "> looking at it.\n", ">\n", "> A.A. Milne as Winnie-the-Pooh in *The House at Pooh Corner*, 1928\n", "\n", "This comment from Pooh bear comes just as he’s tried to rescue his\n", "donkey friend, Eeyore, from a river by dropping a large stone on him\n", "from a bridge. Pooh’s idea had been to create a wave to push the donkey\n", "to the shore, a process that Pooh’s rabbit friend calls “hooshing”.\n", "\n", "Hooshing is a technique many children will have tried to retrieve a ball\n", "from a river. It can work, so Pooh’s idea wasn’t a bad one, but the\n", "challenge he faced was in its execution. Pooh aimed to the side of\n", "Eeyore, unfortunately the stone fell directly on the stuffed donkey. But\n", "where is Laplace’s demon in hooshing? Just as we can talk about Gliders\n", "and Loafers in Conway’s Game of Life, we talk about stones and donkeys\n", "in our Universe. Pooh’s prediction that he can hoosh the donkey with the\n", "stone is not based on the Theory, it comes from observing the way\n", "objects interact in the actual Universe. Pooh is like the mice in\n", "Douglas Adams’s Earth. He is observing his environment. He looks for\n", "patterns in that environment. Pooh then borrows the computation that the\n", "Universe has already done for us. He has seen similar situations before,\n", "perhaps he once used a stone to hoosh a ball. He is then generalising\n", "from these previous circumstances to suggest that he can also hoosh the\n", "donkey. Despite being a bear of little brain, like the mice on Adams’s\n", "Earth, Pooh can answer questions about his universe by observing the\n", "results of the Theory of Everything playing out around him." ], "id": "cbe98b95-2fbe-4133-9eee-b4d54656c88c" }, { "cell_type": "markdown", "metadata": {}, "source": [ "# What is Machine Learning?\n", "\n", "\\[edit\\]\n", "\n", "What is machine learning? At its most basic level machine learning is a\n", "combination of\n", "\n", "$$\\text{data} + \\text{model} \\stackrel{\\text{compute}}{\\rightarrow} \\text{prediction}$$\n", "\n", "where *data* is our observations. They can be actively or passively\n", "acquired (meta-data). The *model* contains our assumptions, based on\n", "previous experience. That experience can be other data, it can come from\n", "transfer learning, or it can merely be our beliefs about the\n", "regularities of the universe. In humans our models include our inductive\n", "biases. The *prediction* is an action to be taken or a categorization or\n", "a quality score. The reason that machine learning has become a mainstay\n", "of artificial intelligence is the importance of predictions in\n", "artificial intelligence. The data and the model are combined through\n", "computation.\n", "\n", "In practice we normally perform machine learning using two functions. To\n", "combine data with a model we typically make use of:\n", "\n", "**a prediction function** it is used to make the predictions. It\n", "includes our beliefs about the regularities of the universe, our\n", "assumptions about how the world works, e.g., smoothness, spatial\n", "similarities, temporal similarities.\n", "\n", "**an objective function** it defines the ‘cost’ of misprediction.\n", "Typically, it includes knowledge about the world’s generating processes\n", "(probabilistic objectives) or the costs we pay for mispredictions\n", "(empirical risk minimization).\n", "\n", "The combination of data and model through the prediction function and\n", "the objective function leads to a *learning algorithm*. The class of\n", "prediction functions and objective functions we can make use of is\n", "restricted by the algorithms they lead to. If the prediction function or\n", "the objective function are too complex, then it can be difficult to find\n", "an appropriate learning algorithm. Much of the academic field of machine\n", "learning is the quest for new learning algorithms that allow us to bring\n", "different types of models and data together.\n", "\n", "You can also check my post blog post on [What is Machine\n", "Learning?](http://inverseprobability.com/2017/07/17/what-is-machine-learning)." ], "id": "d9192d65-a0dd-42cb-b098-6fc15b380f6f" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Kappenball\n", "\n", "\\[edit\\]\n", "\n", "Score:\n", "\n", "\n", "\n", "\n", " Energy:\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "Figure: Kappen Ball\n", "\n", "If you want to complete a task, should you do it now or should you put\n", "it off until tomorrow? Despite being told to not delay tasks, many of us\n", "are deadline driven. Why is this?\n", "\n", "Kappenball is a simple game that illustrates that this behaviour can be\n", "optimal. It is inspired by an example in stochastic optimal control by\n", "[Bert Kappen](https://www.snn.ru.nl/~bertk/). The game is as follows:\n", "you need to place a falling balloon into one of two holes, but if the\n", "balloon misses the holes it will pop on pins placed in the ground. In\n", "‘deterministic mode’, the balloon falls straight towards the ground and\n", "the game is easy. You simply choose which hole to place the ball in, and\n", "you can start to place it there as soon as the ball appears at the top\n", "of the screen. The game becomes more interesting as you increase the\n", "uncertainty. In Kappenball, the uncertainty takes the form of the\n", "balloon being blown left and right as it falls. This movement means that\n", "it is not sensible to decide early on which hole to place the balloon\n", "in. A better strategy is to wait and see which hole the ball falls\n", "towards. You can then place it in that hole using less energy than in\n", "deterministic mode. Sometimes, the ball even falls into the hole on its\n", "own, and you don’t have to expend any energy, but it requires some skill\n", "to judge when you need to intervene. For this system Bert Kappen has\n", "shown mathematically that the best solution is to wait until the ball is\n", "close to the hole before you push it in. In other words, you should be\n", "deadline driven.\n", "\n", "In fact, it seems here uncertainty is a good thing, because on average\n", "you’ll get the ball into the hole with less energy (by playing\n", "intelligently, and being deadline driven!) than you do with\n", "\\`deterministic mode’. It requires some skill to do this, more than the\n", "deterministic system, but by using your resources intelligently you can\n", "get more out of the system. However, if the uncertainty increases too\n", "much then regardless of your skill, you can’t control the ball at all.\n", "\n", "This simple game explains many of the behaviours we exhibit in real\n", "life. If a system is completely deterministic, then we can make a\n", "decision early on and be sure that the ball will ‘drop in the hole’.\n", "However, if there is uncertainty in a system, it can make sense to delay\n", "our decision making until we’ve seen how events ‘pan out’. Be careful\n", "though, as we also see that when the uncertainty is large, if you don’t\n", "have the resources or the skill to be deadline-driven the uncertainty\n", "can overwhelm you and events can quickly move beyond our control." ], "id": "9a40f0e4-96a4-4a30-96b8-4a1e5c9620ff" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prime Air\n", "\n", "\\[edit\\]\n", "\n", "One project where the components of machine learning and the physical\n", "world come together is Amazon’s Prime Air drone delivery system.\n", "\n", "Automating the process of moving physical goods through autonomous\n", "vehicles completes the loop between the ‘bits’ and the ‘atoms’. In other\n", "words, the information and the ‘stuff’. The idea of the drone is to\n", "complete a component of package delivery, the notion of last mile\n", "movement of goods, but in a fully autonomous way.\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Gur Kimchi\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "Paul Viola\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", " \n", "\n", "\n", "\n", "David Moro\n", "\n", "\n", "\n", "\n", "\n", "" ], "id": "efe9c0c5-c3fa-4399-bd05-d963df9d214c" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.lib.display import YouTubeVideo\n", "YouTubeVideo('3HJtmx5f1Fc')" ], "id": "0f778fc2-95fc-429d-a63b-2e3b99896ef9" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Figure: An actual ‘Santa’s sleigh’. Amazon’s prototype delivery\n", "drone. Machine learning algorithms are used across various systems\n", "including sensing (computer vision for detection of wires, people, dogs\n", "etc) and piloting. The technology is necessarily a combination of old\n", "and new ideas. The transition from vertical to horizontal flight is\n", "vital for efficiency and uses sophisticated machine learning to\n", "achieve.\n", "\n", "As Jeff Wilke (who was CEO of Amazon Retail at the time) [announced in\n", "June\n", "2019](https://blog.aboutamazon.com/transportation/a-drone-program-taking-flight)\n", "the technology is ready, but still needs operationalization including\n", "e.g. regulatory approval." ], "id": "ee6f38c6-9cf1-425d-a1d1-458737a0b832" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.lib.display import YouTubeVideo\n", "YouTubeVideo('wa8DU-Sui8Q')" ], "id": "32360b21-3bfb-4425-9bef-4ec71bdd0082" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Figure: Jeff Wilke (CEO Amazon Consumer) announcing the new drone at\n", "the Amazon 2019 re:MARS event alongside the scale of the Amazon supply\n", "chain.\n", "\n", "> When we announced earlier this year that we were evolving our Prime\n", "> two-day shipping offer in the U.S. to a one-day program, the response\n", "> was terrific. But we know customers are always looking for something\n", "> better, more convenient, and there may be times when one-day delivery\n", "> may not be the right choice. Can we deliver packages to customers even\n", "> faster? We think the answer is yes, and one way we’re pursuing that\n", "> goal is by pioneering autonomous drone technology.\n", "\n", "> Today at Amazon’s re:MARS Conference (Machine Learning, Automation,\n", "> Robotics and Space) in Las Vegas, we unveiled our latest Prime Air\n", "> drone design. We’ve been hard at work building fully electric drones\n", "> that can fly up to 15 miles and deliver packages under five pounds to\n", "> customers in less than 30 minutes. And, with the help of our\n", "> world-class fulfillment and delivery network, we expect to scale Prime\n", "> Air both quickly and efficiently, delivering packages via drone to\n", "> customers within months.\n", "\n", "The 15 miles in less than 30 minutes implies air speed velocities of\n", "around 50 kilometers per hour.\n", "\n", "> Our newest drone design includes advances in efficiency, stability\n", "> and, most importantly, in safety. It is also unique, and it advances\n", "> the state of the art. How so? First, it’s a hybrid design. It can do\n", "> vertical takeoffs and landings – like a helicopter. And it’s efficient\n", "> and aerodynamic—like an airplane. It also easily transitions between\n", "> these two modes—from vertical-mode to airplane mode, and back to\n", "> vertical mode.\n", "\n", "> It’s fully shrouded for safety. The shrouds are also the wings, which\n", "> makes it efficient in flight.\n", "\n", "\n", "\n", "Figure: Picture of the drone from Amazon Re-MARS event in 2019.\n", "\n", "> Our drones need to be able to identify static and moving objects\n", "> coming from any direction. We employ diverse sensors and advanced\n", "> algorithms, such as multi-view stereo vision, to detect static objects\n", "> like a chimney. To detect moving objects, like a paraglider or\n", "> helicopter, we use proprietary computer-vision and machine learning\n", "> algorithms.\n", "\n", "> A customer’s yard may have clotheslines, telephone wires, or\n", "> electrical wires. Wire detection is one of the hardest challenges for\n", "> low-altitude flights. Through the use of computer-vision techniques\n", "> we’ve invented, our drones can recognize and avoid wires as they\n", "> descend into, and ascend out of, a customer’s yard." ], "id": "3beab2ad-f9a7-4942-8de2-8d15ff083681" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Bayesian Inference by Rejection Sampling\n", "\n", "\\[edit\\]\n", "\n", "One view of Bayesian inference is to assume we are given a mechanism for\n", "generating samples, where we assume that mechanism is representing an\n", "accurate view on the way we believe the world works.\n", "\n", "This mechanism is known as our *prior* belief.\n", "\n", "We combine our prior belief with our observations of the real world by\n", "discarding all those prior samples that are inconsistent with our\n", "observations. The *likelihood* defines mathematically what we mean by\n", "inconsistent with the observations. The higher the noise level in the\n", "likelihood, the looser the notion of consistent.\n", "\n", "The samples that remain are samples from the *posterior*.\n", "\n", "This approach to Bayesian inference is closely related to two sampling\n", "techniques known as *rejection sampling* and *importance sampling*. It\n", "is realized in practice in an approach known as *approximate Bayesian\n", "computation* (ABC) or likelihood-free inference.\n", "\n", "In practice, the algorithm is often too slow to be practical, because\n", "most samples will be inconsistent with the observations and as a result\n", "the mechanism must be operated many times to obtain a few posterior\n", "samples.\n", "\n", "However, in the Gaussian process case, when the likelihood also assumes\n", "Gaussian noise, we can operate this mechanism mathematically, and obtain\n", "the posterior density *analytically*. This is the benefit of Gaussian\n", "processes.\n", "\n", "First, we will load in two python functions for computing the covariance\n", "function." ], "id": "bd0d63a1-f077-4837-b061-9618e61ad6e8" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import mlai" ], "id": "8e4c4cf0-922c-4247-93f7-8968eb87ddbd" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%load -n mlai.Kernel" ], "id": "4b84d8b5-1465-4edb-817f-74eab9534ac8" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# %load -n mlai.Kernel\n", "class Kernel():\n", " \"\"\"Covariance function\n", " :param function: covariance function\n", " :type function: function\n", " :param name: name of covariance function\n", " :type name: string\n", " :param shortname: abbreviated name of covariance function\n", " :type shortname: string\n", " :param formula: latex formula of covariance function\n", " :type formula: string\n", " :param function: covariance function\n", " :type function: function\n", " :param \\**kwargs:\n", " See below\n", "\n", " :Keyword Arguments:\n", " * \"\"\"\n", "\n", " def __init__(self, function, name=None, shortname=None, formula=None, **kwargs): \n", " self.function=function\n", " self.formula = formula\n", " self.name = name\n", " self.shortname = shortname\n", " self.parameters=kwargs\n", " \n", " def K(self, X, X2=None):\n", " \"\"\"Compute the full covariance function given a kernel function for two data points.\"\"\"\n", " if X2 is None:\n", " X2 = X\n", " K = np.zeros((X.shape[0], X2.shape[0]))\n", " for i in np.arange(X.shape[0]):\n", " for j in np.arange(X2.shape[0]):\n", " K[i, j] = self.function(X[i, :], X2[j, :], **self.parameters)\n", "\n", " return K\n", "\n", " def diag(self, X):\n", " \"\"\"Compute the diagonal of the covariance function\"\"\"\n", " diagK = np.zeros((X.shape[0], 1))\n", " for i in range(X.shape[0]): \n", " diagK[i] = self.function(X[i, :], X[i, :], **self.parameters)\n", " return diagK\n", "\n", " def _repr_html_(self):\n", " raise NotImplementedError" ], "id": "1a026937-eaf6-4da6-9c3f-e0c749394365" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import mlai" ], "id": "8a6714ba-468a-4e7b-b4e9-a33e81c78cb0" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%load -n mlai.eq_cov" ], "id": "1a80ce36-b634-4a54-a410-1bbf7fce7168" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# %load -n mlai.eq_cov\n", "def eq_cov(x, x_prime, variance=1., lengthscale=1.):\n", " \"\"\"Exponentiated quadratic covariance function.\"\"\"\n", " diffx = x - x_prime\n", " return variance*np.exp(-0.5*np.dot(diffx, diffx)/lengthscale**2)" ], "id": "26ea6dd3-7dba-4fe4-b3e6-7aa306ae305d" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "kernel = Kernel(function=eq_cov,\n", " name='Exponentiated Quadratic',\n", " shortname='eq', \n", " lengthscale=0.25)" ], "id": "8d09bc88-4b14-4b3b-b0a5-0cda7ce285b5" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we sample from a multivariate normal density (a multivariate\n", "Gaussian), using the covariance function as the covariance matrix." ], "id": "e61ab77d-a5f5-48e7-b2aa-3f1db0c1b9a8" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "np.random.seed(10)\n", "import mlai.plot as plot" ], "id": "75a649f3-bcc8-4f19-9755-1420d5652c74" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plot.rejection_samples(kernel=kernel, \n", " diagrams='./gp')" ], "id": "f331fa98-9ad3-4424-8fe1-f26c8eb71a6f" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import notutils as nu\n", "from ipywidgets import IntSlider" ], "id": "4e969820-9d52-4504-89ac-a3fdd2e0e01c" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "nu.display_plots('gp_rejection_sample{sample:0>3}.png', \n", " directory='./gp', \n", " sample=IntSlider(1,1,5,1))" ], "id": "bc2ad66d-eabd-4078-98df-64d47ddc179b" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "\n", "\n", "Figure: One view of Bayesian inference is we have a machine for\n", "generating samples (the *prior*), and we discard all samples\n", "inconsistent with our data, leaving the samples of interest (the\n", "*posterior*). This is a rejection sampling view of Bayesian inference.\n", "The Gaussian process allows us to do this analytically by multiplying\n", "the *prior* by the *likelihood*." ], "id": "ccb56247-a25f-4190-8f70-f2825044c698" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## DeepFace\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: The DeepFace architecture (Taigman et al., 2014), visualized\n", "through colors to represent the functional mappings at each layer. There\n", "are 120 million parameters in the model.\n", "\n", "The DeepFace architecture (Taigman et al., 2014) consists of layers that\n", "deal with *translation* invariances, known as convolutional layers.\n", "These layers are followed by three locally-connected layers and two\n", "fully-connected layers. Color illustrates feature maps produced at each\n", "layer. The neural network includes more than 120 million parameters,\n", "where more than 95% come from the local and fully connected layers." ], "id": "3cfbcf30-11bc-41a0-89f0-46b75e4d9bdf" }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Deep Learning as Pinball\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "Figure: Deep learning models are composition of simple functions. We\n", "can think of a pinball machine as an analogy. Each layer of pins\n", "corresponds to one of the layers of functions in the model. Input data\n", "is represented by the location of the ball from left to right when it is\n", "dropped in from the top. Output class comes from the position of the\n", "ball as it leaves the pins at the bottom.\n", "\n", "Sometimes deep learning models are described as being like the brain, or\n", "too complex to understand, but one analogy I find useful to help the\n", "gist of these models is to think of them as being similar to early pin\n", "ball machines.\n", "\n", "In a deep neural network, we input a number (or numbers), whereas in\n", "pinball, we input a ball.\n", "\n", "Think of the location of the ball on the left-right axis as a single\n", "number. Our simple pinball machine can only take one number at a time.\n", "As the ball falls through the machine, each layer of pins can be thought\n", "of as a different layer of ‘neurons’. Each layer acts to move the ball\n", "from left to right.\n", "\n", "In a pinball machine, when the ball gets to the bottom it might fall\n", "into a hole defining a score, in a neural network, that is equivalent to\n", "the decision: a classification of the input object.\n", "\n", "An image has more than one number associated with it, so it is like\n", "playing pinball in a *hyper-space*.\n", "\n", "\n", "\n", "Figure: At initialization, the pins, which represent the parameters\n", "of the function, aren’t in the right place to bring the balls to the\n", "correct decisions.\n", "\n", "\n", "\n", "Figure: After learning the pins are now in the right place to bring\n", "the balls to the correct decisions.\n", "\n", "Learning involves moving all the pins to be in the correct position, so\n", "that the ball ends up in the right place when it’s fallen through the\n", "machine. But moving all these pins in hyperspace can be difficult.\n", "\n", "In a hyper-space you have to put a lot of data through the machine for\n", "to explore the positions of all the pins. Even when you feed many\n", "millions of data points through the machine, there are likely to be\n", "regions in the hyper-space where no ball has passed. When future test\n", "data passes through the machine in a new route unusual things can\n", "happen.\n", "\n", "*Adversarial examples* exploit this high dimensional space. If you have\n", "access to the pinball machine, you can use gradient methods to find a\n", "position for the ball in the hyper space where the image looks like one\n", "thing, but will be classified as another.\n", "\n", "Probabilistic methods explore more of the space by considering a range\n", "of possible paths for the ball through the machine. This helps to make\n", "them more data efficient and gives some robustness to adversarial\n", "examples.\n", "\n", "\n", "\n", "\n", "\n", "" ], "id": "43c03e22-826c-4ed6-a9bf-55d3622188d4" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Deep Neural Network\n", "\n", "\\[edit\\]" ], "id": "6102cb6a-965b-477e-9fc8-d5b7fa2cda75" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install daft" ], "id": "f64a3444-cb52-49ed-855d-c17eacd2ca3f" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib\n", "# Comment for google colab (no latex available)\n", "#matplotlib.rc('text', usetex=True)\n", "#matplotlib.rcParams['text.latex.preamble']=[r\"\\usepackage{amsmath}\"]" ], "id": "270cc1b6-3906-4f4e-ad5f-f5217916e7fc" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import mlai.plot as plot" ], "id": "3d32be86-5b75-409d-bd22-269fada28e2e" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#plot.deep_nn(diagrams='./deepgp/')" ], "id": "38f5cb44-bc96-4098-8893-d6c566edcf94" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: A deep neural network. Input nodes are shown at the bottom.\n", "Each hidden layer is the result of applying an affine transformation to\n", "the previous layer and placing through an activation function.\n", "\n", "Mathematically, each layer of a neural network is given through\n", "computing the activation function, $\\phi(\\cdot)$, contingent on the\n", "previous layer, or the inputs. In this way the activation functions, are\n", "composed to generate more complex interactions than would be possible\n", "with any single layer. $$\n", "\\begin{align*}\n", " \\mathbf{ h}_{1} &= \\phi\\left(\\mathbf{W}_1 \\mathbf{ x}\\right)\\\\\n", " \\mathbf{ h}_{2} &= \\phi\\left(\\mathbf{W}_2\\mathbf{ h}_{1}\\right)\\\\\n", " \\mathbf{ h}_{3} &= \\phi\\left(\\mathbf{W}_3 \\mathbf{ h}_{2}\\right)\\\\\n", " f&= \\mathbf{ w}_4 ^\\top\\mathbf{ h}_{3}\n", "\\end{align*}\n", "$$" ], "id": "0ead82fb-f5cb-4648-8ae7-3f3bb1bb7b1f" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Bottleneck Layers in Deep Neural Networks\n", "\n", "\\[edit\\]" ], "id": "4a7e924a-fb29-4e63-8b5d-33f57681ed84" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import mlai.plot as plot" ], "id": "d32ae859-ff77-4ac4-8773-34520e808797" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plot.deep_nn_bottleneck(diagrams='./deepgp')" ], "id": "30e7616b-3b73-4e8d-99be-7f54bc70983b" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: Inserting the bottleneck layers introduces a new set of\n", "variables.\n", "\n", "Including the low rank decomposition of $\\mathbf{W}$ in the neural\n", "network, we obtain a new mathematical form. Effectively, we are adding\n", "additional *latent* layers, $\\mathbf{ z}$, in between each of the\n", "existing hidden layers. In a neural network these are sometimes known as\n", "*bottleneck* layers. The network can now be written mathematically as $$\n", "\\begin{align}\n", " \\mathbf{ z}_{1} &= \\mathbf{V}^\\top_1 \\mathbf{ x}\\\\\n", " \\mathbf{ h}_{1} &= \\phi\\left(\\mathbf{U}_1 \\mathbf{ z}_{1}\\right)\\\\\n", " \\mathbf{ z}_{2} &= \\mathbf{V}^\\top_2 \\mathbf{ h}_{1}\\\\\n", " \\mathbf{ h}_{2} &= \\phi\\left(\\mathbf{U}_2 \\mathbf{ z}_{2}\\right)\\\\\n", " \\mathbf{ z}_{3} &= \\mathbf{V}^\\top_3 \\mathbf{ h}_{2}\\\\\n", " \\mathbf{ h}_{3} &= \\phi\\left(\\mathbf{U}_3 \\mathbf{ z}_{3}\\right)\\\\\n", " \\mathbf{ y}&= \\mathbf{ w}_4^\\top\\mathbf{ h}_{3}.\n", "\\end{align}\n", "$$\n", "\n", "$$\n", "\\begin{align}\n", " \\mathbf{ z}_{1} &= \\mathbf{V}^\\top_1 \\mathbf{ x}\\\\\n", " \\mathbf{ z}_{2} &= \\mathbf{V}^\\top_2 \\phi\\left(\\mathbf{U}_1 \\mathbf{ z}_{1}\\right)\\\\\n", " \\mathbf{ z}_{3} &= \\mathbf{V}^\\top_3 \\phi\\left(\\mathbf{U}_2 \\mathbf{ z}_{2}\\right)\\\\\n", " \\mathbf{ y}&= \\mathbf{ w}_4 ^\\top \\mathbf{ z}_{3}\n", "\\end{align}\n", "$$" ], "id": "4b979686-d5ce-46a0-a057-7e387ae0236a" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Cascade of Gaussian Processes\n", "\n", "\\[edit\\]\n", "\n", "Now if we replace each of these neural networks with a Gaussian process.\n", "This is equivalent to taking the limit as the width of each layer goes\n", "to infinity, while appropriately scaling down the outputs.\n", "\n", "$$\n", "\\begin{align}\n", " \\mathbf{ z}_{1} &= \\mathbf{ f}_1\\left(\\mathbf{ x}\\right)\\\\\n", " \\mathbf{ z}_{2} &= \\mathbf{ f}_2\\left(\\mathbf{ z}_{1}\\right)\\\\\n", " \\mathbf{ z}_{3} &= \\mathbf{ f}_3\\left(\\mathbf{ z}_{2}\\right)\\\\\n", " \\mathbf{ y}&= \\mathbf{ f}_4\\left(\\mathbf{ z}_{3}\\right)\n", "\\end{align}\n", "$$" ], "id": "2bec7af8-6ebf-4ad2-b1e4-db153ea0980a" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install gpy" ], "id": "262f715b-ef25-4244-917c-ace8cc4e3c1c" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## GPy: A Gaussian Process Framework in Python\n", "\n", "\\[edit\\]\n", "\n", "Gaussian processes are a flexible tool for non-parametric analysis with\n", "uncertainty. The GPy software was started in Sheffield to provide a easy\n", "to use interface to GPs. One which allowed the user to focus on the\n", "modelling rather than the mathematics.\n", "\n", "\n", "\n", "Figure: GPy is a BSD licensed software code base for implementing\n", "Gaussian process models in Python. It is designed for teaching and\n", "modelling. We welcome contributions which can be made through the GitHub\n", "repository \n", "\n", "GPy is a BSD licensed software code base for implementing Gaussian\n", "process models in python. This allows GPs to be combined with a wide\n", "variety of software libraries.\n", "\n", "The software itself is available on\n", "[GitHub](https://github.com/SheffieldML/GPy) and the team welcomes\n", "contributions.\n", "\n", "The aim for GPy is to be a probabilistic-style programming language,\n", "i.e., you specify the model rather than the algorithm. As well as a\n", "large range of covariance functions the software allows for non-Gaussian\n", "likelihoods, multivariate outputs, dimensionality reduction and\n", "approximations for larger data sets.\n", "\n", "The documentation for GPy can be found\n", "[here](https://gpy.readthedocs.io/en/latest/).\n", "\n", "This notebook depends on PyDeepGP. This library can be installed via\n", "pip." ], "id": "45fc28a9-7d1c-47bd-9f6e-3fdd1d8dee6f" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install --upgrade git+https://github.com/SheffieldML/PyDeepGP.git" ], "id": "639ced78-be62-429f-9547-67702b47cba7" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install mlai" ], "id": "34abb5c6-eeb8-4078-8206-6d0e945e26ab" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Late bind setup methods to DeepGP object\n", "from mlai.deepgp_tutorial import initialize\n", "from mlai.deepgp_tutorial import staged_optimize\n", "from mlai.deepgp_tutorial import posterior_sample\n", "from mlai.deepgp_tutorial import visualize\n", "from mlai.deepgp_tutorial import visualize_pinball\n", "\n", "import deepgp\n", "deepgp.DeepGP.initialize=initialize\n", "deepgp.DeepGP.staged_optimize=staged_optimize\n", "deepgp.DeepGP.posterior_sample=posterior_sample\n", "deepgp.DeepGP.visualize=visualize\n", "deepgp.DeepGP.visualize_pinball=visualize_pinball" ], "id": "be643d58-c72b-4ece-afc3-159fa4b4948a" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Olympic Marathon Data\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "\n", "- Gold medal times for Olympic Marathon since 1896.\n", "- Marathons before 1924 didn’t have a standardized distance.\n", "- Present results using pace per km.\n", "- In 1904 Marathon was badly organized leading to very slow times.\n", "\n", "\n", "\n", "\n", "Image from Wikimedia Commons \n", "\n", "
\n", "\n", "The first thing we will do is load a standard data set for regression\n", "modelling. The data consists of the pace of Olympic Gold Medal Marathon\n", "winners for the Olympics from 1896 to present. Let’s load in the data\n", "and plot." ], "id": "839be5f4-f535-483c-917b-16a4e62e8e31" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install pods" ], "id": "7b507213-8ef2-4224-8031-767c3dc015e0" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pods" ], "id": "ab04efa7-e3a6-40a3-9508-d3ab62487097" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data = pods.datasets.olympic_marathon_men()\n", "x = data['X']\n", "y = data['Y']\n", "\n", "offset = y.mean()\n", "scale = np.sqrt(y.var())\n", "yhat = (y - offset)/scale" ], "id": "7624cefc-f95d-4c20-8dc7-d4e59712ee2c" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import mlai.plot as plot\n", "import mlai" ], "id": "7d1ef276-dca7-4ad7-9063-3bbe037ab048" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\n", "xlim = (1875,2030)\n", "ylim = (2.5, 6.5)\n", "\n", "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "_ = ax.plot(x, y, 'r.',markersize=10)\n", "ax.set_xlabel('year', fontsize=20)\n", "ax.set_ylabel('pace min/km', fontsize=20)\n", "ax.set_xlim(xlim)\n", "ax.set_ylim(ylim)\n", "\n", "mlai.write_figure(filename='olympic-marathon.svg', \n", " directory='./datasets')" ], "id": "e45dbfed-25a8-4863-9256-01a1d25d8935" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: Olympic marathon pace times since 1896.\n", "\n", "Things to notice about the data include the outlier in 1904, in that\n", "year the Olympics was in St Louis, USA. Organizational problems and\n", "challenges with dust kicked up by the cars following the race meant that\n", "participants got lost, and only very few participants completed. More\n", "recent years see more consistently quick marathons." ], "id": "b66fa302-a896-4ec4-a7c7-40ead53fb6fb" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Alan Turing\n", "\n", "\\[edit\\]\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
\n", "\n", "Figure: Alan Turing, in 1946 he was only 11 minutes slower than the\n", "winner of the 1948 games. Would he have won a hypothetical games held in\n", "1946? Source:\n", "Alan\n", "Turing Internet Scrapbook.\n", "\n", "If we had to summarise the objectives of machine learning in one word, a\n", "very good candidate for that word would be *generalization*. What is\n", "generalization? From a human perspective it might be summarised as the\n", "ability to take lessons learned in one domain and apply them to another\n", "domain. If we accept the definition given in the first session for\n", "machine learning, $$\n", "\\text{data} + \\text{model} \\stackrel{\\text{compute}}{\\rightarrow} \\text{prediction}\n", "$$ then we see that without a model we can’t generalise: we only have\n", "data. Data is fine for answering very specific questions, like “Who won\n", "the Olympic Marathon in 2012?”, because we have that answer stored,\n", "however, we are not given the answer to many other questions. For\n", "example, Alan Turing was a formidable marathon runner, in 1946 he ran a\n", "time 2 hours 46 minutes (just under four minutes per kilometer, faster\n", "than I and most of the other [Endcliffe Park\n", "Run](http://www.parkrun.org.uk/sheffieldhallam/) runners can do 5 km).\n", "What is the probability he would have won an Olympics if one had been\n", "held in 1946?\n", "\n", "To answer this question we need to generalize, but before we formalize\n", "the concept of generalization let’s introduce some formal representation\n", "of what it means to generalize in machine learning." ], "id": "9d82a572-c97a-499d-9c14-777ccf0c107b" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Gaussian Process Fit\n", "\n", "\\[edit\\]\n", "\n", "Our first objective will be to perform a Gaussian process fit to the\n", "data, we’ll do this using the [GPy\n", "software](https://github.com/SheffieldML/GPy)." ], "id": "68ecea03-8eac-42f4-8b48-3ccdce13c4e4" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import GPy" ], "id": "d5270d5e-2024-49d3-88d4-703f5da4c889" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m_full = GPy.models.GPRegression(x,yhat)\n", "_ = m_full.optimize() # Optimize parameters of covariance function" ], "id": "21378c04-d1c5-40bf-94fd-c59fe66a7ef1" }, { "cell_type": "markdown", "metadata": {}, "source": [ "The first command sets up the model, then `m_full.optimize()` optimizes\n", "the parameters of the covariance function and the noise level of the\n", "model. Once the fit is complete, we’ll try creating some test points,\n", "and computing the output of the GP model in terms of the mean and\n", "standard deviation of the posterior functions between 1870 and 2030. We\n", "plot the mean function and the standard deviation at 200 locations. We\n", "can obtain the predictions using `y_mean, y_var = m_full.predict(xt)`" ], "id": "1b6be7b5-4ded-4119-b058-dbc6c22ec3de" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "xt = np.linspace(1870,2030,200)[:,np.newaxis]\n", "yt_mean, yt_var = m_full.predict(xt)\n", "yt_sd=np.sqrt(yt_var)" ], "id": "e4a033aa-0155-4181-b566-d005e3d86594" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we plot the results using the helper function in `mlai.plot`." ], "id": "9e8c7671-2524-4900-9f55-c2048d8d22df" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import mlai.plot as plot\n", "import mlai" ], "id": "84ecf224-8a65-4ce4-8e60-bf77c2963ad8" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "plot.model_output(m_full, scale=scale, offset=offset, ax=ax, xlabel=\"year\", ylabel=\"pace min/km\", fontsize=20, portion=0.2)\n", "ax.set_xlim(xlim)\n", "ax.set_ylim(ylim)\n", "mlai.write_figure(figure=fig,\n", " filename=\"olympic-marathon-gp.svg\", \n", " directory = \"./gp\",\n", " transparent=True, frameon=True)" ], "id": "ef64a9e5-230a-4122-ab44-14f4e6b15efe" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: Gaussian process fit to the Olympic Marathon data. The error\n", "bars are too large, perhaps due to the outlier from 1904." ], "id": "b3f8d1ed-010f-4276-8531-809b15b51b8f" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Fit Quality\n", "\n", "In the fit we see that the error bars (coming mainly from the noise\n", "variance) are quite large. This is likely due to the outlier point in\n", "1904, ignoring that point we can see that a tighter fit is obtained. To\n", "see this make a version of the model, `m_clean`, where that point is\n", "removed." ], "id": "2f5abca9-cc6b-44a9-80f9-6404dde7dd1c" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x_clean=np.vstack((x[0:2, :], x[3:, :]))\n", "y_clean=np.vstack((yhat[0:2, :], yhat[3:, :]))\n", "\n", "m_clean = GPy.models.GPRegression(x_clean,y_clean)\n", "_ = m_clean.optimize()" ], "id": "31b4fb9f-ab4f-400b-ab2c-6f008007f5ca" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import mlai.plot as plot\n", "import mlai" ], "id": "333e9ee2-2770-4bae-a27b-b5dca6f85fde" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "plot.model_output(m_clean, scale=scale, offset=offset, ax=ax, xlabel='year', ylabel='pace min/km', fontsize=20, portion=0.2)\n", "ax.set_xlim(xlim)\n", "ax.set_ylim(ylim)\n", "mlai.write_figure(figure=fig,\n", " filename='./gp/olympic-marathon-gp.svg', \n", " transparent=True, frameon=True)" ], "id": "6fe0f993-0794-4ab1-8a7e-8de4ee0e633f" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Deep GP Fit\n", "\n", "\\[edit\\]\n", "\n", "Let’s see if a deep Gaussian process can help here. We will construct a\n", "deep Gaussian process with one hidden layer (i.e. one Gaussian process\n", "feeding into another).\n", "\n", "Build a Deep GP with an additional hidden layer (one dimensional) to fit\n", "the model." ], "id": "e4781c4d-597b-45b5-904e-bce722008c90" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import GPy\n", "import deepgp" ], "id": "dd921dda-917c-4a6e-be35-ee20d1787428" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "hidden = 1\n", "m = deepgp.DeepGP([y.shape[1],hidden,x.shape[1]],Y=yhat, X=x, inits=['PCA','PCA'], \n", " kernels=[GPy.kern.RBF(hidden,ARD=True),\n", " GPy.kern.RBF(x.shape[1],ARD=True)], # the kernels for each layer\n", " num_inducing=50, back_constraint=False)" ], "id": "3a9591bc-ae3c-45a4-bf8c-8f0a97ac7f17" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Call the initalization\n", "m.initialize()" ], "id": "bcb0fe56-a801-4e5e-a5cb-6f3798a5dcef" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now optimize the model." ], "id": "208581a0-9ad5-4de6-ad52-654b73fa26b9" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for layer in m.layers:\n", " layer.likelihood.variance.constrain_positive(warning=False)\n", "m.optimize(messages=True,max_iters=10000)" ], "id": "cd6b2def-a8f2-47d9-8460-c2ead13cc947" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m.staged_optimize(messages=(True,True,True))" ], "id": "cc83838d-a448-420e-9d33-d75e3b6b83b7" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import mlai.plot as plot\n", "import mlai" ], "id": "139b57ab-2e43-4a80-b568-e6575242f27c" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "plot.model_output(m, scale=scale, offset=offset, ax=ax, xlabel='year', ylabel='pace min/km', \n", " fontsize=20, portion=0.2)\n", "ax.set_xlim(xlim)\n", "\n", "ax.set_ylim(ylim)\n", "mlai.write_figure(figure=fig, filename='./deepgp/olympic-marathon-deep-gp.svg', \n", " transparent=True, frameon=True)" ], "id": "3b8d24f8-87b7-4d43-ad93-b04bebf68938" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Olympic Marathon Data Deep GP\n", "\n", "\n", "\n", "Figure: Deep GP fit to the Olympic marathon data. Error bars now\n", "change as the prediction evolves." ], "id": "65b01be9-8a0c-458f-b51f-d2afabb587c9" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "plot.model_sample(m, scale=scale, offset=offset, samps=10, ax=ax, \n", " xlabel='year', ylabel='pace min/km', portion = 0.225)\n", "ax.set_xlim(xlim)\n", "ax.set_ylim(ylim)\n", "mlai.write_figure(figure=fig, filename='./deepgp/olympic-marathon-deep-gp-samples.svg', \n", " transparent=True, frameon=True)" ], "id": "a34163df-995e-461e-98bb-a4cba1c68ae5" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Olympic Marathon Data Deep GP\n", "\n", "\n", "\n", "Figure: Point samples run through the deep Gaussian process show the\n", "distribution of output locations." ], "id": "537b7aec-8309-4ad6-891d-2bbe63521a05" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Fitted GP for each layer\n", "\n", "Now we explore the GPs the model has used to fit each layer. First of\n", "all, we look at the hidden layer." ], "id": "16fb9909-a537-42dc-ab82-f6fa68433279" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m.visualize(scale=scale, offset=offset, xlabel='year',\n", " ylabel='pace min/km',xlim=xlim, ylim=ylim,\n", " dataset='olympic-marathon',\n", " diagrams='./deepgp')" ], "id": "7cfe22e4-08af-492d-86ab-b31d318894b3" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import notutils as nu" ], "id": "787ad581-adb4-4543-9b59-f3ba6d9afa8a" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "nu.display_plots('olympic-marathon-deep-gp-layer-{sample:0>1}.svg', \n", " './deepgp', sample=(0,1))" ], "id": "d13b59d9-0c5b-4c61-bd65-26c23c008225" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: The mapping from input to the latent layer is broadly, with\n", "some flattening as time goes on. Variance is high across the input\n", "range.\n", "\n", "\n", "\n", "Figure: The mapping from the latent layer to the output layer." ], "id": "c47ecc8c-b6fc-4762-b0b5-80526ca37c5f" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "m.visualize_pinball(ax=ax, scale=scale, offset=offset, points=30, portion=0.1,\n", " xlabel='year', ylabel='pace km/min', vertical=True)\n", "mlai.write_figure(figure=fig, filename='./deepgp/olympic-marathon-deep-gp-pinball.svg', \n", " transparent=True, frameon=True)" ], "id": "1e35af24-6327-424e-8f21-3daec2263b22" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Olympic Marathon Pinball Plot\n", "\n", "\n", "\n", "Figure: A pinball plot shows the movement of the ‘ball’ as it passes\n", "through each layer of the Gaussian processes. Mean directions of\n", "movement are shown by lines. Shading gives one standard deviation of\n", "movement position. At each layer, the uncertainty is reset. The overal\n", "uncertainty is the cumulative uncertainty from all the layers. There is\n", "some grouping of later points towards the right in the first layer,\n", "which also injects a large amount of uncertainty. Due to flattening of\n", "the curve in the second layer towards the right the uncertainty is\n", "reduced in final output.\n", "\n", "The pinball plot shows the flow of any input ball through the deep\n", "Gaussian process. In a pinball plot a series of vertical parallel lines\n", "would indicate a purely linear function. For the olypmic marathon data\n", "we can see the first layer begins to shift from input towards the right.\n", "Note it also does so with some uncertainty (indicated by the shaded\n", "backgrounds). The second layer has less uncertainty, but bunches the\n", "inputs more strongly to the right. This input layer of uncertainty,\n", "followed by a layer that pushes inputs to the right is what gives the\n", "heteroschedastic noise." ], "id": "91428a0d-db93-4057-9c9a-00450124a32a" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step Function\n", "\n", "\\[edit\\]\n", "\n", "Next we consider a simple step function data set." ], "id": "f15e085a-521e-455f-8c21-c40c3ff1e14c" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "num_low=25\n", "num_high=25\n", "gap = -.1\n", "noise=0.0001\n", "x = np.vstack((np.linspace(-1, -gap/2.0, num_low)[:, np.newaxis],\n", " np.linspace(gap/2.0, 1, num_high)[:, np.newaxis]))\n", "y = np.vstack((np.zeros((num_low, 1)), np.ones((num_high,1))))\n", "scale = np.sqrt(y.var())\n", "offset = y.mean()\n", "yhat = (y-offset)/scale" ], "id": "69dff1c3-8a25-41e8-ada9-c78e2cb7bcfc" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=plot.big_wide_figsize)\n", "_ = ax.plot(x, y, 'r.',markersize=10)\n", "_ = ax.set_xlabel('$x$', fontsize=20)\n", "_ = ax.set_ylabel('$y$', fontsize=20)\n", "xlim = (-2, 2)\n", "ylim = (-0.6, 1.6)\n", "ax.set_ylim(ylim)\n", "ax.set_xlim(xlim)\n", "mlai.write_figure(figure=fig, filename='./datasets/step-function.svg', \n", " transparent=True, frameon=True)" ], "id": "039d5232-7a8e-44c6-87c5-1c7bd8b6c4f9" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step Function Data\n", "\n", "\n", "\n", "Figure: Simulation study of step function data artificially\n", "generated. Here there is a small overlap between the two lines." ], "id": "fa73c23a-43a0-43b2-aa63-bbf979b108a2" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step Function Data GP\n", "\n", "We can fit a Gaussian process to the step function data using `GPy` as\n", "follows." ], "id": "7613a0a9-b7f6-4673-abf5-1f9c78cff7b6" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m_full = GPy.models.GPRegression(x,yhat)\n", "_ = m_full.optimize() # Optimize parameters of covariance function" ], "id": "778bd9ac-18b9-45f0-8e27-f58dcffd9261" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Where `GPy.models.GPRegression()` gives us a standard GP regression\n", "model with exponentiated quadratic covariance function.\n", "\n", "The model is optimized using `m_full.optimize()` which calls an L-BGFS\n", "gradient based solver in python." ], "id": "90d5d0a5-32b8-47f0-bd7a-d1244b686c17" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax=plt.subplots(figsize=plot.big_wide_figsize)\n", "plot.model_output(m_full, scale=scale, offset=offset, ax=ax, fontsize=20, portion=0.5)\n", "ax.set_ylim(ylim)\n", "ax.set_xlim(xlim)\n", "\n", "mlai.write_figure(figure=fig,filename='./gp/step-function-gp.svg', \n", " transparent=True, frameon=True)" ], "id": "dbcf23ac-bf2c-455d-8dea-08fad74e319e" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: Gaussian process fit to the step function data. Note the\n", "large error bars and the over-smoothing of the discontinuity. Error bars\n", "are shown at two standard deviations.\n", "\n", "The resulting fit to the step function data shows some challenges. In\n", "particular, the over smoothing at the discontinuity. If we know how many\n", "discontinuities there are, we can parameterize them in the step\n", "function. But by doing this, we form a semi-parametric model. The\n", "parameters indicate how many discontinuities are, and where they are.\n", "They can be optimized as part of the model fit. But if new, unforeseen,\n", "discontinuities arise when the model is being deployed in practice,\n", "these won’t be accounted for in the predictions." ], "id": "54c5c915-ddd2-4e7a-a921-ffc413f106cd" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step Function Data Deep GP\n", "\n", "\\[edit\\]\n", "\n", "First we initialize a deep Gaussian process with three latent layers\n", "(four layers total). Within each layer we create a GP with an\n", "exponentiated quadratic covariance (`GPy.kern.RBF`).\n", "\n", "At each layer we use 20 inducing points for the variational\n", "approximation." ], "id": "68375c29-c4fa-4c3e-a40f-497d1fb10f91" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "layers = [y.shape[1], 1, 1, 1,x.shape[1]]\n", "inits = ['PCA']*(len(layers)-1)\n", "kernels = []\n", "for i in layers[1:]:\n", " kernels += [GPy.kern.RBF(i)]\n", " \n", "m = deepgp.DeepGP(layers,Y=yhat, X=x, \n", " inits=inits, \n", " kernels=kernels, # the kernels for each layer\n", " num_inducing=20, back_constraint=False)" ], "id": "12d6ac47-cd81-4adb-b8fa-17f0404651eb" }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once the model is constructed we initialize the parameters, and perform\n", "the staged optimization which starts by optimizing variational\n", "parameters with a low noise and proceeds to optimize the whole model." ], "id": "2033a130-c4d5-4fab-bfdd-971f5540bfd8" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m.initialize()\n", "m.staged_optimize()" ], "id": "6ddac1ee-cfc3-4be1-bded-d872b594a303" }, { "cell_type": "markdown", "metadata": {}, "source": [ "We plot the output of the deep Gaussian process fitted to the step data\n", "as follows." ], "id": "e7d9709c-499c-40f2-9de1-4596bb33a345" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax=plt.subplots(figsize=plot.big_wide_figsize)\n", "plot.model_output(m, scale=scale, offset=offset, ax=ax, fontsize=20, portion=0.5)\n", "ax.set_ylim(ylim)\n", "ax.set_xlim(xlim)\n", "mlai.write_figure(filename='./deepgp/step-function-deep-gp.svg', \n", " transparent=True, frameon=True)" ], "id": "3028ba41-42a9-4b9b-9d9d-734dc47dcc67" }, { "cell_type": "markdown", "metadata": {}, "source": [ "The deep Gaussian process does a much better job of fitting the data. It\n", "handles the discontinuity easily, and error bars drop to smaller values\n", "in the regions of data.\n", "\n", "\n", "\n", "Figure: Deep Gaussian process fit to the step function data." ], "id": "c7efb8f0-4d03-444d-b540-81a6da1678ee" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Step Function Data Deep GP\n", "\n", "The samples of the model can be plotted with the helper function from\n", "`mlai.plot`, `model_sample`" ], "id": "6d3b7e3f-4725-4da4-9151-07c2643d21f3" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import mlai.plot as plot" ], "id": "13dae01f-5303-459d-b050-380842cc88bd" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax=plt.subplots(figsize=plot.big_wide_figsize)\n", "\n", "plot.model_sample(m, scale=scale, offset=offset, samps=10, ax=ax, portion = 0.5)\n", "ax.set_ylim(ylim)\n", "ax.set_xlim(xlim)\n", "mlai.write_figure(figure=fig, filename='./deepgp/step-function-deep-gp-samples.svg', \n", " transparent=True, frameon=True)" ], "id": "6174a89f-7837-4019-ad50-5959d69ed8e9" }, { "cell_type": "markdown", "metadata": {}, "source": [ "The samples from the model show that the error bars, which are\n", "informative for Gaussian outputs, are less informative for this model.\n", "They make clear that the data points lie, in output mainly at 0 or 1, or\n", "occasionally in between.\n", "\n", "\n", "\n", "Figure: Samples from the deep Gaussian process model for the step\n", "function fit.\n", "\n", "The visualize code allows us to inspect the intermediate layers in the\n", "deep GP model to understand how it has reconstructed the step function." ], "id": "0d0353d0-2c36-4bd1-bd49-a6fd8f629bc2" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "m.visualize(offset=offset, scale=scale, xlim=xlim, ylim=ylim,\n", " dataset='step-function',\n", " diagrams='./deepgp')" ], "id": "6a984ca9-840e-4754-bc25-3295f93bda2b" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "\n", "\n", "\n", "Figure: From top to bottom, the Gaussian process mapping function\n", "that makes up each layer of the resulting deep Gaussian process.\n", "\n", "A pinball plot can be created for the resulting model to understand how\n", "the input is being translated to the output across the different layers." ], "id": "5858f004-5c6f-4b43-a7a9-885fa8ea74d1" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import mlai.plot as plot\n", "import mlai" ], "id": "0f8a3572-5278-4af9-b3f0-3be396a3eb27" }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax=plt.subplots(figsize=plot.big_wide_figsize)\n", "m.visualize_pinball(offset=offset, ax=ax, scale=scale, xlim=xlim, ylim=ylim, portion=0.1, points=50)\n", "mlai.write_figure(figure=fig, filename='./deepgp/step-function-deep-gp-pinball.svg', \n", " transparent=True, frameon=True, ax=ax)" ], "id": "bf02ab7d-9dcf-47dc-8b89-cc39b4ae11d8" }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "Figure: Pinball plot of the deep GP fitted to the step function data.\n", "Each layer of the model pushes the ‘ball’ towards the left or right,\n", "saturating at 1 and 0. This causes the final density to be be peaked at\n", "0 and 1. Transitions occur driven by the uncertainty of the mapping in\n", "each layer." ], "id": "c38e2101-8503-4e07-9360-fc97595fbcdb" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The Future\n", "\n", "What does this mean for the future? Although I’ve focussed mainly on\n", "uncertainty I think that the main area for future research is\n", "interaction. Coming back to dynamic systems, but not just interaction\n", "between planes and people, interactions of the form of conversation.\n", "Interventions intelligently selected rather than emulated." ], "id": "cb49b653-4ec6-451f-a259-50f01ebd1304" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Conclusions\n", "\n", "The probabilistic modelling community has evolved in an era where the\n", "assumption was that ambiguous conclusions are best shared with a\n", "(trained) professional through probabilities. Recent advances in\n", "generative AI offer the possibility of machines that have a better\n", "understanding of human subjective ambiguities and therefore machines\n", "that can summarise information in a way that can be interogated rather\n", "than just through a series of numbers." ], "id": "faf23115-3094-4406-96c9-ce6bf086aa5a" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Thanks!\n", "\n", "For more information on these subjects and more you might want to check\n", "the following resources.\n", "\n", "- book: [The Atomic\n", " Human](https://www.penguin.co.uk/books/455130/the-atomic-human-by-lawrence-neil-d/9780241625248)\n", "- twitter: [@lawrennd](https://twitter.com/lawrennd)\n", "- podcast: [The Talking Machines](http://thetalkingmachines.com)\n", "- newspaper: [Guardian Profile\n", " Page](http://www.theguardian.com/profile/neil-lawrence)\n", "- blog:\n", " [http://inverseprobability.com](http://inverseprobability.com/blog.html)" ], "id": "c9b0e3c1-01a8-44ca-92cb-b93b4b1d6555" }, { "cell_type": "markdown", "metadata": {}, "source": [ "## References" ], "id": "e64e8daf-7973-4179-b233-ae2c64de5cda" }, { "cell_type": "markdown", "metadata": {}, "source": [ "A brief history of time, 1988. Bantam Dell Publishing Group, London.\n", "\n", "Ananthanarayanan, R., Esser, S.K., Simon, H.D., Modha, D.S., 2009. The\n", "cat is out of the bag: Cortical simulations with $10^9$ neurons,\n", "$10^{13}$ synapses, in: Proceedings of the Conference on High\n", "Performance Computing Networking, Storage and Analysis - SC ’09.\n", "\n", "\n", "Brooks, F., n.d. The mythical man-month. Addison-Wesley.\n", "\n", "Cabrera, C., Paleyes, A., Thodoroff, P., Lawrence, N.D., 2023.\n", "[Real-world machine learning systems: A survey from a data-oriented\n", "architecture perspective](https://arxiv.org/abs/2302.04810).\n", "\n", "Coales, J.F., Kane, S.J., 2014. The “yellow peril” and after. IEEE\n", "Control Systems Magazine 34, 65–69.\n", "\n", "\n", "Felin, T., Koenderink, J., Krueger, J.I., Noble, D., Ellis, G.F.R.,\n", "2021. The data-hypothesis relationship. Genome Biology 22.\n", "\n", "\n", "Heider, F., 1958. The psychology of interpersonal relations. John Wiley.\n", "\n", "Henrich, J., Muthukrishna, M., 2021. The origins and psychology of human\n", "cooperation. Annual Review of Psychology 72, 207–240.\n", "\n", "\n", "Huang, W., Xia, F., Xiao, T., Chan, H., Liang, J., Florence, P., Zeng,\n", "A., Tompson, J., Mordatch, I., Chebotar, Y., Sermanet, P., Jackson, T.,\n", "Brown, N., Luu, L., Levine, S., Hausman, K., ichter, brian, 2023. [Inner\n", "monologue: Embodied reasoning through planning with language\n", "models](https://proceedings.mlr.press/v205/huang23c.html), in: Liu, K.,\n", "Kulic, D., Ichnowski, J. (Eds.), Proceedings of the 6th Conference on\n", "Robot Learning, Proceedings of Machine Learning Research. PMLR, pp.\n", "1769–1782.\n", "\n", "Laplace, P.S., 1814. Essai philosophique sur les probabilités, 2nd ed.\n", "Courcier, Paris.\n", "\n", "Lawrence, N.D., 2024. The atomic human: Understanding ourselves in the\n", "age of AI. Allen Lane.\n", "\n", "Lawrence, N.D., 2017. [Living together: Mind and machine\n", "intelligence](https://arxiv.org/abs/1705.07996). arXiv.\n", "\n", "Lawrence, N.D., 2010. Introduction to learning and inference in\n", "computational systems biology.\n", "\n", "MacKay, D.M., 1991. Behind the eye. Basil Blackwell.\n", "\n", "O’Neill, O., 2002. A question of trust. Cambridge University Press.\n", "\n", "Pfau, D., Spencer, J.S., Matthews, A.G.D.G., Foulkes, W.M.C., 2020. Ab\n", "initio solution of the many-electron Schrödinger equation with deep\n", "neural networks. Phys. Rev. Research 2, 033429.\n", "\n", "\n", "Reed, C., Durlach, N.I., 1998. Note on information transfer rates in\n", "human communication. Presence Teleoperators & Virtual Environments 7,\n", "509–518. \n", "\n", "Roh, S.-H., Shekhar, M., Pintilie, G., Chipot, C., Wilkens, S.,\n", "Singharoy, A., Chiu, W., 2020. Cryo-EM and MD infer water-mediated\n", "proton transport and autoinhibition mechanisms of Vo complex. Science\n", "Advances 6. \n", "\n", "Sculley, D., Holt, G., Golovin, D., Davydov, E., Phillips, T., Ebner,\n", "D., Chaudhary, V., Young, M., Crespo, J.-F., Dennison, D., 2015. [Hidden\n", "technical debt in machine learning\n", "systems](http://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf),\n", "in: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R.\n", "(Eds.), Advances in Neural Information Processing Systems 28. Curran\n", "Associates, Inc., pp. 2503–2511.\n", "\n", "Simons, D.J., Chabris, C.F., 1999. Gorillas in our midst: Sustained\n", "inattentional blindness for dynamic events. Perception 28, 1059–1074.\n", "\n", "\n", "Susskind, R.E., Susskind, D., 2015. The future of the professions: How\n", "technology will transform the work of human experts. Oxford University\n", "Press.\n", "\n", "Taigman, Y., Yang, M., Ranzato, M., Wolf, L., 2014. DeepFace: Closing\n", "the gap to human-level performance in face verification, in: Proceedings\n", "of the IEEE Computer Society Conference on Computer Vision and Pattern\n", "Recognition. \n", "\n", "The Admiralty, 1945. [The gunnery pocket book, b.r.\n", "224/45](https://www.maritime.org/doc/br224/).\n", "\n", "Thompson, W.C., 1989. [Are juries competent to evaluate statistical\n", "evidence?](http://www.jstor.org/stable/1191906) Law and Contemporary\n", "Problems 52, 9–41.\n", "\n", "Wiener, N., 1953. Ex-prodigy: My childhood and youth. mitp, Cambridge,\n", "MA.\n", "\n", "Wiener, N., 1949. The extrapolation, interpolation and smoothing of\n", "stationary time series with engineering applications. wiley.\n", "\n", "Yanai, I., Lercher, M., 2020. A hypothesis is a liability. Genome\n", "Biology 21." ], "id": "94057003-cf87-4333-bcce-19abe77a62da" } ], "nbformat": 4, "nbformat_minor": 5, "metadata": {} }