{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Gaussian Processes for Machine Learning\n", "\n", "#### OxWaSP Symposium, Warwick\n", "\n", "### Neil D. Lawrence\n", "\n", "### 28th January 2016" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# The Data are Note Enough\n", "\n", "* Four pillars:\n", " * Deterministic/Stochastic\n", " * Mechanistic/Empirical\n", " \n", "* **Goal**: *model complex phenomena over time*\n", "* **Problem**: \n", " * Mechanistic models are often inaccurate\n", " * Data is often not rich enough for a purely *empirical* approach\n", "\n", "* **Question 1**: How do we combine *inaccurate physical models* with machine learning?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Central Dogma" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Decision: Transcription Factors" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Mechanistic Model" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Need to Model $p_\\text{TF}(t)$\n", "\n", "* Gaussian process: a *probabilistic* model for functions.\n", "\n", "* Formally known as a *stochastic process*.\n", "\n", "* Multivariate Gaussian is normally defined as a *mean vector*, $\\boldsymbol{\\mu}$, and a *covariance matrix*, $\\mathbf{C}$.\n", "$$\n", "\\mathbf{y} \\sim \\mathcal{N}(\\boldsymbol{\\mu}, \\mathbf{C})\n", "$$\n", "\n", "* Gaussian process defined by a *mean function*, $\\mu(t)$ and a covariance function, $c(t, t^\\prime)$. \n", "$$\n", "y(t) \\sim \\mathcal{N}(\\mu(t), c(t, t^\\prime))\n", "$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Zero Mean Gaussian Sample" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Zero Mean Gaussian Process Sample" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Gaussian Processes" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Gaussian Processes" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Results" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Further Challenges\n", "\n", "* This model inter-relates different functions with mechanistic understanding.\n", "\n", "* What if you need to inter-relate across different modalities of data at different scales.\n", "\n", "* *E.g.* biopsy images + genetic test + mammogram for breast cancer diagnostics." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# The Data are Not Enough\n", "\n", "* Four pillars:\n", " * Deterministic/Stochastic\n", " * Mechanistic/Empirical\n", " \n", "* **Goal**: *model complex phenomena over time*\n", "\n", "* **Problem**:\n", " * *Mechanistic* models are often inaccurate.\n", " * Data is often not rich enough for *empirical* approach\n", " \n", "* **Question 2**: How do we formulate the right representations to integrate different data modalities?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Classical Latent Variables\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Classical Treatment\n", "\n", "* Assume *a priori* that *e.g.*\n", "$$\n", "x \\sim \\mathcal{N}(\\mathbf{0}, \\mathbf{I})\n", "$$\n", "* Relate linearly to $\\mathbf{y}$.\n", "$$\n", "\\mathbf{y} \\sim \\mathbf{W} \\mathbf{x} + \\boldsymbol{\\epsilon}\n", "$$\n", "\n", "* Framework covers many classical models such as PCA, Factor Analysis and ICA. " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Render Gaussian Non Gaussian" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Stochastic Process Composition \n", "\n", "* A new approach to forming stochastic processes.\n", "* Mathematical composition:\n", "$$\n", "y(x) = f_1(f_2(f_3(x)))\n", "$$\n", "* Properties of resulting process highly non-Gaussian.\n", "* Allows for hierarchical structured form of model.\n", "* Learning in models of this type has also become known as *deep learning*. " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Use Abstraction for Complex Systems" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Biology and Health " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Neuroscience" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Example: Motion Capture Modelling" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Modelling Digits " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Health" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Summary\n", "\n", "* Complex systems:\n", " * 'big data' is too 'small'.\n", " * The data are not enough.\n", " \n", "* Solutions:\n", " * Hybrid mechanistic-empirical models.\n", " * Structured model composition for automated data assimilation." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.4.3" } }, "nbformat": 4, "nbformat_minor": 0 }