{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Regression with Outliers\n", "\n", "In the standard __Gaussian process regression__ setting it is assumed that the observations are __Normally distributed__ about the latent function. In the package this can applied using either the `GP` or `GPE` functions with which *exact Gaussian process* models.\n", "\n", "One of the drawbacks of exact GP regression is that by assuming Normal noise the GP is __not robust to outliers__. In this setting, it is more appropriate to assume that the distribution of the noise is heavy tailed. For example, with a __Student-t distribution__,\n", "$$\n", "\\mathbf{y} \\ | \\ \\mathbf{f},\\nu,\\sigma \\sim \\prod_{i=1}^n \\frac{\\Gamma(\\nu+1)/2}{\\Gamma(\\nu/2)\\sqrt{\\nu\\pi}\\sigma}\\left(1+\\frac{(y_i-f_i)^2}{\\nu\\sigma^2}\\right)^{-(\\nu+1)/2}\n", "$$\n", "\n", "Moving away from the Gaussian likelihood function (i.e. Normally distributed noise) and using the Student-t likelihood means that we can no longer analytically calculate the GP marginal likelihood. We can take a Bayesian perspective and sample from the joint distribution of the latent function and model parameters." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Load functions from packages\n", "using GaussianProcesses, Plots\n", "using Distributions:Normal, TDist\n", "using Klara:MALA\n", "\n", "#Simulate the data\n", "srand(112233)\n", "n = 20\n", "X = collect(linspace(-3,3,n))\n", "sigma = 1.0\n", "Y = X + sigma*rand(TDist(3),n);\n", "\n", "# Plots observations\n", "pyplot()\n", "scatter(X,Y;fmt=:png, leg=false)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We fit a standard (exact) Gaussian process model to the Student-t data and compare this against the Monte Carlo GP which is applicable for non-Gaussian observations models." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "GP Monte Carlo object:\n", " Dim = 1\n", " Number of observations = 20\n", " Mean function:\n", " Type: GaussianProcesses.MeanZero, Params: Float64[]\n", " Kernel:\n", " Type: GaussianProcesses.Mat32Iso, Params: [0.0, 0.0]\n", " Likelihood:\n", " Type: GaussianProcesses.StuTLik, Params: [0.1]\n", " Input observations = \n", "[-3.0 -2.68421 … 2.68421 3.0]\n", " Output observations = [-3.98707, -2.7855, -1.74722, -2.27384, -7.12662, -1.43191, -0.840899, 0.305618, -0.444213, -0.924319, -0.0303161, -0.368707, -0.706642, 2.07517, 5.68693, -1.02634, 3.25863, 5.46091, 4.45861, 4.60741]\n", " Log-posterior = -77.863" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Build the models\n", "\n", "gpe = GPE(X,vec(Y),MeanZero(),Matern(3/2,0.0,0.0),0.5) #Exact GP assuming Gaussian noise\n", "\n", "l = StuTLik(3,0.1)\n", "gpmc = GPMC(X, vec(Y), MeanZero(), Matern(3/2,0.0,0.0), l) #Monte Carlo GP with student-t likelihood" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Estimate the parameters of the exact GP through maximum likelihood estimation" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Dict(:mean=>true,:kern=>true,:noise=>true)\n" ] }, { "data": { "text/plain": [ "Results of Optimization Algorithm\n", " * Algorithm: L-BFGS\n", " * Starting Point: [0.5,0.0,0.0]\n", " * Minimizer: [0.6566151884189473,1.7118340256787345, ...]\n", " * Minimum: 4.578633e+01\n", " * Iterations: 11\n", " * Convergence: true\n", " * |x - x'| < 1.0e-32: false\n", " * |f(x) - f(x')| / |f(x)| < 1.0e-32: false\n", " * |g(x)| < 1.0e-08: true\n", " * f(x) > f(x'): false\n", " * Reached Maximum Number of Iterations: false\n", " * Objective Function Calls: 46\n", " * Gradient Calls: 46" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "optimize!(gpe)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Taking a Bayesian perspective, we can add prior distributions to the model parameters and sample from the posterior distribution using Markov chain Monte Carlo through the `mcmc` function. This function builds on the [Klara](https://github.com/JuliaStats/Klara.jl) package and a list of available samplers can be found through this package's documentation." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "BasicMCJob:\n", " Variable [1]: p (BasicContMuvParameter)\n", " GenericModel: 1 variables, 0 dependencies (directed graph)\n", " MALA sampler: drift step = 0.05\n", " VanillaMCTuner: period = 100, verbose = false\n", " BasicMCRange: number of steps = 49991, burnin = 10000, thinning = 10" ] }, { "data": { "text/plain": [ "23×4000 Array{Float64,2}:\n", " -1.44975 -1.44975 -1.02479 … -1.43093 -1.23015 -1.23015 \n", " 0.588195 0.588195 0.332809 1.08002 1.17153 1.17153 \n", " 0.718085 0.718085 0.211443 0.493442 0.385335 0.385335\n", " -1.32609 -1.32609 -1.15774 -0.333319 -0.200674 -0.200674\n", " -0.721646 -0.721646 -0.855224 -0.55112 -0.794651 -0.794651\n", " 0.851406 0.851406 0.986268 … 1.32336 1.21235 1.21235 \n", " -0.250237 -0.250237 -0.296884 -0.298572 -0.166915 -0.166915\n", " 0.154486 0.154486 0.387819 0.0714513 0.162703 0.162703\n", " -0.0969148 -0.0969148 -0.324955 -0.446952 -0.267948 -0.267948\n", " -0.380338 -0.380338 0.133223 0.765023 0.42936 0.42936 \n", " 0.378311 0.378311 0.555143 … -0.226185 -0.211079 -0.211079\n", " -0.918741 -0.918741 -0.744673 -0.589133 -0.436455 -0.436455\n", " -0.136525 -0.136525 0.123217 1.7611 1.83721 1.83721 \n", " 1.77061 1.77061 1.61585 -0.923194 -0.784206 -0.784206\n", " 1.73116 1.73116 1.87613 0.82272 0.86416 0.86416 \n", " -0.678761 -0.678761 -0.545683 … 0.386 0.503511 0.503511\n", " 0.830114 0.830114 0.782332 2.30027 2.27326 2.27326 \n", " 1.28389 1.28389 1.61891 2.18992 2.05585 2.05585 \n", " 0.247051 0.247051 0.492288 -0.479602 -0.876566 -0.876566\n", " -0.385056 -0.385056 -0.53043 -0.780789 -0.959797 -0.959797\n", " -0.372006 -0.372006 -0.116689 … -0.138475 -0.17703 -0.17703 \n", " -0.0997188 -0.0997188 -0.338012 0.869903 1.11409 1.11409 \n", " 1.1369 1.1369 1.10389 1.05892 1.07458 1.07458 " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "set_priors!(gpmc.lik,[Normal(-2.0,4.0)])\n", "set_priors!(gpmc.k,[Normal(-2.0,4.0),Normal(-2.0,4.0)])\n", "\n", "samples = mcmc(gpmc;sampler=MALA(0.05),nIter=50000, thin=10, burnin=10000)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Plot posterior samples\n", "xtest = linspace(minimum(gpmc.X),maximum(gpmc.X),50);\n", "\n", "#Set the parameters to the posterior values the sample random function\n", "fsamples = [];\n", "for i in 1:size(samples,2)\n", " set_params!(gpmc,samples[:,i])\n", " update_target!(gpmc)\n", " push!(fsamples, rand(gpmc,xtest))\n", "end\n", "\n", "#Predict\n", "p1=plot(gpe,leg=false, title=\"Exact GP\") #Exact GP (assuming Gaussian noise)\n", "\n", "sd = [std(fsamples[i]) for i in 1:50]\n", "p2=plot(xtest,mean(fsamples),ribbon=2*sd,leg=false, title=\"Monte Carlo GP\") #GP Monte Carlo with student-t noise\n", "scatter!(X,Y)\n", "\n", "plot(p1,p2;fmt=:png)" ] } ], "metadata": { "kernelspec": { "display_name": "Julia 0.6.0", "language": "julia", "name": "julia-0.6" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "0.6.0" } }, "nbformat": 4, "nbformat_minor": 2 }