{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Regression with Outliers\n",
"\n",
"In the standard __Gaussian process regression__ setting it is assumed that the observations are __Normally distributed__ about the latent function. In the package this can applied using either the `GP` or `GPE` functions with which *exact Gaussian process* models.\n",
"\n",
"One of the drawbacks of exact GP regression is that by assuming Normal noise the GP is __not robust to outliers__. In this setting, it is more appropriate to assume that the distribution of the noise is heavy tailed. For example, with a __Student-t distribution__,\n",
"$$\n",
"\\mathbf{y} \\ | \\ \\mathbf{f},\\nu,\\sigma \\sim \\prod_{i=1}^n \\frac{\\Gamma(\\nu+1)/2}{\\Gamma(\\nu/2)\\sqrt{\\nu\\pi}\\sigma}\\left(1+\\frac{(y_i-f_i)^2}{\\nu\\sigma^2}\\right)^{-(\\nu+1)/2}\n",
"$$\n",
"\n",
"Moving away from the Gaussian likelihood function (i.e. Normally distributed noise) and using the Student-t likelihood means that we can no longer analytically calculate the GP marginal likelihood. We can take a Bayesian perspective and sample from the joint distribution of the latent function and model parameters."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
""
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Load functions from packages\n",
"using GaussianProcesses, Plots\n",
"using Distributions:Normal, TDist\n",
"using Klara:MALA\n",
"\n",
"#Simulate the data\n",
"srand(112233)\n",
"n = 20\n",
"X = collect(linspace(-3,3,n))\n",
"sigma = 1.0\n",
"Y = X + sigma*rand(TDist(3),n);\n",
"\n",
"# Plots observations\n",
"pyplot()\n",
"scatter(X,Y;fmt=:png, leg=false)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We fit a standard (exact) Gaussian process model to the Student-t data and compare this against the Monte Carlo GP which is applicable for non-Gaussian observations models."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"GP Monte Carlo object:\n",
" Dim = 1\n",
" Number of observations = 20\n",
" Mean function:\n",
" Type: GaussianProcesses.MeanZero, Params: Float64[]\n",
" Kernel:\n",
" Type: GaussianProcesses.Mat32Iso, Params: [0.0, 0.0]\n",
" Likelihood:\n",
" Type: GaussianProcesses.StuTLik, Params: [0.1]\n",
" Input observations = \n",
"[-3.0 -2.68421 … 2.68421 3.0]\n",
" Output observations = [-3.98707, -2.7855, -1.74722, -2.27384, -7.12662, -1.43191, -0.840899, 0.305618, -0.444213, -0.924319, -0.0303161, -0.368707, -0.706642, 2.07517, 5.68693, -1.02634, 3.25863, 5.46091, 4.45861, 4.60741]\n",
" Log-posterior = -77.863"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Build the models\n",
"\n",
"gpe = GPE(X,vec(Y),MeanZero(),Matern(3/2,0.0,0.0),0.5) #Exact GP assuming Gaussian noise\n",
"\n",
"l = StuTLik(3,0.1)\n",
"gpmc = GPMC(X, vec(Y), MeanZero(), Matern(3/2,0.0,0.0), l) #Monte Carlo GP with student-t likelihood"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Estimate the parameters of the exact GP through maximum likelihood estimation"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Dict(:mean=>true,:kern=>true,:noise=>true)\n"
]
},
{
"data": {
"text/plain": [
"Results of Optimization Algorithm\n",
" * Algorithm: L-BFGS\n",
" * Starting Point: [0.5,0.0,0.0]\n",
" * Minimizer: [0.6566151884189473,1.7118340256787345, ...]\n",
" * Minimum: 4.578633e+01\n",
" * Iterations: 11\n",
" * Convergence: true\n",
" * |x - x'| < 1.0e-32: false\n",
" * |f(x) - f(x')| / |f(x)| < 1.0e-32: false\n",
" * |g(x)| < 1.0e-08: true\n",
" * f(x) > f(x'): false\n",
" * Reached Maximum Number of Iterations: false\n",
" * Objective Function Calls: 46\n",
" * Gradient Calls: 46"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"optimize!(gpe)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Taking a Bayesian perspective, we can add prior distributions to the model parameters and sample from the posterior distribution using Markov chain Monte Carlo through the `mcmc` function. This function builds on the [Klara](https://github.com/JuliaStats/Klara.jl) package and a list of available samplers can be found through this package's documentation."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"BasicMCJob:\n",
" Variable [1]: p (BasicContMuvParameter)\n",
" GenericModel: 1 variables, 0 dependencies (directed graph)\n",
" MALA sampler: drift step = 0.05\n",
" VanillaMCTuner: period = 100, verbose = false\n",
" BasicMCRange: number of steps = 49991, burnin = 10000, thinning = 10"
]
},
{
"data": {
"text/plain": [
"23×4000 Array{Float64,2}:\n",
" -1.44975 -1.44975 -1.02479 … -1.43093 -1.23015 -1.23015 \n",
" 0.588195 0.588195 0.332809 1.08002 1.17153 1.17153 \n",
" 0.718085 0.718085 0.211443 0.493442 0.385335 0.385335\n",
" -1.32609 -1.32609 -1.15774 -0.333319 -0.200674 -0.200674\n",
" -0.721646 -0.721646 -0.855224 -0.55112 -0.794651 -0.794651\n",
" 0.851406 0.851406 0.986268 … 1.32336 1.21235 1.21235 \n",
" -0.250237 -0.250237 -0.296884 -0.298572 -0.166915 -0.166915\n",
" 0.154486 0.154486 0.387819 0.0714513 0.162703 0.162703\n",
" -0.0969148 -0.0969148 -0.324955 -0.446952 -0.267948 -0.267948\n",
" -0.380338 -0.380338 0.133223 0.765023 0.42936 0.42936 \n",
" 0.378311 0.378311 0.555143 … -0.226185 -0.211079 -0.211079\n",
" -0.918741 -0.918741 -0.744673 -0.589133 -0.436455 -0.436455\n",
" -0.136525 -0.136525 0.123217 1.7611 1.83721 1.83721 \n",
" 1.77061 1.77061 1.61585 -0.923194 -0.784206 -0.784206\n",
" 1.73116 1.73116 1.87613 0.82272 0.86416 0.86416 \n",
" -0.678761 -0.678761 -0.545683 … 0.386 0.503511 0.503511\n",
" 0.830114 0.830114 0.782332 2.30027 2.27326 2.27326 \n",
" 1.28389 1.28389 1.61891 2.18992 2.05585 2.05585 \n",
" 0.247051 0.247051 0.492288 -0.479602 -0.876566 -0.876566\n",
" -0.385056 -0.385056 -0.53043 -0.780789 -0.959797 -0.959797\n",
" -0.372006 -0.372006 -0.116689 … -0.138475 -0.17703 -0.17703 \n",
" -0.0997188 -0.0997188 -0.338012 0.869903 1.11409 1.11409 \n",
" 1.1369 1.1369 1.10389 1.05892 1.07458 1.07458 "
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"set_priors!(gpmc.lik,[Normal(-2.0,4.0)])\n",
"set_priors!(gpmc.k,[Normal(-2.0,4.0),Normal(-2.0,4.0)])\n",
"\n",
"samples = mcmc(gpmc;sampler=MALA(0.05),nIter=50000, thin=10, burnin=10000)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
""
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Plot posterior samples\n",
"xtest = linspace(minimum(gpmc.X),maximum(gpmc.X),50);\n",
"\n",
"#Set the parameters to the posterior values the sample random function\n",
"fsamples = [];\n",
"for i in 1:size(samples,2)\n",
" set_params!(gpmc,samples[:,i])\n",
" update_target!(gpmc)\n",
" push!(fsamples, rand(gpmc,xtest))\n",
"end\n",
"\n",
"#Predict\n",
"p1=plot(gpe,leg=false, title=\"Exact GP\") #Exact GP (assuming Gaussian noise)\n",
"\n",
"sd = [std(fsamples[i]) for i in 1:50]\n",
"p2=plot(xtest,mean(fsamples),ribbon=2*sd,leg=false, title=\"Monte Carlo GP\") #GP Monte Carlo with student-t noise\n",
"scatter!(X,Y)\n",
"\n",
"plot(p1,p2;fmt=:png)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Julia 0.6.0",
"language": "julia",
"name": "julia-0.6"
},
"language_info": {
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
"version": "0.6.0"
}
},
"nbformat": 4,
"nbformat_minor": 2
}