{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "KmfnvFsvamAS" }, "source": [ "# Gaussian Processes" ] }, { "cell_type": "markdown", "metadata": { "id": "XrcDRfPBamAb" }, "source": [ "skorch supports integration with the fantastic [GPyTorch](https://gpytorch.ai/) library. GPyTorch implements various Gaussian Process (GP) techniques on top of PyTorch." ] }, { "cell_type": "markdown", "metadata": { "id": "kaSDDW4QamAf" }, "source": [ "GPyTorch adopts many patterns from PyTorch, thus making it easy to pick up for seasoned PyTorch users. Similarly, the skorch GPyTorch integration should look familiar to seasoned skorch users. However, GPs are a different beast than the more common, non-probabilistic machine learning techniques. It is important to understand the basic concepts before using them in practice." ] }, { "cell_type": "markdown", "metadata": { "id": "aMZJJ0o4amAh" }, "source": [ "This notebook is not the place to learn about GPs in general, instead a basic understanding is assumed. If you're looking for an introduction to probabilistic programming and GPs, here are some pointers:\n", "\n", "- The GPyTorch [documentation](https://docs.gpytorch.ai/en/stable/)\n", "- The book [Gaussian Processes for Machine Learning](http://gaussianprocess.org/gpml/chapters/) by Carl Edward Rasmussen and Christopher K. I. Williams\n", "- The lecture series [Probabilistic Machine Learning](https://www.youtube.com/playlist?list=PL05umP7R6ij1tHaOFY96m5uX3J21a6yNd) by Philipp Hennig" ] }, { "cell_type": "markdown", "metadata": { "id": "EfNQLyUHamAo" }, "source": [ "Below, we will show you how to use skorch for Gaussian Processes through GPyTorch. We assume that you are familiar with how skorch and PyTorch work and we will focus on how using GPs differs from using non-probabilistic deep learning techniques with skorch. For a discussion on when and when not to use GPyTorch with skorch, please have a look at our [documentation](https://skorch.readthedocs.io/en/latest/user/probabilistic.html)." ] }, { "cell_type": "markdown", "metadata": { "id": "CimimrA2amAx" }, "source": [ "
\n",
"\n",
" Run in Google Colab \n",
" | \n",
" View source on GitHub |
train_split or use skorch.helper.predefined_split if you already have split your data beforehand (see the example further below).\n",
"criterion__num_data=int(0.8 * len(y_train)), i.e. to 80% of the total data. This is because above, we split off 20% of the training data for validation. If this number if different (e.g. because you perform a grid search), you should adjust the ratio accordingly."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "KtTYgLZJamCn"
},
"source": [
"X into several folds, some of which might not contain the samples X[:500], which can lead to data leakage. In this case, you might want to set aside those inducing points completely, not using them for training at all.\n",
"