{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Basic usage" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*`skorch`* is designed to maximize interoperability between `sklearn` and `pytorch`. The aim is to keep 99% of the flexibility of `pytorch` while being able to leverage most features of `sklearn`. Below, we show the basic usage of `skorch` and how it can be combined with `sklearn`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This notebook shows you how to use the basic functionality of `skorch`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Table of contents" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* [Definition of the pytorch module](#Definition-of-the-pytorch-module)\n", "* [Training a classifier](#Training-a-classifier-and-making-predictions)\n", " * [Dataset](#A-toy-binary-classification-task)\n", " * [pytorch module](#Definition-of-the-pytorch-classification-module)\n", " * [Model training](#Defining-and-training-the-neural-net-classifier)\n", " * [Inference](#Making-predictions,-classification)\n", "* [Training a regressor](#Training-a-regressor)\n", " * [Dataset](#A-toy-regression-task)\n", " * [pytorch module](#Definition-of-the-pytorch-regression-module)\n", " * [Model training](#Defining-and-training-the-neural-net-regressor)\n", " * [Inference](#Making-predictions,-regression)\n", "* [Saving and loading a model](#Saving-and-loading-a-model)\n", " * [Whole model](#Saving-the-whole-model)\n", " * [Only parameters](#Saving-only-the-model-parameters)\n", "* [Usage with an sklearn Pipeline](#Usage-with-an-sklearn-Pipeline)\n", "* [Callbacks](#Callbacks)\n", "* [Grid search](#Usage-with-sklearn-GridSearchCV)\n", " * [Special prefixes](#Special-prefixes)\n", " * [Performing a grid search](#Performing-a-grid-search)" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import torch\n", "from torch import nn\n", "import torch.nn.functional as F" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "torch.manual_seed(0);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Training a classifier and making predictions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### A toy binary classification task" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We load a toy classification task from `sklearn`." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import numpy as np\n", "from sklearn.datasets import make_classification" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": true }, "outputs": [], "source": [ "X, y = make_classification(1000, 20, n_informative=10, random_state=0)\n", "X = X.astype(np.float32)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "((1000, 20), (1000,), 0.5)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X.shape, y.shape, y.mean()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Definition of the `pytorch` classification `module`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We define a vanilla neural network with two hidden layers. The output layer should have 2 output units since there are two classes. In addition, it should have a softmax nonlinearity, because later, when calling `predict_proba`, the output from the `forward` call will be used." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": true }, "outputs": [], "source": [ "class ClassifierModule(nn.Module):\n", " def __init__(\n", " self,\n", " num_units=10,\n", " nonlin=F.relu,\n", " dropout=0.5,\n", " ):\n", " super(ClassifierModule, self).__init__()\n", " self.num_units = num_units\n", " self.nonlin = nonlin\n", " self.dropout = dropout\n", "\n", " self.dense0 = nn.Linear(20, num_units)\n", " self.nonlin = nonlin\n", " self.dropout = nn.Dropout(dropout)\n", " self.dense1 = nn.Linear(num_units, 10)\n", " self.output = nn.Linear(10, 2)\n", "\n", " def forward(self, X, **kwargs):\n", " X = self.nonlin(self.dense0(X))\n", " X = self.dropout(X)\n", " X = F.relu(self.dense1(X))\n", " X = F.softmax(self.output(X), dim=-1)\n", " return X" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Defining and training the neural net classifier" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We use `NeuralNetClassifier` because we're dealing with a classifcation task. The first argument should be the `pytorch module`. As additional arguments, we pass the number of epochs and the learning rate (`lr`), but those are optional.\n", "\n", "*Note*: To use the cuda backend, pass `use_cuda=True` as an additional argument." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from skorch.net import NeuralNetClassifier" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": true }, "outputs": [], "source": [ "net = NeuralNetClassifier(\n", " ClassifierModule,\n", " max_epochs=20,\n", " lr=0.1,\n", " # use_cuda=True, # uncomment this to train with CUDA\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As in `sklearn`, we call `fit` passing the input data `X` and the targets `y`. By default, `NeuralNetClassifier` makes a `StratifiedKFold` split on the data (80/20) to track the validation loss. This is shown, as well as the train loss and the accuracy on the validation set." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Automatic pdb calling has been turned ON\n" ] } ], "source": [ "pdb on" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " epoch train_loss valid_acc valid_loss dur\n", "------- ------------ ----------- ------------ ------\n", " 1 \u001b[36m0.6868\u001b[0m \u001b[32m0.6000\u001b[0m \u001b[35m0.6740\u001b[0m 0.0793\n", " 2 \u001b[36m0.6706\u001b[0m \u001b[32m0.6400\u001b[0m \u001b[35m0.6617\u001b[0m 0.0686\n", " 3 \u001b[36m0.6637\u001b[0m \u001b[32m0.6650\u001b[0m \u001b[35m0.6504\u001b[0m 0.0541\n", " 4 \u001b[36m0.6548\u001b[0m \u001b[32m0.7000\u001b[0m \u001b[35m0.6418\u001b[0m 0.0535\n", " 5 \u001b[36m0.6340\u001b[0m \u001b[32m0.7100\u001b[0m \u001b[35m0.6272\u001b[0m 0.0539\n", " 6 \u001b[36m0.6219\u001b[0m \u001b[32m0.7150\u001b[0m \u001b[35m0.6124\u001b[0m 0.0574\n", " 7 \u001b[36m0.6058\u001b[0m 0.7100 \u001b[35m0.5980\u001b[0m 0.0530\n", " 8 \u001b[36m0.5964\u001b[0m \u001b[32m0.7200\u001b[0m \u001b[35m0.5875\u001b[0m 0.0646\n", " 9 \u001b[36m0.5901\u001b[0m 0.7100 \u001b[35m0.5760\u001b[0m 0.0572\n", " 10 \u001b[36m0.5716\u001b[0m \u001b[32m0.7250\u001b[0m \u001b[35m0.5651\u001b[0m 0.0460\n", " 11 \u001b[36m0.5633\u001b[0m 0.7250 \u001b[35m0.5580\u001b[0m 0.0471\n", " 12 0.5652 \u001b[32m0.7300\u001b[0m \u001b[35m0.5529\u001b[0m 0.0453\n", " 13 \u001b[36m0.5462\u001b[0m \u001b[32m0.7350\u001b[0m \u001b[35m0.5426\u001b[0m 0.0500\n", " 14 \u001b[36m0.5407\u001b[0m 0.7300 \u001b[35m0.5407\u001b[0m 0.0448\n", " 15 \u001b[36m0.5360\u001b[0m 0.7300 \u001b[35m0.5373\u001b[0m 0.0464\n", " 16 0.5517 \u001b[32m0.7400\u001b[0m \u001b[35m0.5328\u001b[0m 0.0448\n", " 17 \u001b[36m0.5351\u001b[0m \u001b[32m0.7450\u001b[0m \u001b[35m0.5277\u001b[0m 0.0460\n", " 18 \u001b[36m0.5280\u001b[0m 0.7400 \u001b[35m0.5260\u001b[0m 0.0530\n", " 19 \u001b[36m0.5148\u001b[0m 0.7450 0.5264 0.0583\n", " 20 0.5309 0.7400 \u001b[35m0.5210\u001b[0m 0.0740\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "net.fit(X, y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Also, as in `sklearn`, you may call `predict` or `predict_proba` on the fitted model." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Making predictions, classification" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0, 0, 0, 0, 0])" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "y_pred = net.predict(X[:5])\n", "y_pred" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 0.54967159, 0.45032838],\n", " [ 0.7842356 , 0.2157644 ],\n", " [ 0.67652136, 0.32347867],\n", " [ 0.88522649, 0.1147735 ],\n", " [ 0.68577141, 0.31422859]], dtype=float32)" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "y_proba = net.predict_proba(X[:5])\n", "y_proba" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Training a regressor" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### A toy regression task" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from sklearn.datasets import make_regression" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": true }, "outputs": [], "source": [ "X_regr, y_regr = make_regression(1000, 20, n_informative=10, random_state=0)\n", "X_regr = X_regr.astype(np.float32)\n", "y_regr = y_regr.astype(np.float32) / 100\n", "y_regr = y_regr.reshape(-1, 1)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "((1000, 20), (1000, 1), -6.4901485, 6.1545048)" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X_regr.shape, y_regr.shape, y_regr.min(), y_regr.max()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*Note*: Regression currently requires the target to be 2-dimensional, hence the need to reshape. This should be fixed with an upcoming version of pytorch." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Definition of the `pytorch` regression `module`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Again, define a vanilla neural network with two hidden layers. The main difference is that the output layer only has one unit and does not apply a softmax nonlinearity." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": true }, "outputs": [], "source": [ "class RegressorModule(nn.Module):\n", " def __init__(\n", " self,\n", " num_units=10,\n", " nonlin=F.relu,\n", " ):\n", " super(RegressorModule, self).__init__()\n", " self.num_units = num_units\n", " self.nonlin = nonlin\n", "\n", " self.dense0 = nn.Linear(20, num_units)\n", " self.nonlin = nonlin\n", " self.dense1 = nn.Linear(num_units, 10)\n", " self.output = nn.Linear(10, 1)\n", "\n", " def forward(self, X, **kwargs):\n", " X = self.nonlin(self.dense0(X))\n", " X = F.relu(self.dense1(X))\n", " X = self.output(X)\n", " return X" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Defining and training the neural net regressor" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Training a regressor is almost the same as training a classifier. Mainly, we use `NeuralNetRegressor` instead of `NeuralNetClassifier` (this is the same terminology as in `sklearn`)." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from skorch.net import NeuralNetRegressor" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": true }, "outputs": [], "source": [ "net_regr = NeuralNetRegressor(\n", " RegressorModule,\n", " max_epochs=20,\n", " lr=0.1,\n", " # use_cuda=True, # uncomment this to train with CUDA\n", ")" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " epoch train_loss valid_loss dur\n", "------- ------------ ------------ ------\n", " 1 \u001b[36m4.6059\u001b[0m \u001b[32m3.5860\u001b[0m 0.0264\n", " 2 \u001b[36m3.5021\u001b[0m \u001b[32m1.3814\u001b[0m 0.0421\n", " 3 \u001b[36m1.1019\u001b[0m \u001b[32m0.5334\u001b[0m 0.0436\n", " 4 \u001b[36m0.7071\u001b[0m \u001b[32m0.2994\u001b[0m 0.0414\n", " 5 \u001b[36m0.5654\u001b[0m 0.4141 0.0248\n", " 6 \u001b[36m0.3179\u001b[0m \u001b[32m0.1574\u001b[0m 0.0242\n", " 7 \u001b[36m0.2476\u001b[0m 0.1906 0.0269\n", " 8 \u001b[36m0.1302\u001b[0m \u001b[32m0.1049\u001b[0m 0.0250\n", " 9 0.1373 0.1124 0.0240\n", " 10 \u001b[36m0.0728\u001b[0m \u001b[32m0.0737\u001b[0m 0.0265\n", " 11 0.0839 \u001b[32m0.0727\u001b[0m 0.0247\n", " 12 \u001b[36m0.0435\u001b[0m \u001b[32m0.0513\u001b[0m 0.0267\n", " 13 0.0508 \u001b[32m0.0483\u001b[0m 0.0268\n", " 14 \u001b[36m0.0279\u001b[0m \u001b[32m0.0371\u001b[0m 0.0261\n", " 15 0.0322 \u001b[32m0.0335\u001b[0m 0.0275\n", " 16 \u001b[36m0.0193\u001b[0m \u001b[32m0.0282\u001b[0m 0.0260\n", " 17 0.0224 \u001b[32m0.0247\u001b[0m 0.0257\n", " 18 \u001b[36m0.0148\u001b[0m \u001b[32m0.0221\u001b[0m 0.0271\n", " 19 0.0167 \u001b[32m0.0198\u001b[0m 0.0264\n", " 20 \u001b[36m0.0122\u001b[0m \u001b[32m0.0182\u001b[0m 0.0268\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "net_regr.fit(X_regr, y_regr)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Making predictions, regression" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You may call `predict` or `predict_proba` on the fitted model. For regressions, both methods return the same value." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 0.52162153],\n", " [-1.50998139],\n", " [-0.90007448],\n", " [-0.08845913],\n", " [-0.52214217]], dtype=float32)" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "y_pred = net_regr.predict(X_regr[:5])\n", "y_pred" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Saving and loading a model" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "Save and load either the whole model by using pickle or just the learned model parameters by calling `save_params` and `load_params`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Saving the whole model" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import pickle" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": true }, "outputs": [], "source": [ "file_name = '/tmp/mymodel.pkl'" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/home/bbossan_dev/anaconda3/envs/skorch/lib/python3.6/site-packages/torch/serialization.py:158: UserWarning: Couldn't retrieve source code for container of type ClassifierModule. It won't be checked for correctness upon loading.\n", " \"type \" + obj.__name__ + \". It won't be checked \"\n" ] } ], "source": [ "with open(file_name, 'wb') as f:\n", " pickle.dump(net, f)" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "collapsed": true }, "outputs": [], "source": [ "with open(file_name, 'rb') as f:\n", " new_net = pickle.load(f)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Saving only the model parameters" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This only saves and loads the proper `module` parameters, meaning that hyperparameters such as `lr` and `max_epochs` are not saved. Therefore, to load the model, we have to re-initialize it beforehand." ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "collapsed": true }, "outputs": [], "source": [ "net.save_params(file_name) # a file handler also works" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# first initialize the model\n", "new_net = NeuralNetClassifier(\n", " ClassifierModule,\n", " max_epochs=20,\n", " lr=0.1,\n", ").initialize()" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "collapsed": true }, "outputs": [], "source": [ "new_net.load_params(file_name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Usage with an `sklearn Pipeline`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is possible to put the `NeuralNetClassifier` inside an `sklearn Pipeline`, as you would with any `sklearn` classifier." ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from sklearn.pipeline import Pipeline\n", "from sklearn.preprocessing import StandardScaler" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "collapsed": true }, "outputs": [], "source": [ "pipe = Pipeline([\n", " ('scale', StandardScaler()),\n", " ('net', net),\n", "])" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Re-initializing module!\n", " epoch train_loss valid_acc valid_loss dur\n", "------- ------------ ----------- ------------ ------\n", " 1 \u001b[36m0.6891\u001b[0m \u001b[32m0.5550\u001b[0m \u001b[35m0.6853\u001b[0m 0.1064\n", " 2 \u001b[36m0.6826\u001b[0m \u001b[32m0.5600\u001b[0m \u001b[35m0.6825\u001b[0m 0.0576\n", " 3 0.6873 \u001b[32m0.5900\u001b[0m \u001b[35m0.6801\u001b[0m 0.0514\n", " 4 \u001b[36m0.6797\u001b[0m \u001b[32m0.6000\u001b[0m \u001b[35m0.6776\u001b[0m 0.0448\n", " 5 \u001b[36m0.6772\u001b[0m \u001b[32m0.6150\u001b[0m \u001b[35m0.6751\u001b[0m 0.0434\n", " 6 \u001b[36m0.6748\u001b[0m \u001b[32m0.6200\u001b[0m \u001b[35m0.6723\u001b[0m 0.0429\n", " 7 \u001b[36m0.6682\u001b[0m 0.6200 \u001b[35m0.6691\u001b[0m 0.0429\n", " 8 \u001b[36m0.6645\u001b[0m 0.6200 \u001b[35m0.6654\u001b[0m 0.0473\n", " 9 \u001b[36m0.6623\u001b[0m \u001b[32m0.6300\u001b[0m \u001b[35m0.6613\u001b[0m 0.0524\n", " 10 \u001b[36m0.6464\u001b[0m 0.6200 \u001b[35m0.6555\u001b[0m 0.0612\n", " 11 0.6471 0.6300 \u001b[35m0.6491\u001b[0m 0.0490\n", " 12 \u001b[36m0.6449\u001b[0m \u001b[32m0.6600\u001b[0m \u001b[35m0.6424\u001b[0m 0.0493\n", " 13 \u001b[36m0.6285\u001b[0m 0.6500 \u001b[35m0.6341\u001b[0m 0.0520\n", " 14 \u001b[36m0.6265\u001b[0m 0.6500 \u001b[35m0.6261\u001b[0m 0.0487\n", " 15 \u001b[36m0.6252\u001b[0m 0.6600 \u001b[35m0.6193\u001b[0m 0.0341\n", " 16 \u001b[36m0.6148\u001b[0m \u001b[32m0.6750\u001b[0m \u001b[35m0.6102\u001b[0m 0.0282\n", " 17 \u001b[36m0.6039\u001b[0m \u001b[32m0.6850\u001b[0m \u001b[35m0.6017\u001b[0m 0.0475\n", " 18 \u001b[36m0.5979\u001b[0m \u001b[32m0.6900\u001b[0m \u001b[35m0.5949\u001b[0m 0.0435\n", " 19 \u001b[36m0.5794\u001b[0m \u001b[32m0.7000\u001b[0m \u001b[35m0.5849\u001b[0m 0.0440\n", " 20 \u001b[36m0.5596\u001b[0m \u001b[32m0.7050\u001b[0m \u001b[35m0.5758\u001b[0m 0.0484\n" ] }, { "data": { "text/plain": [ "Pipeline(memory=None,\n", " steps=[('scale', StandardScaler(copy=True, with_mean=True, with_std=True)), ('net', )])" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pipe.fit(X, y)" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 0.39650354, 0.60349649],\n", " [ 0.73950195, 0.26049808],\n", " [ 0.72104084, 0.27895918],\n", " [ 0.71111423, 0.2888858 ],\n", " [ 0.66332674, 0.33667326]], dtype=float32)" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "y_proba = pipe.predict_proba(X[:5])\n", "y_proba" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To save the whole pipeline, including the pytorch module, use `pickle`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Callbacks" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Adding a new callback to the model is straightforward. Below we show how to add a new callback that determines the area under the ROC (AUC) score." ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from skorch.callbacks import EpochScoring" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There is a scoring callback in skorch, `EpochScoring`, which we use for this. We have to specify which score to calculate. We have 3 choices:\n", "\n", "* Passing a string: This should be a valid `sklearn` metric. For a list of all existing scores, look [here](http://scikit-learn.org/stable/modules/classes.html#sklearn-metrics-metrics).\n", "* Passing `None`: If you implement your own `.score` method on your neural net, passing `scoring=None` will tell `skorch` to use that.\n", "* Passing a function or callable: If we want to define our own scoring function, we pass a function with the signature `func(model, X, y) -> score`, which is then used.\n", "\n", "Note that this works exactly the same as scoring in `sklearn` does." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For our case here, since `sklearn` already implements AUC, we just pass the correct string `'roc_auc'`. We should also tell the callback that higher scores are better (to get the correct colors printed below -- by default, lower scores are assumed to be better). Furthermore, we may specify a `name` argument for `EpochScoring`, and whether to use training data (by setting `on_train=True`) or validation data (which is the default)." ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "collapsed": true }, "outputs": [], "source": [ "auc = EpochScoring(scoring='roc_auc', lower_is_better=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we pass the scoring callback to the `callbacks` parameter as a list and then call `fit`. Notice that we get the printed scores and color highlighting for free." ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "collapsed": true }, "outputs": [], "source": [ "net = NeuralNetClassifier(\n", " ClassifierModule,\n", " max_epochs=20,\n", " lr=0.1,\n", " callbacks=[auc],\n", ")" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " epoch roc_auc train_loss valid_acc valid_loss dur\n", "------- --------- ------------ ----------- ------------ ------\n", " 1 \u001b[36m0.5911\u001b[0m \u001b[32m0.7204\u001b[0m \u001b[35m0.5000\u001b[0m \u001b[31m0.6948\u001b[0m 0.0590\n", " 2 \u001b[36m0.6524\u001b[0m \u001b[32m0.6925\u001b[0m \u001b[35m0.5300\u001b[0m \u001b[31m0.6881\u001b[0m 0.0502\n", " 3 \u001b[36m0.6700\u001b[0m \u001b[32m0.6867\u001b[0m \u001b[35m0.6000\u001b[0m \u001b[31m0.6857\u001b[0m 0.0321\n", " 4 \u001b[36m0.6854\u001b[0m \u001b[32m0.6820\u001b[0m \u001b[35m0.6400\u001b[0m \u001b[31m0.6832\u001b[0m 0.0364\n", " 5 0.6829 \u001b[32m0.6801\u001b[0m 0.6050 \u001b[31m0.6812\u001b[0m 0.0377\n", " 6 0.6757 \u001b[32m0.6742\u001b[0m 0.6100 \u001b[31m0.6796\u001b[0m 0.0541\n", " 7 0.6808 0.6762 0.6100 \u001b[31m0.6776\u001b[0m 0.0528\n", " 8 0.6759 \u001b[32m0.6576\u001b[0m 0.6350 \u001b[31m0.6747\u001b[0m 0.0317\n", " 9 0.6813 0.6661 0.6350 \u001b[31m0.6707\u001b[0m 0.0525\n", " 10 \u001b[36m0.6903\u001b[0m \u001b[32m0.6548\u001b[0m \u001b[35m0.6450\u001b[0m \u001b[31m0.6655\u001b[0m 0.0467\n", " 11 \u001b[36m0.6929\u001b[0m \u001b[32m0.6500\u001b[0m 0.6400 \u001b[31m0.6611\u001b[0m 0.0495\n", " 12 0.6920 \u001b[32m0.6445\u001b[0m \u001b[35m0.6500\u001b[0m \u001b[31m0.6571\u001b[0m 0.0314\n", " 13 \u001b[36m0.7095\u001b[0m \u001b[32m0.6372\u001b[0m \u001b[35m0.6650\u001b[0m \u001b[31m0.6509\u001b[0m 0.0390\n", " 14 \u001b[36m0.7155\u001b[0m \u001b[32m0.6288\u001b[0m \u001b[35m0.6700\u001b[0m \u001b[31m0.6446\u001b[0m 0.0532\n", " 15 \u001b[36m0.7265\u001b[0m \u001b[32m0.6268\u001b[0m 0.6700 \u001b[31m0.6390\u001b[0m 0.0494\n", " 16 \u001b[36m0.7398\u001b[0m \u001b[32m0.6150\u001b[0m \u001b[35m0.6900\u001b[0m \u001b[31m0.6308\u001b[0m 0.0609\n", " 17 \u001b[36m0.7487\u001b[0m 0.6221 \u001b[35m0.7000\u001b[0m \u001b[31m0.6246\u001b[0m 0.0540\n", " 18 0.7473 0.6168 \u001b[35m0.7250\u001b[0m \u001b[31m0.6187\u001b[0m 0.0529\n", " 19 \u001b[36m0.7588\u001b[0m \u001b[32m0.5945\u001b[0m \u001b[35m0.7400\u001b[0m \u001b[31m0.6100\u001b[0m 0.0522\n", " 20 \u001b[36m0.7664\u001b[0m 0.6000 \u001b[35m0.7650\u001b[0m \u001b[31m0.6026\u001b[0m 0.0524\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "net.fit(X, y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For information on how to write custom callbacks, have a look at the [Advanced_Usage](https://nbviewer.jupyter.org/github/dnouri/skorch/blob/master/notebooks/Advanced_Usage.ipynb) notebook." ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## Usage with sklearn `GridSearchCV`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Special prefixes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `NeuralNet` class allows to directly access parameters of the `pytorch module` by using the `module__` prefix. So e.g. if you defined the `module` to have a `num_units` parameter, you can set it via the `module__num_units` argument. This is exactly the same logic that allows to access estimator parameters in `sklearn Pipeline`s and `FeatureUnion`s." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This feature is useful in several ways. For one, it allows to set those parameters in the model definition. Furthermore, it allows you to set parameters in an `sklearn GridSearchCV` as shown below." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In addition to the parameters prefixed by `module__`, you may access a couple of other attributes, such as those of the optimizer by using the `optimizer__` prefix (again, see below). All those special prefixes are stored in the `prefixes_` attribute:" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "module, iterator_train, iterator_valid, optimizer, criterion, callbacks, dataset\n" ] } ], "source": [ "print(', '.join(net.prefixes_))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Performing a grid search" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Below we show how to perform a grid search over the learning rate (`lr`), the module's number of hidden units (`module__num_units`), the module's dropout rate (`module__dropout`), and whether the SGD optimizer should use Nesterov momentum or not (`optimizer__nesterov`)." ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from sklearn.model_selection import GridSearchCV" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "collapsed": true }, "outputs": [], "source": [ "net = NeuralNetClassifier(\n", " ClassifierModule,\n", " max_epochs=20,\n", " lr=0.1,\n", " verbose=0,\n", " optimizer__momentum=0.9,\n", ")" ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "collapsed": true }, "outputs": [], "source": [ "params = {\n", " 'lr': [0.05, 0.1],\n", " 'module__num_units': [10, 20],\n", " 'module__dropout': [0, 0.5],\n", " 'optimizer__nesterov': [False, True],\n", "}" ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "collapsed": true }, "outputs": [], "source": [ "gs = GridSearchCV(net, params, refit=False, cv=3, scoring='accuracy', verbose=2)" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Fitting 3 folds for each of 16 candidates, totalling 48 fits\n", "[CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=False \n", "[CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=False, total= 0.9s\n", "[CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=False \n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.9s remaining: 0.0s\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "[CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=False, total= 1.1s\n", "[CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=False \n", "[CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=False, total= 2.4s\n", "[CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=True \n", "[CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=True, total= 2.4s\n", "[CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=True \n", "[CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=True, total= 1.0s\n", "[CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=True \n", "[CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=True, total= 0.8s\n", "[CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=False \n", "[CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=False, total= 0.8s\n", "[CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=False \n", "[CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=False, total= 0.8s\n", "[CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=False \n", "[CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=False, total= 0.8s\n", "[CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=True \n", "[CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=True, total= 1.0s\n", "[CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=True \n", "[CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=True, total= 0.9s\n", "[CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=True \n", "[CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=True, total= 0.8s\n", "[CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False \n", "[CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False, total= 0.8s\n", "[CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False \n", "[CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False, total= 0.8s\n", "[CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False \n", "[CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False, total= 0.9s\n", "[CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True \n", "[CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True, total= 1.0s\n", "[CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True \n", "[CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True, total= 1.0s\n", "[CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True \n", "[CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True, total= 0.9s\n", "[CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False \n", "[CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False, total= 1.0s\n", "[CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False \n", "[CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False, total= 0.8s\n", "[CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False \n", "[CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False, total= 0.8s\n", "[CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True \n", "[CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True, total= 0.8s\n", "[CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True \n", "[CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True, total= 0.8s\n", "[CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True \n", "[CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True, total= 0.9s\n", "[CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=False \n", "[CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=False, total= 0.9s\n", "[CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=False \n", "[CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=False, total= 0.8s\n", "[CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=False \n", "[CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=False, total= 0.8s\n", "[CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=True \n", "[CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=True, total= 0.8s\n", "[CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=True \n", "[CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=True, total= 0.8s\n", "[CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=True \n", "[CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=True, total= 0.8s\n", "[CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=False \n", "[CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=False, total= 0.8s\n", "[CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=False \n", "[CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=False, total= 0.7s\n", "[CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=False \n", "[CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=False, total= 0.8s\n", "[CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=True \n", "[CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=True, total= 0.8s\n", "[CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=True \n", "[CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=True, total= 0.8s\n", "[CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=True \n", "[CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=True, total= 0.8s\n", "[CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False \n", "[CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False, total= 0.8s\n", "[CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False \n", "[CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False, total= 0.7s\n", "[CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False \n", "[CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False, total= 0.8s\n", "[CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True \n", "[CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True, total= 0.9s\n", "[CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True \n", "[CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True, total= 0.8s\n", "[CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True \n", "[CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True, total= 0.8s\n", "[CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False \n", "[CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False, total= 0.8s\n", "[CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False \n", "[CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False, total= 0.8s\n", "[CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False \n", "[CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False, total= 0.8s\n", "[CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True \n", "[CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True, total= 0.8s\n", "[CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True \n", "[CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True, total= 0.9s\n", "[CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "[CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True, total= 0.9s\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "[Parallel(n_jobs=1)]: Done 48 out of 48 | elapsed: 44.6s finished\n" ] }, { "data": { "text/plain": [ "GridSearchCV(cv=3, error_score='raise',\n", " estimator=,\n", " fit_params=None, iid=True, n_jobs=1,\n", " param_grid={'lr': [0.05, 0.1], 'module__num_units': [10, 20], 'module__dropout': [0, 0.5], 'optimizer__nesterov': [False, True]},\n", " pre_dispatch='2*n_jobs', refit=False, return_train_score=True,\n", " scoring='accuracy', verbose=2)" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gs.fit(X, y)" ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.856 {'lr': 0.1, 'module__dropout': 0, 'module__num_units': 20, 'optimizer__nesterov': False}\n" ] } ], "source": [ "print(gs.best_score_, gs.best_params_)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "Of course, we could further nest the `NeuralNetClassifier` within an `sklearn Pipeline`, in which case we just prefix the parameter by the name of the net (e.g. `net__module__num_units`)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.2" } }, "nbformat": 4, "nbformat_minor": 2 }