{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " \n", " \n", "## [mlcourse.ai](https://mlcourse.ai) – Open Machine Learning Course \n", "###
Author: Tatyana Kudasova, ODS Slack @kudasova\n", " \n", "##
Tutorial\n", "##
Nested cross-validation\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Why nested cross-validation?\n", "\n", "Often we want to tune the parameters of a model. That is, we want to find the value of a parameter that minimizes our loss function. The best way to do this, as we already know, is cross-validation.\n", "\n", "However, as Cawley and Talbot pointed out in their [2010 paper](http://jmlr.org/papers/volume11/cawley10a/cawley10a.pdf), since we used the test set to both select the values of the parameter and evaluate the model, we risk optimistically biasing our model evaluations. For this reason, if a test set is used to select model parameters, then we need a different test set to get an unbiased evaluation of that selected model. Mainly, we can think of model selection as another training procedure, and hence, we would need a decently-sized, independent test set that we have not seen before to get an unbiased estimate of the models’ performance. Often, this is not affordable. A good way to overcome this problem is to use nested cross-validation." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Nested cross-validation explained\n", "\n", "The nested cross-validation has an inner cross-validation nested in an outer cross-validation. First, an inner cross-validation is used to tune the parameters and select the best model. Second, an outer cross-validation is used to evaluate the model selected by the inner cross-validation.\n", "\n", " \n", "\n", "Imagine that we have _N_ models and we want to use _L_-fold inner cross-validation to tune hyperparameters and K-fold outer cross validation to evaluate the models. Then the algorithm is as follows:\n", "\n", "1. Divide the dataset into _K_ cross-validation folds at random.\n", "2. For each fold _k=1,2,…,K_: (outer loop for evaluation of the model with selected hyperparameter)
\n", "\n", " 2.1. Let `test` be fold _k_
\n", " 2.2. Let `trainval` be all the data except those in fold _k_
\n", " 2.3. Randomly split `trainval` into _L_ folds
\n", " 2.4. For each fold _l=1,2,…L_: (inner loop for hyperparameter tuning)
\n", "> 2.4.1 Let `val` be fold _l_
\n", "> 2.4.2 Let `train` be all the data except those in `test` or `val`
\n", "> 2.4.3 Train each of _N_ models with each hyperparameter on `train`, and evaluate it on `val`. Keep track of the performance metrics
\n", "\n", " 2.5. For each hyperparameter setting, calculate the average metrics score over the _L_ folds, and choose the best hyperparameter setting.
\n", " 2.6. Train each of _N_ models with the best hyperparameter on `trainval`. Evaluate its performance on `test` and save the score for fold _k_