{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\n# Nested versus non-nested cross-validation\n\nThis example compares non-nested and nested cross-validation strategies on a\nclassifier of the iris data set. Nested cross-validation (CV) is often used to\ntrain a model in which hyperparameters also need to be optimized. Nested CV\nestimates the generalization error of the underlying model and its\n(hyper)parameter search. Choosing the parameters that maximize non-nested CV\nbiases the model to the dataset, yielding an overly-optimistic score.\n\nModel selection without nested CV uses the same data to tune model parameters\nand evaluate model performance. Information may thus \"leak\" into the model\nand overfit the data. The magnitude of this effect is primarily dependent on\nthe size of the dataset and the stability of the model. See Cawley and Talbot\n[1]_ for an analysis of these issues.\n\nTo avoid this problem, nested CV effectively uses a series of\ntrain/validation/test set splits. In the inner loop (here executed by\n:class:`GridSearchCV <sklearn.model_selection.GridSearchCV>`), the score is\napproximately maximized by fitting a model to each training set, and then\ndirectly maximized in selecting (hyper)parameters over the validation set. In\nthe outer loop (here in :func:`cross_val_score\n<sklearn.model_selection.cross_val_score>`), generalization error is estimated\nby averaging test set scores over several dataset splits.\n\nThe example below uses a support vector classifier with a non-linear kernel to\nbuild a model with optimized hyperparameters by grid search. We compare the\nperformance of non-nested and nested CV strategies by taking the difference\nbetween their scores.\n\n.. seealso::\n\n    - `cross_validation`\n    - `grid_search`\n\n.. rubric:: References\n\n.. [1] [Cawley, G.C.; Talbot, N.L.C. On over-fitting in model selection and\n    subsequent selection bias in performance evaluation.\n    J. Mach. Learn. Res 2010,11, 2079-2107.](http://jmlr.csail.mit.edu/papers/volume11/cawley10a/cawley10a.pdf)\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# Authors: The scikit-learn developers\n# SPDX-License-Identifier: BSD-3-Clause\n\nimport numpy as np\nfrom matplotlib import pyplot as plt\n\nfrom sklearn.datasets import load_iris\nfrom sklearn.model_selection import GridSearchCV, KFold, cross_val_score\nfrom sklearn.svm import SVC\n\n# Number of random trials\nNUM_TRIALS = 30\n\n# Load the dataset\niris = load_iris()\nX_iris = iris.data\ny_iris = iris.target\n\n# Set up possible values of parameters to optimize over\np_grid = {\"C\": [1, 10, 100], \"gamma\": [0.01, 0.1]}\n\n# We will use a Support Vector Classifier with \"rbf\" kernel\nsvm = SVC(kernel=\"rbf\")\n\n# Arrays to store scores\nnon_nested_scores = np.zeros(NUM_TRIALS)\nnested_scores = np.zeros(NUM_TRIALS)\n\n# Loop for each trial\nfor i in range(NUM_TRIALS):\n    # Choose cross-validation techniques for the inner and outer loops,\n    # independently of the dataset.\n    # E.g \"GroupKFold\", \"LeaveOneOut\", \"LeaveOneGroupOut\", etc.\n    inner_cv = KFold(n_splits=4, shuffle=True, random_state=i)\n    outer_cv = KFold(n_splits=4, shuffle=True, random_state=i)\n\n    # Non_nested parameter search and scoring\n    clf = GridSearchCV(estimator=svm, param_grid=p_grid, cv=outer_cv)\n    clf.fit(X_iris, y_iris)\n    non_nested_scores[i] = clf.best_score_\n\n    # Nested CV with parameter optimization\n    clf = GridSearchCV(estimator=svm, param_grid=p_grid, cv=inner_cv)\n    nested_score = cross_val_score(clf, X=X_iris, y=y_iris, cv=outer_cv)\n    nested_scores[i] = nested_score.mean()\n\nscore_difference = non_nested_scores - nested_scores\n\nprint(\n    \"Average difference of {:6f} with std. dev. of {:6f}.\".format(\n        score_difference.mean(), score_difference.std()\n    )\n)\n\n# Plot scores on each trial for nested and non-nested CV\nplt.figure()\nplt.subplot(211)\n(non_nested_scores_line,) = plt.plot(non_nested_scores, color=\"r\")\n(nested_line,) = plt.plot(nested_scores, color=\"b\")\nplt.ylabel(\"score\", fontsize=\"14\")\nplt.legend(\n    [non_nested_scores_line, nested_line],\n    [\"Non-Nested CV\", \"Nested CV\"],\n    bbox_to_anchor=(0, 0.4, 0.5, 0),\n)\nplt.title(\n    \"Non-Nested and Nested Cross Validation on Iris Dataset\",\n    x=0.5,\n    y=1.1,\n    fontsize=\"15\",\n)\n\n# Plot bar chart of the difference.\nplt.subplot(212)\ndifference_plot = plt.bar(range(NUM_TRIALS), score_difference)\nplt.xlabel(\"Individual Trial #\")\nplt.legend(\n    [difference_plot],\n    [\"Non-Nested CV - Nested CV Score\"],\n    bbox_to_anchor=(0, 1, 0.8, 0),\n)\nplt.ylabel(\"score difference\", fontsize=\"14\")\n\nplt.show()"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.9.21"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}