{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Cross-Validation and the Test Set\n", "\n", "In the last lecture, we saw how keeping some data hidden from our model could help us to get a clearer understanding of whether or not the model was overfitting. This time, we'll introduce a common automated framework for handling this task, called **cross-validation**. We'll also incorporate a designated **test set**, which we won't touch until the very end of our analysis to get an overall view of the performance of our model. " ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from matplotlib import pyplot as plt\n", "import pandas as pd" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | Survived | \n", "Pclass | \n", "Name | \n", "Sex | \n", "Age | \n", "Siblings/Spouses Aboard | \n", "Parents/Children Aboard | \n", "Fare | \n", "
---|---|---|---|---|---|---|---|---|
0 | \n", "0 | \n", "3 | \n", "Mr. Owen Harris Braund | \n", "male | \n", "22.0 | \n", "1 | \n", "0 | \n", "7.2500 | \n", "
1 | \n", "1 | \n", "1 | \n", "Mrs. John Bradley (Florence Briggs Thayer) Cum... | \n", "female | \n", "38.0 | \n", "1 | \n", "0 | \n", "71.2833 | \n", "
2 | \n", "1 | \n", "3 | \n", "Miss. Laina Heikkinen | \n", "female | \n", "26.0 | \n", "0 | \n", "0 | \n", "7.9250 | \n", "
3 | \n", "1 | \n", "1 | \n", "Mrs. Jacques Heath (Lily May Peel) Futrelle | \n", "female | \n", "35.0 | \n", "1 | \n", "0 | \n", "53.1000 | \n", "
4 | \n", "0 | \n", "3 | \n", "Mr. William Henry Allen | \n", "male | \n", "35.0 | \n", "0 | \n", "0 | \n", "8.0500 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
882 | \n", "0 | \n", "2 | \n", "Rev. Juozas Montvila | \n", "male | \n", "27.0 | \n", "0 | \n", "0 | \n", "13.0000 | \n", "
883 | \n", "1 | \n", "1 | \n", "Miss. Margaret Edith Graham | \n", "female | \n", "19.0 | \n", "0 | \n", "0 | \n", "30.0000 | \n", "
884 | \n", "0 | \n", "3 | \n", "Miss. Catherine Helen Johnston | \n", "female | \n", "7.0 | \n", "1 | \n", "2 | \n", "23.4500 | \n", "
885 | \n", "1 | \n", "1 | \n", "Mr. Karl Howell Behr | \n", "male | \n", "26.0 | \n", "0 | \n", "0 | \n", "30.0000 | \n", "
886 | \n", "0 | \n", "3 | \n", "Mr. Patrick Dooley | \n", "male | \n", "32.0 | \n", "0 | \n", "0 | \n", "7.7500 | \n", "
887 rows × 8 columns
\n", "