{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Neural network methods\n", "\n", "Author: Gaurav Vaidya" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## Learning objectives\n", "* Understand what an artificial neural network (ANN) is and how it can be used.\n", "* Implement ANNs for use in prediction and classification based on multiple input features.\n", "\n", "## Learning deeply\n", "[Artificial Neural Networks (ANNs)](https://en.wikipedia.org/wiki/Artificial_neural_network) and [deep learning](https://en.wikipedia.org/wiki/Deep_learning) are currently getting a lot of interest, both as a subject of research and as a tool for analyzing datasets. A big difference from other machine learning techniques we've looked at so far is that ANNs can identify characteristics of interest by themselves, rather than having to be chosen by data scientists. Some of the other advantages of ANNs are related specifically to interpreting video and audio data, such as by using [convolutional neural networks](https://en.wikipedia.org/wiki/Convolutional_neural_network), but today we will focus on simple ANNs so you understand their struction and function.\n", "\n", "### Units\n", "\n", "ANNs are designed as *layers* of *units* (or nodes, or [artificial neurons](https://en.wikipedia.org/wiki/Artificial_neuron)). Each unit accepts multiple inputs, each of which has a different weight, including one bias input, which it combines into a single value. That single value is passed to an [activation function](https://en.wikipedia.org/wiki/Activation_function), which provides an output only if the combined value is greater than a particular threshold (usually, zero).\n", "\n", "In effect, each unit focuses on a particular aspect of the layer underneath it, and then summarizing that information for the layer above it. Every input to every unit has a *weight* and the unit has a *bias* input, and so is effectively doing a linear regression on the incoming data to obtain an output. The use of a non-linear activation function allows the ANN to predict and classify data that are not [linearly separable](https://en.wikipedia.org/wiki/Linear_separability). ANNs used to use the same sigmoid function we saw before, but these days the [rectified linear unit (ReLU) function](https://en.wikipedia.org/wiki/Rectifier_(neural_networks)) has become much more popular. It simply returns zero when the real output value is less than zero.\n", "\n", "![ReLU and softmax rectifiers](../nb-images/800px-Rectifier_and_softplus_functions.svg.png)\n", "\n", "### Layers\n", "\n", "![A diagram showing input, output and hidden layers in an ANN](../nb-images/colored_neural_network.svg.png)\n", "\n", "Every neural network has three layers:\n", "\n", "* An input layer, where each unit corresponds to a particular input feature. This could be categorical data, continuous data, or even colour values from images.\n", "* An output layer. We will be running two examples today: in the first, we will use a single output unit (the predicted price for a particular house in California). In the second, we will use seven output units, each corresponding to a particular type of forest cover.\n", "* A hidden layer. Without a hidden layer, an ANN can only pick up linear relationships: how changes in the input layer correspond to values in the output layer. Thanks to the hidden layer, an ANN can also pick up non-linear relationships, where different groups of input values interact in complicated ways to get the correct response on the output layer. The \"deep\" in [deep learning](https://en.wikipedia.org/wiki/Deep_learning) refers to the hidden layers that allow the model to identify intermediate patterns between the input and output layers.\n", "\n", "Putting it all together, we end up with a type of ANN called a *multilayer perceptron (MLP) network*, which looks like the following:\n", "\n", "![A visualization of a multi-layer perceptron (MLP) from the Scikit-learn manual](../nb-images/multilayerperceptron_network.png)\n", "\n", "[Google's Machine Learning course](https://developers.google.com/machine-learning/crash-course/introduction-to-neural-networks/anatomy) puts this very nicely: \"Stacking nonlinearities on nonlinearities lets us model very complicated relationships between the inputs and the predicted outputs. In brief, each layer is effectively learning a more complex, higher-level function over the raw inputs.\"" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## So what do we need?\n", "To create an ANN, we need to choose:\n", "1. The number of input units (= the number of input features)\n", "2. The number of output units:\n", " - When using the ANN to predict, we generally only need a single output unit.\n", " - When using the ANN to classify, we generally set the number of output units to the number of possible labels.\n", "3. The number of hidden layers\n", " - More hidden layers allows for more complex models -- which, as you've learned, also increases your risk of overfitting! So you want to go for the simplest model that meets your needs.\n", "4. A loss function. The scikit-learn classes we use always use [logistic loss](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.log_loss.html), also known as cross-entropy loss.\n", "4. The solver to use. The solver controls learning by searching for local minima in the parameter space. ANNs generally use [stochastic gradient descent (SGD)](https://en.wikipedia.org/wiki/Stochastic_gradient_descent) such as [RMSProp](https://en.wikipedia.org/wiki/Stochastic_gradient_descent#RMSProp), but today we will use [Adaptive Moment Estimation (Adam)](https://en.wikipedia.org/wiki/Stochastic_gradient_descent#Adam).\n", "5. The regularization protocol. We will use L2 regularization.\n", "\n", "Note that these are the [hyperparameters](https://en.wikipedia.org/wiki/Hyperparameter_(machine_learning)) of our model: we will adjust these hyperparameters to improve how quickly and accurately we can determine the actual parameters of our model, which is the set of weights and biases on all units across all layers." ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## Reminders of the ground rules\n", "* Always have *training* data and *testing* data, and make sure the ANN *never* sees the testing data.\n", "* Always shuffle your data.\n", "* ANNs don't work well when the features are in different ranges: it's usually a good idea to *normalize* it before use." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## ANN for prediction: how much might this house cost?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There aren't very good datasets for showcasing prediction on biological data, so we will use one of the classic machine learning datasets: a dataset of [California house prices](https://www.kaggle.com/camnugent/california-housing-prices), based on the 1990 census and published in [Pace and Barry, 1997](https://doi.org/10.1016/S0167-7152(96)00140-X). Scikit-Learn can download this dataset for us, so let's start with that." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Help on function fetch_california_housing in module sklearn.datasets.california_housing:\n", "\n", "fetch_california_housing(data_home=None, download_if_missing=True, return_X_y=False)\n", " Load the California housing dataset (regression).\n", " \n", " ============== ==============\n", " Samples total 20640\n", " Dimensionality 8\n", " Features real\n", " Target real 0.15 - 5.\n", " ============== ==============\n", " \n", " Read more in the :ref:`User Guide `.\n", " \n", " Parameters\n", " ----------\n", " data_home : optional, default: None\n", " Specify another download and cache folder for the datasets. By default\n", " all scikit-learn data is stored in '~/scikit_learn_data' subfolders.\n", " \n", " download_if_missing : optional, default=True\n", " If False, raise a IOError if the data is not locally available\n", " instead of trying to download the data from the source site.\n", " \n", " \n", " return_X_y : boolean, default=False.\n", " If True, returns ``(data.data, data.target)`` instead of a Bunch\n", " object.\n", " \n", " .. versionadded:: 0.20\n", " \n", " Returns\n", " -------\n", " dataset : dict-like object with the following attributes:\n", " \n", " dataset.data : ndarray, shape [20640, 8]\n", " Each row corresponding to the 8 feature values in order.\n", " \n", " dataset.target : numpy array of shape (20640,)\n", " Each value corresponds to the average house value in units of 100,000.\n", " \n", " dataset.feature_names : array of length 8\n", " Array of ordered feature names used in the dataset.\n", " \n", " dataset.DESCR : string\n", " Description of the California housing dataset.\n", " \n", " (data, target) : tuple if ``return_X_y`` is True\n", " \n", " .. versionadded:: 0.20\n", " \n", " Notes\n", " ------\n", " \n", " This dataset consists of 20,640 samples and 9 features.\n", "\n" ] } ], "source": [ "from sklearn import datasets\n", "help(datasets.fetch_california_housing)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Shape of California housing data: (20640, 8)\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
MedIncHouseAgeAveRoomsAveBedrmsPopulationAveOccupLatitudeLongitude
08.325241.06.9841271.023810322.02.55555637.88-122.23
18.301421.06.2381370.9718802401.02.10984237.86-122.22
27.257452.08.2881361.073446496.02.80226037.85-122.24
35.643152.05.8173521.073059558.02.54794537.85-122.25
43.846252.06.2818531.081081565.02.18146737.85-122.25
\n", "
" ], "text/plain": [ " MedInc HouseAge AveRooms AveBedrms Population AveOccup Latitude \\\n", "0 8.3252 41.0 6.984127 1.023810 322.0 2.555556 37.88 \n", "1 8.3014 21.0 6.238137 0.971880 2401.0 2.109842 37.86 \n", "2 7.2574 52.0 8.288136 1.073446 496.0 2.802260 37.85 \n", "3 5.6431 52.0 5.817352 1.073059 558.0 2.547945 37.85 \n", "4 3.8462 52.0 6.281853 1.081081 565.0 2.181467 37.85 \n", "\n", " Longitude \n", "0 -122.23 \n", "1 -122.22 \n", "2 -122.24 \n", "3 -122.25 \n", "4 -122.25 " ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import numpy as np\n", "import pandas as pd\n", "\n", "# Fetch California housing dataset. This will be downloaded to your computer.\n", "calif = datasets.fetch_california_housing()\n", "print(\"Shape of California housing data: \", calif.data.shape)\n", "califdf = pd.DataFrame.from_records(calif.data, columns=calif.feature_names)\n", "califdf.head()" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[4.526 3.585 3.521 3.413 3.422]\n" ] } ], "source": [ "# What do the house prices look like?\n", "print(calif.target[0:5]) # Units: $100,000x" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's start by shuffling our data and splitting into testing data and training data." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Train data shape: (15480, 8)\n", "Train label shape: (15480,)\n", "Test data shape: (5160, 8)\n", "Test label shape: (5160,)\n" ] } ], "source": [ "from sklearn import model_selection\n", "X_train, X_test, y_train, y_test = model_selection.train_test_split(\n", " califdf, # Input features (X)\n", " calif.target, # Output features (y)\n", " test_size=0.25, # Put aside 25% of data for testing.\n", " shuffle=True # Shuffle inputs.\n", ")\n", "\n", "# Did we err?\n", "\n", "print(\"Train data shape: \", X_train.shape)\n", "print(\"Train label shape: \", y_train.shape)\n", "\n", "print(\"Test data shape: \", X_test.shape)\n", "print(\"Test label shape: \", y_test.shape)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
MedIncHouseAgeAveRoomsAveBedrmsPopulationAveOccupLatitudeLongitude
count15480.00000015480.00000015480.00000015480.00000015480.00000015480.00000015480.00000015480.000000
mean3.86212428.6416675.4210701.0958261424.5291342.99134235.632852-119.577271
std1.89539912.6161852.5563000.4911321141.5456124.9726822.1345062.003396
min0.4999001.0000000.8461540.3750005.0000000.69230832.540000-124.250000
25%2.55620018.0000004.4338671.005435788.0000002.43260733.930000-121.810000
50%3.53380029.0000005.2250131.0484331164.0000002.82174434.260000-118.500000
75%4.73750037.0000006.0415041.0994761722.0000003.28337237.710000-118.020000
max15.00010052.000000141.90909134.06666735682.000000599.71428641.950000-114.310000
\n", "
" ], "text/plain": [ " MedInc HouseAge AveRooms AveBedrms Population \\\n", "count 15480.000000 15480.000000 15480.000000 15480.000000 15480.000000 \n", "mean 3.862124 28.641667 5.421070 1.095826 1424.529134 \n", "std 1.895399 12.616185 2.556300 0.491132 1141.545612 \n", "min 0.499900 1.000000 0.846154 0.375000 5.000000 \n", "25% 2.556200 18.000000 4.433867 1.005435 788.000000 \n", "50% 3.533800 29.000000 5.225013 1.048433 1164.000000 \n", "75% 4.737500 37.000000 6.041504 1.099476 1722.000000 \n", "max 15.000100 52.000000 141.909091 34.066667 35682.000000 \n", "\n", " AveOccup Latitude Longitude \n", "count 15480.000000 15480.000000 15480.000000 \n", "mean 2.991342 35.632852 -119.577271 \n", "std 4.972682 2.134506 2.003396 \n", "min 0.692308 32.540000 -124.250000 \n", "25% 2.432607 33.930000 -121.810000 \n", "50% 2.821744 34.260000 -118.500000 \n", "75% 3.283372 37.710000 -118.020000 \n", "max 599.714286 41.950000 -114.310000 " ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Let's have a look at the data. Is it in similar ranges?\n", "import pandas as pd\n", "pd.DataFrame(X_train).describe()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The input data comes in many different ranges: compare the ranges of latitude (32.54 to 41.95), longitude (-124.35 to -114.31), median income (0.50 to 15.0) and population (3 to 35682). As we described earlier, it's a good idea to normalize these values. Scikit-Learn has several built-in scalers that do just this. We will use the [StandardScaler](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html), which changes the data so it is normally distributed, with a mean of 0 and a variance of 1, but other [standardization methods are available](https://scikit-learn.org/stable/modules/preprocessing.html#standardization-or-mean-removal-and-variance-scaling)." ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
01234567
count1.548000e+041.548000e+041.548000e+041.548000e+041.548000e+041.548000e+041.548000e+041.548000e+04
mean-1.468822e-167.527714e-171.383906e-163.442552e-172.754042e-184.039261e-17-6.334296e-16-3.809758e-16
std1.000032e+001.000032e+001.000032e+001.000032e+001.000032e+001.000032e+001.000032e+001.000032e+00
min-1.773945e+00-2.191040e+00-1.789721e+00-1.467731e+00-1.243555e+00-4.623478e-01-1.449025e+00-2.332479e+00
25%-6.890194e-01-8.435205e-01-3.861969e-01-1.840523e-01-5.576209e-01-1.123645e-01-7.977993e-01-1.114508e+00
50%-1.732274e-012.840359e-02-7.669808e-02-9.650046e-02-2.282323e-01-3.410707e-02-6.431918e-015.377399e-01
75%4.618574e-016.625302e-012.427153e-017.433478e-032.605945e-015.872883e-029.731598e-017.773408e-01
max5.876513e+001.851518e+005.339453e+016.713457e+013.001070e+011.200041e+022.959632e+002.629256e+00
\n", "
" ], "text/plain": [ " 0 1 2 3 4 \\\n", "count 1.548000e+04 1.548000e+04 1.548000e+04 1.548000e+04 1.548000e+04 \n", "mean -1.468822e-16 7.527714e-17 1.383906e-16 3.442552e-17 2.754042e-18 \n", "std 1.000032e+00 1.000032e+00 1.000032e+00 1.000032e+00 1.000032e+00 \n", "min -1.773945e+00 -2.191040e+00 -1.789721e+00 -1.467731e+00 -1.243555e+00 \n", "25% -6.890194e-01 -8.435205e-01 -3.861969e-01 -1.840523e-01 -5.576209e-01 \n", "50% -1.732274e-01 2.840359e-02 -7.669808e-02 -9.650046e-02 -2.282323e-01 \n", "75% 4.618574e-01 6.625302e-01 2.427153e-01 7.433478e-03 2.605945e-01 \n", "max 5.876513e+00 1.851518e+00 5.339453e+01 6.713457e+01 3.001070e+01 \n", "\n", " 5 6 7 \n", "count 1.548000e+04 1.548000e+04 1.548000e+04 \n", "mean 4.039261e-17 -6.334296e-16 -3.809758e-16 \n", "std 1.000032e+00 1.000032e+00 1.000032e+00 \n", "min -4.623478e-01 -1.449025e+00 -2.332479e+00 \n", "25% -1.123645e-01 -7.977993e-01 -1.114508e+00 \n", "50% -3.410707e-02 -6.431918e-01 5.377399e-01 \n", "75% 5.872883e-02 9.731598e-01 7.773408e-01 \n", "max 1.200041e+02 2.959632e+00 2.629256e+00 " ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.preprocessing import StandardScaler\n", "\n", "scaler = StandardScaler()\n", "\n", "# Figure out how to scale all the input features in the training dataset.\n", "scaler.fit(X_train)\n", "X_train_scaled = scaler.transform(X_train)\n", "\n", "# Also tranform our validation and testing data in the same way.\n", "X_test_scaled = scaler.transform(X_test)\n", "\n", "# Did that work?\n", "pd.DataFrame(X_train_scaled).describe()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We won't normalize the output labels for simplicity and because it's usually not necessary if you have a single output. However, if you have multiple output labels, you will want to normalize them to each other." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alright, we're ready to run our model! We will use the [MLPRegressor](https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPRegressor.html) -- MLP stands for [multilevel perceptron](https://en.wikipedia.org/wiki/Multilayer_perceptron), a description of this kind of neural network." ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Iteration 1, loss = 1.78395319\n", "Validation score: 0.269671\n", "Iteration 2, loss = 0.40048710\n", "Validation score: 0.496960\n", "Iteration 3, loss = 0.30416513\n", "Validation score: 0.568650\n", "Iteration 4, loss = 0.26045892\n", "Validation score: 0.611116\n", "Iteration 5, loss = 0.23380253\n", "Validation score: 0.641203\n", "Iteration 6, loss = 0.21744966\n", "Validation score: 0.659125\n", "Iteration 7, loss = 0.20602815\n", "Validation score: 0.671523\n", "Iteration 8, loss = 0.19877348\n", "Validation score: 0.677612\n", "Iteration 9, loss = 0.19326475\n", "Validation score: 0.684275\n", "Iteration 10, loss = 0.18926257\n", "Validation score: 0.690023\n", "Iteration 11, loss = 0.18622392\n", "Validation score: 0.693421\n", "Iteration 12, loss = 0.18364689\n", "Validation score: 0.696335\n", "Iteration 13, loss = 0.18100021\n", "Validation score: 0.702679\n", "Iteration 14, loss = 0.17879836\n", "Validation score: 0.705411\n", "Iteration 15, loss = 0.17674397\n", "Validation score: 0.707982\n", "Iteration 16, loss = 0.17475785\n", "Validation score: 0.709925\n", "Iteration 17, loss = 0.17325396\n", "Validation score: 0.712813\n", "Iteration 18, loss = 0.17186204\n", "Validation score: 0.714069\n", "Iteration 19, loss = 0.17057713\n", "Validation score: 0.718573\n", "Iteration 20, loss = 0.16939444\n", "Validation score: 0.716752\n", "Iteration 21, loss = 0.16808567\n", "Validation score: 0.722229\n", "Iteration 22, loss = 0.16696640\n", "Validation score: 0.721332\n", "Iteration 23, loss = 0.16616058\n", "Validation score: 0.725791\n", "Iteration 24, loss = 0.16496505\n", "Validation score: 0.724879\n", "Iteration 25, loss = 0.16343029\n", "Validation score: 0.728570\n", "Iteration 26, loss = 0.16266397\n", "Validation score: 0.728714\n", "Iteration 27, loss = 0.16140440\n", "Validation score: 0.733954\n", "Iteration 28, loss = 0.15990542\n", "Validation score: 0.732648\n", "Iteration 29, loss = 0.15905258\n", "Validation score: 0.737474\n", "Iteration 30, loss = 0.15700726\n", "Validation score: 0.736568\n", "Iteration 31, loss = 0.15603395\n", "Validation score: 0.740053\n", "Iteration 32, loss = 0.15512096\n", "Validation score: 0.739522\n", "Iteration 33, loss = 0.15510144\n", "Validation score: 0.739999\n", "Iteration 34, loss = 0.15324322\n", "Validation score: 0.743014\n", "Iteration 35, loss = 0.15267468\n", "Validation score: 0.744724\n", "Iteration 36, loss = 0.15275319\n", "Validation score: 0.746584\n", "Iteration 37, loss = 0.15069781\n", "Validation score: 0.745608\n", "Iteration 38, loss = 0.15100557\n", "Validation score: 0.746861\n", "Iteration 39, loss = 0.14962103\n", "Validation score: 0.748189\n", "Iteration 40, loss = 0.14930208\n", "Validation score: 0.748062\n", "Iteration 41, loss = 0.14768952\n", "Validation score: 0.752246\n", "Iteration 42, loss = 0.14709483\n", "Validation score: 0.750436\n", "Iteration 43, loss = 0.14619772\n", "Validation score: 0.754742\n", "Iteration 44, loss = 0.14698924\n", "Validation score: 0.753296\n", "Iteration 45, loss = 0.14502456\n", "Validation score: 0.757108\n", "Iteration 46, loss = 0.14455279\n", "Validation score: 0.754906\n", "Iteration 47, loss = 0.14357148\n", "Validation score: 0.756072\n", "Iteration 48, loss = 0.14328472\n", "Validation score: 0.754611\n", "Iteration 49, loss = 0.14265293\n", "Validation score: 0.759212\n", "Iteration 50, loss = 0.14184246\n", "Validation score: 0.761116\n", "Iteration 51, loss = 0.14245981\n", "Validation score: 0.762705\n", "Iteration 52, loss = 0.14170275\n", "Validation score: 0.760689\n", "Iteration 53, loss = 0.14162699\n", "Validation score: 0.762154\n", "Iteration 54, loss = 0.14181699\n", "Validation score: 0.760913\n", "Iteration 55, loss = 0.14113317\n", "Validation score: 0.761710\n", "Iteration 56, loss = 0.14059386\n", "Validation score: 0.757093\n", "Iteration 57, loss = 0.14002683\n", "Validation score: 0.763510\n", "Iteration 58, loss = 0.13988032\n", "Validation score: 0.762324\n", "Iteration 59, loss = 0.13886572\n", "Validation score: 0.762611\n", "Iteration 60, loss = 0.13948545\n", "Validation score: 0.760565\n", "Iteration 61, loss = 0.13796855\n", "Validation score: 0.764551\n", "Iteration 62, loss = 0.13749400\n", "Validation score: 0.761634\n", "Iteration 63, loss = 0.13776950\n", "Validation score: 0.762150\n", "Iteration 64, loss = 0.13747980\n", "Validation score: 0.760548\n", "Iteration 65, loss = 0.13769208\n", "Validation score: 0.762841\n", "Iteration 66, loss = 0.13705944\n", "Validation score: 0.760266\n", "Iteration 67, loss = 0.13760656\n", "Validation score: 0.763218\n", "Iteration 68, loss = 0.13663838\n", "Validation score: 0.765209\n", "Iteration 69, loss = 0.13600413\n", "Validation score: 0.762145\n", "Iteration 70, loss = 0.13653847\n", "Validation score: 0.760569\n", "Iteration 71, loss = 0.13724037\n", "Validation score: 0.761683\n", "Iteration 72, loss = 0.13620033\n", "Validation score: 0.766285\n", "Iteration 73, loss = 0.13539298\n", "Validation score: 0.761398\n", "Iteration 74, loss = 0.13654450\n", "Validation score: 0.765581\n", "Iteration 75, loss = 0.13572818\n", "Validation score: 0.764635\n", "Iteration 76, loss = 0.13481307\n", "Validation score: 0.765610\n", "Iteration 77, loss = 0.13539711\n", "Validation score: 0.763212\n", "Iteration 78, loss = 0.13486477\n", "Validation score: 0.766516\n", "Iteration 79, loss = 0.13455420\n", "Validation score: 0.764574\n", "Iteration 80, loss = 0.13329891\n", "Validation score: 0.765145\n", "Iteration 81, loss = 0.13359457\n", "Validation score: 0.761580\n", "Iteration 82, loss = 0.13324071\n", "Validation score: 0.767093\n", "Iteration 83, loss = 0.13334167\n", "Validation score: 0.768302\n", "Iteration 84, loss = 0.13305772\n", "Validation score: 0.765975\n", "Iteration 85, loss = 0.13302991\n", "Validation score: 0.765658\n", "Iteration 86, loss = 0.13235215\n", "Validation score: 0.768617\n", "Iteration 87, loss = 0.13205636\n", "Validation score: 0.762191\n", "Iteration 88, loss = 0.13251709\n", "Validation score: 0.767893\n", "Iteration 89, loss = 0.13216362\n", "Validation score: 0.764732\n", "Iteration 90, loss = 0.13183369\n", "Validation score: 0.769772\n", "Iteration 91, loss = 0.13174595\n", "Validation score: 0.769393\n", "Iteration 92, loss = 0.13176786\n", "Validation score: 0.769545\n", "Iteration 93, loss = 0.13167039\n", "Validation score: 0.767980\n", "Iteration 94, loss = 0.13133042\n", "Validation score: 0.764848\n", "Iteration 95, loss = 0.13721522\n", "Validation score: 0.766268\n", "Iteration 96, loss = 0.13296603\n", "Validation score: 0.770427\n", "Iteration 97, loss = 0.13028083\n", "Validation score: 0.766183\n", "Iteration 98, loss = 0.13026281\n", "Validation score: 0.768261\n", "Iteration 99, loss = 0.12995922\n", "Validation score: 0.771497\n", "Iteration 100, loss = 0.13009093\n", "Validation score: 0.770901\n", "Iteration 101, loss = 0.12921611\n", "Validation score: 0.773096\n", "Iteration 102, loss = 0.12937084\n", "Validation score: 0.771619\n", "Iteration 103, loss = 0.12943766\n", "Validation score: 0.768302\n", "Iteration 104, loss = 0.12919797\n", "Validation score: 0.773403\n", "Iteration 105, loss = 0.12894905\n", "Validation score: 0.771725\n", "Iteration 106, loss = 0.13001117\n", "Validation score: 0.773794\n", "Iteration 107, loss = 0.12871301\n", "Validation score: 0.769237\n", "Iteration 108, loss = 0.12970825\n", "Validation score: 0.771186\n", "Iteration 109, loss = 0.12829932\n", "Validation score: 0.771517\n", "Iteration 110, loss = 0.12854296\n", "Validation score: 0.773027\n", "Iteration 111, loss = 0.12832205\n", "Validation score: 0.773978\n", "Iteration 112, loss = 0.12837127\n", "Validation score: 0.769361\n", "Iteration 113, loss = 0.13116502\n", "Validation score: 0.772059\n", "Iteration 114, loss = 0.12819269\n", "Validation score: 0.772295\n", "Iteration 115, loss = 0.12759859\n", "Validation score: 0.769477\n", "Iteration 116, loss = 0.12792852\n", "Validation score: 0.771334\n", "Iteration 117, loss = 0.12850087\n", "Validation score: 0.772960\n", "Iteration 118, loss = 0.12810867\n", "Validation score: 0.772603\n", "Iteration 119, loss = 0.12881906\n", "Validation score: 0.771638\n", "Iteration 120, loss = 0.12917700\n", "Validation score: 0.767134\n", "Iteration 121, loss = 0.12729273\n", "Validation score: 0.769327\n", "Iteration 122, loss = 0.12702171\n", "Validation score: 0.771709\n", "Validation score did not improve more than tol=0.000100 for 10 consecutive epochs. Stopping.\n" ] }, { "data": { "text/plain": [ "MLPRegressor(activation='relu', alpha=0.001, batch_size='auto', beta_1=0.9,\n", " beta_2=0.999, early_stopping=True, epsilon=1e-08,\n", " hidden_layer_sizes=(50, 20), learning_rate='constant',\n", " learning_rate_init=0.001, max_iter=200, momentum=0.9,\n", " n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5,\n", " random_state=None, shuffle=True, solver='adam', tol=0.0001,\n", " validation_fraction=0.1, verbose=True, warm_start=False)" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.neural_network import MLPRegressor\n", "\n", "ann = MLPRegressor(\n", " activation='relu', # The activation function to use: ReLU\n", " solver='adam', # The solver to use\n", " alpha=0.001, # The L2 regularization rate: higher values increase cost for larger weights\n", " hidden_layer_sizes=(50, 20),\n", " # The number of units in each hidden layer.\n", " # Note that we don't need to specify input and output neuron numbers:\n", " # MLPClassifier determines this based on the shape of the features and labels\n", " # being fitted.\n", " verbose=True, # Report on progress.\n", " batch_size='auto', # Process dataset in batches of 200 rows at a time.\n", " early_stopping=True # This activates two features:\n", " # - We will hold 10% of data aside as validation data. At the end of each\n", " # iteration, we will test the validation data to see how well we're doing.\n", " # - If learning slows below a pre-determined level, we stop early rather than\n", " # overtraining on our data.\n", ")\n", "ann.fit(X_train_scaled, y_train)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Three things to note:\n", " - Some models will never converge, in which case you will see a warning. This suggests that learning is *not* completely, and is likely because something is wrong with learning using this dataset with these hyperparameters.\n", " - We learn iteratively. In Scikit-Learn, each iteration is further broken up into \"batches\" of data.\n", " - We expect to see loss going down over time and validation score going up over time. We can visualize these in a graph if we want." ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAX4AAAD8CAYAAABw1c+bAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAIABJREFUeJzt3Xt8lOWd///XZ+6ZnBMSIBwkIHioCiEEiICHKohFtF21rW5FtmpdS631Z3/bb13tdld36bZaa9WvrdZaa922eKjs1vpt7dd6QNF6KEFFBUSOSgyQEE45JzPz+f5xTYYhZCZDCCS583k+HvNI5j5e9xze93Vfc801oqoYY4wZPAJ9XQBjjDFHlwW/McYMMhb8xhgzyFjwG2PMIGPBb4wxg4wFvzHGDDIW/MYYM8hY8BtjzCBjwW+MMYNMsK8L0JXhw4fr+PHj+7oYxhgzYKxcuXKnqhans2y/DP7x48dTWVnZ18UwxpgBQ0Q+SndZa+oxxphBxoLfGGMGGQt+Y4wZZPplG78xva29vZ2qqipaWlr6uijGHJasrCxKSkoIhUI93oYFvxkUqqqqyM/PZ/z48YhIXxfHmB5RVerq6qiqqmLChAk93o419ZhBoaWlhWHDhlnomwFNRBg2bNhhX7la8JtBw0Lf+EFvvI59Ffz3vrCelz+s7etiGGNMv+ar4P/ZSxt5db0Fv+mf8vLy+mzf99xzD01NTV3Ou+aaa1izZg0AP/jBD3p1v4888gjV1dVd7sv0nW6DX0QeFpEaEXk/yfwbReSd2O19EYmIyNDYvC0i8l5s3hH/Km4wIESiR3ovxgw8qYL/oYceYuLEiUDPgj8SiSSd1zn4E/dl+k46Nf5HgPnJZqrqj1S1XFXLge8AL6vqroRF5sTmVxxeUbvneUIkaslvBo6PPvqIuXPnUlZWxty5c/n4448BePLJJyktLWXKlCmcddZZAKxevZoZM2ZQXl5OWVkZ69evP2h7X//616moqGDSpEnceuutANx7771UV1czZ84c5syZc9A6s2fPprKykptvvpnm5mbKy8tZuHAhAL/97W/j+/za174WD/m8vDxuueUWZs6cyeuvv87ixYs59dRTKS0tZdGiRagqS5cupbKykoULF1JeXk5zc3N8XwCPPfYYkydPprS0lJtuuilenry8PL773e8yZcoUZs2axY4dO3rxETeQRndOVV0uIuPT3N4C4LHDKdDhCAaEcFT7avdmgPiP/7OaNdX7enWbE48p4Na/m3TI611//fVcccUVXHnllTz88MPccMMNPPXUUyxevJhnn32WMWPGsGfPHgAeeOABvvnNb7Jw4ULa2tq6rGl///vfZ+jQoUQiEebOncu7777LDTfcwF133cWyZcsYPnx40rLcfvvt/PSnP+Wdd94BYO3atTzxxBP89a9/JRQKcd1117FkyRKuuOIKGhsbKS0tZfHixe74J07klltuAeDLX/4yf/zjH7nkkkv46U9/yp133klFxYH1vurqam666SZWrlxJUVER8+bN46mnnuLiiy+msbGRWbNm8f3vf59//ud/5he/+AX/+q//esiPrUmu19r4RSQHd2Xw3wmTFfiLiKwUkUXdrL9IRCpFpLK2tmft9F5AiFjwmwHk9ddf5/LLLwdcYL766qsAnHHGGVx11VX84he/iAf8aaedxg9+8AN++MMf8tFHH5GdnX3Q9n73u98xbdo0pk6dyurVqw+rPf2FF15g5cqVnHrqqZSXl/PCCy+wadMmADzP44tf/GJ82WXLljFz5kwmT57Miy++yOrVq1Nue8WKFcyePZvi4mKCwSALFy5k+fLlAGRkZPC5z30OgOnTp7Nly5YeH4PpWm9+gevvgL92auY5Q1WrRWQE8JyIfKCqy7taWVUfBB4EqKio6FF6BwMBq/GbbvWkZn60dHTVe+CBB3jzzTf505/+RHl5Oe+88w6XX345M2fO5E9/+hPnnXceDz30EOecc0583c2bN3PnnXeyYsUKioqKuOqqqw6rv7eqcuWVV3LbbbcdNC8rKwvP8wD3HYnrrruOyspKxo4dy7//+793u1/V5O/TUCgUfxw8zyMcDvf4GEzXerNXz2V0auZR1erY3xrg98CMXtzfQazGbwaa008/nccffxyAJUuWcOaZZwKwceNGZs6cyeLFixk+fDhbt25l06ZNHHfccdxwww1ceOGFvPvuuwdsa9++feTm5jJkyBB27NjBn//85/i8/Px86uvruy1PKBSivb0dgLlz57J06VJqamoA2LVrFx99dPDIvx0hP3z4cBoaGli6dGm3+505cyYvv/wyO3fuJBKJ8Nhjj3H22Wd3Wz7TO3qlxi8iQ4CzgX9ImJYLBFS1Pvb/PGBxb+wvGWvjN/1ZU1MTJSUl8fvf+ta3uPfee7n66qv50Y9+RHFxMb/61a8AuPHGG1m/fj2qyty5c5kyZQq33347v/3tbwmFQowaNSrept5hypQpTJ06lUmTJnHcccdxxhlnxOctWrSI888/n9GjR7Ns2bKkZVy0aBFlZWVMmzaNJUuW8J//+Z/MmzePaDRKKBTivvvu49hjjz1gncLCQr761a8yefJkxo8fz6mnnhqfd9VVV3HttdeSnZ3N66+/Hp8+evRobrvtNubMmYOqcsEFF3DRRRf17IE1h0xSXXIBiMhjwGxgOLADuBUIAajqA7FlrgLmq+plCesdh6vlgzvBPKqq30+nUBUVFdqTH2L5zF0vc+LIPO5fOP2Q1zX+tnbtWk455ZS+LoYxvaKr17OIrEy392Q6vXoWpLHMI7hun4nTNgFT0ilEb/ECQjhiNX5jjEnFV9/ctTZ+Y4zpnq+CPxgQIt00XRljzGDnq+C3Gr8xxnTPV8EfDASsjd8YY7rhq+C3Gr8xxnTPV8Ef9ISwDdJm+qHZs2fz7LPPHjDtnnvu4brrrku5XsdQztXV1VxyySVJt91d9+fOo3NecMEF8TGA/G7Pnj3cf//9SeeffvrpAGzZsoVHH320V/fdebTTjn31NV8Fv9X4TX+1YMGC+Dd0Ozz++OMsWNBtb2kAjjnmmAO+EXuoOgf/M888Q2FhYY+3d6SkGuK5p7oL/tdeew3oWfB3V97Owd+xr77mq+C3b+6a/uqSSy7hj3/8I62trYALmerqas4880waGhqYO3cu06ZNY/LkyfzhD384aP0tW7ZQWloKQHNzM5dddhllZWV86Utform5Ob5cusMyjx8/np07dwJw1113UVpaSmlpKffcc098f6eccgpf/epXmTRpEvPmzTtgPx26Gj46Eonw7W9/m8mTJ1NWVsZPfvITwA36NnXqVCZPnszVV18dfyzGjx/P4sWLOfPMM3nyySfZuHEj8+fPZ/r06Xz605/mgw8+SLqvRMkex5tvvpmNGzdSXl7OjTfeeNB6HVdVN998M6+88grl5eXcfffdRCIRbrzxRk499VTKysr4+c9/DsBLL73EnDlzuPzyy5k8eTIAF198MdOnT2fSpEk8+OCD8e11Hua6Y1+qyo033khpaSmTJ0/miSeeiG979uzZXHLJJZx88sksXLgw5bhGPaaq/e42ffp07YlFv16h5939co/WNf62Zs2a/XeeuUn14Qt69/bMTd2W4YILLtCnnnpKVVVvu+02/fa3v62qqu3t7bp3715VVa2trdXjjz9eo9Goqqrm5uaqqurmzZt10qRJqqr64x//WL/yla+oquqqVavU8zxdsWKFqqrW1dWpqmo4HNazzz5bV61apaqqxx57rNbW1sbL0nG/srJSS0tLtaGhQevr63XixIn61ltv6ebNm9XzPH377bdVVfXSSy/V3/zmNwcdU2lpqVZVVamq6u7du1VV9f7779cvfOEL2t7eHi9Tc3OzlpSU6Lp161RV9ctf/rLefffd8bL88Ic/jG/znHPO0Q8//FBVVd944w2dM2dO0n0lSvY4Jj52Xel4jJctW6af/exn49N//vOf6/e+9z1VVW1padHp06frpk2bdNmyZZqTk6ObNm2KL9vxuDc1NemkSZN0586dB2y7876WLl2q5557robDYd2+fbuOHTtWq6urddmyZVpQUKBbt27VSCSis2bN0ldeeeWgMh/weo4BKjXNjPVVjd+zGr/pxxKbexKbeVSVf/mXf6GsrIxzzz2XTz75JOWPjyxfvpx/+Ac3LFZZWRllZWXxeYc6LPOrr77K5z//eXJzc8nLy+MLX/gCr7zyCgATJkygvLwcSD48clfDRz///PNce+21BINuYIChQ4eybt06JkyYwKc+9SkArrzyyvgwzABf+tKXAFdrf+2117j00kvjP/6ybdu2pPtKdKiPY3f+8pe/8Otf/5ry8nJmzpxJXV1d/MdvZsyYwYQJE+LL3nvvvfEfjtm6dWuXP5KT6NVXX2XBggV4nsfIkSM5++yzWbFiRXzbJSUlBAIBysvLj8iw1L05LHOf8wIBa+M33Tv/9j7Z7cUXX8y3vvUt3nrrLZqbm5k2bRrgRuWsra1l5cqVhEIhxo8f3+2wxh3DFifqybDMmqIZITMzM/6/53ldNvV0NXy0qh5UvlT7AcjNzQUgGo1SWFgY/zGY7vY1bNiw+PyePI6pqCo/+clPOO+88w6Y/tJLL8XL23H/+eef5/XXXycnJ4fZs2f36uN+JIal9lWN37XxW68e0z/l5eUxe/Zsrr766gM+1N27dy8jRowgFAqxbNmyLoc+TnTWWWexZMkSAN5///348Mw9GZb5rLPO4qmnnqKpqYnGxkZ+//vf8+lPfzrtY+pq+Oh58+bxwAMPxANr165dnHzyyWzZsoUNGzYA8Jvf/KbLYZgLCgqYMGECTz75JOACctWqVUn3lSjZ45jukNSdlzvvvPP42c9+Fh+m+sMPP6SxsfGg9fbu3UtRURE5OTl88MEHvPHGG/F5icNcJzrrrLN44okniEQi1NbWsnz5cmbMOKKj1h/AV8HvBQTLfdOfLViwgFWrVnHZZfGBbFm4cCGVlZVUVFSwZMkSTj755JTb+PrXv05DQwNlZWXccccd8cBIHJb56quv7nJY5s6/uTtt2jSuuuoqZsyYwcyZM7nmmmuYOnVq2sdz4403xn8396yzzmLKlClcc801jBs3jrKyMqZMmcKjjz5KVlYWv/rVr7j00kuZPHkygUCAa6+9tsttLlmyhF/+8pdMmTKFSZMmxT+k7WpfiZI9jsOGDeOMM86gtLS0yw93O5SVlREMBpkyZQp3330311xzDRMnTmTatGmUlpbyta99rcva9/z58wmHw5SVlfFv//ZvzJo1Kz6vY5jrjg93O3z+85+PPz7nnHMOd9xxB6NGjUrvQe8F3Q7L3Bd6Oizzzf/9LsvW1fDmv5x7BEplBjIbltn4yeEOy+y7Gr+18RtjTGq+Cn7rx2+MMd3zVfB7gQARG6TNJNEfmzWNOVS98Tr2VfC7sXrszW0OlpWVRV1dnYW/GdBUlbq6OrKysg5rOz7rx29t/KZrJSUlVFVVUVtb29dFMeawZGVlUVJScljb8FXwWz9+k0woFDrgm5bGDGbdNvWIyMMiUiMi7yeZP1tE9orIO7HbLQnz5ovIOhHZICI392bBuxIQIaoQtVq/McYklU4b/yPA/G6WeUVVy2O3xQAi4gH3AecDE4EFIjLxcArbnWDAfU3cfnfXGGOS6zb4VXU5sKsH254BbFDVTaraBjwOXNSD7aTN82LBbzV+Y4xJqrd69ZwmIqtE5M8iMik2bQyQOJhGVWzaEROv8VvwG2NMUr3x4e5bwLGq2iAiFwBPAScCBw8fCEkTWUQWAYsAxo0b16OCeAF3HrMuncYYk9xh1/hVdZ+qNsT+fwYIichwXA1/bMKiJUB1iu08qKoVqlpRXFzco7JYjd8YY7p32MEvIqMkNvi2iMyIbbMOWAGcKCITRCQDuAx4+nD3l4oXC37r0mmMMcl129QjIo8Bs4HhIlIF3AqEAFT1AeAS4OsiEgaagctiPwMWFpHrgWcBD3hYVVcfkaOIsRq/McZ0r9vgV9UF3cz/KfDTJPOeAZ7pWdEOXbzGb+P1GGNMUr4bqwesxm+MMan4KvgD0tHGb8FvjDHJ+Cr4g7HunFbjN8aY5HwV/Narxxhjuuer4O/o1WO5b4wxyfkq+DvG6rEavzHGJOer4Ld+/MYY0z1fBf/+Nn4LfmOMScZXwW+9eowxpnu+Cn6r8RtjTPd8Ffz72/jtw11jjEnGV8FvY/UYY0z3fBn81sZvjDHJ+Sr4g9bGb4wx3fJV8HfU+KNqwW+MMcn4Kvg7unNaG78xxiTnq+D3bDx+Y4zplq+C39r4jTGme74Kfs/68RtjTLd8FfxW4zfGmO75KvitH78xxnSv2+AXkYdFpEZE3k8yf6GIvBu7vSYiUxLmbRGR90TkHRGp7M2Cd8XG6jHGmO6lU+N/BJifYv5m4GxVLQO+BzzYaf4cVS1X1YqeFTF9VuM3xpjuBbtbQFWXi8j4FPNfS7j7BlBy+MXqGevHb4wx3evtNv5/BP6ccF+Bv4jIShFZ1Mv7Okiswk/EvrlrjDFJdVvjT5eIzMEF/5kJk89Q1WoRGQE8JyIfqOryJOsvAhYBjBs3rqdlIBgQ685pjDEp9EqNX0TKgIeAi1S1rmO6qlbH/tYAvwdmJNuGqj6oqhWqWlFcXNzjsngBsQ93jTEmhcMOfhEZB/wP8GVV/TBheq6I5Hf8D8wDuuwZ1JuCASFibfzGGJNUt009IvIYMBsYLiJVwK1ACEBVHwBuAYYB94sIQDjWg2ck8PvYtCDwqKr+3yNwDAewGr8xxqSWTq+eBd3Mvwa4povpm4ApB69xZAW9gHXnNMaYFHz1zV2wGr8xxnTHf8Ev1qvHGGNS8V/wW43fGGNS8l3wBz2xNn5jjEnBd8FvNX5jjEnNd8EfDAhRC35jjEnKd8HvBQJW4zfGmBR8F/xurB4LfmOMScZ3wW9t/MYYk5rvgt9G5zTGmNR8F/xeQOyHWIwxJgXfBb/14zfGmNR8F/wBsTZ+Y4xJxXfBb716jDEmNd8Fv/XjN8aY1HwX/PbNXWOMSc13we95Qti6cxpjTFK+C35r4zfGmNR8F/z2zV1jjEnNd8FvNX5jjEnNd8FvvXqMMSa1tIJfRB4WkRoReT/JfBGRe0Vkg4i8KyLTEuZdKSLrY7cre6vgyViN3xhjUku3xv8IMD/F/POBE2O3RcDPAERkKHArMBOYAdwqIkU9LWw63Fg91qvHGGOSSSv4VXU5sCvFIhcBv1bnDaBQREYD5wHPqeouVd0NPEfqE8hh86zGb4wxKfVWG/8YYGvC/arYtGTTj5ig9eoxxpiUeiv4pYtpmmL6wRsQWSQilSJSWVtb2+OCWI3fGGNS663grwLGJtwvAapTTD+Iqj6oqhWqWlFcXNzjggQDQkQt+I0xJpneCv6ngStivXtmAXtVdRvwLDBPRIpiH+rOi007YrxAAFVsvB5jjEkimM5CIvIYMBsYLiJVuJ46IQBVfQB4BrgA2AA0AV+JzdslIt8DVsQ2tVhVU31IfNiCnmtdCkeVjEBXLU3GGDO4pRX8qrqgm/kKfCPJvIeBhw+9aD3jxcLe2vmNMaZrvvvmbjDQUeO3vvzGGNMV3wW/1fiNMSY13wa/9eU3xpiu+Tb4rcZvjDFd813wB63Gb4wxKfku+L2AO6RIxILfGGO64rvg76jx27d3jTGma74L/v1t/Nad0xhjuuK74Lc2fmOMSc13wR/vzmlt/MYY0yXfBX/HWD3WndMYY7rmu+Dv6NVjTT3GGNM1/wW/WI3fGGNS8V/w2yBtxhiTku+C39r4jTEmNd8Fvw3SZowxqfku+OPf3LXunMYY0yXfBb9nQzYYY0xKvgv+YMcgbdbUY4wxXfJd8FsbvzHGpOa74A/aIG3GmN7i0xwJprOQiMwH/jfgAQ+p6u2d5t8NzIndzQFGqGphbF4EeC8272NVvbA3Cp6MjdVjBjxVaK2HrIKDp8e+oBgXjUIgof4WCUNjLWQNgYyc5PuIhGHnOmiqc/sSDwqOcbdgFgQ8N33XJtj3CRwzFYYe59Zta4TqdyC7yE3TCFStgKpKiLRBIOTKpFFQIDMPsodC3ggYdgIUjHHH0dYAez+Bbe/AjvchZ7jbT9Gx0LgT6re7/e/80P3vZUAwA5r3QP02aNnrjjO7CIaUwPCTILvQleXj1yHcCjnDICPXPSb1OyBnKIyZDiMnQcs+aNjhjicadmVv2QNNu6F1n5uuERgxEUoqILMA9nwE+6rdstGoO7ai8e6YWuvdfpp2QtMuV04ALwhepitHZj6MOAXGVED+KHdc9dvcY7y3yj0uX/rt4b6CutVt8IuIB9wHfAaoAlaIyNOquqZjGVX9p4Tl/z9gasImmlW1vPeKnJr14/exlr2w/T33ZmmocW+y5l3uDZxdCHmj9gdDy14IZbtgCAQh3OLerMNOhJJTXbCt+zNseglQ94YMZrs3ejTiwiRvpNvvjvegZq1bZuhxbnqk3W0vI8+FiZcRexNXu2mF4/YHxa7NbnpDDbQ1uXnDjoP8Y/YHU3ujC46atS5Am3e5QBk7C6LtLmh3bXL7ychxJ4G2RjcvdwQMGQPtLbBroysXQEY+hLIg3OZCeOh4KD7FBe6WV124HYrhJ0FuMVT9bf8+EJCAe9zS5WXGHudwwrSMhG12klvsgjUacc9jVoE7gWQVumNo3g0fvwnvPRlbfgQce5qb33FiG1PhnreG7e7xXfs0hHIhr9g9T4EgeCH3Gio+xT3XmXnucd7+Lrz/ewg3u+duSAkEh7mTZcte2LzcnQwy811Zc4dD4bEweoorTzTsyt3W5E4sb/0a3nzgwGPMyHfP4bAT0n8cD0M6Nf4ZwAZV3QQgIo8DFwFrkiy/ALi1d4p36DqGbLA2/j7WvBu2rXJhlVXo3gwNNVD9NtRtcGHX8cbcsRp2b3EBmj/ahUBrvQvDzAIXwo21ULsOV4WMEc+tk1ng9te8y033Mty0cIsLOXDhFAgeHC4jJkIox9U8w81uGSRhe+LCfuREF7SfvOXK4oVczbat0ZUT3HbyR8dqfjX7pxVNcGExqsydjHZvge3vw4YXoa1+f1m8DLevky+AwvGuJrxpmZs+egpMutiFSFujO/aMXFfehu2utpifBZ+a58KpZZ97vMMtEMx026/bAB+95so+6fMw/kxX3sx8dyKrr3Ynr0ib208ox5Untxg+fgPW/ckF3cyvwbFnuse2boNbd9wsGDvDPe7RsAvpgOf221rvasD11VC30b0mAsH9J+vRZe6E3LrPvT72feLCO3+kO/llF6X3mmtrdPsZUnLwlVFn4TZXSUhXR5NPIEnreOcrr1QiYahd6177+aNdzT9rSPpl6QXpBP8YYGvC/SpgZlcLisixwATgxYTJWSJSCYSB21X1qR6WNS32Y+uHKdIOez52b8767e6ytXGnC7vGWkBcjbMjnFv2uTdsy15o7QhZ3P2ueJmuVhNuceGamQ+jJsNJ57va0L5tLnjyR7rgaW1wAVw4Dkq/CMdMg8KxLoyyiw58g4fbXA04lLN/ejTijimY6WpvuzbC1r+5/Z/4GbfdZMKtbv1UTSbgatqRVhd6Hfttb3GPT+7w1CEUbnMBmpF3aEF0tB1TDrOuTW9ZL+RuHXKGutvwE2DCWcnXyxkKJ8zteRkzct0tHYf6WHcX6umGPrimn1GTD23/vSyd4O/qVZssVS8DlqoecN03TlWrReQ44EUReU9VNx60E5FFwCKAceNSvBm7ERzso3PWb4c9W12gZuRAzQew9U1Xywxlu9veKqj9wIVsx5tUoy4g25vc/4mC2e6SOGe4qznvq44FXb4Lu6HHudp7Rq4LOVVXizmmPFaTi7V9Zhe6GnZiKPSmYAbQ6Q0d8PbXPEVg+Inultb2MtNbLpTlbt1N63IfGRAcmt5+jOkl6QR/FTA24X4JUJ1k2cuAbyROUNXq2N9NIvISrv3/oOBX1QeBBwEqKip6nNqe58NePdGoq6nWb3e12WCGazeuWeNCWKOulvzJW1C3/uD1xXO15HCrC/b80a7Z4aQL3LrhVhfoXoYL76ETXPNEwTGuxppuLcoYMyCkE/wrgBNFZALwCS7cL++8kIicBBQBrydMKwKaVLVVRIYDZwB39EbBk9nfnfNI7qWXNNS4D5pa9rimi4xcd3/zctfOGcp2baE1H0BrV00n4tYLBF2tdsREmHYFFJ/kmg9a62Ho8TBmmoW3MSau2+BX1bCIXA88i+vO+bCqrhaRxUClqj4dW3QB8LjqAWMlnAL8XESiuO8M3J7YG+hI6Bc/tq4a6z62zYV7/TbX1LJrs7vfujfWXW1b1+sPP8k1R7Q3u9p46RdcF7SiY13bcXuTq8EXn2yBbow5ZGn141fVZ4BnOk27pdP9f+9ivdeAo/opxlHv1bOvGj581nUz3PMR7P7ItaGHmw9cTjz3QWLBMVBQAiMnu77EJRWuf3Njnav5jyyFgtFHp+zGmEEpreAfSAIBISBHoFdPezPsXO++TNLRJa1mtQt8cN2xisa7L2d86jzXj7dgtOs7nDfC9UNO9aFmx5djjDHmCPNd8IPr2XPYNf5oFD6phA3Pw8YX4ZOVB/Z2KRgDw46Hube6rojFJ3ffd9gYY/oBXwa/F5Ce1fjbmmDLK+4bneuecV/nloDrO37mP7lmmOKTXO08lN37BTfGmKPAt8F/SGP17NoEz/+HC/xIq/sq9wlz4ZS/gxPOdV8sMcYYn/Bt8KfVq6e9GZbfCa/9xHWJrPiKa58fd3p6X74xxpgByJfBHwxI9238DTXw2ALXjj/57+Ezi603jTFmUPBl8Hfbxr9jDTz6JdfX/u9/AxOP6EjRxhjTr/gy+IOpgr9+B/zX59zIil95xn2r1RhjBhFfBr/nJQl+VfjDN9zwrV9b7nroGGPMIOPL4E/aj7/yl7DhObjgTgt9Y8yg5bvf3IUkbfw7N8Cz/+q6Z556Td8UzBhj+gFfBr/r1dOpO+eLi92QCRfdZ9+wNcYMar4M/oNq/DvWwJo/uJ+Myx/VdwUzxph+wLfBf0Ab//I73I8Zz7qu7wpljDH9hG+DP17jr1kLq5+CmYts6AVjjMGnwR9MHKvn5Tvcj5Wcdn3fFsrw5rx+AAARrklEQVQYY/oJXwZ/vMbfshfWPg3Tr7LavjHGxPgy+F0//qgbRz8adqNsGmOMAXwa/F5AiCjuJxGzi6Dk1L4ukjHG9Bu+DP5gQIhGwrD+OTjhMxDw+rpIxhjTb/gy+L2AcFzbh9C0042vb4wxJi6t4BeR+SKyTkQ2iMjNXcy/SkRqReSd2O2ahHlXisj62O3K3ix8MkFPOLXtb+5nE48/52js0hhjBoxuB2kTEQ+4D/gMUAWsEJGnVXVNp0WfUNXrO607FLgVqAAUWBlbd3evlD4JLxBgZrgSxs6y3jzGGNNJOjX+GcAGVd2kqm3A48BFaW7/POA5Vd0VC/vngPk9K2r6hoZrOTG6CT4170jvyhhjBpx0gn8MsDXhflVsWmdfFJF3RWSpiIw9xHURkUUiUikilbW1tWkUK7lTmle6f0609n1jjOksneDvaijLzoPd/x9gvKqWAc8D/3UI67qJqg+qaoWqVhQXF6dRrOSK27cRJmBj7htjTBfSCf4qYGzC/RKgOnEBVa1T1dbY3V8A09Nd90goiOxiN0OsG6cxxnQhneBfAZwoIhNEJAO4DHg6cQERGZ1w90Jgbez/Z4F5IlIkIkXAvNi0I6ogvIudFB7p3RhjzIDUba8eVQ2LyPW4wPaAh1V1tYgsBipV9WngBhG5EAgDu4CrYuvuEpHv4U4eAItVddcROI4D5IXr2MyQI70bY4wZkNL6zV1VfQZ4ptO0WxL+/w7wnSTrPgw8fBhlPGR54d3U6slHc5fGGDNg+O+bu6rktddRo9bUY4wxXfFf8DfvxtMwNVFr6jHGmK74L/gbagCoiRag2mXPUWOMGdT8F/yNLvhrKTzwB9eNMcYAfgz+WI2/Vocc+IPrxhhjAF8G/w4AatVq/MYY0xVfBn9EQuwjh12NbX1dGmOM6Xd8GPy1RHJHAMI7W/f0dWmMMabf8WHw7yBYMIrMYIC3P7bgN8aYznwY/DUE8kdSVjKEt7ce0d97McaYAcmHwb8DcouZOq6I1Z/sozUc6esSGWNMv+Kv4I9G3A+s541k6thC2iJR1lTv6+tSGWNMv+Kv4G+qA41C3gimjisCsHZ+Y4zpxF/BH+vDT95IRg3JYvSQLN62nj3GGHMAnwb/CACmjivk7Y/tA15jjEnks+B3wzXEg39sEVW7m6mpb+nDQhljTP/iz+DP3V/jB2vnN8aYRP4L/ow8yMwDoHTMEEKe8Mamuj4umDHG9B8+C37Xh79DVsjjMxNHsrSyir3N7X1YMGOM6T/8F/x5Iw+Y9I05J1DfGubXr23pmzIZY0w/46/gb6yNf7DbYdIxQzjn5BH88q+baWwN91HBjDGm/0gr+EVkvoisE5ENInJzF/O/JSJrRORdEXlBRI5NmBcRkXdit6d7s/AH6aLGD67Wv6epnUff/PiI7t4YYwaCboNfRDzgPuB8YCKwQEQmdlrsbaBCVcuApcAdCfOaVbU8druwl8p9MFU46bMwbtZBs6YfW8Tpxw/jwVc2sa/F2vqNMYNbOjX+GcAGVd2kqm3A48BFiQuo6jJVbYrdfQMo6d1ipkEELr4PJl/S5ewbzzuJ3Y1t/K/frSJqv8xljBnE0gn+McDWhPtVsWnJ/CPw54T7WSJSKSJviMjFyVYSkUWx5Spra2vTKNahmTquiO9+9hSeW7OD+5Zt6PXtG2PMQBFMYxnpYlqXVWYR+QegAjg7YfI4Va0WkeOAF0XkPVXdeNAGVR8EHgSoqKg4IlXyq04fz6qte7jr+Q85cWQ+80tHHYndGGNMv5ZOjb8KGJtwvwSo7ryQiJwLfBe4UFVbO6aranXs7ybgJWDqYZT3sIgIt32hjMljhvD1JSu55/kPrdnHGDPopBP8K4ATRWSCiGQAlwEH9M4RkanAz3GhX5MwvUhEMmP/DwfOANb0VuF7IjvD44lFp/H58jHc8/x6rvzV3/iorrEvi2SMMUdVt8GvqmHgeuBZYC3wO1VdLSKLRaSjl86PgDzgyU7dNk8BKkVkFbAMuF1V+zT4wYX/j/9+Cv95cSmVW3Yz98cvc+sf3mf7XhvMzRjjf6La/5o6KioqtLKy8qjsq2ZfC/e8sJ4nVmwlqsqM8UP5XNlozvpUMeOG5iDS1UccxhjTv4jISlWtSGvZwR78HT6qa+Spt6t5etUnbKx1TT9jCrOpGF/EpGMKmDh6CBOKcxldkEUgYCcDY0z/YsF/GFSVjbWNvLZxJ3/dsJN3q/ayLaEJKDMYYOzQHMYUZlNSlE1JUQ4lRdkcU5jFyIIsRuRnkRH010gYxpj+71CCP53unIOKiHDCiDxOGJHHFaeNB6CuoZV12+vZXNfIlp2NbN3VTNWeJlZV7WFP08HfBC7KCTEiP4sRBZkU52VSnJ9JYU4GhTkhCrNDDMkOUdDxNytEXlYQz64ijDFHiQV/GoblZXL6CZmcfsLwg+Y1tIap2t3Etr0t1OxrYfveVmrqW6ipb6W2vpVNtY3UNrTSFo6m3EdOhkd+VpD8rBAFsb95WUHyMoJkZ3hkhbz4MnmZQQpzMijKcSeQrKBHVkaAvMwg2SHPPpcwxqRkwX+Y8jKDnDyqgJNHFSRdRlVpbo+wu6mdvU3t7G12t/qWdva1hKlvaaehJUx9S5j61nb2NYfZ09RG1e4mGlrDNLdFaGmP0hZJffIACAaEgmx3ZVGYEyI3M0hWyCM/M0hxvrv6KMgOkZsRJCfDIzvDnVA6rjhCXoBhuRkU5WTYZxnG+JQF/1EgIuRkBMnJCDKmMLvH22kLR2lsdSeIPc1t7GlqZ19LOy3tUZrbI7GTx/4Ty56mdhpbw9TWt1Lf4v6mc/IACAjkZAQJeULIC8S/qp2fFaQ4L5OhuRlEVYlEITMUYEh2iPysIKFAAC8g5GUGKcrNYEh2iKxQgAwvEL9qyQx6tIYjNLdHyAp5DM/LpDA7ZCcaY44SC/4BJCMYICOYQVFuBuPIOeT1VZV9ze6qoqktQmPsaqKpLUIk9iF/WzhKXUMrOxvaaGqL0B6J0h6J0tF6tLe5nZp9rayvacATIRAQWtsj7gqmNUwkqkR68G3ogLjjCwUCZIY8cjM9skMeIS9AICBkeEJGMEBm0E3PCnkEBFrDrnwZwQBZQY9QUBAEEYiqElW37cygR2Yw4PbhuRNRRuy+J275kBcgN9NdCUWiSls4ikL8s5mskEdGMEBAhKgq4agSEPAC4qZF3f6CsbIGA0J7WGmLRMnwAmRneEk/+I9EldZwBC8gZAa9Q378BiJV91oJetYZ4miz4B9ERIQhOSGG5ISO6H5UlYbWMLsb3ZVHazhCW9hdlTS1RWgNR8kKuaBuCUeorW+lrqGNtkiUtnCU1nCUprawOyFFXcC2h6O0tEfZ2xy7wmmLEFUlMxbkbZEoLe0R2iOKqqJAQCQe0m1hNz/cx0N0BAMSPwEp0B52TXjtkf3lCnnuCjE75JriwtEoDS1hmtsj8RNfdkbHCTBAOKq0tkfjJ6+QJwRjV15ZoUD88yJViETdFV/Qcyc8AI1dzwUD7qTWcZLvKGtmyDUFBmNXZO0RJRyJEghIfHow9r8krC8Jw3yJgCduGQXe+ng3L6+rZfu+Fk4ckcfkMUM4fkQeY4tyGFmQGf+cygtIrILhRl5XhVBQyAkFyQwFUHUneMW97sA97x0nYy/gStHYFqax1Z1YC7Ld1ber9ITxAsKQ7BB5mS4OI1FFRAgI8XKoKnub2/l4VxM1+1oZWZDFscNzKMhK/l6KRhVJ2EY44l7b4YgSjkYJBQPkZwb75DM5C37T60SE/KwQ+SneFH0lGnU18LZINB66Ud0/vTl2JRT0hAzPQ1H2NLWzp7md1tiJJRKN4gUCeLEwCkeVqGr8RBOJxt7gUY1dXQjtEaWpLUxjmzsJtoWjBDqCOna1khEMEIlGaYyVoSV2ogx5sQ/uMzzaEk6KzbGTaCh2hQHuis2V0YXLzoYwm3c20tAaRkTiYR+OavwkICLx2nfiebE9FlRHQl5mkNOOH8bnykazdns9L3xQw5Mrq47Ivg6HiHuOk+k4iWcGA0TVhXt7xL2WOq58O06YXVU6MrwAQ3MzyAy5E/GwvAyevPb0I3IsiSz4zaASCAhZAddUZLqnuj/EOoIrw3NXE+6k54IuGlXao9H4uL16wDbcVUUkqkSjEFGlpCibUKcmnr3N7VTtbmJnQ1t834knVsGdpNoj0diVY8RdoeBq+R2iqkTUlSkSdVcDuRlBcjJdE96+ljDNbeHYlVOQSNRdSTa0hCF2chQhdiJ0+0WE/Mwg44blMCI/kx37WtlS18iuxjZ3Io9E41c0HSfikLf/ZADEmwo7rspa26PsbGxlV+xqNxLV+FXHkWbBb4xJSiT1Zw4Zvfiz3UOyQwzJHtJr2zPJ2acqxhgzyFjwG2PMIGPBb4wxg4wFvzHGDDIW/MYYM8hY8BtjzCBjwW+MMYOMBb8xxgwy/fIXuESkFvioh6sPB3b2YnH6ih1H/+OXY7Hj6H9641iOVdXidBbsl8F/OESkMt2fH+vP7Dj6H78cix1H/3O0j8WaeowxZpCx4DfGmEHGj8H/YF8XoJfYcfQ/fjkWO47+56gei+/a+I0xxqTmxxq/McaYFHwT/CIyX0TWicgGEbm5r8tzKERkrIgsE5G1IrJaRL4Zmz5URJ4TkfWxv0V9XdZ0iIgnIm+LyB9j9yeIyJux43hCRDL6uozdEZFCEVkqIh/EnpfTBuLzISL/FHtNvS8ij4lI1kB5PkTkYRGpEZH3E6Z1+RyIc2/s/f+uiEzru5IfKMlx/Cj22npXRH4vIoUJ874TO451InLekSiTL4JfRDzgPuB8YCKwQEQm9m2pDkkY+F+qegowC/hGrPw3Ay+o6onAC7H7A8E3gbUJ938I3B07jt3AP/ZJqQ7N/wb+r6qeDEzBHc+Aej5EZAxwA1ChqqWAB1zGwHk+HgHmd5qW7Dk4HzgxdlsE/OwolTEdj3DwcTwHlKpqGfAh8B2A2Pv+MmBSbJ37Y/nWq3wR/MAMYIOqblLVNuBx4KI+LlPaVHWbqr4V+78eFzJjcMfwX7HF/gu4uG9KmD4RKQE+CzwUuy/AOcDS2CL9/jhEpAA4C/glgKq2qeoeBuDzgfuVvWwRCQI5wDYGyPOhqsuBXZ0mJ3sOLgJ+rc4bQKGIjD46JU2tq+NQ1b+oajh29w2gJPb/RcDjqtqqqpuBDbh861V+Cf4xwNaE+1WxaQOOiIwHpgJvAiNVdRu4kwMwou9KlrZ7gH8GOn6lexiwJ+FFPhCem+OAWuBXsSarh0QklwH2fKjqJ8CdwMe4wN8LrGTgPR+Jkj0HAzkDrgb+HPv/qByHX4Jfupg24LoriUge8N/A/6+q+/q6PIdKRD4H1KjqysTJXSza35+bIDAN+JmqTgUa6efNOl2JtX9fBEwAjgFycU0infX35yMdA/F1hoh8F9fUu6RjUheL9fpx+CX4q4CxCfdLgOo+KkuPiEgIF/pLVPV/YpN3dFyuxv7W9FX50nQGcKGIbME1t52DuwIojDU1wMB4bqqAKlV9M3Z/Ke5EMNCej3OBzapaq6rtwP8ApzPwno9EyZ6DAZcBInIl8Dlgoe7vV39UjsMvwb8CODHWWyED9+HI031cprTF2sF/CaxV1bsSZj0NXBn7/0rgD0e7bIdCVb+jqiWqOh73HLyoqguBZcAlscUGwnFsB7aKyEmxSXOBNQyw5wPXxDNLRHJir7GO4xhQz0cnyZ6Dp4ErYr17ZgF7O5qE+iMRmQ/cBFyoqk0Js54GLhORTBGZgPuw+m+9XgBV9cUNuAD36fhG4Lt9XZ5DLPuZuMu5d4F3YrcLcO3jLwDrY3+H9nVZD+GYZgN/jP1/XOzFuwF4Esjs6/KlUf5yoDL2nDwFFA3E5wP4D+AD4H3gN0DmQHk+gMdwn02042rC/5jsOcA1kdwXe/+/h+vJ1OfHkOI4NuDa8jve7w8kLP/d2HGsA84/EmWyb+4aY8wg45emHmOMMWmy4DfGmEHGgt8YYwYZC35jjBlkLPiNMWaQseA3xphBxoLfGGMGGQt+Y4wZZP4fwit2qj2RLJ8AAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "import matplotlib.pyplot as plt\n", "\n", "# Visualize the loss curve and validation scores over iterations.\n", "plt.plot(ann.loss_curve_, label='Loss at iteration')\n", "plt.plot(ann.validation_scores_, label='Validation scores at iteration')\n", "plt.legend(loc='best')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we can use our test data to check how our model performs on data that it has not been previously exposed to. Let's see how we did!" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.7933064999109457" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ann.score(X_test_scaled, y_test)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Not bad, but it could definitely be better.\n", "\n", "### Let's make a prediction\n", "\n", "Note that we can use this ANN to make a prediction. To come up with one, let's look at those values again:" ] }, { "cell_type": "code", "execution_count": 222, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
MedIncHouseAgeAveRoomsAveBedrmsPopulationAveOccupLatitudeLongitude
count20640.00000020640.00000020640.00000020640.00000020640.00000020640.00000020640.00000020640.000000
mean3.87067128.6394865.4290001.0966751425.4767443.07065535.631861-119.569704
std1.89982212.5855582.4741730.4739111132.46212210.3860502.1359522.003532
min0.4999001.0000000.8461540.3333333.0000000.69230832.540000-124.350000
25%2.56340018.0000004.4407161.006079787.0000002.42974133.930000-121.800000
50%3.53480029.0000005.2291291.0487801166.0000002.81811634.260000-118.490000
75%4.74325037.0000006.0523811.0995261725.0000003.28226137.710000-118.010000
max15.00010052.000000141.90909134.06666735682.0000001243.33333341.950000-114.310000
\n", "
" ], "text/plain": [ " MedInc HouseAge AveRooms AveBedrms Population \\\n", "count 20640.000000 20640.000000 20640.000000 20640.000000 20640.000000 \n", "mean 3.870671 28.639486 5.429000 1.096675 1425.476744 \n", "std 1.899822 12.585558 2.474173 0.473911 1132.462122 \n", "min 0.499900 1.000000 0.846154 0.333333 3.000000 \n", "25% 2.563400 18.000000 4.440716 1.006079 787.000000 \n", "50% 3.534800 29.000000 5.229129 1.048780 1166.000000 \n", "75% 4.743250 37.000000 6.052381 1.099526 1725.000000 \n", "max 15.000100 52.000000 141.909091 34.066667 35682.000000 \n", "\n", " AveOccup Latitude Longitude \n", "count 20640.000000 20640.000000 20640.000000 \n", "mean 3.070655 35.631861 -119.569704 \n", "std 10.386050 2.135952 2.003532 \n", "min 0.692308 32.540000 -124.350000 \n", "25% 2.429741 33.930000 -121.800000 \n", "50% 2.818116 34.260000 -118.490000 \n", "75% 3.282261 37.710000 -118.010000 \n", "max 1243.333333 41.950000 -114.310000 " ] }, "execution_count": 222, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.DataFrame(calif.data, columns=calif.feature_names).describe()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So what if we knew that a house was in a block that:\n", " - had a median income of 30,000 USD\n", " - had a median house age of 12 years\n", " - had an average of 2 rooms\n", " - had an average of 1 bedroom\n", " - had a population of 2,000 in the block\n", " - had an average occupancy of 2\n", " - was located at (33.93 N, -118.49 E)\n", " \n", "How much would our model predict it would cost?" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Scaled values: [[-0.45486585 -1.31911544 -1.33833304 -0.19511847 0.50413181 -0.19936405\n", " -0.7977993 0.54273159]]\n", "Predicted cost: [2.6131105]\n" ] }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAD8CAYAAAB+UHOxAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAE8JJREFUeJzt3X+MXeV95/H3pw6QKMkWUibIaztr1HqzJZXqoFnCCmnFQgoGoppKjWTUTawIyV0JJKKttjX9hyYpEpG2IYqUILnFG+hm46L8EBZ4S70EFEUqP0ziEIzDMgveMLWFpzWQoKisIN/94z5uLzD23BmP78U875d0dc/5nuec8zxE8WfOj3tOqgpJUn9+adIdkCRNhgEgSZ0yACSpUwaAJHXKAJCkThkAktQpA0CSOmUASFKnDABJ6tQ7Jt2B4zn77LNr7dq1k+6GdGxPPTX4/uAHJ9sPachjjz3291U1tVC7t3QArF27lj179ky6G9KxXXzx4PvBByfZC+l1kvzfUdp5CkiSOmUASFKnDABJ6pQBIEmdMgAkqVMGgCR1ygCQpE4ZAJLUKQNAkjr1lv4lsBZv7dZ7J7LfA7dcNZH9Slq6kY8AkqxI8oMk97T5c5M8nOTpJH+V5PRWP6PNz7Tla4e2cWOrP5Xk8uUejCRpdIs5BXQDsH9o/vPArVW1DngBuLbVrwVeqKpfA25t7UhyHrAJ+BCwAfhKkhUn1n1J0lKNFABJVgNXAX/R5gNcAnyjNbkDuLpNb2zztOWXtvYbgR1V9UpVPQvMABcsxyAkSYs36hHAF4E/BH7R5n8FeLGqXm3zs8CqNr0KeA6gLX+ptf+n+jzrSJLGbMEASPIx4HBVPTZcnqdpLbDseOsM729Lkj1J9szNzS3UPUnSEo1yBHAR8NtJDgA7GJz6+SJwZpKjdxGtBg626VlgDUBb/svAkeH6POv8k6raVlXTVTU9NbXg+wwkSUu0YABU1Y1Vtbqq1jK4iPudqvo94AHgd1uzzcDdbXpnm6ct/05VVatvancJnQusAx5ZtpFIkhblRH4H8EfAjiR/CvwAuL3Vbwf+MskMg7/8NwFU1b4kdwFPAq8C11XVayewf0nSCVhUAFTVg8CDbfoZ5rmLp6r+Efj4Mda/Gbh5sZ2UJC0/HwUhSZ0yACSpUwaAJHXKAJCkThkAktQpA0CSOuX7ALQsfA+BdOrxCECSOmUASFKnDABJ6pQBIEmdMgAkqVMGgCR1ygCQpE75O4CTYFL3xEvSYngEIEmdGuWl8O9M8kiSHybZl+Qzrf7VJM8m2ds+61s9Sb6UZCbJ40nOH9rW5iRPt8/mY+1TknTyjXIK6BXgkqp6OclpwPeS/M+27L9U1Tfe0P4KBu/7XQd8BLgN+EiS9wE3AdNAAY8l2VlVLyzHQCRJizPKS+Grql5us6e1Tx1nlY3AnW29h4Azk6wELgd2V9WR9o/+bmDDiXVfkrRUI10DSLIiyV7gMIN/xB9ui25up3luTXJGq60CnhtafbbVjlWXJE3ASAFQVa9V1XpgNXBBkt8AbgT+DfBvgfcBf9SaZ75NHKf+Okm2JNmTZM/c3Nwo3ZMkLcGi7gKqqheBB4ENVXWoneZ5BfhvwAWt2SywZmi11cDB49TfuI9tVTVdVdNTU1OL6Z4kaRFGuQtoKsmZbfpdwEeBH7fz+iQJcDXwRFtlJ/DJdjfQhcBLVXUIuA+4LMlZSc4CLms1SdIEjHIX0ErgjiQrGATGXVV1T5LvJJlicGpnL/CfWvtdwJXADPBz4FMAVXUkyeeAR1u7z1bVkeUbiiRpMRYMgKp6HPjwPPVLjtG+gOuOsWw7sH2RfZQknQT+EliSOmUASFKnDABJ6pQBIEmdMgAkqVMGgCR1ygCQpE4ZAJLUKQNAkjplAEhSpwwASeqUASBJnTIAJKlTBoAkdcoAkKROGQCS1CkDQJI6Nco7gd+Z5JEkP0yyL8lnWv3cJA8neTrJXyU5vdXPaPMzbfnaoW3d2OpPJbn8ZA1KkrSwUY4AXgEuqarfBNYDG9rL3j8P3FpV64AXgGtb+2uBF6rq14BbWzuSnAdsAj4EbAC+0t4zLEmagAUDoAZebrOntU8BlwDfaPU7gKvb9MY2T1t+aZK0+o6qeqWqnmXw0vgLlmUUkqRFG+kaQJIVSfYCh4HdwP8BXqyqV1uTWWBVm14FPAfQlr8E/MpwfZ51hve1JcmeJHvm5uYWPyJJ0khGCoCqeq2q1gOrGfzV/uvzNWvfOcayY9XfuK9tVTVdVdNTU1OjdE+StASLuguoql4EHgQuBM5M8o62aDVwsE3PAmsA2vJfBo4M1+dZR5I0ZqPcBTSV5Mw2/S7go8B+4AHgd1uzzcDdbXpnm6ct/05VVatvancJnQusAx5ZroFIkhbnHQs3YSVwR7tj55eAu6rqniRPAjuS/CnwA+D21v524C+TzDD4y38TQFXtS3IX8CTwKnBdVb22vMORJI1qwQCoqseBD89Tf4Z57uKpqn8EPn6Mbd0M3Lz4bkqSlpu/BJakThkAktQpA0CSOmUASFKnRrkLSHrLWrv13ont+8AtV01s39Jy8AhAkjplAEhSpwwASeqUASBJnTIAJKlTBoAkdcoAkKROGQCS1CkDQJI6ZQBIUqcMAEnq1CivhFyT5IEk+5PsS3JDq/9Jkr9Lsrd9rhxa58YkM0meSnL5UH1Dq80k2XpyhiRJGsUoD4N7FfiDqvp+kvcCjyXZ3ZbdWlX/dbhxkvMYvAbyQ8C/BP5Xkn/dFn8Z+C0GL4h/NMnOqnpyOQYiSVqcUV4JeQg41KZ/lmQ/sOo4q2wEdlTVK8Cz7d3AR18dOdNeJUmSHa2tASBJE7CoawBJ1jJ4P/DDrXR9kseTbE9yVqutAp4bWm221Y5Vf+M+tiTZk2TP3NzcYronSVqEkQMgyXuAbwKfrqqfArcBvwqsZ3CE8GdHm86zeh2n/vpC1baqmq6q6ampqVG7J0lapJFeCJPkNAb/+H+tqr4FUFXPDy3/c+CeNjsLrBlafTVwsE0fqy5JGrNR7gIKcDuwv6q+MFRfOdTsd4An2vROYFOSM5KcC6wDHgEeBdYlOTfJ6QwuFO9cnmFIkhZrlCOAi4BPAD9KsrfV/hi4Jsl6BqdxDgC/D1BV+5LcxeDi7qvAdVX1GkCS64H7gBXA9qrat4xjkSQtwih3AX2P+c/f7zrOOjcDN89T33W89SRJ4+MvgSWpUwaAJHXKAJCkThkAktQpA0CSOmUASFKnDABJ6pQBIEmdMgAkqVMGgCR1ygCQpE4ZAJLUKQNAkjplAEhSpwwASeqUASBJnRrllZBrkjyQZH+SfUluaPX3Jdmd5On2fVarJ8mXkswkeTzJ+UPb2tzaP51k88kbliRpIaMcAbwK/EFV/TpwIXBdkvOArcD9VbUOuL/NA1zB4D3A64AtwG0wCAzgJuAjwAXATUdDQ5I0fgsGQFUdqqrvt+mfAfuBVcBG4I7W7A7g6ja9EbizBh4CzmwvkL8c2F1VR6rqBWA3sGFZRyNJGtmirgEkWQt8GHgYOKeqDsEgJID3t2argOeGVptttWPVJUkTMHIAJHkP8E3g01X10+M1nadWx6m/cT9bkuxJsmdubm7U7kmSFmmkAEhyGoN//L9WVd9q5efbqR3a9+FWnwXWDK2+Gjh4nPrrVNW2qpququmpqanFjEWStAij3AUU4HZgf1V9YWjRTuDonTybgbuH6p9sdwNdCLzUThHdB1yW5Kx28feyVpMkTcA7RmhzEfAJ4EdJ9rbaHwO3AHcluRb4CfDxtmwXcCUwA/wc+BRAVR1J8jng0dbus1V1ZFlGIUlatAUDoKq+x/zn7wEunad9AdcdY1vbge2L6aAk6eTwl8CS1CkDQJI6ZQBIUqcMAEnqlAEgSZ0yACSpU6P8DkDSPNZuvZcdz/wDAJu23ju2/R645aqx7Utvbx4BSFKnDABJ6pQBIEmdMgAkqVMGgCR1ygCQpE4ZAJLUKQNAkjplAEhSp0Z5JeT2JIeTPDFU+5Mkf5dkb/tcObTsxiQzSZ5KcvlQfUOrzSTZuvxDkSQtxihHAF8FNsxTv7Wq1rfPLoAk5wGbgA+1db6SZEWSFcCXgSuA84BrWltJ0oSM8krI7yZZO+L2NgI7quoV4NkkM8AFbdlMVT0DkGRHa/vkonssSVoWJ3IN4Pokj7dTRGe12irguaE2s612rLokaUKWGgC3Ab8KrAcOAX/W6vO9PL6OU3+TJFuS7EmyZ25ubondkyQtZEkBUFXPV9VrVfUL4M/559M8s8CaoaargYPHqc+37W1VNV1V01NTU0vpniRpBEsKgCQrh2Z/Bzh6h9BOYFOSM5KcC6wDHgEeBdYlOTfJ6QwuFO9cerclSSdqwYvASb4OXAycnWQWuAm4OMl6BqdxDgC/D1BV+5LcxeDi7qvAdVX1WtvO9cB9wApge1XtW/bRSJJGNspdQNfMU779OO1vBm6ep74L2LWo3kmSThp/CSxJnTIAJKlTBoAkdcoAkKROGQCS1CkDQJI6ZQBIUqcMAEnq1II/BJP01rJ2670T2/eBW66a2L4nZVL/vcfx39ojAEnqlAEgSZ0yACSpUwaAJHXKAJCkThkAktQpA0CSOmUASFKnFgyAJNuTHE7yxFDtfUl2J3m6fZ/V6knypSQzSR5Pcv7QOptb+6eTbD45w5EkjWqUI4CvAhveUNsK3F9V64D72zzAFQxeBL8O2ALcBoPAYPAu4Y8AFwA3HQ0NSdJkLBgAVfVd4MgbyhuBO9r0HcDVQ/U7a+Ah4MwkK4HLgd1VdaSqXgB28+ZQkSSN0VKvAZxTVYcA2vf7W30V8NxQu9lWO1b9TZJsSbInyZ65ubkldk+StJDlvgiceWp1nPqbi1Xbqmq6qqanpqaWtXOSpH+21AB4vp3aoX0fbvVZYM1Qu9XAwePUJUkTstQA2AkcvZNnM3D3UP2T7W6gC4GX2imi+4DLkpzVLv5e1mqSpAlZ8H0ASb4OXAycnWSWwd08twB3JbkW+Anw8dZ8F3AlMAP8HPgUQFUdSfI54NHW7rNV9cYLy5KkMVowAKrqmmMsunSetgVcd4ztbAe2L6p3kqST5m39RrBJvjlJkt7qfBSEJHXKAJCkThkAktQpA0CSOmUASFKnDABJ6tTb+jZQSctrUrdWH7jlqons9+3OIwBJ6pQBIEmdMgAkqVMGgCR1ygCQpE4ZAJLUKQNAkjplAEhSp04oAJIcSPKjJHuT7Gm19yXZneTp9n1WqyfJl5LMJHk8yfnLMQBJ0tIsxxHAf6iq9VU13ea3AvdX1Trg/jYPcAWwrn22ALctw74lSUt0Mk4BbQTuaNN3AFcP1e+sgYeAM5OsPAn7lySN4EQDoIC/SfJYki2tdk5VHQJo3+9v9VXAc0PrzraaJGkCTvRhcBdV1cEk7wd2J/nxcdpmnlq9qdEgSLYAfOADHzjB7kmSjuWEAqCqDrbvw0m+DVwAPJ9kZVUdaqd4Drfms8CaodVXAwfn2eY2YBvA9PT0mwJCUn8m9RTSt7slnwJK8u4k7z06DVwGPAHsBDa3ZpuBu9v0TuCT7W6gC4GXjp4qkiSN34kcAZwDfDvJ0e38j6r66ySPAncluRb4CfDx1n4XcCUwA/wc+NQJ7FuSdIKWHABV9Qzwm/PU/wG4dJ56AdctdX+SpOXlL4ElqVMGgCR1ygCQpE4ZAJLUKQNAkjplAEhSpwwASeqUASBJnTIAJKlTBoAkdcoAkKROGQCS1CkDQJI6ZQBIUqcMAEnqlAEgSZ0aewAk2ZDkqSQzSbaOe/+SpIGxBkCSFcCXgSuA84Brkpw3zj5IkgbGfQRwATBTVc9U1f8DdgAbx9wHSRLjD4BVwHND87OtJkkasyW/FH6JMk+tXtcg2QJsabMvJ3lqnnXOBv5+mft2qnDsbyH/7ujE5z92snf1lhv7mPQ6bvL5Exr7vxql0bgDYBZYMzS/Gjg43KCqtgHbjreRJHuqanr5u/fW59gde096HTeMZ+zjPgX0KLAuyblJTgc2ATvH3AdJEmM+AqiqV5NcD9wHrAC2V9W+cfZBkjQw7lNAVNUuYNcJbua4p4je5hx7n3ode6/jhjGMPVW1cCtJ0tuOj4KQpE6dcgHQ66MkkmxPcjjJE5PuyzglWZPkgST7k+xLcsOk+zQuSd6Z5JEkP2xj/8yk+zRuSVYk+UGSeybdl3FKciDJj5LsTbLnpO3nVDoF1B4l8b+B32JwS+mjwDVV9eREOzYGSf498DJwZ1X9xqT7My5JVgIrq+r7Sd4LPAZc3cn/5gHeXVUvJzkN+B5wQ1U9NOGujU2S/wxMA/+iqk76jy3eKpIcAKar6qT+BuJUOwLo9lESVfVd4Mik+zFuVXWoqr7fpn8G7KeTX4/XwMtt9rT2OXX+YjtBSVYDVwF/Mem+vF2dagHgoyQ6lmQt8GHg4cn2ZHzaKZC9wGFgd1V1M3bgi8AfAr+YdEcmoIC/SfJYezrCSXGqBcCCj5LQ21OS9wDfBD5dVT+ddH/Gpapeq6r1DH41f0GSLk7/JfkYcLiqHpt0Xybkoqo6n8GTk69rp4CX3akWAAs+SkJvP+389zeBr1XVtybdn0moqheBB4ENE+7KuFwE/HY7F74DuCTJf59sl8anqg6278PAtxmc/l52p1oA+CiJzrQLobcD+6vqC5PuzzglmUpyZpt+F/BR4MeT7dV4VNWNVbW6qtYy+P/5d6rqP064W2OR5N3thgeSvBu4DDgpd/+dUgFQVa8CRx8lsR+4q5dHSST5OvC3wAeTzCa5dtJ9GpOLgE8w+Atwb/tcOelOjclK4IEkjzP442d3VXV1O2SnzgG+l+SHwCPAvVX11ydjR6fUbaCSpOVzSh0BSJKWjwEgSZ0yACSpUwaAJHXKAJCkThkAktQpA0CSOmUASFKn/j/EQq+Fs/Dz2wAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "house_blocks = [[\n", " 3.,\n", " 12,\n", " 2,\n", " 1,\n", " 2000,\n", " 2.,\n", " 33.93,\n", " -118.49\n", "]]\n", "house_blocks_scaled = scaler.transform(house_blocks)\n", "print(\"Scaled values: \", house_blocks_scaled)\n", "\n", "predicted_costs = ann.predict(house_blocks_scaled)\n", "print(\"Predicted cost: \", predicted_costs)\n", "\n", "plt.hist(calif.target)\n", "plt.axvline(x=predicted_costs, c='red')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "# Backpropagation\n", "\n", "The heart of neural networks is [backpropagation algorithms](https://en.wikipedia.org/wiki/Backpropagation), which are an efficient way to change the weights and biases in the ANN based on the size of the loss.\n", "\n", "In effect, an ANN is trained by:\n", "1. Setting all weights and biases randomly.\n", "2. For each row in the test data:\n", " 1. Set the input units to the input features.\n", " 2. Use unit weights and biases, passing through the activation function, to calculate the output value of each unit -- right through to the output units.\n", " 3. Use a loss function to compare the output units with the expected output.\n", " 4. Use a backpropagation algorithm to update all the weights and biases to reduce the loss.\n", "\n", "Google has a [nice visual explanation](https://google-developers.appspot.com/machine-learning/crash-course/backprop-scroll/) of backpropagation. [More detailed explanations](http://neuralnetworksanddeeplearning.com/chap2.html) are also available.\n", " \n", "## When backpropagation goes wrong\n", "\n", "* Vanishing gradients: when weights for lower levels (closer to the input) become very small, gradients become very small too, making it hard or impossible to train these layers. The ReLU activation function can help prevent vanishing gradients.\n", "\n", "* Exploding gradients: when weights become very large, the gradients for lower layers can become very large, making it hard for these gradients to converge. Batch normalization can help prevent exploding gradients, as can lowering the learning rate.\n", "\n", "* Dead ReLU units: once the weighted sum for a ReLU activation function falls below 0, the ReLU unit can get stuck -- without an output, it doesn't contribute to the network output, and gradients can't flow through it in backpropagation. Lowering the learning rate can help keep ReLU units from dying.\n", "\n", "* Dropout regularization: in this form of regularization, a proportion of unit activations are randomly dropped out. This prevents overfitting and so helps create a better model." ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## ANN for classification: what sort of forest is this?\n", "Let's jump in with a dataset called [Covertype](https://archive.ics.uci.edu/ml/datasets/Covertype), where we try to predict forest cover type based on a number of features of a 30x30m area of forest as follows:\n", "\n", "| Column | Feature | Units | Description | How measured |\n", "|---|--------|-------|-------------|--------------|\n", "| 1 | Aspect | degrees azimuth | Aspect in degrees azimuth | Quantitative |\n", "| 2 | Slope | degrees | Slope in degrees | Quantitative |\n", "| 3 | Horizontal_Distance_To_Hydrology | meters | Horz Dist to nearest surface water features | Quantitative |\n", "| 4 | Vertical_Distance_To_Hydrology | meters | Vert Dist to nearest surface water features | Quantitative |\n", "| 5 | Horizontal_Distance_To_Roadways | meters | Horz Dist to nearest roadway | Quantitative |\n", "| 6 | Hillshade_9am | 0 to 255 index | Hillshade index at 9am, summer solstice | Quantitative |\n", "| 7 | Hillshade_Noon | 0 to 255 index | Hillshade index at noon, summer soltice | Quantitative |\n", "| 8 | Hillshade_3pm | 0 to 255 index | Hillshade index at 3pm, summer solstice | Quantitative |\n", "| 9 | Horizontal_Distance_To_Fire_Points | meters | Horz Dist to nearest wildfire ignition points | Quantitative |\n", "| 10-14 | Wilderness_Area | 4 binary columns with 0 (absence) or 1 (presence) | Which wilderness area this plot is in | Qualitative |\n", "| 14-54 | Soil_Type | 40 binary columns with 0 (absence) or 1 (presence) | Soil Type designation | Qualitative |" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using this information, we are trying to classify each 30x30m plot as one of seven forest types.\n", "\n", "This dataset is built into Scikit, so we can use it to download and load the dataset for use." ] }, { "cell_type": "code", "execution_count": 243, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Help on function fetch_covtype in module sklearn.datasets.covtype:\n", "\n", "fetch_covtype(data_home=None, download_if_missing=True, random_state=None, shuffle=False, return_X_y=False)\n", " Load the covertype dataset (classification).\n", " \n", " Download it if necessary.\n", " \n", " ================= ============\n", " Classes 7\n", " Samples total 581012\n", " Dimensionality 54\n", " Features int\n", " ================= ============\n", " \n", " Read more in the :ref:`User Guide `.\n", " \n", " Parameters\n", " ----------\n", " data_home : string, optional\n", " Specify another download and cache folder for the datasets. By default\n", " all scikit-learn data is stored in '~/scikit_learn_data' subfolders.\n", " \n", " download_if_missing : boolean, default=True\n", " If False, raise a IOError if the data is not locally available\n", " instead of trying to download the data from the source site.\n", " \n", " random_state : int, RandomState instance or None (default)\n", " Determines random number generation for dataset shuffling. Pass an int\n", " for reproducible output across multiple function calls.\n", " See :term:`Glossary `.\n", " \n", " shuffle : bool, default=False\n", " Whether to shuffle dataset.\n", " \n", " return_X_y : boolean, default=False.\n", " If True, returns ``(data.data, data.target)`` instead of a Bunch\n", " object.\n", " \n", " .. versionadded:: 0.20\n", " \n", " Returns\n", " -------\n", " dataset : dict-like object with the following attributes:\n", " \n", " dataset.data : numpy array of shape (581012, 54)\n", " Each row corresponds to the 54 features in the dataset.\n", " \n", " dataset.target : numpy array of shape (581012,)\n", " Each value corresponds to one of the 7 forest covertypes with values\n", " ranging between 1 to 7.\n", " \n", " dataset.DESCR : string\n", " Description of the forest covertype dataset.\n", " \n", " (data, target) : tuple if ``return_X_y`` is True\n", " \n", " .. versionadded:: 0.20\n", "\n" ] } ], "source": [ "from sklearn import datasets\n", "help(datasets.fetch_covtype)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So we don't need to provide any arguments, but it warns us that it will need to download this dataset. It also describes the the returned dataset object will have the following properties:\n", "- .data: a numpy array with the features.\n", "- .target: a numpy array with the target labels. Note that each plot is classified into only one of these values.\n", "- .DESCR: describe this forest covertype." ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
0123456789...44454647484950515253
03220.0336.09.0648.076.06220.0199.0227.0167.01323.0...0.00.00.00.00.00.00.00.00.00.0
13054.0322.011.0285.063.02609.0192.0229.0177.01932.0...0.01.00.00.00.00.00.00.00.00.0
22746.031.028.0268.016.01986.0200.0168.090.02058.0...0.00.00.00.00.00.00.00.00.00.0
33195.092.020.0313.033.01206.0246.0205.079.02400.0...0.00.00.00.00.00.00.00.00.00.0
43108.0177.020.0722.0107.01661.0226.0247.0145.0624.0...0.00.00.00.00.00.00.00.00.00.0
53247.0342.010.060.015.0663.0199.0225.0164.01235.0...0.00.00.00.00.00.00.00.00.00.0
63237.017.020.0600.0161.0612.0199.0194.0125.01967.0...0.00.01.00.00.00.00.00.00.00.0
73387.0346.011.0874.0116.01265.0199.0223.0162.01789.0...0.00.00.00.00.00.00.01.00.00.0
82884.031.06.0361.061.02512.0219.0228.0145.02142.0...0.00.00.00.00.00.00.00.00.00.0
93258.0291.017.0360.074.0300.0171.0235.0204.01544.0...0.00.00.00.00.00.00.00.00.00.0
102658.019.040.0342.0192.0808.0161.0128.077.01288.0...0.00.00.00.00.00.00.00.00.00.0
113265.0339.023.0190.077.01661.0164.0200.0170.02644.0...0.00.01.00.00.00.00.00.00.00.0
123013.0280.024.067.017.0525.0146.0234.0224.02305.0...0.00.00.00.00.00.00.00.00.00.0
133024.060.09.030.0-2.06382.0227.0221.0127.03141.0...0.00.00.00.00.00.00.00.00.00.0
142848.0316.015.060.016.02505.0180.0226.0186.02227.0...0.00.00.00.00.00.00.00.00.00.0
152963.055.020.0201.045.0577.0228.0194.091.01779.0...0.00.01.00.00.00.00.00.00.00.0
162902.012.021.0212.050.03841.0193.0192.0130.02708.0...0.00.00.00.00.00.00.00.00.00.0
173043.0162.019.0395.0-23.01584.0235.0241.0128.01262.0...0.00.00.00.00.00.00.00.00.00.0
183146.0134.017.0124.019.02121.0244.0230.0109.03160.0...0.00.01.00.00.00.00.00.00.00.0
192410.0124.036.0247.0138.01172.0252.0187.028.0785.0...0.00.00.00.00.00.00.00.00.00.0
203114.028.010.060.00.05160.0216.0217.0136.02255.0...0.00.00.00.00.00.00.00.00.00.0
212509.0306.09.0391.084.030.0196.0235.0179.01698.0...0.00.00.00.00.00.00.00.00.00.0
222889.0319.014.0242.023.03408.0183.0226.0183.02868.0...0.00.00.00.00.00.00.00.00.00.0
233122.0344.07.0474.0-55.0824.0206.0229.0162.01074.0...0.00.00.00.00.00.00.00.00.00.0
242816.046.016.0247.036.01325.0223.0202.0108.02557.0...0.00.01.00.00.00.00.00.00.00.0
253167.0280.07.0570.059.05708.0201.0241.0179.01273.0...0.00.00.00.00.00.00.00.00.00.0
262982.021.010.030.01.03671.0214.0219.0141.05117.0...0.00.00.00.00.00.00.00.00.00.0
273154.0163.024.0268.015.01824.0235.0238.0120.02058.0...0.00.01.00.00.00.00.00.00.00.0
282977.038.08.01020.0179.02689.0220.0223.0138.0824.0...1.00.00.00.00.00.00.00.00.00.0
292636.0344.012.0153.021.0750.0195.0220.0164.01572.0...0.00.00.00.00.00.00.00.00.00.0
..................................................................
5809822482.0235.035.0120.017.0552.0138.0244.0224.01064.0...0.00.00.00.00.00.00.00.00.00.0
5809832890.096.014.0488.037.01694.0242.0218.0103.02753.0...1.00.00.00.00.00.00.00.00.00.0
5809843085.00.02.0319.031.03089.0216.0234.0156.01486.0...0.01.00.00.00.00.00.00.00.00.0
5809853283.015.019.0330.013.0916.0199.0196.0129.02431.0...0.00.00.00.00.00.00.01.00.00.0
5809862635.065.07.030.06.0977.0227.0226.0134.01883.0...0.00.00.00.00.00.00.00.00.00.0
5809873012.029.026.0488.0-15.02432.0202.0175.098.02758.0...1.00.00.00.00.00.00.00.00.00.0
5809882573.0201.027.0120.0-63.0638.0198.0252.0174.0953.0...0.00.00.00.00.00.00.00.00.00.0
5809893117.0187.014.0384.026.02106.0221.0249.0158.0842.0...0.01.00.00.00.00.00.00.00.00.0
5809903097.0319.013.0216.060.0636.0186.0227.0181.01012.0...0.00.00.00.00.00.00.00.00.00.0
5809912880.0102.015.085.020.05057.0244.0219.099.04273.0...0.00.00.00.00.00.00.00.00.00.0
5809923290.0215.015.0631.0129.01084.0205.0253.0181.01745.0...0.00.00.00.00.00.00.00.00.00.0
5809933223.0343.06.0671.079.06106.0207.0230.0162.01291.0...0.00.00.00.00.00.00.00.00.00.0
5809943398.0242.010.0309.071.01846.0200.0249.0186.02134.0...0.01.00.00.00.00.00.00.00.00.0
5809953101.0248.08.0313.047.0641.0204.0246.0180.01034.0...0.01.00.00.00.00.00.00.00.00.0
5809962607.0274.012.0362.0132.0560.0189.0243.0193.0488.0...0.00.00.00.00.00.00.00.00.00.0
5809973270.07.012.0631.0-3.0618.0205.0215.0147.02991.0...0.00.00.00.00.00.00.00.00.00.0
5809983236.0340.026.0417.094.03812.0156.0192.0169.02056.0...1.00.00.00.00.00.00.00.00.00.0
5809993026.097.025.085.0-11.0612.0251.0195.058.0782.0...0.00.00.00.00.00.00.00.00.00.0
5810003196.0265.08.0362.049.02463.0199.0244.0184.0379.0...0.01.00.00.00.00.00.00.00.00.0
5810013063.0183.012.0150.022.0708.0223.0248.0156.01465.0...0.00.00.00.00.00.00.00.00.00.0
5810023074.0325.08.042.03.02022.0202.0232.0170.01266.0...0.00.00.00.00.00.00.00.00.00.0
5810033348.0337.019.0785.056.02274.0175.0210.0172.01008.0...0.01.00.00.00.00.00.00.00.00.0
5810043096.0275.010.0190.030.05667.0192.0243.0190.02279.0...0.00.00.00.00.00.00.00.00.00.0
5810052936.0255.013.0400.072.01794.0190.0248.0195.01084.0...1.00.00.00.00.00.00.00.00.00.0
5810062923.074.012.0170.013.02248.0234.0217.0114.02320.0...0.00.00.00.00.00.00.00.00.00.0
5810073142.092.017.0285.0-6.06134.0245.0211.089.01158.0...0.00.00.00.00.00.00.00.00.00.0
5810082576.0145.010.042.01.01897.0234.0239.0136.01958.0...0.00.00.00.00.00.00.00.00.00.0
5810093137.0236.03.0309.025.03435.0213.0242.0166.01008.0...0.00.00.00.00.00.00.00.00.00.0
5810103042.0284.04.0426.046.01127.0209.0240.0169.01758.0...0.01.00.00.00.00.00.00.00.00.0
5810113391.0205.015.0524.090.02873.0210.0253.0174.01371.0...0.01.00.00.00.00.00.00.00.00.0
\n", "

581012 rows × 54 columns

\n", "
" ], "text/plain": [ " 0 1 2 3 4 5 6 7 8 \\\n", "0 3220.0 336.0 9.0 648.0 76.0 6220.0 199.0 227.0 167.0 \n", "1 3054.0 322.0 11.0 285.0 63.0 2609.0 192.0 229.0 177.0 \n", "2 2746.0 31.0 28.0 268.0 16.0 1986.0 200.0 168.0 90.0 \n", "3 3195.0 92.0 20.0 313.0 33.0 1206.0 246.0 205.0 79.0 \n", "4 3108.0 177.0 20.0 722.0 107.0 1661.0 226.0 247.0 145.0 \n", "5 3247.0 342.0 10.0 60.0 15.0 663.0 199.0 225.0 164.0 \n", "6 3237.0 17.0 20.0 600.0 161.0 612.0 199.0 194.0 125.0 \n", "7 3387.0 346.0 11.0 874.0 116.0 1265.0 199.0 223.0 162.0 \n", "8 2884.0 31.0 6.0 361.0 61.0 2512.0 219.0 228.0 145.0 \n", "9 3258.0 291.0 17.0 360.0 74.0 300.0 171.0 235.0 204.0 \n", "10 2658.0 19.0 40.0 342.0 192.0 808.0 161.0 128.0 77.0 \n", "11 3265.0 339.0 23.0 190.0 77.0 1661.0 164.0 200.0 170.0 \n", "12 3013.0 280.0 24.0 67.0 17.0 525.0 146.0 234.0 224.0 \n", "13 3024.0 60.0 9.0 30.0 -2.0 6382.0 227.0 221.0 127.0 \n", "14 2848.0 316.0 15.0 60.0 16.0 2505.0 180.0 226.0 186.0 \n", "15 2963.0 55.0 20.0 201.0 45.0 577.0 228.0 194.0 91.0 \n", "16 2902.0 12.0 21.0 212.0 50.0 3841.0 193.0 192.0 130.0 \n", "17 3043.0 162.0 19.0 395.0 -23.0 1584.0 235.0 241.0 128.0 \n", "18 3146.0 134.0 17.0 124.0 19.0 2121.0 244.0 230.0 109.0 \n", "19 2410.0 124.0 36.0 247.0 138.0 1172.0 252.0 187.0 28.0 \n", "20 3114.0 28.0 10.0 60.0 0.0 5160.0 216.0 217.0 136.0 \n", "21 2509.0 306.0 9.0 391.0 84.0 30.0 196.0 235.0 179.0 \n", "22 2889.0 319.0 14.0 242.0 23.0 3408.0 183.0 226.0 183.0 \n", "23 3122.0 344.0 7.0 474.0 -55.0 824.0 206.0 229.0 162.0 \n", "24 2816.0 46.0 16.0 247.0 36.0 1325.0 223.0 202.0 108.0 \n", "25 3167.0 280.0 7.0 570.0 59.0 5708.0 201.0 241.0 179.0 \n", "26 2982.0 21.0 10.0 30.0 1.0 3671.0 214.0 219.0 141.0 \n", "27 3154.0 163.0 24.0 268.0 15.0 1824.0 235.0 238.0 120.0 \n", "28 2977.0 38.0 8.0 1020.0 179.0 2689.0 220.0 223.0 138.0 \n", "29 2636.0 344.0 12.0 153.0 21.0 750.0 195.0 220.0 164.0 \n", "... ... ... ... ... ... ... ... ... ... \n", "580982 2482.0 235.0 35.0 120.0 17.0 552.0 138.0 244.0 224.0 \n", "580983 2890.0 96.0 14.0 488.0 37.0 1694.0 242.0 218.0 103.0 \n", "580984 3085.0 0.0 2.0 319.0 31.0 3089.0 216.0 234.0 156.0 \n", "580985 3283.0 15.0 19.0 330.0 13.0 916.0 199.0 196.0 129.0 \n", "580986 2635.0 65.0 7.0 30.0 6.0 977.0 227.0 226.0 134.0 \n", "580987 3012.0 29.0 26.0 488.0 -15.0 2432.0 202.0 175.0 98.0 \n", "580988 2573.0 201.0 27.0 120.0 -63.0 638.0 198.0 252.0 174.0 \n", "580989 3117.0 187.0 14.0 384.0 26.0 2106.0 221.0 249.0 158.0 \n", "580990 3097.0 319.0 13.0 216.0 60.0 636.0 186.0 227.0 181.0 \n", "580991 2880.0 102.0 15.0 85.0 20.0 5057.0 244.0 219.0 99.0 \n", "580992 3290.0 215.0 15.0 631.0 129.0 1084.0 205.0 253.0 181.0 \n", "580993 3223.0 343.0 6.0 671.0 79.0 6106.0 207.0 230.0 162.0 \n", "580994 3398.0 242.0 10.0 309.0 71.0 1846.0 200.0 249.0 186.0 \n", "580995 3101.0 248.0 8.0 313.0 47.0 641.0 204.0 246.0 180.0 \n", "580996 2607.0 274.0 12.0 362.0 132.0 560.0 189.0 243.0 193.0 \n", "580997 3270.0 7.0 12.0 631.0 -3.0 618.0 205.0 215.0 147.0 \n", "580998 3236.0 340.0 26.0 417.0 94.0 3812.0 156.0 192.0 169.0 \n", "580999 3026.0 97.0 25.0 85.0 -11.0 612.0 251.0 195.0 58.0 \n", "581000 3196.0 265.0 8.0 362.0 49.0 2463.0 199.0 244.0 184.0 \n", "581001 3063.0 183.0 12.0 150.0 22.0 708.0 223.0 248.0 156.0 \n", "581002 3074.0 325.0 8.0 42.0 3.0 2022.0 202.0 232.0 170.0 \n", "581003 3348.0 337.0 19.0 785.0 56.0 2274.0 175.0 210.0 172.0 \n", "581004 3096.0 275.0 10.0 190.0 30.0 5667.0 192.0 243.0 190.0 \n", "581005 2936.0 255.0 13.0 400.0 72.0 1794.0 190.0 248.0 195.0 \n", "581006 2923.0 74.0 12.0 170.0 13.0 2248.0 234.0 217.0 114.0 \n", "581007 3142.0 92.0 17.0 285.0 -6.0 6134.0 245.0 211.0 89.0 \n", "581008 2576.0 145.0 10.0 42.0 1.0 1897.0 234.0 239.0 136.0 \n", "581009 3137.0 236.0 3.0 309.0 25.0 3435.0 213.0 242.0 166.0 \n", "581010 3042.0 284.0 4.0 426.0 46.0 1127.0 209.0 240.0 169.0 \n", "581011 3391.0 205.0 15.0 524.0 90.0 2873.0 210.0 253.0 174.0 \n", "\n", " 9 ... 44 45 46 47 48 49 50 51 52 53 \n", "0 1323.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "1 1932.0 ... 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "2 2058.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "3 2400.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "4 624.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "5 1235.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "6 1967.0 ... 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "7 1789.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 \n", "8 2142.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "9 1544.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "10 1288.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "11 2644.0 ... 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "12 2305.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "13 3141.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "14 2227.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "15 1779.0 ... 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "16 2708.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "17 1262.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "18 3160.0 ... 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "19 785.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "20 2255.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "21 1698.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "22 2868.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "23 1074.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "24 2557.0 ... 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "25 1273.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "26 5117.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "27 2058.0 ... 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "28 824.0 ... 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "29 1572.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "... ... ... ... ... ... ... ... ... ... ... ... ... \n", "580982 1064.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "580983 2753.0 ... 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "580984 1486.0 ... 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "580985 2431.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 \n", "580986 1883.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "580987 2758.0 ... 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "580988 953.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "580989 842.0 ... 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "580990 1012.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "580991 4273.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "580992 1745.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "580993 1291.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "580994 2134.0 ... 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "580995 1034.0 ... 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "580996 488.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "580997 2991.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "580998 2056.0 ... 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "580999 782.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "581000 379.0 ... 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "581001 1465.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "581002 1266.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "581003 1008.0 ... 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "581004 2279.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "581005 1084.0 ... 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "581006 2320.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "581007 1158.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "581008 1958.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "581009 1008.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "581010 1758.0 ... 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "581011 1371.0 ... 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n", "\n", "[581012 rows x 54 columns]" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Let's get the data pre-shuffled.\n", "covtype = datasets.fetch_covtype(shuffle=True)\n", "covtypedf = pd.DataFrame(covtype.data)\n", "covtypedf" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Target: [1 2 2 ... 2 2 1]\n", "Target shape: (581012,)\n" ] } ], "source": [ "print(\"Target: \", covtype.target)\n", "print(\"Target shape: \", covtype.target.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As before, we start by splitting our data into test and training data." ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Train data shape: (435759, 54)\n", "Train label shape: (435759,)\n", "Test data shape: (145253, 54)\n", "Test label shape: (145253,)\n" ] } ], "source": [ "X_train, X_test, y_train, y_test = model_selection.train_test_split(\n", " covtypedf, # Input features (X)\n", " covtype.target, # Output features (y)\n", " test_size=0.25, # Put aside 25% of data for testing.\n", " shuffle=True # Shuffle inputs.\n", ")\n", "\n", "# Did we err?\n", "\n", "print(\"Train data shape: \", X_train.shape)\n", "print(\"Train label shape: \", y_train.shape)\n", "\n", "print(\"Test data shape: \", X_test.shape)\n", "print(\"Test label shape: \", y_test.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Our data is ready for processing. But remember that we have a variety of different input types: binary (0, 1), continuous in small ranges (0-255) and in large ranges (elevations). Before we process this data, we should normalize them into a standard range. We'll use a MinMaxScaler: it reduces the range of all data so they fit into the range 0 to 1." ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
0123456789...44454647484950515253
count435759.000000435759.000000435759.000000435759.000000435759.000000435759.000000435759.000000435759.000000435759.000000435759.000000...435759.000000435759.000000435759.000000435759.000000435759.000000435759.000000435759.000000435759.000000435759.000000435759.000000
mean0.5505530.4320600.2136150.1927770.2769450.3300100.8353820.8792290.5609650.275952...0.0441530.0902430.0776320.0028460.0032720.0001840.0005030.0267350.0237450.015027
std0.1399330.3107390.1134390.1521290.0760920.2190650.1052730.0778300.1506860.184484...0.2054350.2865290.2675920.0532680.0571120.0135480.0224120.1613080.1522530.121659
min0.0000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.000000...0.0000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.000000
25%0.4757380.1611110.1363640.0773090.2255540.1549810.7795280.8385830.4685040.142479...0.0000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.000000
50%0.5687840.3527780.1969700.1560490.2555410.2801740.8582680.8897640.5629920.238394...0.0000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.000000
75%0.6523260.7222220.2727270.2748750.3063890.4674020.9094490.9330710.6614170.355500...0.0000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.0000000.000000
max1.0000001.0000001.0000001.0000001.0000001.0000001.0000001.0000001.0000001.000000...1.0000001.0000001.0000001.0000001.0000001.0000001.0000001.0000001.0000001.000000
\n", "

8 rows × 54 columns

\n", "
" ], "text/plain": [ " 0 1 2 3 \\\n", "count 435759.000000 435759.000000 435759.000000 435759.000000 \n", "mean 0.550553 0.432060 0.213615 0.192777 \n", "std 0.139933 0.310739 0.113439 0.152129 \n", "min 0.000000 0.000000 0.000000 0.000000 \n", "25% 0.475738 0.161111 0.136364 0.077309 \n", "50% 0.568784 0.352778 0.196970 0.156049 \n", "75% 0.652326 0.722222 0.272727 0.274875 \n", "max 1.000000 1.000000 1.000000 1.000000 \n", "\n", " 4 5 6 7 \\\n", "count 435759.000000 435759.000000 435759.000000 435759.000000 \n", "mean 0.276945 0.330010 0.835382 0.879229 \n", "std 0.076092 0.219065 0.105273 0.077830 \n", "min 0.000000 0.000000 0.000000 0.000000 \n", "25% 0.225554 0.154981 0.779528 0.838583 \n", "50% 0.255541 0.280174 0.858268 0.889764 \n", "75% 0.306389 0.467402 0.909449 0.933071 \n", "max 1.000000 1.000000 1.000000 1.000000 \n", "\n", " 8 9 ... 44 \\\n", "count 435759.000000 435759.000000 ... 435759.000000 \n", "mean 0.560965 0.275952 ... 0.044153 \n", "std 0.150686 0.184484 ... 0.205435 \n", "min 0.000000 0.000000 ... 0.000000 \n", "25% 0.468504 0.142479 ... 0.000000 \n", "50% 0.562992 0.238394 ... 0.000000 \n", "75% 0.661417 0.355500 ... 0.000000 \n", "max 1.000000 1.000000 ... 1.000000 \n", "\n", " 45 46 47 48 \\\n", "count 435759.000000 435759.000000 435759.000000 435759.000000 \n", "mean 0.090243 0.077632 0.002846 0.003272 \n", "std 0.286529 0.267592 0.053268 0.057112 \n", "min 0.000000 0.000000 0.000000 0.000000 \n", "25% 0.000000 0.000000 0.000000 0.000000 \n", "50% 0.000000 0.000000 0.000000 0.000000 \n", "75% 0.000000 0.000000 0.000000 0.000000 \n", "max 1.000000 1.000000 1.000000 1.000000 \n", "\n", " 49 50 51 52 \\\n", "count 435759.000000 435759.000000 435759.000000 435759.000000 \n", "mean 0.000184 0.000503 0.026735 0.023745 \n", "std 0.013548 0.022412 0.161308 0.152253 \n", "min 0.000000 0.000000 0.000000 0.000000 \n", "25% 0.000000 0.000000 0.000000 0.000000 \n", "50% 0.000000 0.000000 0.000000 0.000000 \n", "75% 0.000000 0.000000 0.000000 0.000000 \n", "max 1.000000 1.000000 1.000000 1.000000 \n", "\n", " 53 \n", "count 435759.000000 \n", "mean 0.015027 \n", "std 0.121659 \n", "min 0.000000 \n", "25% 0.000000 \n", "50% 0.000000 \n", "75% 0.000000 \n", "max 1.000000 \n", "\n", "[8 rows x 54 columns]" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.preprocessing import MinMaxScaler\n", "\n", "scaler = MinMaxScaler()\n", "\n", "# Figure out how to scale all the input features in the training dataset.\n", "scaler.fit(X_train)\n", "X_train_scaled = scaler.transform(X_train)\n", "\n", "# Also tranform our validation and testing data in the same way.\n", "X_test_scaled = scaler.transform(X_test)\n", "\n", "# Did that work?\n", "pd.DataFrame(X_train_scaled).describe()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Classifying among multiple categories\n", "\n", "Having multiple output units usually would result in each unit being considered independently, allowing you to assign multiple labels for a particular input (for instance, a single image might be classified as containing both a cloud as well as a bird). However, we use scikit-learn's [MLPClassifier](https://scikit-learn.org/stable/modules/neural_networks_supervised.html#classification), which automatically uses *softmax* to treat labels as exclusive to each other.\n", "\n", "### The power of softmax\n", "\n", "[Softmax](https://developers.google.com/machine-learning/crash-course/multi-class-neural-networks/softmax) is an additional layer added just before the output units that ensures that the sum of the outputs of all units in the output layer is 100%, in proportion to their inputs. The output of each individual unit is therefore the probability that it is the category into which the input should be categorized.\n", "\n", "The result of this is that MLPClassifier can provide both a predicted label for a set of input features, as well as the probability that represents how certain the model is about this prediction." ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Iteration 1, loss = 0.74538627\n", "Validation score: 0.723334\n", "Iteration 2, loss = 0.62624445\n", "Validation score: 0.735428\n", "Iteration 3, loss = 0.60337486\n", "Validation score: 0.749082\n", "Iteration 4, loss = 0.58838067\n", "Validation score: 0.753373\n", "Iteration 5, loss = 0.57566493\n", "Validation score: 0.756999\n", "Iteration 6, loss = 0.56735281\n", "Validation score: 0.762392\n", "Iteration 7, loss = 0.56085212\n", "Validation score: 0.761520\n", "Iteration 8, loss = 0.55497659\n", "Validation score: 0.766615\n", "Iteration 9, loss = 0.55008011\n", "Validation score: 0.767556\n", "Iteration 10, loss = 0.54529532\n", "Validation score: 0.771342\n", "Iteration 11, loss = 0.54217079\n", "Validation score: 0.767877\n", "Iteration 12, loss = 0.53793139\n", "Validation score: 0.773499\n", "Iteration 13, loss = 0.53475472\n", "Validation score: 0.771985\n", "Iteration 14, loss = 0.53031985\n", "Validation score: 0.775863\n", "Iteration 15, loss = 0.52697044\n", "Validation score: 0.775197\n", "Iteration 16, loss = 0.52361881\n", "Validation score: 0.779718\n", "Iteration 17, loss = 0.52059385\n", "Validation score: 0.778961\n", "Iteration 18, loss = 0.51789796\n", "Validation score: 0.781921\n", "Iteration 19, loss = 0.51442908\n", "Validation score: 0.782197\n", "Iteration 20, loss = 0.51194400\n", "Validation score: 0.782977\n", "Iteration 21, loss = 0.50958305\n", "Validation score: 0.782403\n", "Iteration 22, loss = 0.50737415\n", "Validation score: 0.781761\n", "Iteration 23, loss = 0.50501049\n", "Validation score: 0.785983\n", "Iteration 24, loss = 0.50316584\n", "Validation score: 0.785845\n", "Iteration 25, loss = 0.50125281\n", "Validation score: 0.786281\n", "Iteration 26, loss = 0.49912198\n", "Validation score: 0.786465\n", "Iteration 27, loss = 0.49724330\n", "Validation score: 0.782931\n", "Iteration 28, loss = 0.49499658\n", "Validation score: 0.786534\n", "Iteration 29, loss = 0.49373495\n", "Validation score: 0.791009\n", "Iteration 30, loss = 0.49090734\n", "Validation score: 0.788484\n", "Iteration 31, loss = 0.48937117\n", "Validation score: 0.788553\n", "Iteration 32, loss = 0.48838294\n", "Validation score: 0.781072\n", "Iteration 33, loss = 0.48736092\n", "Validation score: 0.791491\n", "Iteration 34, loss = 0.48571495\n", "Validation score: 0.794474\n", "Iteration 35, loss = 0.48455416\n", "Validation score: 0.795117\n", "Iteration 36, loss = 0.48313223\n", "Validation score: 0.793602\n", "Iteration 37, loss = 0.48223945\n", "Validation score: 0.794336\n", "Iteration 38, loss = 0.48115867\n", "Validation score: 0.797159\n", "Iteration 39, loss = 0.48043221\n", "Validation score: 0.795805\n", "Iteration 40, loss = 0.47956617\n", "Validation score: 0.796516\n", "Iteration 41, loss = 0.47831751\n", "Validation score: 0.795392\n", "Iteration 42, loss = 0.47753692\n", "Validation score: 0.791973\n", "Iteration 43, loss = 0.47647902\n", "Validation score: 0.792064\n", "Iteration 44, loss = 0.47582413\n", "Validation score: 0.799500\n", "Iteration 45, loss = 0.47475863\n", "Validation score: 0.800487\n", "Iteration 46, loss = 0.47429392\n", "Validation score: 0.799660\n", "Iteration 47, loss = 0.47313433\n", "Validation score: 0.800808\n", "Iteration 48, loss = 0.47215544\n", "Validation score: 0.796402\n", "Iteration 49, loss = 0.47180058\n", "Validation score: 0.800739\n", "Iteration 50, loss = 0.47093229\n", "Validation score: 0.800716\n", "Iteration 51, loss = 0.47008697\n", "Validation score: 0.799821\n", "Iteration 52, loss = 0.46900274\n", "Validation score: 0.801634\n", "Iteration 53, loss = 0.46844796\n", "Validation score: 0.803263\n", "Iteration 54, loss = 0.46802252\n", "Validation score: 0.801932\n", "Iteration 55, loss = 0.46711878\n", "Validation score: 0.803217\n", "Iteration 56, loss = 0.46625471\n", "Validation score: 0.794428\n", "Iteration 57, loss = 0.46581739\n", "Validation score: 0.803103\n", "Iteration 58, loss = 0.46514544\n", "Validation score: 0.804181\n", "Iteration 59, loss = 0.46444367\n", "Validation score: 0.803171\n", "Iteration 60, loss = 0.46371748\n", "Validation score: 0.799890\n", "Iteration 61, loss = 0.46314583\n", "Validation score: 0.805168\n", "Iteration 62, loss = 0.46200141\n", "Validation score: 0.791674\n", "Iteration 63, loss = 0.46186650\n", "Validation score: 0.801221\n", "Iteration 64, loss = 0.46155958\n", "Validation score: 0.804938\n", "Iteration 65, loss = 0.46142637\n", "Validation score: 0.806660\n", "Iteration 66, loss = 0.45994016\n", "Validation score: 0.805283\n", "Iteration 67, loss = 0.45935021\n", "Validation score: 0.806866\n", "Iteration 68, loss = 0.45905481\n", "Validation score: 0.804066\n", "Iteration 69, loss = 0.45879284\n", "Validation score: 0.800349\n", "Iteration 70, loss = 0.45762973\n", "Validation score: 0.805283\n", "Iteration 71, loss = 0.45731931\n", "Validation score: 0.808335\n", "Iteration 72, loss = 0.45603330\n", "Validation score: 0.802621\n", "Iteration 73, loss = 0.45613109\n", "Validation score: 0.806361\n", "Iteration 74, loss = 0.45515546\n", "Validation score: 0.805604\n", "Iteration 75, loss = 0.45518311\n", "Validation score: 0.799890\n", "Iteration 76, loss = 0.45458607\n", "Validation score: 0.807555\n", "Iteration 77, loss = 0.45412957\n", "Validation score: 0.808335\n", "Iteration 78, loss = 0.45344754\n", "Validation score: 0.808358\n", "Iteration 79, loss = 0.45278155\n", "Validation score: 0.807119\n", "Iteration 80, loss = 0.45286773\n", "Validation score: 0.808151\n", "Iteration 81, loss = 0.45294406\n", "Validation score: 0.810561\n", "Iteration 82, loss = 0.45149287\n", "Validation score: 0.808909\n", "Iteration 83, loss = 0.45161436\n", "Validation score: 0.808954\n", "Iteration 84, loss = 0.45109228\n", "Validation score: 0.808817\n", "Iteration 85, loss = 0.45089302\n", "Validation score: 0.807164\n", "Iteration 86, loss = 0.45065411\n", "Validation score: 0.810561\n", "Iteration 87, loss = 0.45052591\n", "Validation score: 0.811410\n", "Iteration 88, loss = 0.44976987\n", "Validation score: 0.808243\n", "Iteration 89, loss = 0.44937019\n", "Validation score: 0.810308\n", "Iteration 90, loss = 0.44877227\n", "Validation score: 0.811295\n", "Iteration 91, loss = 0.44923448\n", "Validation score: 0.810538\n", "Iteration 92, loss = 0.44821595\n", "Validation score: 0.807142\n", "Iteration 93, loss = 0.44860073\n", "Validation score: 0.808564\n", "Iteration 94, loss = 0.44804955\n", "Validation score: 0.808977\n", "Iteration 95, loss = 0.44745104\n", "Validation score: 0.810813\n", "Iteration 96, loss = 0.44755406\n", "Validation score: 0.809092\n", "Iteration 97, loss = 0.44716123\n", "Validation score: 0.811158\n", "Iteration 98, loss = 0.44686319\n", "Validation score: 0.808266\n", "Validation score did not improve more than tol=0.000100 for 10 consecutive epochs. Stopping.\n" ] }, { "data": { "text/plain": [ "MLPClassifier(activation='relu', alpha=0.001, batch_size='auto', beta_1=0.9,\n", " beta_2=0.999, early_stopping=True, epsilon=1e-08,\n", " hidden_layer_sizes=(40, 20), learning_rate='constant',\n", " learning_rate_init=0.001, max_iter=200, momentum=0.9,\n", " n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5,\n", " random_state=None, shuffle=True, solver='adam', tol=0.0001,\n", " validation_fraction=0.1, verbose=True, warm_start=False)" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.neural_network import MLPClassifier\n", "\n", "classifier = MLPClassifier(\n", " activation='relu',\n", " solver='adam',\n", " alpha=0.001,\n", " hidden_layer_sizes=(40, 20),\n", " batch_size='auto',\n", " verbose=True,\n", " early_stopping=True\n", ")\n", "classifier.fit(X_train_scaled, y_train)" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAX4AAAD8CAYAAABw1c+bAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAIABJREFUeJzt3Xl8VNXZwPHfyWTfSEjCmkDCJksIAcImlFUBbVWsWAWqUKpora9tfaXat636Yl1r1bqLuNWiqLwtomJdQUEBCQrIvgSEEJaQQPY9z/vHmYQhJGSAhAkzz/fzyYfMvefee+5MeO6ZsxoRQSmllO/w83QGlFJKnVsa+JVSysdo4FdKKR+jgV8ppXyMBn6llPIxGviVUsrHaOBXSikfo4FfKaV8jAZ+pZTyMf6ezkBdsbGxkpiY6OlsKKXUeWXt2rVHRCTOnbQtLvAnJiaSnp7u6WwopdR5xRjzg7tptapHKaV8jAZ+pZTyMRr4lVLKx2jgV0opH6OBXymlfIwGfqWU8jEa+JVSyse0uH78SinVKBEw5syPL86FnJ32p+gIJF8FrTqenK6sALK3weEt9nrRiRDVGSI7gp+b5WYROLYXIjuAI+D49qpK2LsSItpBbPczv5czoIFfKXX+yPoOvpkHG//PBupuF0P3i6BVAki1/TEOG2AdgRAaA4Ghx4/P2w+f3A0bF5543qX3w9BfwYjf2QfBhrdh07/gyPb68xHVGSY8AD1/fOIDSATKC6HkqA322z6ELe/BsR8gqBV0Gwtdx8GhjfYeirIBA32uhFG/hza9mvwtq49paYutp6WliY7cVeocO50SdGkefPUk7E+HK56tv6TcmIMbYesHcHQ3HN1jgy1i8xEYCu1ToeNAWxLO2QWHN8PeVXBgHQSEQZ9JUHgI9qyAytKGr+MIhE7DoPt4qCiBFY/Zh8OQm6DThRDTFYwffPEwbHgLAkKhohgwkDgCuoyCNr0hrqdNd3QP5GbANy9C9hboMgYGTrcPpD1fwcENUFV+4vW7jIauY+HQJtj+ERQdttt7TLTfNA6sh2/mQnkRpFwDVz5/Rt9mjDFrRSTNrbQa+JU6T1VV2n8djXxxL8yGHR/D9v/A/m+hTU8bDDukwqHNsOtzW+XQvh+MnA3dLqo/8FSU2AC1/DEoPQb+wRDRHma8D63ibZp938Dyv9kAGt0ZopOg9xUQEnX8PLm74YVRUJZnq0yiEyEszgZWY6DkGGR9a0vNNQJCoW0f6Hs19LsWglvZ7eXFNu8lR53H+9nAXlUBVWW2xL7jUxukAXpdDuP/YvNWV9Y6WPMixPaA5MmnfqBVVUL6S/abQmke+AXYB1V8GoS3gZDW9p46XwjBkcePq66G7K222sf1PSnOhZVPQ2UZTLj/lB9nQ5o88BtjJgJ/BxzAPBF5qM7+TsBrQJQzzV0issS57w/AL4Eq4DYR+ehU19LAr7xCRYkt1fk5TtxeXQ3IydsBygptqfDQ9zY4tkqAtr2hTR8IizkxbfY2+OdkG2iufqX+PJQcg0/vgbWv2WtGdICEwTbwZG89ni6uF3QaCjs/hbx99gEw7m77AKiRnwVvXGNLtN0ugrF/hupKeP1KCImGqW/D2ldg9Qs28AWG2aqO6kqb/+mLISzWBraXxtuS/qwvoHVS/XkXsSXr3N0Q0wWiEt2vU6/PsX1Qlm8fHk2pONc+XNqlnFil5AFNGviNMQ5gO3AxkAmsAaaIyGaXNHOB70TkOWNMb2CJiCQ6f38TGAx0AD4FeohIVUPX08CvzksiNmjv/AR2fmZLof7BNjAnDLWl0MxvIHOtLR3PWgoBIceP3/Rv+L8bbKAE+9BwrTLodRlcPAdad4HMdJg/2QZ2gNu+tdtdbXkPPrjDVisMngWpU21wqinJF+XYIB53gS19AlSW2+qO5X+zgbnvz2DiQ5CfaYN+WQFcNQ8uuOT4dfavtcG/NA8wMPhG+9AIioDqKvtt4q3rbKl++mL44hFbqr72DVs/rppMUwf+YcC9IjLB+foPACLyoEuaF4AMEXnYmf5vInJh3bTGmI+c51rZ0PU08KtmUVYAh7far/xHttuGtpguNmAW59iv+Qc3QEw3GHUn+AcdP3bzYvjudRvIjAFHkC2pxnaH8Lawezlsfc+WcAHa9oWuY2xd8d7VtiHPGFvajOsF378NF/6XrXIAKDgIzwyx5xx1J7Tra6tACg/D4U227njVc1BdASk/g43/sted9Cy8dhkMvgkmPnA8v18+Cp/fZ/Nx+ZPQccDpvVeVZTb4L3/MVlNUlNiqi6lvQbvkk9NnrbPph94CnYedvH/3l/bBERwFBVkw7NYzrs5QDWvqwD8ZmCgiNzhfXwcMEZFbXdK0Bz4GooEw4CIRWWuMeRpYJSL/dKZ7CfhQRBbWucYsYBZAp06dBv7wg9uziyrVuL2rbam0osi+rluarhHVyQbvDgPg6ldtN7uP/2TrtaM626oKEduYmJtxvFHREWQb8Hr+2DYiRrY/8bxlBYCBoHD7evFt9kHyy09svfCCqbZkfPOKhrv15R+Az/8C6+bb4Pvzf9kqlYUzbR327Zvt+XN2wbNDbcPh5JdP7D54ug5thg9ut99WfvYP+36cqT0rYP7P7MPvF0vOLl+qXqcT+N3pzllf83Ldp8UU4FUR+ZuzxP+6MSbZzWMRkbnAXLAlfjfypLyZiA0U37wAgeG2/rnTMFt1cmyvrYf287dVF7Hd668vr1F4GN6ZboPkhPttd7moRBu0j+62ATy4lT1XSJStIll0C7ww0lbJHNpoS6jj7gH/wOPnra62+cjfb0voQREN56HuvvH32fr0d39tz71tiS39n6ovd2R7mPQMjLzDBuCaaqLBN9lugRsWQNovYcls+z5d+tezD65te8PM/5zdOWokjrBVUsGtNOi3AO4E/kwgweV1PJBVJ80vgYkAIrLSGBMMxLp5rPIFhYdtqTZ/v+1LXVFsGxqTRtnqFqmGggNwYAN8/aStIw+Lsw+B9W82fN6AUOg8HC7+35Mb7qoqbYm45Bjc8OmJ1RSBzl4idY/pdZnd9vZ0+5BpqC7az8/Za6We3iGNCW4Fl/3d1tMvvhXiB9tqEnfUbQxNGGy7Pq5+AUJjYddnMPHhsyudN5eWmCcf5U5Vjz+2cXccsB/buDtVRDa5pPkQeEtEXjXG9AI+AzoCvYE3ON64+xnQXRt3fUhxLnz1BKyeC5UldltItC2xF2Xb16ExUJpv67DB1m8P/y0MuM6WXnMzbB9uqbLVMa0SbD30wQ22fnnDW7ZxcdANMOYP9vwidqDO10/ClS/YLoCno7rKfisIDGu696Kud39t6+tv+vLsRm6uexMW3Wz7t8d0gRuXNd7FU3md5ujOeSnwBLar5ssicr8xZg6QLiKLnb13XgTCsVU5vxeRj53H/hGYCVQCvxWRD091LQ38LVR5se0HnjDE/QE76xfYniXlhbZRctittvE0MNQG5iM7YPcXdlBOWJwN6tGJtgTv2rjamOJc2586/eXj26Ta/pv2S/jJY+6f61yqrrb94UNbn915Ksvg8T72QfrLT+y3AOVzdACXalo5u+Dt6529Uxx2QM6Qm221Q7Xzy1tEuxMH/RQehif722qTy560g4aa24ENsHkRYOw3irBYGHD96T1Ezleb/m0bgIe5WWWkvE5TN+4qX7b1A/j3zbYB9acv2uHl3/7DzmPiqt9U272wJvgve9BWlUx6zg6LPxfap9gfX9TnSk/nQJ1HNPB7s/ws+Ow+20+9ZtKqimJbNVKcY+cGqSyxVQXRiXawTtcx9tjiXPj0Xvj2NejQ33bni+pkq2xG32UfCGUF9oFwYIMdtdlxgB3Ac3grrH3VDhw6V0FfKeU2Dfznq6pKKC+wg2Lqm1dl24e2W2JlqZ17pGbukoAQ25jaKt72+/YPsd0Ut7wPr0+yk0YljbQPjJKjzq6Md59YXRIUcWJjaXW1fcj85w/2IfHFIxAYASN/3/zvg1LqtGkdf0tWXQWb37UTVpUX2blcio/Y+Uvy9tnh/Y5ACG9n69gjO9geMaV5sO6ftm/65Jfd6zFSUWp73yx/zD4gOg6EnzzhftVJyVHb972swP5+8RwY/puzu3+llNu0cfd8tP9bO5q0ZqbCLe/ZevLsrXZkaFCELaGHRNs00Um2N0hRNhQcskPh87NsH/nKUtv4evH/nn7DZs4uOPi97c9+qoFR9claZyfgCm8Lt66BgODTO14pdca0cfd8UlUBH/8ZVj93fJufvy3Nx/aAya9A70mnt9pPVfmZ92SJ6Xrm9fIdUmHmh3YeHA36SrVYGvjPpepqu6BEcKSdx7w4x44Q3bcKhvzKTnd7dLcdMdqur61vP91StzGe7b7YcaDnrq2UcosG/nPl6A92pOae5fa18bOLN/g54KqXoO9kz+ZPKeUzNPA3tepq+OTPtrtjwhC7dFt5ke0aCTD+fltXn7ffNoKmzbSTYSml1Dmigf9MFWbDZ/farpE/usNW31RXwXu/sVPudrrQLsqxYYFNnzQKrnja9oVXSikP0sB/ukTg+3fgwztt18XqSlj/lnOq3c9soB/5exjzPzbt4U12Iekuo89oAWWllGpqGvjdcXgLZH1n1znd9w3s/RriB8HlT9tqnCX/Df+60aYd8ycYNdv+boxtpFVKqRZEA/+piMCKx+CzOfa1X4AdDDXhQRhy0/EeNzd8ZmeiNMaubaqUUi2YBv6GVFXCh7PtVL99r4ZRd9mBU/XNc+7ngP7TznkWlVLqTGjgr5H+Mqx4wvavj+0GeZmQsQxG/A7G3u3+ACqllGrhNPCLwPK/wef3Qcc0W3rf/jGU5cOlj9rZJpVSyov4duAXsX3uv34KUq6BK545vhB0dbWW8pVSXsl3A39epl0WcPuHMOhGuOSREwO9Bn2llJdyK/AbYyYCf8euuTtPRB6qs/9xwLmCB6FAGxGJcu6rAr537tsrIpc3RcbPWHUVrH4BPv8LIDDhARh6i/axV0r5jEYDvzHGATwDXAxkAmuMMYtFZHNNGhH5nUv6/wL6u5yiRERSmy7LZ+HYXlj4S8j8BrqPt3X40Z09nSullDqn3CnxDwZ2ikgGgDFmAXAFsLmB9FOAe5ome01oy/vw7i22Xv+n8+ykaFrKV0r5IHcqsjsC+1xeZzq3ncQY0xlIAj532RxsjEk3xqwyxkw645yejWUPw1vT7OIlN30BKVdr0FdK+Sx3Svz1RciGlu26FlgoIlUu2zqJSJYxpgvwuTHmexHZdcIFjJkFzALo1KmJJzErOARf/tUuZvLTuZ6dq14ppVoAd0r8mUCCy+t4IKuBtNcCb7puEJEs578ZwDJOrP+vSTNXRNJEJC0uLs6NLJ3sWHE5M19dw2dbDp24Y+2rUF1x8oLhSinlo9wJ/GuA7saYJGNMIDa4L66byBhzARANrHTZFm2MCXL+HgsMp+G2gbNijOHzrYfZfaTo+MbKcjsit9tFZ76coFJKeZlGq3pEpNIYcyvwEbY758sisskYMwdIF5Gah8AUYIGcuHp7L+AFY0w19iHzkGtvoKYUFmgnTCsqc6ll2voeFB6EwU81xyWVUuq85FY/fhFZAiyps+3uOq/vree4r4FzMi+xv8OPIH8/issrj29cPdc26Ha76FxkQSmlzgteNTw1LMifoprAf2C9XcR88I06ClcppVx4VUQMC3Icr+r5Zi4EhEKqTpeslFKuvCvwB/pTVFYJhYfh+4V24rWQKE9nSymlWhSvCvyhgQ5b1bPyaagqhwv/y9NZUkqpFserAn9YkD+UHIM1L0GfK7ULp1JK1cOrpmUOC/Rn2OF3obwQRtzu6ewopVSL5FUl/uiAciaVLYYel0C7ZE9nRymlWiSvCvyj8t+nFYXwo//2dFaUUqrF8p7AX1HK8OwFfF3dBxIGeTo3SinVYnlP4C/KJj+0E09VTqK8strTuVFKqRbLewJ/VAIfD36ZldW9bV9+pZRS9fKewI/t1QPm+LQNSimlTuJVgT80qJ4ZOpVSSp3AqwJ/WJAdlqAlfqWUaph3Bf5AG/iLtcSvlFIN8q7A76zqKdTGXaWUapB3Bf6aEr9W9SilVIO8K/DX1PFriV8ppRrkVuA3xkw0xmwzxuw0xtxVz/7HjTHrnD/bjTHHXPZNN8bscP5Mb8rM11VT1VNUrnX8SinVkEZn5zTGOIBngIuBTGCNMWax66LpIvI7l/T/BfR3/t4auAdIAwRY6zz2aJPehVNIgANjoFhL/Eop1SB3SvyDgZ0ikiEi5cAC4IpTpJ8CvOn8fQLwiYjkOoP9J8DEs8nwqRhjCAv0p1B79SilVIPcCfwdgX0urzOd205ijOkMJAGfn+6xTSU00KGNu0opdQruBH5TzzZpIO21wEIRqSlyu3WsMWaWMSbdGJOenZ3tRpYaFh7kr905lVLqFNwJ/JlAgsvreCCrgbTXcryax+1jRWSuiKSJSFpcXJwbWWpYaJCDYm3cVUqpBrkT+NcA3Y0xScaYQGxwX1w3kTHmAiAaWOmy+SNgvDEm2hgTDYx3bms2to5fS/xKKdWQRgO/iFQCt2ID9hbgbRHZZIyZY4y53CXpFGCBiIjLsbnAfdiHxxpgjnNbswkL8tc6fqWUOgW3FlsXkSXAkjrb7q7z+t4Gjn0ZePkM83fawoL8KT6iVT1KKdUQrxq5CxAW6NCqHqWUOgWvC/yhgf7auKuUUqfgdYE/PMhBUXklLk0NSimlXHhd4A8N8kcESiq01K+UUvXxusBfM0On1vMrpVT9vC/wB9oZOnUVLqWUqp/3BX5dd1cppU7J+wJ/YM1iLFriV0qp+nhf4K9djEVL/EopVR8vDPy6/KJSSp2K1wX+UG3cVUqpU/K6wB+u3TmVUuqUvC7whzobd3WGTqWUqp/XBf5Afz8CHX4U6Xw9SilVL68L/GBX4dLGXaWUqp9XBv6wQH/tx6+UUg3wzsCvJX6llGqQWytwnW/Cgvx1AJc6QUVFBZmZmZSWlno6K0qdleDgYOLj4wkICDjjc7gV+I0xE4G/Aw5gnog8VE+anwH3AgKsF5Gpzu1VwPfOZHtF5PK6xza1MF2MRdWRmZlJREQEiYmJGGM8nR2lzoiIkJOTQ2ZmJklJSWd8nkYDvzHGATwDXAxkAmuMMYtFZLNLmu7AH4DhInLUGNPG5RQlIpJ6xjk8A6GBDo4Ulp3LS6oWrrS0VIO+Ou8ZY4iJiSE7O/uszuNOHf9gYKeIZIhIObAAuKJOmhuBZ0TkKICIHD6rXJ2lcK3qUfXQoK+8QVP8HbsT+DsC+1xeZzq3ueoB9DDGfGWMWeWsGqoRbIxJd26fdJb5dYvtzqlVPaplCQ8P99i1n3jiCYqLi+vdd8MNN7B5s/0C/8ADDzTpdV999VWysrLqvZbyHHcCf32Pl7oL2voD3YHRwBRgnjEmyrmvk4ikAVOBJ4wxXU+6gDGznA+H9LP9CgPOxl3t1aNUrVMF/nnz5tG7d2/gzAJ/VVXDhay6gd/1Wspz3An8mUCCy+t4IKueNO+KSIWI7Aa2YR8EiEiW898MYBnQv+4FRGSuiKSJSFpcXNxp30RdYYH+lFVWU1lVfdbnUqo5/fDDD4wbN46UlBTGjRvH3r17AXjnnXdITk6mX79+jBw5EoBNmzYxePBgUlNTSUlJYceOHSed71e/+hVpaWn06dOHe+65B4Ann3ySrKwsxowZw5gxY046ZvTo0aSnp3PXXXdRUlJCamoq06ZNA+Cf//xn7TVvuumm2iAfHh7O3XffzZAhQ1i5ciVz5sxh0KBBJCcnM2vWLESEhQsXkp6ezrRp00hNTaWkpKT2WgBvvvkmffv2JTk5mTvvvLM2P+Hh4fzxj3+kX79+DB06lEOHDjXhO67AvV49a4DuxpgkYD9wLbb07moRtqT/qjEmFlv1k2GMiQaKRaTMuX048EiT5b4Bx1fhqqJViFcOVVBn4X/f28TmrPwmPWfvDpHcc1mf0z7u1ltv5frrr2f69Om8/PLL3HbbbSxatIg5c+bw0Ucf0bFjR44dOwbA888/z29+8xumTZtGeXl5vSXt+++/n9atW1NVVcW4cePYsGEDt912G4899hhLly4lNja2wbw89NBDPP3006xbtw6ALVu28NZbb/HVV18REBDALbfcwvz587n++uspKioiOTmZOXPm2Pvv3Zu7774bgOuuu47333+fyZMn8/TTT/Poo4+SlpZ2wrWysrK48847Wbt2LdHR0YwfP55FixYxadIkioqKGDp0KPfffz+///3vefHFF/nTn/502u+talijUVFEKoFbgY+ALcDbIrLJGDPHGFPTNfMjIMcYsxlYCswWkRygF5BujFnv3P6Qa2+g5lK77q428KoWbuXKlUydastR1113HStWrABg+PDhzJgxgxdffLE2wA8bNowHHniAhx9+mB9++IGQkJCTzvf2228zYMAA+vfvz6ZNm86qPv2zzz5j7dq1DBo0iNTUVD777DMyMjIAcDgcXHXVVbVply5dypAhQ+jbty+ff/45mzZtOuW516xZw+jRo4mLi8Pf359p06bx5ZdfAhAYGMhPfvITAAYOHMiePXvO+B5U/dzqxy8iS4Aldbbd7fK7ALc7f1zTfA30Pftsnp5QXYxFncKZlMzPlZoeG88//zyrV6/mgw8+IDU1lXXr1jF16lSGDBnCBx98wIQJE5g3bx5jx46tPXb37t08+uijrFmzhujoaGbMmHFWA9ZEhOnTp/Pggw+etC84OBiHwxawSktLueWWW0hPTychIYF777230evakFG/gICA2vfB4XBQWan/j5uaV9aDhNcsv6g9e1QLd+GFF7JgwQIA5s+fz4gRIwDYtWsXQ4YMYc6cOcTGxrJv3z4yMjLo0qULt912G5dffjkbNmw44Vz5+fmEhYXRqlUrDh06xIcffli7LyIigoKCgkbzExAQQEVFBQDjxo1j4cKFHD5se2fn5ubyww8/nHRMTZCPjY2lsLCQhQsXNnrdIUOG8MUXX3DkyBGqqqp48803GTVqVKP5U03DK6dsCA3UEr9qeYqLi4mPj699ffvtt/Pkk08yc+ZM/vrXvxIXF8crr7wCwOzZs9mxYwciwrhx4+jXrx8PPfQQ//znPwkICKBdu3a1deo1+vXrR//+/enTpw9dunRh+PDhtftmzZrFJZdcQvv27Vm6dGmDeZw1axYpKSkMGDCA+fPn85e//IXx48dTXV1NQEAAzzzzDJ07dz7hmKioKG688Ub69u1LYmIigwYNqt03Y8YMbr75ZkJCQli5cmXt9vbt2/Pggw8yZswYRIRLL72UK66oOzxINRdzqq9cnpCWliY1rf5nauP+PH7y1ApevD6Ni3u3baKcqfPZli1b6NWrl6ezoVSTqO/v2Riz1tl1vlFeWdVTs+6ulviVUupkXhn4w2u7c2rgV0qpurwy8Nf06inWxl2llDqJdwb+AFvVU6hVPUopdRKvDPx+foawQAfHiss9nRWllGpxvDLwA/Tp0Irv9h3zdDaUUqrF8drAP7RrDBv355FfWuHprCjF6NGj+eijj07Y9sQTT3DLLbec8riaqZyzsrKYPHlyg+durAt03dk5L7300to5gLzdsWPHePbZZxvcf+GFFwKwZ88e3njjjSa9dt3ZTmuu5WleG/iHdYmhWuCbjFxPZ0UppkyZUjtCt8aCBQuYMmWKW8d36NDhhBGxp6tu4F+yZAlRUVGnOMIzTjXF85lqLPB//fXXwJkF/sbyWzfw11zL07w28PfvFEWgvx8rM3I8nRWlmDx5Mu+//z5lZXZJ0D179pCVlcWIESMoLCxk3LhxDBgwgL59+/Luu++edPyePXtITk4GoKSkhGuvvZaUlBSuueYaSkpKatO5Oy1zYmIiR44cAeCxxx4jOTmZ5ORknnjiidrr9erVixtvvJE+ffowfvz4E65To77po6uqqrjjjjvo27cvKSkpPPXUU4Cd9K1///707duXmTNn1r4XiYmJzJkzhxEjRvDOO++wa9cuJk6cyMCBA/nRj37E1q1bG7yWq4bex7vuuotdu3aRmprK7NmzTzqu5lvVXXfdxfLly0lNTeXxxx+nqqqK2bNnM2jQIFJSUnjhhRcAWLZsGWPGjGHq1Kn07WunIps0aRIDBw6kT58+zJ07t/Z8dae5rrmWiDB79mySk5Pp27cvb731Vu25R48ezeTJk+nZsyfTpk075bxGZ0xEWtTPwIEDpalc88LXcskTXzbZ+dT5a/PmzcdfLLlT5OVLm/ZnyZ2N5uHSSy+VRYsWiYjIgw8+KHfccYeIiFRUVEheXp6IiGRnZ0vXrl2lurpaRETCwsJERGT37t3Sp08fERH529/+Jr/4xS9ERGT9+vXicDhkzZo1IiKSk5MjIiKVlZUyatQoWb9+vYiIdO7cWbKzs2vzUvM6PT1dkpOTpbCwUAoKCqR3797y7bffyu7du8XhcMh3330nIiJXX321vP766yfdU3JysmRmZoqIyNGjR0VE5Nlnn5Wf/vSnUlFRUZunkpISiY+Pl23btomIyHXXXSePP/54bV4efvjh2nOOHTtWtm/fLiIiq1atkjFjxjR4LVcNvY+u7119at7jpUuXyo9//OPa7S+88ILcd999IiJSWloqAwcOlIyMDFm6dKmEhoZKRkZGbdqa9724uFj69OkjR44cOeHcda+1cOFCueiii6SyslIOHjwoCQkJkpWVJUuXLpXIyEjZt2+fVFVVydChQ2X58uUn5fmEv2cnIF3cjLNeW+IHGNYlli0H87V3j2oRXKt7XKt5RIT/+Z//ISUlhYsuuoj9+/efcvGRL7/8kp///OcApKSkkJKSUrvvdKdlXrFiBVdeeSVhYWGEh4fz05/+lOXLlwOQlJREamoq0PD0yPVNH/3pp59y88034+9vx9O0bt2abdu2kZSURI8ePQCYPn167TTMANdccw1gS+1ff/01V199de3iLwcOHGjwWq5O931szMcff8w//vEPUlNTGTJkCDk5ObWL3wwePJikpKTatE8++WTtwjH79u2rd5EcVytWrGDKlCk4HA7atm3LqFGjWLNmTe254+Pj8fPzIzU1tVmmpfbKSdpqDOsaw+OfwurduUzo087T2VEtxSXB6XQjAAAbhElEQVQPeeSykyZN4vbbb+fbb7+lpKSEAQMGAHZWzuzsbNauXUtAQACJiYmNTmtc34LbZzIts5yiGiEoKKj2d4fDUW9VT33TR4vISfk71XUAwsLCAKiuriYqKqp2MZjGrhUTE1O7/0zex1MREZ566ikmTJhwwvZly5bV5rfm9aeffsrKlSsJDQ1l9OjRTfq+N8e01F5d4u+X0IrgAD9W7tJ6fuV54eHhjB49mpkzZ57QqJuXl0ebNm0ICAhg6dKl9U597GrkyJHMnz8fgI0bN9ZOz3wm0zKPHDmSRYsWUVxcTFFREf/+97/50Y9+5PY91Td99Pjx43n++edrA1Zubi49e/Zkz5497Ny5E4DXX3+93mmYIyMjSUpK4p133gFsgFy/fn2D13LV0Pvo7pTUddNNmDCB5557rnaa6u3bt1NUVHTScXl5eURHRxMaGsrWrVtZtWpV7T7Xaa5djRw5krfeeouqqiqys7P58ssvGTx4cKN5bCpeHfiD/B0M7BzNKm3gVS3ElClTWL9+Pddee23ttmnTppGenk5aWhrz58+nZ8+epzzHr371KwoLC0lJSeGRRx6pDRiu0zLPnDmz3mmZ6665O2DAAGbMmMHgwYMZMmQIN9xwA/37n7QsdoNmz55du27uyJEj6devHzfccAOdOnUiJSWFfv368cYbbxAcHMwrr7zC1VdfTd++ffHz8+Pmm2+u95zz58/npZdeol+/fvTp06e2kba+a7lq6H2MiYlh+PDhJCcn19u4WyMlJQV/f3/69evH448/zg033EDv3r0ZMGAAycnJ3HTTTfWWvidOnEhlZSUpKSn8+c9/ZujQobX7aqa5rmncrXHllVfWvj9jx47lkUceoV27c1cr4ZXTMrt6+vMdPPrxdtb+6SJiwoMaP0B5JZ2WWXmTczItszFmojFmmzFmpzHmrgbS/MwYs9kYs8kY84bL9unGmB3On+nuXK8pDetq6wC/2a39+ZVSCtwI/MYYB/AMcAnQG5hijOldJ0134A/AcBHpA/zWub01cA8wBBgM3GOMiW7SO2hESnwUIQEOvtZ6fqWUAtwr8Q8GdopIhoiUAwuAumuk3Qg8IyJHAUTksHP7BOATEcl17vsEmNg0WXdPgMOPEd1j+XDjQSqqqs/lpZVSqkVyJ/B3BFybzzOd21z1AHoYY74yxqwyxkw8jWOb3c/SEjhSWMbSrYcbT6y8Vktrz1LqTDTF37E7gf/kDsNQ98r+QHdgNDAFmGeMiXLzWIwxs4wx6caY9OzsbDeydHrGXBBHXEQQb6fvazyx8krBwcHk5ORo8FfnNREhJyeH4ODgszqPOwO4MoEEl9fxQFY9aVaJSAWw2xizDfsgyMQ+DFyPXVb3AiIyF5gLtlePm3l3m7/Dj6sGxPPi8gwO55fSJvLs3jR1/omPjyczM5PmKFgodS4FBwcTHx9/VudwJ/CvAbobY5KA/cC1wNQ6aRZhS/qvGmNisVU/GcAu4AGXBt3x2Ebgc+7qtHie/2IX//ftfn41uqsnsqA8KCAg4IQh9kr5skarekSkErgV+AjYArwtIpuMMXOMMZc7k30E5BhjNgNLgdkikiMiucB92IfHGmCOc9s51zUunEGJ0byTvk+/7iulfJrXD+By9U76PmYv3MDbNw1jcFLrZrmGUkp5QpMP4PIWP05pT3iQP2+t0UZepZTv8qnAHxrozxWpHXhvQxZHCss8nR2llPIInwr8ADNHJFFRVc0/Vp56BkSllPJWPhf4u8aFc1Gvtry+cg8l5U2/vqdSSrV0Phf4AWaN7MLR4goWrtW6fqWU7/HJwJ/WOZrUhCjmrdhNVXXL6tWklFLNzScDvzGGm0Z24YecYj7adNDT2VFKqXPKJwM/wPg+7egcE8oLX+yiWkv9Sikf4rOB3+FnuHVMN9Zn5vGPlXs8nR2llDpnfDbwA0weGM/Ynm148MOt7Dzc+GLMSinlDXw68BtjeOiqvoQF+fPbt9ZRXqkLtSilvJ9PB36ANhHBPHBlXzbuz+fJz3Z4OjtKKdXsfD7wA0xMbsfVA+N5dtlOPtl8yNPZUUqpZqWB3+l/r+hD346t+K83v+XbvUc9nR2llGo2GvidQgP9eWnGINpGBvPLV9eQkV3o6SwppVSz0MDvIjY8iNd+MRg/Y7j+5W84lF/q6SwppVST08BfR2JsGC/NGMTRonKmvLiKwwUa/JVS3kUDfz1SE6J45ReDOXCslKkvrta5+5VSXsWtwG+MmWiM2WaM2WmMuaue/TOMMdnGmHXOnxtc9lW5bF/clJlvToOTWvPyjEFkHi1m6ouryC7Q4K+U8g6NBn5jjAN4BrgE6A1MMcb0rifpWyKS6vyZ57K9xGX75fUc12IN6xrDy9MHsTe3mCuf/YptB3V0r1Lq/OdOiX8wsFNEMkSkHFgAXNG82Wo5LuwWy9s3DaOsspqrnvuaZdsOezpLSil1VtwJ/B0B1xVLMp3b6rrKGLPBGLPQGJPgsj3YGJNujFlljJl0Npn1lJT4KN799XASWocy89U1vPrVbkR0Rk+l1PnJncBv6tlWN+q9BySKSArwKfCay75OIpIGTAWeMMZ0PekCxsxyPhzSs7Oz3cz6udUhKoSFNw9jbM+23PveZv64aCMVVTq3j1Lq/ONO4M8EXEvw8UCWawIRyRGRmtbPF4GBLvuynP9mAMuA/nUvICJzRSRNRNLi4uJO6wbOpbAgf+ZeN5BbRnfljdV7+fm81eQWlXs6W0opdVrcCfxrgO7GmCRjTCBwLXBC7xxjTHuXl5cDW5zbo40xQc7fY4HhwOamyLin+PkZfj+xJ09ck8p3+44x5tFlvPhlBqUVunC7Uur80GjgF5FK4FbgI2xAf1tENhlj5hhjanrp3GaM2WSMWQ/cBsxwbu8FpDu3LwUeEpHzOvDXmNS/I4tvHU5qQhT3L9nC2EeX8d76rMYPVEopDzMtrZEyLS1N0tPTPZ2N0/L1ziM8+OFWvt+fx08HdOS+K5IJC/L3dLaUUj7EGLPW2Z7aKB252wQu7BbLol8P57cXdWfRd/u57KkVbNyf5+lsKaVUvTTwNxGHn+G3F/XgjRuHUlxexeVPr+AP//qewzrRm1KqhdHA38SGdonhP7/9EdMvTGTh2n2MfnQZf/90h3b9VEq1GBr4m0FUaCD3XNaHT28fxZgL2vD4p9uZMncVB/O09K+U8jwN/M2oc0wYz0wbwJNT+rP5QD4/fnI5X+084ulsKaV8nAb+c+Dyfh1YfOtwWocF8vOXVvOHf23QqZ6VUh6jgf8c6dYmgndvHc7M4Um8k57JmL8uY97yDMorte5fKXVuaeA/h0ID/fnzT3rzn9+OZEDnaP7ywRYmPvElS7fqjJ9KqXNHA78HdGsTzmszB/PyDDvW4hevruEXr3zDjkM6379Sqvlp4PegsT3b8p/fjuSPl/Yifc9RJjzxJf/99nr25RZ7OmtKKS+mUza0ELlF5Tz/xS5e+3oP1SJMHhjPrJFdSYoN83TWlFLngdOZskEDfwtzMK+Up5fu4O30TCqqqrkkuR2/HtONPh1aeTprSqkWTAO/FzhcUMqrX+3h9VU/UFhWybQhnbhj/AVEhQZ6OmtKqRZIA78XySup4PFPtvOPlXuICg3kdxf34OqB8QQHODydNaVUC6KB3wttzsrnnsUbWbPnKFGhAVw7qBPXDetMx6gQT2dNKdUCaOD3UiLC6t25vPrVHj7efBCwPYN+PrQTI7vH4edX3/LISilfcDqBX1cLOY8YYxjaJYahXWLIPFrMG6v38nb6Pj7dcoiE1iFcPTCBqwbG67cApdQpaYn/PFdeWc1Hmw7yxuq9rMzIwRgY0S2WqYM7cVHvtgQ4dKiGUr6gyVfgMsZMNMZsM8bsNMbcVc/+GcaYbGPMOufPDS77phtjdjh/prt/G8odgf5+XNavA2/OGsry34/htrHd2XW4kF/N/5bhD33OYx9vIyO70NPZVEq1II2W+I0xDmA7cDGQCawBprgumm6MmQGkicitdY5tDaQDaYAAa4GBInK0oetpif/sVVULS7ceZv7qH1i2PRsR6N4mnInJ7ZiY3I7e7SMxRtsDlPImTV3HPxjYKSIZzpMvAK4ANp/yKGsC8ImI5DqP/QSYCLzpTubUmXH4GS7q3ZaLercl61gJH206yH82HuSZpTt56vOdJMaEcknf9lyW0oHeHSI9nV2l1DnmTuDvCOxzeZ0JDKkn3VXGmJHYbwe/E5F9DRzb8Qzzqs5Ah6gQfjE8iV8MTyKnsIyPNx9iyfcHmPtlBs8t20XPdhFcNSCeK1I70CYy2NPZVUqdA+4E/vrqBOrWD70HvCkiZcaYm4HXgLFuHosxZhYwC6BTp05uZEmdiZjwIKYM7sSUwZ3ILSrn/Q1Z/N+3+7l/yRYe+HALaZ2jmZjcnkuS29FBewYp5bXcqeMfBtwrIhOcr/8AICIPNpDeAeSKSCtjzBRgtIjc5Nz3ArBMRBqs6tE6/nNv5+FC3t+QxX82HmTrwYLankHXDErg4t5tCfLXUcJKtXRNOoDLGOOPrb4ZB+zHNu5OFZFNLmnai8gB5+9XAneKyFBn4+5aYIAz6bfYxt3chq6ngd+zdh8pYtF3+1m4NpP9x0qIDPZnQOdo+idEM7BzNMO6xuDQgWJKtThN2rgrIpXGmFuBjwAH8LKIbDLGzAHSRWQxcJsx5nKgEsgFZjiPzTXG3Id9WADMOVXQV56XFBvG7y7uwW3juvPVziN8sOEA3+07yhfO3kFdYsO4ZUw3rkjtoGMElDpP6QAu5ZaC0gq+2J7NM0t3seVAPgmtQ7gspQMjusUyoHO0ThqnlIfpXD2q2YgIn205zLwVGaTvOUpltRDk70fvDpFc0DaCHm0juKCd/YkND/J0dpXyGRr41TlRWFbJN7tz+GpnDpuy8th+qJDcovLa/bHhgaTERzGqRxyjL4ijc4yuJqZUc9FJ2tQ5ER7kz9iebRnbs23ttuyCMrYfKmDrwQK2HshnzZ5cPt96GICE1iGkxEfRt2MrUuJb0T8hmpBArSJS6lzTwK+aVFxEEHERQQzvFlu7bc+RIpZtO8zq3bms33eMDzYcACDAYUiJj2JQYmt6touga1w4SXFhhAfpn6VSzUmretQ5d7SonHX7jrF6dy6rd+fwfWYeldXH/w4TWofQq10kvTtEkhQbRkLrUOKjQ4gLD9I5hpRqgFb1qBYtOiyQMT3bMKZnGwDKKqvYm1PMruwidmUXsvlAPluy8vlkyyFcyyWhgQ6SYsNIig2jR9sIUhOi6JcQRauQAA/diVLnJw38yuOC/B10bxtB97YRJ2wvLq8k82gJmUeL2Zdbwg85xWQcKWRDZh7vO6uLADrHhNKpdSidY0JJjAmjZ7tIerWPIEZ7FSlVLw38qsUKDfSnh7OLaF35pRVs2JfHun1H2XKggH1Hi3lv/QHySipq08SGB9ExOoQOrYLpEBVCj7bh9GofSY+2ETruQPk0DfzqvBQZHMCI7rGM6B57wvacwjK2Hixgy4F8th8qIOtYKdsOFbB022FKK6oB8DN2wrrWoYG0DgskMTaMvh1b0bdjK3q0C9e5iZTX08Zd5ROqq4W9ucVsOZDPloMFHM4vJbeonJyicnYcKiC/tBKwaxkkxoRyQbsIOrUOIzLEn8jgAPuAiLHtC9oFVbVE2rirVB1+fobE2DASY8O4pG/7E/aJCPtyS9iw/xhbDxSw/VABWw4U8MnmQ1RUnVww6tAquPZcXWJtm0LvDpG0Dgs8V7ej1FnRwK98njGGTjGhdIoJ5Scpx7eLCGWV1eSXVJBdWMaeI8VkZBey+0gRu3OKWPL9AY4VH29TaBsZRHx0KG0igmgTEUR0WCDRoYFEhQbQvlUIXePCtMFZtQga+JVqgDGG4AAHwQEO2kQG06dDq5PS1LQpbM7KZ8uBfA7klbLjcCFf7TxSW33kKjo0gC5x4STFhtElLoyE6FBiwgKJCg0kJjyQ2PAgnfZaNTut41eqmVRUVZNXUsGx4nIyj5aw83Ahu7ILycguYveRIg4XlJ10jL+foW1kMB2jQugYHULHqBDio0OIj7aD2NpHBWvjs6qX1vEr1QIEOPyIDQ8iNjyIbm0iGH1BmxP2F5ZVknm0mKNF9uFwpKicA8dKOJBXyv5jJXyzO5eD+aVUuYxqNgZiwgKJiwimTUQQSbFhJDt7JHWOCSXA4affGFSjNPAr5SHhQf70bBd5yjSVVdUczC9l/9ES9jkHsx3KL+VwfhmHCkpZsyeXV7/ec8IxfsYOigsP9ic8yJ/QQAeB/n4EOPwIDXSQmhDFhV1jSU2IItBfF9PxRVrVo9R5rKpa2H2kkO/353Egr5TKKqGiqprSiioKy6ooLKukuKySimqhorKaYyUVbD2YjwgE+fsRHx1C28hg2kYG0yokgLAgB2FB/oQF2gdGWJA/bSOD6N2+lXZjbeG0qkcpH+HwM3RrE0G3NiePbm5IXnEFq3fnsGZPLvuPlXAov4w1e3LJL6mgqLzqhKqlGn4GureJoFubcCKc3yRahwfSL97Ol6Qzqp5f9NNSyse0Cg1gfJ92jO/T7qR9NV1Yi8oqKS6voqi8kn25JXyfeYzv9+ex5UA+hWWV9ptEeRVgHwqJsWH4+xk7y6pAfOtQerQJp3vbcEID/al5lEQE+9M2Ipi2kUFEhwbip+0RHuFW4DfGTAT+jl1sfZ6IPNRAusnAO8AgEUk3xiQCW4BtziSrROTms820Uqp5uHZhjXFu69kukot7tz0pbV5xBesyj/HtD0fZdrAAAH+HQQT25BTxekYOZZXVDV7L4WeIDrWjomPDg+gQFUKHqBA6Rtm5ldq3CqFNZBAitkqr5qeyuhoRaN8qGH+HtlGciUYDvzHGATwDXAxkAmuMMYtFZHOddBHAbcDqOqfYJSKpTZRfpVQL0So0gFE94hjVI67e/VXVQubRYsoqqzHYHkl5JZUcyi/lYF4pOUVl5BZVcLSonEMFpazYcYRDBaW42+wYEeTPoKTWDElqTVxEkG3fqK4m2N9BbEQQseF2fESwv1/tw0x7PFnulPgHAztFJAPAGLMAuALYXCfdfcAjwB1NmkOl1HnJ4WdOe53liqpqDuaVciCvlKxjJWQXlOHnZ3AYez6Hnx/+foYqEb7fn8eqjJzapT3dEejwIyTQQXCAn+3p5OdHUICDDq2Caxf8iQ0PolVIAK1CA4gLD6JNZFDt2ImqaiGvpAJ/hyEy+PxdB8KdwN8R2OfyOhMY4prAGNMfSBCR940xdQN/kjHmOyAf+JOILK97AWPMLGAWQKdOnU4j+0opbxLg8COhdSgJrUMbTTvF+W92QRlFZZX4OwwBDj9Kyqs4UljGkcIy8koqKK2wvZxKK6opqaiipLySkooqKquEcmcPqP3HSlm9O5fCspNHWwO18zAdLS6v/UbSoVUwF7SLoF2rEApKK8grqaCsopqY8EDinNN2tIkMpl1kMHERQfgZQ2V1NVXVQmigg8iQAFqFBHhkQJ47gb++70a1X8aMMX7A48CMetIdADqJSI4xZiCwyBjTR0TyTziZyFxgLtjunG7mXSmlatd5dpUYe3rfNMA2bOeVVJBbVM4x54jrIwXlHMwv5WB+KX4GWocGEh0WSElFFdsOFrDtYAHf788jMjiAyJAAAv39TjllR30C/f2ICPInLMiffglRPDWl/2nn/XS5E/gzgQSX1/FAlsvrCCAZWOZcD7UdsNgYc7mIpANlACKy1hizC+gBaEd9pVSLYowhKtS2CzSF0oqq2oF22QVliNjGb4cxFFdUkVdSQV5xOQVllRSWVlJUVkmHqJAmuXZj3An8a4DuxpgkYD9wLTC1ZqeI5AG1q2EYY5YBdzh79cQBuSJSZYzpAnQHMpow/0op1SIFBzhqZ31taRoN/CJSaYy5FfgI253zZRHZZIyZA6SLyOJTHD4SmGOMqQSqgJtFJLcpMq6UUurM6JQNSinlBU5nygYd/aCUUj5GA79SSvkYDfxKKeVjNPArpZSP0cCvlFI+RgO/Ukr5mBbXndMYkw38cBaniAWONFF2zie+et+g96737lsauu/OIlL/VKl1tLjAf7aMMenu9mX1Jr5636D3rvfuW5rivrWqRymlfIwGfqWU8jHeGPjnejoDHuKr9w16777KV+/9rO/b6+r4lVJKnZo3lviVUkqdgtcEfmPMRGPMNmPMTmPMXZ7OT3MyxiQYY5YaY7YYYzYZY37j3N7aGPOJMWaH899oT+e1ORhjHMaY74wx7ztfJxljVjvv+y1jTNOspNHCGGOijDELjTFbnZ/9MB/6zH/n/FvfaIx50xgT7K2fuzHmZWPMYWPMRpdt9X7OxnrSGfc2GGMGuHMNrwj8xhgH8AxwCdAbmGKM6e3ZXDWrSuC/RaQXMBT4tfN+7wI+E5HuwGfO197oN8AWl9cPA4877/so8EuP5Kr5/R34j4j0BPph3wOv/8yNMR2B24A0EUnGrgtyLd77ub8KTKyzraHP+RLsAlfdseuWP+fOBbwi8AODgZ0ikiEi5cAC4AoP56nZiMgBEfnW+XsBNgB0xN7za85krwGTPJPD5mOMiQd+DMxzvjbAWGChM4m33nckdmGjlwBEpFxEjuEDn7mTPxBijPEHQrHreXvl5y4iXwJ1F6xq6HO+AviHWKuAKGNM+8au4S2BvyOwz+V1pnOb1zPGJAL9gdVAWxE5APbhALTxXM6azRPA74Fq5+sY4JiI1Kxs7a2ffRcgG3jFWc01zxgThg985iKyH3gU2IsN+HnAWnzjc6/R0Od8RrHPWwK/qWeb13dXMsaEA/8H/FZE8j2dn+ZmjPkJcFhE1rpuriepN372/sAA4DkR6Q8U4YXVOvVx1mdfASQBHYAwbBVHXd74uTfmjP7+vSXwZwIJLq/jgSwP5eWcMMYEYIP+fBH5l3PzoZqvec5/D3sqf81kOHC5MWYPtjpvLPYbQJSzCgC897PPBDJFZLXz9ULsg8DbP3OAi4DdIpItIhXAv4AL8Y3PvUZDn/MZxT5vCfxrgO7OVv5AbMPPqRaBP68567VfAraIyGMuuxYD052/TwfePdd5a04i8gcRiReRROxn/LmITAOWApOdybzuvgFE5CCwzxhzgXPTOGAzXv6ZO+0FhhpjQp1/+zX37vWfu4uGPufFwPXO3j1DgbyaKqFTEhGv+AEuBbYDu4A/ejo/zXyvI7Bf5zYA65w/l2Lruz8Ddjj/be3pvDbjezAaeN/5exfgG2An8A4Q5On8NdM9pwLpzs99ERDtK5858L/AVmAj8DoQ5K2fO/Amti2jAlui/2VDnzO2qucZZ9z7HtvzqdFr6MhdpZTyMd5S1aOUUspNGviVUsrHaOBXSikfo4FfKaV8jAZ+pZTyMRr4lVLKx2jgV0opH6OBXymlfMz/A5Xs0xVpi+0+AAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Visualize the loss curve and validation scores over iterations.\n", "plt.plot(classifier.loss_curve_, label='Loss at iteration')\n", "plt.plot(classifier.validation_scores_, label='Validation scores at iteration')\n", "plt.legend(loc='best')\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "0.8111501999958692" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "classifier.score(X_test_scaled, y_test)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# What's next?\n", "\n", "In the next part of this course, we will discuss the landscape of machine learning methods. Artificial neural networks are a valuable part of this landscape, and -- as you can see -- very easy to set up and try, but will not always be the best solution to the problem." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" } }, "nbformat": 4, "nbformat_minor": 2 }