{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Random Forests == Awesome"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Random Forests should be the hammer of your data science tool kit. \n",
"\n",
"What are they?\n",
"* Machine learning algorithm built for prediction tasks"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" \n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Pros\n",
"* Automatically model non-linear relations and interactions between variables. Perfect collinearity doesn't matter.\n",
"* Easy to tune\n",
"* Relatively easy to understand everything about them\n",
"* Flexible enough to handle regression and classification tasks\n",
"* Is useful as a step in exploratory data analysis\n",
"* Can handle high dimensional data\n",
"* Have a built in method of checking to see model accuracy\n",
"* In general, beats most models at most prediction tasks"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Cons\n",
"* ?\n",
"* ?\n",
"* ?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Simple example: Boston Housing dataset"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"%matplotlib inline"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Load the Boston Housing dataset\n",
"from sklearn.datasets import load_boston\n",
"X, y = load_boston().data, load_boston().target\n",
"\n",
"# Make train and test datasets\n",
"from sklearn.cross_validation import train_test_split\n",
"import numpy as np\n",
"np.random.seed(100)\n",
"X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.2)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"(506, 13)"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"X.shape"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Linear Regression"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"R^2: 0.76\n"
]
}
],
"source": [
"from sklearn.linear_model import LinearRegression\n",
"model = LinearRegression()\n",
"model.fit(X_train, y_train)\n",
"print (\"R^2:\", model.score(X_test, y_test).round(2))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Decision Tree"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"R^2: 0.8\n"
]
}
],
"source": [
"from sklearn.tree import DecisionTreeRegressor\n",
"model = DecisionTreeRegressor()\n",
"model.fit(X_train, y_train)\n",
"print (\"R^2:\", model.score(X_test, y_test).round(2))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Random Forest with defaults"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"R^2: 0.87\n"
]
}
],
"source": [
"from sklearn.ensemble import RandomForestRegressor\n",
"model = RandomForestRegressor(random_state=42)\n",
"model.fit(X_train, y_train)\n",
"print (\"R^2:\", model.score(X_test, y_test).round(2))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Prerequisites\n",
"* Decision trees\n",
"* Bootstrap sampling"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## The Random Forest Algorithm\n",
"\n",
"The big idea: Combine a bunch of terrible decision trees into one awesome model. \n",
"\n",
"For each tree in the forest:\n",
"1. Take a bootstrap sample of the data\n",
"2. Randomly select some variables.\n",
"3. For each variable selected, find the split point which minimizes MSE (or Gini Impurity or Information Gain if classification).\n",
"4. Split the data using the variable with the lowest MSE (or other stat).\n",
"5. Repeat step 2 through 4 (randomly selecting new sets of variables at each split) until some stopping condition is satisfied or all the data is exhausted.\n",
"\n",
"Repeat this process to build several trees. \n",
"\n",
"To make a prediction, run an observation down several trees and average the predicted values from all the trees (for regression) or find the most popular class predicted (if classification)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##Most important parameters (and what they mean)\n",
" * ###Parameters that will make your model better\n",
" * n_estimators: The number of trees in the forest. Choose as high of a number as your computer can handle.\n",
" * max_features: The number of features to consider when looking for the best split. Try [\"auto\", \"None\", \"sqrt\", \"log2\", 0.9, and 0.2]\n",
" * min_samples_leaf: The minimum number of samples in newly created leaves.Try [1, 2, 3]. If 3 is the best, try higher numbers.\n",
" * ###Parameters that will make it easier to train your model\n",
" * n_jobs: Determines if multiple processors should be used to train and test the model. Always set this to -1 and %%timeit vs. if it is set to 1. It should be much faster (especially when many trees are trained).\n",
" * random_state: Set this to 42 if you want to be cool AND want others to be able to replicate your results.\n",
" * oob_score: THE BEST THING EVER. Random Forest's custom validation method: out-of-bag predictions."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## OOB predictions\n",
"About a third of observations don't show up in a bootstrap sample. \n",
"\n",
"Because an individual tree in the forest is made from a bootstrap sample, it means that about a third of the data was not used to build that tree. We can track which observations were used to build which trees.\n",
"\n",
"#### Here is the magic. \n",
"\n",
"After the forest is built, we take each observation in the dataset and identify which trees used the observation and which trees did not use the observation (based on the bootstrap sample). We use the trees the observation was not used to build to predict the true value of the observation. About a third of the trees in the forest will not use any specific observation from the dataset. \n",
"\n",
"OOB predictions are similar to following awesome, but computationally expensive method: \n",
"1. Train a model with n_estimators trees, but exclude one observation from the dataset.\n",
"2. Use the trained model to predict the excluded observation. Record the prediction.\n",
"3. Repeat this process for every single observation in the dataset.\n",
"4. Collect all your final predictions. These will be similar to your oob prediction errors. \n",
"\n",
"The leave-one-out method will take n_estimators\\*time_to_train_one_model\\*n_observations to run. \n",
"\n",
"The oob method will take n_estimators\\*time_to_train_one_model\\*3 to run (the \\*3 is because if you want to get an accuracy estimate of a 100 tree forest, you will need to train 300 trees. Why? Because with 300 trees each observation will have about 100 trees it was not used to build that can be used for the oob_predictions).\n",
"\n",
"This means the oob method is n_observations/3 times faster to train then the leave-one-out method."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Full example. Titanic dataset"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###Your first goal always should be getting a generalized prediction as fast as possible.\n",
"* This doesn't mean to skip exploratory data analysis (EDA). It just means to not get caught up on it. Initially do only what is needed to get a generalized prediction.\n",
"* Getting a prediction first lets you set a benchmark for yourself. As you make improvements to the model, you should be able to see your desired error metric improve."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# With the goal above, I will import just what I need. \n",
"# The model to use (I already imported it above, but will do it again here so each example is self-contained)\n",
"from sklearn.ensemble import RandomForestRegressor\n",
"\n",
"# The error metric. In this case, we will use c-stat (aka ROC/AUC)\n",
"from sklearn.metrics import roc_auc_score\n",
"\n",
"# An efficient data structure. \n",
"import pandas as pd\n",
"\n",
"# Import the data\n",
"X = pd.read_csv(\"../data/train.csv\")\n",
"y = X.pop(\"Survived\")"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"
\n",
" \n",
" \n",
" | \n",
" PassengerId | \n",
" Pclass | \n",
" Age | \n",
" SibSp | \n",
" Parch | \n",
" Fare | \n",
"
\n",
" \n",
" \n",
" \n",
" count | \n",
" 891.000000 | \n",
" 891.000000 | \n",
" 714.000000 | \n",
" 891.000000 | \n",
" 891.000000 | \n",
" 891.000000 | \n",
"
\n",
" \n",
" mean | \n",
" 446.000000 | \n",
" 2.308642 | \n",
" 29.699118 | \n",
" 0.523008 | \n",
" 0.381594 | \n",
" 32.204208 | \n",
"
\n",
" \n",
" std | \n",
" 257.353842 | \n",
" 0.836071 | \n",
" 14.526497 | \n",
" 1.102743 | \n",
" 0.806057 | \n",
" 49.693429 | \n",
"
\n",
" \n",
" min | \n",
" 1.000000 | \n",
" 1.000000 | \n",
" 0.420000 | \n",
" 0.000000 | \n",
" 0.000000 | \n",
" 0.000000 | \n",
"
\n",
" \n",
" 25% | \n",
" 223.500000 | \n",
" 2.000000 | \n",
" 20.125000 | \n",
" 0.000000 | \n",
" 0.000000 | \n",
" 7.910400 | \n",
"
\n",
" \n",
" 50% | \n",
" 446.000000 | \n",
" 3.000000 | \n",
" 28.000000 | \n",
" 0.000000 | \n",
" 0.000000 | \n",
" 14.454200 | \n",
"
\n",
" \n",
" 75% | \n",
" 668.500000 | \n",
" 3.000000 | \n",
" 38.000000 | \n",
" 1.000000 | \n",
" 0.000000 | \n",
" 31.000000 | \n",
"
\n",
" \n",
" max | \n",
" 891.000000 | \n",
" 3.000000 | \n",
" 80.000000 | \n",
" 8.000000 | \n",
" 6.000000 | \n",
" 512.329200 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" PassengerId Pclass Age SibSp Parch Fare\n",
"count 891.000000 891.000000 714.000000 891.000000 891.000000 891.000000\n",
"mean 446.000000 2.308642 29.699118 0.523008 0.381594 32.204208\n",
"std 257.353842 0.836071 14.526497 1.102743 0.806057 49.693429\n",
"min 1.000000 1.000000 0.420000 0.000000 0.000000 0.000000\n",
"25% 223.500000 2.000000 20.125000 0.000000 0.000000 7.910400\n",
"50% 446.000000 3.000000 28.000000 0.000000 0.000000 14.454200\n",
"75% 668.500000 3.000000 38.000000 1.000000 0.000000 31.000000\n",
"max 891.000000 3.000000 80.000000 8.000000 6.000000 512.329200"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"X.describe()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"I know that there are categorical variables in the dataset, but I will skip them for the moment. I will impute age though, because it will be fast."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" PassengerId | \n",
" Pclass | \n",
" Age | \n",
" SibSp | \n",
" Parch | \n",
" Fare | \n",
"
\n",
" \n",
" \n",
" \n",
" count | \n",
" 891.000000 | \n",
" 891.000000 | \n",
" 891.000000 | \n",
" 891.000000 | \n",
" 891.000000 | \n",
" 891.000000 | \n",
"
\n",
" \n",
" mean | \n",
" 446.000000 | \n",
" 2.308642 | \n",
" 29.699118 | \n",
" 0.523008 | \n",
" 0.381594 | \n",
" 32.204208 | \n",
"
\n",
" \n",
" std | \n",
" 257.353842 | \n",
" 0.836071 | \n",
" 13.002015 | \n",
" 1.102743 | \n",
" 0.806057 | \n",
" 49.693429 | \n",
"
\n",
" \n",
" min | \n",
" 1.000000 | \n",
" 1.000000 | \n",
" 0.420000 | \n",
" 0.000000 | \n",
" 0.000000 | \n",
" 0.000000 | \n",
"
\n",
" \n",
" 25% | \n",
" 223.500000 | \n",
" 2.000000 | \n",
" 22.000000 | \n",
" 0.000000 | \n",
" 0.000000 | \n",
" 7.910400 | \n",
"
\n",
" \n",
" 50% | \n",
" 446.000000 | \n",
" 3.000000 | \n",
" 29.699118 | \n",
" 0.000000 | \n",
" 0.000000 | \n",
" 14.454200 | \n",
"
\n",
" \n",
" 75% | \n",
" 668.500000 | \n",
" 3.000000 | \n",
" 35.000000 | \n",
" 1.000000 | \n",
" 0.000000 | \n",
" 31.000000 | \n",
"
\n",
" \n",
" max | \n",
" 891.000000 | \n",
" 3.000000 | \n",
" 80.000000 | \n",
" 8.000000 | \n",
" 6.000000 | \n",
" 512.329200 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" PassengerId Pclass Age SibSp Parch Fare\n",
"count 891.000000 891.000000 891.000000 891.000000 891.000000 891.000000\n",
"mean 446.000000 2.308642 29.699118 0.523008 0.381594 32.204208\n",
"std 257.353842 0.836071 13.002015 1.102743 0.806057 49.693429\n",
"min 1.000000 1.000000 0.420000 0.000000 0.000000 0.000000\n",
"25% 223.500000 2.000000 22.000000 0.000000 0.000000 7.910400\n",
"50% 446.000000 3.000000 29.699118 0.000000 0.000000 14.454200\n",
"75% 668.500000 3.000000 35.000000 1.000000 0.000000 31.000000\n",
"max 891.000000 3.000000 80.000000 8.000000 6.000000 512.329200"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Impute Age with mean\n",
"X[\"Age\"].fillna(X.Age.mean(), inplace=True)\n",
"\n",
"# Confirm the code is correct\n",
"X.describe()"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" PassengerId | \n",
" Pclass | \n",
" Age | \n",
" SibSp | \n",
" Parch | \n",
" Fare | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 1 | \n",
" 3 | \n",
" 22.0 | \n",
" 1 | \n",
" 0 | \n",
" 7.2500 | \n",
"
\n",
" \n",
" 1 | \n",
" 2 | \n",
" 1 | \n",
" 38.0 | \n",
" 1 | \n",
" 0 | \n",
" 71.2833 | \n",
"
\n",
" \n",
" 2 | \n",
" 3 | \n",
" 3 | \n",
" 26.0 | \n",
" 0 | \n",
" 0 | \n",
" 7.9250 | \n",
"
\n",
" \n",
" 3 | \n",
" 4 | \n",
" 1 | \n",
" 35.0 | \n",
" 1 | \n",
" 0 | \n",
" 53.1000 | \n",
"
\n",
" \n",
" 4 | \n",
" 5 | \n",
" 3 | \n",
" 35.0 | \n",
" 0 | \n",
" 0 | \n",
" 8.0500 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" PassengerId Pclass Age SibSp Parch Fare\n",
"0 1 3 22.0 1 0 7.2500\n",
"1 2 1 38.0 1 0 71.2833\n",
"2 3 3 26.0 0 0 7.9250\n",
"3 4 1 35.0 1 0 53.1000\n",
"4 5 3 35.0 0 0 8.0500"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Get just the numeric variables by selecting only the variables that are not \"object\" datatypes.\n",
"numeric_variables = list(X.dtypes[X.dtypes != \"object\"].index)\n",
"X[numeric_variables].head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"I notice PassengerId looks like a worthless variable. I leave it in for two reasons. First, I don't want to go through the effort of dropping it (although that would be very easy). Second, I am interested in seeing if it is useful for prediction. It might be useful if the PassengerId was assigned in some non-random way. For example, perhaps PassengerId was assigned based on when the ticket was purchased in which case there might be something predictive about people who purchased their tickets early or late."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"RandomForestRegressor(bootstrap=True, criterion='mse', max_depth=None,\n",
" max_features='auto', max_leaf_nodes=None, min_samples_leaf=1,\n",
" min_samples_split=2, min_weight_fraction_leaf=0.0,\n",
" n_estimators=100, n_jobs=1, oob_score=True, random_state=42,\n",
" verbose=0, warm_start=False)"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Let's build our first model. I always have oob_score=True. It is a good idea to increase n_estimators to a number higher than \n",
"# the default. In this case the oob_predictions will be based on a forest of 33 trees. I set random_state=42 so that you all can\n",
"# replicate the model exactly.\n",
"model = RandomForestRegressor(n_estimators=100, oob_score=True, random_state=42)\n",
"\n",
"# I only use numeric_variables because I have yet to dummy out the categorical variables\n",
"model.fit(X[numeric_variables], y)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"0.1361695005913669"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# For regression, the oob_score_ attribute gives the R^2 based on the oob predictions. We want to use c-stat, but I mention this \n",
"# for awareness. By the way, attributes in sklearn that have a trailing underscore are only available after the model has been fit.\n",
"model.oob_score_"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"c-stat: 0.73995515504\n"
]
}
],
"source": [
"y_oob = model.oob_prediction_\n",
"print(\"c-stat: \", roc_auc_score(y, y_oob))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We now have a benchmark. This isn't very good for this dataset; however, it provides us a benchmark for improvement. Before changing parameters for the Random Forest, let's whip this dataset into shape."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Here is a simple function to show descriptive stats on the categorical variables\n",
"def describe_categorical(X):\n",
" \"\"\"\n",
" Just like .describe(), but returns the results for\n",
" categorical variables only.\n",
" \"\"\"\n",
" from IPython.display import display, HTML\n",
" display(HTML(X[X.columns[X.dtypes == \"object\"]].describe().to_html()))"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" | \n",
" Name | \n",
" Sex | \n",
" Ticket | \n",
" Cabin | \n",
" Embarked | \n",
"
\n",
" \n",
" \n",
" \n",
" count | \n",
" 891 | \n",
" 891 | \n",
" 891 | \n",
" 204 | \n",
" 889 | \n",
"
\n",
" \n",
" unique | \n",
" 891 | \n",
" 2 | \n",
" 681 | \n",
" 147 | \n",
" 3 | \n",
"
\n",
" \n",
" top | \n",
" Andersson, Mr. Anders Johan | \n",
" male | \n",
" CA. 2343 | \n",
" B96 B98 | \n",
" S | \n",
"
\n",
" \n",
" freq | \n",
" 1 | \n",
" 577 | \n",
" 7 | \n",
" 4 | \n",
" 644 | \n",
"
\n",
" \n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"describe_categorical(X)"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Drop the variables I don't feel like dealing with for this tutorial\n",
"X.drop([\"Name\", \"Ticket\", \"PassengerId\"], axis=1, inplace=True)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Change the Cabin variable to be only the first letter or None\n",
"def clean_cabin(x):\n",
" try:\n",
" return x[0]\n",
" except TypeError:\n",
" return \"None\"\n",
"\n",
"X[\"Cabin\"] = X.Cabin.apply(clean_cabin)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"categorical_variables = ['Sex', 'Cabin', 'Embarked']\n",
"\n",
"for variable in categorical_variables:\n",
" # Fill missing data with the word \"Missing\"\n",
" X[variable].fillna(\"Missing\", inplace=True)\n",
" # Create array of dummies\n",
" dummies = pd.get_dummies(X[variable], prefix=variable)\n",
" # Update X to include dummies and drop the main variable\n",
" X = pd.concat([X, dummies], axis=1)\n",
" X.drop([variable], axis=1, inplace=True)"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Pclass | \n",
" Age | \n",
" SibSp | \n",
" Parch | \n",
" Fare | \n",
" Sex_female | \n",
" Sex_male | \n",
" Cabin_A | \n",
" Cabin_B | \n",
" Cabin_C | \n",
" Cabin_D | \n",
" Cabin_E | \n",
" Cabin_F | \n",
" Cabin_G | \n",
" Cabin_None | \n",
" Cabin_T | \n",
" Embarked_C | \n",
" Embarked_Missing | \n",
" Embarked_Q | \n",
" Embarked_S | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 3 | \n",
" 22.000000 | \n",
" 1 | \n",
" 0 | \n",
" 7.2500 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 1 | \n",
" 1 | \n",
" 38.000000 | \n",
" 1 | \n",
" 0 | \n",
" 71.2833 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" 2 | \n",
" 3 | \n",
" 26.000000 | \n",
" 0 | \n",
" 0 | \n",
" 7.9250 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 3 | \n",
" 1 | \n",
" 35.000000 | \n",
" 1 | \n",
" 0 | \n",
" 53.1000 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 4 | \n",
" 3 | \n",
" 35.000000 | \n",
" 0 | \n",
" 0 | \n",
" 8.0500 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 5 | \n",
" 3 | \n",
" 29.699118 | \n",
" 0 | \n",
" 0 | \n",
" 8.4583 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" 6 | \n",
" 1 | \n",
" 54.000000 | \n",
" 0 | \n",
" 0 | \n",
" 51.8625 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 7 | \n",
" 3 | \n",
" 2.000000 | \n",
" 3 | \n",
" 1 | \n",
" 21.0750 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 8 | \n",
" 3 | \n",
" 27.000000 | \n",
" 0 | \n",
" 2 | \n",
" 11.1333 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 9 | \n",
" 2 | \n",
" 14.000000 | \n",
" 1 | \n",
" 0 | \n",
" 30.0708 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" 10 | \n",
" 3 | \n",
" 4.000000 | \n",
" 1 | \n",
" 1 | \n",
" 16.7000 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 11 | \n",
" 1 | \n",
" 58.000000 | \n",
" 0 | \n",
" 0 | \n",
" 26.5500 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 12 | \n",
" 3 | \n",
" 20.000000 | \n",
" 0 | \n",
" 0 | \n",
" 8.0500 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 13 | \n",
" 3 | \n",
" 39.000000 | \n",
" 1 | \n",
" 5 | \n",
" 31.2750 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 14 | \n",
" 3 | \n",
" 14.000000 | \n",
" 0 | \n",
" 0 | \n",
" 7.8542 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 15 | \n",
" 2 | \n",
" 55.000000 | \n",
" 0 | \n",
" 0 | \n",
" 16.0000 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 16 | \n",
" 3 | \n",
" 2.000000 | \n",
" 4 | \n",
" 1 | \n",
" 29.1250 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" 17 | \n",
" 2 | \n",
" 29.699118 | \n",
" 0 | \n",
" 0 | \n",
" 13.0000 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 18 | \n",
" 3 | \n",
" 31.000000 | \n",
" 1 | \n",
" 0 | \n",
" 18.0000 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 19 | \n",
" 3 | \n",
" 29.699118 | \n",
" 0 | \n",
" 0 | \n",
" 7.2250 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" 20 | \n",
" 2 | \n",
" 35.000000 | \n",
" 0 | \n",
" 0 | \n",
" 26.0000 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 21 | \n",
" 2 | \n",
" 34.000000 | \n",
" 0 | \n",
" 0 | \n",
" 13.0000 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 22 | \n",
" 3 | \n",
" 15.000000 | \n",
" 0 | \n",
" 0 | \n",
" 8.0292 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" 23 | \n",
" 1 | \n",
" 28.000000 | \n",
" 0 | \n",
" 0 | \n",
" 35.5000 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 24 | \n",
" 3 | \n",
" 8.000000 | \n",
" 3 | \n",
" 1 | \n",
" 21.0750 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 25 | \n",
" 3 | \n",
" 38.000000 | \n",
" 1 | \n",
" 5 | \n",
" 31.3875 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 26 | \n",
" 3 | \n",
" 29.699118 | \n",
" 0 | \n",
" 0 | \n",
" 7.2250 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" 27 | \n",
" 1 | \n",
" 19.000000 | \n",
" 3 | \n",
" 2 | \n",
" 263.0000 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 28 | \n",
" 3 | \n",
" 29.699118 | \n",
" 0 | \n",
" 0 | \n",
" 7.8792 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" 29 | \n",
" 3 | \n",
" 29.699118 | \n",
" 0 | \n",
" 0 | \n",
" 7.8958 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 861 | \n",
" 2 | \n",
" 21.000000 | \n",
" 1 | \n",
" 0 | \n",
" 11.5000 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 862 | \n",
" 1 | \n",
" 48.000000 | \n",
" 0 | \n",
" 0 | \n",
" 25.9292 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 863 | \n",
" 3 | \n",
" 29.699118 | \n",
" 8 | \n",
" 2 | \n",
" 69.5500 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 864 | \n",
" 2 | \n",
" 24.000000 | \n",
" 0 | \n",
" 0 | \n",
" 13.0000 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 865 | \n",
" 2 | \n",
" 42.000000 | \n",
" 0 | \n",
" 0 | \n",
" 13.0000 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 866 | \n",
" 2 | \n",
" 27.000000 | \n",
" 1 | \n",
" 0 | \n",
" 13.8583 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" 867 | \n",
" 1 | \n",
" 31.000000 | \n",
" 0 | \n",
" 0 | \n",
" 50.4958 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 868 | \n",
" 3 | \n",
" 29.699118 | \n",
" 0 | \n",
" 0 | \n",
" 9.5000 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 869 | \n",
" 3 | \n",
" 4.000000 | \n",
" 1 | \n",
" 1 | \n",
" 11.1333 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 870 | \n",
" 3 | \n",
" 26.000000 | \n",
" 0 | \n",
" 0 | \n",
" 7.8958 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 871 | \n",
" 1 | \n",
" 47.000000 | \n",
" 1 | \n",
" 1 | \n",
" 52.5542 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 872 | \n",
" 1 | \n",
" 33.000000 | \n",
" 0 | \n",
" 0 | \n",
" 5.0000 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 873 | \n",
" 3 | \n",
" 47.000000 | \n",
" 0 | \n",
" 0 | \n",
" 9.0000 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 874 | \n",
" 2 | \n",
" 28.000000 | \n",
" 1 | \n",
" 0 | \n",
" 24.0000 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" 875 | \n",
" 3 | \n",
" 15.000000 | \n",
" 0 | \n",
" 0 | \n",
" 7.2250 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" 876 | \n",
" 3 | \n",
" 20.000000 | \n",
" 0 | \n",
" 0 | \n",
" 9.8458 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 877 | \n",
" 3 | \n",
" 19.000000 | \n",
" 0 | \n",
" 0 | \n",
" 7.8958 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 878 | \n",
" 3 | \n",
" 29.699118 | \n",
" 0 | \n",
" 0 | \n",
" 7.8958 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 879 | \n",
" 1 | \n",
" 56.000000 | \n",
" 0 | \n",
" 1 | \n",
" 83.1583 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" 880 | \n",
" 2 | \n",
" 25.000000 | \n",
" 0 | \n",
" 1 | \n",
" 26.0000 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 881 | \n",
" 3 | \n",
" 33.000000 | \n",
" 0 | \n",
" 0 | \n",
" 7.8958 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 882 | \n",
" 3 | \n",
" 22.000000 | \n",
" 0 | \n",
" 0 | \n",
" 10.5167 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 883 | \n",
" 2 | \n",
" 28.000000 | \n",
" 0 | \n",
" 0 | \n",
" 10.5000 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 884 | \n",
" 3 | \n",
" 25.000000 | \n",
" 0 | \n",
" 0 | \n",
" 7.0500 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 885 | \n",
" 3 | \n",
" 39.000000 | \n",
" 0 | \n",
" 5 | \n",
" 29.1250 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" 886 | \n",
" 2 | \n",
" 27.000000 | \n",
" 0 | \n",
" 0 | \n",
" 13.0000 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 887 | \n",
" 1 | \n",
" 19.000000 | \n",
" 0 | \n",
" 0 | \n",
" 30.0000 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 888 | \n",
" 3 | \n",
" 29.699118 | \n",
" 1 | \n",
" 2 | \n",
" 23.4500 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 889 | \n",
" 1 | \n",
" 26.000000 | \n",
" 0 | \n",
" 0 | \n",
" 30.0000 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" 890 | \n",
" 3 | \n",
" 32.000000 | \n",
" 0 | \n",
" 0 | \n",
" 7.7500 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
"
\n",
"
891 rows × 20 columns
\n",
"
"
],
"text/plain": [
" Pclass Age SibSp Parch Fare Sex_female Sex_male Cabin_A \\\n",
"0 3 22.000000 1 0 7.2500 0.0 1.0 0.0 \n",
"1 1 38.000000 1 0 71.2833 1.0 0.0 0.0 \n",
"2 3 26.000000 0 0 7.9250 1.0 0.0 0.0 \n",
"3 1 35.000000 1 0 53.1000 1.0 0.0 0.0 \n",
"4 3 35.000000 0 0 8.0500 0.0 1.0 0.0 \n",
"5 3 29.699118 0 0 8.4583 0.0 1.0 0.0 \n",
"6 1 54.000000 0 0 51.8625 0.0 1.0 0.0 \n",
"7 3 2.000000 3 1 21.0750 0.0 1.0 0.0 \n",
"8 3 27.000000 0 2 11.1333 1.0 0.0 0.0 \n",
"9 2 14.000000 1 0 30.0708 1.0 0.0 0.0 \n",
"10 3 4.000000 1 1 16.7000 1.0 0.0 0.0 \n",
"11 1 58.000000 0 0 26.5500 1.0 0.0 0.0 \n",
"12 3 20.000000 0 0 8.0500 0.0 1.0 0.0 \n",
"13 3 39.000000 1 5 31.2750 0.0 1.0 0.0 \n",
"14 3 14.000000 0 0 7.8542 1.0 0.0 0.0 \n",
"15 2 55.000000 0 0 16.0000 1.0 0.0 0.0 \n",
"16 3 2.000000 4 1 29.1250 0.0 1.0 0.0 \n",
"17 2 29.699118 0 0 13.0000 0.0 1.0 0.0 \n",
"18 3 31.000000 1 0 18.0000 1.0 0.0 0.0 \n",
"19 3 29.699118 0 0 7.2250 1.0 0.0 0.0 \n",
"20 2 35.000000 0 0 26.0000 0.0 1.0 0.0 \n",
"21 2 34.000000 0 0 13.0000 0.0 1.0 0.0 \n",
"22 3 15.000000 0 0 8.0292 1.0 0.0 0.0 \n",
"23 1 28.000000 0 0 35.5000 0.0 1.0 1.0 \n",
"24 3 8.000000 3 1 21.0750 1.0 0.0 0.0 \n",
"25 3 38.000000 1 5 31.3875 1.0 0.0 0.0 \n",
"26 3 29.699118 0 0 7.2250 0.0 1.0 0.0 \n",
"27 1 19.000000 3 2 263.0000 0.0 1.0 0.0 \n",
"28 3 29.699118 0 0 7.8792 1.0 0.0 0.0 \n",
"29 3 29.699118 0 0 7.8958 0.0 1.0 0.0 \n",
".. ... ... ... ... ... ... ... ... \n",
"861 2 21.000000 1 0 11.5000 0.0 1.0 0.0 \n",
"862 1 48.000000 0 0 25.9292 1.0 0.0 0.0 \n",
"863 3 29.699118 8 2 69.5500 1.0 0.0 0.0 \n",
"864 2 24.000000 0 0 13.0000 0.0 1.0 0.0 \n",
"865 2 42.000000 0 0 13.0000 1.0 0.0 0.0 \n",
"866 2 27.000000 1 0 13.8583 1.0 0.0 0.0 \n",
"867 1 31.000000 0 0 50.4958 0.0 1.0 1.0 \n",
"868 3 29.699118 0 0 9.5000 0.0 1.0 0.0 \n",
"869 3 4.000000 1 1 11.1333 0.0 1.0 0.0 \n",
"870 3 26.000000 0 0 7.8958 0.0 1.0 0.0 \n",
"871 1 47.000000 1 1 52.5542 1.0 0.0 0.0 \n",
"872 1 33.000000 0 0 5.0000 0.0 1.0 0.0 \n",
"873 3 47.000000 0 0 9.0000 0.0 1.0 0.0 \n",
"874 2 28.000000 1 0 24.0000 1.0 0.0 0.0 \n",
"875 3 15.000000 0 0 7.2250 1.0 0.0 0.0 \n",
"876 3 20.000000 0 0 9.8458 0.0 1.0 0.0 \n",
"877 3 19.000000 0 0 7.8958 0.0 1.0 0.0 \n",
"878 3 29.699118 0 0 7.8958 0.0 1.0 0.0 \n",
"879 1 56.000000 0 1 83.1583 1.0 0.0 0.0 \n",
"880 2 25.000000 0 1 26.0000 1.0 0.0 0.0 \n",
"881 3 33.000000 0 0 7.8958 0.0 1.0 0.0 \n",
"882 3 22.000000 0 0 10.5167 1.0 0.0 0.0 \n",
"883 2 28.000000 0 0 10.5000 0.0 1.0 0.0 \n",
"884 3 25.000000 0 0 7.0500 0.0 1.0 0.0 \n",
"885 3 39.000000 0 5 29.1250 1.0 0.0 0.0 \n",
"886 2 27.000000 0 0 13.0000 0.0 1.0 0.0 \n",
"887 1 19.000000 0 0 30.0000 1.0 0.0 0.0 \n",
"888 3 29.699118 1 2 23.4500 1.0 0.0 0.0 \n",
"889 1 26.000000 0 0 30.0000 0.0 1.0 0.0 \n",
"890 3 32.000000 0 0 7.7500 0.0 1.0 0.0 \n",
"\n",
" Cabin_B Cabin_C Cabin_D Cabin_E Cabin_F Cabin_G Cabin_None \\\n",
"0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"1 0.0 1.0 0.0 0.0 0.0 0.0 0.0 \n",
"2 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"3 0.0 1.0 0.0 0.0 0.0 0.0 0.0 \n",
"4 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"5 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"6 0.0 0.0 0.0 1.0 0.0 0.0 0.0 \n",
"7 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"8 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"9 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"10 0.0 0.0 0.0 0.0 0.0 1.0 0.0 \n",
"11 0.0 1.0 0.0 0.0 0.0 0.0 0.0 \n",
"12 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"13 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"14 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"15 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"16 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"17 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"18 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"19 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"20 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"21 0.0 0.0 1.0 0.0 0.0 0.0 0.0 \n",
"22 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"23 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n",
"24 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"25 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"26 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"27 0.0 1.0 0.0 0.0 0.0 0.0 0.0 \n",
"28 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"29 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
".. ... ... ... ... ... ... ... \n",
"861 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"862 0.0 0.0 1.0 0.0 0.0 0.0 0.0 \n",
"863 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"864 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"865 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"866 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"867 0.0 0.0 0.0 0.0 0.0 0.0 0.0 \n",
"868 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"869 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"870 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"871 0.0 0.0 1.0 0.0 0.0 0.0 0.0 \n",
"872 1.0 0.0 0.0 0.0 0.0 0.0 0.0 \n",
"873 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"874 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"875 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"876 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"877 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"878 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"879 0.0 1.0 0.0 0.0 0.0 0.0 0.0 \n",
"880 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"881 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"882 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"883 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"884 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"885 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"886 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"887 1.0 0.0 0.0 0.0 0.0 0.0 0.0 \n",
"888 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"889 0.0 1.0 0.0 0.0 0.0 0.0 0.0 \n",
"890 0.0 0.0 0.0 0.0 0.0 0.0 1.0 \n",
"\n",
" Cabin_T Embarked_C Embarked_Missing Embarked_Q Embarked_S \n",
"0 0.0 0.0 0.0 0.0 1.0 \n",
"1 0.0 1.0 0.0 0.0 0.0 \n",
"2 0.0 0.0 0.0 0.0 1.0 \n",
"3 0.0 0.0 0.0 0.0 1.0 \n",
"4 0.0 0.0 0.0 0.0 1.0 \n",
"5 0.0 0.0 0.0 1.0 0.0 \n",
"6 0.0 0.0 0.0 0.0 1.0 \n",
"7 0.0 0.0 0.0 0.0 1.0 \n",
"8 0.0 0.0 0.0 0.0 1.0 \n",
"9 0.0 1.0 0.0 0.0 0.0 \n",
"10 0.0 0.0 0.0 0.0 1.0 \n",
"11 0.0 0.0 0.0 0.0 1.0 \n",
"12 0.0 0.0 0.0 0.0 1.0 \n",
"13 0.0 0.0 0.0 0.0 1.0 \n",
"14 0.0 0.0 0.0 0.0 1.0 \n",
"15 0.0 0.0 0.0 0.0 1.0 \n",
"16 0.0 0.0 0.0 1.0 0.0 \n",
"17 0.0 0.0 0.0 0.0 1.0 \n",
"18 0.0 0.0 0.0 0.0 1.0 \n",
"19 0.0 1.0 0.0 0.0 0.0 \n",
"20 0.0 0.0 0.0 0.0 1.0 \n",
"21 0.0 0.0 0.0 0.0 1.0 \n",
"22 0.0 0.0 0.0 1.0 0.0 \n",
"23 0.0 0.0 0.0 0.0 1.0 \n",
"24 0.0 0.0 0.0 0.0 1.0 \n",
"25 0.0 0.0 0.0 0.0 1.0 \n",
"26 0.0 1.0 0.0 0.0 0.0 \n",
"27 0.0 0.0 0.0 0.0 1.0 \n",
"28 0.0 0.0 0.0 1.0 0.0 \n",
"29 0.0 0.0 0.0 0.0 1.0 \n",
".. ... ... ... ... ... \n",
"861 0.0 0.0 0.0 0.0 1.0 \n",
"862 0.0 0.0 0.0 0.0 1.0 \n",
"863 0.0 0.0 0.0 0.0 1.0 \n",
"864 0.0 0.0 0.0 0.0 1.0 \n",
"865 0.0 0.0 0.0 0.0 1.0 \n",
"866 0.0 1.0 0.0 0.0 0.0 \n",
"867 0.0 0.0 0.0 0.0 1.0 \n",
"868 0.0 0.0 0.0 0.0 1.0 \n",
"869 0.0 0.0 0.0 0.0 1.0 \n",
"870 0.0 0.0 0.0 0.0 1.0 \n",
"871 0.0 0.0 0.0 0.0 1.0 \n",
"872 0.0 0.0 0.0 0.0 1.0 \n",
"873 0.0 0.0 0.0 0.0 1.0 \n",
"874 0.0 1.0 0.0 0.0 0.0 \n",
"875 0.0 1.0 0.0 0.0 0.0 \n",
"876 0.0 0.0 0.0 0.0 1.0 \n",
"877 0.0 0.0 0.0 0.0 1.0 \n",
"878 0.0 0.0 0.0 0.0 1.0 \n",
"879 0.0 1.0 0.0 0.0 0.0 \n",
"880 0.0 0.0 0.0 0.0 1.0 \n",
"881 0.0 0.0 0.0 0.0 1.0 \n",
"882 0.0 0.0 0.0 0.0 1.0 \n",
"883 0.0 0.0 0.0 0.0 1.0 \n",
"884 0.0 0.0 0.0 0.0 1.0 \n",
"885 0.0 0.0 0.0 1.0 0.0 \n",
"886 0.0 0.0 0.0 0.0 1.0 \n",
"887 0.0 0.0 0.0 0.0 1.0 \n",
"888 0.0 0.0 0.0 0.0 1.0 \n",
"889 0.0 1.0 0.0 0.0 0.0 \n",
"890 0.0 0.0 0.0 1.0 0.0 \n",
"\n",
"[891 rows x 20 columns]"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"X"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" \n",
" | \n",
" Pclass | \n",
" Age | \n",
" SibSp | \n",
" Parch | \n",
" Fare | \n",
" Sex_female | \n",
" Sex_male | \n",
" Cabin_A | \n",
" Cabin_B | \n",
" Cabin_C | \n",
" Cabin_D | \n",
" Cabin_E | \n",
" Cabin_F | \n",
" Cabin_G | \n",
" Cabin_None | \n",
" Cabin_T | \n",
" Embarked_C | \n",
" Embarked_Missing | \n",
" Embarked_Q | \n",
" Embarked_S | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 3 | \n",
" 22.000000 | \n",
" 1 | \n",
" 0 | \n",
" 7.2500 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 1 | \n",
" 1 | \n",
" 38.000000 | \n",
" 1 | \n",
" 0 | \n",
" 71.2833 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" 2 | \n",
" 3 | \n",
" 26.000000 | \n",
" 0 | \n",
" 0 | \n",
" 7.9250 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 3 | \n",
" 1 | \n",
" 35.000000 | \n",
" 1 | \n",
" 0 | \n",
" 53.1000 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 4 | \n",
" 3 | \n",
" 35.000000 | \n",
" 0 | \n",
" 0 | \n",
" 8.0500 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 886 | \n",
" 2 | \n",
" 27.000000 | \n",
" 0 | \n",
" 0 | \n",
" 13.0000 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 887 | \n",
" 1 | \n",
" 19.000000 | \n",
" 0 | \n",
" 0 | \n",
" 30.0000 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 888 | \n",
" 3 | \n",
" 29.699118 | \n",
" 1 | \n",
" 2 | \n",
" 23.4500 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 889 | \n",
" 1 | \n",
" 26.000000 | \n",
" 0 | \n",
" 0 | \n",
" 30.0000 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" 890 | \n",
" 3 | \n",
" 32.000000 | \n",
" 0 | \n",
" 0 | \n",
" 7.7500 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
"
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Look at all the columns in the dataset\n",
"def printall(X, max_rows=10):\n",
" from IPython.display import display, HTML\n",
" display(HTML(X.to_html(max_rows=max_rows)))\n",
" \n",
"printall(X)"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"C-stat: 0.863521128261\n"
]
}
],
"source": [
"model = RandomForestRegressor(100, oob_score=True, n_jobs=-1, random_state=42)\n",
"model.fit(X, y)\n",
"print (\"C-stat: \", roc_auc_score(y, model.oob_prediction_))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This is a pretty good model. Now, before we try some different parameters for the model, let's use the Random Forest to help us with some EDA."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Variable importance measures"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([ 9.11384671e-02, 2.38891052e-01, 4.43567267e-02,\n",
" 2.15831071e-02, 2.15047796e-01, 1.43423437e-01,\n",
" 1.58822440e-01, 2.95342368e-03, 3.79055011e-03,\n",
" 6.47116172e-03, 4.30998991e-03, 8.59480266e-03,\n",
" 1.02403226e-03, 8.12054428e-04, 2.67741854e-02,\n",
" 6.64265010e-05, 1.06189189e-02, 0.00000000e+00,\n",
" 6.00379221e-03, 1.53176370e-02])"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model.feature_importances_"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAgEAAAFrCAYAAABIYVrAAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XucXVV9///XGwQDQcALTKTfrxMrFKRRSQiChcIBxYp+\nS7mqXKpWufQnKlZsbdFKgkrsRbQF+SpI02oUgQpt5SYiHC+ESy4kBEKU/AigrRlqqzUG5JK8v3+c\nNeEwnDMzZ+YMc9nv5+NxmH1Za+219znkfM5aa+8l20RERET1bDXeFYiIiIjxkSAgIiKiohIERERE\nVFSCgIiIiIpKEBAREVFRCQIiIiIq6nnjXYEYnKTcwxkREc9iW6MtIy0Bk4DtvEbwOuecc8a9DpPx\nleuW65brNvFf3ZIgICIioqISBERERFSUutmsUHWSjgKuAvay/aMulZk3KCJiCunp6WX9+gdHVYYk\n3IUxAQkCukjS14GXAjfbnt+lMg15jyIipg6Nul+/W0FAugO6RNJ04EDgPcAJZZskXSRptaRvSbpW\n0jFl3xxJdUlLJF0vqWccqx8RERWUIKB7/gC4wfZa4GeSZgPHAC+zvTfwDuB1AJKeB1wAHGt7P2Ah\ncN74VDsiIqoqzwnonhOAz5Xly4ETaVzfKwFs90m6pezfE5gFfFuSaARj/9G+6HlNy7XyioiIqqjX\n69Tr9a6XmzEBXSDphcBPgEdodOBvXf5eDayw/U8l3TeArwI/Ar5o+8BhlJ0xARERU0rGBEw1xwNf\ntv1y279puxdYB/wcOLaMDejh6Z/wPwR2kXQANLoHJO09HhWPiIjqShDQHW+j8au/2TeAHhotBPcC\nXwaWAf9j+0ngOOCvJK0A7qKMF4iIiHiupDtgjEmabnujpBcBdwAH2n6kg/zpDoiImFImTndABgaO\nvWsk7QxsA5zbSQDwtFG/zxERMUH09PSOdxW2SEvABCfJeY8iIqJZBgZGRETEqCQIiIiIqKgEARER\nERWVICAiIqKiEgRERERUVG4RHAVJm4CVNO7hM3CU7YfHt1YRERHDk1sER0HSL23vOIJ8W9veNMy0\neYMiIkagp6eX9esfHO9qjIk8LGhieNYbIKkX+Aqwfdn0Ptu3SzoE+ASN+QT2BPaSdBLwARoPEroD\neG/rhwIkDoiI6FRfXx60NpQEAaOznaTlNIKBB2wfC/QBb7D9hKTdgcuA/Ur62cBv235Y0l405hz4\nHdubJH0eOAlY9NyfRkREVFGCgNF51PacAdu2BS6UtA+wCdijad+dTWMGXg/MAZZIEjCNRgARERHx\nnEgQ0H1/Aqy3/WpJWwOPNe3b2LQs4J9sf3ToIuc1Ldd4ekbiiIiognq9Tr1e73q5GRg4CpI22H7B\ngG3nAz+2/VlJfwR8yfbWZUzAWbaPLOleCfwLcJDt/5T0QuAFA+8uyCyCEREjNfrZ+iaqzB0wMbT6\ndF0EvEvSXcBv8cxf/09ntO8DPgbcKGklcCMwY6wqGhERMVBaAia4tARERIxUWgKGkjEBk0Juc4mI\n6FRPT+94V2HCSxAwCUzVSDYiIsZXxgRERERUVIKAiIiIikoQEBERUVEJAiIiIioqQUBERERFJQiI\niIioqAQBERERFTVlnxMg6aPACTRm8tsEnG57yfjWqvV8A8PIM1bViYhx1tPTy/r1D453NaKipmQQ\nIOkA4M3APrafkvQiGlP8TgQjePJPHhYUMVX19SXIj/EzVbsDXgr8zPZTALb/2/Z6SXMk1SUtkXS9\npB5JW0u6U9LBAJIWSPpEu4IlrZN0nqS7Sr7Zkm6QdL+k00ua6ZJukrRU0kpJR7Yp68OljBWSzhmD\n6xAREdHWVA0CbgReJmmNpM9LOljS84ALgGNt7wcsBM6zvQl4F/B/Jb0eeCMwf4jyH7Q9G/hBKecY\n4HVN+X4NHGV7LnAY8JmBBUg6HNjD9muB2cBcSQeN6qwjIiI6MCW7A2xvlDQH+F0aX8JfBz4FzAK+\nrUYn+1bAT0v61ZIWAdcA+/e3IAzim+XvKmC67UeBRyX9WtKOwKPAgtK6sBnYTdKuth9pKuONwOGS\nltOYIWg6sAeNwGKAeU3LtfKKiIiqqNfr1Ov1rpdbiamEJR0LnAE83/aBbdJ8jca36zttf3uQstYB\n+9r+b0nvLMsfKPseAOYCvw+8CTjJ9uaS5xDbD0v6pe0dJf0t8EPblwxR90wlHDGlTd3pbmPsdGsq\n4SnZHSDptyTt3rRpH2A1sEsZNIik50nauywfA7wQOBi4sPyaH9Ghy9+dgEdKAHAo0NsizbeAd0ua\nXuqwm6RdRnjciIiIjk3J7gBgB+ACSTsBTwFrgdOAi5u2bw18TlIfcB5wmO3/kHQB8HfAH7Upe7CQ\nvX/fV4FvSloJLAXuG5jG9rcl7QXcVm4B3ACcDPxnpycbERExEpXoDpjMGt0BETFV5TkBMRLd6g6Y\nqi0BU0oCtYiIGAsJAtqQdBUws3+VRjP+RwYbNBgRETGZpDtggpPkvEcREdEsdwdERETEqCQIiIiI\nqKgEARERERWVICAiIqKiJt3dAZI+CpwAbCqv020v6UK5BwFfAJ4AXmf78dGW2eIYhwAftv37Hebr\ndlUiYphyH39MZZMqCCiP/H0zsI/tpyS9CNi2S8WfRGNWwa91qbx2RjDUP3cHRIyXvr4E4TF1Tbbu\ngJcCP+uf5c/2f9teL2mOpLqkJZKul9QjaWtJd5aZ/JC0QNInWhUq6T3AW4FPSPpK2fbhkn+FpHPK\ntl5J90laKOmHkhZJer2kH5T1uSXdfpIWS1pW9u3R4pjbS7pU0u0lXUetAxEREaM1qZ4TUCbb+QGw\nHfAd4HJgMfBd4Ejb/yXprcDv2X5PmSDoSuADwF8zyDTBkhYC37R9laTDgeNsn16mHf434K+AHwP3\n02iJWC1pKbDC9imSjgT+yPbRknYAHi0TCL0e+P9sH1e6A86yfaSkTwH32v5amcvgzlLuYwPqlVkE\nI8ZVZvmLiaeSjw22vVHSHOB3gcOArwOfAmYB3y5f2FsBPy3pV0taBFzDIAFAC28EDpe0nMbTAqcD\ne9AIAtbZXl3S3UsjGAFYxdOzBe4MfLm0AJjW1/mNwO9L+tOyvi3wMuCHw6xjRETEqEyqIACgPD7v\ne8D3JK0CzgDusX1gmyyvAn4O9HRwGAELbF/yjI1SL9A8YHBz0/pmnr6enwButn1MyXNLm2Mca/v+\noaszr2m5Vl4REVEV9Xqder3e9XInVRAg6beAzbbXlk37AKuBN0o6wPbtkp4H/FZpBTgGeCFwMHCt\npP1s/3IYh/oWcK6kr5XWh92AJ/urMYz8OwH/XpbbTUn8LRrdFO8v57aP7RWtk84bxiEjImKqqtVq\n1Gq1Levz58/vSrmTbWDgDsA/SbpH0grglcDHgeOAvyrb7gJeJ+nFwHnAe0rQcAHwd4OUvaXTr0wS\n9DXgNkl30xhXsMPAdLTvrP9r4NOSltH+Gn8C2EbS3aVF49xB6hYREdF1k2pgYBVlYGDEeMvAwJh4\nKjkwsLpyn3LEeOnp6R06UcQkVbkgQNJVwMz+VRo/sz9SugAmpPwKiYiIsZDugAlOkvMeRUREs251\nB0y2gYERERHRJQkCIiIiKipBQEREREUlCIiIiKioBAEREREVVblbBIdL0iZgJbANjUcTv9P2r9uk\nPQfYYPv8MarLWBQbU1xPTy/r1z843tWIiAksQUB7G23PASgzEf4x8LnxqUpuEYzO9fUleIyIwaU7\nYHi+D+wOIOkdklZKukvSPw1MKOkUSXeW/VdKmla2Hy9pVdleL9v2lnSHpOWSVkh6xXN5UhERUW1p\nCWhPAGVWwiOA6yXtDXwUOMD2zyXt3CLfN2x/qeT9BPAe4PPAXwJvtP1TSTuWtH8MfM72ZeU4W4/t\nKUVERDwtQUB720laXpa/B1xK40v7Cts/B7D9ixb5Xl2+/HcGptOYMhjgBzRmQLwCuKpsuw34qKT/\nBVzdNEXyAPOalmvlFRERVVGv16nX610vN48NbkPSL23vOGDb+4Ae2385YPuWgYGSHgCOtH2PpHcC\nh9h+d0m3H/B/gHcAc0prwsvLtvcDp9muDyg7swjGCGX2u4ipKo8NHnutLu7NwPGSXgQg6YUt0uwA\nrJe0DXDSlsKk37S9xPY5wCPA/5b0ctvrbF8A/Cvw6q6fRURERBvpDmjvWT+hbK+W9Cngu5KeAu4C\n3j0g2ceBO2l80d8BvKBs/xtJe5Tlm2zfLekjkv4QeBL4KfCpMTiPiIiIltIdMMGlOyBGLt0BEVNV\nt7oD0hIwKeR+7+hcT0/veFchIia4BAGTQH7NRUTEWMjAwIiIiIpKEBAREVFRCQIiIiIqKkFARERE\nRSUIiIiIqKgEAcMk6aOS7ikzCC6X9FpJF0vaq+zf0Cbf/pJuL7MH3ivp489tzSMiIlrLLYLDIOkA\n4M3APrafKo8N3tb2aU3J2t3H90/AcWUuAQF7juD4Hdd5quvp6WX9+gfHuxoREZNaWgKG56XAz2w/\nBWD7v22vl3SLpDkljSSdX1oLvi3pxWX7LkBfyWfba0ricyR9WdJiST+UdEr7wzuvAa++vofaX66I\niBiWBAHDcyPwMklrJH1e0sEt0kwH7rQ9i8bUw+eU7Z8DfijpG5JOk/T8pjyvojEv8O8AH5c0Y+xO\nISIi4pkSBAyD7Y3AHOA04D+Br5dpgpttAq4oy4uAg0reTwD70ggkTgSub8rzr7afsP1fNGYofO2Y\nnURERMQAGRMwTG48u/d7wPckrQLeyeAz+2zZZ3sd8EVJXwL+s2kK4ub8al/evKblWnlFRERV1Ot1\n6vV618vNLILDIOm3gM2215b1TwA7AbOAD9teLmkz8HbbV0j6GLCL7TMlvdn2dSXfK4HvAj00phz+\nA+AAGtMNLwMOsL1+wLEzi2BLmSEvIqorswg+t3YALpC0E/AUsJZG18A/N6X5FfBaSX9JYyDg28r2\nP5R0PvBoyXuibZcR/3cDdeDFwLkDA4CIiIixlJaAcSLpHGCD7fOHSJeWgJbSEhAR1ZWWgErJcwIG\n6unpHe8qRERMemkJmOAkOe9RREQ061ZLQG4RjIiIqKgEARERERWVICAiIqKiEgRERERUVIKAiIiI\nikoQEBERUVGTJgiQ1CPpMkn3S1oi6RpJu7dJ21ue799q38WS9hrB8edJ2ijpJU3bNnRaTkRExEQx\naYIA4GrgZtt72N4P+Asaz+Bvp+XN9bZPs71mBMc3jRkEzxrqGN0macK/ZsyY+VxcioiI6KJJEQRI\nOhR4wvYl/dtsrwJWSLpJ0lJJKyUd2ZRtG0mLJK2WdIWkaaWsWyTNKcsbJH1S0gpJiyXtMkRVFgJv\nk7Rzizp+SNIqSXdLOrNs6y3Hv1jSPZJukPT8su83JV1fWjW+WyYpasMT/tXX99AQly4iIiaaSREE\n0Jitb1mL7Y8BR9meCxwGfKZp357Ahbb3BjYA722Rfzqw2PY+wPeBU4eoxwbgH4APlnUBSNqXxtTC\n+wGvA06V9JqSZnfgAtuzgP8Bji3bLwbeV1o1/hT4v0McOyIioqsmSxDQzlbAAkkrgZuA3STtWvY9\nbPv2srwIOKhF/sf7p/mlEWTMHMYxLwDeIWkHnu4OOBC42vavbW8ErgJ+t+xbV1otthxD0nTgd4Ar\nJd0FfJHBuzYiIiK6brJMIHQvcFyL7ScBLwFm294saR0wrewb2F/fqv/+yablTQzjetj+H0lfA84Y\nstYNjw84xjQawcvPbc8ZXhHzmpZr5RUREVVRr9ep1+tdL3dSBAG2b5b0KUmn2P4SgKRXAb3AIyUA\nOLSs9+uVtL/tO4ATaTT3DzTSyRc+Cyzh6ev3fWChpE8DWwNHAye3O4btDZLWSTrO9j+X83m17btb\nH27eCKsZERFTQa1Wo1arbVmfP39+V8qdTN0BRwOHS1pbbv87D7gW2K90B5wM3NeUfg1whqTVwM7A\nF8r25haBEY3ut/1fNO5W2Las3wX8I43A4DbgYtsrhzjGycB7yqDEe4Aj26SLiIgYE5lKeIKT5Ofo\nTsRREvksRUQ8N7o1lfCk6A6IUb/PY66np3foRBERMaEkCBhA0tnA8TR+fqv8vdL2gvGqU35hR0TE\nWEh3wAQnyXmPIiKiWbe6AybTwMCIiIjoogQBERERFZUgICIioqISBERERFRUgoCIiIiKShAwDJI2\nSVpepgq+vH9a4lGW+U5JFwwz7bi/ZsyYOdpTjoiICSZBwPBstD3H9qtoTDr0x8PNKGmwazzMe/88\n7q++voeGV9WIiJg0EgR07vvA7gCSrpa0pLQQnNKfQNIGSX9bpgk+QNJcSbeWeQJuL1MJA/yGpOsl\n/VDSX43DuURERIXliYHDIwBJzwOOAK4v2//I9i9K98ASSd+w/XNgOnCb7Q9L2obGZEbH214uaQfg\n1yX/a4B9aLQu/FDS39v+9+fwvCIiosLSEjA820laDtwJPARcWrZ/UNIK4HbgfwF7lO1PAVeV5T2B\n/7C9HMD2r2xvKvu+U9YfB1bzzKmQIyIixlRaAobnUdtzmjdIOgQ4DNjf9uOSbgH6Bwz+esCzfts9\n2vHxpuVNtH0/5jUt18orIiKqol6vU6/Xu15ugoDhafUlvhPw8xIA7AUc0Cb9D4EZkva1vax0BzzW\n2eHndZY8IiKmlFqtRq1W27I+f/78rpSbIGB4Wo3ivwH4Y0n30viiv61VettPSnobcKGk7YBHgTcM\n8xgRERFjJrMITnCSPDHiA2VK44iICaJbswimJWBSGPX7PGo9PRmzGBEx1SQImATyCzwiIsZCbhGM\niIioqAQBERERFZUgICIioqISBERERFRUgoCIiIiKShAQERFRURM+CJC0SdJySXeVv3/WQd5DJH1z\nlMe/RdKcoVN2fnxJu0r6Zpli+F5J17RJN2avGTNmjuTUIiJiCpgMzwnYOHDyng6N+CZ7Sd0IkgY7\n/rnAjbYvKMeb1XkRo9PXN/4PIoqIiPEx4VsCaPO4PEnrJJ1XWgjulDRb0g2S7pd0WlPSnSRdI2mN\npIua8l9U8q2SdM6Acj8taSlwfNN2SVoo6dyyfrikxZKWSrpc0vZl+5sk3VfyHzPEub0U+En/iu17\nOrguERERozIZgoDtBnQHHN+070Hbs4EfAAtpfOm+jsYv7H77AWcArwR2l9T/xXy27dcCrwFqA36F\n/8z2XNuXl/VtgK8CP7L9cUkvBj4GvN72XGAZ8CFJzwcuBt5Sts8Y4tw+D/yDpO9IOlvSSzu5MBER\nEaMxGboDHh2kO6C/v30VMN32o8Cjkn4tacey707bDwFIugw4CLgKeLukU2lcgxnA3kD/L/H+L/9+\nXwQut72grB9Q0t8qSTSChNuAvYAHbD9Q0i0CTm13YrZvlPRy4E3Am4HlkmbZ/q9nppzXtFwrr4iI\nqIp6vU69Xu96uZMhCBjM4+Xv5qbl/vX+cxvYoW5JM4GzgH1t/1LSQmBaU5qNA/LcChwq6Xzbj9Po\norjR9knNiSS9hg5n+7H9C+DrwNfLIMKDgaufmWpeJ0VGRMQUU6vVqNVqW9bnz5/flXInQ3fASEau\nNefZX1JvGeT3NhpdBzsCvwI2SOoBjhiivEuB64ErSjm3AwdKegWApO0l7QGsAXrLr3uAEwatpHSo\npO3K8guAVwAPd3CeERERIzYZWgKmSVpO44vdwA22z2bwIfPN++4ELgR2B262fTWApBXAfcCPaQQG\nrfJuWbf9WUk7AV+xfZKkdwGXlXEABj5m+35JpwPXSdoIfB/YYZB67gtcKOlJGgHZxbaXDZI+IiKi\na5Rpaic2SWP6BvX09LJ+/YNjeYiIiOgySdge9T3ek6EloPISqEVExFhIEPAcKF0HZ/LMroZbbb9/\nfGoUERGR7oAJT5LzHkVERLNudQdMhrsDIiIiYgwkCIiIiKioBAEREREVlSAgIiKiohIEREREVNSE\nDgIkbRowg+CfdZD3kPIs/tEc/xZJ7SYvGvXxJR0haYmkeyQtk/Q3bdKN6jVjxsyRnEJERExxE/05\nARsHmUFwOEZ8b12ZI2C02h6/TF18AXBEedywgNM6LGZY+vpGfRdJRERMQRO6JYA2kwdJWifpvNJC\ncKek2ZJukHS/pOYv0p0kXSNpjaSLmvJfVPKtknTOgHI/LWkpcHzTdklaKOncsn64pMWSlkq6XNL2\nZfubJN1X8h8zxLn9KfBJ2/cDuOGLHV6fiIiIEZvoQcB2A7oDjm/a96Dt2TQm/1lI40v3dcC5TWn2\nA84AXgnsLqn/i/ls268FXgPUyq/yfj+zPdf25WV9G+CrwI9sf1zSi4GPAa+3PRdYBnyoTCR0MfCW\nsn3GEOc2q+SNiIgYFxO9O+DRQboD+vvbVwHTbT8KPCrp15J2LPvutP0QgKTLgIOAq4C3SzqVxvnP\nAPYG7il5+r/8+30RuNz2grJ+QEl/a2nC3wa4DdgLeMD2AyXdIuDUkZz0s81rWq6VV0REVEW9Xqde\nr3e93IkeBAzm8fJ3c9Ny/3r/eT1rWmBJM4GzgH1t/1LSQmBaU5qNA/LcChwq6Xzbj9PoorjR9knN\niSS9hjbdF23cA8ylEcQMYV4HxUZExFRTq9Wo1Wpb1ufPn9+Vcid6d8BIRrQ159lfUm8Z5Pc2Gl0H\nOwK/AjZI6gGOGKK8S4HrgStKObcDB0p6BYCk7SXtAawBeiW9vOQ7YYhy/xb4i5IXSVtJOn3YZxkR\nETFKE70lYJqk5TS+2A3cYPtsBh8u37zvTuBCYHfgZttXA0haAdwH/JhGYNAq75Z125+VtBPwFdsn\nlVkBLyvjAAx8rIzwPx24TtJG4PvADm0raa+S9MFSznalnGsGOa+IiIiuyiyCE5ykUb9BPT29rF//\nYBdqExERE0G3ZhGc6C0BASRQi4iIsZAgYIyVroMzeWZXw6223z8+NYqIiGhId8AEJ8l5jyIiolm3\nugMm+t0BERERMUYSBERERFRUgoCIiIiKShAQERFRUZUNAiT1SLqszDy4pMw2uHubtL2SWj7eV9LF\nkvYawfHPkfSTARMk7Th0zoiIiO6o8i2CVwMLbZ8AIOlVQA+wtk36lkP0bZ/WavswnW/7/KESNeYp\n6kweEBQREUOpZEuApEOBJ2xf0r/N9ipghaSbJC2VtFLSkU3ZtpG0SNJqSVdImlbKukXSnLK8QdIn\nJa2QtFjSLkNVZXg1dsevvr6Hhld0RERUViWDAGAWsKzF9seAo2zPBQ4DPtO0b0/gQtt7AxuA97bI\nPx1YbHsfGnMHDDWV8J80dQd8p9OTiIiIGI2qBgHtbAUskLQSuAnYTdKuZd/Dtm8vy4uAg1rkf9z2\ndWV5GTBziOOdb3uO7dm2Xz/KukdERHSkqmMC7gWOa7H9JOAlwGzbmyWtA6aVfS1nGBzgyablTXTt\n+s5rWq6VV0REVEW9Xqder3e93EoGAbZvlvQpSafY/hJsGRjYCzxSAoBDy3q/Xkn7274DOJFGc/9A\nnY7gG2b6eR0WGxERU0mtVqNWq21Znz9/flfKrXJ3wNHA4ZLWltv/zgOuBfYr3QEnA/c1pV8DnCFp\nNbAz8IWyvblFoNOH/H9wwC2CLxvRmURERIxAJhCa4CS589gCQJmCOCJiiurWBEKV7A6YfEb2nICI\niIjBJAgYY5LOBo6n8XNe5e+VthcMt4z8oo+IiLGQ7oAJTpLzHkVERLNudQdUeWBgREREpSUIiIiI\nqKgEARERERWVICAiIqKiEgRERERUVGWDAEk9ki6TdL+kJZKukbR7m7S95amCrfZdLGmvEdbhHZJW\nlWmLl0n6UJt0w3rNmDFzJNWIiIiKqvJzAq4GFto+AbbMHdADrG2TvuV9erZPG8nBJR0BfAB4g+0+\nSdsA7+jg0M/S1zfqu0UiIqJCKtkSUCYHesL2Jf3bbK8CVki6SdLS8uv8yKZs20haJGm1pCskTStl\n3SJpTlneIOmTklZIWixpl0Gq8efAWbb7yvGftH1p1082IiKijUoGAcAsYFmL7Y8BR9meCxwGfKZp\n357Ahbb3BjYA722Rfzqw2PY+NGYZPHWIOiwfQd0jIiK6oqpBQDtbAQvKLII3AbtJ2rXse9j27WV5\nEXBQi/yP276uLC8DZg5yrDwGMCIixlVVxwTcCxzXYvtJwEuA2bY3S1oHTCv7Bn5pt/oSf7JpeROD\nX997gX2B+tDVnde0XCuviIioinq9Tr1e73q5lZ07QNJtwKW2v1TWXwUcDbzY9pll3MB3aPyaF7AO\neJ3tOyRdAtxr+3OSbqHRt79c0gbbLyjlHQu8xfa72xz/COBc4P+UgYHbAn84cFxAZ1MJZ/rgiIgq\nyNwBo3c0cLikteX2v/OAa4H9SnfAycB9TenXAGdIWg3sDHyhbG/+1h32N7Dt64ELgZvK8ZcCLxjp\nyURERHSqsi0Bk0VaAiIiYqButQRUdUzAJDO897mnp3eM6xEREVNJgoAxJuls4HgaP+dV/l5pe8Fw\ny8iv+4iIGAvpDpjgJDnvUURENMvAwIiIiBiVBAEREREVlSAgIiKiohIEREREVFSCgIiIiIqasEGA\npE2Slku6q/z9sw7yHiLpm6M8/pYpgkeQd8jjSzqqTFe8WtLd5THD7dIO+ZoxY+ZIqhoRERU2kZ8T\nsNH2iL6EixHfVyepG8FR2+NLeg3w18AbbD8saSaNxwc/YPuuDoraoq9v1HeKRERExUzYlgDaPCZP\n0jpJ55UWgjslzZZ0g6T7JZ3WlHQnSddIWiPpoqb8F5V8qySdM6DcT0taSuPhPv3bJWmhpHPL+uGS\nFktaKulySduX7W+SdF/Jf8wQ53YWcJ7thwFsP0hj7oIPd3KBIiIiRmMiBwHbDegOOL5p34O2ZwM/\nABbS+NJ9HY1Z+frtB5wBvBLYXVL/F/PZtl8LvAaoSZrVlOdntufavrysbwN8FfiR7Y9LejHwMeD1\ntucCy4APSXo+cDGNWQPnAjOGOLffLnmbLS11jYiIeE5M5O6ARwfpDujvb18FTLf9KPCopF9L2rHs\nu9P2QwCSLgMOAq4C3i7pVBrnPgPYG7in5On/8u/3ReDypkf8HlDS3ypJNIKE24C9gAdsP1DSLQJO\nHclJtzbKjmQOAAAWoUlEQVSvablWXhERURX1ep16vd71cidyEDCYx8vfzU3L/ev95zSwI92l7/0s\nYF/bv5S0EJjWlGbjgDy3AodKOt/24zS6KG60fVJzotLH30mn/L3AXBpBTL+5NFoDWpjXQdERETHV\n1Go1arXalvX58+d3pdyJ3B0wkpFuzXn2l9RbBvm9jUbXwY7Ar4ANknqAI4Yo71LgeuCKUs7twIGS\nXgEgaXtJewBrgF5JLy/5Thii3M8Afy6pt5QzE/gA8DfDOsuIiIgumMgtAdMkLefpmfdusH02gw+V\nb953J3AhsDtws+2rASStAO4DfkwjMGiVd8u67c9K2gn4iu2TJL0LuKyMAzDwMdv3SzoduE7SRuD7\nwA5tK2mvlPQR4JulnF7gUNv3D3JuERERXZVZBCcASecB+wO/Z/upAfuG9Qb19PSyfv2DY1C7iIiY\naLo1i2CCgAkuUwlHRMRA3QoCJnJ3wKRXug7O5JldDbfafv/41CgiIuJpaQmY4NISEBERA3WrJWAi\n3x0QERERYyhBQEREREUlCIiIiKioBAEREREVlSAgIiKioioZBEjqkXRZmX54SZlyePc2aXslrWqz\n72JJe43g+OdI+kmZHfGHkv5ZUtsZBCW1fM2YMbPTQ0dERGxR1ecEXA0stH0CgKRXAT3A2jbpW96j\nZ/u0UdThfNvnl+O/FbhZ0izb/zXMw9PXN+q7QyIiosIq1xIg6VDgCduX9G+zvQpYIekmSUslrZR0\nZFO2bSQtkrRa0hWSppWybpE0pyxvkPRJSSskLZa0y3DrZPsK4FvAiV05yYiIiGGoXBAAzAKWtdj+\nGHCU7bnAYTRm+uu3J3Ch7b2BDcB7W+SfDiy2vQ+NCYRO7bBedwEddy1ERESMVFW7A1rZClgg6WBg\nM7CbpF3Lvodt316WFwHvB84fkP9x29eV5WXAGzo8/iBt+/OalmvlFRERVVGv16nX610vt4pBwL3A\ncS22nwS8BJhte7OkdcC0sq/lNMMDPNm0vInOr+1sYEnrXfM6LCoiIqaSWq1GrVbbsj5//vyulFu5\n7gDbNwPbSjqlf1sZGNgLPFICgEPLer9eSfuX5RNpNPcP1OkovS3pJR0LHA5c1mEZERERI1a5IKA4\nGjhc0tpy+995wLXAfpJWAicD9zWlXwOcIWk1sDPwhbK9uUWg01l+Pth/iyCNwOKw1ncGREREjI3M\nIjjBSWr7BvX09LJ+/YPPYW0iImIi6NYsglUcEzDpJFCLiIixkCBgDEk6GzieRleByt8rbS8Y14pF\nRESQ7oAJT5LzHkVERLNudQdUdWBgRERE5SUIiIiIqKgEARERERWVICAiIqKiKhsESOqRdJmk+yUt\nkXSNpN3bpO0tDxVqte9iSR1P/CPpHEk/KQ8MWi3p852WERERMRqVDQKAq4Gbbe9hez/gL4CeQdK3\nHKJv+zTba0ZYh/NtzymzE75a0iGtEkl61mvGjJkjPGRERERDJYOAMjfAE7Yv6d9mexWwQtJNkpZK\nWinpyKZs20haVH61XyFpWinrFklzyvIGSZ+UtELSYkm7DFWVkm8a8Hzg562T+Vmvvr6HRnLqERER\nW1QyCABm0Zjud6DHgKNszwUOAz7TtG9P4MLyq30D8N4W+acDi23vQ2OSoVOHqMefSFoO/DvwI9t3\nd3YaERERI1fVIKCdrYAFZRKhm4DdJO1a9j1s+/ayvAg4qEX+x21fV5aXATOHON75tucAuwI7SHrr\nqGofERHRgao+Nvhe4LgW208CXgLMLlMKrwOmlX0DxwS0GiPwZNPyJoZ5fW1vknQDcDBwxbNTzGta\nrpVXRERURb1ep16vd73cSgYBtm+W9ClJp9j+EoCkVwG9wCMlADi0rPfrlbS/7TtoTP37/RZFd/oI\nx/4xAQIOBJa3Tjavw2IjImIqqdVq1Gq1Levz58/vSrlV7g44Gjhc0tpy+995wLXAfqU74GTgvqb0\na4AzJK0Gdga+ULY3twh0+pD/D5YxAXfTeC8u6vw0IiIiRiYTCE1wktw6tlCmGI6IqKhuTSBUye6A\nyefZ73NPT2+LdBEREcOXIGCMSTobOJ7Gz3mVv1faXjDcMvKLPyIixkK6AyY4Sc57FBERzbrVHVDl\ngYERERGVliAgIiKiohIEREREVFSCgIiIiIpKEBAREVFRlQ0CJPVIukzS/ZKWSLpG0u5t0vaWpwq2\n2nexpL1GUY8Vkr42RJpnvGbMmDnSw0VERGxR5ecEXA0stH0CbJk7oAdY2yZ9y/v0bJ820gqU4GEr\n4HclbWf7seEcuq9v1HeFREREVLMloEwO9ITtS/q32V4FrJB0k6SlklZKOrIp2zaSFklaLekKSdNK\nWbdImlOWN0j6ZPl1v1jSLkNU5QTgy8CNwB909SQjIiKGUMkgAJgFLGux/THgKNtzgcOAzzTt2xO4\n0PbewAbgvS3yTwcW296HxiyDpw5Rj7cBXy+vEzs6g4iIiFGqcndAK1sBCyQdDGwGdpO0a9n3sO3b\ny/Ii4P3A+QPyP277urK8DHhDuwNJ2hf4me2fSPop8A+Sdrb9i2ennte0XOvohCIiYvKr1+vU6/Wu\nl1vVIOBe4LgW208CXgLMtr1Z0jpgWtk3cExAqzECTzYtb2Lw63sCsKekB2jMKfAC4Fjg0mcnnTdI\nMRERMdXVajVqtdqW9fnz53el3Ep2B9i+GdhW0in928rAwF7gkRIAHFrW+/VK2r8sn0ijuX+gYY3Y\nkyTgrcAs279p++XAUaRLICIinkOVDAKKo4HDJa0tt/+dB1wL7CdpJXAycF9T+jXAGZJWAzsDXyjb\nm1sEhjvTz+8CP7Hd17Tte8ArJfV0fioRERGdyyyCE5wkPzu2UKYXjoiosG7NIljVMQGTzDPf556e\n3jbpIiIihi9BwBiTdDZwPI2f8yp/r7S9YLhl5Fd/RESMhXQHTHCSnPcoIiKadas7oMoDAyMiIiot\nQUBERERFJQiIiIioqAQBERERFZUgICIioqIqHQRI6pF0maT7JS2RdI2k3duk7S1PFmy172JJe43g\n+OdI+omk5eV1Xpt0zJgxs9PiIyIiBlX15wRcDSy0fQJsmT+gB1jbJn3Le/VsnzaKOpxve+BshM86\nbF/fqO8EiYiIeIbKtgSUCYKesH1J/zbbq4AVkm6StFTSSklHNmXbRtIiSaslXSFpWinrFklzyvIG\nSZ+UtELSYkm7DFWVbp9bRETEcFQ2CABmActabH8MOMr2XOAw4DNN+/YELrS9N7ABeG+L/NOBxbb3\noTHT4KlD1ONPmroDDu/0JCIiIkaq6t0BrWwFLJB0MLAZ2E3SrmXfw7ZvL8uLgPcDA5vyH7d9XVle\nBrxhiOMNoztgXuO/8+Y9a07piIiY+ur1OvV6vevlVjkIuBc4rsX2k4CXALNtb5a0DphW9g0cE9Bq\njMCTTcub6Mo1ngfMZ968eaMvKiIiJp2BPwDnz5/flXIr2x1g+2ZgW0mn9G8rAwN7gUdKAHBoWe/X\nK2n/snwijeb+gdLHHxERk0Jlg4DiaOBwSWvL7X/nAdcC+0laCZwM3NeUfg1whqTVwM7AF8r25haB\nzPYTERGTQmYRnOAkGaCnp5f16x8c59pERMRE0K1ZBKs8JmDSSKAWERFjIUHAc0DS2cDxNLoKVP5e\naXvBuFYsIiIqLd0BE5wk5z2KiIhm3eoOqPrAwIiIiMpKEBAREVFRCQIiIiIqKkFARERERSUIiIiI\nqKhKBgGSeiRdJul+SUskXSNp9zZpe8vTBFvtu1jSXiOsw8llquJVku4qZe3YJi0zZswcyWEiIiLa\nqupzAq4GFto+AbbMGdADrG2TvuU9erZPG8nBJb0JOBP4PdvrJQl4Z6nDL1sdvq8vUxJERER3Va4l\noEwK9ITtS/q32V4FrJB0k6Sl5Rf6kU3ZtpG0SNJqSVdImlbKukXSnLK8QdInJa2QtFjSLoNU42zg\nLNvry/Ft+x9t39/1E46IiGijckEAMAtY1mL7Y8BRtucChwGfadq3J3Ch7b2BDcB7W+SfDiy2vQ+N\n2QVPHaQOvw3cNYK6R0REdE1VuwNa2QpYIOlgYDOwm6Rdy76Hbd9elhcB7wfOH5D/cdvXleVlwBsG\nOdaW7gVJs4CvAC8A/sL2lc9OPq/x33nznjWndERETH31ep16vd71civ32GBJhwHn2D5kwPZ3Am8C\nTrK9WdI64BAaz/qv2355SXco8D7bx0q6hUaz/nJJv7S9Y0lzLPAW2+9uU4fvAh+3/d2mbRcAS2x/\neUBa9085ULX3KiIiWstjg0fI9s3AtpJO6d9WBgb2Ao+UAODQst6vV9L+ZflEGs39A3XyZnwa+FtJ\nv9G0bbsO8kdERIxa5YKA4mjgcElry+1/5wHXAvtJWgmcDNzXlH4NcIak1cDOwBfK9uaf5sP+mW77\neuDvgesl3SPpB8BTwLdGekIRERGdqlx3wGTT6A6Anp5e1q9/cJxrExERE0G3ugMyMHASSKAWERFj\nIUHAGJJ0NnA8/SP7Gn+vtL1gXCsWERFBugMmPEnOexQREc1yd0BERESMSoKAiIiIikoQEBERUVEJ\nAiIiIiqqskGApB5Jl0m6X9ISSddI2r1N2t7yUKFW+y6WtFeHxz5b0l3l9ZSk5eX1vpGcS0RExEhU\n9u4ASYuBhf1TCpdHB+9o+9YWaXuBb9p+9RjUY8ucA232Ow8KioiIZrk7YBTK3ABP9AcAALZXASsk\n3SRpqaSVko5syraNpEWSVku6QtK0UtYtkuaU5Q2SPilphaTFknbpRn37+h7qRjERERHPUMkgAJhF\nY7rfgR4DjrI9FzgM+EzTvj2BC23vDWwA3tsi/3Rgse19aEwydGpXax0REdFFVQ0C2tkKWFAmEboJ\n2E3SrmXfw7ZvL8uLgINa5H/c9nVleRkwcywrGxERMRpVfWzwvcBxLbafBLwEmF2mFF4HTCv7Bg6e\naDWY4smm5U108frOmzcPgFqtRq1W61axERExCdTrder1etfLrfLAwNuAS21/qay/isYUwy+2fWYZ\nN/AdGr/mBawDXmf7DkmXAPfa/pykW4CzbC+XtMH2C0p5xwJvsf3uIeqxJU+b/YZMIhQREU/LwMDR\nOxo4XNLacvvfecC1wH6lO+Bk4L6m9GuAMyStBnYGvlC2N387j+SbOt/uERExLirbEjBZpCUgIiIG\nSktAhfT09I53FSIiYgpKS8AYk3Q2cDyNZn+Vv1faXjDM/JlKOCIinqFbLQEJAia4BAERETFQugMi\nIiJiVBIEREREVFSCgIiIiIpKEBAREVFRCQIiIiIqasggQNImScsl3VX+/tlwC5d0iKRvjqaCzVP1\njiDvoMeX9E5JmyUd1rTtqLLtmLJ+iaS9Ojzu73dynSIiIsbDcCa42Wh7RF/CxYjvb5PUjZaKoY5/\nN/B24Oay/nZgxZbMdsfTAdv+JjCq4CciImKsDedLtuV9iJLWSTqvtBDcKWm2pBsk3S/ptKakO0m6\nRtIaSRc15b+o5Fsl6ZwB5X5a0lIaD9np3y5JCyWdW9YPl7RY0lJJl0vavmx/k6T7Sv5jhnF+PwBe\nK2lrSdOB3WkKAvpbIiRtVY5/t6SVks4s+z8g6V5JKyR9rWx7p6QLyvJCSX8n6dYyT0F/C4PKNVgt\n6VuSru3fFxER8VwYTkvAdpKW8/TT7hbYvrLse9D2bEnnAwuB3wG2B+4BLi5p9gNeCTwMfEvSMbav\nAs62/Yvya/87kr5h+56S52e25wJI+mNgG+CrwCrbCyS9GPgY8Hrbj5Wm9w9J+pty3JrtByRdPozz\nM3AT8CZgJ+BfgZe3SLcP8Bu2X13qtWPZ/hFgpu0nm7b1l9tvhu0DJb0S+DfgKuBY4GW295bUQ2Oy\nokuHUd+IiIiuGE4Q8Ogg3QH9Td6rgOm2HwUelfTrpi/EO20/BCDpMuAgGl+Cb5d0aqnDDGBvGsED\nwMAv7y8Clzc9aveAkv5WSaIRJNwG7AU8YPuBkm4RMFRzvoGvA2cCOwJnAR9tke4B4OWS/g64Drix\nbF8JfE3SvwD/0uYY/wJg+z5Ju5ZtBwJXlu19ZUrilubNm7dluVarUavVhjiliIiYSur1OvV6vevl\nDicIGMzj5e/mpuX+9f6yB/bJW9JMGl+2+9r+paSFwLSmNBsH5LkVOFTS+bYfp9EqcaPtk5oTSXoN\nbbovBmN7qaRXAb+yvbYRVzwrzS9K+b8HnA68FXgP8BbgYOBI4KOSZrU4RPO16bh+zUFARERUz8Af\ngPPnz+9KuSMeE9BBnv0l9ZZm/7fR6IPfEfgVsKE0hR8xRHmXAtcDV5RybgcOlPQKAEnbS9oDWAP0\nSupvzj+hgzp/hNYtAJRjvBjY2vbVwF8Cs8uul9n+LvDn5bx2GOI4/dfmVuDYMjagB6h1UNeIiIhR\nG05LwLQBYwJusH02g4+6b953J3AhjQF3N5cvUSStoNEP/mMagUGrvFvWbX9W0k7AV2yfJOldwGWS\nnl/SfMz2/ZJOB66TtBH4PkN/KVPK/1abOvQv/wawsAQhBv5c0vOARaXrQ8DflZaNdteief0bwGHA\nveUaLAP+Zzh1jYiI6IbMIjiOJE23vVHSi4A7gANtPzIgTWYRjIiIZ1BmEZwSrpF0F/A94NyBAUCM\nzlgMoqmCXLeRyXUbmVy38VWJIEDSu/T0Ew/7XxeMd71sH2p7tu1Ztr8y3vWZavKPy8jkuo1MrtvI\n5LqNr9HeHTAp2P5H4B/HuRoRERETSiVaAiIiIuLZMjBwgpOUNygiIp6lGwMDEwRERERUVLoDIiIi\nKipBQEREREUlCBhHZdrjNZJ+JOkjbdL8vRrTM6+QtE8neaeqEVy32U3bHyxTQd8l6c7nrtbjb6jr\nJmlPNabn/rWkD3WSdyob5XXL5639dTuxXJuVkn4g6dXDzTuVjfK6df55s53XOLxoBGBrgV4asyCu\nAPYakOYI4NqyvD9w+3DzTtXXaK5bWX8AeOF4n8cEvW4vAfYFPgF8qJO8U/U1muuWz9uQ1+0AYKey\n/Kb8+za66zbSz1taAsbPa4H7bT9k+0ka0xn/wYA0fwB8GcD2HcBOZbKh4eSdqkZz3aAxx0MVP/dD\nXjfbP7O9DHiq07xT2GiuG+TzNth1u912/3wpt9OYn2VYeaew0Vw3GMHnrYofzoniN2hMHNTvJzzz\nzRwszXDyTlUjuW7/3pTGwLclLZF06pjVcuIZzWcmn7endXru+bw1DHXdTqExU+xI8k4lo7luMILP\nWyWeGDiFjPqe0OBA2z+VtAuN/1nus/2DIXNFjEw+b0OQdCjwR8BB412XyaTNdev485aWgPHz78DL\nmtb/V9k2MM3/bpFmOHmnqtFcN2z/tPz9T+BqGs1vVTCaz0w+b0/r6Nzzedui5XUrg9ouBo60/fNO\n8k5Ro7luI/q8JQgYP0uA3SX1StoWeDvwbwPS/BvwDgBJBwC/sN03zLxT1Yivm6TtJe1Qtk8H3gjc\n89xVfVx1+plpbnXK520E1y2ft8Gvm6SXAd8A/tD2/99J3ilsxNdtpJ+3dAeME9ubJL0PuJFGMHap\n7fsknd7Y7YttXyfpzZLWAhtpNP20zTtOp/KcGs11A3qAq9V4FPPzgK/avnE8zuO5NpzrVgZPLgVe\nAGyWdCawt+1f5fPW+XUDdiGft7bXDfhL4EXARZIEPGn7tfn3bWTXjRH++5bHBkdERFRUugMiIiIq\nKkFARERERSUIiIiIqKgEARERERWVICAiIqKiEgRERERUVIKAiIiIikoQEBERUVH/D0fp/Hca+rNg\nAAAAAElFTkSuQmCC\n",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Simple version that shows all of the variables\n",
"feature_importances = pd.Series(model.feature_importances_, index=X.columns)\n",
"feature_importances.sort_values(inplace=True)\n",
"feature_importances.plot(kind=\"barh\", figsize=(7,6));"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAnMAAACMCAYAAAAX6y/rAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAE/VJREFUeJzt3XuwZlV55/HvT8CALSAacpjRsZ0MIiIoNLdWmeFIboZM\ngdeoMAoa6Tjq6Iw6k6qo0EjUyWSGWIrWhEhRGlIIiZJJECzU8AoIDUgDcg8tF1HpVgxq2y1IN8/8\n8e42r4dzec853We/+5zvp2pX78tae6+997vrPL3W3mulqpAkSVI3PantAkiSJGnuDOYkSZI6zGBO\nkiSpwwzmJEmSOsxgTpIkqcMM5iRJkjps57YL0JYk9skiSZI6o6oy2folXTNXVU4tTaeddlrrZVjq\nk/eg/cl74PVf6pP3YPhpOp0I5pK8P8mtSW5OsjbJ4W2XSZIkaRSMfDNrkpXAscDBVbUlydOBJ7dc\nLEmSpJGQmaru2pbklcDJVXX8hPUrgDOBZcBDwMnNv9cA76uqK5J8FNhSVR+cZL+jfeKSJEkDaop3\n5roQzC0DrgJ2A74KXABcDXwNOK6qfpjk94Hfqao/SHIA8DfAu4D/BRxZVVsm2W/BaJ+7JElSX6YM\n5ka+mbWqNjW1cP8eOAb4HPBh4EDgy0lC/92/B5v0tyc5D7iYKQI5SZKkxWLkgzmA6lcfXgFckeQW\n4B3ArVX10imyHAQ8DIxNv+fVA/PjzSRJktS2XjPNrAvNrPsBj1fVumb5DGAv4LeBN1XVmiQ7A/s1\ntXKvAk4B/gvwReDwqvrJJPu1mVWSJHXE1M2sXQjmVgCfAPYEtgDrgFXAswbW7wR8DPg74OvAMVX1\nvSTvBA6tqjdPsl+DOUmS1BEdDuZ2FIM5SZLUHR3+AGLHmvSaSJIkdcaSDuaWaq2kJEnqln7nHZPr\nxHBekiRJmpzBnCRJUocZzEmSJHVYJ4K5JK9I8njT55wkSZIanQjmgNcDVwJvaLsgkiRJo2Tk+5lL\nsgy4E3gZcHFV7d+Mx/pJ+uNvPUC/M+FzquoLTSfDZwLLgIeAk6tqwyT7He0TlyRpRIyNLWf9+vva\nLsaSlnS7n7njgS9V1bokDyU5BPh14NlVdUCSMeAO4JxmWK9PAMdV1Q+T/D7wEeAPJt+18ZwkSTPZ\nsMF+WUdZF4K5N9AfqgvgAuAE+uX+G4Cq2pDk8mb784ADgS83tXdPAr63sMWVJElaOCMdzCXZCzgG\nOLBpFt2JfnXaRVNlAW6tqpcOd4TVA/PjzSRJktSuXq9Hr9cbKu1IvzOXZBVwSFX954F1lwOXA4fR\nb4L9NeB24BTgH4DbgDdV1Zqm2XW/qrp9kn07NqskSUOJoya1bLp35kb9a9bX8cRauM8DY8B36Adu\nnwVuAH5cVY8BrwH+NMlNwI3AixeuuJIkSQtrpGvmppNkWVVtSvJ04FrgpVX1/Vnkt2ZOkqShWDPX\ntq5/zTqVi5M8DdgF+NBsArl/4dc5kiTNZGxsedtF0DQ6WzM3X0lqqZ67JEnqli6/MydJkqRpGMxJ\nkiR1mMGcJElSh43kBxBJtgI30/9CoYBXVNW32y2VJEnS6BnJDyCS/KSq9phDvp2qauuQaf0AQpIk\ndUIXuyZ5QmGTLAf+CnhKs+qdzSgPRwNnAA/TH5t1/yQnAu+i323JtcDbJ4vc+sO3SpK0MMbGlrN+\n/X1tF0OLzKgGc7slWUs/qLunql4NbAB+s6p+nmRf4Hzg8Cb9IcALqurbSfanP3LES6pqa5JPAicC\n5z3xMNbMSZIWzoYNViJo+xvVYG5zVa2YsO7JwFlJDga2As8d2HbdwDt1vwGsAK5Pv+ptV/qBoCRJ\n0qIzqsHcZP4bsL6qXphkJ+BnA9s2DcwH+ExVvX/mXa4emB9vJkmSpHb1ej16vd5QaUf1A4iNVbX7\nhHVnAg9U1Z8neTPw6araqXln7r1VdVyT7vnA3wFHVdUPkuwF7D7xa1jHZpUkLTzHONXcdHEEiMl+\n6Z8CTk5yI7Afv1wb9y8Zq+4APgBcluRm4DJgnx1VUEmSpDaNZM3cQrBmTpK08KyZ09x0sWuSBeJX\nRZKkhTM2trztImgRWtLBnP87kiRJXTeq78xJkiRpCAZzkiRJHWYwJ0mS1GELHswl2ZpkbZJbklyQ\nZNdp0p6W5D0LWT5JkqQuaaNmblNVraiqg4DHgLe1UAZJkqRFoe2vWa8EDgJI8ibgvcDjwDer6qTB\nhEneCqwCdgHWAW+sqkeSvBY4FdgC/LiqxpMcAJzbpH0S8Oqq+tbEg/eHbpW6Z2xsOevX39d2MSRJ\nI2DBOw3eNlRXkp2BvwUupR/UXQSsrKqHkzytqn6U5DRgY1WdmWSvqnq42ccZ9Mdp/WSSbwK/U1UP\nJtmjqn6S5OPANVV1fnOcnarq0QnlsNNgdZgdj0rSUjJqw3ntlmQtcB1wH3AOcAxw4bZgrap+NEm+\nFya5ogneTgBe0Ky/CvhMU3O3rabxGuD9Sf478JyJgZwkSdJi0UYz6+aqWjG4YsjmznOB46rq1iQn\nAUcDVNXbkxwO/EfghiQrmhq5Nc26S5KsqqreE3e5emB+vJkkSZLa1ev16PV6Q6VtrZl1wroDgC8A\nL6mqf97WpDqhmfX7wAHAj4EvAt+pqrck+fWquqfZz7XAKU2ee5t1fwY8UFUfn3BMm1nVYTazStJS\nMmpjsz7hL1BV3Z7kw8DXkmwBbgTeMiHZqfSbZr8PXAtsCwj/LMlzm/mvVNU3k/xRkjfS/1r2QeDD\nO+A8JEmSWrfgNXOjwpo5dZs1c5K0lIxazdwIsWsSddPY2PK2iyBJGhFLOpizZkOSJHWdY7NKkiR1\nmMGcJElShxnMSZIkdVirwVySsSTnJ7k7yfVJLk6y7xRplye5ZYptZyfZf8eWVpIkafS0/QHERcC5\nVfUGgCQHAWPAuinST/rFQlWtmsvBhxx5Qi1yQHlJkqbXWs1ckpcBP6+qv9y2rqpuAW5K8pUk30hy\nc5LjBrLtkuS8JLcnuTDJrs2+Lk+yopnfmORPktyU5Ooke09dinIa8WnDhvunvn2SJKnVZtYDgRsm\nWf8z4BVVdRhwDPB/BrY9Dzirqg4ANgJvnyT/MuDqqjoYuJL+8F6SJEmL0ih+APEk4KNJbga+Avzr\nJL/WbPt2Va1p5s8Djpok/6NVdUkzfwPwnB1ZWEmSpDa1+c7cbcBrJll/IvCrwCFV9XiSe4Fdm20T\n35mb7B26xwbmtzLtOa4emB9vJkmSpHb1ej16vd5QaVsdmzXJNcA5VfXpZvkg4JXAM6rq3c17dV+l\nX7sW4F7gxVV1bZK/BG6rqo8luRx4b1WtTbKxqnZv9vdq4Peq6i2THNuxWTvBMUglSZpubNa2m1lf\nCfxWknVNtyMfAb4IHN40s/4n4I6B9HcC70hyO/A04P826wf/2vuXX5IkLRmt1sy1qV8zp1Fn1ySS\nJE1fM9d2P3OtWqqBrCRJWjzabmaVJEnSPBjMSZIkdZjBnCRJUocZzEmSJHVY68FckvcnubUZh3Vt\nkiOSnJ1k/2b7xinyHZlkTZIbk9yW5NSFLbkkSVL7Wv2aNclK4Fjg4KrakuTpwJOratVAsqk+Of0M\n8JqqujVJ6I/bOtvjz7rMmp5diUiStLDarpn7V8BDVbUFoKr+uarWJ7k8yYomTZKc2dTefTnJM5r1\newMbmnxVVXc2iU9L8tkkVye5K8lbpz58OW3nacOG+6e+3JIkabtrO5i7DHh2kjuTfDLJf5gkzTLg\nuqo6ELgCOK1Z/zHgriSfT7Iqya8M5DmI/kCrLwFOTbLPjjsFSZKk9rQazFXVJmAFsAr4AfC5JCdN\nSLYVuLCZPw84qsl7BnAo/YDwBODSgTz/r6p+XlU/BP4ROGKHnYQkSVKLWh8BovrDMFwBXNGMz3oS\n04+v+ottVXUv8BdJPg38IMleE9MAmXp/qwfmx5tJkiSpXb1ej16vN1TaVsdmTbIf8HhVrWuWzwD2\nBA4E3ldVa5M8Dry+qi5M8gFg76p6d5Jjq+qSJt/zga8BY8CpwPHASmB34AZgZVWtn3Dsmj5m1NzE\nYdIkSdrORnls1qcCn0iyJ7AFWEe/yfVvB9L8FDgiyQfpf/Dwumb9G5OcCWxu8p5QVdV8ofpNoAc8\nA/jQxEBOkiRpsWi1Zm5HSHIasLGqzpwh3eI68RFh1ySSJG1/o1wz16rFFshKkqSlZ9HVzA0rSS3V\nc5ckSd0yXc1c2/3MSZIkaR4M5iRJkjrMYE6SJKnDZgzmkmxNsjbJjc2//2PYnSc5Osk/zKeAE8Zp\nnW3eeR9fkiRplA3zNeumqppTMNWY81cGSbZHzeGUx2/6pFtS7DpEkqTFZZhgadKIJ8m9ST7S1Nhd\nl+SQJF9KcneSVQNJ90xycZI7k3xqIP+nmny3NH3DDe73fyb5BvDagfVJcm6SDzXLv5Xk6iTfSHJB\nkqc061+e5I4m/6umP7VactOGDfdPf0kkSVKnDBPM7TahmfW1A9vuq6pDgKuAc+kHTy8GPjSQ5nDg\nHcDzgX2TbAuw/riqjgBeBIwnOXAgz0NVdVhVXdAs7wL8NfBPVXVqkmcAHwB+o6oOoz9k13uS/Apw\nNvB7zfp9hr4SkiRJHTRMM+vmaZpZt72PdguwrKo2A5uTPJJkj2bbdVV1P0CS84GjgC8Ar09ySlOG\nfYADgFubPNuCuG3+Arigqj7aLK9s0n89/bbSXYBrgP2Be6rqnibdecApQ5yjJElSJ813BIhHm38f\nH5jftrxt3xPfWaskzwHeCxxaVT9Jci6w60CaTRPyfB14WZIzq+pR+k2/l1XViYOJkryIKZqFJ7d6\nYH68mSRJktrV6/Xo9XpDpR0mmJvLVwKDeY5Mshx4AHgd/Vq2PYCfAhuTjAG/C1w+zf7OAY4GLkzy\nSmANcFaSf1dV32rel3smcCewPMm/rap7gTdMX8zVczg1SZKkHWt8fJzx8fFfLJ9++ulTph0mmNs1\nyVr6AVoBX6qqP2b6r1QHt10HnAXsC/xjVV0EkOQm4A76Qd5VU+T9xXJV/XmSPYG/qqoTk5wMnN+8\nJ1fAB6rq7iR/CFySZBNwJfDUIc5RkiSpk5b02Kxtl6ENdk0iSVL3TDc263zfmeu0pRrISpKkxcPh\nvCRJkjrMYE6SJKnDDOYkSZI6zGBOkiSpw1oN5pJsbYYIu6UZX3XXmXPNuM+Tknxie5RPkiRp1LVd\nM7epqlZU1UHAY8Dbhs2YZLqyD/WZapKRn/bZ5znDXhJJkrQEtR3MDbqSfsfCJLkoyfVNjd1btyVI\nsjHJ/05yI7AyyWFJvp7kpiRrkixrkj4zyaVJ7kryp1MfskZ+2rDh/jlcSkmStFS03c9cAJLsTH9I\nr0ub9W+uqh81za7XJ/l8VT0MLAOuqar3JdmF/vBdr62qtUmeCjzS5H8RcDD92r67kny8qr67gOcl\nSZK0INqumdutGSrsOuB++mOwAvzXZrivNcCzgOc267cAX2jmnwd8r6rWAlTVT6tqa7Ptq83yo8Dt\nwPIdfyqSJEkLr+2auc1VtWJwRZKjgWOAI6vq0SSXA9s+jHikfnnYhkmHtQAeHZjfypTnuXpgfryZ\nJEmS2tXr9ej1ekOlbTuYmywY2xN4uAnk9gdWTpH+LmCfJIdW1Q1NM+vPZnf41bNLLkmStADGx8cZ\nHx//xfLpp58+Zdq2g7nJvjr9EvC2JLfRD9iumSx9VT2W5HXAWUl2AzYDvznkMSRJkhaFVt+Zq6o9\nJln386o6tqpeUFWvqqpjquqKydJX1Q1V9eKqOriqXlJVm6vqM1X1roE0x23L/0QZ+WlsbHG+7jds\n1bF2HO9B+7wH7fL6t897sH20/QFEq6pq5Kf16+9r+zLtED7A7fMetM970C6vf/u8B9vHkg7mJEmS\nus5gTpIkqcPyyz19LB1JluaJS5KkTqqqSbtkW7LBnCRJ0mJgM6skSVKHGcxJkiR12KIL5pK8PMmd\nSf4pyR9NkebjSe5OclOSg2eTVzObwz04ZGD9fUluTnJjkusWrtSLx0zXP8nzklyd5JEk75lNXg1n\nnvfAZ2A7GOIenNBc55uTXJXkhcPm1czmef19Bmar7X7UtudEPzhdBywHdgFuAvafkOZ3gS8280cC\na4bN67Rj70GzfA+wV9vn0dVpyOv/q8ChwBnAe2aT12nH3oNmm8/AwtyDlcCezfzL/VswGte/WfYZ\nmOW02GrmjgDurqr7q+ox4HPA8RPSHA98FqCqrgX2TDI2ZF7NbD73APpDXyy23+VCmvH6V9VDVXUD\nsGW2eTWU+dwD8BnYHoa5B2uq6sfN4hrgmcPm1Yzmc/3BZ2DWFtvFeibwwMDyd/jlH8h0aYbJq5nN\n5R58dyBNAV9Ocn2SU3ZYKRev+fyOfQa2j/leR5+B+ZvtPXgrcOkc8+qJ5nP9wWdg1nZuuwAjYNI+\nW9Sal1bVg0n2pv8w31FVV7VdKGkB+QwsoCQvA94MHNV2WZaiKa6/z8AsLbaaue8Czx5YflazbmKa\nfzNJmmHyambzuQdU1YPNvz8ALqJfXa/hzed37DOwfczrOvoMbBdD3YPmpfuzgeOq6uHZ5NW05nP9\nfQbmYLEFc9cD+yZZnuTJwOuBv5+Q5u+BNwEkWQn8qKo2DJlXM5vzPUjylCRPbdYvA34buHXhir4o\nzPZ3PFgz7TOwfcz5HvgMbDcz3oMkzwY+D7yxqr41m7ya0Zyvv8/A3CyqZtaq2prkncBl9APVc6rq\njiR/2N9cZ1fVJUmOTbIO2ES/enfKvC2dSmfN5x4AY8BF6Q+1tjPw11V1WRvn0VXDXP/mY5NvALsD\njyd5N3BAVf3UZ2D+5nMPgL3xGZi3Ye4B8EHg6cCnkgR4rKqO8G/B/M3n+uPfgTlxOC9JkqQOW2zN\nrJIkSUuKwZwkSVKHGcxJkiR1mMGcJElShxnMSZIkdZjBnCRJUocZzEmSJHWYwZwkSVKH/X+AHV9l\n19yVMQAAAABJRU5ErkJggg==\n",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Complex version that shows the summary view\n",
"\n",
"def graph_feature_importances(model, feature_names, autoscale=True, headroom=0.05, width=10, summarized_columns=None):\n",
" \"\"\"\n",
" By Mike Bernico\n",
" \n",
" Graphs the feature importances of a random decision forest using a horizontal bar chart. \n",
" Probably works but untested on other sklearn.ensembles.\n",
" \n",
" Parameters\n",
" ----------\n",
" ensemble = Name of the ensemble whose features you would like graphed.\n",
" feature_names = A list of the names of those featurs, displayed on the Y axis.\n",
" autoscale = True (Automatically adjust the X axis size to the largest feature +.headroom) / False = scale from 0 to 1\n",
" headroom = used with autoscale, .05 default\n",
" width=figure width in inches\n",
" summarized_columns = a list of column prefixes to summarize on, for dummy variables (e.g. [\"day_\"] would summarize all day_ vars\n",
" \"\"\"\n",
" \n",
" if autoscale:\n",
" x_scale = model.feature_importances_.max()+ headroom\n",
" else:\n",
" x_scale = 1\n",
" \n",
" feature_dict=dict(zip(feature_names, model.feature_importances_))\n",
" \n",
" if summarized_columns: \n",
" #some dummy columns need to be summarized\n",
" for col_name in summarized_columns: \n",
" #sum all the features that contain col_name, store in temp sum_value\n",
" sum_value = sum(x for i, x in feature_dict.items() if col_name in i ) \n",
" \n",
" #now remove all keys that are part of col_name\n",
" keys_to_remove = [i for i in feature_dict.keys() if col_name in i ]\n",
" for i in keys_to_remove:\n",
" feature_dict.pop(i)\n",
" #lastly, read the summarized field\n",
" feature_dict[col_name] = sum_value\n",
" \n",
" results = pd.Series(feature_dict)\n",
" results.sort_values(inplace=True)\n",
" results.plot(kind=\"barh\", figsize=(width,len(results)/4), xlim=(0,x_scale))\n",
" \n",
"graph_feature_importances(model, X.columns, summarized_columns=categorical_variables)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Parameter tests"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Parameters to test\n",
"\n",
" * ###Parameters that will make your model better\n",
" * n_estimators: The number of trees in the forest. Choose as high of a number as your computer can handle.\n",
" * max_features: The number of features to consider when looking for the best split. Try [\"auto\", \"None\", \"sqrt\", \"log2\", 0.9, and 0.2]\n",
" * min_samples_leaf: The minimum number of samples in newly created leaves.Try [1, 2, 3]. If 3 is the best, try higher numbers such as 1 through 10.\n",
" * ###Parameters that will make it easier to train your model\n",
" * n_jobs: Determines if multiple processors should be used to train and test the model. Always set this to -1 and %%timeit vs. if it is set to 1. It should be much faster (especially when many trees are trained)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### n_jobs"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1 loop, best of 3: 1.21 s per loop\n"
]
}
],
"source": [
"%%timeit\n",
"model = RandomForestRegressor(1000, oob_score=True, n_jobs=1, random_state=42)\n",
"model.fit(X, y)"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1 loop, best of 3: 708 ms per loop\n"
]
}
],
"source": [
"%%timeit\n",
"model = RandomForestRegressor(1000, oob_score=True, n_jobs=-1, random_state=42)\n",
"model.fit(X, y)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### n_estimators"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"30 trees\n",
"C-stat: 0.853875733657\n",
"\n",
"50 trees\n",
"C-stat: 0.860698345743\n",
"\n",
"100 trees\n",
"C-stat: 0.863521128261\n",
"\n",
"200 trees\n",
"C-stat: 0.862192290076\n",
"\n",
"500 trees\n",
"C-stat: 0.863739494456\n",
"\n",
"1000 trees\n",
"C-stat: 0.864043076726\n",
"\n",
"2000 trees\n",
"C-stat: 0.863449227197\n",
"\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYoAAAEACAYAAACtVTGuAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAHFFJREFUeJzt3XuQFfWd9/H3B7moMKCJBI1GAyIaWa+PQTbejooyZkvZ\njY8JbpUupmIsg6tldrloUsU8te4Cum4kS7Ib4gU33ipmH3WKNRFvJ1tSEvEREHEQNsSRixpFBS8B\nGfg+f3RPbE9mes4MZ+ZwZj6vqlN0//rXfX7dzJzP/H59OYoIzMzM2tOv2g0wM7O9m4PCzMxyOSjM\nzCyXg8LMzHI5KMzMLJeDwszMcpUVFJLqJa2RtFbSjDaWD5XUKGmFpFWSpmSWDZP0oKQmSaslnZpZ\n9rdp+SpJcyqyR2ZmVlH9O6ogqR8wHzgX2Awsk/RIRKzJVJsKrI6IiyQdBLwi6Z6IaAHmAY9GxCWS\n+gP7p9stABcCx0VES7qemZntZcrpUYwD1kVEc0TsBB4AJpXUCaAuna4DtqQf/kOBMyLiLoCIaImI\nbWm9q4E5aZgQEW/v4b6YmVk3KCcoDgU2ZOY3pmVZ84FjJW0GVgLXpeUjgbcl3SXpBUkLJO2XLhsD\nnClpqaSnJZ3S9d0wM7PuUqmT2ROB5RHxeeAk4EeShpAMbZ0M/CgiTgY+Amam6/QHDoyI8cB04OcV\naouZmVVQh+cogE3A4Zn5w9KyrCuA2QAR8VtJvwOOIemJbIiI59N6vwBaT4ZvBP5vus4ySbslfTYi\ntmQ3LMkPozIz64KIUCW2U06PYhkwWtIRkgYCk4HGkjrNwAQASSNIhpXWR8SbwAZJY9J65wIvp9MP\nA+ek64wBBpSGRKuI8KuLr1mzZlW9Db3t5WPqY1oLr0rqsEcREbskXQMsJgmWOyKiSdJVyeJYANwE\nLJT0Yrra9Ih4J52+FrhX0gBgPUnvA+BO4E5Jq4AdwOUV2yszM6uYcoaeiIhfAUeXlP0kM/06yXmK\nttZdCXy5jfKdwGWdaayZmfU835ndyxUKhWo3odfxMa08H9O9myo9llVpkmJvb6OZ2d5GEtGDJ7PN\nzKwPc1CYmVkuB4WZmeVyUJiZWS4HhZmZ5XJQmJlZLgeFmZnlclCYmVkuB4WZmeVyUJiZWS4HhZmZ\n5XJQmJlZLgeFmZnlclCYmVkuB4WZmeVyUJiZWS4HhZmZ5XJQmJlZLgeFmZnlclCYmVkuB4WZmeUq\nKygk1UtaI2mtpBltLB8qqVHSCkmrJE3JLBsm6UFJTZJWSzq1ZN2/k7Rb0mf2eG/MzKziOgwKSf2A\n+cBEYCxwqaRjSqpNBVZHxInA2cCtkvqny+YBj0bEl4ATgKbMtg8DzgOa93RHzMyse5TToxgHrIuI\n5ojYCTwATCqpE0BdOl0HbImIFklDgTMi4i6AiGiJiG2Z9X4ATNujPTAzs27Vv+MqHApsyMxvJAmP\nrPlAo6TNwBDgG2n5SOBtSXeR9CaeB66LiD9IugjYEBGrJO3JPph12a5d8MEHsHVr8tq2re1/W6e3\nbYMBA2Do0OQ1bFjb09n5wYOhn88GWg0rJyjKMRFYHhHnSDoSeFzS8en2TwamRsTzkm4DZkqaA9xI\nMuzUqt20aGho+ON0oVCgUChUqNlWqyJg+/byPtzzyj78EIYM+fQHe/bf1umjjvokAHbu/CQ0tm2D\n5uZPz2dDZds2+Oij5D3aC5JyQ2fwYPDfVNaeYrFIsVjslm0rIvIrSOOBhoioT+dnAhERczN1FgGz\nI2JJOv8kMIOkJ/JsRIxKy09Py28AngA+IgmIw4BNwLiI+H3J+0dHbbTa0tIC77/f9Q/31n/79cv/\ncC9nWV1d9/+1v2tXsr9thUhn5v/wh6S9HYVMR/MOnL5BEhFRkf/pcnoUy4DRko4AXgcmA5eW1GkG\nJgBLJI0AxgDrI+IdSRskjYmItcC5wMsR8RJwcOvKkn4HnBwR7+75Lll3iUj+Oi73g7y9Za0feB19\nuB9ySP4H/6BB1T4i5dlnHzjggOS1J1oDtqNg+d3v8kNnx45PAmdPQmf//R04fUWHPQpILo8luXqp\nH3BHRMyRdBVJz2KBpEOAhcAh6SqzI+L+dN0TgNuBAcB64IqI2Fqy/fXAKRHxThvv7R5FBWSHS7ry\n4d46PXBg1/5yz/7rMfvqKu3RdaV30xo42QDpaujst58DpztUskdRVlBUk4Oi8yJg8WKYOxeampJf\n8I8/bv+Du9wP/rq6JCjMIPnjoxJDaqU/m10NHQfOpzkorE27d0NjI9x0UzK8c+ONcPbZyS+Uhwls\nb1V6cUBXQ6elZc+H04YOhX337R2/Kw4K+5Rdu+DBB+Ef/zH5i//734dJkzy8Y33Lxx93fkitrWW7\nd+cPqZUbOoMGVTdwHBTteOUVGD4cPtNHHgaycyfcey/80z8l+/3970N9fe/4a8isWnbsqMyQWmvg\ndPYy6NL5rl600dNXPdWM73wnubb+6ad791j69u1w113JOYjRo2HBAjjrLAeEWSUMGpS8Djpoz7az\nY0d5obJxY/vLt25Nfq+7MoRWSb2mRxGR/McefzwcfTT8+7/3QON62IcfJqHwz/8MJ50E3/se/Pmf\nV7tVZtadWgOns0Nq//3f7lH8iY0bk0crPPIIjB+ffKB++9vVblVlbNsGP/oR3HYbnHEGLFqUBIWZ\n9X6DBiVDy8OHd269So4w9JqgWLEi+fAcOhQefhhOPx3+7M/gK1+pdsu6bssW+OEPk5Cor4ennoKx\nY6vdKjPra3rNdTHLl8OJJybTY8YkY/iXXAKbN1e3XV3xxhswfXqyH5s3w9KlcM89Dgkzq45eExQr\nVnwSFAB/8RfJye2LL07G+GrBhg1w7bVw7LHJfRDLl8NPf5qcsDYzq5ZeFRSl4/Y33gif/zxcc01y\nsntvtX59cj7lhBOS8cjVq+Ff/xUOP7zaLTMz6yVB8d578NZbcOSRny6XYOFCePZZ+MlPqtK0XE1N\ncPnlMG4cjBgBa9fCLbckD8MzM9tb9IqgWLkSjjsueUpnqbq65OT2rFnwzDM937a2rFiRnD856yw4\n5hj47W/hH/5hz6/bNjPrDr0iKNoadsoaPRruvhu+/vXkMtpqWboULrwQvvrV5P6H9euT4bFhw6rX\nJjOzjvSKoMhe8dSe+vrkRPHFFyd3NveUCPj1r+G88+Ab34ALLkgC4rvfTb71zMxsb9crgqL0iqf2\nzJgBRxwBU6d2/8ntCPjVr5Ib5L71Lfjrv4Z165Irsfbdt3vf28yskmr+ER4ff5x8c9iWLcnz6Dvy\nwQfJTXhXXZUERqVlH/W9fXvymI1LLoH+vebWRjOrBX4oYMbq1TBqVHkhAclwz8MPJ+cIjjsOzjyz\nMu3wo77NrLeq+aAod9gpa9Qo+NnPYPJk+M1v4Atf6Pr7lz7q++ab/ahvM+tdav7v3Y6ueGrP+efD\n9dfD176W3AXdWdu3w7/9Gxx1VPJ4jQULkstvL7jAIWFmvUvNB0U5Vzy15+//Prl09uqryz+5/eGH\n8IMfJDf3/dd/wf33wxNPQKHggDCz3qmmg2L37uRmu64GhQS33570SubPz6+7dSvMnp0MWy1Zkjzq\ne9Eifx+EmfV+NX2O4tVXk5vVPvvZrm9j8GB46KFPTm4XCp9evmULzJsHP/6xH/VtZn1TTfco9mTY\nKWvkyOQ8w6WXwmuvJWWtj/o+6ih4/XU/6tvM+q6ygkJSvaQ1ktZKmtHG8qGSGiWtkLRK0pTMsmGS\nHpTUJGm1pFPT8pvTshWS/lNSp7/ltStXPLVnwgSYNg3+6q8+/ajvFSv8qG8z69s6DApJ/YD5wERg\nLHCppGNKqk0FVkfEicDZwK2SWoe15gGPRsSXgBOAprR8MTA2XWcdcENnG9/VK57ac/31yX0VgwbB\nyy/7Ud9mZlDeOYpxwLqIaAaQ9AAwCViTqRNAXTpdB2yJiJa0l3BGREwBiIgWYFs6/URm/aXAxZ1t\nfKWGnlpJyRVNZmb2iXKGng4FNmTmN6ZlWfOBYyVtBlYC16XlI4G3Jd0l6QVJCyS1dQ/1N4Ffdqbh\nb72VPI7ji1/szFpmZtZZlbrqaSKwPCLOkXQk8Lik49PtnwxMjYjnJd0GzARmta4o6XvAzoi4r72N\nNzQ0/HG6UChQKBT+eFms710wM4NisUixWOyWbXf4UEBJ44GGiKhP52cCERFzM3UWAbMjYkk6/yQw\ng6Qn8mxEjErLTwdmRMSF6fwU4ErgnIho85ut23so4C23wKZNcNttndthM7O+oJIPBSxn6GkZMFrS\nEZIGApOBxpI6zcCEtHEjgDHA+oh4E9ggaUxa71zg5bRePTANuKi9kMhTySuezMysfWU9Zjz9UJ9H\nEix3RMQcSVeR9CwWSDoEWAi0ftvz7Ii4P133BOB2YACwHrgiIrZKWgcMBLak6yyNiO+08d5t9ijG\njoX77oMTTujU/pqZ9QmV7FHU5PdRfPRR8v3S772XPNLbzMw+raeHnvY6L70ERx/tkDAz6wk1GRSV\nvtHOzMzaV5NBUekb7czMrH01GRS+4snMrOfU3MnsXbuSR4tv2pT8a2Zmf6pPn8xetw5GjHBImJn1\nlJoLCg87mZn1rJoMCl/xZGbWc2ouKHzFk5lZz6qpoIhwUJiZ9bSaCoo33kjC4tDSb8MwM7NuU1NB\n0dqb8HdQmJn1nJoKCl/xZGbW82ouKHzFk5lZz6qpoPCJbDOznlczj/B4/304+GDYuhX6V+qbvs3M\neqk++QiP116Dww93SJiZ9bSaCYrt22G//ardCjOzvqemgmLffavdCjOzvsdBYWZmuRwUZmaWy0Fh\nZma5HBRmZparrKCQVC9pjaS1kma0sXyopEZJKyStkjQls2yYpAclNUlaLenUtPxASYslvSLpMUm5\n31nnoDAzq44Og0JSP2A+MBEYC1wq6ZiSalOB1RFxInA2cKuk1jse5gGPRsSXgBOAprR8JvBERBwN\nPAXckNcOB4WZWXWU06MYB6yLiOaI2Ak8AEwqqRNAXTpdB2yJiBZJQ4EzIuIugIhoiYhtab1JwN3p\n9N3AX+Y1wkFhZlYd5QTFocCGzPzGtCxrPnCspM3ASuC6tHwk8LakuyS9IGmBpNbb5j4XEW8CRMQb\nwOfyGuGgMDOrjko9EGMisDwizpF0JPC4pOPT7Z8MTI2I5yXdRjLkNAsofQZJuw+damho4KmnYJ99\noFgsUCgUKtRsM7PeoVgsUiwWu2XbHT4UUNJ4oCEi6tP5mUBExNxMnUXA7IhYks4/Ccwg6Yk8GxGj\n0vLTgRkRcaGkJqAQEW9KOhh4Oj2PUfr+ERFMmwbDh8P06ZXYbTOz3q2nHwq4DBgt6QhJA4HJQGNJ\nnWZgQtq4EcAYYH06tLRB0pi03rnAy+l0IzAlnf4b4JG8RnjoycysOjoceoqIXZKuARaTBMsdEdEk\n6apkcSwAbgIWSnoxXW16RLyTTl8L3CtpALAeuCItnwv8XNI3SYLm63ntcFCYmVVHzXwfxWWXwXnn\nweWXV7tFZmZ7vz75fRTuUZiZVYeDwszMcjkozMwsl4PCzMxyOSjMzCyXg8LMzHI5KMzMLJeDwszM\ncjkozMwsl4PCzMxyOSjMzCxXTQRFSwvs2gUDBlS7JWZmfU9NBMWOHUlvQhV5vJWZmXVGTQWFmZn1\nvJoICp+fMDOrHgeFmZnlclCYmVkuB4WZmeVyUJiZWS4HhZmZ5XJQmJlZLgeFmZnlclCYmVmusoJC\nUr2kNZLWSprRxvKhkholrZC0StKUzLJXJa2UtFzSc5nyEyQ921ou6ZT23t9BYWZWPf07qiCpHzAf\nOBfYDCyT9EhErMlUmwqsjoiLJB0EvCLpnohoAXYDhYh4t2TTNwOzImKxpAuAW4Cz22qDg8LMrHrK\n6VGMA9ZFRHNE7AQeACaV1AmgLp2uA7akIQGgdt5nNzAsnT4A2NReAxwUZmbV02GPAjgU2JCZ30gS\nHlnzgUZJm4EhwDcyywJ4XNIuYEFE/DQtvx54TNKtJGHylfYa4KAwM6uecoKiHBOB5RFxjqQjSYLh\n+Ij4ADgtIl6XNDwtb4qIZ4Crgesi4mFJ/xu4EzivrY0/9lgD/fpBQwMUCgUKhUKFmm1m1jsUi0WK\nxWK3bFsRkV9BGg80RER9Oj8TiIiYm6mzCJgdEUvS+SeBGRHxfMm2ZgHvR8S/SHovIg7ILNsaEcMo\nISmmTQsOOgimT+/6jpqZ9SWSiIiKfItPOecolgGjJR0haSAwGWgsqdMMTEgbNwIYA6yXtL+kIWn5\nYOB8YFW6ziZJZ6XLzgXWttcADz2ZmVVPh0NPEbFL0jXAYpJguSMimiRdlSyOBcBNwEJJL6arTY+I\ndySNBB6SFOl73RsRj6d1rgR+KGkfYDvw7fba4KAwM6ueDoeeqk1SXHZZMGECXH55tVtjZlYbenro\nqercozAzqx4HhZmZ5XJQmJlZLgeFmZnlclCYmVkuB4WZmeVyUJiZWS4HhZmZ5XJQmJlZLgeFmZnl\nclCYmVmumgiKlhYYMKDarTAz65tqIij23RdUkUdbmZlZZ9VMUJiZWXU4KMzMLJeDwszMcjkozMws\nl4PCzMxyOSjMzCyXg8LMzHI5KMzMLJeDwszMcjkozMwsV1lBIale0hpJayXNaGP5UEmNklZIWiVp\nSmbZq5JWSlou6bmS9f5WUlO6zpz23t9BYWZWPf07qiCpHzAfOBfYDCyT9EhErMlUmwqsjoiLJB0E\nvCLpnohoAXYDhYh4t2S7BeBC4LiIaEnXa5ODwsysesrpUYwD1kVEc0TsBB4AJpXUCaAuna4DtqQh\nAaB23udqYE5rvYh4u70GOCjMzKqnnKA4FNiQmd+YlmXNB46VtBlYCVyXWRbA45KWSboyUz4GOFPS\nUklPSzqlvQY4KMzMqqfDoacyTQSWR8Q5ko4kCYbjI+ID4LSIeF3S8LS8KSKeSd/7wIgYL+nLwM+B\nUW1t/NlnG2hoSKYLhQKFQqFCzTYz6x2KxSLFYrFbtq2IyK8gjQcaIqI+nZ8JRETMzdRZBMyOiCXp\n/JPAjIh4vmRbs4D3I+JfJP2SZOjp1+my/wFOjYgtJevE3LnB9Ol7uqtmZn2HJCKiIt/kU87Q0zJg\ntKQjJA0EJgONJXWagQlp40aQDCutl7S/pCFp+WDgfOCldJ2HgXPSZWOAAaUh0cpDT2Zm1dPh0FNE\n7JJ0DbCYJFjuiIgmSVcli2MBcBOwUNKL6WrTI+IdSSOBhyRF+l73RsTitM6dwJ2SVgE7gMvba4OD\nwsysejoceqo2SXH33cHl7caImZmV6umhp6pzj8LMrHocFGZmlstBYWZmuRwUZmaWy0FhZma5HBRm\nZpbLQWFmZrkcFGZmlstBYWZmuRwUZmaWy0FhZma5aiIoBgyodgvMzPqumggKVeSxVmZm1hU1ERRm\nZlY9DgozM8vloDAzs1wOCjMzy+WgMDOzXA4KMzPL5aAwM7NcDgozM8vloDAzs1wOCjMzy1VWUEiq\nl7RG0lpJM9pYPlRSo6QVklZJmpJZ9qqklZKWS3qujXX/TtJuSZ/Zoz0xM7Nu0b+jCpL6AfOBc4HN\nwDJJj0TEmky1qcDqiLhI0kHAK5LuiYgWYDdQiIh329j2YcB5QHMF9sXMzLpBOT2KccC6iGiOiJ3A\nA8CkkjoB1KXTdcCWNCQAlPM+PwCmda7JZmbWk8oJikOBDZn5jWlZ1nzgWEmbgZXAdZllATwuaZmk\nK1sLJV0EbIiIVV1quZmZ9YgOh57KNBFYHhHnSDqSJBiOj4gPgNMi4nVJw9PyJuD/ATeSDDu1avdh\n4g0NDX+cLhQKFAqFCjXbzKx3KBaLFIvFbtm2IiK/gjQeaIiI+nR+JhARMTdTZxEwOyKWpPNPAjMi\n4vmSbc0C3gcWA08AH5EExGHAJmBcRPy+ZJ3oqI1mZvZpkoiIinybTzlDT8uA0ZKOkDQQmAw0ltRp\nBiakjRsBjAHWS9pf0pC0fDBwPvBSRLwUEQdHxKiIGEkynHVSaUiYmVn1dTj0FBG7JF1D0gvoB9wR\nEU2SrkoWxwLgJmChpBfT1aZHxDuSRgIPSYr0ve6NiMVtvQ05Q09mZlY9HQ49VZuHnszMOq+nh57M\nzKwPc1CYmVkuB4WZmeVyUJiZWS4HhZmZ5XJQmJlZLgeFmZnlclCYmVkuB4WZmeVyUJiZWS4HhZmZ\n5XJQmJlZLgeFmZnlclCYmVkuB4WZmeVyUJiZWS4HhZmZ5XJQmJlZLgeFmZnlclCYmVkuB4WZmeVy\nUJiZWa6ygkJSvaQ1ktZKmtHG8qGSGiWtkLRK0pTMslclrZS0XNJzmfKbJTWl6/ynpKEV2SMzM6uo\nDoNCUj9gPjARGAtcKumYkmpTgdURcSJwNnCrpP7pst1AISJOiohxmXUWA2PTddYBN+zZrlhbisVi\ntZvQ6/iYVp6P6d6tnB7FOGBdRDRHxE7gAWBSSZ0A6tLpOmBLRLSk82rrfSLiiYjYnc4uBQ7rbOOt\nY/4FrDwf08rzMd27lRMUhwIbMvMb07Ks+cCxkjYDK4HrMssCeFzSMklXtvMe3wR+WV6TzcysJ/Xv\nuEpZJgLLI+IcSUeSBMPxEfEBcFpEvC5peFreFBHPtK4o6XvAzoi4r0JtMTOzSoqI3BcwHvhVZn4m\nMKOkziKSQGidfxI4pY1tzQK+m5mfAiwBBuW8f/jll19++dX5V0ef7+W+yulRLANGSzoCeB2YDFxa\nUqcZmAAskTQCGAOsl7Q/0C8iPpA0GDgf+D+QXEkFTAPOjIgd7b15RKiMNpqZWTdR+ld7fqXkQ30e\nyTmNOyJijqSrSBJrgaRDgIXAIekqsyPifkkjgYdI0q0/cG9EzEm3uQ4YCGxJ11kaEd+p3K6ZmVkl\nlBUUZmbWd/nO7BrX1g2Nkg6UtFjSK5IekzQsU/8GSevSmx3Pr17L9x6S7pD0pqQXM2WdPoaSTpb0\nYnpj6m09vR97k3aO6SxJGyW9kL7qM8t8TDsg6TBJT0land7YfG1a3v0/q5U62eFXdV7AeuDAkrK5\nwPR0egYwJ50+FlhOMgz4ReB/SHuVffkFnA6cCLy4J8cQ+A3w5XT6UWBitfdtLzums8hczJIp/5KP\naVnH9GDgxHR6CPAKcExP/Ky6R1H72rqhcRJwdzp9N/CX6fRFwAMR0RIRr5LcET+OPi6Sy7XfLSnu\n1DGUdDBQFxHL0nr/kVmnz2nnmELy81pqEj6mHYqINyJiRTr9AdBEcqNyt/+sOihqX/DJDY3fSstG\nRMSbkPxwAZ9Ly0tvntzEn948aYnPdfIYHkpyM2qrtm5MNbgmfb7b7ZkhEh/TTpL0RZIe21I6//ve\n6ePqoKh9p0XEycBXgamSziAJjyxfsbDnfAz33I+BUZE83+0N4NYqt6cmSRoC/AK4Lu1ZdPvvu4Oi\nxkXE6+m/bwEPkwwlvZnez0Lazfx9Wn0T8IXM6oelZfanOnsMfWw7EBFvRTooDvyUT4Y9fUzLlD5s\n9RfAzyLikbS4239WHRQ1TNL+6V8XZG5oXAU0ktz1DvA3QOsPVCMwWdLA9B6X0cBzGCRj59nx804d\nw7TLv1XSOEkCLs+s01d96pimH2Ktvga8lE77mJbvTuDliJiXKev+n9Vqn8n3a4+ughgJrCC5smEV\nMDMt/wzwBMlVEYuBAzLr3EBy9UMTcH6192FveAH3AZuBHcBrwBXAgZ09hsD/Sv8f1gHzqr1fe+Ex\n/Q/gxfRn9mGSsXUf0/KP6WnArszv/AtAfVd+3zt7XH3DnZmZ5fLQk5mZ5XJQmJlZLgeFmZnlclCY\nmVkuB4WZmeVyUJiZWS4HhZmZ5XJQmJlZrv8PkEXkUeq6ZWYAAAAASUVORK5CYII=\n",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"results = []\n",
"n_estimator_options = [30, 50, 100, 200, 500, 1000, 2000]\n",
"\n",
"for trees in n_estimator_options:\n",
" model = RandomForestRegressor(trees, oob_score=True, n_jobs=-1, random_state=42)\n",
" model.fit(X, y)\n",
" print (trees, \"trees\")\n",
" roc = roc_auc_score(y, model.oob_prediction_)\n",
" print (\"C-stat: \", roc)\n",
" results.append(roc)\n",
" print (\"\")\n",
" \n",
"pd.Series(results, n_estimator_options).plot();"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### max_features"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"auto option\n",
"C-stat: 0.864043076726\n",
"\n",
"None option\n",
"C-stat: 0.864043076726\n",
"\n",
"sqrt option\n",
"C-stat: 0.86337466313\n",
"\n",
"log2 option\n",
"C-stat: 0.86337466313\n",
"\n",
"0.9 option\n",
"C-stat: 0.863534443273\n",
"\n",
"0.2 option\n",
"C-stat: 0.86337466313\n",
"\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYkAAAD7CAYAAACfQGjDAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAEQhJREFUeJzt3XuwbnVdx/H3h9sgd1HZKgw7TA1xEKREFMoT3o6aoNgF\nNEMU8Y8YLM2gmjxHbcbQdLKUJtQhzLyRaJRTYOrDpEmg3JWb4kEuHTA1QCtE/PbHXke3m/M7l72f\ntZ+1n/N+zTzDuq/fb36b83nW7/estVJVSJK0MdtNugCSpOEyJCRJTYaEJKnJkJAkNRkSkqQmQ0KS\n1LTDpAuwMUn8Xa4kLUJVZZzHG+yVRFVN7WfNmjUTL4P1s37bWt22hfr1YbAhIUmaPENCktRkSEzA\nqlWrJl2EXlm/lWua6wbTX78+pK9+rKVIUkMslyQNWRJqWxm4liRNniEhSWoyJCRJTYO8mQ7m+tak\ncZmZmWX9+nWTLoa04gx24BqGVy6tZOntZiNpKBy4liQtK0NCktRkSEiSmsYaEklWJ7k+yY1JTt/I\n+pcmuar7fD7JweM8vyRpvMY2cJ1kO+BG4JnAHcBlwPFVdf28bY4Arququ5OsBtZW1REbOZYD1xoz\nB641/YY+cH04cFNV3VJV9wMfAY6dv0FVXVJVd3ezlwD7jvH8kqQxG2dI7AvcOm/+NjYdAicD/zzG\n80uSxmwiN9Ml+WXgJOCo9lZr502v6j6SpA1GoxGj0ajXc4xzTOII5sYYVnfzZwBVVWcu2O5JwMeB\n1VX19caxHJPQmDkmoek39DGJy4DHJplNshNwPHDB/A2S7M9cQLy8FRCSpOEYW3dTVT2Q5FTgIubC\n5/1VdV2S18ytrrOBPwb2Bs7K3MOZ7q+qw8dVBknSePnsJm0j7G7S9Bt6d5MkacoYEpKkJkNCktRk\nSEiSmgwJSVLTYF9fCr6+VOMzMzM76SJIK9JgQ8KfK0rS5NndJElqMiQkSU2GhCSpyZCQJDUZEpKk\nJkNCktRkSEiSmgwJSVKTISFJajIkJElNhoQkqcmQkCQ1GRKSpCZDQpLUZEhIkpoMCUlS02BfOpT4\nZjqN38zMLOvXr5t0MaQVI0N8A1ySguGVS9MgvvVQUysJVTXWb9h2N0mSmgwJSVKTISFJahprSCRZ\nneT6JDcmOX0j6/dKcn6Sq5JckuSgcZ5fkjReYwuJJNsB7waeCzwROCHJgQs2+0Pgiqo6BDgR+Itx\nnV+SNH7jvJI4HLipqm6pqvuBjwDHLtjmIOCzAFV1A/AzSR4xxjJIksZonCGxL3DrvPnbumXzXQUc\nB5DkcGB/YL8xlkGSNEbLfTPdnwLvSnI5cA1wBfDAxjddO296VfeRJG0wGo0YjUa9nmNsN9MlOQJY\nW1Wru/kzgKqqMzexzzeAg6vqewuWezOdeuLNdJpeQ7+Z7jLgsUlmk+wEHA9cMH+DJHsm2bGbfjVw\n8cKAkCQNx9i6m6rqgSSnAhcxFz7vr6rrkrxmbnWdDTwBODfJj4CvAK8a1/klSePns5u0jbG7SdNr\n6N1NkqQpY0hIkpoMCUlSkyEhSWoyJCRJTYN9fSn4+lKN38zM7KSLIK0ogw0Jf6YoSZNnd5MkqcmQ\nkCQ1GRKSpCZDQpLUZEhIkpoMCUlSkyEhSWoyJCRJTYaEJKnJkJAkNRkSkqQmQ0KS1GRISJKaDAlJ\nUpMhIUlqMiQkSU2DfelQ4pvpND4zM7OsX79u0sWQVpwM8Q1wSQqGVy6tZPFth5p6SaiqsX7DtrtJ\nktRkSEiSmgwJSVLTokIiyb2LPWGSDya5PsnVSd6XZPvFHkuS1K/FXkksZQTwg1V1YFU9CdgFOHkJ\nx5Ik9WjJ3U1J3p7kmiRXJfn1blmSnJXkq0kuTPKpJMcBVNW/zNv9UmC/pZZBktSPJd0nkeQlwJOq\n6uAk+wCXJbkYOArYv6oOSjIDXAe8f8G+OwAvB05bShkkSf1Z6pXEkcCHAarqLmAEHM5cSJzXLb8T\n+NxG9j0LuLiqvrDEMkiSejLuO67DFoxXJHkj8PCqOqW91dp506u6jyRpg9FoxGg06vUci7rjOsm9\nVbV7khcDpwAvAB7G3BjDU4FfAk4EjgH2Ab4KvLqqzk9yMnAScHRV3dc4vndca8y841rTr487rhd7\nJVEAVfWJJEcAVwE/At5QVXcl+ThwNPAV4Fbgy8Dd3b5/BawDLpkLA86vqj9ZfBUkSX3p7dlNSXat\nqu8n2Rv4D+DIbtxiS/b1SkJj5pWEpt+QriS2xD8l2QvYEXjzlgaEJGk4fAqsthFeSWj6+RRYSdKy\nMiQkSU2GhCSpabCvL527L08aj5mZ2UkXQVqRBhsSDjJK0uTZ3SRJajIkJElNhoQkqcmQkCQ1GRKS\npCZDQpLUZEhIkpoMCUlSkyEhSWoyJCRJTYaEJKnJkJAkNRkSkqQmQ0KS1GRISJKaDAlJUpMhIUlq\nGuyb6RJfX6rxmZmZZf36dZMuhrTiZIivCU1SMLxyaSWLr8TV1EtCVY31G7bdTZKkJkNCktQ0kZBI\nckiS503i3JKkLbfsIZFke+BQ4PnLfW5J0tZZ9MB1kl2AjwH7AtsDbwHuAf4c+D7wBeAxVfXCJGuA\nnwUOAG4FjgR2Bm4H3lpV5y04tgPXGjMHrjX9+hi4XspPYFcDt1fVrwAk2QO4FlhVVTcn+Sg//S/9\nE4Ajq+oHSU4Efr6qTlvC+SVJPVtKd9M1wLOTvDXJUcxdJdxcVTd36z+4YPsLquoHSzifJGmZLfpK\noqpuSnIYc2MLbwE+u5ldvr91Z1g7b3pV95EkbTAajRiNRr2eYyljEo8CvlNV9yV5AXAqc11KR3fd\nTR8CdquqY7oxiXur6p3dvscBx1TVKxrHdkxCY+aYhKbf0G6mOxi4NMkVwBuBPwJOAT6V5EvAnZvY\n93PAQUkuT/JrSyiDJKlHvT2WI8kzgNdX1TGL2NcrCY2ZVxKafkO7kpAkTTkf8KdthFcSmn5eSUiS\nlpUhIUlqMiQkSU2DfTMd+GY6jc/MzOykiyCtSIMNCQcZJWny7G6SJDUZEpKkJkNCktRkSEiSmgwJ\nSVKTISFJajIkJElNhoQkqcmQkCQ1GRKSpCZDQpLUZEhIkpoMCUlSkyEhSWoyJCRJTYaEJKnJkJAk\nNQ32zXSJry/V8pmZmWX9+nWTLoY0OBnia0KTFAyvXJpm8ZW5WvGSUFVj/YZtd5MkqcmQkCQ1GRKS\npKbNhkSSHyV5+7z51yd5Y7/FkiQNwZZcSdwHHJdk774LI0kali0JiR8CZwOvW7giyWySzyS5Msmn\nk+zXLT8nybuSfCHJ15IcN2+f30tyabfPmrHVRJI0dlsSEgW8B3hZkt0XrPtL4JyqOhT4UDe/wSOr\n6kjghcCZAEmeDTyuqg4Hngz8QpKjllgHSVJPtuhmuqr6XpJzgdcC/ztv1dOAF3fTf0sXBp1Pdvte\nl2SfbtlzgGcnuRwIsCvwOODzDz7r2nnTq7qPJGmD0WjEaDTq9RybvZkuyT1VtUeShwKXA+cAVNWb\nk9wFPKqqHkiyA3BHVe2T5BzgH6vq/AXH+DPghqp672bO6c10WmbeTKeVb1I30wWgqr4LfAx41bx1\n/w6c0E3/JvBvmzoGcCHwyiS7AiR5dJJHbG2hJUnLY0vHJDZ4B/CwectOA05KciXwMua6oxbu8+P5\nqvo0c2MXX0xyNXAesNviii5J6pvPbpIAu5s0DXx2kyRpWRkSkqQmQ0KS1GRISJKaBvtmup/8albq\n38zM7KSLIA3SYEPCX5pI0uTZ3SRJajIkJElNhoQkqcmQkCQ1GRKSpCZDQpLUZEhIkpoMCUlSkyEh\nSWoyJCRJTYaEJKnJkJAkNRkSkqQmQ0KS1GRISJKaDAlJUpMhIUlqGuyb6RJfX6rlMzMzy/r16yZd\nDGlwMsTXhCYpGF65NM3iK3O14iWhqsb6DdvuJklSkyEhSWoyJCRJTb2GRJJjkxzY5zkkSf3p+0ri\nRcATez6HJKknWx0SST6R5LIk1yQ5uVt277z1L0lyTpKnAccAb0tyeZIDkhyS5ItJrkzy8SR7jq8q\nkqRxW8yVxElV9RTgKcBrk+zNg3+vWlX1ReAC4A1VdVhVfQP4QDd/KHAtsHbxRZck9W0xN9P9TpIX\nddP7AY/bkp2S7AHsWVWf7xadC3ysvcfaedOruo8kaYPRaMRoNOr1HFsVEkmeARwNPLWq7kvyOWBn\nfvpKYufxFG3teA4jSVNq1apVrFq16sfzb3rTm8Z+jq3tbtoT+G4XEAcCR3TL70zyc0m2A148b/t7\ngT0Aquoe4LtJjuzWvRy4ePFFlyT1basey5FkJ+CTwCxwA7AXc1/5Hw68DbgL+BKwW1W9MsnTgfcC\n/wf8KrA78NfAQ4CbmRvfuHsj5/GxHFpmPpZDK18fj+Xw2U0SYEhoGvjsJknSsjIkJElNhoQkqcmQ\nkCQ1GRKSpKbBvr4UfH2pls/MzOykiyAN0mBDwp8jStLk2d0kSWoyJCRJTYaEJKnJkJiAvh/tO2nW\nb+Wa5rrB9NevD4bEBEz7H6r1W7mmuW4w/fXrgyEhSWoyJCRJTQN+VLgkaWttE++TkCQNg91NkqQm\nQ0KS1NR7SCRZneT6JDcmOX0j6/dIckGSK5Nck+QV89atS3JVkiuSXDpv+UOTXJTkhiQXJtmz73q0\n9FS/NUluS3J591m9TNV5kCXWb88k5yW5LslXkjy1Wz4t7deq34pvvySP7/4uL+/+e3eS07p1g2i/\nnuq24tuuW/e7Sa5NcnWSv0uyU7d869uuqnr7MBdCXwNmgR2BK4EDF2zzB8Bbu+mHA98GdujmbwYe\nupHjngn8fjd9OvCnfdZjAvVbA7xuEnUac/3+Bjipm94B2GPK2q9Vv6lovwXHuQPYbyjt12PdVnzb\nAY/u/m3ZqVv3UeC3Ftt2fV9JHA7cVFW3VNX9wEeAYxdsU8Du3fTuwLer6ofdfNj41c6xwLnd9LnA\ni8Za6i3XV/02rJu0RdcvyR7AL1bVOQBV9cOquqfbbsW332bqByu8/RZs8yzg61V1Wzc/hPbrq24w\nHW23PbBrkh2AXYDbu+Vb3XZ9h8S+wK3z5m/rls33buCgJHcAVwGvnbeugE8nuSzJq+ct36eq7gSo\nqvXAPmMv+Zbpq34Ap3aXke+bYHfMUup3APBfSc7pLtvPTvKQbt00tN+m6gcrv/3m+w3gw/Pmh9B+\nfdUNVnjbVdUdwDuAbzIXDv9dVZ/p9tnqthvCwPVzgSuq6tHAk4H3JNmtW3dkVR0GPB/47SRHNY4x\n5N/xLqZ+ZwGPqapDgfXAO5e70FuhVb8dgMOA93R1/B/gjG6fhd/UVmL7bap+09B+ACTZETgGOG8T\nxxhq+y2mbiu+7ZLsxdwVwyxzXU+7JXlp4xibbbu+Q+J2YP958/vxk8ueDU4Czgeoqq8D3wAO7Ob/\ns/vvt4BPMHcJBnBnkhmAJI8E7uqp/JvTS/2q6lvVdRoC7wWe0lP5N2cp9bsNuLWqvtRt9/fM/aMK\nsH4K2q9Zvylpvw2eB3y5+xvdYAj///VStylpu2cBN1fVd6rqgW6bp3f7bHXb9R0SlwGPTTLbja4f\nD1ywYJtbmKsUXeEfD9ycZJcNqZ9kV+A5wLXdPhcAr+imTwT+oc9KbEIv9esab4Pj+Em9l9ui69dd\n0t6a5PHdds8EvtpNr/j221T9pqH95q0/gQd3xwyh/Xqp25S03TeBI5LsnCTM/W1e1+2z9W23DKP0\nq4EbgJuAM7plrwFO6aYfBVwIXN19TuiWH8DciP4VwDUb9u3W7Q38a3fci4C9+q7HMtfvA922VwKf\nBGZWWv26dYcw98d+JXPfZvaclvbbTP2mpf12Ab4F7L7gmINov57qNi1tt4a5YLiauQHqHRfbdj6W\nQ5LUNISBa0nSQBkSkqQmQ0KS1GRISJKaDAlJUpMhIUlqMiQkSU2GhCSp6f8BbkhdwN1JIzQAAAAA\nSUVORK5CYII=\n",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"results = []\n",
"max_features_options = [\"auto\", None, \"sqrt\", \"log2\", 0.9, 0.2]\n",
"\n",
"for max_features in max_features_options:\n",
" model = RandomForestRegressor(n_estimators=1000, oob_score=True, n_jobs=-1, random_state=42, max_features=max_features)\n",
" model.fit(X, y)\n",
" print (max_features, \"option\")\n",
" roc = roc_auc_score(y, model.oob_prediction_)\n",
" print (\"C-stat: \", roc)\n",
" results.append(roc)\n",
" print (\"\")\n",
" \n",
"pd.Series(results, max_features_options).plot(kind=\"barh\", xlim=(.85,.88));"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### min_samples_leaf"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1 min samples\n",
"C-stat: 0.864043076726\n",
"\n",
"2 min samples\n",
"C-stat: 0.869654022731\n",
"\n",
"3 min samples\n",
"C-stat: 0.871571384442\n",
"\n",
"4 min samples\n",
"C-stat: 0.873478094142\n",
"\n",
"5 min samples\n",
"C-stat: 0.874269005848\n",
"\n",
"6 min samples\n",
"C-stat: 0.874029335634\n",
"\n",
"7 min samples\n",
"C-stat: 0.873304998988\n",
"\n",
"8 min samples\n",
"C-stat: 0.871866977705\n",
"\n",
"9 min samples\n",
"C-stat: 0.869294517411\n",
"\n",
"10 min samples\n",
"C-stat: 0.867430415748\n",
"\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYQAAAEACAYAAACznAEdAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xl8VPW5x/HPAyKKrG7IJooUFa0CKuKCjYCCVnGtgBbX\noq0LtvVWxLaCvd4r2utVW+xtsSAWwu4C4gKiRrCtAgqKCoJAMRBFQRCpioE894/fiYwxkEkyyZnl\n+3695pVzzpxz5hlI5pnfbu6OiIhInbgDEBGR9KCEICIigBKCiIhElBBERARQQhARkYgSgoiIAEkm\nBDPrY2bLzGy5mQ0p5/nGZjbDzBab2RIzuzI63sHMFpnZG9HPz8xscMJ1N5nZ0uiaESl7VyIiUmlW\n0TgEM6sDLAd6AkXAAqC/uy9LOGco0Njdh5rZ/sB7QHN3317mPmuBru6+1szygNuBs919u5nt7+4b\nUvv2REQkWcmUELoCK9x9jbsXA5OA88qc40CjaLsRsDExGUR6ASvdfW20/zNgROl5SgYiIvFKJiG0\nAgoT9tdGxxKNBDqaWRHwJnBzOffpB0xM2O8AnGZmr5rZS2Z2fPJhi4hIqqWqUbk3sMjdWwKdgYfM\nrGHpk2ZWD+gLTE24Zg+gmbt3A24FpqQoFhERqYI9kjhnHXBwwn7r6Fiiq4C7Adx9pZmtBo4AFkbP\nnwW87u6fJFxTCDweXbPAzErMbD9335h4YzPTZEsiIlXg7laZ85MpISwA2ptZWzPbE+gPzChzzhpC\nGwFm1pxQHbQq4fkBfLu6COBJoEd0TQegXtlkUMrd0+oxbNiw2GPIhJjSNS7FpJhyIa6qqLCE4O47\nzOxGYDYhgYx296Vmdl142kcBdwFjzeyt6LJb3f3T6MO+ASFZXFvm1o8AY8xsCbANuLxK70BERFIi\nmSoj3P054PAyx/6SsP0hoR2hvGu/AA4o53gxMLAywYqISM3RSOUqyMvLizuE70jHmCA941JMyVFM\nyUvXuCqrwoFpcTMzT/cYRUTSjZnhNdCoLCIiOUAJQUREACUEERGJKCGIiAighCAiIhElBBERAZQQ\nREQkooQgIiKAEoKIiESUEEREBFBCEBGRiBKCiIgASggiIhJRQhAREUAJQUREIkoIIiICKCGIiEhE\nCUFERAAlBBERiSghiIgIoIQgIiIRJQQREQGUEEREJKKEICIigBKCiIhE9og7AJHa4A4vvwzjxkFJ\nCbRtGx4HHxx+tmkD9evHHaVIvMzdKz7JrA/wAKFEMdrd7ynzfGNgPHAwUBe4z93HmlkHYDLggAHt\ngN+6+x8Srr0F+D2wv7t/Ws5rezIxipTn449h7Fj4619hzz3hmmugUSP44ANYs2bno6gI9ttvZ4JI\nTBal202bxv1uRJJnZri7Veqaij5szawOsBzoCRQBC4D+7r4s4ZyhQGN3H2pm+wPvAc3dfXuZ+6wF\nTnT3wuhYa+CvwOHAcUoIkgolJTBnDjz8MDz/PFxwAVx7LXTrBraLP48dO+DDD7+dKMomjTp1yk8U\npdsHHRTOEUkHVUkIyVQZdQVWuPua6EUmAecByxLOcaBRtN0I2JiYDCK9gJWlySByP/ArYEZlghYp\nT1ERjBkDo0dDs2YwaFAoGTRpUvG1detC69bhcfLJ333eHTZv/m6iWLhw5/amTeH6XSWNNm1gr71S\n/75FUiWZhNAKSPwQX0tIEolGAjPMrAhoCPQr5z79gImlO2bWFyh09yW2q69tIhXYsQOefTaUBubO\nhUsugWnT4LjjUvs6ZiHJNGsGnTqVf85XX0Fh4beTxty5O7fXrg3Xl1e66No1lDBE4pSqRuXewCJ3\n72FmhwHPm9kx7r4VwMzqAX2B26L9vYHbgTMS7qGsIElbsyaUBsaMgVatQmkgPx8aNowvpr32gu99\nLzzKs2MHrF//7SqpZcvguefgiivgxBNh4EA4/3zYZ5/ajV0EkksI6wiNxaVaR8cSXQXcDeDuK81s\nNXAEsDB6/izgdXf/JNo/DDgEeNNC8aA18LqZdXX3j8sGMHz48G+28/LyyMvLSyJsyTbFxfDUU6E0\nMH8+XHopPP00HHNM3JElp25daNkyPE466dvPffEFTJ8O48fDjTfCueeG5NCjR7hOpCIFBQUUFBRU\n6x7JNCrXJTQS9wQ+BOYDA9x9acI5DwEfu/udZtackAiOLW0kNrOJwHPu/uguXmM10MXdN5XznBqV\nc9zKlaEtYOxYaN8+NBBffDHsvXfckdWM9eth0qTQRbaoKCS+gQPh2GPjjkwySY30Mopu3Ad4kJ3d\nTkeY2XWAu/soM2sBjAVaRJfc7e4To2sbAGuAdu7++S7uvwo4Xr2MpNS2bfDkkzBqFCxZEj4Qf/IT\nOPLIuCOrXUuXhlLD+PGhcXzgwJAgWrWKOzJJdzWWEOKkhJBbli0LVULjxsH3vx9KA+efr0FjJSXw\nyivh3+Wxx6BLl5AcLrwwjKsQKUsJQTLSl1+GnkEPPwwrVsCVV4YBZO3bxx1Zevrqq9CWMn58GH19\n9tkhOZxxBuyhuQckooQgGeWtt0ISmDAhdLscNCg0ptarF3dkmWPDBpg8OSSH1auhf/+QHLp02fUg\nPMkNSgiS9rZuDR9gDz8c+uVfcw1cfXXoiy/Vs2LFzvaG+vV3tjfo3zY3KSFI2nr99ZAEpkyB7t1D\naaBPH1Vx1AR3+Oc/Q3vD1Klw9NEhOVx8cXKjtiU7KCFIWvnss1Ad9PDDYVqHa66Bq65SD5natG1b\nGMk9bhy88AKceWZIDr17h8n+JHspIUjaeOWVMKlcXl4oDfTqpYnf4rZpUyihjR8P770XpvkYODC0\n36i9IfsoIUhamDMHBgwIU0mceWbc0Uh5Vq0K/z/jxoX9H/84PNq1izcuSR0lBIndU0+FqqHHHgtt\nBZLe3GHBgpAYJk+GDh1CqeFHP4J99407OqkOJQSJ1eTJMHgwzJwJJ5wQdzRSWcXFMGtWSA6zZsFF\nF8H990PjxnFHJlVRlYSgWl1JiTFj4Be/CAvSKBlkpnr14JxzQmJfsybsd+oEr74ad2RSW1RCkGr7\n4x/h978PyeDww+OORlLpiSfgpz+Fm26CoUM182omUZWR1LoRI0K30hdegEMOiTsaqQlr14Z2hZKS\n0EOpTZu4I5JkqMpIao07/OY38Le/wbx5SgbZrHXr0HPsrLPg+OPDvFOSnVRCkEpzD+0FL78Ms2fD\nAQfEHZHUltKFiU4/HR54QCu7pTOVEKTG7dgRpqR+7TV46SUlg1zTtSssWgRffx0m0HvjjbgjklRS\nQpCkFRfD5ZfD+++HkkHTpnFHJHFo1AgefRSGDQtTYNx3X2hfkMynKiNJyrZt0K9fSArTpmXv8pVS\nOatXw2WXQcOGIUm0aFHxNVI7VGUkNeKLL6Bv3zAz6RNPKBnIToceCnPnwkknhSqkmTPjjkiqQyUE\n2a0tW8JgpUMOCYPPNF217Mq8eWE+pL59w7iUvfaKO6LcphKCpNSnn4ZZSo86CsaOVTKQ3eveHRYv\nhvXrw2j1t9+OOyKpLCUEKdf69WHq6tNOgz/9SVNXS3KaNQtTX/zyl6Fr6p/+FLopS2ZQlZF8R2Fh\nKBlceinccYfmypeqWb48/A61bBmqG/ffP+6IcouqjKTaVq4MpYJrrw3dCpUMpKo6dIB//AOOOCJM\nkjdnTtwRSUVUQpBvLF0aFrT59a/DhGYiqTJnDlxxReiietddWr6zNqiEIFW2eDH06AH/9V9KBpJ6\nvXqF37Fly+Dkk0N1kqQfJQTh1VfDiNORI8NIZJGacMABMH06XH01nHJKaFdQ4T+9qMooxxUUhOUS\nH30Uzj477mgkV7z9dlh3u2NH+POfQ+8kSS1VGUmlPPNMSAZTpigZSO06+ugwc2rz5qHBed68uCMS\nUAkhZz32GFx/PTz5ZJh2QCQuM2fCoEHhcccdGgCZKjVWQjCzPma2zMyWm9mQcp5vbGYzzGyxmS0x\nsyuj4x3MbJGZvRH9/MzMBkfP3WtmS6NrHjMzLeVdS8aNgxtvhOeeUzKQ+J1zTphG+9VXQ5fn1avj\njih3VZgQzKwOMBLoDRwFDDCzI8qcdgPwjrt3Ak4H7jOzPdx9ubt3dvcuwHHAv4HHo2tmA0dF16wA\nhqbkHclu/fnPYW3cF16Azp3jjkYkaNEifEG56KKw5sKECXFHlJuSKSF0BVa4+xp3LwYmAeeVOceB\nRtF2I2Cju28vc04vYKW7rwVw9znuXjqL+qtA66q8AUnefffBPfeElc46dow7GpFvq1MHbrkFZs2C\nO+8MPd62bIk7qtySTEJoBRQm7K+NjiUaCXQ0syLgTeDmcu7TD5i4i9e4Gng2iVikCtzDH9ioUWGq\n4sMOizsikV0rXYmtfv2wPX9+3BHljlQ13/QGFrl7DzM7DHjezI5x960AZlYP6AvcVvZCM/s1UOzu\nuywkDh8+/JvtvLw88vLyUhR29nOHW28N37rmzg29OkTS3T77wMMPh84P554LP/95+D2uWzfuyNJX\nQUEBBQUF1bpHhb2MzKwbMNzd+0T7twHu7vcknDMTuNvd/x7tvwAMcfeF0X5f4PrSeyRcdyUwCOjh\n7tt28frqZVRFJSWh8XjhwlA/u+++cUckUnmFhTBwYJhXa9w4aK3K5aTUVC+jBUB7M2trZnsC/YEZ\nZc5ZQ2gjwMyaAx2AVQnPD6BMdZGZ9QF+BfTdVTKQqtu+Ha66KgwAmjNHyUAyV5s2oRNEr15w4olQ\nVBR3RNkrqXEI0Yf3g4QEMtrdR5jZdYSSwigzawGMBUpXVL3b3SdG1zYgJIx27v55wj1XAHsCG6ND\nr7r79eW8tkoIlfT112ESsS1bwpKXDRrEHZFIavzudzB7Nrz0EtSrF3c06a0qJQQNTMsyX34JF18c\n/lgmTw4NcyLZoqQktCl06AD33x93NOlNU1fkuK1b4Yc/hCZNYOpUJQPJPnXqhHaE6dPDlCuSWkoI\nWWLz5rCWwWGHhT8YFaclW+27b+h9dMMN8O67cUeTXZQQssAnn4S1DLp2DWMN1DVPsl3nznDvvWFk\n8+efV3y+JEdtCBnuww9D74vzzw8rUWnJS8kl114LmzaF6iP97n+b2hByzAcfhMnALrssrHSmPwjJ\nNX/4Q5gMTw3MqaESQoZ6//1QMvj5z8NDJFetWRPGJ0yZEr4gSaASQo54913Iy4Pbb1cyEGnbNqz4\nN2BAqEKVqlNCyDCLF0PPnjBiRKg/FZGwJvh118Ell0BxcdzRZC5VGWWQ116Dvn3hoYfC4DMR2al0\n0Nrhh8P//m/c0cRPVUZZbO7c8Ms+ZoySgUh5SgetPfmkBq1VlUoIGWD27NCTaNKkUF0kIrv2xhuh\nCmnuXDjyyLijiY9KCFloxgz48Y/DJHVKBiIV69IlrAx44YUatFZZKiGkscmTYfBgmDkTTjgh7mhE\nMsugQWFKl1wdtKYSQhZ59FH4xS/g+eeVDESq4o9/1KC1ylIJIQ393//Bf/93SAZHHBF3NCKZ61//\nCoPWpk7NvUFrKiFkgfvuC5N2vfyykoFIdR1yiAatVYZKCGnCPUxON358WPKyTZu4IxLJHnfeGf6u\nXnwxd6aG14ppGcodhg6Fp58O1UQHHRR3RCLZpaQEzjknlLpzZdCaqowyUEkJ3HxzSAQFBUoGIjWh\nTp1Q+n7iCQ1a2x2VEGK0Y0eYf+Xdd+GZZ6Bp07gjEsluuTRoTSWEDFJcDJdfDqtWhZHISgYiNa9L\nlzAxpAatlU8lhBhs2xZ6PXz1VVgbdu+9445IJLf85CewZUsY/Jmtg9ZUQsgAX34ZlruEUJ+pZCBS\n+0aOhJUr4YEH4o4kvaiEUIu2bg3TV7doEfpG77FH3BGJ5K7Vq6FbN5g2Dbp3jzua1FMJIY1t3gxn\nngmHHQZ/+5uSgUjcDj0Uxo6F/v01aK2UEkIt2LAhzFR6/PHwl79A3bpxRyQiAGedFVYe7NdPK62B\nEkKN++ijsP7xmWfCgw+G/tAikj5++1to2BBuuy3uSOKnj6caVFgYJtTq3z9MVpetvRlEMlnpoLXH\nHw+T4OWypBKCmfUxs2VmttzMhpTzfGMzm2Fmi81siZldGR3vYGaLzOyN6OdnZjY4eq6Zmc02s/fM\nbJaZNUnpO4vZqlUhGfz0p/Cb3ygZiKSzffcNXcCvvx6WLo07mvhU2MvIzOoAy4GeQBGwAOjv7ssS\nzhkKNHb3oWa2P/Ae0Nzdt5e5z1qgq7uvNbN7gI3ufm+UZJq5+3cKbZnYy2jZMjjjDLj9dvjZz+KO\nRkSSNXp0mHF4/vxQjZTJaqqXUVdghbuvcfdiYBJwXplzHGgUbTcifNBvL3NOL2Clu6+N9s8DHo22\nHwXOr0zg6eqtt6BHjzBzqZKBSGa55ho4+eTwM8O+h6ZEMgmhFVCYsL82OpZoJNDRzIqAN4Gby7lP\nP2Biwv6B7r4ewN0/Ag5MNuh0tWBBKBk88ABccUXc0YhIVYwcCe+/HzqB5JpU9YbvDSxy9x5mdhjw\nvJkd4+5bAcysHtAX2F07/i7z8fDhw7/ZzsvLIy8vLxUxp9Qrr4T5UUaPhnPPjTsaEamqvfYKg9W6\ndQtdxU89Ne6IklNQUEBBQUG17pFMG0I3YLi794n2bwPc3e9JOGcmcLe7/z3afwEY4u4Lo/2+wPWl\n94iOLQXy3H29mR0EvOTu35l/MBPaEObMCXMTTZgQSggikvmefRYGDYKFCzNzWvqaakNYALQ3s7Zm\ntifQH5hR5pw1hDYCzKw50AFYlfD8AL5dXUR0jyuj7SuA6ZUJPF08/TRcemnooaBkIJI9zjorTIKX\nS4PWkprLyMz6AA8SEshodx9hZtcRSgqjzKwFMBZoEV1yt7tPjK5tQEgY7dz984R77gtMAdpEz1/i\n7pvLee20LSFMmwY33ABPPQVdu8YdjYikWkkJ/PCHcNRR8D//E3c0laMlNGvRuHFw662hWNmpU9zR\niEhN2bgxtCX8/vdw8cVxR5M8JYRaMmoU/O53YdnLbF91SUTg9dehTx+YNy+sy5wJNNtpLXj66TAN\nRUGBkoFIrjjuOLj77tCTcOvWuKOpOSohVNJ554VfCo0zEMk911wD//43TJyY/tPRqMqohn36KbRr\nBx98AI0bxx2NiNS2L7+EU04JXwhvLm/4bRqpSkLQMi2VMHUq9O6tZCCSq/beO3Qx79YtVCNlyqC1\nZKkNoRLGj4fLLos7ChGJ06GHwiOPhGntP/oo7mhSS1VGSVqzJnwjKCqCPfeMOxoRiduwYaFzyZw5\nUK9e3NF8l3oZ1aAJE+BHP1IyEJHgjjtCFdLQoXFHkjpKCElwV3WRiHxb3bqQnw/Tp4eu6GlQkVFt\nalROwptvhq5mJ58cdyQikk722w/mzg1rpm/aBPfem/7dUXdHJYQk5OeH0kEd/WuJSBktWsDLL4cp\n8AcNgh074o6o6tSoXIEdO6BtW5g9Gzp2jC0MEUlzW7fCBRdAkybhS2T9+vHGo0blGjB3LhxwgJKB\niOxew4Ywc2ZoS+jbN1QzZxolhAqoMVlEklW/PkyeDK1ahfVRNm2KO6LKUZXRbnz1FbRsCUuWhP9g\nEZFkuMN//EeYEXn27HhWXFOVUYo9/TR07qxkICKVYxYW1OnXL0xvsXp13BElR91Od6O0d5GISGWZ\nwa9/DU2bwmmnwXPPhZXX0pmqjHZh0yY45JAws2mTJrX+8iKSRfLz4ZZbwnK7J5xQO6+pKqMUmjYt\nDDZRMhCR6rrsMnj44bA+80svxR3Nrikh7IKqi0Qklc49N0yh369fmO4iHanKqBwffBAak4uK4h9c\nIiLZZeHCkBzuuQcuv7zmXkcL5KTIxIlw0UVKBiKSescfDy++GBbb2rwZBg+OO6KdlBDKkZ8PI0fG\nHYWIZKsjj4R583YOXrvjjvSYFE9tCGW89VbI2tm2NJ6IpJe2bUNSeOIJ+PnPoaQk7oiUEL4jPx8u\nvVQzm4pIzWvePKy69vrrcPXVsH17vPGoUTlBSUkYe/D00/D979fKS4qI8MUXO9stJ02Cvfaq/j01\nDqGa5s2DZs2UDESkdjVoELqi1q8fxip8/nk8cSghJNDYAxGJy557hrXb27eHnj1h48bajyGphGBm\nfcxsmZktN7Mh5Tzf2MxmmNliM1tiZlcmPNfEzKaa2VIze8fMToyOH2tm/zSzRWY238yOT9m7qoJt\n2+Cxx2DAgDijEJFcVrcu/PnP0KNHmP9o3braff0KE4KZ1QFGAr2Bo4ABZnZEmdNuAN5x907A6cB9\nZlbapfVB4Bl3PxI4FlgaHb8XGObunYFhwO+r+2aq45lnQlVRmzZxRiEiuc4MRowIg9a6d4eVK2vv\ntZMZh9AVWOHuawDMbBJwHrAs4RwHGkXbjYCN7r7dzBoD3d39SgB33w5sic4rAUpnCmoK1HIu/Lb8\nfPjxj+OMQERkpyFDQpvmaafBs8/CMcfU/GsmkxBaAYUJ+2sJSSLRSGCGmRUBDYF+0fFDgQ1m9gih\ndLAQuNndvwR+Acwys/sAA06u8ruops2bw0IWf/1rXBGIiHzXtdeGCTbPOAOefBJOOqlmXy9VI5V7\nA4vcvYeZHQY8b2bHRPfvAtzg7gvN7AHgNkIV0c8IyeFJM7sYGAOcUd7Nhw8f/s12Xl4eeXl5KQo7\neOyx0IjTtGlKbysiUm39+kHjxmGd5gkTQnIoT0FBAQUFBdV6rQrHIZhZN2C4u/eJ9m8D3N3vSThn\nJnC3u/892n8BGEIoWfzT3dtFx08Fhrj7uWa22d2bJtzjM3f/zmTTtTEOoUcPuOGG0A9YRCQdvfJK\n+Iz605+S+6yqqXEIC4D2ZtbWzPYE+gMzypyzBugVBdEc6ACscvf1QKGZdYjO6wm8G22vM7MfRNf0\nBJZXJvBUWbcOFi8OfX9FRNLVqafCrFlw000wZkzNvEaFVUbuvsPMbgRmExLIaHdfambXhad9FHAX\nMNbM3oouu9XdP422BwP5ZlYPWAVcFR2/FnjQzOoCX0X7tW7iRLjwwtSMDBQRqUmdOoWpLs48M0yK\nd8stqb1/zk9d0akT3H8/nH56jb2EiEhKFRaGpHDRRfCf/1n+TKmauqKS3nkHNmyAH/wg7khERJLX\npg3MnRu6o954Y+pmSs3phKCZTUUkUx1wQFif+e23YeBAKC6u/j1ztsqopATatQsTSh17bMpvLyJS\nK778Ei65BNzDms177x2Oq8qoEv7+d2jYsHZG/4mI1JS994bHHw8D2Pr0gc8+q/q9cjYhlM5smg7L\n1omIVEe9ejBuHBx9dBhX9cknVbtPTlYZff01tGwZVilq2zaltxYRiY07/Pa3MG0avPeeqoyS8txz\n0LGjkoGIZBczuOsu+OUvq3h9LpYQLrkkzF103XUpva2ISNqoSqNyziWELVtCH97Vq2HffVN2WxGR\ntKJeRkl4/PEwKlnJQETk23IuIWjdZBGR8uVUlVFRERx1VPhZOnhDRCQbqcqoApMmwfnnKxmIiJQn\npxKC1k0WEdm1nEkIS5fCRx9BilffFBHJGjmTEPLzoX9/qFs37khERNJThSumZQP3sDj1tGlxRyIi\nkr5yooTwz3+GJTI7d447EhGR9JUTCUEzm4qIVCzrxyEUF4eZTefPh0MPTWFgIiJpTOMQyjFrFnTo\noGQgIlKRrE8IGnsgIpKcrK4y+vzzMLPp++/D/vunODARkTSmKqMynngCundXMhARSUZWJwTNbCoi\nkrysrTL66CM48khYtw4aNKiBwERE0piqjBJMngx9+yoZiIgkK2sTwvjxqi4SEamMpBKCmfUxs2Vm\nttzMhpTzfGMzm2Fmi81siZldmfBcEzObamZLzewdMzsx4bmbouNLzGxESt4RsHw5rF0LPXqk6o4i\nItmvwsntzKwOMBLoCRQBC8xsursvSzjtBuAdd+9rZvsD75nZeHffDjwIPOPuPzKzPYAG0X3zgHOB\n77v79ui6lCid2XSPnJi6T0QkNZIpIXQFVrj7GncvBiYB55U5x4FG0XYjYGP0Id8Y6O7ujwC4+3Z3\n3xKd9zNgRJQ0cPcN1XwvIRBX7yIRkapIJiG0AgoT9tdGxxKNBDqaWRHwJnBzdPxQYIOZPWJmb5jZ\nKDMrXcCyA3Camb1qZi+Z2fFVfxs7vfZaWPPguONScTcRkdyRqkbl3sAid28JdAYeMrOGhCqpLsBD\n7t4F+AK4LbpmD6CZu3cDbgWmpCIQzWwqIlI1ydSyrwMOTthvHR1LdBVwN4C7rzSz1cARhJJFobsv\njM6bBpQ2Sq8FHo+uWWBmJWa2n7tvLBvA8OHDv9nOy8sjbxfrYBYXw5Qp8I9/JPGuRESySEFBAQUF\nBdW6R4UD08ysLvAeoVH5Q2A+MMDdlyac8xDwsbvfaWbNgYXAse7+qZm9DAxy9+VmNgxo4O5DzOw6\noKW7DzOzDsDz7t62nNdPemDas8/C734XFsQREcllVRmYVmEJwd13mNmNwGxCFdNod18afaC7u48C\n7gLGmtlb0WW3uvun0fZgIN/M6gGrCKUJgDHAGDNbAmwDLq9M4OXR2AMRkarLmqkrtm6F1q3DGIQD\nD6yFwERE0lhOT10xfTqccoqSgYhIVWVNQtDYAxGR6smKKqOPPw7LZK5bB/vsU0uBiYiksZytMpo8\nGc45R8lARKQ6siIhaN1kEZHqy/gqo/ffD43J69ZpMjsRkVI5WWWUnw/9+ikZiIhUV0YnBM1sKiKS\nOhmdEBYuDEmha9e4IxERyXwZnRA0s6mISOpkbKPy9u1hqop58+B734shMBGRNJZTjcovvAAHH6xk\nICKSKhmbEDT2QEQktTKyyujf/w7VRcuWQfPmMQUmIpLGcqbKaMYMOPFEJQMRkVTKyISgsQciIqmX\ncVVGGzZA+/awdi00bBhjYCIiaSwnqoymTIGzz1YyEBFJtYxLCFo3WUSkZmRUldGqVdCtW5jZtF69\nmAMTEUljWV9lNGECXHKJkoGISE3ImISgmU1FRGpWxiSEN96Ar78OVUYiIpJ6GZMQ8vPh0ks1s6mI\nSE3JiEbl7dudNm3gxRfhiCPijkhEJP1lbaPySy9By5ZKBiIiNSkjEoLGHoiI1LyMqDJq2tR5911o\n0SLuaESOolSeAAAFzUlEQVREMkONVRmZWR8zW2Zmy81sSDnPNzazGWa22MyWmNmVCc81MbOpZrbU\nzN4xsxPLXHuLmZWY2b67ev0TTlAyEBGpaRUmBDOrA4wEegNHAQPMrGxt/g3AO+7eCTgduM/M9oie\nexB4xt2PBI4FlibcuzVwBrBmdzGkW3VRQUFB3CF8RzrGBOkZl2JKjmJKXrrGVVnJlBC6AivcfY27\nFwOTgPPKnONAo2i7EbDR3bebWWOgu7s/AuDu2919S8J19wO/qiiACy5IIspalI7/+ekYE6RnXIop\nOYopeekaV2UlkxBaAYUJ+2ujY4lGAh3NrAh4E7g5On4osMHMHjGzN8xslJntDWBmfYFCd19SUQCN\nGycRpYiIVEuqehn1Bha5e0ugM/CQmTUE9gC6AA+5exfgC+C2KCncDgxLuIeGnImIxMndd/sAugHP\nJezfBgwpc85M4JSE/ReA44HmwKqE46cCTwFHAx8Bq4DVQDHwL+DAcl7f9dBDDz30qPyjos/3so/S\nht/dWQC0N7O2wIdAf2BAmXPWAL2Av5tZc6ADIRF8amaFZtbB3ZcDPYF33f1t4KDSi81sNdDF3TeV\nffHKdpsSEZGqqTAhuPsOM7sRmE2oYhrt7kvN7LrwtI8C7gLGmtlb0WW3uvun0fZgIN/M6hFKBFeV\n9zKoykhEJFZpPzBNRERqR9pOXWFmo81sfUKpI3Zm1trMXowG2C0xs8FpEFN9M3vNzBZFMQ2LO6ZS\nZlYn6l02I+5YAMzsX2b2ZvRvNT/ueEpVNHgzhng6RP9Gb0Q/P0uT3/VfmNnbZvaWmeWb2Z5pENPN\n0d9dbJ8H5X1WmlkzM5ttZu+Z2Swza5LMvdI2IQCPEHovpZPtwC/d/SjgJOCGcgbp1Sp33wac7u6d\ngU7AWWbWNc6YEtwMvBt3EAlKgDx37+zu6fJvBLsZvBkHd18e/Rt1AY4D/g08EWdMZtYSuInQ1ngM\nobq7f8wxHQVcQ+hA0wk4x8zaxRBKeZ+VtwFz3P1w4EVgaDI3StuE4O6vAN9pZI6Tu3/k7ouj7a2E\nP9yyYzJqnbt/EW3WJ/yhxF4PGI1CPxv4a9yxJDDS7Hc+icGbcesFrHT3wgrPrHl1gX2iWRAaAEUx\nx3Mk8Jq7b3P3HcBc4MLaDmIXn5XnAY9G248C5ydzr7T648gkZnYI4VvBa/FG8k3VzCJCV97n3X1B\n3DGxcxR67MkpgQPPm9kCMxsUdzCRXQ7eTBP9gIlxB+HuRcB9wAfAOmCzu8+JNyreBrpH1TMNCF+A\n2sQcU6kD3X09hC+ywIHJXKSEUAXRoLtpwM1RSSFW7l4SVRm1Bk40s45xxmNmPwTWR6UpI316kJ0S\nVYOcTajuOzXugNjF4M14QwqinoF9galpEEtTwrfetkBLoKGZXRpnTO6+DLgHeB54BlgE7Igzpt1I\n6ouZEkIlRcXVacA4d58edzyJoqqGl4A+MYdyCtDXzFYRvl2ebmZ/izkm3P3D6OcnhDrxdGhHWEuY\nwmVhtD+NkCDSwVnA69G/V9x6EY1tiqpnHgdOjjkm3P0Rdz/e3fOAzcDymEMqtT4aE4aZHQR8nMxF\n6Z4Q0unbZakxhMF1D8YdCICZ7V/agyCqajgDWBZnTO5+u7sf7O7tCA1/L7r75XHGZGYNopIdZrYP\ncCahyB+rqFhfaGYdokM9SZ+G+AGkQXVR5AOgm5ntZWZG+HeKtfEdwMwOiH4eDFwATIgrFL79WTkD\nuDLavgJI6strMiOVY2FmE4A8YD8z+wAYVtrwFmNMpwCXAUuiOnsHbnf352IMqwXwaDRNeR1gsrs/\nE2M86ao58ISZOeH3Pt/dZ8ccU6lkBm/WqqhOvBdwbdyxALj7fDObRqiWKY5+joo3KgAei9ZyKQau\nj6NDQHmflcAIYKqZXU2YSeKSpO6lgWkiIgLpX2UkIiK1RAlBREQAJQQREYkoIYiICKCEICIiESUE\nEREBlBBERCSihCAiIgD8PyYYv/RexgaHAAAAAElFTkSuQmCC\n",
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"results = []\n",
"min_samples_leaf_options = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]\n",
"\n",
"for min_samples in min_samples_leaf_options:\n",
" model = RandomForestRegressor(n_estimators=1000, \n",
" oob_score=True, \n",
" n_jobs=-1, \n",
" random_state=42, \n",
" max_features=\"auto\", \n",
" min_samples_leaf=min_samples)\n",
" model.fit(X, y)\n",
" print (min_samples, \"min samples\")\n",
" roc = roc_auc_score(y, model.oob_prediction_)\n",
" print (\"C-stat: \", roc)\n",
" results.append(roc)\n",
" print (\"\")\n",
" \n",
"pd.Series(results, min_samples_leaf_options).plot();"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Final model"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"C-stat: 0.874269005848\n"
]
}
],
"source": [
"model = RandomForestRegressor(n_estimators=1000, \n",
" oob_score=True, \n",
" n_jobs=-1, \n",
" random_state=42, \n",
" max_features=\"auto\", \n",
" min_samples_leaf=5)\n",
"model.fit(X, y)\n",
"roc = roc_auc_score(y, model.oob_prediction_)\n",
"print (\"C-stat: \", roc)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.1"
}
},
"nbformat": 4,
"nbformat_minor": 0
}