{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "

Machine Learning Using Python (MEAFA Workshop)

\n", "

Lesson 7: Ensembles and Stacking

\n", "
\n", "\n", "Twitter Airline Sentiment Data
\n", "Data Preparation
\n", "Text Classification Methods
\n", "Voting Classifier
\n", "Model Stacking
\n", "Model Evaluation
\n", "\n", "This notebook relies on the following imports and settings." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# Packages\n", "import nltk\n", "import numpy as np\n", "from scipy import stats\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "import seaborn as sns\n", "import warnings\n", "warnings.filterwarnings('ignore') " ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# Plot settings\n", "sns.set_context('notebook') \n", "sns.set_style('ticks') \n", "colours = ['#1F77B4', '#FF7F0E', '#2CA02C', '#DB2728', '#9467BD', '#8C564B', '#E377C2','#7F7F7F', '#BCBD22', '#17BECF']\n", "crayon = ['#4E79A7','#F28E2C','#E15759','#76B7B2','#59A14F', '#EDC949','#AF7AA1','#FF9DA7','#9C755F','#BAB0AB']\n", "sns.set_palette(colours)\n", "%matplotlib inline\n", "plt.rcParams['figure.figsize'] = (9, 6)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "from sklearn.model_selection import train_test_split\n", "from sklearn.model_selection import RandomizedSearchCV, GridSearchCV\n", "from sklearn.metrics import accuracy_score, recall_score, precision_score, roc_auc_score, confusion_matrix, log_loss\n", "\n", "from sklearn.naive_bayes import BernoulliNB\n", "from sklearn.linear_model import LogisticRegression, LogisticRegressionCV\n", "from sklearn.tree import DecisionTreeClassifier\n", "from sklearn.ensemble import RandomForestClassifier\n", "import lightgbm as lgb" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##Twitter Airline Sentiment Data\n", "\n", "In this lesson we revisit the Twitter airline sentiment dataset. To save time, we directly load the processed dataset that we constructed in the earlier lesson. " ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
tweet_idairline_sentimentairline_sentiment_confidencenegativereasonnegativereason_confidenceairlineairline_sentiment_goldnamenegativereason_goldretweet_counttexttweet_coordtweet_createdtweet_locationuser_timezonetokenspositive
3570301031407624196negative1.0Bad Flight0.7033Virgin AmericaNaNjnardinoNaN0@VirginAmerica it's really aggressive to blast...NaN2015-02-24 11:15:36 -0800NaNPacific Time (US & Canada)[@virginamerica, it', realli, aggress, blast, ...0
4570300817074462722negative1.0Can't Tell1.0000Virgin AmericaNaNjnardinoNaN0@VirginAmerica and it's a really big bad thing...NaN2015-02-24 11:14:45 -0800NaNPacific Time (US & Canada)[@virginamerica, it', realli, big, bad, thing]0
5570300767074181121negative1.0Can't Tell0.6842Virgin AmericaNaNjnardinoNaN0@VirginAmerica seriously would pay $30 a fligh...NaN2015-02-24 11:14:33 -0800NaNPacific Time (US & Canada)[@virginamerica, serious, would, pay, 30, flig...0
9570295459631263746positive1.0NaNNaNVirgin AmericaNaNYupitsTateNaN0@VirginAmerica it was amazing, and arrived an ...NaN2015-02-24 10:53:27 -0800Los AngelesEastern Time (US & Canada)[@virginamerica, amaz, arriv, hour, earli, you...1
11570289724453216256positive1.0NaNNaNVirgin AmericaNaNHyperCamiLaxNaN0@VirginAmerica I &lt;3 pretty graphics. so muc...NaN2015-02-24 10:30:40 -0800NYCAmerica/New_York[@virginamerica, <3, pretti, graphic, much, be...1
\n", "
" ], "text/plain": [ " tweet_id airline_sentiment airline_sentiment_confidence \\\n", "3 570301031407624196 negative 1.0 \n", "4 570300817074462722 negative 1.0 \n", "5 570300767074181121 negative 1.0 \n", "9 570295459631263746 positive 1.0 \n", "11 570289724453216256 positive 1.0 \n", "\n", " negativereason negativereason_confidence airline \\\n", "3 Bad Flight 0.7033 Virgin America \n", "4 Can't Tell 1.0000 Virgin America \n", "5 Can't Tell 0.6842 Virgin America \n", "9 NaN NaN Virgin America \n", "11 NaN NaN Virgin America \n", "\n", " airline_sentiment_gold name negativereason_gold retweet_count \\\n", "3 NaN jnardino NaN 0 \n", "4 NaN jnardino NaN 0 \n", "5 NaN jnardino NaN 0 \n", "9 NaN YupitsTate NaN 0 \n", "11 NaN HyperCamiLax NaN 0 \n", "\n", " text tweet_coord \\\n", "3 @VirginAmerica it's really aggressive to blast... NaN \n", "4 @VirginAmerica and it's a really big bad thing... NaN \n", "5 @VirginAmerica seriously would pay $30 a fligh... NaN \n", "9 @VirginAmerica it was amazing, and arrived an ... NaN \n", "11 @VirginAmerica I <3 pretty graphics. so muc... NaN \n", "\n", " tweet_created tweet_location user_timezone \\\n", "3 2015-02-24 11:15:36 -0800 NaN Pacific Time (US & Canada) \n", "4 2015-02-24 11:14:45 -0800 NaN Pacific Time (US & Canada) \n", "5 2015-02-24 11:14:33 -0800 NaN Pacific Time (US & Canada) \n", "9 2015-02-24 10:53:27 -0800 Los Angeles Eastern Time (US & Canada) \n", "11 2015-02-24 10:30:40 -0800 NYC America/New_York \n", "\n", " tokens positive \n", "3 [@virginamerica, it', realli, aggress, blast, ... 0 \n", "4 [@virginamerica, it', realli, big, bad, thing] 0 \n", "5 [@virginamerica, serious, would, pay, 30, flig... 0 \n", "9 [@virginamerica, amaz, arriv, hour, earli, you... 1 \n", "11 [@virginamerica, <3, pretti, graphic, much, be... 1 " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data = pd.read_pickle('Datasets/processed_tweets.pickle')\n", "data.head()" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
texttokens
14625@AmericanAir Flight 236 was great. Fantastic c...[@americanair, flight, 236, great, fantast, ca...
14626@AmericanAir Flight 953 NYC-Buenos Aires has b...[@americanair, flight, 953, nyc-bueno, air, de...
14627@AmericanAir Flight Cancelled Flightled, can't...[@americanair, flight, cancel, flightl, can't,...
14628Thank you. “@AmericanAir: @jlhalldc Customer R...[thank, “, @americanair, @jlhalldc, custom, re...
14629@AmericanAir How do I change my flight if the ...[@americanair, chang, flight, phone, system, k...
14630@AmericanAir Thanks! He is.[@americanair, thank]
14631@AmericanAir thx for nothing on getting us out...[@americanair, thx, noth, get, us, countri, ba...
14633@AmericanAir my flight was Cancelled Flightled...[@americanair, flight, cancel, flightl, leav, ...
14636@AmericanAir leaving over 20 minutes Late Flig...[@americanair, leav, 20, minut, late, flight, ...
14638@AmericanAir you have my money, you change my ...[@americanair, money, chang, flight, don't, an...
\n", "
" ], "text/plain": [ " text \\\n", "14625 @AmericanAir Flight 236 was great. Fantastic c... \n", "14626 @AmericanAir Flight 953 NYC-Buenos Aires has b... \n", "14627 @AmericanAir Flight Cancelled Flightled, can't... \n", "14628 Thank you. “@AmericanAir: @jlhalldc Customer R... \n", "14629 @AmericanAir How do I change my flight if the ... \n", "14630 @AmericanAir Thanks! He is. \n", "14631 @AmericanAir thx for nothing on getting us out... \n", "14633 @AmericanAir my flight was Cancelled Flightled... \n", "14636 @AmericanAir leaving over 20 minutes Late Flig... \n", "14638 @AmericanAir you have my money, you change my ... \n", "\n", " tokens \n", "14625 [@americanair, flight, 236, great, fantast, ca... \n", "14626 [@americanair, flight, 953, nyc-bueno, air, de... \n", "14627 [@americanair, flight, cancel, flightl, can't,... \n", "14628 [thank, “, @americanair, @jlhalldc, custom, re... \n", "14629 [@americanair, chang, flight, phone, system, k... \n", "14630 [@americanair, thank] \n", "14631 [@americanair, thx, noth, get, us, countri, ba... \n", "14633 [@americanair, flight, cancel, flightl, leav, ... \n", "14636 [@americanair, leav, 20, minut, late, flight, ... \n", "14638 [@americanair, money, chang, flight, don't, an... " ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data[['text','tokens']].tail(10)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "# Randomly split indexes\n", "index_train, index_test = train_test_split(np.array(data.index), train_size=0.7, random_state=1)\n", "\n", "# Write training and test sets \n", "train = data.loc[index_train,:].copy()\n", "test = data.loc[index_test,:].copy()\n", "\n", "y_train = train['positive'].values\n", "y_test = test['positive'].values" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##Data preparation\n", "\n", "Compute frequency distribution of tokens." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "fdist = nltk.FreqDist()\n", "for words in train['tokens']:\n", " for word in words:\n", " fdist[word] += 1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Discard features with too few appearances, retrieve list of remaining tokens." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1551" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "features = pd.Series(dict(fdist))\n", "features = features.sort_values(ascending=False)\n", "features = features[features>=5]\n", "len(features)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Rank features based on univariate performance, if we want to include screening. " ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "def univariate_design_matrix(feature, series):\n", " X=series.apply(lambda tokens: (feature in tokens))\n", " X= X.astype(int) \n", " return X.values.reshape((-1,1)) # converting to a NumPy matrix, as requiresd\n", "\n", "def training_error(feature):\n", " X_train = univariate_design_matrix(feature, train['tokens'])\n", " nbc= BernoulliNB().fit(X_train, np.ravel(y_train))\n", " prob = nbc.predict_proba(X_train)\n", " return log_loss(y_train, prob)\n", "\n", "losses=[]\n", "for feature in features.index:\n", " losses.append(training_error(feature))\n", "\n", "ranked = pd.Series(losses, index=features.index)\n", "ranked = ranked.sort_values()\n", "ranked_features = list(ranked.index)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Build design matrix (slow to run)." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "from scipy.sparse import lil_matrix\n", "\n", "def design_matrix(features, series):\n", " X = lil_matrix((len(series),len(features))) # initialise \n", " for i in range(len(series)): \n", " tokens = series.iloc[i]\n", " for j, feature in enumerate(features): # scan the list of features\n", " if feature in tokens: # if the feature is among the tokens, \n", " X[i, j]= 1.0\n", " return X\n", "\n", "X_train = design_matrix(ranked_features, train['tokens'])\n", "X_test = design_matrix(ranked_features, test['tokens'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Text Classification Methods\n", "\n", "### Naive Bayes" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "BernoulliNB(alpha=1.0, binarize=0.0, class_prior=None, fit_prior=True)" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nbc= BernoulliNB()\n", "nbc.fit(X_train, y_train)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Regularised Logistic Regression" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "LogisticRegressionCV(Cs=50, class_weight=None, cv=None, dual=False,\n", " fit_intercept=True, intercept_scaling=1.0, max_iter=100,\n", " multi_class='ovr', n_jobs=1, penalty='l1', random_state=None,\n", " refit=True, scoring='neg_log_loss', solver='liblinear',\n", " tol=0.0001, verbose=0)" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "logit_l1= LogisticRegressionCV(Cs = 50, penalty='l1', solver='liblinear', scoring='neg_log_loss')\n", "logit_l1.fit(X_train, y_train.ravel())" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1093" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.sum(np.abs(logit_l1.coef_) == 0.0)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "LogisticRegressionCV(Cs=50, class_weight=None, cv=None, dual=False,\n", " fit_intercept=True, intercept_scaling=1.0, max_iter=100,\n", " multi_class='ovr', n_jobs=1, penalty='l2', random_state=None,\n", " refit=True, scoring='neg_log_loss', solver='lbfgs', tol=0.0001,\n", " verbose=0)" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "logit_l2= LogisticRegressionCV(Cs = 50, penalty='l2', scoring='neg_log_loss')\n", "logit_l2.fit(X_train, y_train.ravel())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Random Forest" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Best parameters found by randomised search: {'min_samples_leaf': 5, 'max_features': 350} \n", "\n" ] } ], "source": [ "#%%time\n", "\n", "model = RandomForestClassifier(criterion = 'entropy', n_estimators=100)\n", "\n", "tuning_parameters = {\n", " 'min_samples_leaf': [5, 10, 20, 50],\n", " 'max_features': np.arange(50, X_train.shape[1], 50),\n", "}\n", "\n", "rf_search = RandomizedSearchCV(model, tuning_parameters, cv = 5, n_iter= 16, scoring='neg_log_loss',\n", " return_train_score=False, n_jobs=4)\n", "rf_search.fit(X_train, y_train)\n", "\n", "rf = rf_search.best_estimator_\n", "\n", "print('Best parameters found by randomised search:', rf_search.best_params_, '\\n')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Gradient Boosting" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Best parameters found by randomised search: {'reg_alpha': 0.0011613350732448448, 'n_estimators': 1000, 'max_depth': 4, 'learning_rate': 0.1} \n", "\n" ] } ], "source": [ "#%%time \n", "\n", "from xgboost import XGBClassifier\n", "\n", "model = XGBClassifier()\n", "\n", "alphas = [0] + list(np.logspace(-10, 10, 81, base=2))\n", "\n", "tuning_parameters = {\n", " 'learning_rate': [0.001, 0.01, 0.05, 0.1],\n", " 'n_estimators' : [250, 500, 750, 1000, 1500, 2000, 2500, 3000, 5000],\n", " 'max_depth' : [1, 2, 3, 4],\n", " 'reg_alpha': alphas,\n", "}\n", "\n", "\n", "gb_search = RandomizedSearchCV(model, tuning_parameters, n_iter = 32, cv = 5, scoring='neg_log_loss',\n", " return_train_score=False, n_jobs=4, random_state = 10)\n", "gb_search.fit(X_train, y_train)\n", "\n", "gb = gb_search.best_estimator_\n", "\n", "print('Best parameters found by randomised search:', gb_search.best_params_, '\\n')" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAlQAAAF5CAYAAABZQrRcAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAAIABJREFUeJzt3XeYXWW1+PFvApEiIAjGgJQg4KII\nAv6QnoA0KVJykSJcjDQRuILilQioiOJVVLxGBCGCQSkBFFFEA9wovUlTELMQBQQk0quUtN8f7x4z\nDJOZSU7mnDl7vp/nmeec2WeXd+9oWFnve9YaMnv2bCRJkjT/hrZ6AJIkSe3OgEqSJKlBBlSSJEkN\nMqCSJElqkAGVJElSgwyoJEmSGmRAJUkDUES8NSKGt3ockvrGgEpSwyJiSkSM72b7kIh4MCI+Nh/n\nzIjYtg/73RARh83ls69GxKT5uPYPI+LL83rcAnYjsGGLxyCpjxZu9QAk1cJZwGkRcUxmTu+0fRvg\nbcDF83rCzIwFNbj5uPbBrbp2J8u1egCS+s6AStKC8HPge8DOwGWdth8InJuZr0TE4sCpwHbA8sCj\nwDGZeXmVifputW0jYFdgEnBwZk6OiO2BE4H3AG8BfgOMzcxXqutsEBF3AysCk4EjM/O5roOMiCOB\no4FlgGuAwzPzn93sdx7waGaOi4gbgF8B+wOrVtf+LnAGsArwS+A/M3NWRDxKCS4PAZas7uGozHwt\nIpYCTgF2B2YDlwOfzcwXIuKrwHrV/b0VeABYAfh5RBwDnAl8BfgP4F3AM8BJmfnDiFgduKV6tkdV\nt/CTzPxsdS8rV2PdEngB+GZmfrf67CPVc10BuB34ZGY+0PV5SOqdU36SGpaZrwPnAv+e2ouIpSnB\nw5nVpmOB1YENKMHGT4DO04RrA+cDK1EChI7zLAn8FDg5M5cD1gE2A/bqdOzOwEeBdwPLUgKeN4iI\nfYHPAh+mBCWPABf08RY/BmwPrAZsC3yfEhi+F9ixet9hb2BzYE3gA8AXqu0/rMa3TvWzEnB6p+M+\nCOwBvDcztwb+AeyRmacDB1CCzC0pz+4LwPiIWKw6dllKULQyMAb4VERsVH12KfB34J3VNY6PiA9G\nxKaU4O9gYDhwJfCriPAf2tJ8MKCStKCcBewYEctWv+8H3JqZU6vfx1OCjZcp/+F/kRLYdJgJXJiZ\nL2fmjE7b/wWsn5lXRMTbKNmtJ7sc+93MvC8zX6AEG3tFRNe/3w4Cvp2Zf87MV4FxwJYR8e4+3Nt5\nmfmPzJwG/Bm4IDMfz8yHgamUTFWHkzLz79W+XwP2jYi3UgKd/87MpzPzGUpwt09ELFIdd0c1tue7\nuf7PKIFcx32/AixGybR1+EZmvpaZNwJ/AdaIiDUo67D+OzNfycz7ga2BP1bP45zMvDkzp2fmt4DF\ngVF9eB6SuvBfIpIWiMz8S0TcBOwLnEaZ7jul0y5LUzIyGwF/BR7kjf+oe7rL+quO886MiN0j4mhg\nFvAHyrRY52Mf7vT+UWDR6nqdrQz8T0R8pdO2mZRg6G+93N4zXY7pPJ04q8tYOk+ZPQqMAN4OLNRl\nnA9X25avfp/Ww/XfQplS3YaSbfpDtb3zdZ/s9H569dk7gecz86WODzLzT/DvqcAtIuKgLtdZuYdx\nSJoLM1SSFqSzgAMiYj3Keqafd/psAnAPMDwzNwJ+0OXY2d2dMCJGAccDW2fmyMzcDei67mlEp/er\nAC9VWaDOHqesmVq644eSvbmxD/fV7djmYoUuY3mkuvZ0YGSnz1alBGdP9eEap1ACt+Uzc32gr99A\nfAx4W0Qs0bEhIvar1qQ9Dny9y/NYH7ioj+eW1IkBlaQF6VJK0HAcZTrp9U6fLUWZqppZZUe+DCwU\nEUN6OedSlMDjlYhYOCLGUtYoDeu0z1ERsXpELAOcDJzTzXnOBf47IlaLiKFVxutmytTZgvT5iHhH\nRKwAfJ6yKH8GZb3WKRGxbES8nRIk/bJz9qiL1yj3Dm98dssB36y2D+vuwA6Z+SDlHr8WEYtERADf\npgR35wKfiIj1q/IWe1IC3nfN/YyS5saAStIC02lx+l6UbFVnR1HWEb0A/I7y7bhXgLV6Oe2vKZmu\n+yhZlb2ra3Q+7leURdV/oUy5fb6b8/yo+rmKMmW3L7DTXNYsNeIe4FbKOqXJzAl+PgU8BPypGuPj\nwMd7OM+5wI8iYhxlXdjawLPAXZR1XA/T+7OD8rxWpixyvwr4Ymb+LjN/S/miwAWUP5MvAXv6LT9p\n/gyZPXteMtmSpLmpyiYcnJmTWz0WSc1lhkqSJKlBBlSSJEkNcspPkiSpQdahalBVVXhFSpuKGb3t\nL0mS6seAqnErAg9OmTKl1eOQJEkLRm/lXN7ENVSSJEkNcg1VgyJiJPDgwksux5CFFmr1cCRJqr0V\nRwxn8i9+2p+XmOcMlVN+C8hi232KoUss2/uOkiSpIY/+8qRWD+FNahNQRcSiwP6UNU3TMrNrn7B5\nOdc1wGGZOXUBDU+SJNVYndZQjQAObvUgJEnS4FObDBWlG/3awAeAKyPiI8CywBcy8/KIOJLSR2wY\n8Hz1/qPATsDiwGrANzJzYscJI+LDwGeAPTLzuSbeiyRJaiN1ylCdTGmeehLwWGZuAxwNfDIihlKC\nq20zc0tKULVRddzbMnMXYFdgXKfzjQGOBHYxmJIkST2pU0DV2R3V6zRg8cycBbwOXBgRZ1PWWQ2r\n9rm7en0EWLTTObYB3g5M7//hSpKkdlangGoWc+7nDbUgImI9YPfM3Bv4r2q/Id3t28kRwJWUjJck\nSdJc1WkN1RPAW4DFuvnsAeDliLgdeA14HFihD+c8CbgtIq7IzOt72vGVq8dbh0qSpCZYccTwVg/h\nTSzs2aCOwp5TpkxhxRVXbPVwJElS42w9I0mS1GxmqBpk6xlJUt01odXLQDN4W89ExAbA54GOidVF\ngF9Taku9Po/nujQzx0TEusAymXldb8fYekaSVFcDsdXLQFOLKb+I2A74JqWI51aZuRUwCngO+FlE\nzFOkmZljqrf/QSkWKkmSNFdtn6Gqevh9EdgZ2D8ifkL5Ft904AxKBfRtI+KgzNynOmZaZo6IiImU\nb/2NBJYHxmbmnRExDXg/MBZ4PSLuzMzbmntnkiSpXdQhQ7U9cDmwPrAXsDmlp9+uwG3AVdVnc/Nw\nZu4AfA84tGNjZj4GTARONZiSJEk9qUNAtSZwD7AvcEZmTgdeBv6UmS9Ssk8Pdjmm8xTgXdVr10rp\nkiRJfVKHgOpfwDLAs8Bbq23jgDsiYjVgb+BRypQeEbEKpaVMh56+5ti5+rokSVK36hAsXA3sR5my\nOygi/g9YgtKL7xTgEEqG6rmIuBX4Mm/OWM3NHcCREbH1Ah+1JEmqjVrUoYqIrwHLAsdl5tOdtgcl\n0Pp2Zl7ZT9ceiXWoJEk1Zh2qPhxQh4AKICLGULJRb6VM4y1EyUR9MTP7mpGan+uOxNYzkiTVyeAt\n7JmZlwKXtnockiRp8KlNhqpVnPKTJPVkEE6X1UH/ZagWZGuXuZx/FPBcZv6xo/Bmo+ecy3VWBt6X\nmZfP5fMRlGnCw+flvLaekSR1x7Ytg0OfvuW3oFu7zMWBwAoL4Dy9+SCl+Ge3MnPavAZTkiRpcOs1\nQ9XH1i57RsQnKQHaMOCwzLwnIo4B9gFmANdl5rERcSIwLTN/EBFrAj8AjgE+BGwYEfcBi0TEBcDK\nwNOUCuj3AmsB76DUlRoOvATcnJkbRsT/UIK8oZTq5pdExOHAxyj1pG6g1KcaByweETcBzwNfqm51\nceAA4HVgUmZuEhH3AvcDr2XmvvP6cCVJ0uDQlwxVX1q7rE0JTnYEPgUsFRHrVvtvVv2sERG7dHeB\nzLwDmAx8LjP/TqkjdVxmbgG8DXgfcD2wKSXwupdSZ2ob4KqI2BFYNTM3B7YGjo+IpYGPA0dl5qbA\n3yhzol8HLsjMXwLrAPtn5geBXwIf6TK0JYCvGExJkqSe9CWg6ktrlz8D1wK/AE6iZITWBG7JzOmZ\nOZsSEK3T5dxzmyp8JjMfqt5Po2SPLgV2AnYAjge2owR1PwPWBd4fEddQArNhwCqUgOqwiLi2+r3r\n9R4DxldNkreujusq5zJGSZIkoG8BVV9au7wEPJ6Z2wNfBb4GTAU2joiFqzVWoyjTZ69StYEBNux0\nnc5tXrr76uHVwGhgOcpi+PcD62fm76tr/a5a2/VB4GJKRuoQyvTjaGADSqas83V+CHw8M8cC/6D7\nAG9Wz49HkiQNdn35lt/VwKmUab6fRsRHgT9SptuWoQQtTwMXRcTRwEzgpGoN1cXAjZQA5gbgMkpG\n6+LqW313dLrOrcDXI6LbIpyZ+VpEPAI8nJmzIiKBJ6qPLwe2iojrKdN0P8/MFyPiHuD3EfEkJRt1\nK/ACZUrwTuAnwK0R8SzwTxpYFP/K1eMtmyBJepMVRwzvfSe1vT7VoWpla5eBzkrpkiTVTv/UocrM\n46rWLudFRNfWLp/oz9YukiRJA12fC3va2kWSJKl7tp5pkK1nJKm5bOWiJhi8zZE7i4iFKN8EfCtw\nBfAbYNfM7Lb+f0SMBdbMzHFdtv+7HU5v17T1jCQ1h61cNBDVMqCilGVYjvLtv2cz827g7vk4z4HA\nJMq3GiVJkrpV14DqLGANSmA1LSK2otSj2iciDgKOBJ6htJm5qDpmk4i4itLa5gxKSYd/t8OpKrhL\nkiS9SZ+aI7ehw4H7KD0H/y0ilgOOpbTP2Z45hUqh9CbcAdgDOLqbdjiSJEndqmtANTerA/dl5r8y\ncyZwU6fP7qxa5HS0upEkSeqTuk75zc0DwJoRsRjwGvABStsa6L7dTec2NZIkSd0aVAFVZj4VEd+g\nNGp+BliMMtXXXVNk6NQOJzP/3NO5bT0jSc1hKxcNRIOqDlVELAwcm5knV79fB5yQmdc1cM6R2HpG\nkqQ6sQ5VTzJzRkS8tWqM/DolA3V9i4clSZLa3KAKqKD0JQSOa/U4JElSfQyqKb/+YOsZSX1huxSp\nrdR3yi8iNgA+D3SsRlyE0l7mG8Da9Nxa5u3AhzLzgoiYCEzKzMmdPh9Zbdtkfsdn6xlJPbFdilRv\nbRFQRcR2lIKcR2RmVtuGAYcBP6MEUz21llkP2BW4oL/HKkmSBp8BH1BFxKLAF4Gdgf0j4ieUCujT\nKS1iVgO2jYiDqtYyHwE+A8wEbqgaHh8PvC8iDq1Oe3hE/Dfl/g8CZnS63kOURsmvRsTXgamZObH/\n71SSJLWrdihauT2lyfH6wF6UtjEHUzJOtwFXVZ91TO19GdgmM7cA3lVlt04GfpuZZ1XnvCkzt6FM\nF57SxHuRJEk11A4B1ZrAPcC+wBmZOR14GfhTZr4IjAQerPZdndLc+NcRcQ1lbdW7uzlnR92pm4Do\n4drzvChNkiQNPu0QUP0LWAZ4ljnNjMcBd0TEasDewEvV9geBR4DtMnMr4HuUWlNdW8h8oHrdEri3\ny/VeBZaPiCFUmS9JkqSeDPg1VMDVwKmUab6fRsRHgT8C21ACrUOAdQEy88mIOBW4NiIWAh4CLq72\nWzcijq7OuUlE/JbSv+9A3piJOoXy7cGHKEFcn9h6RlJPbJci1Vtb1KGKiK8BywLHZebTnbYHJQv1\nbWC/zDygBWMbia1nJEmqk3rWocrM4yJiDHBeRLyVkllaiDLF93ngHMrCc0mSpKZriwzVQGaldEk9\nsUK61JbqmaFqB1ZKl9QdK6RLg0NTA6qe2sdk5usL4PyTgAMWxLkkSZL6qmkBVW/tYyJi18xsaP4x\nM/dpfKSSJEnzpikB1by2j6mOmZaZI6rF6MdW+z4EHACsUB23KOXbfydl5mUdbWOAH1TblwWeBL6S\nmbdHRALjMvPnEXEV8HFgD2AMMAx4vno/ETg/M6+IiLWAb2Xmzv34iCRJUhtrVmHPPreP6ca+wHeq\nVjJXAUtRgqZvZ+Z2wJHAEd0c99vM3IzSPHnHiFiVUrRzu4h4GyUYe5wSdG2bmVtSgqqNgAnAx6rz\nHAicPf+3LkmS6q5ZAdW8tI/p0LHC/jPAqIi4FtiMUvX8ceATVabrMEog1FVWr5cD2wEfovTu+wCw\nI3B5Zs4CXgcujIizgRWrc10DrBURw5kTDEqSJHWrWQFVX9rHPAosDxARqwBvr/Y7FDgxM0dTgqw9\ngK8AP87M/wR+R/dfb5wFkJnPVtffG5gM/B04Grg0ItYDds/MvYH/ojyPIdVarvOA7wJXVQGgJElS\nt5q1KL0v7WOeB56LiFuBPzMnY3UbcHVEPA28CPyKklUaHxHTKL37luvl+r8APp6Zz0TElcDhmfnX\niFgceDkibgdeo2S+VqiOmVide72+3KCtZyR1x5Yz0uDQtMKefWkfk5lXNmUwfRAR76JkwbbpZb+R\n2HpGkqQ6GbiFPXtpH/OJzOy6hqplIuI/gBOBg1o8FEmS1AZsPdMgW8+ojmyXImmQa16Gqr+rnvdw\n3Q8BK2fmWf10/vWBXTNznvpF2HpGdWK7FEmaN/MVUDWj6vncZObk/jhvp/PfDdzdn9eQJEn1Ms8B\nVR+rnm8fER8BVqIsRP9NZn4hIiZW+61CyWhNAj4MrAzsVn3z7n+AUZQSBqdm5iURcQ2l4vkywIXA\nGpk5LiJOAHav7uOMzDyzOv7/AUsCf87Mj0fEicCqlGzaKsCnM/PKiNiTUhS0I7W3J/Be4LDM3Cci\nHgamVuc5el6flSRJGhzmpw5VX6qerwfckpk7AFsAn+x0/EOZuT2lNMKqmbkTpZr5hyNix2rb5sDW\nwPERsXR13AWZuS0wE/495bgjsDGl4OfaVQX0Z6sK6psBm1Tf1gN4LTN3BI4CPl1tew+wc2ZuRSkE\nukOXe10J+KjBlCRJ6sn8TPm9qep5RPy76nm1SPufwOYRsTXwAiUb1eHO6vU5SvYHSsHPRYF1gfdX\nGSkoVctXqd53VD7vEMBtmTmTUrjzqGracXhEXAi8BCzBnCrqd1Wvj1TXAngCODciXqru6+Yu13iq\nc4kHSZKk7sxPhqovVc+XBp7LzP2AbwOLR0THtFpPa6umAr+rMkYfBC4G/lZ9NqubfTeMiKERMSwi\nrgZ2AlbKzH2B44DFmDOd94brVtmsLwP7UDJsr/DmVf1drylJkvQm8xNQXQ3sRynGeVBE/B8lE7QN\ncAql6vkdwE4RcRNlXdVfmFOBvCeXAy9FxPXVOWZXvf7epFo8Phm4EbgBOB+4FXh3RNwC/JQSjM3t\nui9Ux94JXE8JqPoyRkmSpDeYrzpU7Vb1vD9Zh0p1ZB0qSYPcPNehmu/CnlXV80Mo036dq55/cSBV\nPe9vtp6RJKl2mlfYMzMvBS6d3+MlSZLqwtYzDXLKTwuCU2ySNKAM3ObIrRARY4E1M3Ncf1/L1jNq\nhK1eJKm9zc+3/CRJktRJrTNUlU0i4irgHZQSDg8CXwVeBZ4GDqRUfT8sM/cBiIhpmTmiapWzbPWz\nc2Y+24LxS5KkAW4wBFTTKS1lVgF+Q6mSvkVmPhYRRwEnAL/q4fjfZuZ3+n+YkiSpXQ2GKb87M3M2\nMI3ShPmFzHys+uw6YJ1ujum8GK1ryxtJkqQ3GAwBVeevMT4FLBURy1e/jwbup0z/LQ8QEasAb+90\njO1nJElSjwbDlF9nsynFSC+NiFmUfoRjKY2an4uIW4E/U9ZZzZNXrh5v2QTNtxVHDG/1ECRJDbAO\nVYOslC5JUu3Mcx2qwTDlJ0mS1K8MqCRJkhrklF+DbD2jeWWbGUka8Gw901Wj7Wci4sjMPK23/Ww9\no76yzYwk1Y9Tfr07odUDkCRJA1vtM1SVTSNiCrAUcCLwEnAyMBP4K/AJYFVgIqWy+gzgAEpJhbdH\nxOmZeXjTRy1JktrCYMlQvQxsC+wMnAZMAMZk5mjgMUrgtB1wR7XfycAymXky8IzBlCRJ6slgCahu\nyMzZmfkE8AqwEnBxRFwDbE9pSXM2pZL6ZOBISpZKkiSpV4MloNoIICJGUJojPwTslplbUbJRvwN2\nA67PzG2AS4Bjq2PneaW/JEkaXAbLGqrFIuK3wBKU1jMLAVdExFDgBcp6qSWB8yJiBqV/36erY++L\niPMyc/+eLmDrGfWVbWYkqX6sQ9UgW89IklQ7tp6RJElqNgMqSZKkBjnl1yBbz6gvbDcjSW3F1jML\nUl/bzoCtZ9Qz281IUr055dcz285IkqReDboMVUQsBvwYWAF4BBgF7ACMp6T4ngYOpBT3tO2MJEnq\n1WDMUB0KPJiZm1P6+r2T0ormiKrQ56+Bz9l2RpIk9dWgy1ABa1Hay5CZUyPiyWrb6REBMAy4v3XD\nkyRJ7WYwZqjuBTYFiIjVgOWABA6oMlSfA66o9rXtjCRJ6tVgzFCdDUyMiOuAh4FXgU8CP46IjroH\nB1WvfWo7A7aeUc9sNyNJ9Tbo6lBFxGbAEpl5VUSsAUzOzNUaON9IbD0jSVKdWIeqD/4GXBgRX6Ks\nlzqixeORJEltbtBlqBY0K6ULrIQuSTVjhqpVrJQ+uFkJXZIGt8H4Lb8+i4gjWz0GSZI08BlQ9czW\nM5IkqVe1mvKbS1uZ+4EngWWAnYHTgTUoweQJmXlNROxJWZzeMWe6J/AJbD0jSZL6oG4Zqu7aygBc\nkJnbUnr0PZWZo4DdgO9Xn78H2Lkq7JnADraekSRJfVWrDBXdt5WBEiQBrAtsGREbV78vHBHLAk8A\n50bES8CawM1NHLMkSWpzdctQdddWBmBW9ToVuLDKRO0IXALMAL4M7AMcDLzCnKk/W89IkqRe1S1D\n1V1bmc7OBCZExLXAUpT1VC8ANwJ3Ai8Dz1LWYIGtZ9RHtpaRpMGtVoU9F3RbmT5ecyS2npEkqU4G\nfWFP28pIkqSmq1WGqhVsPTN42F5GkgaNQZ+h6lZErAy8LzMv769r2Hqm/mwvI0mam7p9y29uPghs\n3upBSJKkemqbDFVVBf1HwCqU9VE/A96WmeMiYlFgamaOjIjDgY9RSiXcAIyrfhaPiJsoFdS/B8yk\nfAvwEEpgeVH12UhgEvBeYAPgisw8rln3KUmS2k87ZagOAx7KzE2BsZR6Ud35OHBUtd/fKPOgX6dU\nS/8lMAE4MjNHU8omnFod927gIGAX4CvAZ4CNq22SJElz1U4BVVBVMM/Me4HnOn3WefHYx4HDqlpT\nq/DmhWUrZObd1fvrgHWq93/LzOer8/4zM5/JzFcBV+1LkqQetVNA9WdgI4CIeDdwDrB89dmGnfY7\nBDisykBtAGxGmf7ruNd/RMR61fvRlObJYOAkSZLmU9usoaJUOT+nyjwtBHwA+HZE3ADcQal4DnAP\n8Puqj99jwK3VZ8dHxJ2UgOu0iBhCaTvjlJ4kSWqIdagaZB2qwcM6VJI0aFiHqlWuvOwiW89IkjRI\ntdMaKkmSpAHJKb8GOeVXP07tSdKg55TfghQRE4FJmTm5t31tPVMftpiRJM0rp/wkSZIa1JYZqm7a\n0BwDHAEsDSwHTMjMMyLiGuBuShuZpYCPZObDEXECsDvl/s/IzDMj4r+Aj1LqUU3KzPFNvi1JktSm\n2jVD1bUNzfspQdD2lNYxn+m0722ZuS1wNbBvRGwA7EhpK7MZsHZErAPsDWxR/eweEdGsm5EkSe2t\nLTNUlDY0v4HShiYingG+HhFjKEU8h3Xa967q9RFgRHXsbZk5E/gXcFRE7EXJdk2p9l0GWL3f70KS\nJNVCu2aourahOQ24OTP3By7hjavzu36NcSqwYUQMjYhhEXE1kMCfgK0zcytgIqXiuiRJUq/aNUPV\ntQ3NLyiZpv2Ap4EZEbFIdwdm5t0RMRm4kRJQnpGZf4iIKcAN1XG3UdrW9NkrV4+3bEJNrDhieKuH\nIElqM9ahalBHHaopU6ZYKV2SpHqY5zpU7TrlJ0mSNGAYUEmSJDXIKb8G2XqmPmw5I0mq2Hqmq4gY\nC6yZmeP68zq2nml/tpyRJM0vp/wkSZIaVPsMVYeIOAbYB5gBXJeZx0bE7cCemflQRHyEUiX9i8DZ\nQEe66VOZaU0qSZI0V4MlQ7UGsBel1cxmwBoRsQslcDqg2mcsMAE4DpiSmVsDhwJnNH20kiSprQyW\ngGp94JbMnJ6Zs4HrgXWA84E9I2IFYKnMvBdYFziwaqw8gdKGRpIkaa4GS0B1N7BxRCwcEUOAUcD9\nmfkCcAfwHeBH1b5Tge9ULWj2ogRdkiRJczVY1lD9hdJqpqPdzA3AZdVnE4DJwIHV7ycDZ0fEocBS\nwIl9uYCtZ9qfLWckSfPLOlQNsvWMJEm1Y+sZSZKkZjOgkiRJapBTfg2y9Ux7s92MJKkb9W8906xW\nMvPK1jPtyXYzkqQFwSk/SZKkBrVdhqoyMiJuycxNACLiFkpbmZeAC4BFgAQ+mJmrV1XRTwKeB54F\n/gh8BTgTWInSZuY3mfmFiFgdmAhMBx4GRlY1qSRJkrpVtwzV8cBlmTkauARYOCIWAsYDO1btZF6p\n9l2JUj19B0oPv09W278JfK3a98amjl6SJLWlugRUHYvH1gJuqt5fX72+A3ghM//ZZfszwEYRcT6l\nUvoiPZxDkiRprto1oHoOGB4RC0XE0sCq1fZ7gU2r95tUr08AS0bEO7psHws8l5n7Ad8GFq/a0nR3\nDkmSpLlqu7IJHd/yozQt3gh4AFgR+ChlDdVPgEWBfwAfyMw1ImJH5qyhGgpMobSemQS8CLxMmQLc\npjr2HGBGtf+SmbldD+MZiWUT2pZlEyRJ3ah/2YTMnDi3zyJiJ+CLmfn7iNgWWL76aH1gi8x8LSLO\nAx7JzD8B63Zzjv2AgzLzgYg4GNisL+O68rKLbD0jSdIg1XYBVS8eBM6JiBnAQsCnqu0vArdExL+A\nh4CLejjHI8Ckat+ZwEH9N1xJklQHbTflN9A45dfenPKTJHWj/lN+A5WV0tuTldIlSQtCu37Lr0cR\nsWhEPDSXz7aKiEnNHZEkSaqzWgZUkiRJzVSbKb+IWAI4n1JO4YFq27qUKulDgKeBA7sccyQwBhhG\nKZEwhtJ25vzMvCIi1gK+lZk7N+k2JElSG6pThmoscG9mjqL06AOYABxR9eL7NfC5jp0jYiilh9+2\nmbklJajaqDrmY9VuBwJnN2PwkiSpfdUpoFoHuA0gM2+lNDdeCzg9Iq6hBEcrdOycmbOA14ELI+Js\nSnHQYcA1wFoRMRzYHri8ebfKO8T0AAAXXklEQVQgSZLaUZ0CqqlULWMiYgNKcJTAAVWG6nPAFR07\nR8R6wO6ZuTfwX5RnMSQzZwPnAd8FrsrM6c28CUmS1H5qs4YK+D7wo4i4gRJcvQZ8EvhxRHQUiDqI\nOVmqB4CXI+L2at/HO302kVLgc72+XvyVq8dbh6oNrThieKuHIEmqAQt7diMi3gX8ODO36cO+I4EH\np0yZYusZSZLqYZ4Le9Zpym+BiIj/ACYDn2/1WCRJUnswQ9UgW88MbLaWkSTNB1vPLEgRsSbwg2pR\ne49sPTMw2VpGktQMTvlJkiQ1qHYZqohYDPgx5Rt7jwCjgJ2B7wEzgVeBQzLz7xFxDLAPMAO4LjOP\njYjlKRXXhwDTWnALkiSpzdQxQ3Uo8GBmbg6cCLyTUv38yMwcDZwOnFq1pdkL2Kz6WSMidgGOAS7M\nzK2By1owfkmS1GbqGFCtBdwEkJlTgSeBFTLz7urz6yhV1dcEbsnM6VUxz+ur7f+uuA7c2MyBS5Kk\n9lTHgOpe5lRMXw1YDvhHVRkdYDRwP6X458YRsXBEDKFMDXZs37Tad6NmDlySJLWn2q2hojQznhgR\n1wEPU62ZAk6rAqcZwEGZ+beIuJiShRoK3ECZ4rsauCgi9gEebMUNSJKk9lK7OlQRsRmwRGZeFRFr\nAJMzc7V+vN5IrEM1YFmHSpI0H6xDBfwNuDAivkRpkHxEMy565WUX2XpGkqRBqnYBVWZOA7Zu9Tgk\nSdLgUbspv2Zzym9gcqpPktQAp/zmV0R8CFg5M8+an+NtPTOw2HJGktRMBlSVzJzc6jFIkqT21PYB\nVUS8B5gITKeURDgAOJJSV2oocGpmXhIR11CKfC4DvAj8b2ZeGxEbAScAPwfWzMxxEXECsDvl+ZyR\nmWc2964kSVI7qUNhz+2AO4BtgZOBMcCqVeuZrYHjI2Lpat8LMnNb4CzgY9W2sZTWNABExAbAjsDG\nlJY0a1f1qyRJkrrV9hkqSiHPY4HJwPPA3cD7q4wUlNIJq1Tvs3q9EvhmRLwd2BL4FPCf1WcB3JaZ\nM4F/AUf19w1IkqT2VocM1W7A9Zm5DXAJ8HHgd5m5FfBB4GJKbSqAWQCZOava9wzgsip46jAV2DAi\nhkbEsIi4OiIWac6tSJKkdlSHDNXtwHkRMYMSMO0J7BcR1wNLAD/PzBcjoutx51ACrTU6b8zMuyNi\nMnNa0pyRma/1NohXrh5v2YQBZMURw1s9BEnSIGIdqgZ11KGaMmWKldIlSaqHeV47XYcpP0mSpJYy\noJIkSWqQU34NsvVM69lmRpK0gNl6prOIWBTYn1Lw85nM/GV/XcvWM61jmxlJUqvVOqACRgAHZ+Ym\nrR6IJEmqr7oHVMdTKp3PAg6n1Jj6PPAasBLwA0qtqvcB383MMyJiNKXi+kzgr8AnMnN6KwYvSZLa\nQ90XpZ8M3Ad0nhNaEfgP4JOUHn7/SWk184mqxcwEYExmjgYeo7SmkSRJmqu6B1TdubfKOD0H/DUz\nXweeBRYF3gEsD1xcta7ZHli5VQOVJEntoe5TfrN4c9DY09canwIeBXbLzOcjYlfgpf4anCRJqoe6\nB1RPAG8BFuvLzpk5KyKOAq6IiKHAC8ABfTnW1jOtY5sZSVKrWYeqQbaekSSpdmw9I0mS1GwGVJIk\nSQ1yyq9Btp5pPVvPSJIWMFvPtIqtZ1rH1jOSpFZzyk+SJKlBAzJDFRFLAT8ElgaWAy4BNs/MXSJi\nX2BcZr4vIraglDX4b+BsoCNF9KnMvCciJgKrUYp2fiszL4qI7YCvAq8CTwMHAuvTS0uaJty2JElq\nUwM1Q7U6MCkztwd2oQQ9q0TEosCHgNkR8U5gV+BS4DhgSmZuDRwKnBERSwJbA2MorWUWqlrLnMWc\n1jLXUtrPQA8taZpwv5IkqY0N1IBqGrB7RJxHCW6GAVcCW1EySOcD2wKjgCnAusCBVbuYCcAymfki\ncCQlgLoIWISS7XohMx+rrnMdsE71vqeWNJIkSXM1UAOqzwI3Z+b+lOm+IcDPgXHAHynB1ZHAX6og\naCrwnczcCtgLOD8ilgfen5l7ADsDp1CCpaWqzwBGA/dX7/26oyRJmi8Dcg0VcDll2m4/yjqnGcDt\nQACnZOYfI2IVSpAEcDJwdkQcCiwFnEjJco2IiLso/fi+lZnTI+IQ4NKImEXJQI0F3tvogG090zq2\nnpEktZp1qBpk6xlJkmrH1jOSJEnNZoaqQVZKbw2ro0uS+pGV0lvFSunNZXV0SdJA4pRfDyJij4hY\nodXjkCRJA5sBVc+OonxrUJIkaa5qM+UXEXdSqqg/Sym1MDoz76q2XwTsSSm/cF1mHhsRJwKbAUsA\nBwHfAN4GLAZ8DngrpSXNjyNii6rQpyRJ0pvUKUN1GbADsAXwILBdRKxdvR9DCZ42A9aIiF2qY/6c\nmZtRnsMI4MPAR4HFM/MK4G7gAIMpSZLUkzoFVJcCO1GyVMdTWtPsCkwCbsnM6Zk5G7ieOe1mEiAz\n/wR8H7gQOJ16PRdJktTPahM4ZOa9wKrAB4BfU6bydqO0pdk4IhaumiOPYk67mVkAEbEusGRm7gx8\nDPhep89r84wkSVL/qM0aqsq1wKqZOSsirgXWzsx7IuJi4EZKcHQDZXrwfZ2O+wvwpYg4AHgd+GK1\n/SbKGqrtM/OZni5s65nmst2MJGkgsbBng2w9I0lS7dh6RpIkqdnMUDXI1jPNZcsZSVIT2HqmVWw9\n0xy2nJEkDURO+UmSJDWoNhmqiFgK+CGwNLAcMAHYG/gD8F7gJUoNqh2qfbYHZnY9JjPPiIhfUKqm\nA2wObJuZ1zbvbiRJUjupU4ZqdWBSZm4P7AJ8ptp+W2ZuAywC/CsztwPuA0bP7ZjM3C0ztwJuBk4x\nmJIkST2pTYYKmAYcHRFjgBeAYdX2O6vX5yiBFJR+f4v2cAwR8VlgeGYe1ISxS5KkNlanDNVngZsz\nc3/gEuas0O/pa4zdHhMRB1F6Ah7af8OVJEl1UacM1eXAGRGxH/A0MIMyzTdPx0TEKsCZlMrq/1e1\nqzkrMy/ov6FLkqR2Zh2qBlmHqrmsQyVJagLrULXKlZddZOsZSZIGqTqtoZIkSWoJp/wa5JRf/3Oa\nT5LUZE75za+IeAhYMzNf7bTtQ8A+mTm2t+NtPdN/bDcjSRronPKTJElqUNtnqCLiTuBDlGKdTwOj\nM/OuavtFwJ6UEgrXZeaxEXEiMC0zfxARawI/qKqid5xvLeAc4OXq59lm3o8kSWo/dchQXUbpz7cF\n8CCwXUSsXb0fA2xW/awREbv04XxfAb6YmdsCN/XPkCVJUp3UIaC6FNiJkqU6HtgW2BWYBNySmdMz\nczalMfI6XY7tbtHZOsBt1fsb+2XEkiSpVto+oMrMe4FVgQ8AvwaWAHYDpgIbR8TCVbXzUcD9wKvA\n8tXhG3ZzyqnAptX7jfpx6JIkqSbaPqCqXAs8mZmzqvdPZOY9wMWULNNtwEOU6cGLgJ0i4nfABt2c\n63DguIiYAmzchLFLkqQ2Zx2qBlmHqv9Zh0qS1GTWoWoVW89IkjR41WXKT5IkqWWc8muQU379w2k+\nSVILOeXXqIhYH9g1M0+KiD2AWzPzH70dZ+uZBct2M5KkdmJA1UVm3g3cXf16FHAY0GtAJUmSBq9B\nH1BFxHuAicB0Souac4APAz8B1gd+HBFbZObrLRukJEka0FyUDtsBd1AqrJ8MLAOQmVdQMlUHGExJ\nkqSeGFDB2cBTwGTgSEqWSpIkqc8MqEqbmuszcxvgEuDYTp/NwmckSZJ6MejXUAG3A+dFxAxKAPU9\nSl9AgJsoa6i2z8xnejrJK1ePt2zCArTiiOGtHoIkSX1mHaoGddShmjJlipXSJUmqh3muQ+V0liRJ\nUoMMqCRJkhrklF+DbD3TP2w9I0lqIVvPdBURY4E1M3Pcgthvbmw9s2DZekaS1E6c8pMkSWpQ7TNU\nlU0i4irgHcAZQFKqos8E/gp8omPHagrvEuBxYEXgN5l5fLMHLEmS2sdgyVBNB3YA9gA+DUwAxmTm\naOAxYGyX/UdW2zYCPhgRGzZroJIkqf0MloDqzsycDUwDVgGWBy6OiGuA7YGVu+z/h8x8JjNnArcC\n0czBSpKk9jJYpvw6f5XxKeBlYLfMfD4idgVe4o1B1VoRsTjwGrAx8KOmjVSSJLWdwRJQdTYLOAq4\nIiKGAi8AB/DGgOp1yjqqdwI/zcw/9HZSW88sWLaekSS1E+tQdVEtSp+UmZvMw/62npEkqT5sPSNJ\nktRsZqgaZKX0xlkVXZI0wFgpvVWslD7/rIouSWp3tZzyi4ixEfH1eTzmsIg4sZ+GJEmSaqyWAZUk\nSVIz1XrKLyLeAVwGnAOskZnjImJRYGpmjoyILYDvAs9Q2tDcUh13DLAPMAO4LjOPbckNSJKktlDn\nDNU7gV8Cn6EES935DrBvZm4HPAgQEesCewGbVT9rRMQu/T9cSZLUruocUH0IWIQ332Pnlfvvysz7\nq/c3Vq9rArdk5vSqXc31wDr9OlJJktTW6hxQnQvsD/yQ0hx5+Wp750bH0yJirer9RtXrVGDjiFg4\nIoYAo4D7kSRJmotar6HKzPsi4jxgNDAyIm4A7qC0m4EScJ0bES8CLwLPZuY9EXExJWM1FLiBsg6r\nR7aemX+2mZEktTsLezbI1jOSJNWOrWckSZKazQxVg2w90zhbz0iSBpjB0Xqmcy2pfjr/7MwcEhGj\ngOcy84+9HWPrmfln6xlJUrtzyq97d1WvBwIrtHIgkiRp4GubDFVELAGcDywDPFBt2wD4HqVw56vA\nIZn594j4L+CjwGxgUmaOj4gxwLGUEgoPAQcAXwRWBYYDqwCfzswrgR0j4v2UWlYbRsR9mfn3pt2s\nJElqK+2UoRoL3JuZo4Azq20TgCMzczRwOnBqRKwN7A1sUf3sHhEB7At8JzO3AK4ClqrO8Vpm7ggc\nBXwaIDP/mZl3AJOBzxlMSZKknrRTQLUOcBtAZt5KyTStkJl3V59fV+3zXkq2aQrwW2BZYHVKC5pR\nEXEtpaXMrOq4jum9R4BF+/82JElS3bRTQDUV2BT+PdU3DPhHRKxXfT6aUtE8gT8BW2fmVsBE4B7g\nUODEKps1BNijOq6nrznOor2ekSRJaoF2Cha+D7yrqnZ+BPAacAhwWkRcTzVll5l/oGSnboiI24E1\ngMco2a2rI+K3wAjgV3245q3A1zu1p5EkSXoT61A1yDpUjbMOlSRpgBkcdagGoisvu8jWM5IkDVLt\nNOUnSZI0IDnl1yCn/BrnlJ8kaYBxym9BioixwJqZOa63fW09M/9sPSNJandO+UmSJDWoVhmqiBgG\n/IBSKmEocArwdUrl9JnAJEr19FHAl6rD7gIOA7YETq72+yvwiWaOXZIkta+6ZagOBp6q2tPsBvwP\npWXNBOBHlP59/wJOA3bOzI2AR4GVqn3GVIU/H6uOkyRJ6lXdAqp1gZ0i4hrgZ5QM3F+B54B/Vm1q\nlgOezcwnADLzJOAVYHng4urY7YGVmz56SZLUluoWUE0FLqxazuwIXAJsA7wEzIiIPYEngKUj4u0A\nETEeGEnJVO1WHXsy8LtmD16SJLWnWq2hAs4EJlQNkJcCLgO+TFkfNRS4Hvg9cDhwRUTMpKyh+j2l\ndc0VETEUeIEyPWiWSpIk9co6VA2yDlXjrEMlSRpgrEPVKraekSRp8KrbGipJkqSmc8qvQU75zT+n\n+iRJA5RTfvMrIhYFpmbmyPk53tYz886WM5KkunDKT5IkqUG1zlBVzY0PpASOkZnvqLZPorSouR04\nH1gGeKDTcesC4ykpv6eBAzPz+aYOXpIktY3BkKF6NjO3oPTo62oscG/VqubMTtsnAEdURT5/DXyu\nvwcpSZLaV60zVJXsZlvHYrN1gMkAmXlrREyvtq8FnB4RAMOA+/t7kJIkqX0NhoBqVvU6LCKWAF6n\nBFJQWtVsCvwiIjagBE9QgrADMvPvEbE5pc+fJElStwZDQNXhf4FbgL8BD1fbvg/8KCJuoARXr1Xb\nPwn8OCI66iAc1NvJX7l6vGUT5tGKI4a3egiSJC0Q1qFqUEcdqilTplgpXZKkepjnOlSDYVG6JElS\nvzKgkiRJapBTfg2y9cy8sd2MJKkN2HpmQauKgB6Qma/3tJ+tZ/rGdjOSpDoyoOpFZu7T6jFIkqSB\nrW0CqohYCvghsDSwHKWa+d6UcgdrUtJze1fvj6fUnxoBnJWZ34+Ia4AnKW1mdgZOB9agrCM7ITOv\niYhdgC9Vl7wLOIxSZmHNzHy1CbcpSZLaUDstSl8dmJSZ2wO7AJ+ptt9UtYi5CDiu2vYuYFdgE+DT\nEdFR8OiCzNyW0t/vqarlzG7A9yNiYeA0YOfM3Ah4FLAOgiRJ6lXbZKiAacDRETEGeIE5Vc1/W73e\nRAmOoARZrwFExL3AatX2jjY06wJbRsTG1e8LA++k9P17AiAzT6qO75+7kSRJtdFOGarPAjdn5v7A\nJcxZgf/+6nVz4E/V+/UjYqGIWJzSZuYv1faONjRTgQurzNaO1fkeB5aOiLcDRMT4iPhAP96PJEmq\niXbKUF0OnBER+wFPAzOARYCxEfEZ4GXgPynZp2HAb4Blga9m5lNdMk1nAhMi4lpgKeD0zJwVEYcD\nV0TETMoaqt/3dXC2nukb281IkuqoretQVQvND8vMqZ22bVVta8q382w9I0lS7ViHqgUWApg2bVqr\nxyFJkhaAbbbZZiTwaGbO6OsxbZ2hGggiYgvg+laPQ5IkLVCrZuZDfd3ZDFXjOtZZrQ7MbOVABrEH\ngVVbPYhBymffWj7/1vL5t04znv2j87KzGaoFICJmZ+Y8z7dqwfD5t47PvrV8/q3l82+dgfjs26ls\ngiRJ0oBkQCVJktQgAypJkqQGGVAtGF9u9QAGOZ9/6/jsW8vn31o+/9YZcM/eRemSJEkNMkMlSZLU\nIAMqSZKkBhlQSZIkNciASpIkqUEGVJIkSQ0yoJIkSWqQzZEbEBFDgdOB9wGvAQdn5gOtHVV9RcQw\n4BxgJLAI8FXgPmAiMBu4FzgiM2e1aIiDQkQMB+4AtgNm4PNvmoj4PLAr8BbK3z3X4vNviurvn3Mp\nf//MBA7B//33u4jYGPhGZm4VEavTzfOOiC8BO1P+PI7OzNtaMVYzVI3ZHVg0MzcFxgHfbvF46m5/\n4OnM3BLYETgNOBU4odo2BNitheOrveo/KmcCr1SbfP5NEhFbAZsBmwOjgZXw+TfTTsDCmbkZcBJw\nMj7/fhURnwN+CCxabXrT846IDSn/f9gY2Af4fivGCgZUjdoCmAyQmbcA/6+1w6m9S4AvdPp9BvB+\nyr/SAX4DbNvsQQ0y3wJ+APyj+t3n3zw7APcAPwcuB36Fz7+Z7gcWrmYmlgKm4/Pvb38FxnT6vbvn\nvQVwVWbOzsy/U/6M3tHcYRYGVI1ZCni+0+8zI8Jp1H6SmS9l5osRsSTwU+AEYEhmdpT7fxF4W8sG\nWHMRMRZ4MjOv7LTZ5988y1H+0fYR4DDgfGCoz79pXqJM900FJgDj8X///Sozf0YJXDt097y7/ne4\nZX8OBlSNeQFYstPvQzNzRqsGMxhExErA74CfZOYFQOf1CksCz7VkYIPDgcB2EXENsD7wY2B4p899\n/v3raeDKzHw9MxN4lTf+h8Pn378+TXn+76Gsmz2Xspatg8+//3X3933X/w637M/BgKoxN1Lm1YmI\nTSjpePWTiHgncBVwbGaeU22+q1pbAmVd1fWtGNtgkJmjMnN0Zm4F3A0cAPzG5980NwAfioghEbEC\n8FZgis+/aZ5lTibkGWAY/v3TbN097xuBHSJiaESsTElsPNWKwTk91ZifU/7FfhNlgdzHWzyeujsO\nWAb4QkR0rKU6ChgfEW8B/kyZClTzHANM8Pn3v8z8VUSMAm6j/GP4COBBfP7N8h3gnIi4npKZOg64\nHZ9/M73p75vMnFn9mdzMnP9ftMSQ2bNn976XJEmS5sopP0mSpAYZUEmSJDXIgEqSJKlBBlSSJEkN\nMqCSJElqkAGVJElSgwyoJEmSGvT/AWuhe+oQcvNTAAAAAElFTkSuQmCC\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from statlearning import plot_feature_importance\n", "plot_feature_importance(gb, ranked_features, max_features=30)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### AdaBoost" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Best parameters found by randomised search: {'n_estimators': 500, 'learning_rate': 0.001, 'base_estimator__max_depth': 2} \n", "\n" ] } ], "source": [ "#%%time\n", "\n", "from sklearn.ensemble import AdaBoostClassifier\n", "\n", "learner = DecisionTreeClassifier(criterion='gini')\n", "\n", "model = AdaBoostClassifier(base_estimator = learner)\n", "\n", "tuning_parameters = {\n", " 'base_estimator__max_depth' : [1,2,3,4],\n", " 'learning_rate' : [0.001, 0.01, 0.02, 0.05, 0.1],\n", " 'n_estimators' : [100, 250, 500, 750, 1000, 1500, 2000, 3000],\n", "}\n", "\n", "adaboost_search = RandomizedSearchCV(model, tuning_parameters, n_iter = 4, cv = 5, scoring='neg_log_loss',\n", " return_train_score=False, n_jobs=4, random_state = 1)\n", "adaboost_search.fit(X_train, y_train)\n", "\n", "adaboost = adaboost_search.best_estimator_\n", "\n", "print('Best parameters found by randomised search:', adaboost_search.best_params_, '\\n')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Linear Support Vector Classifier" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Best parameters found by grid search: {'C': 0.25} \n", "\n" ] } ], "source": [ "#%%time\n", "\n", "from sklearn.svm import LinearSVC\n", "\n", "Cs = np.logspace(-10, 10, 81, base=2)\n", "\n", "model = LinearSVC(loss='hinge')\n", "\n", "tuning_parameters ={\n", " 'C': Cs,\n", "}\n", "\n", "svm_search = GridSearchCV(model, tuning_parameters, cv=5, return_train_score=False, n_jobs=4)\n", "svm_search.fit(X_train, y_train)\n", "\n", "svm = svm_search.best_estimator_\n", "\n", "print('Best parameters found by grid search:', svm_search.best_params_, '\\n')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Voting Classifier" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "VotingClassifier(estimators=[('clf1', BernoulliNB(alpha=1.0, binarize=0.0, class_prior=None, fit_prior=True)), ('clf2', LogisticRegressionCV(Cs=50, class_weight=None, cv=None, dual=False,\n", " fit_intercept=True, intercept_scaling=1.0, max_iter=100,\n", " multi_class='ovr', n_jobs=1, penalty='l1', random_...e', max_iter=1000, multi_class='ovr',\n", " penalty='l2', random_state=None, tol=0.0001, verbose=0))],\n", " flatten_transform=None, n_jobs=1, voting='hard', weights=None)" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#%%time\n", "from sklearn.ensemble import VotingClassifier\n", "\n", "clfs = [('clf1', nbc), ('clf2', logit_l1), ('clf3', logit_l2), ('clf4', gb), ('clf5', svm) ]\n", "\n", "vhard = VotingClassifier(clfs)\n", "vhard.fit(X_train, y_train)" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "VotingClassifier(estimators=[('clf1', BernoulliNB(alpha=1.0, binarize=0.0, class_prior=None, fit_prior=True)), ('clf2', LogisticRegressionCV(Cs=50, class_weight=None, cv=None, dual=False,\n", " fit_intercept=True, intercept_scaling=1.0, max_iter=100,\n", " multi_class='ovr', n_jobs=1, penalty='l1', random_...0011613350732448448, reg_lambda=1, scale_pos_weight=1,\n", " seed=None, silent=True, subsample=1))],\n", " flatten_transform=None, n_jobs=1, voting='soft', weights=None)" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#%%time\n", "clfs = [('clf1', nbc), ('clf2', logit_l1), ('clf3', logit_l2), ('clf4', gb)]\n", "# We exclude SVM since it does not predict probabilities\n", "\n", "vsoft = VotingClassifier(clfs, voting='soft')\n", "vsoft.fit(X_train, y_train)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Model Stacking\n" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Wall time: 25min 5s\n" ] } ], "source": [ "%%time\n", "from mlxtend.classifier import StackingCVClassifier\n", "\n", "stack = StackingCVClassifier([nbc, logit_l1, logit_l2, gb], use_probas=True, meta_classifier = LogisticRegression(C=1e4), cv=5)\n", "stack.fit(X_train.todense(), y_train) \n", "\n", "# The stacking class is not compatible with sparse matrices, which considerably slows down training\n", "# Remove XGboost, or replace it with LightGBM, for faster results" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Model Evaluation\n" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Error rateSensitivitySpecificityAUCPrecision
Naive Bayes0.0550.8510.9650.9790.837
Logistic L10.0540.8080.9750.9780.872
Logistic L20.0520.7930.9800.9810.893
Random Forest0.0800.6880.9690.9420.822
Gradient Boosting0.0500.8020.9810.9780.899
AdaBoost0.1090.4870.9760.8030.807
Linear SVC0.0510.8040.9800.9770.894
Hard Voting0.0480.8170.9800.0000.896
Soft Voting0.0460.8300.9810.9830.900
Stack0.0470.8360.9780.9840.888
\n", "
" ], "text/plain": [ " Error rate Sensitivity Specificity AUC Precision\n", "Naive Bayes 0.055 0.851 0.965 0.979 0.837\n", "Logistic L1 0.054 0.808 0.975 0.978 0.872\n", "Logistic L2 0.052 0.793 0.980 0.981 0.893\n", "Random Forest 0.080 0.688 0.969 0.942 0.822\n", "Gradient Boosting 0.050 0.802 0.981 0.978 0.899\n", "AdaBoost 0.109 0.487 0.976 0.803 0.807\n", "Linear SVC 0.051 0.804 0.980 0.977 0.894\n", "Hard Voting 0.048 0.817 0.980 0.000 0.896\n", "Soft Voting 0.046 0.830 0.981 0.983 0.900\n", "Stack 0.047 0.836 0.978 0.984 0.888" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "columns=['Error rate', 'Sensitivity', 'Specificity', 'AUC', 'Precision']\n", "rows=['Naive Bayes', 'Logistic L1', 'Logistic L2', 'Random Forest', \n", " 'Gradient Boosting', 'AdaBoost', 'Linear SVC', 'Hard Voting', 'Soft Voting', 'Stack']\n", "\n", "results=pd.DataFrame(0.0, columns=columns, index=rows) \n", "\n", "methods=[nbc, logit_l1, logit_l2, rf, gb, adaboost, svm, vhard, vsoft, stack]\n", "\n", "y_prob = np.zeros((len(test), len(rows)))\n", "\n", "for i, method in enumerate(methods):\n", " \n", " if method != stack:\n", " y_pred = method.predict(X_test)\n", " else:\n", " y_pred = method.predict(X_test.todense()) \n", " \n", " if method not in [svm, vhard, stack]: # svm and vhard do not predict probabilities\n", " y_prob[:, i] = method.predict_proba(X_test)[:,1]\n", " results.iloc[i,3]= roc_auc_score(y_test, y_prob[:,i]) \n", " elif method == svm:\n", " y_df = method.decision_function(X_test)\n", " results.iloc[i,3]= roc_auc_score(y_test, y_df) \n", " elif method == stack: \n", " y_prob[:, i] = method.predict_proba(X_test.todense())[:,1]\n", " results.iloc[i,3]= roc_auc_score(y_test, y_prob[:,i])\n", "\n", " confusion = confusion_matrix(y_test, y_pred) \n", " results.iloc[i,0]= 1 - accuracy_score(y_test, y_pred)\n", " results.iloc[i,1]= confusion[1,1]/np.sum(confusion[1,:])\n", " results.iloc[i,2]= confusion[0,0]/np.sum(confusion[0,:])\n", " results.iloc[i,4]= precision_score(y_test, y_pred)\n", "\n", "results.round(3)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.3" } }, "nbformat": 4, "nbformat_minor": 2 }