{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Exercise 7\n", "\n", "# Part 1 - DT\n", "\n", "## Capital Bikeshare data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction\n", "\n", "- Capital Bikeshare dataset from Kaggle: [data](https://github.com/justmarkham/DAT8/blob/master/data/bikeshare.csv), [data dictionary](https://www.kaggle.com/c/bike-sharing-demand/data)\n", "- Each observation represents the bikeshare rentals initiated during a given hour of a given day" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import pandas as pd\n", "import numpy as np\n", "from sklearn.model_selection import cross_val_score\n", "from sklearn.linear_model import LinearRegression\n", "from sklearn.tree import DecisionTreeRegressor, export_graphviz" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# read the data and set \"datetime\" as the index\n", "bikes = pd.read_csv('../datasets/bikeshare.csv', index_col='datetime', parse_dates=True)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "# \"count\" is a method, so it's best to rename that column\n", "bikes.rename(columns={'count':'total'}, inplace=True)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "# create \"hour\" as its own feature\n", "bikes['hour'] = bikes.index.hour" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
seasonholidayworkingdayweathertempatemphumiditywindspeedcasualregisteredtotalhour
datetime
2011-01-01 00:00:0010019.8414.395810.0313160
2011-01-01 01:00:0010019.0213.635800.0832401
2011-01-01 02:00:0010019.0213.635800.0527322
2011-01-01 03:00:0010019.8414.395750.0310133
2011-01-01 04:00:0010019.8414.395750.00114
\n", "
" ], "text/plain": [ " season holiday workingday weather temp atemp \\\n", "datetime \n", "2011-01-01 00:00:00 1 0 0 1 9.84 14.395 \n", "2011-01-01 01:00:00 1 0 0 1 9.02 13.635 \n", "2011-01-01 02:00:00 1 0 0 1 9.02 13.635 \n", "2011-01-01 03:00:00 1 0 0 1 9.84 14.395 \n", "2011-01-01 04:00:00 1 0 0 1 9.84 14.395 \n", "\n", " humidity windspeed casual registered total hour \n", "datetime \n", "2011-01-01 00:00:00 81 0.0 3 13 16 0 \n", "2011-01-01 01:00:00 80 0.0 8 32 40 1 \n", "2011-01-01 02:00:00 80 0.0 5 27 32 2 \n", "2011-01-01 03:00:00 75 0.0 3 10 13 3 \n", "2011-01-01 04:00:00 75 0.0 0 1 1 4 " ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bikes.head()" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
seasonholidayworkingdayweathertempatemphumiditywindspeedcasualregisteredtotalhour
datetime
2012-12-19 19:00:00401115.5819.6955026.0027732933619
2012-12-19 20:00:00401114.7617.4255715.00131023124120
2012-12-19 21:00:00401113.9415.9106115.0013416416821
2012-12-19 22:00:00401113.9417.425616.00321211712922
2012-12-19 23:00:00401113.1216.665668.99814848823
\n", "
" ], "text/plain": [ " season holiday workingday weather temp atemp \\\n", "datetime \n", "2012-12-19 19:00:00 4 0 1 1 15.58 19.695 \n", "2012-12-19 20:00:00 4 0 1 1 14.76 17.425 \n", "2012-12-19 21:00:00 4 0 1 1 13.94 15.910 \n", "2012-12-19 22:00:00 4 0 1 1 13.94 17.425 \n", "2012-12-19 23:00:00 4 0 1 1 13.12 16.665 \n", "\n", " humidity windspeed casual registered total hour \n", "datetime \n", "2012-12-19 19:00:00 50 26.0027 7 329 336 19 \n", "2012-12-19 20:00:00 57 15.0013 10 231 241 20 \n", "2012-12-19 21:00:00 61 15.0013 4 164 168 21 \n", "2012-12-19 22:00:00 61 6.0032 12 117 129 22 \n", "2012-12-19 23:00:00 66 8.9981 4 84 88 23 " ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bikes.tail()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- **hour** ranges from 0 (midnight) through 23 (11pm)\n", "- **workingday** is either 0 (weekend or holiday) or 1 (non-holiday weekday)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Exercise 7.1\n", "\n", "Run these two `groupby` statements and figure out what they tell you about the data." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "workingday\n", "0 188.506621\n", "1 193.011873\n", "Name: total, dtype: float64" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# mean rentals for each value of \"workingday\"\n", "bikes.groupby('workingday').total.mean()" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "hour\n", "0 55.138462\n", "1 33.859031\n", "2 22.899554\n", "3 11.757506\n", "4 6.407240\n", "5 19.767699\n", "6 76.259341\n", "7 213.116484\n", "8 362.769231\n", "9 221.780220\n", "10 175.092308\n", "11 210.674725\n", "12 256.508772\n", "13 257.787281\n", "14 243.442982\n", "15 254.298246\n", "16 316.372807\n", "17 468.765351\n", "18 430.859649\n", "19 315.278509\n", "20 228.517544\n", "21 173.370614\n", "22 133.576754\n", "23 89.508772\n", "Name: total, dtype: float64" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# mean rentals for each value of \"hour\"\n", "bikes.groupby('hour').total.mean()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Exercise 7.2\n", "\n", "Run this plotting code, and make sure you understand the output. Then, separate this plot into two separate plots conditioned on \"workingday\". (In other words, one plot should display the hourly trend for \"workingday=0\", and the other should display the hourly trend for \"workingday=1\".)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# mean rentals for each value of \"hour\"\n", "bikes.groupby('hour').total.mean().plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot for workingday == 0 and workingday == 1" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# hourly rental trend for \"workingday=0\"\n", "bikes[bikes.workingday==0].groupby('hour').total.mean().plot()" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# hourly rental trend for \"workingday=1\"\n", "bikes[bikes.workingday==1].groupby('hour').total.mean().plot()" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXcAAAEGCAYAAACevtWaAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+j8jraAAAgAElEQVR4nOzdd3iUVdrA4d/JpBdSSAIpQBJKaIEAoUgXF0GkqCiCFHtZcVXctWyz7e6nq+6K2EFUVFQsqygqCkoJnUCoIQVCgBAglZBC+vn+OBNqSJ2WmXNfV66ZeWfmfR9C8uTMKc8RUko0TdM0++Jk7QA0TdM009PJXdM0zQ7p5K5pmmaHdHLXNE2zQzq5a5qm2SFnawcAEBgYKCMiIqwdhqZpWquyY8eOXCllUF3P2URyj4iIICEhwdphaJqmtSpCiCNXek53y2iaptkhndw1TdPskE7umqZpdsgm+tzrUllZSWZmJmVlZdYOxWrc3d0JDw/HxcXF2qFomtbK2Gxyz8zMxMfHh4iICIQQ1g7H4qSU5OXlkZmZSWRkpLXD0TStlbHZbpmysjLatm3rkIkdQAhB27ZtHfqTi6ZpzWezyR1w2MRey9H//ZqmNZ9NJ3dN01qxmhrY9RmU5ls7EofkcMk9IiKC3Nzcy44PHTrU7NfQNIeSsR6+fQDeHwcFGdaOxuE4VHKvrq6+4nObNm2yYCSa5gCyk9Vt0Ul473dwfId143EwrSa5v/TSSyxYsACAefPmMWbMGAB+/fVXZs2axWeffUZMTAy9e/fmySefPPc+b29vnn76aQYPHszmzZvPHT979izjx49n0aJF514HsHbtWkaPHs3NN99M9+7dmTlzJrW7Vf344490796d4cOH8/DDDzNx4kQA8vLyuPbaa+nXrx/3338/F+5udcMNNzBgwAB69erFwoULAVi8eDHz5s0795pFixbx2GOPmfx7pmlWlZMMHv5wz6/g4gkfToTkH60dlcNoNcl95MiRxMfHA5CQkEBxcTGVlZVs2LCBrl278uSTT/Lbb7+xa9cutm/fzrfffgtASUkJvXv3ZuvWrQwfPhyA4uJiJk2axG233ca999572bUSExOZP38+SUlJpKens3HjRsrKyrj//vv56aef2LBhAzk5Oede/9xzzzF8+HASExOZPHkyR48ePffc+++/z44dO0hISGDBggXk5eUxffp0vvvuOyorKwH44IMPuPPOO832vdM0q8hJgaDuENQN7lmt7i+bCdsWWTsyh9BqkvuAAQPYsWMHRUVFuLm5cdVVV5GQkEB8fDx+fn6MHj2aoKAgnJ2dmTlzJuvXrwfAYDAwderUi841ZcoU7rzzTubMmVPntQYNGkR4eDhOTk7ExsaSkZFBcnIyUVFR5+acz5gx49zr169fz6xZswC4/vrr8ff3P/fcggUL6Nu3L0OGDOHYsWOkpaXh5eXFmDFjWLFiBcnJyVRWVhITE2PS75emWV1uCgR2U/e9g+GOFdB1HPz4J/jlb2rAVTObVpPcXVxciIiI4IMPPmDo0KGMGDGCNWvWcOjQITp27HjF97m7u2MwGC46NmzYMH766SeutDm4m5vbufsGg4GqqqorvrZWXdMW165dy+rVq9m8eTO7d++mX79+5+at33PPPXz44Ye61a7Zp5JcKM1TrfVarl4wfSkMvBc2vQ5f3QmVeh2HubSa5A6qa+aVV15h5MiRjBgxgnfeeYfY2FiGDBnCunXryM3Npbq6ms8++4xRo0Zd8TzPP/88bdu25cEHH2z0tbt37056ejoZGRkALFu27KK4li5dCsBPP/1EQUEBAIWFhfj7++Pp6UlycjJbtmw5957Bgwdz7NgxPv3004s+BWiaXcgxDqYGRV983MkAE16Ga/8JSd/CR1P0VEkzaVXJfcSIEZw4cYKrrrqKdu3a4e7uzogRIwgJCeGFF17g6quvpm/fvvTv358pU6bUe6758+dTVlbGE0880ahre3h48NZbbzF+/HiGDx9Ou3bt8PX1BeCZZ55h/fr19O/fn19++eXcJ4nx48dTVVVFnz59+Pvf/86QIUMuOue0adMYNmzYRd04mmYXziX37pc/JwQM/QPc8iFkJcLisZCfbtHwHIFoqLsBQAiRARQB1UCVlDJOCBEALAMigAxgmpSyQKj+ideACUApcIeUcmd954+Li5OXbtZx4MABevTo0dR/j1kVFxfj7e2NlJK5c+fStWvXi2a9NNXEiROZN28e11xzzRVfY4vfB01r0I9PwK5P4c/HVDK/kqNb4LPpIAxw2zIIj7NcjHZACLFDSlnnN60pLferpZSxF5zoKeBXKWVX4FfjY4DrgK7Gr/uAt5sXtu1ZtGgRsbGx9OrVi8LCQu6///5mnef06dN069YNDw+PehO7prVaOclqlkxDJTQ6DoG7V4Obt5oqeWCFZeJzAC2pCjkFGG28vwRYCzxpPP6RVB8Jtggh/IQQIVLKEy0J1BbMmzevRS31Wn5+fqSmppogIk2zUTkp0KWRDZfALirBfzYdls2C8S/CkAfMG58DaGzLXQK/CCF2CCHuMx5rV5uwjbfBxuNhwLEL3ptpPHYRIcR9QogEIUTChXPGNU1r5c6ehuKTlw+m1sc7CG7/HrpfDyufhJV/0VMlW6ixyX2YlLI/qstlrhBiZD2vretz2GUd+1LKhVLKOCllXFBQnZt3a5rWGuUaP5XWNZhaH1dPmPYRDP49bHkTVjxq+tgcSKOSu5Qyy3ibDXwDDAJOCSFCAIy32caXZwIdLnh7OJBlqoA1TbNxtTNlahcwNYWTAa57EfrPgd2fQUWpaWNzIA0mdyGElxDCp/Y+cC2wD/gOuN34stuB5cb73wFzhDIEKLSH/nZN0xopJwWcPcDvyosLG9RzClRXwNHNDb9Wq1NjWu7tgA1CiN3ANuAHKeVK4EVgrBAiDRhrfAzwI5AOHAQWAY1fKWQHVq5cSXR0NF26dOHFF19s+A2aZm9ykiGwq2qFN1fHq8DJBdLXmiwsR9PgbBkpZTrQt47jecBlw+HGWTJzTRJdK1NdXc3cuXNZtWoV4eHhDBw4kMmTJ9OzZ09rh6ZplpOTqqY4toSrF3QYDIfXmSYmB9SqVqjaum3bttGlSxeioqJwdXVl+vTpLF++vOE3apq9KC+GwqNqjntLRY2CE3t0eYJmask8d5v13Pf7Sco6Y9Jz9gxtwzOTetX7muPHj9Ohw/mx5PDwcLZu3WrSOGxe4XH4dJqa9dC2s7Wj0SytuTNl6hI1Gtb8Cw6vh143tPx8Dka33E2orlIODrfJdfoaOLUPMuKtHYlmDTkp6tYUyT20P7j66H73ZrLLlntDLWxzCQ8P59ix8+u3MjMzCQ0NtUosVpOVqG5rf8k1x5KbogZC/SNbfi6DM0QM18m9mXTL3YQGDhxIWloahw8fpqKigs8//5zJkydbOyzLOm6sEaeTu2PKSYG2XVRiNoWo0VBwGAqOmOZ8DkQndxNydnbmjTfeYNy4cfTo0YNp06bRq5d1PkVYRVWF6pIBndwdVU5y08oONCTKuC+DnjXTZHbZLWNNEyZMYMKECdYOwzqyk9TCk/YxcHIvlBeBm4+1o9IspbIMCjIgZprpzhnUHbzbQfo6tWpVazTdctdMp7a/vc90dZurK186lLyDIGtM23IXQnXNpK/VhcSaSCd3zXSydoK7H3Qbpx7rrhnHcqWt9VoqchSU5qpPhlqj6eSumU5WIoT2UzMlDK46uTuanBQQTmpA1ZR0v3uz6OSumUblWcg+oJK7wVn9guvk7lhykiEgCpzdTHte33D186SnRDaJTu6aaZzaDzVVENZfPQ6KPv8xXXMMuammWbxUl6jRkLERqivNc347pJO7Zhq1g6mh/dRtYDScPqJa9Jr9q65UA6rNqeHeGFGjobIEMhPMc347pJO7Cd11110EBwfTu3dva4diecd3glcQtDHuqBgUrWZO5B20blyaZeSnq09u5mq5RwxX/fm6a6bRdHI3oTvuuIOVK1daOwzrqB1Mra2lU/tLrvvdHYO5ZsrU8vCHkFg9qNoEOrmb0MiRIwkICLB2GJZXXqxqioT2P3+sbWfV0tLJ3THkpALCfN0yoGbNZG5Xi+O0BtnnCtWfnlIrJE2pfYza21G73Mm9qgumtr8d1IyJgCiV9DX7l5MMfh3UJtfmEjUaNrwKRzadX0uhXZFuuWstl2UsFhYae/HxwGjdcncUOSnm62+v1WEIOLurUgRag+yz5a5b2JaVlQg+oeDT/uLjQdGQ9rOaSWFwsU5smvnVVKtpkJ1Hm/c6Lu5q6z09qNoouuWutVxW4vn57RcK6q5mUOQftnxMmuWcPgLV5eZvuYPqmsneD8XZ5r9WK6eTuwnNmDGDq666ipSUFMLDw1m8eLG1QzK/skI13fHSLhk4v4+mXsxk32q73gLNNFPmQudKEaw3/7VaOfvslrGSzz77zNohWF7WLnV74WBqrdqZE3pQ1b6dmwZpxpkytUJiwd1XbecYc7P5r9eK6Za71jK1K1ND6kjurl7g21EPqtq7nBQ15uLua/5rORkgcqQaVK1jz2LtPJ3ctZbJSgS/juDVtu7ndY0Z+5eTYr7FS3WJHAWFx9SqWO2KbDq5Swf/y9wq/v1ZiRcvXrpUUDTkpqkZFZr9kdLyyT3qanWrV6vWy2aTu7u7O3l5ea0jwZmBlJK8vDzc3d2tHcqVleSpmRJ19bfXCoqGqjI4fdRycWmWU5ipCnpZMrm37axqGOkpkfWy2QHV8PBwMjMzycnJsXYoVuPu7k54eLi1w7iyE5dUgqxL7fS43FQIiDR/TJpl1Y6nWGIaZK3arfdSflRb7znZbBvVqmw2ubu4uBAZqZOBTTs3mNr3yq8JvGA6pF4ybn9yrZDcQSX3XUvh5J66p+Fqttsto7UCWbvUDjkefld+jYcfeLfXM2bsVU4yeAaCp4UL5kWOVLe6a+aKGp3chRAGIUSiEGKF8XGkEGKrECJNCLFMCOFqPO5mfHzQ+HyEeULXrK62zG9Dgrrp5G6vLFFTpi4+7SGohx5UrUdTWu6PAAcuePxv4FUpZVegALjbePxuoEBK2QV41fg6zd4UnYIzxxuZ3LurJOCgg+N2S0rVcrfkYOqFokbBkc1QWWad69u4RiV3IUQ4cD3wnvGxAMYAXxlfsgS4wXh/ivExxuevMb5esyeXbqtXn6BoqCiCohPmjUmzrOJsVX7CGi13UP3uVWchc5t1rm/jGttynw88AdQYH7cFTkspq4yPMwHj/mqEAccAjM8XGl9/ESHEfUKIBCFEgiPPiGm1shLVZhzt+zT82tqaI3oxk32xZNmBunQaBsKgSwBfQYPJXQgxEciWUu648HAdL5WNeO78ASkXSinjpJRxQUFBjQpWsyFZiSppu3k3/Fq95Z59ssY0yAu5t4GwAXpQ9Qoa03IfBkwWQmQAn6O6Y+YDfkKI2qmU4UCW8X4m0AHA+LwvkG/CmDVrk1Jt0NGYLhkAr0C1B6ZO7vYlJ1nVk/FuZ70Yokarn8WyQuvFYKMaTO5Syj9LKcOllBHAdOA3KeVMYA1QW5btdmC58f53xscYn/9NOuoyU3t15jiU5DQ+uQtxflBVo6iskvScYmpqWvmvRW6q+n+15pBa1Ci1xWPGBuvFYKNasojpSeBzIcQ/gUSgtnj5YuBjIcRBVIt9estC1GxOUwZTawVFw4HvzROPDZNScjS/lB1HCs59pZwqQkpo4+7MgE7+xEUEMCgygJgwX9xdDNYOufFykqHbeOvGED4QXDxV10z3660bi41pUnKXUq4F1hrvpwOD6nhNGXCLCWLTbFVWIjg5Q/vejX9PYDSUfggluaqbxk6VVVaz93jhuUS+80gBeSUVAPi4ORPb0Y/xvdsT6utB4rHTJGTksyZFfaJxNTjRJ9yXgZEBDIzwZ0DHAHw9bXR7wpI89enNWv3ttZzdoNNQPahaB5stP6DZsOM7IbgHuHg0/j1BF8yY8Rpunris4GRh2flW+dECkrIKqaxW3S2RgV6Mjg5mQCd/BnTyp0uwNwan810Y0wZ2ACC/pIIdRwpIyMhne0Y+78Wn8/ZadY7odj4MjPRnYEQAcREBhPk14XtuTtYqO1CXyFGw6u9wJgvahFo7Gpuhk7vWNFKqlnvPyU1737nkngIRrT+5F56tZNZ7W9l7XA3kubs40Sfcj3tGRDGgoz/9OvrR1tutUecK8HJlbM92jO2pBibPVlSzO1O16rdlFPBtYhafbFFVNUN93ekR0oYOAZ6E+3vQMcCTDsYvbzcL/jqfmyljpQVMF4oarW7T10HsDGtGYlN0cteapiADyk43rb8dVIlWV2+7GFSVUvLXb/aSdOIMf5nQnSFRbekR0gYXg2lKNXm4GhgS1ZYhUWp5SHWNJPnkGbYfzmf7kQLSc0rYejif4vKqi94X4OVKB38PwgM8VdL396RDgPoDEOrnYbL4APX/6OIFvjZQtbRdb/Bsq0oR6OR+jk7uWtOcG0ytZ4OOugihKkTawX6qX+88zoo9J3h8XDT3jexs9usZnAS9Qn3pFerLHcNUpVQpJadLKzlWUMrR/FKO5Z/lWEEpx/JL2X+8kF/2nzzXPQTgJCDE14OoIC+6BHvTOcj4FexFkLcbTV5EnpOsFi/ZwuJzJyfj1ntr1SdLW4jJBujkrjVN1k4wuEJwz6a/N6i72ti4FcvILeHp5fsYHBnAA6PMn9ivRAiBv5cr/l6u9Am/vCpndY3k5JkyjuWXnvs6kl9Kek4Jy7Yfo7Ti/M5YPu7OdA7yviDpe9E52JuOAZ5Xbu3npKhpiLYichTs/0bt+mWtFbM2Rid3rWmydqmPwc6uTX9vUDTs/lQtOLHEZsomVlFVw8OfJ+JicOLVW2MvGhy1NQYnQZifB2F+Hue6d2pJKTlRWMahnGIOZRdzKKeEQznFxKfl8NWOzHOvc3YSdGrrSZdgb8b1as+N/cJUC7/sDBRl2UZ/e62o0eo2fa1O7kY6uWuNV1Ojknufac17/7lB1VToMNB0cVnIq6tT2ZNZyNsz+xNqK7NWmkEIQaifB6F+HozoenHpjzNllaTnlBiTvvo6cKKIn/ef4rfkbP7vphja5KaqFwfaUHIPiAS/Tiq5D77P2tHYBJ3ctcbLP6SqO4Y1sb+91oXTIVtZct90MJd31h1ixqAOXBcTYu1wzKaNuwuxHfyI7XC+q6emRvL2ukP8d5X647Y0Lk3VF7GlljuobqL9y6G6Cgw6temdmLTGO75T3TZ1pkwtv05gcGt1g6oFJRXM+2IXkYFe/H1iM8YaWjknJ8Hcq7uw7L4hVFXX8POadVQ5uSL9Olk7tItFjYbyQjixy9qR2ASd3LXGy0oEZ4/mfxx3MqgZM61oOqSUkie/3kN+SQULpvfD09VxW4RxEQH8+MgIBnrnkFoVwj0fJ1JgXH1rEyKNA7y6SiSgk7vWFFmJENKnZR95g6JbVV33T7cd5ZekUzw5vju9w1rfILCp+Xm60sftJO4hPYhPy+W61+LZdthGir56BUK7GJ3cjXRy1xqnusq403wzu2RqBUXD6WNQUWKauMwo7VQR/1iRxIiugdxlnF/u8CpKEKePEtVzAP97cCjuLk5MX7iZ139No9oWqlxGjYJjW6Gi1NqRWJ1O7lrj5KZCZWnTFy9dKigakGo+sg0rq6zm4c934eXqzH+m9cXJhqc9WlRuGiAhKJreYb58/4fhTOobyn9WpTJ78Vayz1h5P9Oo0VBdAUc3WzcOG6CTu9Y4WS0cTK1V219fO53ORr20MoUDJ87w8i19CPZxt3Y4tuOS3Zd83F2Yf2ssL03tw86jBUxYEM+6VCtum9lpqCpzse9/1ovBRujkrjVOVqL6pWnbpWXnCYhS5YJtuN99TUo27288zB1DIxjT3Yq7DNmi3BT1/xcQde6QEIJpAzvw3UPDCfBy5fb3t/HvlclUVtfUcyIzcfWCXjeq1arlRZa/vg3RyV1rnKxECIlVdTxawtkVAjrb7IyZnKJyHv9yN9HtfHjqOhsoZ2trclLU/5/h8jrz3dr5sHzucGYM6sjbaw9x67ubySywQt93/zlQWaISvAPTyV1rWFUFnNwHYS3skqkVZJvTIWtqJH/6cjdFZVUsmNGvde2KZCk5yfUuXvJwNfDCTTG8PqMfqaeKmfBaPLuPnbZggKjdmQKjYefHlr2ujdHJXWtYdhJUl7e8v71WUHfIT1d/NGzIh5syWJeaw9+u70F0ex9rh2N7qsrV/1sjNuiY1DeUHx4eThsPF+79KIFTlhxoFQL6z4bMbTbZiLAUndy1hjVnz9T6BEaDrFblDGxEUtYZXvwpmd/1CGbWEBtbeWkr8g6pzagbWXagU1svFs2Jo7i8ivs+SqCssrrhN5lKn+lqbGDnR5a7po3RyV1rWFYiuPuBv4nmel9YY8YGnK2o5uHPE/HzdOGlm/s2vba5o6j9/2pCTZkeIW149dZYdmcW8tTXe5DSQnPhvYMg+jrY/bnNfUK0FJ3ctYZlJapWu6mSXmBXQKjqkDbgnz8kcTC7mP9OiyXAqxmljB1FTgoIpybPmBrXqz1/urYb3+7K4p116WYKrg79ZkNpLqSutNw1bYhO7lr9KstUn7upumRAbazt38kmWu6rkk6xdOtR7h8ZxfCugdYOx7blJIN/RNM2Rjeae3UXJvUN5aWfk1mddMr0sdWl8zXgEwKJjjmwqpO7Vr9T+6CmyrTJHdSgnJUHu86UVfLXb/bSI6QNf7zWxsrX2qLc1GYXjRNC8NLUPvQO9eWRzxNJPWWBOegGZ4i9DQ6uhjNZ5r+ejdHJXaufqQdTawV2g7yDqmaNlbzycwq5xeW8eFMMrs76V6Fe1VXGLeya/0fQw9XAojlxeLo5c8+SBMtUlOw3Sw0C71pq/mvZGP0TrdUvKxE8A02/y31QdzW98vQR0563kRKPFvDxliPMuSqCvh0u34NUu0TBYaipbNQ0yPq093Vn4ewBnDxTxu+X7jD/KtaAKIgYAYmfqJ3EHIhO7lr9ju9UOy+ZegbJuRkzlu+aqayu4c//20s7H3f+eK3eb7NRmjFT5kr6dfTnxZti2JKez3Pf72/x+Rq+4GwoyIAjG8x/LRuik7t2ZeXFqpaIqbtkQHXLgFUGVd/fcJjkk0U8N6UXPu6XL6PX6lD7RzjQNH8Mb+ofzv0jo/hky1E+3mLmT289J4Obr8OtWNXJXbuyk3tVf6U5krt7G2gTZvGW+7H8Ul5dncrYnu0Y16u9Ra/dquWkgG8HcPM22SmfGN+dMd2Dee67/Ww+lGey817GxQNiboYD38FZC5dCsKIGk7sQwl0IsU0IsVsIsV8I8ZzxeKQQYqsQIk0IsUwI4Wo87mZ8fND4fIR5/wma2ZhrMLVWYDeL7qcqpeRv3+7DIATPTe5lsevahQZqyjSHwUnw2vRYIgK9+P3SHRzNM2ORsf6zoaoM9n5pvmvYmMa03MuBMVLKvkAsMF4IMQT4N/CqlLIrUADcbXz93UCBlLIL8KrxdVprlJUIPqHgY6YWblB3tZDJQgNdK/acYF1qDn+8NppQv6bP1XZYNTXGmTKmr5Lp4+7Ce3PikBLu+Wg7RWWVJr8GoCqatotRA6sOosHkLpVi40MX45cExgBfGY8vAW4w3p9ifIzx+WuEXs/dOmXtNF+rHVR1yMoSOHPcfNcwKjxbyXPfJxET5svtQyPMfj27UngUqs6arL/9UhGBXrw9sz+HckqYt2yXebbrqy0mdmKX6m50AI3qcxdCGIQQu4BsYBVwCDgtpaydpJwJhBnvhwHHAIzPFwJtTRm0ZgFlhWoeulmTu7ElaIF+93+vTCa/pJwXborBoLfMa5pLdl8yh6FdAnlmUk9WH8jmlV/M9PMQcwsYXB1mYLVRyV1KWS2ljAXCgUFAj7peZryt6zfnsj/FQoj7hBAJQoiEnBwrbsul1e3EbnVrzuQeaJkCYjuO5PPp1qPcNSyS3mG+Zr2WXTo3DdK800ZnD+nEbYPVRh/fJprh05xnAHSfCHuWqbIadq5Js2WklKeBtcAQwE8I4Wx8KhyoXd+bCXQAMD7vC+TXca6FUso4KWVcUFBQ86LXzMfcg6kAXm3VAikzDqpWVKk57WF+Hswbq+e0N0tOCni3Bw9/s15GGAe6B0cG8MTXe9hljk0++s+GstOQvML057YxjZktEySE8DPe9wB+BxwA1gA3G192O7DceP8742OMz/8mLVbnUzOZ4zvBr6NKwOZk5hozi+LTST1VzPNTeuHl5tzwG7SLSQnHtkFwXR/WTc/F4MTbswbQro0b93+cQG5xuWkvEDkafDs6RDGxxrTcQ4A1Qog9wHZglZRyBfAk8JgQ4iCqT32x8fWLgbbG448BT5k+bM3savdMNbfaLffM8Pf/SF4JC35N47re7bmmh97oulmObIS8NDVP3EICvFx5d1Ycp0sr+cOniVSZskSBkxP0mwnpa6HAOqUvLKUxs2X2SCn7SSn7SCl7SymfNx5Pl1IOklJ2kVLeIqUsNx4vMz7uYnzeggWcNZMoyVM1X8L6m/9aQd3Vx+TibJOetnZOu4vBiWf1nPbm275YbdTS6yaLXrZnaBv+dWMMm9Pz+O8qE9f9j50JCLsvJqZXqGqXO1Hb326B5G6mMgTLd2URn5bLE+OjadfG3aTndhhFp9SqztiZ4Opp8cvfPCCcGYM68tbaQ6wyZQ14vw7Q+WpIXAo1Ftz6z8J0ctcud7w2uVuiW8Y4vS7XdK2z06UV/GNFErEd/Jg5WO+H2myJH6la/nF3WS2EZyb1JCbMl8e+2MWRvBLTnbjfbDiTCelrTHdOG6OTu3a5rES1lZq7BaYN+rRXRZ1M2HJ/4cdkTp+t1HPaW6KmGhI+hKjRENi0bfVMyd3FwFsz++MkBA98stN0m2x3vx48Aux6zrtO7trlsnZapksG1MrB2kFVE9ianseyhGPcMyKSHiFtTHJOh5T6s2rZxt3d8GvNrEOAJ/Onx5J88gx/+3afaTbZdnaDPrdC8g9qjMkO6eSuXezMCSg6Yd757ZcKijZJci+vquYv3+wl3N+DR67paoLAHFjCYrX/aPQEa0cCwNXRwfxhTFe+2pHJsu3HTHPS/rPVBiR7lvYKamwAACAASURBVJnmfDZGJ3ftYrWLlywxU6ZWYDSUZEPpZWvdmuSdtekcyinhHzf0xtNVz2lvtvx0OPgrDLhD7UNqIx65pisjugby9Hf72ZtZ2PITtuulGjGJH5tlKq616eSuXSwrEYQTtI+x3DVNMKianlPMm2sOMrFPCFdHB5soMAeV8IH6Geg/x9qRXESVCO5HoJcrD3yyg9OlJtiDtd9syE5Si/bsjE7u2sWydkJQD3D1stw1g1pWY0ZKyV+/2YebixNPT+ppwsAcUGWZKovbfQK0CbV2NJcJ8HLlrVkDyCkq59Flu6hpaQXJmJvB2cMuV6zq5K6dJ6VqwViyvx3UDj8unqq2ezPEp+WyOT2PJ8ZFE+yj57S3SNJyOJsPA++xdiRXFNvBj6cn9WRtSg5vrDnYspO5+0LPKbDva6gw42YhVqCTu3be6aPqFzvMwsndyQkCuza75b4oPp0gHzemDexg4sAc0Pb31DTYyFHWjqReMwd35KZ+Yby6OpX1qS2sKtt/NpSfUX/Y7IhO7tp5WcZ+R0u33EENqjZjxkzyyTPEp+Vyx9AI3JwNZgjMgZzcC5nb1KIlG99fRwjBv26MIbqdD498nsjx02ebf7JOwyAgyu66ZnRy187LSgQnF2jX2/LXDopW86rLi5r0tvfiD+PhYmDm4I5mCsyBbF+s+p9jb7N2JI3i4Wrg7VkDqKqWPLh0J+VVzVzgJAT0m2UsknbItEFakU7u2nnHd0L73mqBh6XVDqo2YcZM9pkylu86zi1x4fh5upopMAdRdgb2fAG9p5q9brspRQZ68fItfdl97DT/XHGg+Sfqe5uaIWRHrXed3DWlpkbtvmSNLhm4YMu9xif3JZszqKqR3DUs0jwxOZI9y9R+tgOtV0emucb3bs/9I6P4eMsRvknMbN5J2oRA12tVMbHy4oZf3wro5K4p+YfUoJKlyg5cyj9SdQk1clC1tKKKT7YcZVzP9kQEWnDapj2SUnXJhPaDsAHWjqZZHh8XzaDIAP78v70knzzTvJMMf0wtplv/smmDsxKd3DXFEtvq1cfgrGZpNLJb5suETArPVnLvSN1qb7EjmyDngE3UkWkuZ4MTb9zWDx93F37/yU7OlFU2/SQdB6vyxpvfhNw00wdpYTq5a8rxnWowzYw73DeoXS/ITGiwxnZ1jWTxhsP06+jHgE4BFgrOjiUsVvO9e0+1diQtEuzjzpu39edofin/+D6peSf53XNqzcWPj7f6kgQ6uWtKViKE9LFuLZEek9TH4sPr633ZqqSTHM0v5d4RURYKzI4VZ0OS9TbkMLVBkQHcPzKKL3dksjalGbt7eQfBmL+qOu8HvjN9gBakk7sG1VXGwVQr9bfX6jYe3NrA3i/rfdmi+MN0CPBgXK/2FgrMju38SFVGtOKGHKb28DVd6RLszV/+t5ei5nTPxN2tpgOv/AtUmHCDEAvTyV2D3BSoOmu9/vZaLu7QY7JqSVbWvShl59ECdhwp4K5hkXojjpaqqYYdH0LkSLVC2E64uxh4+eY+nDxTxgs/NWPVs8EZJryi1l2sf8X0AVqITu7a+Yp4lizzeyV9boGKIkhdWefT78Wn08bdmWlxutRAi6WtgsJjNl1Hprn6dfTn7uGRfLr1KJsO5jb9BJ2ugr4zYNPrkNvC+jVWopO7psoOuLWBgM7WjgQiRoB3e9hzedfMsfxSVu47yW2DO+HlZjt1xlut7e+p77WNbMhhan+8NprIQC+e+HoPJeVVTT/B2OfBxQN+eqJVDq7q5K4ZB1P7qgJe1uZkUGVY036BswUXPbV4w2GchOCOoRHWic2eFGTAwdUw4HYwuFg7GrNwdzHw0s19OH76LC+tbEb3jHcwXP0XOPQrJK8wfYBmZgO/zZpVVZXDyX220SVTK+YWNch3QZW+wtJKvkg4xuS+obT31WV9W+zchhy3WzsSsxoYEcDtV0WwZPMRtqY3Y6/UgfdCcC9Y+edWVxJYJ3dHd2q/SqTWHky9UEhfaNv1oq6ZT7cdpbSimnv09MeWqypXNVSirwPfMGtHY3ZPjI+mY4AnT3y9h7MVTSwuZnCGCS+rsYn4/5gnQDPRyd3RnSvza0MtdyGgzzQ4sgEKM6moquHDTYcZ3iWQnqFtrB1d65e0HErzYGDrXZHaFJ6uzvx7ah+O5JXyyi/N2Ig9Yhj0uRU2LWhVVSN1cnd0WYngEQB+NlYyN+Zmdbv3K77fncWpM+XcM0KXGjCJ7YvV4HnkaGtHYjFXdW7LrCEdeX/jYXYcKWj4DZca+zwY3OCnJ1vN4KpO7o7ueKLqb7e1zRkCoiB8IHLvFyyKT6dbO29GdQuydlSt38l9cGyLWrRkCwPoFvTUdT0I9fXgia92U1bZxO4Zn/Zw9Z/h4CpI+dE8AZqYY/3vaherKFUFo2ypv/1CMdMQp/ZTcyqJe4ZHIWztD1BrlLAYnN1bzYYcpuTt5swLN8VwKKeE+aubURhs0H1q8/ifnmoVg6sNJnchRAchxBohxAEhxH4hxCPG4wFCiFVCiDTjrb/xuBBCLBBCHBRC7BFC2FBnrnaRk3tA1thWf/uFet1INU7McN/ClH6h1o6m9btwQw5Pxyy4NrJbELfGdWDh+kPsPna6aW82uMD1r0DhUdjwqnkCNKHGtNyrgD9KKXsAQ4C5QoiewFPAr1LKrsCvxscA1wFdjV/3AW+bPGrNNKxd5rcBqSXurK+OYarrZtx0qYGW27MMKopbdWlfU/jrxB4E+7jz+Fe7m741X8RwNVV342uQn26eAE2kweQupTwhpdxpvF8EHADCgCnAEuPLlgA3GO9PAT6SyhbATwgRYvLItZY7vhN8QtQuNDbovfh0fhAjaFN+UvUT25qyM1B4XO37WlNj7WjqJyUkvK+mmdrSmgYraOPuwgs3xZB6qpg3fmtGaYGx/1Ct+J+eavi1VtSkNdxCiAigH7AVaCelPAHqD4AQItj4sjDg2AVvyzQeO3HJue5Dtezp2NHGZmo4iqxEm+2SyS4q49vELGb1nwzJ76vuhE5DrR3WeZkJ8NEU1RIGQKgSDm4+4G68veyx7/nHbUJVqQUng/ljLc6BTa9BdhJMft32Bs+t4OruwdzUP4y31h5iXK/29A7zbfyb24TA6Kfgl79Byk9qvYANanRyF0J4A18Dj0opz9QzuFXXE5fNHZJSLgQWAsTFxbWOuUX2pKwQ8tLU/F0b9PHmI1TW1DB7VG/gekj6Fq57CZxtYCPsnBRYejN4BcK1/1At9/Ii1ZIvL1LbFZafgZIc9dG93Hi8quzi8/hHwKD7od8slfBN7fRRVfhq58fq2r2nQsw001+nlXp6Yk/i03J5/Ks9LJ87DFfnJswvGfwAJH6i6s5EjVY1aGxMo5K7EMIFldiXSin/Zzx8SggRYmy1hwC1lfEzgQtL9oUDWaYKWDORE7vVbZjt9befrajmky1H+F2PdkQGeqmEtPdLVQulu5WLXBVmwsc3qv1eZ3+jpmw2VlWFMfkXQtYu2Pou/PxnWPMvleAH3QdtTVC8LTsZNs431sUX0PdWGPaoXZX1NQU/T1f+eUNv7v94B++sO8TD1zTh+2NwUStXl0yCDfPVNEkb05jZMgJYDByQUv73gqe+A2oLU9wOLL/g+BzjrJkhQGFt941mQ2rL/IbYXnL/amcmBaWV53da6nw1eLaFvV9YN7DSfPj4JpWgZ33dtMQO6lOHV1v1vt43wd0/w71roPv1amHR6wPg01shfW3zFspk7oDPZ8Jbg9Uq1EH3wSO7YMqbOrFfwbhe7ZnUN5TXf0tr+sbakSPVp6ENr0L+YfME2AKN+RwyDJgNjBFC7DJ+TQBeBMYKIdKAscbHAD8C6cBBYBHwoOnD1losayf4dVLJxoZU10je33CYvh38GBjhrw4aXKDXTap/s6yZO9u3VEUJLL1FVVOc/qnaktAUwvrDTQth3j4Y+fj5vvy3h8KOJVfctOQcKeHQGtWCfG8MZGyAUU/Co/tg/AvgG26aOO3Yc5N70cbdhce/3ENVdRMHxq/9p/r5XNkKW+5Syg1SSiGl7COljDV+/SilzJNSXiOl7Gq8zTe+Xkop50opO0spY6SUCeb/Z2hNlpVok1MgVx84xeHcEu4dEXnxoqWYW1S/sTVKr1ZXwhdz1B/EmxdD5AjTX8Onvdq7c95+mPIWCAN8/zD8tyesfk7NyrlQTY3asWrRGPj4BshJVYlm3j5VptbG/mjbsgAvV56f0pu9xwtZGN/E6Y1tQmHUE5D6ExywrbLAescDR1SSqwbbbHAHnvfi0wnz82D8pfujdhikPmns+cKyqytrauDbB1V//6TX1Cbe5uTiDv1mqn/jkY2w5W31sX/TAug5RQ3A5h9Sx3JTwT8SJs5Xr3d2M29sduz6PiGs2NOe//ySSqC3W9N2+hr8ezW+8e3vIbAbBHUzX6BNoMsPOKKsXerWxlruu46dZntGAXcNj8TZcMmPphCq9X54HRSdskxAUsIvf1V9/WP+BgPusMx1Qf17I4bD9KWq33zwA2pbvPevVUnE4AZTF8NDCRB3p07sJvDKLX0Z2rktT3y1h7fWHkQ2dtzD2VV11Tm7wWe3qrEZG6CTuyPK2gkICIm1diQXeXPNQXzcnbl14BVaTX2mqXIJ+/9X9/OmtuFV2PKWSqwj/mSZa9bFPwLG/QseS1KDozO/hgfiVeVMg/7wbSpebs4svn0gk/uG8tLKFP6x4gA1NY1M8H4d4dalajbVl7errjwr08ndEWUlqtkT5phb3UzxaTmsSjrFA6M6432l/VGDoqF9H9U1Y247P4Jfn4PeN8O4F2xj4Y+bj5oy2fV3thGPHXJ1dmL+rbHcOSyC9zceZt4Xu6ioauQga8fBquvu8HpVGtjKdHJ3RMd32lSXTGV1Dc9+t59ObT0brtneZ5r65GHOTROSf4DvH4HO18ANbztcaVxH5+QkeHpiT54YH83yXVncvWR74zfYjr0Nhj6sqm9uW2TeQBugf2odzZkTUHzSpsoOLNmUwaGcEp6e2BM35waW4/eeCgjztd4zNsJXd6k/ftM+so0VsZrFCSF4cHQXXprah40Hc7lt0Rbyissb9+bfPQvdxqvW+6E15gyzXjq5O5pz2+rZRss9u6iM+avTuDo6iGt6tGv4DW1C1VTEvV+Yfkeck/vgsxng2wFu+xLcvE17fq3VmTawA+/OjiP5ZBG3vLOZzIJG1HF3MsBNi9TMmS9vt9rWfDq5O5qsRDWHun2MtSMB4KWVKZRXVfP3iT0b/6aYaapmS+0qW1MoyIBPbgJXL1VWQM8T14zG9mzHJ/cMJre4nKlvb2rcSlb3NnDb5+p37dNb4WwTa8ebgE7ujub4TgjuAa6e1o6EnUcL+GpHJncPjyIqqAmt5J6T1VRAU5UjKM5R9WKqylVi92vCHGfNIQyMCODLB1RV0lve2cy2w42Y7ugfAbd+ohoOX90J1Y3stzcRndwdiZQ2szK1pkby7Hf7CfZx46ExXZr2Zndf6DYO9n3d8l+YsjOwdKoai5j5JQR3b9n5NLsV3d6Hr38/lCAfN2Yv3sov+082/KaIYTDxv3DoN7VmwoJ0cnckp4/A2XybSO5f7jjGnsxC/jKhx5WnPtanzzRVUvfw2uYHkfoLvDtC9bVP+0itgtW0eoT7e/LVA0PpHtKGBz7ZwbLtRxt+U/85MGQubH0HEj4wf5BGOrk7kto+aivvxFN4tpKXVqYQ18mfKbHN3Bu167Vq84s9Xzb9vaePquqJn94CBleYsxy6Xdu8ODSHE+Dlyqf3DGZ41yCe/Hovb65pxGrWsc9Dl9/Bj3+Cw/EWiVMnd0eSlaiSWXAvq4Yxf3Uq+aUVPDu5F/Vs+lI/ZzfV9568ovE70VdVQPx/4Y1B6mPy756FBzaapxCYZte83Jx5b04cU2JDefnnFJ77Pqn+1awGZ7j5fVXu+YvZFikRrJO7I8lKhHa9rTp3O/VUER9tPsKMQR2btrVZXfpMU9vcpfzY8GvT18E7w9Sq0y7XwNxtMHyenseuNZursxOvTovlrmGRfLgpgznvb+NYfj0NDXdfmPG5uv/ZdLOXr9bJ3VHU1KiCYVbskpFSDaJ6uznz+LXRLT9hp+HgE2rccegKik7CV3fDR5OhukLNX5++VM+I0UzCyUnw94k9+NeNvUk8WsD4+ev5eHPGlVvxbTur8Z28g/D13VBTbb7YzHZmzbbkHYSKIqsOpq7cd5JNh/L407Xd8PcyQYvZyQlipqpyvCV5Fz9XXQWb34LX4+DA9zDqKXhwi+5b10xOCMHMwZ34ed5I+nfy5+/L9zNj0RYyckvqfkPkSLUfcNovsOpps8Wlk7ujyEpUt1YqO3C2opp//nCA7u19mDGoo+lOHDMNaqog6Zvzx45ugYWj1P6kHQfDg5vVHpc2uImxZj/C/T356K5BvDS1D0knzjD+tfW8F59OdV2t+IF3q20QN7+hNto2A53cHUXWTnDxVEuireCddYc4fvosz03udXmt9pZoHwNB3dWsmZJc+HYuvD9OrQic9jHM/Mo0m05rWiMIIZg2sAOr5o1iWOdA/vnDAW55ZxMHs4svf/G4FyBqNHz/qNpe0cR0cncUWYkQ0tcq9b+P5ZfyzrpDTOobyuAoEy/rr93E49gWeL0/7Pkchj0KD21Ts2l0aVzNCtr7uvPe7XHMvzWW9NwSJiyI5621By/eo9XgDDd/oOolfTFHrZQ2IZ3cHUF1FZzYY7X+9n/9cAAnIfjLBDOt/uwzDZw9VK33BzbC2OdUjRhNsyIhBDf0C+OXeSMZEx3MSytTuPGtTRw4ccEsGc8AuPVjKM0zeYkCndwdQU4yVJ21Sn/7hrRcVu4/yUNjuhDia6Y+b7+O8Hga3P69Lh+g2ZxgH3femT2At2b2J+v0WSa/sYH5q1PPbwIS0hcmvgoZ8fDb8ya7rk7ujsBKZX4rq2t49nu1CcfdwxvYhKOl3Hx0F4xm0ybEhLDqsVFMiAlh/uo0Jr+xgb2ZherJ2Nsg7i7Y+BokLTfJ9XRydwRZiWqpfkCURS+7ZFMGB7OL+fv1PXF3aWATDk1zAAFerrw2vR+L5sSRX1LBDW9t5OWfk1Vf/PgXISwOvn0QclJbfC2d3B3B8Z0QGmvR7eJyisp5bXUao6ODuKZHsMWuq2mtwdie7Vg1bxQ39gvjzTWHuHtJAkVVTsbdv9xh2SwoL2rRNXRyt3dV5XBqv8W7ZF5amUxZVTVPT+zZ/PoxmmbHfD1deOWWvvzfjTFsOJjL1Lc3cazaX9WgyUuD5Q+1aLcxndzt3al9UFNp0bIDu46d5ssdmdw1PLJpm3BomgO6bXBHltw5iBOFZdz41kZ2OvdRRe2SvoXNbzb7vDq527vjlh1MramRPLN8H8E+bvxhTFeLXFPTWrvhXQP55sGheLo6M33hFr73uhl6TFLlCZpZIlgnd3uXtQs8A9Wmzxbw1Y5MdmcW8ucJ3Zu3CYemOaguwT58O3cYfcJ8+cPnu3jH74/Itp3V/PczWU0+n07u9i5rp2q1W6Dfu7C0kn+vTGZghD83xIaZ/XqaZm8CvFxZeu9gbuwXxotrsnixzV+RlWfhi9vVfgRN0GByF0K8L4TIFkLsu+BYgBBilRAizXjrbzwuhBALhBAHhRB7hBDW3fLH0VWUqAVMFupvf3V1KgUt3YRD0xycm7OB/07ry2Nju/Fukgv/9XgYMrfBz39p0nka03L/EBh/ybGngF+llF2BX42PAa4Duhq/7gPeblI0mmmd2AOyxiL97QdOnOGjzRnMGtKJXqEt3IRD0xycEIKHr+nK6zP68W5eHz53ngLbF8Huzxt9jgaTu5RyPZB/yeEpwBLj/SXADRcc/0gqWwA/IURIo6PRTGvnElVzpcNgs15GSskzy/fj5+nKY2OtU3VS0+zRpL6hfH7fEF6Vt7GdnlR/9wic3Nuo9za3z72dlPIEgPG2dpVKGHDsgtdlGo9dRghxnxAiQQiRkJNj2mpoGpB3CPZ8oepGewaY9VLf7c5iW0Y+T4yLxs9Tb1unaabUv6M/X80dySs+T5FT5UHRkulwtqDB95l6QLWujtY6Z+FLKRdKKeOklHHSvQ1llebbbsohxf8XDC4w9GGzXqaorJJ//XCAvuG+TIvTW9dpmjl0CPDkvbkTWNj+WdxKT5D6zkyqq+vPmc1N7qdqu1uMt9nG45nAhb/h4UCDc3hOnSlj7Kvr+Hn/SWQLVmRpRvmHYfdnMOBO8Gln1ku9/ttBsovKeW5Kb5yc9CCqppmLj7sLf7lvDqs6PEK3wo18//pj9b6+ucn9O+B24/3bgeUXHJ9jnDUzBCis7b6pT2SgFx4uBu7/eAezF28j7VTLaio4vPj/gJMzDHvErJc5mF3E+xsOc2tcB2I7+Jn1WpqmgbPBievvfpqDIROZXLCk3tc2ZirkZ8BmIFoIkSmEuBt4ERgrhEgDxhofA/wIpAMHgUXAg40J2NvNmR8fHsFzk3uxJ/M041+L5/nvkyg8W9mYt2sXKjhibLXfDm3MN5YtpeTZ75LwdDXwxPhos11H07RLCEGXOxdR6l//752whW6QuLg4mZCg9hDML6nglV9S+GzbUQI8XXl8XDS3xHXAoD/yN873j8CuT+HhXeBrvoVEP+09we+X7uS5yb24fWiE2a6jadoVlOYjvNrukFLG1fW0za1QDfBy5f9ujOH7h4YTFeTFU//byw1vbmTHkUtnY2qXOX0MEpdCv9lmTexnK6r5x4okurf3Yebgjma7jqZp9WhgFpzNJfdavcN8+eL+q3hteiw5ReVMfXsz85bt4tSZMmuHZrs2vKpuh88z62XeWnuQrMIynp/SG2eDzf4IaZpDs+nfTCEEU2LD+PWPo3jo6i78sOcEV7+ylrfXHqK8Sk+dvEjhcUj8GPrNAj/zTUnMyC3h3XXp3NgvjEGR5p0/r2la89l0cq/l5ebMn8ZFs+qxkQzrEsi/VyYz7tX1rE46padO1trwqio1YOZW+/MrknAxCP58nd6IWtNsWatI7rU6tfVi0Zw4PrprEAYnwT0fJTD17U2sTjpFTY0DJ/kzWarUQOxt4N/JbJf59cApfkvO5tHfdSO4jbvZrqNpWsu1quRea2S3IFY+OpJ/TOlFdlE593yUwPjX1vNNYiaV1TXWDs/yNr4GNdUw4o9mu0RZZTXPfZ9El2Bv7hgWYbbraJpmGq0yuQO4GJyYfVUEa/40mldv7QvAvGW7Gf3yWpZsyuBshYP0yRedhB0fQt8Z4B9htsssWp/O0fxSnp3UCxc9iKppNq/V/5a6GJy4sV84Kx8ZyeLb42jv684z3+1n+L9/443f0ux/IdTGBVBdCSPN12rPLCjlzbUHmRDTnuFdA812HU3TTMdu9kFzchJc06Md1/Rox7bD+by99iCv/JLKO+vSmTm4I3cPj7S/fuLibEh4H/pMg4Aos13mXz8cAOCv1/c02zU0TTMtu0nuFxoUGcCgyEEkZZ3hnXWHWBSfzgcbM5g6IJz7R0YREehl7RBNY+NrUF0OIx832yXi03L4ad9J/nRtN8L8PMx2HU3TTMvmyg+Yw5G8EhauT+fLHZlUVdcwISaEuVd3oUdIG7Nd0+yKc+C1PmqH9JsWmuUSFVU1jH9tPdU1kp8fHYm7i8Es19E0rXmEEK2n/IA5dGrrxb9ujGHDk1dz38jOrE3JYeLrG3hzzUGqW+sUys2vQ+VZGPEns13ig42HSc8p4ZlJPXVi17RWxiGSe61gH3eeuq47G58cw3W92/PyzynMeX8r2a2tpEFJHmx7D3pPhSDzbGt3srCMBb+mcU33YMZ0N29NeE3TTM+hknstX08XXp/Rj39PjWHHkQKuey2eNSnZDb/RVmx+AypLzdbXXni2kse/2k1lteTpSXoQVdNaI4dM7qDq1tw6sCMr/jCcIB837vxgO/9ckURFlY0vgirNh20LodcNEGz6EgCbD+Vx3fz1bD6Ux9OTetKprZ0MPmuag3HY5F6rS7AP384dxpyrOvHehsNMfXsTGbkl1g7ryra8BRXFMPIJk562oqqGF346wG3vbcHNxcDXvx/KrCHmK2WgaZp5OXxyB3B3MfD8lN68O3sAR/NLuX5BPN8kZlo7rMudLYCt70KPydDOdN0lB7OLuPGtjby7Lp3pAzvyw8PD6au3zdO0Vs02knv+ITi519pRMK5Xe356ZAS9Qn2Zt2w3j32xi5LyKmuHdd6Wt6H8DIx60iSnk1Ly8eYMJr6+gazTZ1k4ewAv3BSDp6tdLn/QNIdiG8m9ohTeGQFf3wsFGVYNJdTPg0/vHcwj13Tl28TjTHx9A/uOF1o1JgDOnoYt70D3idC+d4tPl1NUzt1LEvj78v0MimzLz4+O5Npe7U0QqKZptsA2kntwTxj+KBz4Hl6Pg5+ehJJcq4XjbHBi3thufHrvEM5WVHPjWxtZvOGwdWvHb30XygthVMv72n89cIrx89ez4WAuz07qyYd3DLS/0gya5uBsa4XqmSxY+yIkfgIuHjD0YbhqLrh5Wy22gpIKHv9qD6sPnGJM92BevrkPbb3dLBtEWSHMj4FOw2DGZ80+zdmKav71YxKfbDlK9/Y+LJjRj27tfEwYqKZpltR6Vqi2CYXJC+DBLdD5alj7f7AgFrYtgqoKq4Tk7+XKojkDeG5yLzak5XLda/F8m3jcslMmty5UCb4FrfZ9xwuZ+Ho8n2w5yr0jIln+0DCd2DXNjtlWy/1Sx7bD6mfhyAZVq3zM36HXTeBknb9JSVlneOyLXSSfLKJdGzfmXBXBbYM64u/lap4LZh+AdS/B/m+g2zi4bVmTT1FdI1m4Pp3/rkqhrZcb/5nWl2FddNleTbMH9bXcbTu5A0gJB1erJH9qH7TvA2Ofg85jLBpjrZoaybq0HN7fcJj4tFzcXZy4qX84dw2LoEuwiVrCp5Jg3b8haTm4esGge2HYo+DR+OmJxeVVbD6Ux3vxSgV1CgAACAhJREFU6Ww9nM+EmPb8340x+Hma6Q+RpmkW17qTe62aGtj7Jfz2Tyg8CpGj4HfPQlh/S4RYp5STRXyw8TD/M3bTjOoWxN3DIxnRNRAhRNNPeGr/BUndBwbfB1c9BJ4BDb61pkayL6uQ9ak5rE/LZeeRAqpqJD7uzjwzqRdT+4c1LyZN02yWfST3WlXlaoOK9S9DaR74dYSQvhASa/zqC95B5g34EnnF5SzdepSPNh8ht7icrsHe3DU8khv7hTWumuLJvSqpH/ge3NrA4PthyIMNJvUThWeJT8tlfWoOGw/mUlCqdp3qHdaGkV2DGNE1iP6d/HBz1hUdNc0e2Vdyr1V2BhI/hsztkLULCg6ff84nVCX5UGOyD+kLPiFg5pZreVU1K3afYPGGwySdOIO/pwszB3dizlWd6p5qeGK36lNPXgFuvjDkARjye/Dwr/P8Zyuq2ZaRz/rUHOLTckg9VQxAsI8bI7oGMbJbIMO7BFp+No+maVZhn8n9UmdPqxbwid3nv3JTAeO/zyvofKIPiYXgHuDdDtx8TJ70pZRsPZzP4g2HWX3gFM5Ogol9QpnaPxwXg8AtZy/hexYQePxXKl18ONR5DqkRsyh18qayRlJZVUNVTQ2V1ZLK6hoqqmrYk1nItox8KqpqcHV2YnBkgGqddwskup2P7nLRNAfkGMm9LuXFahD2woSffQBk9fnXOLuDdzB4Batk7x2kbr2C1PFz99s1a779kbwSPtiYwZcJR4mqPMjDzv9jrGEnhdKTxVUT+LB6HGeov/KiwUnQOciLkV2DGNktiEGRAXrzDE3TLJ/chRDjgdcAA/CelPLF+l5v7m32LlJ5FrKTIDdNbTBdfApKctRtsfG2NI9zLf4LuXiqRG9wgZpq9UeipsZ4W1XHserzt8bzVbr6crLn3eT1vgODux8uzgJnJydcDU44GwQuBidcjLfOBoGLkxNOTrpVrmna5epL7iavECWEMABvAmOBTGC7EOI7KWWSqa/VLC4eEDZAfV1JdRWU5hqTfzaUZJ9P/iXZKpE7OYMwgJMBhJPx1qCOX3bMeNyzLS59bqWDexs6WO5frGmaAzJH+b9BwEEpZTqAEOJzYApgG8m9MQzO4NNefWmaprVC5ljqGQYcu+BxpvHYRYQQ9wkhEoQQCTk5OWYIQ9M0zXGZI7nX1UF8WQe2lHKhlDJOShkXFGTZeemapmn2zhzJPRMu6lIOB7LMcB1N0zTtCsyR3LcDXYUQkUIIV2A68J0ZrqNpmqZdgckHVKWUVUKIh4CfUVMh35dS7jf1dTRN07QrM8tmmVLKH4EfzXFuTdM0rWG2tVmHpmmaZhI6uWuaptkhm6gtI4QoAlKsHYeNCwSst2t466C/R/XT35+GtbbvUScpZZ1zyc3S594MKVeqj6ApQogE/T2qn/4e1U9/fxpmT98j3S2jaZpmh3Ry1zRNs0O2ktwXWjuAVkB/jxqmv0f109+fhtnN98gmBlQ1TdM007KVlrumaZpmQjq5a5qm2SGrJ3chxHghRIoQ4qAQ4ilrx2NrhBAZQoi9QohdQggL7UVo24QQ7wshsoUQ+y44FiCEWCWESDPe+lszRmu7wvfoWSHEcePP0i4hxARrxmhtQogOQog1QogDQoj9QohHjMft4mfJqsn9gi35rgN6AjOEED2tGZONulpKGWsv829N4ENg/CXHngJ+lVJ2BX41PnZkH3L59wjgVePPUqyxBpQjqwL+KKXsAQwB5hrzj138LFm75X5uSz4pZQVQuyWfpl2RlHI9kH/J4SnAEuP9JcANFg3Kxlzhe6RdQEp5Qkq503i/CDiA2jXOLn6WrJ3cG7Uln4OTwC9CiB1CiPusHYwNayelPAHqlxYItnI8tuohIcQeY7dNq+xuMAchRATQD9iKnfwsWTu5N2pLPgc3TErZH9V1NVcIMdLaAWmt1ttAZyAWOAH8x7rh2AYhhDfwNfColPKMteMxFWsnd70lXwOklFnG22zgG1RXlna5U0KIEADjbbaV47E5UspTUspqKWUNsAj9s4QQwgWV2JdKKf9nPGwXP0vWTu56S756CCG8hBA+tfeBa4F99b/r/9u7e9YoojAMw/eDFmqVwlpEsBJE0EZQCP4DUcuwWvkTBJtUdiI2giAWWijEwq8ugig2NoKQwk4XOyu1sBDUYzFnSSIoG9g4yZn7qs7OLMu7w+zDcGbOu4P1BBjV8Qh43GMtW9IksKrTDPxcShLgNvCulHJtza4mzqXeV6jWx7Gus/qXfFd6LWgLSXKA7modug6e9zw+kOQ+ME/XnvUTsAg8ApaAfcBH4FwpZbA3FP9yjObppmQKMAYuTuaWhyjJCeAVsAL8qpsv0827b/tzqfdwlyTNXt/TMpKkTWC4S1KDDHdJapDhLkkNMtwlqUGGuwYpyf61HROl1hju0owk2dl3DdKE4a4h25HkVu3lvZxkd5IjSV7X5loPJ821krxIcqyO9yYZ1/H5JA+SPAWW+/sq0nqGu4bsIHCjlHII+AKcAe4Cl0oph+lWLi5O8TnHgVEp5dSmVSptkOGuIftQSnlbx2/oOibOlVJe1m13gGm6cD7bjsvT1TbDXUP2fc34JzD3j/f+YPX3suuPfd9mWZQ0C4a7tOor8DnJyfp6AZhcxY+Bo3V89j/XJW2Yd/el9UbAzSR7gPfAhbr9KrCUZAF43ldx0rTsCilJDXJaRpIaZLhLUoMMd0lqkOEuSQ0y3CWpQYa7JDXIcJekBv0GkF4pEJ9sF/4AAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# combine the two plots\n", "bikes.groupby(['hour', 'workingday']).total.mean().unstack().plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Write about your findings" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Exercise 7.3\n", "\n", "Fit a linear regression model to the entire dataset, using \"total\" as the response and \"hour\" and \"workingday\" as the only features. Then, print the coefficients and interpret them. What are the limitations of linear regression in this instance?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Exercice 7.4\n", "\n", "Create a Decision Tree to forecast \"total\" by manually iterating over the features \"hour\" and \"workingday\". The algorithm must at least have 6 end nodes." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Exercise 7.5\n", "\n", "Train a Decision Tree using scikit-learn. Comment about the performance of the models." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Part 2 - Bagging" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Mashable news stories analysis\n", "\n", "Predicting if a news story is going to be popular" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
urltimedeltan_tokens_titlen_tokens_contentn_unique_tokensn_non_stop_wordsn_non_stop_unique_tokensnum_hrefsnum_self_hrefsnum_imgs...min_positive_polaritymax_positive_polarityavg_negative_polaritymin_negative_polaritymax_negative_polaritytitle_subjectivitytitle_sentiment_polarityabs_title_subjectivityabs_title_sentiment_polarityPopular
0http://mashable.com/2014/12/10/cia-torture-rep...28.09.0188.00.7326201.00.8442625.01.01.0...0.2000000.80-0.487500-0.60-0.2500000.90.80.40.81
1http://mashable.com/2013/10/18/bitlock-kicksta...447.07.0297.00.6531991.00.8157899.04.01.0...0.1600000.50-0.135340-0.40-0.0500000.1-0.10.40.10
2http://mashable.com/2013/07/24/google-glass-po...533.011.0181.00.6603771.00.7757014.03.01.0...0.1363641.000.0000000.000.0000000.31.00.21.00
3http://mashable.com/2013/11/21/these-are-the-m...413.012.0781.00.4974091.00.67735010.03.01.0...0.1000001.00-0.195701-0.40-0.0714290.00.00.50.00
4http://mashable.com/2014/02/11/parking-ticket-...331.08.0177.00.6857141.00.8303573.02.01.0...0.1000000.55-0.175000-0.25-0.1000000.00.00.50.00
\n", "

5 rows × 61 columns

\n", "
" ], "text/plain": [ " url timedelta \\\n", "0 http://mashable.com/2014/12/10/cia-torture-rep... 28.0 \n", "1 http://mashable.com/2013/10/18/bitlock-kicksta... 447.0 \n", "2 http://mashable.com/2013/07/24/google-glass-po... 533.0 \n", "3 http://mashable.com/2013/11/21/these-are-the-m... 413.0 \n", "4 http://mashable.com/2014/02/11/parking-ticket-... 331.0 \n", "\n", " n_tokens_title n_tokens_content n_unique_tokens n_non_stop_words \\\n", "0 9.0 188.0 0.732620 1.0 \n", "1 7.0 297.0 0.653199 1.0 \n", "2 11.0 181.0 0.660377 1.0 \n", "3 12.0 781.0 0.497409 1.0 \n", "4 8.0 177.0 0.685714 1.0 \n", "\n", " n_non_stop_unique_tokens num_hrefs num_self_hrefs num_imgs ... \\\n", "0 0.844262 5.0 1.0 1.0 ... \n", "1 0.815789 9.0 4.0 1.0 ... \n", "2 0.775701 4.0 3.0 1.0 ... \n", "3 0.677350 10.0 3.0 1.0 ... \n", "4 0.830357 3.0 2.0 1.0 ... \n", "\n", " min_positive_polarity max_positive_polarity avg_negative_polarity \\\n", "0 0.200000 0.80 -0.487500 \n", "1 0.160000 0.50 -0.135340 \n", "2 0.136364 1.00 0.000000 \n", "3 0.100000 1.00 -0.195701 \n", "4 0.100000 0.55 -0.175000 \n", "\n", " min_negative_polarity max_negative_polarity title_subjectivity \\\n", "0 -0.60 -0.250000 0.9 \n", "1 -0.40 -0.050000 0.1 \n", "2 0.00 0.000000 0.3 \n", "3 -0.40 -0.071429 0.0 \n", "4 -0.25 -0.100000 0.0 \n", "\n", " title_sentiment_polarity abs_title_subjectivity \\\n", "0 0.8 0.4 \n", "1 -0.1 0.4 \n", "2 1.0 0.2 \n", "3 0.0 0.5 \n", "4 0.0 0.5 \n", "\n", " abs_title_sentiment_polarity Popular \n", "0 0.8 1 \n", "1 0.1 0 \n", "2 1.0 0 \n", "3 0.0 0 \n", "4 0.0 0 \n", "\n", "[5 rows x 61 columns]" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_csv('../datasets/mashable.csv', index_col=0)\n", "df.head()" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(6000, 61)" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.shape" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "X = df.drop(['url', 'Popular'], axis=1)\n", "y = df['Popular']" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.5" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "y.mean()" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [], "source": [ "# train/test split\n", "from sklearn.model_selection import train_test_split\n", "X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Exercise 7.6\n", "\n", "Estimate a Decision Tree Classifier and a Logistic Regression\n", "\n", "Evaluate using the following metrics:\n", "* Accuracy\n", "* F1-Score" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Exercise 7.7\n", "\n", "Estimate 300 bagged samples\n", "\n", "Estimate the following set of classifiers:\n", "\n", "* 100 Decision Trees where max_depth=None\n", "* 100 Decision Trees where max_depth=2\n", "* 100 Logistic Regressions" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Exercise 7.8\n", "\n", "Ensemble using majority voting\n", "\n", "Evaluate using the following metrics:\n", "* Accuracy\n", "* F1-Score" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Exercise 7.9\n", "\n", "Estimate the probability as %models that predict positive\n", "\n", "Modify the probability threshold and select the one that maximizes the F1-Score" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Exercise 7.10\n", "\n", "Ensemble using weighted voting using the oob_error\n", "\n", "Evaluate using the following metrics:\n", "* Accuracy\n", "* F1-Score" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Exercise 7.11\n", "\n", "Estimate te probability of the weighted voting\n", "\n", "Modify the probability threshold and select the one that maximizes the F1-Score" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Exercise 7.12\n", "\n", "Estimate a logistic regression using as input the estimated classifiers\n", "\n", "Modify the probability threshold such that maximizes the F1-Score" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" } }, "nbformat": 4, "nbformat_minor": 1 }