{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Final Project \n", "\n", "## \"Forecasting the winner of tennis matches in the Men’s ATP World Tour\"\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Problem Statement" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The goal of the project is to predict the probability that the higher-ranked player will win a tennis match. We will call that a `win`. This can be stated more formally as:\n", "\n", "For two players ${\\cal P}_1$ and ${\\cal P}_2$ where ${{\\rm Rank}_1} > {{\\rm Rank}_2}$ we calculate the probability that ${\\cal P}_1$ will win the match, or that a `win` will happen.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Dataset" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Results for the men's ATP tour date back to January 2000 from the dateset http://www.tennis-data.co.uk/data.php (obtained from Kaggle). The original dataset is shown below." ] }, { "cell_type": "code", "execution_count": 69, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>ATP</th>\n", " <th>Location</th>\n", " <th>Tournament</th>\n", " <th>Date</th>\n", " <th>Series</th>\n", " <th>Court</th>\n", " <th>Surface</th>\n", " <th>Round</th>\n", " <th>Best of</th>\n", " <th>Winner</th>\n", " <th>...</th>\n", " <th>UBW</th>\n", " <th>UBL</th>\n", " <th>LBW</th>\n", " <th>LBL</th>\n", " <th>SJW</th>\n", " <th>SJL</th>\n", " <th>MaxW</th>\n", " <th>MaxL</th>\n", " <th>AvgW</th>\n", " <th>AvgL</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>0</th>\n", " <td>1</td>\n", " <td>Adelaide</td>\n", " <td>Australian Hardcourt Championships</td>\n", " <td>3/01/2000</td>\n", " <td>International</td>\n", " <td>Outdoor</td>\n", " <td>Hard</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>Dosedel S.</td>\n", " <td>...</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>1</td>\n", " <td>Adelaide</td>\n", " <td>Australian Hardcourt Championships</td>\n", " <td>3/01/2000</td>\n", " <td>International</td>\n", " <td>Outdoor</td>\n", " <td>Hard</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>Enqvist T.</td>\n", " <td>...</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " </tr>\n", " <tr>\n", " <th>2</th>\n", " <td>1</td>\n", " <td>Adelaide</td>\n", " <td>Australian Hardcourt Championships</td>\n", " <td>3/01/2000</td>\n", " <td>International</td>\n", " <td>Outdoor</td>\n", " <td>Hard</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>Escude N.</td>\n", " <td>...</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>1</td>\n", " <td>Adelaide</td>\n", " <td>Australian Hardcourt Championships</td>\n", " <td>3/01/2000</td>\n", " <td>International</td>\n", " <td>Outdoor</td>\n", " <td>Hard</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>Federer R.</td>\n", " <td>...</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>1</td>\n", " <td>Adelaide</td>\n", " <td>Australian Hardcourt Championships</td>\n", " <td>3/01/2000</td>\n", " <td>International</td>\n", " <td>Outdoor</td>\n", " <td>Hard</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>Fromberg R.</td>\n", " <td>...</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " <td>NaN</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "<p>5 rows × 54 columns</p>\n", "</div>" ], "text/plain": [ " ATP Location Tournament Date \\\n", "0 1 Adelaide Australian Hardcourt Championships 3/01/2000 \n", "1 1 Adelaide Australian Hardcourt Championships 3/01/2000 \n", "2 1 Adelaide Australian Hardcourt Championships 3/01/2000 \n", "3 1 Adelaide Australian Hardcourt Championships 3/01/2000 \n", "4 1 Adelaide Australian Hardcourt Championships 3/01/2000 \n", "\n", " Series Court Surface Round Best of Winner ... UBW \\\n", "0 International Outdoor Hard 1st Round 3 Dosedel S. ... NaN \n", "1 International Outdoor Hard 1st Round 3 Enqvist T. ... NaN \n", "2 International Outdoor Hard 1st Round 3 Escude N. ... NaN \n", "3 International Outdoor Hard 1st Round 3 Federer R. ... NaN \n", "4 International Outdoor Hard 1st Round 3 Fromberg R. ... NaN \n", "\n", " UBL LBW LBL SJW SJL MaxW MaxL AvgW AvgL \n", "0 NaN NaN NaN NaN NaN NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN NaN NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN NaN NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN NaN NaN NaN NaN NaN \n", "4 NaN NaN NaN NaN NaN NaN NaN NaN NaN \n", "\n", "[5 rows x 54 columns]" ] }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "df_atp = pd.read_csv('Data.csv')\n", "df_atp.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Features " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The features for each match that were used in the project follow:\n", "- `Date`: date of the match \n", "- `Series`: name of ATP tennis series (we kept the four main current categories namely Grand Slams, Masters 1000, ATP250, ATP500)\n", "- `Surface`: type of surface (clay, hard or grass)\n", "- `Round`: round of match (from first round to the final)\n", "- `Best of`: maximum number of sets playable in match (Best of 3 or Best of 5)\n", "- `WRank`: ATP Entry ranking of the match winner as of the start of the tournament\n", "- `LRank`: ATP Entry ranking of the match loser as of the start of the tournament" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The output variable is binary. The better player has higher rank by definition. We define the `win` variable by:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "${\\rm{win}} = \\left\\{ {\\begin{array}{*{20}{c}}\n", "1&{{\\rm{higher\\,\\,ranked\\,\\,player\\,\\, wins}}}\\\\\n", "0&{{\\rm{higher\\,\\,ranked\\,\\,player\\,\\, loses}}}\n", "\\end{array}} \\right.$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Importing basic modules" ] }, { "cell_type": "code", "execution_count": 70, "metadata": { "collapsed": false }, "outputs": [], "source": [ "import numpy as np\n", "import statsmodels.api as sm\n", "import matplotlib.pyplot as plt\n", "from sklearn import metrics\n", "import seaborn as sns\n", "sns.set_style(\"darkgrid\")\n", "import pylab as pl\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Pre-Processing of dataset" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After loading the dataset we proceed as following:\n", "- Keep only completed matches i.e. eliminate matches with injury withdrawals and walkovers.\n", "- For convenience we rename `Best of` to `Best_of`\n", "- Choose the features listed above\n", "- Drop `NaN` entries\n", "- Consider the two final years only (to avoid comparing different categories of tournaments which existed in the past). We note that this choice is somewhat arbitrary and can be changed if needed.\n", "- Choose only higher ranked players for better accuracy (as suggested by Corral and Prieto-Rodriguez (2010) and confirmed here)" ] }, { "cell_type": "code", "execution_count": 71, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>Date</th>\n", " <th>Series</th>\n", " <th>Surface</th>\n", " <th>Round</th>\n", " <th>Best_of</th>\n", " <th>WRank</th>\n", " <th>LRank</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>39530</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>92</td>\n", " <td>42</td>\n", " </tr>\n", " <tr>\n", " <th>39531</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>45</td>\n", " <td>78</td>\n", " </tr>\n", " <tr>\n", " <th>39532</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>53</td>\n", " <td>230</td>\n", " </tr>\n", " <tr>\n", " <th>39533</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>84</td>\n", " <td>165</td>\n", " </tr>\n", " <tr>\n", " <th>39534</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>18</td>\n", " <td>111</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " Date Series Surface Round Best_of WRank LRank\n", "39530 2014-12-02 ATP250 Clay 1st Round 3 92 42\n", "39531 2014-12-02 ATP250 Clay 1st Round 3 45 78\n", "39532 2014-12-02 ATP250 Clay 1st Round 3 53 230\n", "39533 2014-12-02 ATP250 Clay 1st Round 3 84 165\n", "39534 2014-12-02 ATP250 Clay 1st Round 3 18 111" ] }, "execution_count": 71, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_atp['Date'] = pd.to_datetime(df_atp['Date']) \n", "# Restricing dates\n", "df_atp = df_atp.loc[(df_atp['Date'] > '2014-11-09') & (df_atp['Date'] <= '2016-11-09')]\n", "# Keeping only completed matches\n", "df_atp = df_atp[df_atp['Comment'] == 'Completed'].drop(\"Comment\",axis = 1)\n", "# Rename Best of to Best_of\n", "df_atp.rename(columns = {'Best of':'Best_of'},inplace=True)\n", "# Choosing features\n", "cols_to_keep = ['Date','Series','Surface', 'Round','Best_of', 'WRank','LRank']\n", "# Dropping NaN\n", "df_atp = df_atp[cols_to_keep].dropna()\n", "# Dropping errors in the dataset and unimportant entries (e.g. there are very few entries for Masters Cup)\n", "df_atp = df_atp[(df_atp['LRank'] != 'NR') & (df_atp['WRank'] != 'NR') & (df_atp['Series'] != 'Masters Cup')]\n", "df_atp.head()" ] }, { "cell_type": "code", "execution_count": 72, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>Date</th>\n", " <th>Series</th>\n", " <th>Surface</th>\n", " <th>Round</th>\n", " <th>Best_of</th>\n", " <th>WRank</th>\n", " <th>LRank</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>39530</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>92</td>\n", " <td>42</td>\n", " </tr>\n", " <tr>\n", " <th>39531</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>45</td>\n", " <td>78</td>\n", " </tr>\n", " <tr>\n", " <th>39532</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>53</td>\n", " <td>230</td>\n", " </tr>\n", " <tr>\n", " <th>39533</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>84</td>\n", " <td>165</td>\n", " </tr>\n", " <tr>\n", " <th>39534</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>18</td>\n", " <td>111</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " Date Series Surface Round Best_of WRank LRank\n", "39530 2014-12-02 ATP250 Clay 1st Round 3 92 42\n", "39531 2014-12-02 ATP250 Clay 1st Round 3 45 78\n", "39532 2014-12-02 ATP250 Clay 1st Round 3 53 230\n", "39533 2014-12-02 ATP250 Clay 1st Round 3 84 165\n", "39534 2014-12-02 ATP250 Clay 1st Round 3 18 111" ] }, "execution_count": 72, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Transform strings into numerical values\n", "df_atp[['Best_of','WRank','LRank']] = df_atp[['Best_of','WRank','LRank']].astype(int)\n", "df_atp.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We now create an extra columns for the variable `win` described above using an auxiliary function `win(x)`" ] }, { "cell_type": "code", "execution_count": 73, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>Date</th>\n", " <th>Series</th>\n", " <th>Surface</th>\n", " <th>Round</th>\n", " <th>Best_of</th>\n", " <th>WRank</th>\n", " <th>LRank</th>\n", " <th>win</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>39530</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>92</td>\n", " <td>42</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>39531</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>45</td>\n", " <td>78</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>39532</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>53</td>\n", " <td>230</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>39533</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>84</td>\n", " <td>165</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>39534</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>18</td>\n", " <td>111</td>\n", " <td>1</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " Date Series Surface Round Best_of WRank LRank win\n", "39530 2014-12-02 ATP250 Clay 1st Round 3 92 42 0\n", "39531 2014-12-02 ATP250 Clay 1st Round 3 45 78 1\n", "39532 2014-12-02 ATP250 Clay 1st Round 3 53 230 1\n", "39533 2014-12-02 ATP250 Clay 1st Round 3 84 165 1\n", "39534 2014-12-02 ATP250 Clay 1st Round 3 18 111 1" ] }, "execution_count": 73, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def win(x):\n", " if x > 0:\n", " return 0\n", " elif x <= 0:\n", " return 1 \n", " \n", "df_atp['win'] = (df_atp['WRank'] - df_atp['LRank']).apply(win)\n", "df_atp.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Following Corral and Prieto-Rodriguez (2010) we restrict the analysis to higher ranked players (the analysis including all players has indeed less predictive power as we tested)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>Date</th>\n", " <th>Series</th>\n", " <th>Surface</th>\n", " <th>Round</th>\n", " <th>Best_of</th>\n", " <th>WRank</th>\n", " <th>LRank</th>\n", " <th>win</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>39530</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>92</td>\n", " <td>42</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>39531</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>45</td>\n", " <td>78</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>39534</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>18</td>\n", " <td>111</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>39536</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>2nd Round</td>\n", " <td>3</td>\n", " <td>17</td>\n", " <td>108</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>39537</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>2nd Round</td>\n", " <td>3</td>\n", " <td>14</td>\n", " <td>58</td>\n", " <td>1</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " Date Series Surface Round Best_of WRank LRank win\n", "39530 2014-12-02 ATP250 Clay 1st Round 3 92 42 0\n", "39531 2014-12-02 ATP250 Clay 1st Round 3 45 78 1\n", "39534 2014-12-02 ATP250 Clay 1st Round 3 18 111 1\n", "39536 2014-12-02 ATP250 Clay 2nd Round 3 17 108 1\n", "39537 2014-12-02 ATP250 Clay 2nd Round 3 14 58 1" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "newdf = df_atp.copy()\n", "newdf2 = newdf[(newdf['WRank'] <= 150) & (newdf['LRank'] <= 150)]\n", "newdf2.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Best_of = 5\n", "We now restrict our analysis to matches of Best_of = 5. Since only Grand Slams have 5 sets we can drop the new `Series` column. The case of Best_of = 3 will be considered afterwards." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>Date</th>\n", " <th>Surface</th>\n", " <th>Round</th>\n", " <th>WRank</th>\n", " <th>LRank</th>\n", " <th>win</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>41911</th>\n", " <td>2015-01-19</td>\n", " <td>Hard</td>\n", " <td>1st Round</td>\n", " <td>85</td>\n", " <td>84</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>41912</th>\n", " <td>2015-01-19</td>\n", " <td>Hard</td>\n", " <td>1st Round</td>\n", " <td>11</td>\n", " <td>90</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>41913</th>\n", " <td>2015-01-19</td>\n", " <td>Hard</td>\n", " <td>1st Round</td>\n", " <td>15</td>\n", " <td>59</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>41914</th>\n", " <td>2015-01-19</td>\n", " <td>Hard</td>\n", " <td>1st Round</td>\n", " <td>31</td>\n", " <td>91</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>41915</th>\n", " <td>2015-01-19</td>\n", " <td>Hard</td>\n", " <td>1st Round</td>\n", " <td>99</td>\n", " <td>98</td>\n", " <td>0</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " Date Surface Round WRank LRank win\n", "41911 2015-01-19 Hard 1st Round 85 84 0\n", "41912 2015-01-19 Hard 1st Round 11 90 1\n", "41913 2015-01-19 Hard 1st Round 15 59 1\n", "41914 2015-01-19 Hard 1st Round 31 91 1\n", "41915 2015-01-19 Hard 1st Round 99 98 0" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df3 = newdf2.copy()\n", "df3 = df3[df3['Best_of'] == 5]\n", "# Drop Best_of and Series columns\n", "df3.drop(\"Series\",axis = 1,inplace=True)\n", "df3.drop(\"Best_of\",axis = 1,inplace=True)\n", "df3.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we notice that our dataset is uneven in terms of frequency of `wins`:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "618\n", "816\n" ] }, { "data": { "text/plain": [ "0.7573529411764706" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "print(df3['win'].sum(axis=0))\n", "print(len(df3.index))\n", "df3['win'].sum(axis=0)/float(len(df3.index))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To correct this problem, and create a balanced dataset, we use a stratified sampling procedure. " ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false }, "outputs": [], "source": [ "y_0 = df3[df3.win == 0] \n", "y_1 = df3[df3.win == 1] \n", "n = min([len(y_0), len(y_1)]) \n", "y_0 = y_0.sample(n = n, random_state = 0) \n", "y_1 = y_1.sample(n = n, random_state = 0)\n", "df_strat = pd.concat([y_0, y_1]) \n", "X_strat = df_strat[['Date', 'Surface', 'Round','WRank', 'LRank']]\n", "y_strat = df_strat.win" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>Date</th>\n", " <th>Surface</th>\n", " <th>Round</th>\n", " <th>WRank</th>\n", " <th>LRank</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>41984</th>\n", " <td>2015-01-21</td>\n", " <td>Hard</td>\n", " <td>2nd Round</td>\n", " <td>46</td>\n", " <td>31</td>\n", " </tr>\n", " <tr>\n", " <th>46044</th>\n", " <td>2016-08-07</td>\n", " <td>Grass</td>\n", " <td>Semifinals</td>\n", " <td>7</td>\n", " <td>3</td>\n", " </tr>\n", " <tr>\n", " <th>43308</th>\n", " <td>2015-06-29</td>\n", " <td>Grass</td>\n", " <td>1st Round</td>\n", " <td>148</td>\n", " <td>93</td>\n", " </tr>\n", " <tr>\n", " <th>46497</th>\n", " <td>2016-08-29</td>\n", " <td>Hard</td>\n", " <td>1st Round</td>\n", " <td>120</td>\n", " <td>54</td>\n", " </tr>\n", " <tr>\n", " <th>43345</th>\n", " <td>2015-06-30</td>\n", " <td>Grass</td>\n", " <td>1st Round</td>\n", " <td>37</td>\n", " <td>32</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " Date Surface Round WRank LRank\n", "41984 2015-01-21 Hard 2nd Round 46 31\n", "46044 2016-08-07 Grass Semifinals 7 3\n", "43308 2015-06-29 Grass 1st Round 148 93\n", "46497 2016-08-29 Hard 1st Round 120 54\n", "43345 2015-06-30 Grass 1st Round 37 32" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X_strat.head()" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "41984 0\n", "46044 0\n", "43308 0\n", "46497 0\n", "43345 0\n", "Name: win, dtype: int64" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "y_strat.head()" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>Date</th>\n", " <th>Surface</th>\n", " <th>Round</th>\n", " <th>WRank</th>\n", " <th>LRank</th>\n", " <th>win</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>41984</th>\n", " <td>2015-01-21</td>\n", " <td>Hard</td>\n", " <td>2nd Round</td>\n", " <td>46</td>\n", " <td>31</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>46044</th>\n", " <td>2016-08-07</td>\n", " <td>Grass</td>\n", " <td>Semifinals</td>\n", " <td>7</td>\n", " <td>3</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>43308</th>\n", " <td>2015-06-29</td>\n", " <td>Grass</td>\n", " <td>1st Round</td>\n", " <td>148</td>\n", " <td>93</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>46497</th>\n", " <td>2016-08-29</td>\n", " <td>Hard</td>\n", " <td>1st Round</td>\n", " <td>120</td>\n", " <td>54</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>43345</th>\n", " <td>2015-06-30</td>\n", " <td>Grass</td>\n", " <td>1st Round</td>\n", " <td>37</td>\n", " <td>32</td>\n", " <td>0</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " Date Surface Round WRank LRank win\n", "41984 2015-01-21 Hard 2nd Round 46 31 0\n", "46044 2016-08-07 Grass Semifinals 7 3 0\n", "43308 2015-06-29 Grass 1st Round 148 93 0\n", "46497 2016-08-29 Hard 1st Round 120 54 0\n", "43345 2015-06-30 Grass 1st Round 37 32 0" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X_strat_1=X_strat.copy()\n", "X_strat_1['win']=y_strat\n", "X_strat_1.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We now define the variables ${\\cal P}_1$ and ${\\cal P}_2$ where ${{\\rm Rank}_1} > {{\\rm Rank}_2}$" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>Date</th>\n", " <th>Surface</th>\n", " <th>Round</th>\n", " <th>WRank</th>\n", " <th>LRank</th>\n", " <th>win</th>\n", " <th>P1</th>\n", " <th>P2</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>41984</th>\n", " <td>2015-01-21</td>\n", " <td>Hard</td>\n", " <td>2nd Round</td>\n", " <td>46</td>\n", " <td>31</td>\n", " <td>0</td>\n", " <td>46</td>\n", " <td>31</td>\n", " </tr>\n", " <tr>\n", " <th>46044</th>\n", " <td>2016-08-07</td>\n", " <td>Grass</td>\n", " <td>Semifinals</td>\n", " <td>7</td>\n", " <td>3</td>\n", " <td>0</td>\n", " <td>7</td>\n", " <td>3</td>\n", " </tr>\n", " <tr>\n", " <th>43308</th>\n", " <td>2015-06-29</td>\n", " <td>Grass</td>\n", " <td>1st Round</td>\n", " <td>148</td>\n", " <td>93</td>\n", " <td>0</td>\n", " <td>148</td>\n", " <td>93</td>\n", " </tr>\n", " <tr>\n", " <th>46497</th>\n", " <td>2016-08-29</td>\n", " <td>Hard</td>\n", " <td>1st Round</td>\n", " <td>120</td>\n", " <td>54</td>\n", " <td>0</td>\n", " <td>120</td>\n", " <td>54</td>\n", " </tr>\n", " <tr>\n", " <th>43345</th>\n", " <td>2015-06-30</td>\n", " <td>Grass</td>\n", " <td>1st Round</td>\n", " <td>37</td>\n", " <td>32</td>\n", " <td>0</td>\n", " <td>37</td>\n", " <td>32</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " Date Surface Round WRank LRank win P1 P2\n", "41984 2015-01-21 Hard 2nd Round 46 31 0 46 31\n", "46044 2016-08-07 Grass Semifinals 7 3 0 7 3\n", "43308 2015-06-29 Grass 1st Round 148 93 0 148 93\n", "46497 2016-08-29 Hard 1st Round 120 54 0 120 54\n", "43345 2015-06-30 Grass 1st Round 37 32 0 37 32" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = X_strat_1.copy()\n", "df[\"P1\"] = df[[\"WRank\", \"LRank\"]].max(axis=1)\n", "df[\"P2\"] = df[[\"WRank\", \"LRank\"]].min(axis=1)\n", "df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exploratory Analysis for Best_of = 5" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We first look at percentage of wins for each surface. We find that when the `Surface` is Clay there is a higher likelihood of upsets (opposite of wins) i.e. the percentage of wins is lower. The difference is not too large tough." ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th>Surface</th>\n", " <th>Clay</th>\n", " <th>Grass</th>\n", " <th>Hard</th>\n", " </tr>\n", " <tr>\n", " <th>win</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>0</th>\n", " <td>0.557692</td>\n", " <td>0.474747</td>\n", " <td>0.481865</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>0.442308</td>\n", " <td>0.525253</td>\n", " <td>0.518135</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ "Surface Clay Grass Hard\n", "win \n", "0 0.557692 0.474747 0.481865\n", "1 0.442308 0.525253 0.518135" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "win_by_Surface = pd.crosstab(df.win, df.Surface).apply( lambda x: x/x.sum(), axis = 0 )\n", "win_by_Surface" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAfAAAAFICAYAAACvNaz+AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAGf1JREFUeJzt3XuU3WV97/H3ZEaQZJI0gVEQFQTxq6JSBQUiBYNQK2KN\nt6PUKxpKsXJqtR5MveDt4GVB2oqyEFIVinpQjqAiwqkIaKNGFBUQ+QLGgC49x+BMcwVDmDl//PaE\nTZqZ2TPZv9l59rxfa2Vlfrdnf/daT+aT53d7ekZGRpAkSWWZ1ekCJEnS5BngkiQVyACXJKlABrgk\nSQUywCVJKpABLklSgfrqbDwieoDzgEOA+4Glmbm6afuzgXMai/8XeG1mbqmzJkmSukHdI/AlwO6Z\nuQhYBizfbvsFwBsz82jgamC/muuRJKkr1B3gR1EFM5m5CjhsdENEPAn4A/D2iLgeWJiZd9ZcjyRJ\nXaHuAJ8HrGta3hoRo5+5F3Ak8AngOOC4iHhezfVIktQVar0GDqwH5jYtz8rM4cbPfwDuysw7ACLi\naqoR+vVjNbZ164MjfX29NZUqSdIuqWdHK+sO8JXAicBlEXEEcEvTttVAf0Qc0Lix7c+AFeM1NjS0\nubZCJUnaFQ0MzN3h+p46JzNpugv9GY1VJwOHAnMyc0XjlPnHGtu+l5l/P157a9ducOYVSdKMMjAw\nd4cj8FoDvN0McEnSTDNWgPsiF0mSCmSAS5JUIANckqQCGeCSJBXIAJckqUAGuCRJBTLAJUmapM9/\n/iLuvntNR2vwOXBJknZhPgcuSdIknXLKG9iyZQu/+tVqTjjh+QD85Cc/5j3vOYPbbruVz3zmAj78\n4TN5+9tP55RT3sDvf///pq02A1ySpDE8+9mH89Of3sSPfvRDBgYGuOuuO/nBD77Hhg0btu3zqEc9\nmuXLz+Xoo5/HDTdcN221GeCSJI1h0aKjuPHGVdx88095zWveyE033cjtt9/GwMDAtn0OPPCJAOy1\n1wBbtvxx2mozwCVJGsPBBz+dzF/wwANbOPLI5/Lv/341j3703sya1RyfO7xEXTsDXJKkMfT09PDo\nR+/Nk5/8VObOncvICDz3uUc/bHvHavMudEmSdl1j3YXeN92FdMKDDz7ImjWrO13GtNp//wPo7e3t\ndBmSpJrMiABfs2Y1y865lDnzBybeuQtsWreWj7zjVRx44EGdLkWSVJMZEeAAc+YPMG/hPp0uQ5Kk\ntvAmNkmSCjRjRuCSJEE990V14r4jA1ySNKO0+76oTt13ZIBLkmac6b4vamRkhHPO+Sh33XUnu+22\nG2ec8R723fexO9Wm18AlSarZd75zPVu2bOH88z/Dqae+lU9+8p92uk0DXJKkmt188085/PBFABx8\n8NO4/fZf7HSbBrgkSTXbvHkT/f3925Z7e3sZHh7eqTYNcEmSajZ79hw2b960bXl4eHi7CVEmz5vY\nJEkzzqZ1a6e1rWc84xBWrvwuixcfx6233rJtCtKdYYBLkmaU/fc/gI+841Vtb3M8Rx+9mBtvXMVp\np70JgGXLztzpzzTAJUkzSm9v77Q/s93T08M//MOytrbpNXBJkgpkgEuSVCBPoUsaVx3vjd7VdeK9\n1tJkGeCSxrVmzWre++UP0r/XvE6XMi023rueD73yfdN+jVSaLANc0oT695rH/L0XdLoMqS2cjUyS\npCno9GWZe+65mwtXXdS2s0ob713PKYe/gcc/fr8x96kj4A1wSdK06vRlmY33rm/7WaULV11E/y93\n/H2aL8v8/Oe3cv7553LuuZ/e6c80wCVJ067bLsu08n2+8IWLueaaq9hjj9lt+UwfI5MkaRrsu+/j\nOOuss9vWngEuSdI0OOaYxW29Dm6AS5JUoFqvgUdED3AecAhwP7A0M1c3bX8bsBT4fWPVqZl5Z501\nSZK08d71bW1rMjfkjYyMtOVz676JbQmwe2YuiojDgeWNdaMOBV6XmT+puQ5JkgCY+6j5bW2vf695\nk2qzp6enLZ9bd4AfBVwNkJmrIuKw7bYfCiyLiH2Ab2TmR2uuR5I0w82aNatjd8Dvvfc+nH/+Z9rS\nVt0BPg9Y17S8NSJmZeZwY/mLwKeA9cAVEXFCZl41VmMLFsymr2/yNwAMDfVP+pjSLVzYz8DA3E6X\noS7gvx+1m32qPeoO8PVAc8XN4Q3wL5m5HiAivgE8ExgzwIeGNk+piMHBjVM6rmSDgxtZu3ZDp8tQ\nF/Dfj9rNPjU5YwV/3XehrwROAIiII4BbRjdExDzg1oiY3bjZ7VjgxzXXI0lSV6h7BH45cHxErGws\nnxwRJwFzMnNFRCwDrqe6Q/3azLy65nokSeoKtQZ4Zo4Ap223+o6m7Z8HPl9nDZIkdSNf5CJJUoEM\ncEmSCmSAS5JUIKcTlaRdwIMPPsiaNasn3rEL3HPP3Z0uoSsY4JK0C1izZjXLzrmUOfMHOl1K7db+\nJnnMMZ2uonwGuCTtIubMH2Dewn06XUbtNq5bC/yu02UUzwCXpsDTnZI6zQCXpsDTnZI6zQCXpsjT\nnZI6ycfIJEkqkAEuSVKBDHBJkgpkgEuSVCADXJKkAhngkiQVyACXJKlABrgkSQUywCVJKpABLklS\ngQxwSZIKZIBLklQgJzPpQiPDwzNqCsj99z+A3t7eTpchSdPKAO9Cmzb8gQtXfY/+X87rdCm123jv\nej70yvdx4IEHdboUSZpWBniX6t9rHvP3XtDpMiRJNfEauCRJBTLAJUkqkAEuSVKBDHBJkgpkgEuS\nVCADXJKkAhngkiQVyACXJKlABrgkSQUywCVJKpABLklSgQxwSZIKZIBLklQgA1ySpALVOp1oRPQA\n5wGHAPcDSzNz9Q72+zTwh8z8xzrrkSSpW9Q9Al8C7J6Zi4BlwPLtd4iIU4Gn1VyHJEldpe4APwq4\nGiAzVwGHNW+MiCOBZwOfrrkOSZK6St0BPg9Y17S8NSJmAUTE3sCZwFuBnprrkCSpq9R6DRxYD8xt\nWp6VmcONn18J7AlcBewD7BERt2fmxWM1tmDBbPr6eiddxNBQ/6SPUTkWLuxnYGDuxDu2kX2qu9mn\n1G519Km6A3wlcCJwWUQcAdwyuiEzzwXOBYiINwAxXngDDA1tnlIRg4Mbp3ScyjA4uJG1azdM+2eq\ne9mn1G4706fGCv66A/xy4PiIWNlYPjkiTgLmZOaKmj9bkqSuVWuAZ+YIcNp2q+/YwX4X1VmHJEnd\nxhe5SJJUIANckqQCGeCSJBXIAJckqUAGuCRJBTLAJUkqkAEuSVKBDHBJkgpkgEuSVCADXJKkAhng\nkiQVyACXJKlABrgkSQUywCVJKpABLklSgQxwSZIKZIBLklSgvlZ2ioh+YDFwEDAM3AV8KzPvr7E2\nSZI0hnEDPCJmA2cCLwNuBu4GHgAWAf8UEV8BPpSZG+suVJIkPWSiEfglwAXAsswcbt4QEbOAExv7\nLKmnPEmStCMTBfjLM3NkRxsagf61iPh6+8uSJEnjmSjA3xsRY27MzA+OFfCSJKk+EwV4z7RUIUmS\nJmXcAM/MD+xofUT0AE+opSJJkjShVh8jeytwFjCnafWvgCfWUZQkSRpfqy9yeQdwCHApcCDwZmBV\nXUVJkqTxtRrgv8/MX1E9C/70zPwcMPbdbZIkqVatBvimiFhMFeAvjoi9gQX1lSVJksbTaoCfDvwl\ncDWwJ3A7cG5dRUmSpPG1dBMb8JjM/PvGzy8HiIiX1VOSJEmayETvQn8VsDvwwYh433bH/SPwlRpr\nkyRJY5hoBD6PauKSuVSzkY3aCry7rqIkSdL4JnqRy4XAhRHx/My8NiLmAr2Z+Z/TU54kSdqRVm9i\nWxMRPwTWAKsj4icR8aT6ypIkSeNpNcDPBz6emXtm5kLgI1TTjEqSpA5oNcD3yszLRhcy80vAwnpK\nkiRJE2k1wP8YEc8aXYiIQ4HN9ZQkSZIm0upz4G8D/ndEDFJNMboQeFVtVUmSpHG1GuAJPKnxZ1Zj\neZ+6ipIkSeOb6EUuj6MacV8FvBDY0Nj02Ma6J09wfA9wHtVMZvcDSzNzddP2lwNnAMPAFzLzE1P7\nGpIkzSwTXQP/AHADcBDwncbPNwDXAN9sof0lwO6ZuQhYBiwf3RARs6jmGD+W6mUxb4kIb4yTJKkF\nE73I5U0AEXFGZn5sCu0fRTUBCpm5KiIOa2p7OCKe0vj7UVT/mdgyhc+QJGnGmegU+keAj44V3o0R\n8xmZecYYTcwD1jUtb42IWZk5DNtC/KXAp4ArgU3j1bNgwWz6+nrH22WHhob6J32MyrFwYT8DA3On\n9TPtU93NPqV2q6NPTXQT25eAr0bEb6lOof+G6j3o+1Gd+n4M1R3qY1lP9R71UdvCe1RmXg5cHhEX\nAa8HLhqrsaGhqT25Nji4cUrHqQyDgxtZu3bDxDu2+TPVvexTared6VNjBf9Ep9B/AjwvIhZTzQd+\nItUNZ78EPp2Z357gc1c2jrksIo4Abhnd0Hiv+teBP8/MLVSj7+EdtiJJkh6mpcfIMvM64LoptH85\ncHxErGwsnxwRJwFzMnNFRFwCfCcitgA3A5dM4TMkSZpxWgrwiHgB8GGqF7j0jK7PzAPGOy4zR4DT\ntlt9R9P2FcCKVouVJEmVVl/kci7wduBWYKS+ciRJUitaDfB7M/PKWiuRJEktazXAvxsRy6me6b5/\ndGVmfqeWqiRJ0rhaDfDnNP5+ZtO6EapHySRJ0jRr9S70xXUXIkmSWtfqXehHAe8E+qnuQu8F9svM\n/esrTZIkjWWiyUxGrQCuoAr8TwF3Uj3jLUmSOqDVAL8vMz8LXA8MAacAx9RVlCRJGl+rAX5/Y+KS\nBI5ovKBlTn1lSZKk8bQa4MuBS6neXf76iPg58KPaqpIkSeNqKcAz88tUk45sAA4FXgu8rs7CJEnS\n2FoK8IhYAFwQEd8GHgmcDsyvszBJkjS2Vk+hXwjcCOwJbAB+hzOHSZLUMa0G+BMy8wJgODO3ZOa7\ngcfWWJckSRpHqwG+NSLm05iJLCIOAoZrq0qSJI2r1Xehn0n1DPjjIuIK4EjgTXUVJUmSxtfqCPzH\nVG9e+xXweOArVHejS5KkDmh1BH4VcDPQPCd4T/vLkSRJrWg1wMnMN9dZiCRJal2rAX5FRCwFvg1s\nHV2ZmffUUpUkSRpXqwE+H3gXcG/TuhHggLZXJEmSJtRqgL8ceFRm3ldnMZIkqTWt3oW+GlhQZyGS\nJKl1rY7AR4DbIuJWYMvoysw8tpaqJEnSuFoN8P9ZaxWSJGlSWgrwzLyh7kIkSVLrWr0GLkmSdiEG\nuCRJBTLAJUkqkAEuSVKBDHBJkgpkgEuSVCADXJKkAhngkiQVyACXJKlABrgkSQUywCVJKpABLklS\ngVqdjWxKIqIHOA84BLgfWJqZq5u2nwT8HfAAcEtmvqXOeiRJ6hZ1j8CXALtn5iJgGbB8dENEPBL4\nIHBMZv4Z8CcRcWLN9UiS1BXqDvCjgKsBMnMVcFjTtj8CizLzj43lPqpRuiRJmkDdAT4PWNe0vDUi\nZgFk5khmrgWIiNOBOZn5rZrrkSSpK9R6DRxYD8xtWp6VmcOjC41r5B8HDgJeNlFjCxbMpq+vd9JF\nDA31T/oYlWPhwn4GBuZOvGMb2ae6m31K7VZHn6o7wFcCJwKXRcQRwC3bbb8AuC8zl7TS2NDQ5ikV\nMTi4cUrHqQyDgxtZu3bDtH+mupd9Su22M31qrOCvO8AvB46PiJWN5ZMbd57PAX4MnAx8NyKuA0aA\nf8nMr9ZckyRJxas1wDNzBDhtu9V3TNfnS5LUrXyRiyRJBTLAJUkqkAEuSVKBDHBJkgpkgEuSVCAD\nXJKkAhngkiQVyACXJKlABrgkSQUywCVJKpABLklSgQxwSZIKZIBLklQgA1ySpAIZ4JIkFcgAlySp\nQAa4JEkFMsAlSSqQAS5JUoEMcEmSCmSAS5JUIANckqQCGeCSJBXIAJckqUAGuCRJBTLAJUkqkAEu\nSVKBDHBJkgpkgEuSVCADXJKkAhngkiQVyACXJKlABrgkSQUywCVJKpABLklSgQxwSZIKZIBLklQg\nA1ySpAIZ4JIkFaivzsYjogc4DzgEuB9Ympmrt9tnNvB/gDdl5h111iNJUreoewS+BNg9MxcBy4Dl\nzRsj4lDgBuCAmuuQJKmr1B3gRwFXA2TmKuCw7bbvRhXyt9dchyRJXaXWU+jAPGBd0/LWiJiVmcMA\nmfl92HaqfUILFsymr6930kUMDfVP+hiVY+HCfgYG5k7rZ9qnupt9Su1WR5+qO8DXA80VbwvvqRga\n2jyl4wYHN071I1WAwcGNrF27Ydo/U93LPqV225k+NVbw130KfSVwAkBEHAHcUvPnSZI0I9Q9Ar8c\nOD4iVjaWT46Ik4A5mbmiab+RmuuQJKmr1BrgmTkCnLbd6v/yqFhmHltnHZIkdRtf5CJJUoEMcEmS\nCmSAS5JUIANckqQCGeCSJBXIAJckqUAGuCRJBTLAJUkqkAEuSVKBDHBJkgpkgEuSVCADXJKkAhng\nkiQVyACXJKlABrgkSQUywCVJKpABLklSgQxwSZIKZIBLklQgA1ySpAIZ4JIkFcgAlySpQAa4JEkF\nMsAlSSqQAS5JUoEMcEmSCmSAS5JUIANckqQCGeCSJBXIAJckqUAGuCRJBTLAJUkqkAEuSVKBDHBJ\nkgpkgEuSVCADXJKkAhngkiQVyACXJKlAfXU2HhE9wHnAIcD9wNLMXN20/cXAe4EHgM9m5oo665Ek\nqVvUPQJfAuyemYuAZcDy0Q0R0ddYPg54HvDXETFQcz2SJHWFugP8KOBqgMxcBRzWtO0pwJ2ZuT4z\nHwD+Azi65nokSeoKtZ5CB+YB65qWt0bErMwc3sG2DcD8ugrZtG5tXU3vcu7bMMgj7l3f6TKmxcYO\nfs+Z0qdmUn8C+9R0sE+1R90Bvh6Y27Q8Gt6j2+Y1bZsL/Od4jQ0MzO2ZShEDA8/iui8/ayqHSjtk\nn1K72ac0WXWfQl8JnAAQEUcAtzRt+wXwxIj4k4jYjer0+fdrrkeSpK7QMzIyUlvjTXehP6Ox6mTg\nUGBOZq6IiBcBZwI9wL9m5vm1FSNJUhepNcAlSVI9fJGLJEkFMsAlSSqQAS5JUoEMcEmSClT3c+Cq\nQUQcDHwMmA3MAb4JXA+cmpkndbA0FSIingB8HNgXuA/YDJyRmbd1tDAVLyKOAf6m+XdRRHwE+EVm\nXjzJtn6Xmfu0u8ZuYYAXJiLmA18ElmTm6sajel8Gfgf4SIEmFBF7AF8D3pyZP2ysOwz4JHBsJ2tT\n12jX7yJ/p43DAC/PS4BrR2d1y8yRiHg98FzgGICI+FvgZVQj9HsbP38OuCQzvxkRTwbOzswTO1C/\nOu/FVH3oh6MrMvNHwLER8VlgT2BhY7+PA48F9gG+npnvjYiXAf8D2AL8NjNfHRHPBc5urNsMvCIz\nN03nl9IuZUdvzeyNiAt5qD99LTPft12f+0uqPvdUYDWw+zTVWySvgZfnMVQde5vM3Ez1i3PUnpn5\n/Mw8EngE1SQyFwBvbGx/E+DUrTPXE4C7Rhci4oqIuC4ibqc6pX5tZh5F9arj72fmC4HDgb9pHPJq\n4OOZeTRwZeOs0EuAS6lmFjwfWDBdX0a7pGMj4tuNP9cBJwEP8vD+dFrT/qN9bjEPn8Fy9nQXXhJH\n4OW5G3jYC5MjYn8ePpPbloj4IrCJ6hfyIzLzhog4NyL2Av6c6h+HZqZf0zQzYGYuAYiI7wO/AbKx\naRB4TkQspppsaLfG+rcDyyLidKpXIl8BnAW8G7i20cYP6v8a2oVdm5l/NboQEWdR/YfwaTvoT/BQ\nn3sS8EOAzPx1RPx6muotkiPw8lwJvCAiDgCIiEdQzau+trH8dKrr4ycBpwO9PHQ669+ATwDXZOaD\n0124dhlfBZ4fEc8ZXRERT6Q6tbkfMDrh0BuBocx8HVUfGx0N/TVwZmYupvod8lLgtcBnM/NY4LbG\nPtKonsafHfUneKjP3QYcCRARj6HqkxqDI/DCZOaGiHgDcGHjBra5wNeB26lG4XcCGyPiu1T/YH5L\nddod4CLgw8DTpr1w7TIyc1NEvBj4WETsTXWZZSvwNuBFTbteC3whIo6kukRzR0TsQzVC+kZEbKAa\nSV0JHAT8a0RsojpVaoCr2QhVH/uLHfSnbTeqZeZXI+L4xtmge4Dfd6TaQvgu9BkkIvYFPpeZx3e6\nFknSzvEU+gwRES8FrgLe1+laJEk7zxG4JEkFcgQuSVKBDHBJkgpkgEuSVCADXJKkAvkcuNTlIuIV\nwLuo/r33AP+WmWdP4vhTqN6ydmlmnlFPlZImyxG41MUab7M6GzguM/+U6i1Xr4qIyUxk82pgqeEt\n7Vp8jEzqYhHxDKrn/4/IzN801j0V+CPwLeCYzLynMYfz+zNzcWPyiUGqGaG+QDXz2O+A/w70A+8A\nHgnsQRXs/xERf0o1ickejWNfk5m/jYgzgP9GNVi4JjPfNV3fXep2jsClLpaZN1PN/b06IlZFxEeB\nvsz8Jf91ruXm5Z9l5lMy80PAj4A3A9cApwIvysxnAh8D3tnY/xLgA5l5CPC/gL+LiBcAh1JNnPIs\n4LER8VdIagsDXOpymfkWqklKzmv8/f3Gm/nGs2q75Z7MHKGaW/4vIuIDVJOd9EfEnsDemfnNxud9\nunG6/TjgOcCPgZuowvzg9nwrSd7EJnWxiDgB6M/ML1FNZnNRRCylGlGP8NBMdY/Y7tD7dtDWHOBG\n4GLgBuBm4G+BB5raISJ2p5pApxf458z858b6eVQTWkhqA0fgUnfbDJwVEfsBNGaweyrViPheHhoR\nv6SFtp4EPJiZZwHXAS8EejNzPXBPRDy/sd/rgQ9QzWb2+oiYExF9VNOYvqI9X0uSAS51scy8nipM\nr4yIX1DNtzwL+CDwfuATEbEKGGo6bKxr4z8DfhYRSXVafAPVKXmA1wHvj4ibgFcC78zMbwCXUZ2O\nvxm4KTMvbusXlGYw70KXJKlAjsAlSSqQAS5JUoEMcEmSCmSAS5JUIANckqQCGeCSJBXIAJckqUD/\nH34A+e8p9aE6AAAAAElFTkSuQmCC\n", "text/plain": [ "<matplotlib.figure.Figure at 0x11994f610>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "win_by_Surface = pd.DataFrame( win_by_Surface.unstack() ).reset_index()\n", "win_by_Surface.columns = [\"Surface\", \"win\", \"total\" ]\n", "fig2 = sns.barplot(win_by_Surface.Surface, win_by_Surface.total, hue = win_by_Surface.win )\n", "fig2.figure.set_size_inches(8,5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What about the dependence on rounds? The relation is not very clear but we can clearly see that upsets are unlikely to happen on the semifinals." ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th>Round</th>\n", " <th>1st Round</th>\n", " <th>2nd Round</th>\n", " <th>3rd Round</th>\n", " <th>4th Round</th>\n", " <th>Quarterfinals</th>\n", " <th>Semifinals</th>\n", " <th>The Final</th>\n", " </tr>\n", " <tr>\n", " <th>win</th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " <th></th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>0</th>\n", " <td>0.521505</td>\n", " <td>0.542056</td>\n", " <td>0.469388</td>\n", " <td>0.354839</td>\n", " <td>0.666667</td>\n", " <td>0.111111</td>\n", " <td>0.4</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>0.478495</td>\n", " <td>0.457944</td>\n", " <td>0.530612</td>\n", " <td>0.645161</td>\n", " <td>0.333333</td>\n", " <td>0.888889</td>\n", " <td>0.6</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ "Round 1st Round 2nd Round 3rd Round 4th Round Quarterfinals Semifinals \\\n", "win \n", "0 0.521505 0.542056 0.469388 0.354839 0.666667 0.111111 \n", "1 0.478495 0.457944 0.530612 0.645161 0.333333 0.888889 \n", "\n", "Round The Final \n", "win \n", "0 0.4 \n", "1 0.6 " ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "win_by_round = pd.crosstab(df.win, df.Round).apply( lambda x: x/x.sum(), axis = 0 )\n", "win_by_round" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAfAAAAFICAYAAACvNaz+AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XuYXWV59/HvJCnRZJKYmJEAKhGEG4uKAgrGFASLVkVf\n1CKlHhF4Eap9FVSMBVRQKCqxVoucikJFRVFqPaFWQTBqBBUJIneAOOCZwRlzIEBMMu8fa+2wGeaw\nJ5m1mTXz/VxXrsw67LXuZ59+67TX09Hf348kSaqXKY90AZIkafQMcEmSasgAlySphgxwSZJqyACX\nJKmGDHBJkmpoWpULj4gO4FxgL+B+4JjMXNU0/bXA24E/A5dk5sVV1iNJ0kRR9R74YcD0zFwELAGW\nNiZExGOB04EDgOcBr46IJ1ZcjyRJE0LVAb4YuAogM5cD+zZN2wW4MTNXZ2Y/cD2wf8X1SJI0IVQd\n4LOB1U3DGyOisc7bgD0joisiZgDPB2ZWXI8kSRNCpefAgTXArKbhKZm5GSAz/xwRJwJfBP4E/AS4\nZ7iFbdy4qX/atKlV1SpJ0njUMdjIqgN8GXAocEVE7A+saEyIiKnA3pl5QERsB3wLePdwC+vrW19l\nrZIkjTtdXbMGHV91gF8JHBIRy8rhoyLiSGBmZl4UEUTET4H7gHMys7fieiRJo7Bp0ya6u1eNPOM2\nWrhwF6ZO9QjraHTUqTeynp619SlWkiaAO+64jVO/cDqd82dXto5196zhjMNPY9ddd6tsHXXW1TXr\nETmELkmquc75s5mzYO4jXYYG8E5skiTVkAEuSVINGeCSJNWQAS5JUg0Z4JIk1ZABLknSKF122SXc\neWf3I1qDPyOTJGmUXv3q1z/SJbgHLknSUI499vVs2LCBX/1qFS9+8fMB+NnPfsIpp5zMLbfczMUX\nX8D73/8eTjzxLRx77Ou5++4/tq02A1ySpCE861n7ceONP+WGG35MV1cXt99+Gz/60Q9Yu3btlnke\n97jtWbr0YxxwwPP43veublttBrgkSUNYtGgx11+/nJtuupFXv/oN/PSn13PrrbfQ1dW1ZZ5dd30y\nAPPnd7FhwwNtq80AlyRpCHvu+TQyf8lf/rKB5zznuXz721ex/fYLmDKlOT4HvVV55QxwSZKG0NHR\nwfbbL2CPPf6aWbNm0d8Pz33uAQ+Z/ojVZm9kkqSh3HHHbZx99b9V2pnJ6j/0cfJBb7U3siEM1RuZ\ne+CSJNWQAS5JUg0Z4JIk1ZABLklSDXkrVUnSpLJp0ya6u1eN6TIXLtyFqVOnjukyR2KAS5Imle7u\nVSw553JmzukaeeYW3Lu6h7NOOqLtV9Eb4JKkSWfmnC5mz9uhbevr7+/nnHP+ldtvv43tttuOk08+\nhZ12evw2LdNz4JIkVezaa69hw4YNnHfexRx33Jv5+Mc/ss3LNMAlSarYTTfdyH77LQJgzz2fyq23\n/nKbl1npIfSI6ADOBfYC7geOycxVTdNfDZwIbAQ+mZnnVVmPJEmPhPXr76Wzs3PL8NSpU9m8efOA\ne6qPTtV74IcB0zNzEbAEWDpg+oeAg4HFwEkRMafieiRJarsZM2ayfv29W4a3Nbyh+ovYFgNXAWTm\n8ojYd8D0nwNzgcY9zr3XuSSpcveu7mnrsp7+9L1Ytuw6Djrob7n55hVbuiDdFlUH+GxgddPwxoiY\nkpmby+FfAD8B1gFfysw1FdcjSZrkFi7chbNOOmLMlzmcAw44iOuvX87xx78RgCVL3rPN66w6wNcA\ns5qGt4R3RDwNeAmwM3AvcFlEvDIzvzjUwubOncG0ae39obwkTWZ9fZ0jzzQG5s3rpKtr1sgzjpEF\nC/Zu27oazj77zDFdXtUBvgw4FLgiIvYHVjRNWw2sBx7IzP6IuJvicPqQ+vrWV1aoJOnhenvXtW09\nPT1r27Kuuhlqw6bqAL8SOCQilpXDR0XEkcDMzLwoIi4Avh8RDwB3AJ+quB5JkiaESgM8M/uB4weM\nXtk0/Xzg/CprkCRpIvJGLpIk1ZD3QpckTSr2RiZJUg11d6/i1C+cTuf82WOyvHX3rOGMw0+zNzJJ\nkqrWOX82cxYM+8OnSvziFzdz3nkf42Mf2/bLvwxwSZLa4DOfuZRvfvPrPPrRM8ZkeV7EJklSG+y0\n0xM488wPj9nyDHBJktrgwAMPGtML3QxwSZJqyHPgkqRJZ909Y9d31miX1d8/Nh1vGuCSpEll4cJd\nOOPw08Z8ma3q6OgYk3Ua4JKkSWXq1Klt/812w4IFO3DeeRePybI8By5JUg0Z4JIk1ZABLklSDRng\nkiTVkAEuSVINGeCSJNWQAS5JUg0Z4JIk1ZABLklSDRngkiTVkAEuSVINGeCSJNVQpZ2ZREQHcC6w\nF3A/cExmriqnbQ98DugHOoBnACdn5gVV1iRJ0kRQdW9khwHTM3NRROwHLC3HkZl/BA4CiIj9gfcD\nF1ZcjyRJE0LVh9AXA1cBZOZyYN8h5vsY8KbMHJteziVJmuCqDvDZwOqm4Y0R8ZB1RsRLgZsz8/aK\na5EkacKo+hD6GmBW0/CUzNw8YJ7XAP/WysLmzp3BtGlTx6o2SdII+vo627KeefM66eqaNfKM2qLq\nAF8GHApcUZ7nXjHIPPtm5g9bWVhf3/qxrE2SNILe3nVtW09Pz9q2rKtuhtqwqTrArwQOiYhl5fBR\nEXEkMDMzL4qI+Tz0ELskSWpBpQFeXpR2/IDRK5um3wPsXWUNkiRNRN7IRZKkGjLAJUmqIQNckqQa\nMsAlSaohA1ySpBqq+mdkklSZTZs20d29qvL1LFy4C1OnehMpjS8GuKTa6u5exZJzLmfmnK7K1nHv\n6h7OOukIdt11t8rWIW0NA1xSrc2c08XseTs80mVIbec5cEmSasgAlySphgxwSZJqyACXJKmGvIhN\n0rDa9VMt8Oda0mgY4JKG1d29ilO/cDqd82dXup5196zhjMNP8+daUosMcEkj6pw/mzkL5j7SZUhq\n4jlwSZJqyACXJKmGDHBJkmrIAJckqYYMcEmSasgAlySphgxwSZJqyACXJKmGKr2RS0R0AOcCewH3\nA8dk5qqm6c8CzikH/wC8JjM3VFmTJEkTQdV74IcB0zNzEbAEWDpg+gXAGzLzAOAqYOeK65EkaUKo\nOsAXUwQzmbkc2LcxISJ2B/4EnBgR1wDzMvO2iuuRJGlCqPpe6LOB1U3DGyNiSmZuBuYDzwFOAFYB\nX42IGzLzmoprkiSpJe3qjW9reuKrOsDXALOahhvhDcXe9+2ZuRIgIq6i2EO/ZqiFzZ07g2nT7GpQ\naqe+vs62rWvevE66umaNPGOpXbWNtq6JZLI/xytXrqy8N75196zhP477ILvvvvuoHld1gC8DDgWu\niIj9gRVN01YBnRGxS3lh298AFw23sL6+9ZUVKmlwvb3r2rqunp61o5q/HUZb10Qy2Z/j3t51bemN\nb7j2D7VhU3WAXwkcEhHLyuGjIuJIYGZmXhQRRwOfjQiAH2TmNyquR5KkCaHSAM/MfuD4AaNXNk2/\nBtivyhokSZqIvJGLJEk1ZIBLklRDBrgkSTVkgEuSVEMGuCRJNWSAS5JUQwa4JEk1ZIBLklRDBrgk\nSTVkgEuSVEMGuCRJNWSAS5JUQwa4JEk1VHV3oqrYpk2b6O5e1ZZ1LVy4C1OnTm3LuiRJwzPAa667\nexVLzrmcmXO6Kl3Pvat7OOukI9h1190qXY8kqTUG+AQwc04Xs+ft8EiXIUlqIwNcakG7TlV4mkJS\nq1oK8IjoBA4CdgM2A7cD/5uZ91dYmzRudHev4tQvnE7n/NmVrWPdPWs44/DTPE0hqSXDBnhEzADe\nA7wCuAm4E/gLsAj4SER8CTgjM9dVXaj0SOucP5s5C+Y+0mVIEjDyHvingQuAJZm5uXlCREwBDi3n\nOaya8lrj4U1J0mQzUoC/MjP7B5tQBvr/RMRXxr6s0WnHldhehS1JGk9GCvBTI2LIiZl5+lAB325e\niS1JmkxGCvCOtlQhSZJGZdgAz8z3DTY+IjqAJ4208HK+c4G9gPuBYzJzVdP0twLHAHeXo47LzNta\nK12SpMmr1Z+RvRk4E5jZNPpXwJNHeOhhwPTMXBQR+wFLeegFb/sAr83Mn7VesiRJarUzk5Mo9qIv\nB3YFjgaWt/C4xcBVAJm5HNh3wPR9gCURcV1EvKvFWiRJmvRaDfC7M/NXFL8Ff1pmfgoY+uq2B80G\nVjcNbyx/ftbwWeBNFDeJWRwRL26xHkmSJrVWb6V6b0QcRBHgh0XE9UArd7RYA8xqGp4y4PfkH83M\nNQAR8TXgmcDXh1rY3LkzmDbt4b/D7uvrbKGUbTdvXiddXbNGnrGN2tV2GJ/tbxffY+0x2vZP5tel\nXSb7czye299qgL+F4mKzkygOn98KvLeFxy2juNnLFRGxP7CiMSEiZgM3R8QewH3AwcB/Drewvr71\ng47v7W3PjeB6e9fR07O2LetqVbva3ljXeGt/u/gea9+6RtP+yfy6tMtkf47HQ/uHCvZWA3zHzHxb\n+fcrASLiFS087krgkIhYVg4fFRFHAjMz86KIWAJcQ3GF+ncy86oW65EkaVIb6V7oRwDTgdMj4rQB\nj3s38KXhHl/e5OX4AaNXNk2/DLhsNAVLkqSR98BnU3RcMoviQrOGjcC/VFWUJEka3kg3crkQuDAi\nnp+Z34mIWcDUzPxze8obP/o3b+auu+5sy7rsNEWSNJJWz4F3R8SPKX4D3hERdwJHZObKER43Ydy7\n9k9cuPwHdN5RXX/QYJ/QkqTWtBrg5wEfzMwrACLiVRTdjD6vorrGJfuDliSNF60G+PxGeANk5ucj\n4pSKapJa1q6+4Nt1+kSSWtVqgD8QEXtn5k8BImIfYPAfZUtt1I6+4AF6fpPseGClq5CkUWk1wN8K\nfDEieim6GJ0HHFFZVdIotKMv+HWre4DfV7oOSRqNVgM8gd3Lf1PK4Wq/MSVJ0pBGupHLEyj2uL8O\nvAho3Oft8eW4PSqtTuNGu35G50/oJKk1I+2Bv4/iBi47Atc2jd8IfLWqojT+tONndP6ETpJaN9KN\nXN4IEBEnZ+bZ7SlJ45U/o5Ok8WPY/sAj4qyImDNUeEfEvIgw2CVJarORDqF/HvhyRPyO4hD6bygO\nn+9M0f3njhRXqEuSpDYa6RD6z4DnRcRBwMso+vbeDNwBnJ+Z362+REmSNFBLPyPLzKuBqyuuRZIk\ntailAI+IFwLvp7iBS0djfGbuUlFdkiRpGK3eyOVjwInAzUB/deVIkqRWtBrg92Smv/uWJGmcaDXA\nr4uIpcBVwP2NkZl57dAPkSRJVWk1wJ9d/v/MpnH9FD8lkyRJbdbqVegHVV2IJElqXatXoS8G3gF0\nUlyFPhXYOTMXVleaJEkayrC3Um1yEfDfFIH/H8BtwJVVFSVJkobX6jnw+zLzkxGxEOgDjgV+MtKD\nIqIDOBfYi+Lit2Myc9Ug850P/Ckz391q4ZIkTWat7oHfHxHzgAT2z8x+YGYLjzsMmJ6Zi4AlwNKB\nM0TEccBTW6xDkiTReoAvBS4HvgK8LiJ+AdzQwuMWU/z0jMxcDuzbPDEingM8Czi/1YIlSVLrV6F/\nISKuyMz+iNgH2B34eQsPnQ2sbhreGBFTMnNzRCwA3kOxl37EaAuXJE1emzZtorv7YWdkx9xdd91Z\n+Tq2VqtXoc8FPhgRuwKHA28BTqI4Hz6cNcCspuEpmbm5/Ptw4LHA14EdgEdHxK2ZeelQC5s7dwbT\npk192Pi+vs5WmlEb8+Z10tU1a+QZmdxtB9vfDu18jsfr6z8eX5d2Ga/P8cqVK1lyzuXMnNNVYVXQ\n85tkxwMrXQWwde+xVi9iuxD4FsUNXdYCvwc+DbxkhMcto+iC9IqI2B9Y0ZiQmR+juMc6EfF6IIYL\nb4C+vvWDju/tXddSI+qit3cdPT1rW553IhlN2xvzTySjbX87tPM5Hq+v/3h8XdplvD7Hvb3rmDmn\ni9nzdqiwKli3uoci8qo1XPuHCvZWz4E/KTMvADZn5obM/Bfg8S087krggYhYBpwDvC0ijoyIY1pc\nryRJGkSre+AbI2IOZU9kEbEbsHn4h0B5tfrxA0avHGS+S1qsQ5Ik0XqAvwe4BnhCRPw38BzgjVUV\nJUmShtfqIfSfUBwO/xXwROBLwD5VFSVJkobX6h7414GbgOY+wTvGvhxJktSKVgOczDy6ykIkSVLr\nWg3w/y6vHP8usLExMjPvqqQqSZI0rFYDfA7wLuCepnH9wC5jXpEkSRpRqwH+SuBxmXlflcVIkqTW\ntHoV+ipgbpWFSJKk1rW6B94P3BIRNwMbGiMz8+BKqpIkScNqNcA/UGkVkrZKO3pkGs+9MUmTWavd\niX6v6kIkjV5396rKe2RqV29Mkkan5d+BSxqfqu6RqV29MUkanVYvYpMkSeOIAS5JUg0Z4JIk1ZDn\nwCVpGP2bN7ftSvyFC3dh6tSpbVmX6s8Al6Rh3Lv2T1y4/Ad03jG70vWsu2cNZxx+Grvuulul69HE\nYYBL0gg6589mzgJvRqnxxXPgkiTVkAEuSVINGeCSJNWQAS5JUg0Z4JIk1VClV6FHRAdwLrAXcD9w\nTGauapr+SuBkYDPwmcz89yrrkSRpoqh6D/wwYHpmLgKWAEsbEyJiCnAmcDCwCDghIuZVXI8kSRNC\n1QG+GLgKIDOXA/s2JmTmZuApmbkOmF/WsqHieiRJmhCqDvDZwOqm4Y3lnjdQhHhEvBy4EbgGuLfi\neiRJmhCqvhPbGmBW0/CUcs97i8y8ErgyIi4BXgdcMtTC5s6dwbRpD79PcF9f59hUO07Mm9dJV9es\nkWdkcrcdbL/tn9ztb4d2Pce+9qN/7asO8GXAocAVEbE/sKIxISJmAV8BXpCZGyj2vjcPupRSX9/6\nQcf39q4bq3rHhd7edfT0rG153olkNG1vzD+R2H7bP5r2t0O7nmNf+6HbP1SwVx3gVwKHRMSycvio\niDgSmJmZF0XEp4FrI2IDcBPw6YrrkSRpQqg0wDOzHzh+wOiVTdMvAi6qsgZJkiYib+QiSVINGeCS\nJNWQAS5JUg0Z4JIk1ZABLklSDRngkiTVkAEuSVINGeCSJNWQAS5JUg0Z4JIk1ZABLklSDRngkiTV\nkAEuSVINGeCSJNWQAS5JUg0Z4JIk1ZABLklSDRngkiTVkAEuSVINGeCSJNWQAS5JUg0Z4JIk1ZAB\nLklSDU2rcuER0QGcC+wF3A8ck5mrmqYfCfw/4C/Aisw8ocp6JEmaKKreAz8MmJ6Zi4AlwNLGhIh4\nFHA6cGBm/g3wmIg4tOJ6JEmaEKoO8MXAVQCZuRzYt2naA8CizHygHJ5GsZcuSZJGUHWAzwZWNw1v\njIgpAJnZn5k9ABHxFmBmZv5vxfVIkjQhVHoOHFgDzGoanpKZmxsD5TnyDwK7Aa8YaWFz585g2rSp\nDxvf19e57ZWOI/PmddLVNWvkGZncbQfbb/snd/vboV3Psa/96F/7qgN8GXAocEVE7A+sGDD9AuC+\nzDyslYX19a0fdHxv77ptqXHc6e1dR0/P2pbnnUhG0/bG/BOJ7bf9o2l/O7TrOfa1H7r9QwV71QF+\nJXBIRCwrh48qrzyfCfwEOAq4LiKuBvqBj2bmlyuuSZKk2qs0wDOzHzh+wOiV7Vq/JEkTlTdykSSp\nhgxwSZJqyACXJKmGDHBJkmrIAJckqYYMcEmSasgAlySphgxwSZJqyACXJKmGDHBJkmrIAJckqYYM\ncEmSasgAlySphgxwSZJqyACXJKmGDHBJkmrIAJckqYYMcEmSasgAlySphgxwSZJqyACXJKmGDHBJ\nkmpo2iNdgCRp9DZt2kR396rK13PXXXdWvg5tnUoDPCI6gHOBvYD7gWMyc9WAeWYA3wLemJkrq6xH\nkiaK7u5VLDnncmbO6ap0PT2/SXY8sNJVaCtVvQd+GDA9MxdFxH7A0nIcABGxD3AesFPFdUjShDNz\nThez5+1Q6TrWre4Bfl/pOrR1qj4Hvhi4CiAzlwP7Dpi+HUWg31pxHZIkTShVB/hsYHXT8MaI2LLO\nzPxhZv4W6Ki4DkmSJpSqD6GvAWY1DU/JzM1bu7C5c2cwbdrUh43v6+vc2kWOS/PmddLVNWvkGZnc\nbQfbb/snb/snc9vB9kP1Ab4MOBS4IiL2B1Zsy8L6+tYPOr63d922LHbc6e1dR0/P2pbnnUhG0/bG\n/BOJ7bf9fvZbn38iGa79QwV71QF+JXBIRCwrh4+KiCOBmZl5UdN8/RXXIUnShFJpgGdmP3D8gNEP\n+6lYZh5cZR2SJE003olNkqQaMsAlSaohA1ySpBoywCVJqiEDXJKkGjLAJUmqIQNckqQaMsAlSaoh\nA1ySpBoywCVJqiEDXJKkGjLAJUmqIQNckqQaMsAlSaohA1ySpBoywCVJqiEDXJKkGjLAJUmqIQNc\nkqQaMsAlSaohA1ySpBoywCVJqiEDXJKkGppW5cIjogM4F9gLuB84JjNXNU1/KXAq8Bfgk5l5UZX1\nSJI0UVS9B34YMD0zFwFLgKWNCRExrRz+W+B5wP+NiK6K65EkaUKoOsAXA1cBZOZyYN+maU8BbsvM\nNZn5F+D7wAEV1yNJ0oRQ6SF0YDawuml4Y0RMyczNg0xbC8zZ2hXdu7pnax/akvvW9vJX96ypdB0A\n67ZiHVW3HdrT/q1pO9j+yfzeh8ndft/7k7v9Hf39/WNcyoMi4hzgh5l5RTl8V2Y+sfz7acC/ZuZL\nyuGlwPcz80uVFSRJ0gRR9SH0ZcCLASJif2BF07RfAk+OiMdExHYUh89/WHE9kiRNCFXvgTeuQn96\nOeooYB9gZmZeFBEvAd4DdAD/mZnnVVaMJEkTSKUBLkmSquGNXCRJqiEDXJKkGjLAJUmqIQNckqQa\nqvpGLo+YiNiP4nfmBw0zzxOAvTLzqwPG/wq4E9hM8RzNBI7NzJ9WUOdngU9k5rXbuJxpwMXAQmA7\n4AOZ+ZUWH/tD4IjMvKtp3CeBvYE/UWzozQOWZuantqXOIdZ/HLB9Zp6+jcuZAlwIBMVr96bMvGWY\n+acDt2bmkwaMr93rP2CZjwNuAP42M1dGxFOBx2Tm98u2RWZuGOKx7wH+Efgtxa9D5gGfy8yzxqq+\npnW9EPiHzDxqKx+/D3Am8GiK9+jVwOnlnR23tqZjgYszc1ML804Fvk3xefsCcMfA75IWlvH7zNxh\nq4odfHknU9ye+q+ATcA7tvZ9GxGfAV4HPAH4OvAjoI/ie+A3o1jOgRSfxSO3po7RiogPU/zaaQEw\nA7gD6KH4RdSo6yjr/zzwC4rPRD/wGeDXwBNG24fHSJ/B0ZiQAR4R7wBeC6wbYdaDgT2AgR+6fuCQ\nxhdBRLwAeB/w0jEudSy9BrgnM18XEXOBG4GWAnwYb8/MbwOUy/wF8KltXGaVXgr0Z+bi8kN3JsX9\n+IfS+DAOVMfXH9iyIXcesL5p9CuB31PcrriVn52ck5kXlMvbDrglIi7MzHvGut4W63mYiNgJ+C/g\npZl5RznuVOAjwJu3oZ53A5dQhN9IdgJmZeaztmF9Y/YzoIh4CvCyzHxuOfx0irY8c2uWl5n/WC5n\nMfDVzHzHNpTXtp87ZebbASLi9RRB+e5y+MBtqOM7jedjDIzZczEhAxy4HXg5xQccgIg4gWJrchNw\nPXAi8C7g0RGxbJAt5+bTCzsDveVyDgHOAO6j2Dt9I8UHZMuWXWOrutyLfYBir3gB8IbMvDEi/gk4\nmuJLdaw6cPk8xV5Ao/ZG+FxNEeZPBWYBh2fmryPiA8ALgN8Ajx1imc3PwQ4UbSYidqbY259K8Wb8\n58xc0bw30dizBJ5EcTOfGcAuwNmZeWn5pfBvFM/rJsbgJj6Z+eWIaGy0LKTYW2g8B3cDc4G/By4F\nHkOxZT6Uur3+DR+meN6XlLXsCLwBeCAifkax0fKJiNiF4rV7eWauHrCMjqa/51N8T9wXEXOAT1Pc\nBnkqcEpmXtO8RxERZ1HcpOlO4GRgA8V74PLMPDMi9qB476yj2Mjo3cp2vha4sBHeAJl5RkSsiogf\nAa8rjz5sOboTEWdS7Jk9Fvh5Zh5dHnFYRHGU5TMUr9PngFeU8y8u27o0M7844L20EdgtIj4B/KH8\nd+sQ7d6TovOmKeVzenxm/qhR+8Dvp8x861Y8J6uBJ0TEG4GrMvOmiHh2eQTm38t5Gu/ZvSneIw8A\njwfOp9iheTrw0cw8v3xd/4Zio+bREXEHcARwHHBk2b7HAU8E3paZ346IVwL/RPGe6af4Ht6i/Ezs\nQnHU5KOZedlWtHNb7B4RX6Oo+6uZ+b7Bnp/MXDvgcR0DhhsbCHtQbDB/FrgLeDLw48w8odzI/AQw\nneL785TM/J/BlrW1JuQ58My8kuLD1ez1wD+VW6e/LMf9K/CZQcK7A/hmRCyPiF8DzwLeXk47Hzis\nPDT/PYruUOGhW1XNf3dn5t8BH6foce1xwD8Dzwb+D8Xht22Wmesz896ImEUR5P/SNHl5Zh4C/C9w\nZHnocXG55/A6imAfzNkRcW1E3AmcQxF+UITERzLzecBbKb6QYegty9mZ+VKK9r6rHHcuxWH7FwC/\nGmVzh5SZmyPiU8BHgeYvh8vKdR0LrChrP3+IxdTu9QeIiDcAd5dHTToAMvN3FEdNlmbm9eWsF5X1\n3wkcMsiiToyIq8sv7M8BR2fmvcApwLcy80DgVcB/jlDSEym+wJ8DvLMc9yGKL7IXAD/YqoYWFgKr\nBhn/R2D7gSPLz0VvZr6Q4vV8TkQ0Dl3fkpmLM/Ncio2qIyLi74CFmXkARbCdUm7AQPGd8QLg+PKx\nx5fjG6/7YO3eEzix/Bx+kOKmVs0e8v1Ung4alfK1fhnwXOCHEXELxVGjC4ETMvNg4BsUGxhQHEF4\nOXACxffFqyk2to9ras/dPPg9eR4PfW/fn5kvpvgOeFs5bnfgxeXz9kvghY2ZI6KTYoPoFcCLaO0o\nx1ibTvG5O4BiQwOGfn6aHRwR3y0/F98tb1IGDz4fu1FsGD0beHH5Od8D+HD5njuuaX1jZkIG+BDe\nCLy53ILemeHb3jiEuh/FIaiZmdkTEfOB1Zn5h3K+64C/HuTxzVtYPyv//zXwKGBX4ObM3JiZGymO\nBoyJ8pxZu5EXAAAGYklEQVT+d4FLMvPyYWrYneIcKeWW5s1DLPKd5QfxTcCOPPiF+RSKtpOZP6fY\ngoeHtrv57xsHrB+KvaLG3tOyVtrXqsx8A0UbL4qIR5ejV5b/7w78uJzvx5RHKgao5etPEQqHlO/x\nZwCXll8kAzXOif6B4sjIQOeUAX84RRjeVo5/CnAtbAmLNYMsv7ntKzKzPzPX8+Ah/d15sM3b8rrf\nRfFcblF+qT6RInQG1nMfsH1EXEaxETaT4jwxQA7ShqcB+0bEdyl6VJxGsdEw2PwDDdbu3wKnlXug\nf9+07kZ9A7+fRr2XFhG7Amsz8+jM3JnitNp5FO/Rc8u2HEXxWYbifbgZ+DPF+ftNFEetGp/RwWoY\n7r0NxXN/SURcTPEcNtpJZq6jCPoLKTYMp4+2jWOg8dm7jwd38p7C4M9Ps+9k5sGZeVD5/8CdldvL\nnajNwO8ono/fA2+KiEsovkP/ijE20QO8+c12LHBc+cW0N8XW8WaKw2ODPa7x2FOBnSLi+PIc4OyI\naGzhH0gRDPdTvujl4eV5Tcsa+ELfBuwZEdPLi2C26vzUQGVN36QI3UsGTB5Ywy0UW4pExEwGD6Et\nMvMbwJcpPniNxx9QPv4ZFEEAMC0iZpTnTfccZv0Av4mIKP/elnOIW0TEayKisYd/P8UW/uZyuPH/\nLRSHTImIZzL4h6p2rz9AZh5YfsEcRLHR9NrMvJui7c2f9ZbOwWVx8dPZwOVlOP6SB1/3nSgOI99D\nEY47lPM8Y4jFNZ7PX1A+/2zb634pcHRE7BpFfwrfBC6iuJ7lTzz4Jbx3+f+LKC44ejXlIeGmmjY/\nuNgt3wm3At8t98oOpjhFdccg87fq34HTsrhgbwUPD8eB30+LGL2nAx+PiMZ7+naKcL6N4pTCwRR7\nl40jjs3vg605rPuQ91FEzKa4VuQfgGMoPhcdTdO3B/bJzFcAhwIf2pojDdtosPf+rQz+/GytRpvP\noNiZej3FBZZjdui8YaIHePOLtQL4fkR8h+Iw2/Jy3Msi4lVDPa7c0jqG4hDaAooP2pURcR3wfIoX\n6Qbgz1Fczf1eHtxTfdibpQyBsynO+X6NkS+0a9USivO6pzYd5nnUEDX8HLgqIq6nOHfzx0GWN/Bx\nZwBPiYgXURxOfktEfA/4D4q9ByjOaf+I4suue4R63wT8V0R8m2KvaSx8CXhmWdc3gP+XmQ/w0Lac\nB+wSEddSHDp8YJDl1PH1H6wNjS+Mn1Ds3T2PoQ/1DzouMy+mOLf6JuADFIcSv0fxXB9b7nF8iOL5\n/ioPPac92LreTvFcfptyI3JrZHEV9Gso3n9foTivvIBig+xSij2qb/Dgd9xyitf9GuAKitdox4Ht\npbjQ72tZ/ILj3vJ9cgPFxZHrBpl/MIPN82ngivK5240HNzAa8w72/TQq5anDa4Hry/fnNyie72Mp\nPmvXAWcBN7VY83DjBntvr6F4/n5EcXRqPU17s5n5R2BBRCwDvgV8sHz/PNJOYOTnZySDvde/AJxT\nvucO4cFrjcbsIjbvhS5pwigvSFpVHr6WJjQDXJKkGproh9AlSZqQDHBJkmrIAJckqYYMcEmSasgA\nlySphibqvdAlDVDeZGYlD/aqNIXiNrqXZuZ7K1rngcB7c5heASVtHQNcmlx+m5mNu5NR3g/8toj4\nbGaOdIvQreVvVaUKGODS5Na4U9baiHg3RYcWGynulPVOirvkXZNln+lR9N7Vn0XvXr+juKvZYop7\nyr8qM++MovvVpRS3WK1qo0Ca9DwHLk0uO0XETyPilxHRA5xO0SPVXhT3p35m+W83itunwtB70AuA\nb5d79NdR3K51O4rez15R9nZ3X2UtkSY5A1yaXH6bmXtn5lMo7hm+HUVHCwcDn83MDeX9qS+muNf7\nSL5Z/n8zRScuTyvX0ej9bWDHOpLGiAEuTV7vpOgu9O08vKekDopTbP089HviIb23ZeaG8s9G5yn9\nPLSHv41IqoQBLk0uW4K67P/5HRTda/4MODIiHhUR0yj6Rf4uRXeUj4mIx0bEdODvRlj+TUBXRDyt\nHD5yrBsgqWCAS5PLwO5Cv0nRtemBFN2B3kDRtWU38PGyi8gPleO/xUO7uRysS8mNwD8Cn46IGyj6\n3ZZUAXsjkySphtwDlySphgxwSZJqyACXJKmGDHBJkmrIAJckqYYMcEmSasgAlySphv4/gTIl0cDl\nlNQAAAAASUVORK5CYII=\n", "text/plain": [ "<matplotlib.figure.Figure at 0x1199af4d0>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "win_by_round = pd.DataFrame(win_by_round.unstack() ).reset_index()\n", "win_by_round.columns = [\"Round\", \"win\", \"total\" ]\n", "fig2 = sns.barplot(win_by_round.Round, win_by_round.total, hue = win_by_round.win )\n", "fig2.figure.set_size_inches(8,5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's check these results before stratification" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAfAAAAFICAYAAACvNaz+AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAHCRJREFUeJzt3X94XVWd7/F32kilTdppJAKiF8QfXwW1KlVK7VCLMI6K\nY/3BVcRfYBlFr3e8Ol7sOIr4A5QrHUeUB6EzCKIOioKKWO+IgE7FiqBSUL6AtaLivQaTadOWUkoz\nf+yTcohNclqyc7LPeb+epw/de+2zzjc+q/m49tlnrY6hoSEkSVK1TGt2AZIkafcZ4JIkVZABLklS\nBRngkiRVkAEuSVIFGeCSJFVQZ5mdR0QHcC4wD9gKLMvMdXXtJwDvArYDF2bmeWXWI0lSqyh7Br4U\nmJGZC4HlwIoR7f8HOApYBLw7IuaUXI8kSS2h7ABfBKwCyMw1wPwR7T8H5gJ7145dVUaSpAaUHeCz\ngQ11x9sjov49bwVuBNYCV2bmxpLrkSSpJZT6GTiwEeiuO56WmTsAIuLpwEuAA4HNwBci4pWZ+dXR\nOtu+/YGhzs7pZdYrSdJU07Grk2UH+GrgWOCyiFhAMdMetgHYAtyXmUMR8UeK2+mjGhjYUlqhkiRN\nRb293bs831HmZiZ1T6E/o3bqROAwYFZmroyItwAnAfcBvwJOzszto/XX1zfoZ+SSpLbS29u9yxl4\nqQE+0QxwSVK7GS3AXchFkqQKMsAlSaogA1ySpAoywCVJqiADXJKkCir7e+CSJD3EAw88wPr168a/\nsIUcdNDBTJ8+sQuRGeAjfOELF7Fo0WIOPPCgZpciSS1p/fp1vP8rH6Jrn9nNLmVSbLpnIx8+7gM8\n4QlPmtB+DfARTjjhjc0uQZpSnC2pDF37zGbOfmMuvqlxtG2An3zyG/nMZy7g97//HW9/+8lcddXV\n/PSnN/LVr36Z17729fzoRz/k7rt/T39/P4ODG/noR8/i0Y/et9llS5PO2ZI0NbVtgD/nOYfzs5/d\nxG9+s57e3l7uvPMOfvSjHzI4OLjzmkc/el/+8R9P5/Ofv5DrrruG4457TRMrlprH2ZI09bTtU+gL\nFy7ihhvWcPPNP+OEE97ETTfdwG23/YLe3t6d1zzhCU8EYJ99etm27b5mlSpJ0p9p2wA/9NCnk/lL\n7r9/G0cc8Tz+/d9Xse+++zFtWv3/JLtcflaSpKZr2wDv6Ohg33334ylPOYTu7m6GhuB5zzvyIe2S\nJE1VbfsZOMD73vfBnX9fufJiABYvXgLAIYc8bWfbi1507KTWJUnSeNp2Bi5JUpUZ4JIkVZABLklS\nBRngkiRVUEs9xFbGko8uqShJmopaKsDXr1/H8rMvZdac3vEvbsDmDX2c+e5Xu6SiJGnKaakAB5g1\np5fZPftP6nsODQ1x9tkf484772Cvvfbi1FP/kQMOeOyk1lCv3Taf8C6JpHbUcgHeDN///rVs27aN\n8877V2699RY+/el/4swzz25aPe20+YQbT0hqVwb4BLj55p9x+OELATj00Kdx222/bHJFbj4hSa3O\np9AnwJYtm+nq6tp5PH36dHbs2NHEiiRJrc4AnwAzZ85iy5bNO4937NgxYlMUSZImVsvdQt+8oW/S\n+3rGM+axevUPWLLkaG65Ze3ObUglSSpLSwX4QQcdzJnvfvWE9zmeI49cwg03rOGUU04CYPny0ya0\nBkmSRio1wCOiAzgXmAdsBZZl5rpa277AvwFDFBtvPxM4NTPP39P3mz59elOeRu7o6ODv/375pL+v\nJKl9lT0DXwrMyMyFEXE4sKJ2jsz8/8ASgIhYAHwEuKDkeiRJagllP2m1CFgFkJlrgPmjXHcO8NbM\nHCq5HkmSWkLZAT4b2FB3vD0iHvKeEfFS4JbMvLPkWiRJahll30LfCHTXHU/LzJFfkH4d8MlGOps7\ndyadnS6ZOZ6Bga7xL2ohPT1d9PZ2j3+h9ki7jSdwTJXNMTUxyg7w1cCxwGW1z7nX7uKa+Zl5fSOd\nDQxsGbPd3cgK/f2bml3CpOrv30Rf32Czy2hZ7TaewDFVNsfU7hkt+MsO8MuBYyJide34xIg4HpiV\nmSsjYh8eeov9YZnoNcBdZ1uSNFWVGuC1h9JOGXH69rr2e4BnT+R7NmsN8FtvvYXzzjuHc8757KS/\ntySp/bTUQi7N8sUvXsx3vnMVe+89s9mlSJLahAt2T4ADDngcZ5zxiWaXIUlqIwb4BFi8eEnlHnST\nJFWbAS5JUgW13Gfgm+7Z2LS+hoZcSE6SNDlaKsAPOuhgPnzcBya8z0Z1dHRM6HtLkjSalgrwZu1G\nBrDffvtz3nn/2pT3liS1Hz8DlySpggxwSZIqyACXJKmCDHBJkirIAJckqYIMcEmSKsgAlySpggxw\nSZIqyACXJKmCDHBJkirIAJckqYIMcEmSKsgAlySpggxwSZIqyACXJKmCDHBJkirIAJckqYIMcEmS\nKsgAlySpggxwSZIqqLPMziOiAzgXmAdsBZZl5rq69ucAZ9cO/x/wuszcVmZNkiS1grJn4EuBGZm5\nEFgOrBjRfj7wpsw8ElgFHFhyPZIktYSyA3wRRTCTmWuA+cMNEfFk4E/AuyLiWqAnM+8ouR5JklpC\n2QE+G9hQd7w9Iobfcx/gCOBTwNHA0RHx/JLrkSSpJZT6GTiwEeiuO56WmTtqf/8TcGdm3g4QEaso\nZujXjtbZ3Lkz6eycXlKprWNgoKvZJUyqnp4uenu7x79Qe6TdxhM4psrmmJoYZQf4auBY4LKIWACs\nrWtbB3RFxMG1B9v+Elg5VmcDA1tKK7SV9PdvanYJk6q/fxN9fYPNLqNltdt4AsdU2RxTu2e04C87\nwC8HjomI1bXjEyPieGBWZq6MiDcDX4oIgB9m5rdLrkeSpJZQaoBn5hBwyojTt9e1XwscXmYNkiS1\nIhdykSSpggxwSZIqyACXJKmCDHBJkirIAJckqYIMcEmSKsgAlySpggxwSZIqyACXJKmCDHBJkirI\nAJckqYIMcEmSKsgAlySpggxwSZIqyACXJKmCDHBJkirIAJckqYIMcEmSKsgAlySpggxwSZIqyACX\nJKmCDHBJkirIAJckqYIMcEmSKsgAlySpggxwSZIqyACXJKmCOsvsPCI6gHOBecBWYFlmrqtrfyew\nDPhj7dRbMvOOMmuSJKkVlBrgwFJgRmYujIjDgRW1c8MOA16fmT8tuQ5JklpK2bfQFwGrADJzDTB/\nRPthwPKI+EFEvLfkWiRJahllz8BnAxvqjrdHxLTM3FE7/hLwGWAjcEVEvDgzrxqts7lzZ9LZOb28\nalvEwEBXs0uYVD09XfT2dje7jJbVbuMJHFNlc0xNjLIDfCNQX3F9eAP8c2ZuBIiIbwHPAkYN8IGB\nLaUU2Wr6+zc1u4RJ1d+/ib6+wWaX0bLabTyBY6psjqndM1rwl30LfTXwYoCIWACsHW6IiNnALREx\ns/aw21HAjSXXI0lSSyh7Bn45cExErK4dnxgRxwOzMnNlRCwHrqV4Qv3qzFxVcj2SJLWEUgM8M4eA\nU0acvr2u/QvAF8qsQZKkVuRCLpIkVZABLklSBRngkiRVkAEuSVIFGeCSJFVQQ0+hR0QXsAR4ErAD\nuBP4bmZuLbE2SZI0ijEDPCJmAqcBrwBuBn4D3A8sBP4pIr4GfDgz229ZHUmSmmi8GfglwPnA8hFL\noBIR04Bja9cs3cVrJUlSScYL8FfWFmP5M7VA/0ZEfHPiy5IkSWMZL8DfHxGjNmbmh0YLeEmSVJ7x\nArxjUqqQJEm7ZcwAz8zTd3W+tnvY40upSJIkjavRr5H9D+AMYFbd6V8DTyyjKEmSNLZGF3J5NzAP\nuBR4AvBmYE1ZRUmSpLE1GuB/zMxfU3wX/OmZ+Tlg9KfbJElSqRoN8M0RsYQiwF8aEfsBc8srS5Ik\njaXRAH8H8DfAKuBRwG3AOWUVJUmSxtbQQ2zAYzLzf9X+/kqAiHhFOSVJkqTxjLcW+quBGcCHIuID\nI173D8DXSqxNkiSNYrwZ+GyKjUu6KXYjG7YdeF9ZRUmSpLGNt5DLBcAFEfGCzLw6IrqB6Zn5n5NT\nniRJ2pVGH2JbHxE/BtYD6yLipxHx5PLKkiRJY2k0wM8DzsrMR2VmD3AmxTajkiSpCRoN8H0y87Lh\ng8z8MtBTTkmSJGk8jQb4fRHx7OGDiDgM2FJOSZIkaTyNfg/8ncBXI6KfYovRHuDVpVUlSZLG1GiA\nJ/Dk2p9pteP9yypKkiSNbbyFXB5HMeO+CngRMFhremzt3FPGeX0HcC7FTmZbgWWZuW4X130W+FNm\n/sPu/gCSJLWj8Wbgp1Ms4PIY4Pt157cDVzbQ/1JgRmYujIjDgRW1cztFxFuApwHXNVq0JEntbryF\nXE4CiIhTM/Pje9D/IooNUMjMNRExv74xIo4AngN8lnFm85Ik6UHj3UI/E/jYaOEdET3AqZl56ihd\nzAY21B1vj4hpmbmjtiXpaRQz8oYeiJs7dyadndMbubStDQx0NbuESdXT00Vvb3ezy2hZ7TaewDFV\nNsfUxBjvFvqXga9HxN0Ut9B/R3H7/EDgKIpb6+8c4/UbKdZRHzYtM3fU/n4cxdakV1E8ELd3RNyW\nmReP1tnAgN9ca0R//6ZmlzCp+vs30dc3OP6F2iPtNp7AMVU2x9TuGS34x7uF/lPg+RGxhGI/8GOB\nHcCvgM9m5vfGed/VtddcFhELgLV1fZ9DbU/xiHgjEGOFtyRJelBDXyPLzGuAa/ag/8uBYyJide34\nxIg4HpiVmSv3oD9JkkSDAR4RLwQ+QrGAS8fw+cw8eKzXZeYQcMqI07fv4rqLGqlDkiQVGl3I5Rzg\nXcAtwFB55UiSpEY0GuD3ZGYj3/uWJEmToNEA/0FErKD4TvfW4ZOZ+f3RXyJJksrSaIA/t/bfZ9Wd\nG6L4KpkkSZpkjT6FvqTsQiRJUuMafQp9EfAeoIviKfTpwIGZeVB5pUmSpNFMa/C6lcAVFIH/GeAO\niu94S5KkJmg0wO/NzAuBa4EB4GRgcVlFSZKksTUa4FtrG5cksKC2QMus8sqSJEljaTTAVwCXAt8E\n3hARtwI/Ka0qSZI0poYCPDO/AvxVZg4ChwGvA15fZmGSJGl0DQV4RMwFzo+I7wGPBN4BzCmzMEmS\nNLpGb6FfANxAsX/3IPAH4JKyipIkSWNrNMAfn5nnAzsyc1tmvg94bIl1SZKkMTQa4NsjYg61ncgi\n4knAjtKqkiRJY2p0LfTTKL4D/riIuAI4AjiprKIkSdLYGp2B30ix8tqvgf8GfI3iaXRJktQEjc7A\nrwJuBur3BO+Y+HIkSVIjGg1wMvPNZRYiSZIa12iAXxERy4DvAduHT2bmXaVUJUmSxtRogM8B3gvc\nU3duCDh4wiuSJEnjajTAXwk8OjPvLbMYSZLUmEafQl8HzC2zEEmS1LhGZ+BDwC8i4hZg2/DJzDyq\nlKokSdKYGg3wj5ZahSRJ2i0NBXhmXld2IZIkqXGNfgYuSZKmkIYXctkTEdEBnAvMA7YCyzJzXV37\nK4FTKTZG+WJmfqrMeiRJahVlz8CXAjMycyGwHFgx3BAR04AzgKOAhcDbIqKn5HokSWoJZQf4ImAV\nQGauAeYPN2TmDuCpmbkJ2KdWy7ZddSJJkh6q1FvowGxgQ93x9oiYVgtvMnNHRLwc+AzFRimbx+ps\n7tyZdHZOL63YVjEw0NXsEiZVT08Xvb3dzS6jZbXbeALHVNkcUxOj7ADfCNRXvDO8h2Xm5cDlEXER\n8AbgotE6GxjYUkqRraa/f1OzS5hU/f2b6OsbbHYZLavdxhM4psrmmNo9owV/2bfQVwMvBoiIBcDa\n4YaI6I6IayNir9qpzRQPs0mSpHGUPQO/HDgmIlbXjk+MiOOBWZm5MiIuAb4fEdso9hu/pOR6JElq\nCaUGeGYOAaeMOH17XftKYGWZNUiS1IpcyEWSpAoywCVJqiADXJKkCir7IbYp4YEHHmD9+nXjX9gi\n7rrrN80uQZJUsrYI8PXr17H87EuZNae32aVMir7fJY9Z3OwqJEllaosAB5g1p5fZPfs3u4xJsWlD\nH/CHZpchSSqRn4FLklRBBrgkSRVkgEuSVEEGuCRJFWSAS5JUQQa4JEkVZIBLklRBBrgkSRVkgEuS\nVEEGuCRJFWSAS5JUQW2zFro0kdpphzt3t5OmJgNc2gPttMOdu9tJU5MBLu2hdtnhzt3tJod3dbS7\nDHBJmgK8q6PdZYBL0hThXR3tDp9ClySpggxwSZIqyACXJKmCDHBJkiqo1IfYIqIDOBeYB2wFlmXm\nurr244G/A+4H1mbm28qsR5KkVlH2DHwpMCMzFwLLgRXDDRHxSOBDwOLM/EvgLyLi2JLrkSSpJZQd\n4IuAVQCZuQaYX9d2H7AwM++rHXdSzNIlSdI4yg7w2cCGuuPtETENIDOHMrMPICLeAczKzO+WXI8k\nSS2h7IVcNgLddcfTMnPH8EHtM/KzgCcBrxivs7lzZ9LZOX23ixgY6Nrt16g6enq66O3tHv/CCeSY\nam2OKU20MsZU2QG+GjgWuCwiFgBrR7SfD9ybmUsb6WxgYMseFdHfv2mPXqdq6O/fRF/f4KS/p1qX\nY0oT7eGMqdGCv+wAvxw4JiJW145PrD15Pgu4ETgR+EFEXAMMAf+cmV8vuSZJkiqv1ADPzCHglBGn\nb5+s95ckqVW5kIskSRVkgEuSVEEGuCRJFWSAS5JUQQa4JEkVZIBLklRBBrgkSRVkgEuSVEEGuCRJ\nFWSAS5JUQQa4JEkVZIBLklRBBrgkSRVkgEuSVEEGuCRJFWSAS5JUQQa4JEkVZIBLklRBBrgkSRVk\ngEuSVEEGuCRJFWSAS5JUQQa4JEkVZIBLklRBBrgkSRVkgEuSVEEGuCRJFdRZZucR0QGcC8wDtgLL\nMnPdiGtmAv8XOCkzby+zHkmSWkXZM/ClwIzMXAgsB1bUN0bEYcB1wMEl1yFJUkspO8AXAasAMnMN\nMH9E+14UIX9byXVIktRSyg7w2cCGuuPtEbHzPTPz+sz8PdBRch2SJLWUUj8DBzYC3XXH0zJzx552\nNnfuTDo7p+/26wYGuvb0LVUBPT1d9PZ2j3/hBHJMtTbHlCZaGWOq7ABfDRwLXBYRC4C1D6ezgYEt\ne/S6/v5ND+dtNcX192+ir29w0t9TrcsxpYn2cMbUaMFfdoBfDhwTEatrxydGxPHArMxcWXfdUMl1\nSJLUUkoN8MwcAk4ZcfrPviqWmUeVWYckSa3GhVwkSaogA1ySpAoywCVJqiADXJKkCjLAJUmqIANc\nkqQKMsAlSaogA1ySpAoywCVJqiADXJKkCjLAJUmqIANckqQKMsAlSaogA1ySpAoywCVJqiADXJKk\nCjLAJUmqIANckqQKMsAlSaogA1ySpAoywCVJqiADXJKkCjLAJUmqIANckqQKMsAlSaogA1ySpAoy\nwCVJqqDOMjuPiA7gXGAesBVYlpnr6tpfCrwfuB+4MDNXllmPJEmtouwZ+FJgRmYuBJYDK4YbIqKz\ndnw08HzgbyOit+R6JElqCWUH+CJgFUBmrgHm17U9FbgjMzdm5v3AfwBHllyPJEktodRb6MBsYEPd\n8faImJaZO3bRNgjMKauQzRv6yup6yrl3sJ9H3LOx2WVMik1N/DnbZUy103gCx9RkcExNjLIDfCPQ\nXXc8HN7DbbPr2rqB/xyrs97e7o49KaK399lc85Vn78lLpV1yTGmiOaa0u8q+hb4aeDFARCwA1ta1\n/RJ4YkT8RUTsRXH7/PqS65EkqSV0DA0NldZ53VPoz6idOhE4DJiVmSsj4iXAaUAH8C+ZeV5pxUiS\n1EJKDXBJklQOF3KRJKmCDHBJkirIAJckqYIMcEmSKqjs74GrBBFxKPBxYCYwC/g2cC3wlsw8voml\nqSIi4vHAWcABwL3AFuDUzPxFUwtT5UXEYuCt9b+LIuJM4JeZefFu9vWHzNx/omtsFQZ4xUTEHOBL\nwNLMXFf7qt5XgD8AfqVA44qIvYFvAG/OzB/Xzs0HPg0c1cza1DIm6neRv9PGYIBXz8uAq4d3dcvM\noYh4A/A8YDFARLwdeAXFDP2e2t8/B1ySmd+OiKcAn8jMY5tQv5rvpRRj6MfDJzLzJ8BREXEh8Cig\np3bdWcBjgf2Bb2bm+yPiFcD/BrYBd2fmayLiecAnaue2AK/KzM2T+UNpStnVqpnTI+ICHhxP38jM\nD4wYc39DMeYOAdYBMyap3kryM/DqeQzFwN4pM7dQ/OIc9qjMfEFmHgE8gmITmfOBN9XaTwLcurV9\nPR64c/ggIq6IiGsi4jaKW+pXZ+YiiqWOr8/MFwGHA2+tveQ1wFmZeSRwZe2u0MuASyl2FjwPmDtZ\nP4ympKMi4nu1P9cAxwMP8NDxdErd9cNjbgkP3cFy5mQXXiXOwKvnN8BDFkyOiIN46E5u2yLiS8Bm\nil/Ij8jM6yLinIjYB/grin8cak+/pW5nwMxcChAR1wO/A7LW1A88NyKWUGw2tFft/LuA5RHxDool\nka8AzgDeB1xd6+NH5f8YmsKuzszXDh9ExBkU/4fwabsYT/DgmHsy8GOAzPxtRPx2kuqtJGfg1XMl\n8MKIOBggIh5Bsa96X+346RSfjx8PvAOYzoO3sz4PfAr4TmY+MNmFa8r4OvCCiHju8ImIeCLFrc0D\ngeENh94EDGTm6ynG2PBs6G+B0zJzCcXvkJcDrwMuzMyjgF/UrpGGddT+7Go8wYNj7hfAEQAR8RiK\nMalROAOvmMwcjIg3AhfUHmDrBr4J3EYxC78D2BQRP6D4B3M3xW13gIuAjwBPm/TCNWVk5uaIeCnw\n8YjYj+Jjlu3AO4GX1F16NfDFiDiC4iOa2yNif4oZ0rciYpBiJnUl8CTgXyJiM8WtUgNc9YYoxthf\n72I87XxQLTO/HhHH1O4G3QX8sSnVVoRrobeRiDgA+FxmHtPsWiRJD4+30NtERLwcuAr4QLNrkSQ9\nfM7AJUmqIGfgkiRVkAEuSVIFGeCSJFWQAS5JUgX5PXCpxUXEq4D3Uvx77wA+n5mf2I3Xn0yxytql\nmXlqOVVK2l3OwKUWVlvN6hPA0Zn5TIpVrl4dEbuzkc1rgGWGtzS1+DUyqYVFxDMovv+/IDN/Vzt3\nCHAf8F1gcWbeVdvD+YOZuaS2+UQ/xY5QX6TYeewPwP8EuoB3A48E9qYI9v+IiGdSbGKyd+21J2Tm\n3RFxKvDfKSYL38nM907Wzy61OmfgUgvLzJsp9v5eFxFrIuJjQGdm/oo/32u5/vjnmfnUzPww8BPg\nzcB3gLcAL8nMZwEfB95Tu/4S4PTMnAf8G/B3EfFC4DCKjVOeDTw2Il6LpAlhgEstLjPfRrFJybm1\n/15fW5lvLGtGHHdk5hDF3vJ/HRGnU2x20hURjwL2y8xv197vs7Xb7UcDzwVuBG6iCPNDJ+ankuRD\nbFILi4gXA12Z+WWKzWwuiohlFDPqIR7cqe4RI1567y76mgXcAFwMXAfcDLwduL+uHyJiBsUGOtOB\nT2bmJ2vnZ1NsaCFpAjgDl1rbFuCMiDgQoLaD3SEUM+J7eHBG/LIG+noy8EBmngFcA7wImJ6ZG4G7\nIuIFteveAJxOsZvZGyJiVkR0Umxj+qqJ+bEkGeBSC8vMaynC9MqI+CXFfsvTgA8BHwQ+FRFrgIG6\nl4322fjPgZ9HRFLcFh+kuCUP8HrggxFxE3Ac8J7M/BZwGcXt+JuBmzLz4gn9AaU25lPokiRVkDNw\nSZIqyACXJKmCDHBJkirIAJckqYIMcEmSKsgAlySpggxwSZIq6L8AOiDOSd4hjMIAAAAASUVORK5C\nYII=\n", "text/plain": [ "<matplotlib.figure.Figure at 0x11c3f1e90>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "win_by_Surface_before_strat = pd.crosstab(df3.win, df3.Surface).apply( lambda x: x/x.sum(), axis = 0 )\n", "win_by_Surface_before_strat = pd.DataFrame( win_by_Surface_before_strat.unstack() ).reset_index()\n", "win_by_Surface_before_strat.columns = [\"Surface\", \"win\", \"total\" ]\n", "fig2 = sns.barplot(win_by_Surface_before_strat.Surface, win_by_Surface_before_strat.total, hue = win_by_Surface_before_strat.win )\n", "fig2.figure.set_size_inches(8,5)" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAfAAAAFICAYAAACvNaz+AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xu8HVV99/HPSVIQcjORSAARBOEnDyL1iiAFoaKVgg9i\nLaVeEShCtbUUtVjUChXqhfhYLHIrCvVSlEqtVkEqIJgKgoLcfwFCwAtKMDEXAoQk5/ljzU42h3PZ\n5+TMzplzPu/XK6+cmb33zFp7z57vrDWzZ/X09vYiSZKaZdKmLoAkSRo+A1ySpAYywCVJaiADXJKk\nBjLAJUlqIANckqQGqj3AI2KviLi6n/mHRsSPI2J+RBxTdzkkSRpPag3wiHg/cD6weZ/5U4B5wGuA\nVwN/ERFz6iyLJEnjSd0t8HuBN/YzfzfgnsxcnplPAj8E9qu5LJIkjRu1BnhmXgas6eehGcCytukV\nwMw6yyJJ0ngyZROtdzklxFumA78b6kVr1qztnTJlcm2FkiRpDOrpb2a3Arzvyu8Cnh8RzwRWUbrP\nPzXUQpYuXVVD0SRJGrvmzJne7/xuBXgvQEQcCUzNzAsi4kTge5RwvyAzH+pSWSRJaryeJo1Gtnjx\niuYUVpKkUTBnzvR+u9C9kYskSQ1kgEuS1EAGuCRJDWSAS5LUQAa4JEkNZIBLktRABrgkScP05S9f\nxAMPLNqkZfB34JIkjWH+DlySpGE69th3sHr1au6/fyEHH/yHANx880845ZQPcuedt3Phhefxj//4\nUU488b0ce+w7ePjh33StbAa4JEkDePnL9+KWW37KTTf9mDlz5nDvvfdw/fX/y4oVK9Y/59nP3pp5\n885iv/1ezQ9+cHXXymaAS5I0gH322Zcbb7yBW2+9hbe85Z389Kc3cvfddzJnzpz1z9l55+cDsNVW\nc1i9+omulc0AlyRpALvvvgeZd/Hkk6vZe+9XceWVl7P11nOZNKk9Pvs9RV27TTUeuCSpAdauXcui\nRQtrX8+OO+7E5MmTa1/PcPX09LD11nPZbrvnMH36dHp74VWv2o/5869d//gmK5tXoUuSBnLffffw\n4a+fyrStZtS2jpWPLOe0N3+EnXfepbZ1NNlAV6HbApckDWraVjOYOXfWpi6G+vAcuCRJDWSAS5LU\nQAa4JEkN5DlwSdKEUseV9ZviKnoDXJI0oSxatJCTz7yEqTPnDP3kDjy6bDFn/O0RXb+K3gCXJE04\nU2fOYcbsbbq2vt7eXs4885+499572GyzzfjgB09hu+2es1HL9By4JEk1u/baa1i9ejXnnHMhxx33\nHj73uc9s9DINcEmSanbrrbew1177ALD77i/k7rvv2uhlGuCSJNVs1apHmTZt2vrpyZMns27duo1a\npgEuSVLNttxyKqtWPbp+et26dX0GRBk+L2KTJE04jy5b3NVlvehFezJ//nUccMBruP3229YPQbox\nDHBJ0oSy4447ccbfHjHqyxzMfvsdwI033sDxx78LgJNP/uhGr9MAlyRNKJMnT+76b7Z7eno46aST\nR3WZngOXJKmBDHBJkhrIAJckqYEMcEmSGsiL2CRJE4qjkUmS1ECLFi3kw18/lWlbzRiV5a18ZDmn\nvfkjjkYmSVLdpm01g5lzZ3V9vXfccTvnnHMWZ5117kYvywCXJKkLvvKVi7niiu+wxRZbjsryvIhN\nkqQu2G677Tn99E+P2vIMcEmSumD//Q8Y1QvdDHBJkhrIc+CSpAln5SPLN9myent7R2W9BrgkaULZ\nccedOO3NHxn1ZXaqp6dnVNZpgEuSJpRNMRpZy9y523DOOReOyrI8By5JUgMZ4JIkNZABLklSA3kO\nXNKg6hj4YSCbYkAIqakMcEmDGu2BHwayqQaEkJrKAJc0pE018IOkgdUa4BHRA5wN7Ak8DhyTmQvb\nHn8LcCKwBvhCZp5TZ3kkSRov6m6BHwZsnpn7RMRewLxqXsungN2AVcCdEfHVzFxWc5kkqWNeA6Cx\nqu4A3xe4HCAzb4iIl/V5/GfALKB1X7nRub+cJI0SrwHQWFV3gM8A2lvUayJiUmauq6bvAH4CrAS+\nkZmjd3NaaRR1qxVmC2xs8hoAjUV1B/hyYHrb9Prwjog9gD8GdgAeBb4cEW/KzP8YaGGzZm3JlCnu\n3NR9CxYsqL0VtvKR5fzLcZ9k1113rW0dI7F06bSurWv27GnMmTN96Cd2kfXvTv3HYt3HuroDfD5w\nCHBpRLwSuK3tsWWUc99PZGZvRDxM6U4f0NKlq2orqDSYJUtWdqUVtmTJShYvXlHrOoZryZKVXV2X\n9Z+Y9R+LdR8rBjqwqTvALwMOioj51fRREXEkMDUzL4iI84AfRsQTwH3AF2sujyRJ40KtAZ6ZvcDx\nfWYvaHv8XODcOssgSdJ45L3QJUlqIANckqQGMsAlSWogA1ySpAYywCVJaiADXJKkBjLAJUlqIANc\nkqQGMsAlSWqgum+lKklSY43lkQgNcEmSBtCN8eBHOha8Ad6hbh2FgWNCS9JYMlbHgzfAO9SNozAY\n+ZGYJGliMcCHYawehUmSJh6vQpckqYFsgasjY/lKTEmaiAxwdWQsX4kpSRORAa6OeQ2AJI0dngOX\nJKmBDHBJkhrIAJckqYEMcEmSGsgAlySpgQxwSZIayACXJKmBDHBJkhrIAJckqYEMcEmSGsgAlySp\ngQxwSZIayACXJKmBDHBJkhrIAJckqYEMcEmSGsgAlySpgQxwSZIayACXJKmBDHBJkhrIAJckqYEM\ncEmSGsgAlySpgQxwSZIayACXJKmBDHBJkhrIAJckqYEMcEmSGsgAlySpgQxwSZIaaEqdC4+IHuBs\nYE/gceCYzFzY9vjLgTOryV8Db83M1XWWSZKk8aDuFvhhwOaZuQ9wMjCvz+PnAe/MzP2Ay4Edai6P\nJEnjQt0Bvi8lmMnMG4CXtR6IiF2B3wInRsQ1wOzMvKfm8kiSNC7U2oUOzACWtU2viYhJmbkO2ArY\nGzgBWAh8OyJuysxrBlrYrFlbMmXK5DrLO6ClS6d1bV2zZ09jzpzpXVtfJ7pV/7FYd5jY9Xfbt/7d\nMBbrDmO7/nUH+HKgvUSt8IbS+r43MxcARMTllBb6NQMtbOnSVTUVc2hLlqzs6roWL17RtfV1olv1\nH4t1h4ldf7d969+t9Yy1usPYqP9AwV53F/p84GCAiHglcFvbYwuBaRGxUzX9B8AdNZdHkqRxoaMW\neERMAw4AdgHWAfcC/5OZjw/x0suAgyJifjV9VEQcCUzNzAsi4mjgqxEB8L+Z+d2RVEKSpIlm0ACP\niC2BjwKHA7cCDwBPAvsAn4mIbwCnZWa/fQyZ2Qsc32f2grbHrwH2GmnhJUmaqIZqgX+J8lOvk9vO\nXQMQEZOAQ6rnHFZP8SRJUn+GCvA3Va3op6kC/b8i4lujXyxJkjSYoQL8w9X56X5l5qkDBbwkSarP\nUAHe05VSSJKkYRk0wDPzY/3Nr+5x/rxaSiRJkobU6c/I3gOcDkxtm30/8Pw6CiVJkgbX6Y1c/pYy\notglwM7A0cANdRVKkiQNrtMAfzgz76f8FnyPzPwiMPDVbZIkqVadBvijEXEAJcAPjYi5wKz6iiVJ\nkgbTaYC/F3gDZWjQZwF3A2fVVShJkjS4Tkcj2zYz/6b6+00AEXF4PUWSJElDGepe6EcAmwOnRsRH\n+rzuQ8A3aiybJEkawFAt8BmUgUumU0Yja1kD/H1dhZIkSYMb6kYu5wPnR8QfZub3I2I6MDkzf9ed\n4kmSpP50ehHbooj4MbAIWBgRN0fErvUVS5IkDabTAD8H+GRmPiszZwNnUIYZlSRJm0CnAb5VZl7a\nmsjMrwGz6ymSJEkaSqcB/kREvKQ1EREvBVbVUyRJkjSUTn8H/j7gPyJiCWWI0dnAEbWVSpIkDarT\nAE9g1+rfpGp6m7oKJUmSBjfUjVy2p7S4vwO8HlhRPfScat4Lai2dJEnq11At8I9RbuCyLXBt2/w1\nwLfrKpQkSRrcUDdyeRdARHwwMz/RnSJJkqShDHoVekScEREzBwrviJgdEQa7JEldNlQX+teAb0bE\nryhd6L+gdJ/vABxI6Vp/X60llCRJTzNUF/rNwKsj4gDKeOCHAOuA+4BzM/Oq+osoSZL66uhnZJl5\nNXB1zWWRJEkd6ijAI+J1wD9SbuDS05qfmTvVVC5JkjSITm/kchZwInA70FtfcSRJUic6DfBHMtPf\nfUuSNEZ0GuDXRcQ84HLg8dbMzLx24JdIkqS6dBrgr6j+f3HbvF7KT8kkSVKXdXoV+gF1F0SSJHWu\n06vQ9wXeD0yjXIU+GdghM3esr2iSJGkgg95Ktc0FwH9SAv9fgHuAy+oqlCRJGlynAf5YZn4BuAZY\nChwL7F9XoSRJ0uA6DfDHI2I2kMArM7MXmFpfsSRJ0mA6DfB5wCXAt4C3R8QdwE21lUqSJA2qowDP\nzK8Dr83MFcBLgbcCb6uzYJIkaWAdBXhEzALOi4irgGcA7wVm1lkwSZI0sE670M8HbgSeBawAHgK+\nVFehJEnS4DoN8Odl5nnAusxcnZl/DzynxnJJkqRBdBrgayJiJtVIZBGxC7CutlJJkqRBdXov9I9S\nfgO+fUT8J7A38K66CiVJkgbXaQv8J5Q7r90PPBf4BuVqdEmStAl02gL/DnAr0D4meM/oF0eSJHWi\n0wAnM4+usyCSJKlznQb4f0bEMcBVwJrWzMx8sJZSSZKkQXUa4DOBvwMeaZvXC+w02Isiogc4G9gT\neBw4JjMX9vO8c4HfZuaHOiyPJEkTWqcB/ibg2Zn52DCXfxiweWbuExF7Ue6pflj7EyLiOOCFwA+G\nuWxJkiasTq9CXwjMGsHy9wUuB8jMG4CXtT8YEXsDLwfOHcGyJUmasDptgfcCd0bE7cDq1szMPHCI\n180AlrVNr4mISZm5LiLmUn5ffhhwRCeFmDVrS6ZMmdxhkUfX0qXTurau2bOnMWfO9K6trxPdqv9Y\nrDtM7Pq77Vv/bhiLdYexXf9OA/zjwy8OAMuB9hJNyszWHdzeTLm3+neAbYAtIuLuzLx4oIUtXbpq\nhMXYeEuWrOzquhYvXtG19XWiW/Ufi3WHiV1/t33r3631jLW6w9io/0DB3lGAZ+ZIz0/PBw4BLo2I\nVwK3tS3zLOAsgIh4BxCDhbckSdqg49+Bj9BlwEERMb+aPioijgSmZuYFNa9bkqRxq9YAz8xe4Pg+\nsxf087yL6iyHJEnjTadXoUuSpDHEAJckqYEMcEmSGsgAlySpgQxwSZIayACXJKmBDHBJkhrIAJck\nqYEMcEmSGsgAlySpgQxwSZIayACXJKmBDHBJkhrIAJckqYEMcEmSGsgAlySpgQxwSZIayACXJKmB\nDHBJkhrIAJckqYEMcEmSGsgAlySpgQxwSZIayACXJKmBDHBJkhrIAJckqYEMcEmSGsgAlySpgQxw\nSZIaaMqmLoA2ztq1a1m0aGHt63nwwQdqX4ckqXMGeMMtWrSQk8+8hKkz59S6nsW/SLbdv9ZVSJKG\nwQAfB6bOnMOM2dvUuo6VyxYDD9W6DklS5zwHLklSA42LFng3zgN7DliSNJaMiwDvxnlgzwFLksaS\ncRHgUP95YM8BS5LGknET4JKkicOf0BrgkqQG8ie0BrgkqaEm+k9o/RmZJEkNZIBLktRABrgkSQ1k\ngEuS1EAGuCRJDWSAS5LUQAa4JEkNZIBLktRAtd7IJSJ6gLOBPYHHgWMyc2Hb40cCfw08CdyWmSfU\nWR5JksaLulvghwGbZ+Y+wMnAvNYDEfEM4FRg/8z8A+CZEXFIzeWRJGlcqDvA9wUuB8jMG4CXtT32\nBLBPZj5RTU+htNIlSdIQ6r4X+gxgWdv0moiYlJnrMrMXWAwQEe8Fpmbm/9RcHo0zjkgkaaKqO8CX\nA9Pbpidl5rrWRHWO/JPALsDhQy1s1qwtmTJl8tPmL106beNLOobMnj2NOXOmD/1EJnbdARYsWDCu\nRiQabv27oZvbmPWfuPUfbt0n+r4P6g/w+cAhwKUR8Urgtj6Pnwc8lpmHdbKwpUtX9Tt/yZKVG1PG\nMWfJkpUsXryi4+eOJ8Ope+v542lEouHWvxu6uY0Nt/7d6IHpZu/LRP78R/LdH08Gq/9AwV53gF8G\nHBQR86vpo6orz6cCPwGOAq6LiKuBXuCzmfnNmsskaZzoxpjQY3k8aE1stQZ4dZ77+D6zF3Rr/ZLG\nv7p7YMbyeNCa2AxQqcHGWxeypM4Z4FKD2YUsTVwGuNRwdiFLE5P3QpckqYEMcEmSGsgAlySpgQxw\nSZIayACXJKmBDHBJkhrIAJckqYEMcEmSGsgAlySpgQxwSZIayACXJKmBDHBJkhrIAJckqYEMcEmS\nGsgAlySpgQxwSZIayACXJKmBDHBJkhrIAJckqYEMcEmSGsgAlySpgQxwSZIayACXJKmBDHBJkhrI\nAJckqYEMcEmSGsgAlySpgQxwSZIayACXJKmBDHBJkhpoyqYugCRp+NauXcuiRQtrX8+DDz5Q+zo0\nMga4JDXQokULOfnMS5g6c06t61n8i2Tb/WtdhUbIAJekhpo6cw4zZm9T6zpWLlsMPFTrOjQyngOX\nJKmBDHBJkhrIAJckqYEMcEmSGsgAlySpgQxwSZIayACXJKmBDHBJkhrIAJckqYEMcEmSGsgAlySp\ngQxwSZIaqNbBTCKiBzgb2BN4HDgmMxe2PX4o8GHgSeALmXlBneWRJGm8qLsFfhiweWbuA5wMzGs9\nEBFTqunXAK8G/iIi6h0XT5KkcaLuAN8XuBwgM28AXtb22G7APZm5PDOfBH4I7FdzeSRJGhfqHg98\nBrCsbXpNREzKzHX9PLYCmDnSFT26bPFIX9qRx1Ys4fceWV7rOgBWjmAdddcdulP/kdQdrP9E3vZh\nYtffbX9i17+nt7d3lIuyQUScCfwoMy+tph/MzOdWf+8B/FNm/nE1PQ/4YWZ+o7YCSZI0TtTdhT4f\nOBggIl4J3Nb22F3A8yPimRGxGaX7/Ec1l0eSpHGh7hZ46yr0F1WzjgJeCkzNzAsi4o+BjwI9wL9m\n5jm1FUaSpHGk1gCXJEn18EYukiQ1kAEuSVIDGeCSJDWQAS5JUgPVfSOXTSYi9qL8zvyAQZ6zPbBn\nZn67z/z7gQeAdZT3aCpwbGb+tIZyfhX4fGZeu5HLmQJcCOwIbAZ8PDO/1eFrfwQckZkPts37AvAS\n4LeUA73ZwLzM/OLGlHOA9R8HbJ2Zp27kciYB5wNB+ezenZl3DvL8zYG7M/N5feY37vPvs8xnAzcB\nr8nMBRHxQuCZmfnDqm6RmasHeO1HgT8Hfkn5dchs4N8z84zRKl/bul4H/FlmHjXC178UOB3YgrKN\nXg2cWt3ZcaRlOha4MDPXdvDcycCVlO/b14H7+u5LOljGQ5m5zYgK2//yPki5PfXvAWuB9490u42I\nrwBvB7YHvgNcDyyl7Ad+MYzl7E/5Lh45knIMV0R8mvJrp7nAlsB9wGLKL6KGXY6q/F8D7qB8J3qB\nrwA/B7Yf7hgeQ30Hh2NcBnhEvB94G7ByiKceCLwA6Pul6wUOau0IIuK1wMeAQ0e5qKPprcAjmfn2\niJgF3AJ0FOCDOCkzrwSolnkH8MWNXGadDgV6M3Pf6kt3OuV+/ANpfRn7auLnD6w/kDsHWNU2+03A\nQ5TbFXfys5MzM/O8anmbAXdGxPmZ+chol7fD8jxNRGwH/BtwaGbeV837MPAZ4D0bUZ4PARdRwm8o\n2wHTM/PlG7G+UfsZUETsBrwhM19VTb+IUpcXj2R5mfnn1XL2Bb6dme/fiOJ17edOmXkSQES8gxKU\nH6qm99+Icny/9X6MglF7L8ZlgAP3Am+kfMEBiIgTKEeTa4EbgROBvwO2iIj5/Rw5t59e2AFYUi3n\nIOA04DFK6/RdlC/I+iO71lF11Yp9gtIqngu8MzNviYi/BI6m7FRHawCXr1FaAa2yt8LnakqYvxCY\nDrw5M38eER8HXgv8AnjWAMtsfw+2odSZiNiB0tqfTNkY/yozb2tvTbRalsDzKDfz2RLYCfhEZl5c\n7RT+H+V9Xcso3MQnM78ZEa2Dlh0prYXWe/AwMAv4E+Bi4JmUI/OBNO3zb/k05X0/uSrLtsA7gSci\n4mbKQcvnI2Inymf3xsxc1mcZPW1/b0XZTzwWETOBL1FugzwZOCUzr2lvUUTEGZSbND0AfBBYTdkG\nLsnM0yPiBZRtZyXlIGPJCOv5NuD8VngDZOZpEbEwIq4H3l71Pqzv3YmI0ykts2cBP8vMo6seh30o\nvSxfoXxO/w4cXj1/36qu8zLzP/psS2uAXSLi88Cvq393D1Dv3SmDN02q3tPjM/P6Vtn77p8y830j\neE+WAdtHxLuAyzPz1oh4RdUD88/Vc1rb7Eso28gTwHOAcykNmhcBn83Mc6vP9Q8oBzVbRMR9wBHA\nccCRVf2eDTwX+JvMvDIi3gT8JWWb6aXsh9ervhM7UXpNPpuZXx5BPTfGrhHx35RyfzszP9bf+5OZ\nK/q8rqfPdOsA4QWUA+avAg8Czwd+nJknVAeZnwc2p+w/T8nM/+pvWSM1Ls+BZ+ZllC9Xu3cAf1kd\nnd5Vzfsn4Cv9hHcPcEVE3BARPwdeDpxUPXYucFjVNf8DynCo8NSjqva/F2XmHwGfo4y49mzgr4BX\nAP+X0v220TJzVWY+GhHTKUH+920P35CZBwH/AxxZdT3uW7Uc3k4J9v58IiKujYgHgDMp4QclJD6T\nma8G3kfZIcPAR5YzMvNQSn3/rpp3NqXb/rXA/cOs7oAyc11EfBH4LNC+c/hyta5jgduqsp87wGIa\n9/kDRMQ7gYerXpMegMz8FaXXZF5m3lg99YKq/A8AB/WzqBMj4upqh/3vwNGZ+ShwCvC9zNwf+FPg\nX4co0nMpO/C9gQ9U8z5F2ZG9FvjfEVW02BFY2M/83wBb951ZfS+WZObrKJ/n3hHR6rq+MzP3zcyz\nKQdVR0TEHwE7ZuZ+lGA7pTqAgbLPeC1wfPXa46v5rc+9v3rvDpxYfQ8/SbmpVbun7J+q00HDUn3W\nbwBeBfwoIu6k9BqdD5yQmQcC36UcYEDpQXgjcAJlf/EWysH2cW31eZgN+8lzeOq2/XhmHkzZB/xN\nNW9X4ODqfbsLeF3ryRExjXJAdDjwejrr5Rhtm1O+d/tRDjRg4Pen3YERcVX1vbiqukkZbHg/dqEc\nGL0COLj6nr8A+HS1zR3Xtr5RMy4DfADvAt5THUHvwOB1b3Wh7kXpgpqamYsjYitgWWb+unredcD/\n6ef17UdYN1f//xx4BrAzcHtmrsnMNZTegFFRndO/CrgoMy8ZpAy7Us6RUh1p3j7AIj9QfRHfDWzL\nhh3mbpS6k5k/oxzBw1Pr3f73LX3WD6VV1Go9ze+kfp3KzHdS6nhBRGxRzV5Q/b8r8OPqeT+m6qno\no5GfPyUUDqq28d8HLq52JH21zon+mtIz0teZVcC/mRKG91TzdwOuhfVhsbyf5bfX/bbM7M3MVWzo\n0t+VDXXemM/9Qcp7uV61U30uJXT6lucxYOuI+DLlIGwq5TwxQPZThz2Al0XEVZQRFadQDhr6e35f\n/dX7l8BHqhbon7Stu1W+vvunYbfSImJnYEVmHp2ZO1BOq51D2UbPrupyFOW7DGU7XAf8jnL+fi2l\n16r1He2vDINt21De+4si4kLKe9iqJ5m5khL051MODDcfbh1HQeu79xgbGnm70f/70+77mXlgZh5Q\n/d+3sXJv1YhaB/yK8n48BLw7Ii6i7EN/j1E23gO8fWM7Fjiu2jG9hHJ0vI7SPdbf61qv/TCwXUQc\nX50DnBERrSP8/SnB8DjVh151L89uW1bfD/oeYPeI2Ly6CGZE56f6qsp0BSV0L+rzcN8y3Ek5UiQi\nptJ/CK2Xmd8Fvkn54rVev1/1+t+nBAHAlIjYsjpvuvsg6wf4RURE9ffGnENcLyLeGhGtFv7jlCP8\nddV06/87KV2mRMSL6f9L1bjPHyAz9692MAdQDprelpkPU+re/l3v6BxcloufPgFcUoXjXWz43Lej\ndCM/QgnHbarn/P4Ai2u9n3dQvf9s3Od+MXB0ROwcZTyFK4ALKNez/JYNO+GXVP+/nnLB0VuouoTb\nyrRuw2LX7xPuBq6qWmUHUk5R3dfP8zv1z8BHslywdxtPD8e++6d9GL4XAZ+LiNY2fS8lnO+hnFI4\nkNK6bPU4tm8HI+nWfcp2FBEzKNeK/BlwDOV70dP2+NbASzPzcOAQ4FMj6WnYSP1t+3fT//szUq06\nn0ZpTL2DcoHlqHWdt4z3AG//sG4DfhgR36d0s91QzXtDRPzpQK+rjrSOoXShzaV80S6LiOuAP6R8\nSDcBv4tyNfc/sKGl+rSNpQqBT1DO+f43Q19o16mTKed1P9zWzfOMAcrwM+DyiLiRcu7mN/0sr+/r\nTgN2i4jXU7qT3xsRPwD+hdJ6gHJO+3rKzm7REOV9N/BvEXElpdU0Gr4BvLgq13eBv87MJ3hqXc4B\ndoqIayldh0/0s5wmfv791aG1w/gJpXX3agbu6u93XmZeSDm3+m7g45SuxB9Q3utjqxbHpyjv97d5\n6jnt/tZ1EuW9vJLqIHIkslwF/VbK9vctynnluZQDsospLarvsmEfdwPlc78GuJTyGW3bt76UC/3+\nO8svOB6ttpObKBdHruzn+f3p7zlfAi6t3rtd2HCA0Xpuf/unYalOHV4L3Fhtn9+lvN/HUr5r1wFn\nALd2WObB5vW3bS+nvH/XU3qnVtHWms3M3wBzI2I+8D3gk9X2s6mdwNDvz1D629a/DpxZbXMHseFa\no1G7iM17oUsaN6oLkhZW3dfSuGaAS5LUQOO9C12SpHHJAJckqYEMcEmSGsgAlySpgQxwSZIaaLze\nC11SH9VNZhawYVSlSZTb6F6cmf9Q0zr3B/4hBxkVUNLIGODSxPLLzGzdnYzqfuD3RMRXM3OoW4SO\nlL9VlWpggEsTW+tOWSsi4kOUAS3WUO6U9QHKXfKuyWrM9Cijd/VmGd3rV5S7mu1Luaf8n2bmA1GG\nX51HucVqXQcF0oTnOXBpYtkuIn4aEXdFxGLgVMqIVHtS7k/94urfLpTbp8LALei5wJVVi/46yu1a\nN6OMfnaW6AV+AAABC0lEQVR4NdrdY7XVRJrgDHBpYvllZr4kM3ej3DN8M8pACwcCX83M1dX9qS+k\n3Ot9KFdU/99OGcRlj2odrdHf+g6sI2mUGODSxPUBynChJ/H0kZJ6KKfYennqfuIpo7dl5urqz9bg\nKb08dYS/NUiqhQEuTSzrg7oa//n9lOE1bwaOjIhnRMQUyrjIV1GGo3xmRDwrIjYH/miI5d8KzImI\nParpI0e7ApIKA1yaWPoOF3oFZWjT/SnDgd5EGdpyEfC5aojIT1Xzv8dTh7nsb0jJNcCfA1+KiJso\n425LqoGjkUmS1EC2wCVJaiADXJKkBjLAJUlqIANckqQGMsAlSWogA1ySpAYywCVJaqD/D87pPmAr\nELF6AAAAAElFTkSuQmCC\n", "text/plain": [ "<matplotlib.figure.Figure at 0x11c044690>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "win_by_round_before_strat = pd.crosstab(df3.win, df3.Round).apply( lambda x: x/x.sum(), axis = 0 )\n", "win_by_round_before_strat = pd.DataFrame(win_by_round_before_strat.unstack() ).reset_index()\n", "win_by_round_before_strat.columns = [\"Round\", \"win\", \"total\" ]\n", "fig2 = sns.barplot(win_by_round_before_strat.Round, win_by_round_before_strat.total, hue = win_by_round_before_strat.win )\n", "fig2.figure.set_size_inches(8,5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Dummy variables\n", "------" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To keep the dataframe cleaner we transform the `Round` entries into numbers. We then transform rounds into dummy variables" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>Date</th>\n", " <th>Surface</th>\n", " <th>Round</th>\n", " <th>WRank</th>\n", " <th>LRank</th>\n", " <th>win</th>\n", " <th>P1</th>\n", " <th>P2</th>\n", " <th>Round_2</th>\n", " <th>Round_3</th>\n", " <th>Round_4</th>\n", " <th>Round_5</th>\n", " <th>Round_6</th>\n", " <th>Round_7</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>41984</th>\n", " <td>2015-01-21</td>\n", " <td>Hard</td>\n", " <td>2</td>\n", " <td>46</td>\n", " <td>31</td>\n", " <td>0</td>\n", " <td>46</td>\n", " <td>31</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>46044</th>\n", " <td>2016-08-07</td>\n", " <td>Grass</td>\n", " <td>6</td>\n", " <td>7</td>\n", " <td>3</td>\n", " <td>0</td>\n", " <td>7</td>\n", " <td>3</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>43308</th>\n", " <td>2015-06-29</td>\n", " <td>Grass</td>\n", " <td>1</td>\n", " <td>148</td>\n", " <td>93</td>\n", " <td>0</td>\n", " <td>148</td>\n", " <td>93</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>46497</th>\n", " <td>2016-08-29</td>\n", " <td>Hard</td>\n", " <td>1</td>\n", " <td>120</td>\n", " <td>54</td>\n", " <td>0</td>\n", " <td>120</td>\n", " <td>54</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>43345</th>\n", " <td>2015-06-30</td>\n", " <td>Grass</td>\n", " <td>1</td>\n", " <td>37</td>\n", " <td>32</td>\n", " <td>0</td>\n", " <td>37</td>\n", " <td>32</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " Date Surface Round WRank LRank win P1 P2 Round_2 Round_3 \\\n", "41984 2015-01-21 Hard 2 46 31 0 46 31 1 0 \n", "46044 2016-08-07 Grass 6 7 3 0 7 3 0 0 \n", "43308 2015-06-29 Grass 1 148 93 0 148 93 0 0 \n", "46497 2016-08-29 Hard 1 120 54 0 120 54 0 0 \n", "43345 2015-06-30 Grass 1 37 32 0 37 32 0 0 \n", "\n", " Round_4 Round_5 Round_6 Round_7 \n", "41984 0 0 0 0 \n", "46044 0 0 1 0 \n", "43308 0 0 0 0 \n", "46497 0 0 0 0 \n", "43345 0 0 0 0 " ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df1 = df.copy()\n", "def round_number(x):\n", " if x == '1st Round':\n", " return 1\n", " elif x == '2nd Round':\n", " return 2\n", " elif x == '3rd Round':\n", " return 3\n", " elif x == '4th Round':\n", " return 4\n", " elif x == 'Quarterfinals':\n", " return 5\n", " elif x == 'Semifinals':\n", " return 6\n", " elif x == 'The Final':\n", " return 7\n", " \n", "df1['Round'] = df1['Round'].apply(round_number)\n", "\n", "dummy_ranks = pd.get_dummies(df1['Round'], prefix='Round')\n", "\n", "df1 = df1.join(dummy_ranks.ix[:, 'Round_2':])\n", "df1[['Round_2', 'Round_3',\n", " 'Round_4', 'Round_5', 'Round_6', 'Round_7']] = df1[['Round_2', 'Round_3','Round_4', 'Round_5', 'Round_6', 'Round_7']].astype('int_')\n", "df1.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We repeat this for the `Surface` variable" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>Surface_Clay</th>\n", " <th>Surface_Grass</th>\n", " <th>Surface_Hard</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>41984</th>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " </tr>\n", " <tr>\n", " <th>46044</th>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>43308</th>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " <tr>\n", " <th>46497</th>\n", " <td>0.0</td>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " </tr>\n", " <tr>\n", " <th>43345</th>\n", " <td>0.0</td>\n", " <td>1.0</td>\n", " <td>0.0</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " Surface_Clay Surface_Grass Surface_Hard\n", "41984 0.0 0.0 1.0\n", "46044 0.0 1.0 0.0\n", "43308 0.0 1.0 0.0\n", "46497 0.0 0.0 1.0\n", "43345 0.0 1.0 0.0" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dummy_ranks = pd.get_dummies(df1['Surface'], prefix='Surface')\n", "dummy_ranks.head()" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>Date</th>\n", " <th>WRank</th>\n", " <th>LRank</th>\n", " <th>win</th>\n", " <th>P1</th>\n", " <th>P2</th>\n", " <th>Round_2</th>\n", " <th>Round_3</th>\n", " <th>Round_4</th>\n", " <th>Round_5</th>\n", " <th>Round_6</th>\n", " <th>Round_7</th>\n", " <th>Surface_Grass</th>\n", " <th>Surface_Hard</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>41984</th>\n", " <td>2015-01-21</td>\n", " <td>46</td>\n", " <td>31</td>\n", " <td>0</td>\n", " <td>46</td>\n", " <td>31</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>46044</th>\n", " <td>2016-08-07</td>\n", " <td>7</td>\n", " <td>3</td>\n", " <td>0</td>\n", " <td>7</td>\n", " <td>3</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>43308</th>\n", " <td>2015-06-29</td>\n", " <td>148</td>\n", " <td>93</td>\n", " <td>0</td>\n", " <td>148</td>\n", " <td>93</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>46497</th>\n", " <td>2016-08-29</td>\n", " <td>120</td>\n", " <td>54</td>\n", " <td>0</td>\n", " <td>120</td>\n", " <td>54</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>43345</th>\n", " <td>2015-06-30</td>\n", " <td>37</td>\n", " <td>32</td>\n", " <td>0</td>\n", " <td>37</td>\n", " <td>32</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " Date WRank LRank win P1 P2 Round_2 Round_3 Round_4 \\\n", "41984 2015-01-21 46 31 0 46 31 1 0 0 \n", "46044 2016-08-07 7 3 0 7 3 0 0 0 \n", "43308 2015-06-29 148 93 0 148 93 0 0 0 \n", "46497 2016-08-29 120 54 0 120 54 0 0 0 \n", "43345 2015-06-30 37 32 0 37 32 0 0 0 \n", "\n", " Round_5 Round_6 Round_7 Surface_Grass Surface_Hard \n", "41984 0 0 0 0 1 \n", "46044 0 1 0 1 0 \n", "43308 0 0 0 1 0 \n", "46497 0 0 0 0 1 \n", "43345 0 0 0 1 0 " ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_2 = df1.join(dummy_ranks.ix[:, 'Surface_Grass':])\n", "df_2.drop(\"Surface\",axis = 1,inplace=True)\n", "df_2[['Surface_Grass','Surface_Hard']] = df_2[['Surface_Grass','Surface_Hard']].astype('int_')\n", "df_2.drop(\"Round\",axis = 1,inplace=True)\n", "df_2.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We now take the logarithms of ${\\cal P}_1$ and ${\\cal P}_2$, then create a variable `D` " ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>Date</th>\n", " <th>WRank</th>\n", " <th>LRank</th>\n", " <th>win</th>\n", " <th>P1</th>\n", " <th>P2</th>\n", " <th>Round_2</th>\n", " <th>Round_3</th>\n", " <th>Round_4</th>\n", " <th>Round_5</th>\n", " <th>Round_6</th>\n", " <th>Round_7</th>\n", " <th>Surface_Grass</th>\n", " <th>Surface_Hard</th>\n", " <th>D</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>41984</th>\n", " <td>2015-01-21</td>\n", " <td>46</td>\n", " <td>31</td>\n", " <td>0</td>\n", " <td>5.523562</td>\n", " <td>4.954196</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0.569366</td>\n", " </tr>\n", " <tr>\n", " <th>46044</th>\n", " <td>2016-08-07</td>\n", " <td>7</td>\n", " <td>3</td>\n", " <td>0</td>\n", " <td>2.807355</td>\n", " <td>1.584963</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>1.222392</td>\n", " </tr>\n", " <tr>\n", " <th>43308</th>\n", " <td>2015-06-29</td>\n", " <td>148</td>\n", " <td>93</td>\n", " <td>0</td>\n", " <td>7.209453</td>\n", " <td>6.539159</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>0.670295</td>\n", " </tr>\n", " <tr>\n", " <th>46497</th>\n", " <td>2016-08-29</td>\n", " <td>120</td>\n", " <td>54</td>\n", " <td>0</td>\n", " <td>6.906891</td>\n", " <td>5.754888</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>1.152003</td>\n", " </tr>\n", " <tr>\n", " <th>43345</th>\n", " <td>2015-06-30</td>\n", " <td>37</td>\n", " <td>32</td>\n", " <td>0</td>\n", " <td>5.209453</td>\n", " <td>5.000000</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>0.209453</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " Date WRank LRank win P1 P2 Round_2 Round_3 \\\n", "41984 2015-01-21 46 31 0 5.523562 4.954196 1 0 \n", "46044 2016-08-07 7 3 0 2.807355 1.584963 0 0 \n", "43308 2015-06-29 148 93 0 7.209453 6.539159 0 0 \n", "46497 2016-08-29 120 54 0 6.906891 5.754888 0 0 \n", "43345 2015-06-30 37 32 0 5.209453 5.000000 0 0 \n", "\n", " Round_4 Round_5 Round_6 Round_7 Surface_Grass Surface_Hard \\\n", "41984 0 0 0 0 0 1 \n", "46044 0 0 1 0 1 0 \n", "43308 0 0 0 0 1 0 \n", "46497 0 0 0 0 0 1 \n", "43345 0 0 0 0 1 0 \n", "\n", " D \n", "41984 0.569366 \n", "46044 1.222392 \n", "43308 0.670295 \n", "46497 1.152003 \n", "43345 0.209453 " ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df4 = df_2.copy()\n", "df4['P1'] = np.log2(df4['P1'].astype('float64')) \n", "df4['P2'] = np.log2(df4['P2'].astype('float64')) \n", "df4['D'] = df4['P1'] - df4['P2']\n", "df4['D'] = np.absolute(df4['D'])\n", "df4.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Model 1: Logistic Regression" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We now use a logistic regression to study our data." ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYQAAAECCAYAAAD+VKAWAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xl4VGWC7/FvbansCYQAYd9fUAQUFQRFVGhFcWmXbpmZ\nXuz2Tk/37e47vT13nDtL33nuvXOfmWl7ZuzNHm3bme7rtIp0q7izCo2CIIKIL5skJGxJCNlTqeXc\nP6qikYakkFROLb/P8+SpOlWVyu9IrF/es7zH4zgOIiIiXrcDiIhIelAhiIgIoEIQEZEEFYKIiAAq\nBBERSVAhiIgIMAiFYIyZZ4xZd5bHbzXGbDXGbDbG3J/qHCIi0reUFoIx5nvAvwHBMx73Aw8CS4DF\nwJ8aYypTmUVERPqW6hHCAeDTZ3l8BrDfWttirQ0Dm4BFKc4iIiJ9SGkhWGtXAZGzPFUKNPdabgXK\nUplFRET65tZO5RbipdCjBDjtUhYREQH8g/RzPGcs7wWmGGPKgQ7im4v+sb83cRzH8XjOfCsREXdF\nozFaO8K0dnTT0h7/au3opr0zTHtXmI6uCO2dYTp67neF6QpF6AxF6eqO0BWKEEvhtHLP/eD2pD44\nB6sQHABjzAqgyFr7iDHm28ArxMviEWvtsf7exOPxUF/fmtqkLqqsLNH6ZbBsXr9sXjc49/p1dUdo\nbO6iqTVEc3s3p9tCNLd109zeTXNb/LGWjjCdobNtGT+3YMBHfp6PYJ6PYaX5BBP38wM+ggEfeQEf\nAb+XvICXgN9Hnt8bf8znJeD34k/cBnwe/P7EY14vfr8Xn9eD3+fF5/Pg98Zvk+XJsNlOnVz8pcwW\nWr/Mla3rFo3FaDjdRUfUYf8HjTS0dNHY3EVj4ra969wf9B6gpDBASVEeJQUBigoCFCe+ivJ7bv0U\n5vspCPb+8uHzDu7W+srKkrQaIYiIuCYSjXG0oZ2aE20cO9XO8cYOjp/q4GRTJ9GzbKvJ83upKMtn\nYlUpFWX5DC0JUlYcpLw4j7KiIKVFeZQWBQb9gz3VVAgiklVC4Si1J9uoOdFK9YlWqo+3UdfQRiT6\n8Q/+gqCfcSNKqKooZPLYIZQEfVSU5VNRlk9JQYBc3F+pQhCRjNbRFeFA3WlszWn2HTnN4eOtH/ur\n3+/zMqaymPEjSxg3vJhRw4oYWVFEaeFHH/rZuknsfKkQRCSjhMJR3jt8iverT2OPNHHkZBs9u0J9\nXg/jR5YwaVQp40eUfDgC8Puya9NOqqgQRCTtNbd3886BBnbub2DP4VOEIzEg/tf/1DHlTBtbjhlX\nzpRRZQTzfC6nzVwqBBFJSydOdfCWPcnOAw0cqmuhZyPQqGFFXDp1GDMnDmXSqFICfhXAQFEhiEja\n6OqOsO39k2zadYz9tfHZbTwemDa2nDlThzFn6jBGDCl0OWX2UiGIiKscx+FAXTOv7zrGtr0nCYWj\nAMwYP4QFM0cye8owigsCLqfMDSoEEXFFKBxl4ztHWbejjuOnOgCoKM3npnnjWDhzJMPKC1xOmHtU\nCCIyqDq6wqzZUcer247Q1hnG7/My/6IRXD2riunjh+DNweP/04UKQUQGRXN7N69uO8LaHbV0dUcp\nDPq5dcEEllw+hpLCPLfjCSoEEUmxptYQq7cc5vVdxwhHYpQW5XHrggksvnQ0BUF9BKUT/WuISEqE\nIzFefesIz20+TCgcZVhZPsvmjWPhJVXkBXSoaDpSIYjIgNt1sIEnXtvPiaZOigsCfPb6KVw9q0pn\nDKc5FYKIDJgTpzp4Ys1+dh1sxOvxsGTuGG6/ZiJF+TpsNBOoEETkgoXCUZ7d/AGvbD1CNOYwfVw5\nf7R0GmMqi92OJudBhSAiF6TmRCsPP7uHY40dVJQG+ez1U5lrKnNy+uhMp0IQkU8k5ji89lYtT68/\nQCTqcMPcMdy9eDJB7TDOWCoEETlvze3dPLr6Pd49dIqSwgBfvmUGsyYPczuWXCAVgoicl10HG3h0\n9V5aO8LMnDSUL988g7LioNuxZACoEEQkKZFojCfXHuC17bX4fR7uvWEqSy4fo6kmsogKQUT61d4Z\n5p+feof3DjcxalgRf3rrRYwbUeJ2LBlgKgQR6VNjcxf/85fbqD7eypwpw/jKbRfrqmRZSoUgIudU\nfbyVf376HZrburlh7hhW3DAVr1ebiLKVCkFEzmrngQYe/t0eusNR7r99JgtmDHc7kqSYJhYRkT+w\ndkctD63cheM4fO3Tl3D7osluR5JBoBGCiHzIcRyeWn+Ql96sobQwwDfvns2kUaVux5JBokIQESBe\nBk+uO8DLW49QVVHIn98zm0pdxjKnqBBEBIBVr3/wYRn89z+6jNIiXcUs12gfgojw/O8P8/zvDzO8\nvIDv3nupyiBHqRBEctzLW2t4ZuMhKkrz+d6KSxlSomkocpUKQSSHrd1Ry2/WHmBISZDvrZhDRVm+\n25HERSoEkRz1+jtH+dUr+ygtyuO7985h+JBCtyOJy1QIIjnojfeO88sX36e4IMB3751DVUWR25Ek\nDagQRHLM/trTPPr8XvKDfr7z2Tm6zKV8SIUgkkMam7v48TO7cRz4+qdnMn6kZiyVj6gQRHJEqDvK\nQyt30dIRZsWSqcyYMNTtSJJmUnpimjHGA/wEmA10Afdbaw/1ev6PgW8DEeAxa+3PUplHJFc5jsOj\nL+yl5mQb184ZxfWXjXY7kqShVI8Q7gCC1toFwAPAg2c8/4/A9cDVwHeMMWUpziOSk577/WHeev8k\n08aU8cdLp+HRVc7kLFJdCFcDLwFYa98ELj/j+XeAIUDPhClOivOI5Jzttp7fvv4BFaX5fO3OS/D7\ntKVYzi7VvxmlQHOv5YgxpvfP3ANsB3YDz1trW1KcRySnHDnZxiPPv0cw4OObd8+itFBTUsi5pXpy\nuxag92EMXmttDMAYcwlwCzAeaAd+bYy5y1q7sq83rKzM7qMitH6ZLZ3Wr7ktxI9X7SYUjvLAF67g\nsourLuj90mndUiHb1y8ZqS6EzcBy4GljzHziI4EezUAHELLWOsaYk8Q3H/Wpvr41JUHTQWVlidYv\ng6XT+sUchwd/s5OTTZ3ccfVEplZdWLZ0WrdUyIX1S0aqC2EVsNQYszmxfJ8xZgVQZK19xBjzc2CT\nMSYEHAR+meI8Ijnhla1HeO9wE7MmV7B84QS340iGSGkhWGsd4KtnPLyv1/MPAw+nMoNIrqk50crK\nDQcpLcrjSzfPwKsjiiRJOtxAJIuEwlEefnYP0ZjDl2+ZoesayHlRIYhkkSfXHeBYYwdL5o7hkkkV\nbseRDKNCEMkSO/c3sG5HHaMri7jnuslux5EMpEIQyQLNbSF+8cJe/D4vX7n1YgJ+n9uRJAOpEEQy\nXMxxeHT1Xto6w9xz3WTGDNd01vLJqBBEMtya7bW8+8EpZk4aypK5Y9yOIxlMhSCSwWrr23hq3UFK\nCgN8+eYZmrROLogKQSRDxWIOj72wl0g0xn3LZlBWHHQ7kmQ4FYJIhlq7o5YPjrUy/6IRzJk6zO04\nkgVUCCIZ6FRLFys3HqIo38+9N0x1O45kCRWCSAb69av7CHVH+cx1U3Q2sgwYFYJIhtlu63l7fwNm\nbDlXz7qwKa1FelMhiGSQzlCEX79q8fs8fP4mo6OKZECpEEQyyMoNBznd1s3yqyZQVVHkdhzJMioE\nkQxxsK6ZdTvqqKooZNn88W7HkSykQhDJAJFojMdfeh8H+MJN0wn49b+uDDz9VolkgJe31lBb386i\n2VVMG1vudhzJUioEkTR38nQnz24+TGlRHvdcN8XtOJLFVAgiae6ptQcIR2Lce/0UivIDbseRLKZC\nEElj71c3sX1fPVNGlzHvohFux5Esp0IQSVOxmMMTa/YDsGLJVJ1zICmnQhBJU5t2H+PIyTYWzhzJ\nxKpSt+NIDlAhiKShzlCEZzYcJC/g5c5rdX1kGRwqBJE09PyWw7R0hLll/niGlOg6BzI4VAgiaebk\n6U5e3XaEoaVBbrxynNtxJIeoEETSzFNrDxCJOtyzeAp5AZ/bcSSHqBBE0kjvw0yvnDHc7TiSY1QI\nImkiFnP4Tx1mKi5SIYikiU27j1Fzso0FOsxUXKJCEEkDnaEIz2w8RF7Ay106zFRcokIQSQOvbDtC\nS3s3y+bpMFNxjwpBxGUtHd28tLWGksIAN1451u04ksNUCCIuW/37akLdUW5dMIH8PL/bcSSHqRBE\nXNTQ3Mm6t2sZVpbPtXNGux1HcpwKQcRFz246TCTqcMc1E3VZTHGdfgNFXFLX0M7md48xurKI+ReN\ndDuOCCndYGmM8QA/AWYDXcD91tpDvZ6/AvhBYvE48CfW2u5UZhJJF6s2HsJx4K5Fk/F6dRKauC/V\nI4Q7gKC1dgHwAPDgGc//HPiitXYR8BIwPsV5RNLCwaPN7EhMUTF7SoXbcUSA1BfC1cQ/6LHWvglc\n3vOEMWYa0Ah82xizHhhqrd2f4jwirnMch5XrDwJw9+LJmqJC0kZShWCMecEYc48x5nyv8F0KNPda\njhhjen7mMOAq4F+BJcASY8zi83x/kYyz5/Ap3q85zazJFUwbW+52HJEPJTtC+L/ATcB+Y8yPE9v+\nk9EClPT+edbaWOJ+I3DAWrvPWhshPpK4/Mw3EMkmMcdh5fr4brQ7F01yOY3IxyW1U9lauxHYaIwp\nAO4GVhpjWoBHgJ9aa0Pn+NbNwHLgaWPMfGB3r+cOAcXGmEmJHc3XJN6vT5WVJf29JKNp/TJbf+v3\n+s46qk+0cu2lY5g7c9QgpRoYuf5vlwuSPsoosTnnc8CngBeB3wBLgWeBG8/xbauApcaYzYnl+4wx\nK4Aia+0jxpgvA08YYwB+b619sb8c9fWtyUbOOJWVJVq/DNbf+kVjMR5/fg8+r4dlV47JqP8Wuf5v\nl+mSLbukCsEYU038L/rHgK9bazsTj68Htp3r+6y1DvDVMx7e1+v59cC8pJKKZLgt757gRFMniy8d\nzfAhhW7HEfkDyY4QbrHWvtv7AWPMfGvtG8BlAx9LJLtEojGe3fwBfp+H5Vfp6GpJT30WgjFmIeAD\nejbv9BwfFwB+CkxLbTyR7PD7d4/T0NzFDXPHMLQ03+04ImfV3whhKXAtUAX8Xa/HI8DDqQolkk3C\nkRjPbf6AgN/LLRodSBrrsxCstd8HMMZ8zlr7H4OSSCTLbNp1lMaWEJ+6Yizlxbr4jaSv/jYZfT9R\nCtcbY64783lr7ZdSFUwkG4QjUZ7fUk1ewMuy+RodSHrrb5PR9sTt+hTnEMlKG3Yepak1xLJ54ygr\nynM7jkif+iuEd4wx44B1gxFGJJt0h6Os3lJNMM/HTfPGuR1HpF/9FcIGwOGjo4t6cwCdey9yDuvf\nrqO5vZtbrhpPSaFGB5L++tupPHGwgohkk1B3lBfeqCY/z8eNV2p0IJkhqZ3KxphfnO157VQWObu1\nO2pp6Qhz28IJFBec7yTBIu5IdqfyhlQHEckWnaEIL75ZQ0HQz6euGOt2HJGk9Tn9tbX2ucTt48Qn\ntDsFnACeSzwmImdYs72Wts4wN145lsJ8jQ4kcyR7gZx7gJ3AF4A/BXYaY25KZTCRTNQZivDy1hqK\n8v0svVyjA8ksyU5u91fAXGvtMQBjzHji016/lKpgIpnote21tHdFuHPRJAqCSc8uL5IWkr1iWhg4\n3rNgra0mPp+RiCR0dIV5JTE6uGHuGLfjiJy3/o4y+nzi7gfAc8aYx4kXwQrgnRRnE8koz71+SKMD\nyWj9/db2zF/Ulvi6ObHcztlPVhPJSR1dEX674aBGB5LR+jsx7b5zPZe4vrKIAGu2H6GtM8xd12p0\nIJkr2Uto3gX8DVBMfGTgAwqA4amLJpIZOroivLLtCCWFAa6/TKMDyVzJ7lT+B+DPgb3AHxO/tvKT\nqQolkknWbD9Ce1eETy+eotGBZLRkC6HJWrsOeAMoS1wj4aqUpRLJED2jg6J8P7cs1NRfktmSLYRO\nY8w04iOExcaYPKAsdbFEMkPP6OCmeeN0VrJkvGQL4a+A/wU8D9xAfPqKVakKJZIJOroivLw1PjrQ\nvgPJBklt8LTWbuCjCe6uMMYMsdY2pS6WSPp7bfsROkIRHVkkWSPZo4zGAP8KLAa6gdeMMd+y1tan\nMJtI2uroivDK1iMUF+jIIskeyW4y+gXwKjAemEZ8WuzHUhVKJN31jA5uvHKsRgeSNZL9Ta601v60\n1/IPjTFfSEUgkXTX0RXmZY0OJAslO0LYaoy5t2fBGLMceCs1kUTS2yvbjtAZirBs3jiNDiSr9De5\nXQxwiJ+d/F+MMY8CUeJnLDcB96c8oUgaaesM66xkyVr9zWWU7AhCJCe8vLWGru4ot189kWCez+04\nIgMq2aOMCoG/JX4Ogh9YC/y1tbY9hdlE0kprRzevvVVLWVEeiy8d7XYckQGX7AjgR0AR8CXil9HM\nA36WqlAi6eilN2sIhaPcfNV4ggGNDiT7JLtHbK61dnav5a8bY95LRSCRdNTc3s2aHbWUF+exeM4o\nt+OIpESyIwSvMaa8ZyFxX5fQlJzx4hvVdIdjLF8wgYBfowPJTsmOEB4kfujpc4nl24C/T00kkfTS\n1Bpi3dt1DC0Ncs0sjQ4keyVbCM8B24BriY8q7rTW7k5ZKpE08sIb1YQjMW5dMIGAXwfeSfZKthBe\nt9bOAN5NZRiRdHOqpYsNO+sYVpbPwkuq3I4jklLJFsI7xpjPA28CnT0PWmtr+vomY4wH+AkwG+gC\n7rfWHjrL6x4GGq21f5lscJHBsHpLNZGow60LJ+D3aXQg2S3ZQpgHXEn8jOUeDjCpn++7AwhaaxcY\nY+YR3xdxR+8XGGO+Aszko+m1RdJCQ3MnG985yvDyAhbMHOl2HJGU62/qilHEz0FoBzYBf2GtPX0e\n73818BKAtfZNY8zlZ7z/VcAVwMPA9PN4X5GUe3bzYaKx+OjA59XoQLJff7/ljwHvA98FgsT/wj8f\npUBzr+WIMcYLYIwZSfzs56/z8ZGHiOuONbazefcxRg0r4qqLNTqQ3NDfJqPR1tobAYwxa4Cd5/n+\nLUBJr2WvtTaWuH8PUAG8AFQBBcaY9621/36eP0NkwD2z8RCOA3cumoTXq79XJDf0VwjdPXestWFj\nTHdfLz6LzcBy4GljzHzgw0NVrbUPAQ8BJK6tYJIpg8rKkv5ektG0fu7bV9PEdluPGTeETy2YiMeT\nfCFkwvp9Utm8bpD965eM853M3TnP168ClhpjNieW7zPGrACKrLWPnOd7AVBf3/pJvi0jVFaWaP3S\nwCO/jf/dcvvCCTQ0tCX9fZmyfp9ENq8b5Mb6JaO/QrjYGNP7MNHRiWUP4Fhr+zzKyFrrAF894+F9\nZ3nd48mEFUm1PYdPsbe6iZkThzJ9/BC344gMqv4KYdqgpBBJA47j8PT6gwDcde1kl9OIDL7+LpBT\nPVhBRNy23dZTfbyVK2cMZ/xIbU+W3KODq0WAaCzGyo2H8Ho8fPqa/s63FMlOKgQRYPPu45w41cGi\n2VWMGFrodhwRV6gQJOd1h6P8btMHBPxebl040e04Iq5RIUjOW7ujjqbWEEsuH8OQkqDbcURco0KQ\nnNbRFWb1lsMUBv3cPH+823FEXKVCkJz2/JZq2rsiLJs/jqL8gNtxRFylQpCcdaKpg1e3HaGiNJ+l\nl491O46I61QIkrOeXHuAaMzhnusmkxfwuR1HxHUqBMlJew+f4u39DUwZU8YV04e7HUckLagQJOfE\nYg5PrDkAwIobpp7XbKYi2UyFIDnn9V1Hqa1vY+ElI5lYVep2HJG0oUKQnNLRFeGZjYcIBnzcuUgT\n2In0pkKQnLJ6y2FaO8LcfNV4nYQmcgYVguSMk00dvPrWESpKg9x4hQ4zFTmTCkFyxpPrDhKJOtxz\n3RQdZipyFioEyQl7q5vYsa9eh5mK9EGFIFkvFnP4zzX7AR1mKtIXFYJkvTU7ajlyso2FM3WYqUhf\nVAiS1U61dPHMxkMU5fu557opbscRSWsqBMlajuPwq1f2EeqO8pnrplBalOd2JJG0pkKQrLVjXz07\nDzQwfVw5V8+qcjuOSNpTIUhW6uiK8KtX9+H3efjcjUY7kkWSoEKQrLRy40Ga27pZftUEqiqK3I4j\nkhFUCJJ1DtQ1s35HHVUVhSzTZTFFkqZCkKwSicZ4/KX3cYAv3DSdgF+/4iLJ0v8tklVe3lpDXX07\ni2aPYtrYcrfjiGQUFYJkjRNNHTy7+TClRXncc52mthY5XyoEyQoxx+HfX7KEIzH+aMlUivIDbkcS\nyTgqBMkKa7bXsre6iVmTKzR5ncgnpEKQjFd7so2n1h2kpDDAfcum65wDkU9IhSAZrTsc5eHn9hCJ\nxrjv5hmUFesqaCKflApBMtrT6w9SV9/OdZeNZs6UYW7HEcloKgTJWLsONvLa9lqqKgr5jGYyFblg\nKgTJSC3t3fxi9Xv4fR6+ctvFBHVJTJELpkKQjOM4Dr94YS8tHWHuunYy40aUuB1JJCv4U/nmxhgP\n8BNgNtAF3G+tPdTr+RXAfwPCwG5r7ddSmUeyw7q369h1sJGLJwxh6RVj3Y4jkjVSPUK4AwhaaxcA\nDwAP9jxhjMkH/g641lp7DVBujFme4jyS4erq2/jN2gMUFwT40i0X4dUhpiIDJtWFcDXwEoC19k3g\n8l7PhYAF1tpQYtlPfBQhcladoQg/+90ewpEYX1w2nSElOsRUZCCluhBKgeZeyxFjjBfAWutYa+sB\njDHfAIqsta+lOI9kqJjj8Mjz71HX0M4Nc8dw2bRKtyOJZJ2U7kMAWoDee/y81tpYz0JiH8M/AFOB\nO5N5w8rK7N6BqPU7u/94cS9v729g9tRhfOOzl+LzpefxENn875fN6wbZv37JSHUhbAaWA08bY+YD\nu894/udAp7X2jmTfsL6+dQDjpZfKyhKt31ls3XuCJ1/bx/DyAr588wxOnWpPQboLl83/ftm8bpAb\n65eMVBfCKmCpMWZzYvm+xJFFRcB24D7gdWPMOsAB/sVa+7sUZ5IMcvh4C4+u3kt+no9v3D2L4gLN\nYiqSKiktBGutA3z1jIf3DdbPl8zW3BbioZW7iURifO3uWYwepmsji6RSem6IlZwXjkT50TO7aWoN\ncffiyczWPEUiKadCkLTjJC52c/BoC1ddPIKb5o1zO5JITlAhSNp54Y1qNr97nIlVpXxR1zcQGTQq\nBEkra7bXsnLDIYaUBPn6nZcQ8GvSOpHBokKQtLHxnaP8+tV9lBXl8b0Vl+pMZJFBpkKQtLDl3eM8\n/uL7FBcE+O6KSxk5tNDtSCI5R4Ugrtv2/kkeWf0eBUE/3713jg4vFXGJCkFc9fb+en7+7B6CAR/f\nuXeOrm0g4iIVgrjm3UON/PS37+L3efnWZ2YzsarU7UgiOU2FIK5491AjDz2zG4/HwzfvnsXUMeVu\nRxLJeZo6QgbdxneO8u8vWbxeD9+46xJmjB/idiQRQYUggygWc1i54SCrt1RTXBDgm3fNYsqYMrdj\niUiCCkEGRTgS5Qe/3s7GnXUMH1LAtz4zmxFDdGipSDpRIUjKtXWGeWjlLvbXNjNlTBnfuPMSSgrz\n3I4lImdQIUhKnWzq4IdPvsOJpk6umTOaP1kyRdNRiKQpFYKkzJ4PTvHws3to6wxz8/zxfOWu2TQ2\ntrkdS0TOQYUgAy4cibFyw0Fe2XYEn9fD528yLJ4zGq9Xs5aKpDMVggyoow3t/PzZPdScbGPE0EL+\n7LaLGT9SZx+LZAIVggwIx3HYsPMo/7lmP92RGItmV7HihmkE87S/QCRTqBDkgrV1hnnshb28vb+B\nonw/9y+/iMunD3c7loicJxWCfGKO4/Dm3hP8Zu0Bmtu6mT6unPuXX8TQ0ny3o4nIJ6BCkE+k5kQr\nv351H/trm/H7vNx17SSWzRuvHcciGUyFIOelrTPMqo2HWL+zDseBS6cO47M3TGV4eYHb0UTkAqkQ\nJCnRWIwNO4+yauMh2rsiVFUUsmLJVGZOrHA7mogMEBWC9Ckai7Ft70lWb6mmrqGdgqCPe6+fwvVz\nx+D3afZ0kWyiQpCzCkeibN59nBffrKb+dBdej4erZ1Vx17WTKSvSPEQi2UiFIB/TGYqwYedRXt5W\nQ3NbN36fl+suHc1N88ZRqf0EIllNhSAAnGjqYNOuY6x/u472rgjBPB/L5o1j6RVjKS8Ouh1PRAaB\nCiGHhbqjvGVP8vquY+w7chqA4oIAd1wzkRvmjqEoP+ByQhEZTCqEHOM4DgePtrBp11He3HuSUHcU\ngBnjh3DNrCoum1ZJXkDTTYjkIhVCDohEY9gjp9m5v4Gd+xtobOkCoKI0yI1XjGXhJVXaPyAiKoRs\n1dEVZtehRnbub2D3oVN0hiIAFAT9zL9oBAsvqWLGhCF4PTqzWETiVAhZoqMrwoG609gjp9lXc5rD\nx1uJxhwAKkrzWThzJHOmDmPa2HKdPyAiZ6VCyECO49DUGuKDY63srz2NrTlNzclWnPjnP16PhwlV\nJcyeXMGcqZWMqSzCo5GAiPRDhZDmYo5DfVMn1SdaqT7RSs3xVqpPtNHWGf7wNX6fh6mjy5g2rhwz\ndgiTR5eSn6d/WhE5P/rUSBOhcJRDdc3sPVjPscYOjp/q4HjiNhSOfuy1w8ryMePKGTeihKmjy5g0\nqlRHBonIBUtpIRhjPMBPgNlAF3C/tfZQr+dvBf4aCAOPWWsfSWUetziOQ2coSlNrF40tXTQ2d9HY\nEup1v4um1tAffF/A72XEkELGDC9i/IgSxo0oYdyIYp0fICIpkeoRwh1A0Fq7wBgzD3gw8RjGGH9i\neS7QCWw2xvzOWluf4kwDIhyJ0tYZoa0zTFtnmPbEbWtHN83t3TS3xW9Pt4Voae+mOxI76/t4PR6G\nlgaZMX4IE0eXUVYYoGpoISOHFjK0LF9HAYnIoEl1IVwNvARgrX3TGHN5r+dmAPuttS0AxphNwCJg\n5UCHcByHcCRGdyRGdzj6sfuhcJRQd5SuxO2Hy91ROkIROnt9dfS67Q6f/QO+N6/HQ2lRgKphRZQV\n5VFeHKSiLJ9hpfnx27J8youDH15UprKyhPr61oFefRGRpKS6EEqB5l7LEWOM11obO8tzrUBZX2/2\nwyd20NbESnCLAAAEHklEQVQeIhJ1iEZjRGIf3UYiMSLRGOGe26jz0f1z/HV+PnxeDwVBPwVBH1VF\nRRQX+CkqCFCc+Oq5X1IQoKw4SFlxHsUFAf2FLyIZI9WF0AKU9FruKYOe50p7PVcCnO7rzda+deSs\nj/u8Hnw+DwGfF7/fS8DnpTDoI1Dkxe/zEvB7yfN7yQv4yPP3LPsIBLzkB3wE8xJfAd9HywEfBUE/\nhfl+CoJ+8vxeHbopIlkt1YWwGVgOPG2MmQ/s7vXcXmCKMaYc6CC+uegf+3qz535we9Z/IldWlvT/\nogym9ctc2bxukP3rlwyP03M2Uwr0OspoVuKh+4jvRC6y1j5ijLkF+FvAAzxqrf1ZysKIiEifUloI\nIiKSOTSpjYiIACoEERFJUCGIiAigQhARkYSMmtzOGOPlo+kugsD3rbUvuJtq4BljpgNvAMOttd1u\n5xkoxphS4FfEzz8JAN+x1r7hbqoL0998XZkuMcXML4AJQB7wv621z7kaKgWMMcOBt4Al1tp9bucZ\nSMaYvwBuI/7/3E+stY+d67WZNkL4HOC31l5DfE6kKS7nGXDGmBLgn4h/uGSbbwOvWWsXEz8E+cfu\nxhkQH87XBTxA/A+WbPInQIO1dhGwDPiRy3kGXKL0fkb8fKisYoy5Frgq8fu5GBjb1+szrRBuBI4a\nY54Hfg5k3V8qxNfrAbLwl5P4h+XDifsB4pMaZrqPzdcFXN73yzPOk8RnJIb450W4j9dmqn8Cfgoc\ndTtICtwIvGuM+S3wLPB8Xy9O201GxpgvAd8Cep8oUQ90WmuXG2MWAb8ErnUh3gU7x/rVAE9Ya3cn\nNkVkrDPWz5O4vc9au90YMxL4D+CbLkYcKH3N15XxrLUd8OHI9Sngf7ibaGAZY74InLTWvmqM+Uu3\n86TAMGAc8RkjJhEvhennenFGnZhmjHkCeNJauyqxfMxaW+VyrAFjjNkH1BL/AJ0PvJnYvJI1jDGX\nAP+P+P6DV9zOc6GMMT8Atlhrn04s11hrx7kca0AZY8YCzwA/stY+7naegWSM2QD0lPccwAK3WWtP\nupdq4Bhj/p544f0wsbyT+H6ShrO9Pm1HCOewCbgZWGWMmQ1Uu5xnQFlrp/XcN8Z8ACx1Mc6AM8Zc\nRHwTxGestbv7e32G6Gu+roxnjBkBvAz8V2vtOrfzDDRr7YdbGIwx64CvZEsZJGwiPhL/oTFmFFAI\nNJ7rxZlWCP8G/NQYsyWx/Gduhkmxnk0t2eT/ED867F8Sm8ROW2s/7XKmC7UKWGqM2ZxYvs/NMCnw\nAFAO/LUx5m+I/14us9b+4SX+Ml/mbC5JkrV2tTHmGmPMVuKfJ1+z1p5zPTNqk5GIiKROph1lJCIi\nKaJCEBERQIUgIiIJKgQREQFUCCIikqBCEBERQIUgIiIJKgQREQHg/wNcY8Mhrq7uiAAAAABJRU5E\nrkJggg==\n", "text/plain": [ "<matplotlib.figure.Figure at 0x11cca9210>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "# Logit Function\n", "def logit(x):\n", " return np.exp(x) / (1 + np.exp(x)) \n", " \n", "x = np.linspace(-6,6,50, dtype=float)\n", "\n", "y = logit(x)\n", "\n", "plt.plot(x, y)\n", "plt.ylabel(\"Probability\")\n", "plt.show()\n" ] }, { "cell_type": "code", "execution_count": 251, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "['Date',\n", " 'WRank',\n", " 'LRank',\n", " 'win',\n", " 'P1',\n", " 'P2',\n", " 'Round_2',\n", " 'Round_3',\n", " 'Round_4',\n", " 'Round_5',\n", " 'Round_6',\n", " 'Round_7',\n", " 'Surface_Grass',\n", " 'Surface_Hard',\n", " 'D']" ] }, "execution_count": 251, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df4.columns.tolist()" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "collapsed": true }, "outputs": [], "source": [ "feature_cols = ['Round_2',\n", " 'Round_3',\n", " 'Round_4',\n", " 'Round_5',\n", " 'Round_6',\n", " 'Round_7',\n", " 'Surface_Grass',\n", " 'Surface_Hard',\n", " 'D']" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>Round_2</th>\n", " <th>Round_3</th>\n", " <th>Round_4</th>\n", " <th>Round_5</th>\n", " <th>Round_6</th>\n", " <th>Round_7</th>\n", " <th>Surface_Grass</th>\n", " <th>Surface_Hard</th>\n", " <th>D</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>41984</th>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0.569366</td>\n", " </tr>\n", " <tr>\n", " <th>46044</th>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>1.222392</td>\n", " </tr>\n", " <tr>\n", " <th>43308</th>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>0.670295</td>\n", " </tr>\n", " <tr>\n", " <th>46497</th>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>1.152003</td>\n", " </tr>\n", " <tr>\n", " <th>43345</th>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>0.209453</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " Round_2 Round_3 Round_4 Round_5 Round_6 Round_7 Surface_Grass \\\n", "41984 1 0 0 0 0 0 0 \n", "46044 0 0 0 0 1 0 1 \n", "43308 0 0 0 0 0 0 1 \n", "46497 0 0 0 0 0 0 0 \n", "43345 0 0 0 0 0 0 1 \n", "\n", " Surface_Hard D \n", "41984 1 0.569366 \n", "46044 0 1.222392 \n", "43308 0 0.670295 \n", "46497 1 1.152003 \n", "43345 0 0.209453 " ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dfnew = df4.copy()\n", "dfnew[feature_cols].head()" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "collapsed": false }, "outputs": [], "source": [ "X = dfnew[feature_cols]\n", "y = dfnew.win\n", "#pd.value_counts(dfnew['Best_of_5'].values, sort=False)" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,\n", " intercept_scaling=1, max_iter=100, multi_class='ovr', n_jobs=1,\n", " penalty='l2', random_state=None, solver='liblinear', tol=0.0001,\n", " verbose=0, warm_start=False)" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.cross_validation import train_test_split\n", "X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)\n", "from sklearn.linear_model import LogisticRegression\n", "logreg = LogisticRegression()\n", "logreg.fit(X_train, y_train)" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.707070707071\n" ] } ], "source": [ "y_pred_class = logreg.predict(X_test)\n", "from sklearn import metrics\n", "print(metrics.accuracy_score(y_test, y_pred_class))" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "True: [1 1 1 0 0 1 1 0 1 1 1 0 0 0 1 0 0 1 1 0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 1 1 0\n", " 1 1 0]\n", "Pred: [0 0 1 0 0 1 1 0 1 1 0 1 1 0 0 0 0 1 1 1 0 0 1 0 1 0 1 0 0 0 0 0 1 0 1 0 0\n", " 0 0 0]\n" ] } ], "source": [ "from __future__ import print_function\n", "print('True:', y_test.values[0:40])\n", "print('Pred:', y_pred_class[0:40])" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "0.75469387755102035" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "y_pred_prob = logreg.predict_proba(X_test)[:, 1]\n", "auc_score = metrics.roc_auc_score(y_test, y_pred_prob)\n", "auc_score" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYgAAAEZCAYAAACNebLAAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XmcTfX/wPHXGDW2YQZDolDp7ZtQQkgllAqVIu0LSkm0\nfn/fNlkSRWrqS2WJ0r5+K220WiqVLZV5t6iUIjsR4zK/P86547ru3Dkz5t4zc+/7+Xh4mHvOuee8\n7zHO+372lLy8PIwxxphw5fwOwBhjTOlkCcIYY0xEliCMMcZEZAnCGGNMRJYgjDHGRGQJwhhjTETl\n/Q7AlA0ishtYCuwG8oBKwCZggKoucI+pBAwDugM73OPeBEaq6vaQc10O9AcqAAcCc4H/U9VNcftA\nRSAijwKnAc+q6l0xusZxOPfg/BI411RgqaqO2//I8s/ZHeikqjeISHPgFWAjMA04QlVvKKlrmdLD\nEoTxKg/ooKobghtE5GbgEaCdiKQC7wOfAseo6nYRqQCMBt4TkVNUdbeI3A50Ac5S1bXu+7KBN4CT\n4/yZvLoaOERV/4jVBdwku9/JIVZU9U2cZA9wFvChql7tY0gmDixBGK9S3D8AuA/2Q4F17qbzgRRV\nvSV4jFtquEFEFgE9ROQd4DaguaqudY/ZJSK3uPvLq2og9KIi0g0Y4V57K3AtTsnlG1VNd4+pH3zt\nlk764pRwNgNpwAOq+qp77Cj3ureJSF/3fCnu57heVTXs+rPdH98RkQHABuC/QA2c0tQ4VZ0uIifj\nJLqt7rVbq+pO9xyDgFaqeqmIlHevNVhVp4lIO+BB4N/Af1W1qVsC2Aw0BQ4BcoDeqrotLLbKOAn6\nBGAn8D9VvTPsmD44Ce4AoDpwn6o+JiK1gafczwHwtqoOibD9LVW9272vPYHngAFAORGpiPOloKeq\ndheRqu49ONq93gfAre4Xg+3A60Az4GJVXYgp9awNwhTFRyKyWERWAt/jlCqudPe1BWYX8L4PgPZA\nY2Crqi4P3amq21X1uQjJoRYwHbhMVY8BxgKj3N3hUwCEvj4KOFlVOwKTgjGKSDngEmCSiJwEXAa0\nV9XjgDHAq+GBq+pJOAmkA/A5TkknW1WbA2cC94rI8e7hTXAe5McGk4Prf0Bn9+cTgL9DXp8NvBTh\nM7TAqdb6F3Aw0Cs8NmA4kKaqAhwLnOB+LtzPWxknWZ7hfsYLgPvd3VcBP6lqS+Ak4AgRSY+wvZG7\nHSBPVZ8FHgNeUNVLw+J+EPhKVVu58WcBN7n7DgReV9V/WXIoOyxBmKLo4D6ouwIVgU+DJQHXAQW8\nLw3nIbKbov3OnYBTl74UQFVfU9WuHt73tapudX9+EWjjJpvTgR/cBNUVOBz41C3h3A9kiEhGAedM\nAY7EeSC/7sbzJ05d/OnuMb+p6u/hb1TVFcDvItLSPXYUTsIBJ0G8EuF676pqwE2aS3G+/YfrDExx\nr7FTVU9R1fwk7d6D7kA3ERkO3AFUDp4fOE9E3sJpD/qPqm6Jst2LbkB/934uAFrhlCaC5no8jykl\nLEGYokgBUNXFON8Mp4jIoe6+eTjfOPciIinu9nnAd8ABInJY2DFpIvKWiBwU9vYAYSUFEWnqbgv9\n3T0w7H1/B39wq2VeAi4GrsApUQCkAtNVtYX7jf9YnGqhjRE+dzCGSP9fyrEnMf4dYX/QqzgljlPd\neFaISG9gm6r+HOH4f8KunxLhmL3uj4jUE5HqIa/rAotxqgLnAPnVT6r6FdAQeByoD3wpIm0K2h7l\nc4VKBXqF3M82wPUh+6PdH1MKWYIwxaKqz+M0SGe7m14GtorIQ27jNG4d9SPAFpz68VzgPuAJ9xs9\nIpIGPARUUtVVYZeZD/xLRP7lHnsOTpXTRpxE09g97txCwp2Mkxzasufb+kzgwmBSctsX3i/g/cGH\nswK5bhyIyMHAecCsQq4PTjXTRUCqqq5233M/e6qXiuN94HIRSXHv48vsnaRbAn+p6khVnYVTmsA9\nfhQwRFXfcHsgfQscWdB2j/G8h1ul5MbzBjBwPz6f8ZklCONVpGl/rwdOF5FTVXUXTp35VmCBiHwN\nfIWTHIL7UdXROA/p90RkIbDIPffZ4SdX1b9wvvk/5R57A04d/2acRt13RWQ+sCta4G6d907gZTdJ\noaozcZLVLBFZjFM/3yPaZ3ere87BaXhfgpNkhqrqJ9Gu7753mXueYBJ6D6hHhHaPgq4fwTD3cy3B\nqdKZoar/C9n/HrBSRFREFrjXWwMcgZOUjxGRr0XkS2A5TgN06PavQrZ7MQioJCJLcUouS9jT5mHT\nRpdBKTbdtzHGmEhiXoIQkeNF5KMI27uLyBciMk9E+sU6DmOMMUUT0wQhIrfiNAqmhW0vD4zD6YXR\nAbhaRLJiGYsxxpiiiXUJ4kci1+v+C6e74Wa3v/hcIvSAMcYY45+YJghVfQ2nK164qjijYYO2ANVi\nGYsxxpii8Wuqjc04SSIoHafrYlR5eXl5KSmRuoMbY0zx9b1nJms3badmtQp+h1IiNqz+hTmvPcDa\nld9TKb0GWzevLdaDM14JIjy4ZThD+zOAbTjVS2MKPUlKCmvWeB3UmdiystLtXrjsXuxh92KPotyL\nXbvyyKySxuj+bWMcVWwFAgHGj8/m6YmjyM3NpWfP3owceV+xzxevBJEHICIXApVVdbKI3ITTjzwF\nmOxOW2CMMaaYxoy5lwcfHEutWrUZOzab008/c7/OF/MEoaq/Au3cn58L2f4W8Fasr2+MSS4vfvgj\nX+b8RWpqCrt2eRvntWHLDjLT0wo/sJS76qoBbNy4kf/8504yMyNN31U0Nt23MSahfJnzFxu27KBm\nhvf2hMz0NFo1rhXDqOKjZs2a3Hdfia0TZQnCGJN4MtPTmHLnaQnbHhMIBFi/fj21asU2qdlcTMYY\nU4bk5Cyja9fOXHZZbwKBSKMISo6VIIwxZU6wnSGSRGlPCBfsoTRmzJ4eStu3b6dKlSoxu6YlCGNM\nmRNsZ4iUCBKlPSFUTs4yBg++lkWLFpZYDyUvLEEYY8qkzPQ0xgxo53cYcfHpp3NZtGhh/riGkuih\n5IUlCGOMKeWuuKIv//rXUbRte0Jcr2uN1MYYU8qVK1cu7skBrARhjCmlkrEhOidnGb/++gtdupzh\ndyiAlSCMMaVUsCE6kkRriA4EAmRnP0Dnzidy3XVXs3HjBr9DAqwEYYwpxZKhITpSD6WMjEy/wwKs\nBGGMMb55/vln6Nz5xPweSnPmzI9L91WvrARhjPFNMrYzhGrSpClZWbUYNWpsqUoMQZYgjDG+SbYB\nb+GaNm3GF18s4YADDvA7lIgsQRhjfJUM7QzRlNbkANYGYYwxMRXsoXTLLTf4HUqRWYIwxpgYCc68\nOnLkMN57723Wr1/nd0hFYgnCGGNKWOi4hmAPpdmzP6d69Rp+h1Yk1gZhjDElbPz4bEaOHBbXmVdj\nwRKEMcaUsL59r2bt2jXcdNO/4zbzaixYgjDGmBJWpUo6I0aM9juM/WZtEMYYU0yBQICVK3/3O4yY\nsQRhjDHFEOyhdP7557B9+3a/w4kJSxDGGFME4T2Umjc/lp07c/0OKyasDcIYYzzya21ov1gJwhhj\nPPruu29K7cyrsWAlCGMSVLSZUkuLsjZja48ePalfvwHHHdfK71DiwkoQxiSoaCuylRZlbcbWlJSU\npEkOYCUIYxJass+UWlw5OctYtuxbevTo6XcovrIShDHGuEJ7KA0ePIDVq1f7HZKvrARhjDFE7qFU\nu3Ztv8PylZUgjDFJ79VXXyrVa0P7xUoQxpikd8wxx1KnzsGMGDHaEkMITwlCRJoCjYDdwI+q+k1M\nozLGmDg67LAj+PzzRaSmpvodSqlSYIIQkRTgGuAGYAuwAtgJNBSRqkA28Liq7o5HoMYYUxLy8vJI\nSUnZZ7slh31FK0G8DMwC2qjqhtAdIlINuBx4DTg7duEZY0zJCAQCjB+fzTffLGXixKkRk4TZW7QE\ncZmqbo20Q1U3AQ+LyJRoJ3dLIROA5sB2oJ+qLg/ZfzFwExAApqrqY0WM3xhjChXeQ2nVqj+pU+dg\nv8Mq9QrsxRRMDiLyjYjcKiIHFXRMFOcAaaraDrgNGBe2fwzQEWgP3OyWTIwxpkREWht6zpz5lhw8\n8tLNtStQAfhIRN4SkZ4icoDH87cH3gVQ1flAy7D9S4BMoKL7Os/jeY0xplDTpk1m5MhhZGRk8tRT\nzzNhwqQyvQRovBXai0lVfwVGACNEpAfwMPCYiDwNjFDVdVHeXhXYFPI6ICLlQhq2vwUWAH8Dr6rq\n5uJ8CGOgbExOF2upqSns2uV8zyprE+HFwqWXXsmqVau47rpBlhiKodAEISJVgJ7ApUBd4FHgBaAL\n8B77lgpCbQbSQ17nJwe362xXoD6wFXhGRM5T1VeixZOVlR5td1Kxe7FHVlY6C39Yw4a/d1CzWgW/\nw/FVaqrT+FozowInNK+b1L8n9erVJDv7Ab/DKLO8jIP4GZgBDFPV2cGNIvIocGoh750HdANeFpE2\nwNKQfZuAbcAOVc0Tkb9wqpuiWrNmi4eQE19WVrrdC1fwXuzalUdmlTRG92/rd0i+ifR7kQy/J4FA\ngN9+W0HDhoflb7P/I3sU90uClwTRV1XfCN0gIueq6qtAj0Le+xpwqojMc19fKSIXApVVdbKITATm\nisgO4CdgWtHCN8Yku5ycZQwadA1r1qxh9uzPSU+v6ndICSPaQLneQBowXEQyQnYdgNMj6dXCTq6q\necC1YZu/D9n/OPB4UQI2xhjYM65hzJhR5Obm0rNnb3bvtnG7JSlaCaIq0A6nDeGUkO0B4I5YBmUS\nX0k2KAcbZq1RNnkESw2LFy9KirWh/VJgglDVScAkEemkqh/EMSaTBIKrnZXkA72srU5mim/lyt9Y\nvHgRPXv2ZuTI+6yHUoxEq2KaqKpXA3eKyD4lBlXtGNPITMIrqdXOrDEy+XTqdBoffDCXpk2b+R1K\nQotWxRRsGxgahziMMaZILDnEXrQqpgXujzcB04E3VDU3LlGZhFFQW4O1FxgvcnKW8dVXX3DJJZf7\nHUpS8jLVxkScOZV+EpHJItIhtiGZRBJsawhn7QUmmtA5lG699QZ+/fUXv0NKSl6m2ngLeEtEKuKM\nfH5ARGqqav2YR2cSQkm1NZjkEKmHUv36DfwOKyl5XVHuKOACoBfwG/BQLIMyxiSnN998nWuv7Zs/\nrsF6KPnLy1xMS3HGPjwNdFTVP2MelTEmKbVq1ZoGDRpy553DbFxDKeClBHGRqi4t/DCTyIo7sM0a\no01RHHRQHWbPnk+5cl6aR02seRkH8bCI7LNOg42DSC7FHdhmjdGmIAWtDW3JofSwcRDGM2tsNiUh\nOIfSvHlzeP75Vy0hlGJexkH0VNXrQ/eJyJPAJ7EMzBiTeMJ7KK1Y8SsNGjT0OyxTgGhVTJOBw4CW\nItIk7D0Zkd9ljDH7ijTzqvVQKv2iVTHdAzQAsoFhIdsDwLIYxmSMSTAvvfQ8I0cOs5lXy5hoCWK7\nqn4sIt0j7KsCrI9RTMaYBHP++Rfy+++/0a9ffys1lCHREsRknOVCPwHygNDuBnk41U/GGFOo1NRU\nbr31Nr/DMEUUrZG6m/u3tSAZYzwJBAIsX/4TRx4pfodiSoCXkdStgfbAf4EZwLHANar6SoxjM3EW\nbTCcDXgzhVHNYdCga1ix4ldmz/6CrKwsv0My+8lLB+SHgQVAT+Af4DjgP7EMyvijoJlXwQa8mYIF\nAgEefngcnTq1Z9GihZxySmcOOMDTNG+mlPPyr1hOVT8RkWeAl1V1hYjYv36CssFwpii+/165/vr+\nLFq00HooJSAvJYhtInIz0AmYISKDAVvf0RjDxo0bWbJkMT179mbOnPmWHBKMl5LAxUBf4FxV3SAi\nBwMXxjYsY0xZ0Lr18cyePd8apRNUoSUIVV0JvAKkishJwFvA4bEOzBhTNlhySFxeejGNB7oDy3HG\nP+D+bbO5GpMkVHP4+OMP6N//Or9DMXHkpYrpNEBU9Z9YB2OMKV0CgQATJjzM/fffS25uLh06dEKk\nsd9hmTjxkiCWs/coalOG2VgH41VwXENoDyVLDsnFS4JYD3wnIp8C24MbVbVPzKIyMRNt4R8b62CC\nZs58hz59LrWZV5OclwTxrvvHJAgb62AK06rV8TRufBS33PIf67qaxApNEKr6pIg0AJoA7wGHqOrP\nsQ7MGOOfzMzqzJr1ScQlQU3yKLSbq4j0Bt7EWReiOvCZiFwS68CMMfGxa9euiNstORgvI6n/D2gH\nbFHVv3Am67N5e40p44JzKHXv3oWdO3f6HY4phbwkiF2qmj+1hqr+CeyOXUjGmFhTzaFr187cc89Q\nVqz4lV9+sVpjsy8vCeJbERkIHCAix4jIRGBxjOMyxsRA+MyrwTmUGjU60u/QTCnkJUFcB9TFmer7\nCWAzMCCWQRljYuOdd2Zwzz1DycjI5KmnnmfChEnWfdUUyEsvpq04bQ63iUgNYL2q5hXyNuMjGwxn\nCtKt29kMGTKCiy++1BKDKVSBCUJEsoBHcVaS+wRnwr7TgNUi0l1Vv4tPiKaobDCcKUhKSgoDBw72\nOwxTRkQrQTwCfOX+OR9oARwMHIHT5fXUwk4uIinABKA5zijsfqq6PGR/K+AB9+Uq4BJVzS36xzDh\nbDBccgsEAixb9h1NmzbzOxRThkVrgzhKVUer6t/AGcCLqrpZVRfiJAovzgHSVLUdTjXVuLD9E4Er\nVPUknNHa9YsWvjEmXLCH0llnnc5vv63wOxxThkVLEKHtDB2B90NeV/J4/va403So6nygZXCHiBwJ\nrANuEpGPgeqq+oPH8xpjwgQCAUaPHp3fQ+mMM7pSpUoVv8MyZVi0KqZf3VHUldw/HwO4o6i/9Xj+\nqsCmkNcBESmnqruBmkBbnB5Ry3GWM/1KVT8u0idIYgU1RltDdPL54YfvGTjwalsb2pSoaAniOuBx\noDZwkarmisg4nMWDvP7mbQbSQ14HkwM4pYcfVfV7ABF5F6eE8XG0E2ZlpUfbnVQW/rCGDX/voGa1\nCnttr5lRgROa102qe5VMnzWSVasO5JtvlnLJJZeQnZ1N9erWQwns92J/FZggVPU39k0EI4BbQh7y\nhZkHdANeFpE2wNKQfcuBKiJymNtwfSIwubATrlmzpbBDkkJWVjq7duWRWSWN0f3bRjwmWe5VVlZ6\n0nzWghx0UAPmzv2S1q2bs2bNlqS/H2C/F6GKmygLbIMQkSdEpFHoNlXdEEwOItJERKYWcv7XgB0i\nMg+nt9KNInKhiPRT1Z1AX+A5EZkPrFDVd4r1KYwxNGx4mN8hmAQTrYrpLuAhEakDzAV+BwI4PY1O\ncV/fFO3k7oC6a8M2fx+y/2Pg+CJHnUQKamdITU2xtoYkpJrDjBmvc/PN/+d3KCYJRKtiWgn0EpHD\ncaqJGuNM0vcTcLGq/hSfEJObDXozsO/a0B07dubYY4/zOyyT4LxMtfETzsA445NIg96sfjV5RFob\n2pKDiQcvS44aY3zyyScfcfHFvWxtaOMLSxDGlGItW7bm2GOP47rrBtu4BhN3nhKEiFQGDsfpplrJ\nneHVGBNjlStX5o033rXlP40vvKxJ3QlYArwOHAT8IiKnxTowY5JNIBCIuN2Sg/GLlwWD7sWZU2mj\nu9zoycCYmEZlTBIJXeVt27ZtfodjTD4vCaKcqq4KvrB1IIwpOaFrQ69bt46ff15e2FuMiRsvbRC/\ni0g3IE9EMnDmaLI5hI3ZD+HjGqyHkimNvCSI/jjjIA7BGST3IXBVLIMyJtHNnv0x99wz1GZeNaWa\nlwTRXFUvDN0gIucCr8YmJGMSX8eOnbnvvnGcc865VmowpVa0Nal7A2nAcBEZEvae27EEYcx+ufLK\nfn6HYExU0UoQVYF2OOs5nBKyPQDcEcugjEkUgUCAxYsX0rJla79DMabIok3WNwmYJCKdVPWDOMaU\ndAqasRVsdbiyLDiH0rfffsMHH8xFpLHfIRlTJF7aIHaIyOtAFSAFSAXqq2qDWAaWTGzG1sQS3kOp\nV68LqFXL/g1N2eMlQUwG7gOuAB4GzgAWxjCmpBRpxlZT9vz00w8MGHBV/syrDzzwMF26nOF3WMYU\ni5eBcv+o6lSctaI34HRxPTmWQRlTVpUvfwCqSq9eFzBnznxLDqZM81KC2C4i1QEF2qjqh+7kfaYI\nrJ0hOdSv34C5c7+gXr1D/A7FmP3mpQQxDngBeBO4TES+BRbENKoEFGxniMTaGRKLJQeTKLysKPeS\niLysqnkichxwJPBj7ENLPNbOkDhUc3j++WcYMmS4zbZqEla0gXJZwE3AeuBBnPEP/+CMjXgXqB2P\nAI0pTcJ7KJ16ahfatWvvd1jGxES0EsQzwBagJnCgiLwNTAcqATfGITZjSpVIa0NbcjCJLFqCOFxV\nDxeRdOAzYADwCDBOVXPjEp0xpcTnn39Gz57dbeZVk1SiJYjNAKq6xe3FdJ6qfhafsIwpXVq0OI72\n7U/iiiv62cyrJmlESxB5IT+vtuRgktmBBx7I88/b/JQmuURLEOkiciJOV9jK7s/53TVUdXasgzPG\nDzt27CAtzcalGBMtQfwODHd/XhnyMzili46xCqqsssFwZVuwh9L06dN4//3ZVKuW4XdIxvgq2myu\npxS0z0Rmk+6VXeE9lH7+eTnHHNPC77CM8ZWXqTZMEdhguLLF1oY2pmCWIExSW7hwga0NbUwBLEGY\npNa69fE8/PCjdOlyhpUajAlTaIIQkUzgfuBwoBcwBrhZVTfEODZj4uKCCy72OwRjSiUvs7lOAr4E\nauBMvfEn8HQsgzKmpAUCAebNm+N3GMaUKV4SRENVnQjsVtVcVb0DqBfjuIwpMao5dO3amfPO687C\nhV/5HY4xZYaXBBEQkWq4I6tFpBGwO6ZRGVMCAoEADz88jk6d2rNo0ULOPbcXDRse5ndYxpQZXhqp\n78ZZbvRQEfkf0BboE8ugjNlfP/+8nGuu6bPXzKvWQ8mYovGSIGYBXwHHA6lAf1VdHdOojNlPlSpV\n5pdffrZxDcbsBy8JYgXwGvC0qn5elJOLSAowAWgObAf6qeryCMc9DqxT1duLcn5jClK7dm1mz/6C\n2rVtXStjistLG8TRwGJgpIjkiMhQETnC4/nPAdJUtR1wG8761nsRkf7uNYwpUZYcjNk/Xtak3gBM\nBiaLSEvgceBOL+8F2uMsT4qqznffn09E2gKt3HM2LlrosRVt4r2C2IR88aeaw913T2XIkHtJTU31\nOxxjEoqXgXJZOAPkLgCqA88CPTyevyqwKeR1QETKqepuETkIpwH8HKC314CzstK9HrpfFv6whg1/\n76BmtQqe31MzowInNK8btxjjdZ3SKBAIMHbsWO6++25yc3Pp1q0bZ55pjdCQ3L8X4exe7B8vpYDF\nwIvAjaq6oIjn3wyE/guVU9VgF9leOIPv3gbqABVFJEdVn4p2wjVrthQxhOLZtSuPzCppjO7ftsjv\njUeMWVnpcbsXpU34zKuTJk2kVasTk/Z+hErm34twdi/2KG6i9JIgDgl5qBfVPKAb8LKItAGWBneo\n6iM4a1wjIpcDUlhyMGbx4oV063baXjOvHnlkfXsQGBMDBSYIEVmoqi1wqoVClx9NAfJU1UuF72vA\nqSIyz319pYhcCFRW1cnFjtokrWbNjuH007vSs2dvG9dgTIyl5OXlFX5UGBFJU9UdMYinMHkl+U3R\nywpwpXVtBys+72H3Yg+7F3vYvdgjKys9pfCj9lVoN1cR+SzsdTmcgXNlXnAFuEhsBTj//fPPP36H\nYExSi1bF9CHQwf05tA0iALwR27DipzSXEpJVcJW3SZMe4/3359h4BmN8Em1N6o4AIpKtqoPjF5JJ\nZuE9lFas+MUShDE+iVaC6KaqM4CFInJZ+H7rcWRKkq0NbUzpE62baytgBm41U5g8wBKEKTE5Ocu4\n997h1KyZZTOvGlNKRKtiutv9+8rgNhGpijMu4ts4xGaSyNFHN2XixKmceOLJVmowppTwMtVGX+AE\n4P+ARcAWEXlFVe+MdXAmuZx1ltcZXIwx8eBlNtcBwC3AhcDrQFPg9FgGZRJXIBDggw9m+h2GMcYD\nLwkCVV0PnAm8paoBoGJMozIJKbg29IUX9uSTTz7yOxxjTCG8JIhvRWQGcBjwvoi8CHwZ27BMIglf\nG7pnz940a9bc77CMMYXwMllfH6AdsFRVc0VkOvBObMMyiWLFil+56qrLbW1oY8ogLyWIA3FmZJ0l\nIouBjoCtimM8ycjIYPXq1fTs2Zs5c+ZbcjCmDPFSgvgvsA2nJJECXAU8Blwaw7hMgqhatRoffjiX\n6tVr+B2KMaaIvCSI41Q1tMJ4oIh8F6uATOKx5GBM2eSliqmciGQEX7g/B2IXkimLVHMYOLA/ubm5\nfodijCkhXkoQ44AvRSQ4g+tZwKjYhWTKkvA5lLp0OYPu3c/xOyxjTAkoNEGo6lQR+RI4GafEca6q\nLi3kbSYJhM+8aj2UjEks0WZzLQdcBxwJzFXV8XGLypR6OTnL6Nz5RJt51ZgEFq0EMQE4CvgUuF1E\nRFWHxycsU9qJNKZXrwvo0uVMKzUYk6CiJYiTgaNUNU9ExgAfApYgDAApKSk8+OB//Q7DGBND0Xox\nbVfVPABVXYezBoRJQn//bQu/G5OMoiWI8ISwO+JRJmEFAgGysx+gRYsm/Pzzcr/DMcbEWbQqpvoi\n8kRBr1W1T+zCMn7LyVnG4MHX5vdQWr16FQ0bHuZ3WMaYOIqWIG4Ke/1JLAMxpUMgEGD8+GzGjBll\nPZSMSXLRlhx9Mp6BmNJhxYpfGDt2NBkZmTauwZgk52UktUkihx12BFOnPs1xx7WyUoMxSc4ShNlH\n585d/A7BGFMKeFpyVEQqi0gzEUkRkcqxDsrEXiAQYMaMNwo/0BiTtApNECLSCVgCvA4cBPwiIqfF\nOjATOzk5y+jatTN9+lzCm2++7nc4xphSyksJ4l6gPbBRVf/EGWE9JqZRmZgIjmvo3PnE/LWh27c/\n0e+wjDGllJc2iHKqukpEAFDV74I/lwUvfvgjX+b8FXHfhi07yExPjtVTV678nT59LrGZV40xnnlJ\nEL+LSDcPZ8oMAAAX1klEQVQgz10s6DpgRWzDKjlf5vxVYCLITE+jVeNaPkQVf9Wr12Dz5s02rsEY\n45mXBNEfyAYOAZYDHwBXxzKokpaZnsaYAe38DsNXFStW5L33PqJq1Wp+h2KMKSO8LBj0F3BhHGIx\nMWbJwRhTFIUmCBH5mQgzuaqqTcxTCuXkLGPMmFFkZ0+gSpUqfodjjCnDvFQxdQj5+QCgB1CqWnat\nIXrfOZROO+10eve+yO+wjDFlmJcqpl/DNo0Rka+Aewp7r4ik4KxM1xzYDvRT1eUh+y8EBgM7gaWq\nOqAIsedL9obo8JlXrYeSMaYkeKliOinkZQrQBKjo8fznAGmq2k5EjgfGudsQkQo4K9Qdrao7RORZ\nEemmqjOK9AlcydoQ/csvP9va0MaYmPBSxTQs5Oc8YC1wucfztwfeBVDV+SLSMmTfDqCdqu4IiWW7\nx/MaV4MGDenT52ratWtvpQZjTInykiBeVNVHi3n+qsCmkNcBESmnqrvd5UzXAIjI9UBlVX2/mNdJ\nasOH3+t3CMaYBOQlQVwHFDdBbAbSQ16XU9X8pUvdNor7gUbAuV5OmJWVvs+21NSUAvclkg0bNpCZ\nmZn/OtE/b1HYvdjD7sUedi/2j5cE8ZuIfAjMB/4JblTV4R7eOw/oBrwsIm2ApWH7JwL/qOo5HuNl\nzZot+2zbtSuvwH2JINhD6aGHHmDGjJk0aXI0WVnpCft5i8ruxR52L/awe7FHcROllwTxecjPKUU8\n/2vAqSIyz319pdtzqTKwALgSmCMiH+G0b2Srqk0vGiK8h9KGDev9DskYkyQKTBAicrmqPqmqwwo6\npjBuO8O1YZu/93L9ZGdrQxtj/BbtAT0YKFXrUve9Z2Z+dVKoRBwMt2bNX2Rnj7O1oY0xvilT3+DX\nbtpOZpXkGAxXp87BPPXUczRpcrSVGowxvoiWIJqIyPII21OAPD/mYqpZrQKj+7eN92V90779SYUf\nZIwxMRItQfwIWL1GjAUCAV577WV69uxNSkpR+wAYY0zsREsQuRHmYTIlKLSH0s6dO7nookv9DskY\nY/JFW5N6XpR9Zj9EWhv6jDO6+h2WMcbspcAShKoOjGcgyWL16tVcdllvm3nVGFPqRStBmBioUaMG\nu3fn0bNnb+bMmW/JwRhTapWpbq6JoHz58rz22lu22psxptSzEoQPLDkYY8oCSxAxkpOzjEsuOZ/1\n69f5HYoxxhSLJYgSFtpDaebMd3njjf/5HZIxxhSLtUGUIFsb2hiTSCxBlJBVq/7ktNNOZvv27Tbz\nqjEmIViCKCEHHVSHQYNu4uijm1mpwRiTECxBlKBbbvmP3yEYY0yJsUbqYli7dq3fIRhjTMxZgiiC\nYA+lFi2O4vPPP/M7HGOMiSmrYvIovIfS9u3/+B2SMcbElJUgChFp5tU5c+bToUNHv0MzxpiYshJE\nITZv3sTjj4+3taGNMUnHEkQhqlevwVNPPc/hhx9h4xqMMUnFEoQHLVu29jsEY4yJO0sQrkAgwAsv\nPEvv3hdRvrzdFpN8Fi1awJAht9Gw4WEAbN26lbp16zFkyAjKly/Pxo0bGT/+IVavXsXu3bupVas2\nAwfeQPXqNQBYsmQR06ZNJhAIsH37ds48szs9evT08yPlVxHfeuvtvsaxY8cORoy4iw0bNlC5cmXu\nuGMo1apl5O//4YfvefjhB0hJSSEvL49vv/2G0aMfoHXrNvTocSaHHHIoAE2aNKV//+uYMuVxOnU6\njQYNGsY0bnsSsncPpS1bNnPNNbaYnklOxx3XiqFDR+a/HjbsTubNm83JJ3fkjjtu5aKLLuOEE04E\n4KuvvuDf/76RSZOe5I8/VpKdPZZx48aTkZHBjh07GDz4WurWrUfr1m38+jhMnPgo5513vm/XD/rf\n/17m8MMbceWVV/HBBzOZNm0KgwffnL+/UaMjeeSRxwH46KP3qVWrFq1bt2Hlyt8Raczo0eP2Ol/v\n3hczbNgdjBmTHdO4kzpBBAIBxo/PZsyYUeTm5tKzZ296977I77CM4cUPf+TLnL+K/L7U1BR27cqL\nuK9V41qc3/GIqO/Py9vz3p07d7Ju3VrS06uSk7OMKlWq5CcHcKpe69atx6JFC1iyZBGnn96NjAzn\nW3FaWhrjxj1CxYqV9jr/77//xujRIwgEAlSoUIGhQ+9lwoRsOnfuQuvWbZg//zM++GAmt99+N+ed\n140GDQ6jQYMGzJs3hyeffI60tAo899zTpKam0qFDR+6/fyS5ubmkpaXx73/fQVZWrfxr/f3336h+\nx2GHOZ/5lVdeZPbsj9i+fTvVqmVw771jmDXrXd566w3y8vLo27c/mzZt5IUXniU1NZVmzY6hf//r\nWLPmL8aOHZV/P6666lratz85/zorV/7O6NEjSElJyd926qmn0737Ofmvv/56MRdffDkAbdq0Y9q0\nyRHv//bt25kyZSITJjj7c3KW8ddffzFo0DVUqFCBgQNv5NBD61OlShXS0iqwfPmP+Z8vFpI2Qaxb\nt46LLjrPZl41JsTChV8xaNA1rF+/nnLlUjj77HNp0aIlH374PnXr1tvn+IMPrsvq1atYu3YNjRrJ\nXvsqVaq8z/Hjxz/E5Zf3oVWrNsybN4cffsgpMJY1a/5i2rTnSE9P54ADDuTjjz+kS5czmTXrXR56\naAIPPDCKXr0u5Pjj27JgwZc8+ugjDBkyIv/9S5Ys4dBD6wNO4tuyZTPZ2Y8CcNNN15OT8x0A6elV\nGTVqLJs3b2bAgH5MmTKdtLQ0RowYwldffQHAhRdeyjHHtOCbb75mypTH90oQdevWy//2X5CtW7fm\nLxRWqVJltm7dGvG4GTP+R8eOnalatSoANWtmcdllV9KhQye+/noxI0bcxaRJTwFw+OFHsGjRAksQ\nsZCZmUmVKuk286oplc7veESh3/YjycpKZ82aLcW+brCKafPmTdx440Dq1KnrnjeLP//8Y5/jf/tt\nBa1aHc/atWtZvXrVXvt+/PEH8vJ275U4Vqz4lSZNmgLkl0ZmzXovf39oCSYjI5P09HQAunU7m7Fj\nR3HoofWpX78BVatW5aeffmL69Kk888yT5OXl7dN2uGHDBjIznfaRlJQUUlPLc/fdt1OxYkXWrv2L\nQCAAkJ9EVq78jY0bN3DrrYPJy8vjn3/+YeXK32nW7BiefHIKM2a8DsCuXbv2uk5oCSIvL4+UlJR9\nShCVK1dm27ZtAGzbtjX/c4WbOfNdRo68P/9148aNSU11PlezZsewbt2eBchq1KjJ2rVrIp6npCRt\ngihXrhzPPPMSFSpU8DsUY0qdqlWrcdddwxk06BqmTXuWpk2bs379ej79dC7t2rUH4PPPP+WPP37n\n2GOP4+CD63L77bfQqdNpZGRksG3bNsaMuZcrr7yKRo32nLdBg4Z89923tGzZmpkz32XLlk0ceGBa\n/oPu++/3lChCamyoV+8Q8vLg2Wen5zd8N2jQgAsuuJSjj27KihW/sHjxor0+Q40aNfj7bydZ/vTT\nj8yZ8zETJ05jx47t9O17aX4yKlfOGS9cp05datc+iAcfHE9qairvvDODRo2EyZMf5ayzzuX449vy\n9ttv8s47M/a6jpcSRNOmzfnss3k0bnwUn302j2bNjt3nmK1b/yYQ2LlXNdkTT0yiWrVqXHTRZfzw\nw/fUqlU7f9+WLZtj/sU2aRMEYMnBmCgaNGhIr14X8NBDYxk+fBT33fcg2dljmT79CQBq1arN/fdn\nk5KSwkEH1eHaawdxxx23kpqayrZt2+je/RzatGm31zkHDBjM/fffy1NPPUGFChW4664RrFz5O6NG\nDWfWrHfze+s4UvZ6b7duZzFlykRatGiZf66xY0eTm7uD3NxcBg++Za/jmzdvzqhR9wFQr149Klas\nxIAB/cjLy6NGjax9vn1nZGTQu/fFDBx4Fbt27aZOnYPp2PFUTjmlM//974NMnz6VWrVqs2nTxiLf\nyx49enLPPUMZMKAfBxxwIEOH3gPACy88Q716h3LCCSfy228rOOigg/d63yWXXMGIEXfx6adzKV++\nPLfffnf+vu+++4b+/WPboSYltEhX2vW9Z2be6P5ti/SenJxl3Hnnf3j44QkcfHDdGEUWf/tblZBI\n7F7sYfdij6ysdP7v/+7g7LN77NM+UtZt3ryZe+8duk/vpoJkZaWnFH7UvhJ2LqbQOZRmz/6IN954\nze+QjDFx1rdvf1577WW/wyhxL774LFdffV3Mr5OQVUy2NrQxBpzOKP/+9x1+h1Hi+vW7Ji7XSbgE\nsWHDek4/vSPbtm21HkrGGLMfEi5BZGZW5/bb7+LQQxtYqcEYY/ZDwiUIgKuvHuB3CMYYU+aVqUbq\nE5rv3Qtp1ao/fYrEGGMSX0wThIikiMijIvKpiHwoIoeF7e8uIl+IyDwR6VfY+fp0bwLs6aHUsmVT\nZs16N0bRG2NMcot1CeIcIE1V2wG3AfmddkWkvPu6M9ABuFpEsgo7YU7OMrp27czIkcPIyMjMH4Zu\njDGmZMU6QbQH3gVQ1flAy5B9/wJ+UNXNqroTmAucFO1ko0ePzl8bulevC5gzZz4dO3aOVezGGJPU\nYp0gqgKbQl4HRKRcAfu2ANWineyxxx4jIyOT6dNfYPz4idZ91RhjYijW9TObgdBpC8up6u6QfVVD\n9qUDUSc5+eWXX4o1XDxRZWVFnhEyGdm92MPuxR52L/ZPrEsQ84AzAUSkDbA0ZN8y4AgRyRCRA3Gq\nlz6LcTzGGGM8iulkfSKSAkwAmrmbrgSOAyqr6mQR6QrcjTNt4xRVfSxmwRhjjCmSMjWbqzHGmPgp\nUwPljDHGxI8lCGOMMRFZgjDGGBNRqRyGHNK43RzYDvRT1eUh+7sDdwE7gamqOtmXQOPAw724EBiM\ncy+WqmrCzlRY2L0IOe5xYJ2q3h7nEOPGw+9FK+AB9+Uq4BJVzY17oDHm4T5cDNwEBHCeFQnfEUZE\njgdGq+opYduL/NwsrSWIEp+iowyLdi8qAMOBk1X1RCBDRLr5E2ZcFHgvgkSkP3B0vAPzQWH3YiJw\nhaqehDObQf04xxcvhd2HMUBHnFkdbhaRqINxyzoRuRWYBKSFbS/Wc7O0JogSnaKjjIt2L3YA7VR1\nh/u6PM63qEQV7V4gIm2BVsDj8Q8t7gq8FyJyJLAOuElEPgaqq+oPfgQZB1F/J4AlQCZQ0X2d6N02\nfwR6RNherOdmaU0QJTpFRxlX4L1Q1TxVXQMgItfjjC9534cY46XAeyEiB+GMqRmIM64m0UX7P1IT\naAs8jPONsbOIdIhveHET7T4AfAsswBmkO0NVN8czuHhT1ddwqtPCFeu5WVoTRIlO0VHGRbsXwSnV\nxwCdgHPjHVycRbsXvYAawNvAf4CLROSyOMcXT9HuxTrgR1X9XlUDON+ww79ZJ4oC74OINAW64lSv\nNQBqi8h5cY+wdCjWc7O0JgibomOPaPcCnLrmNFU9J6SqKVEVeC9U9RFVbaWqHYHRwLOq+pQ/YcZF\ntN+L5UCVkPVXTsT5Jp2Iot2HTcA2YIeq5gF/4VQ3JYPwUnSxnpulciS1TdGxR7R7gVN0/hKY4+7L\nA7JV9fV4xxkPhf1ehBx3OSBJ0oupoP8jHYD73H2fquqN8Y8y9jzch/5AH5z2up+Aq9xSVcISkfrA\nc6razu3lWOznZqlMEMYYY/xXWquYjDHG+MwShDHGmIgsQRhjjInIEoQxxpiILEEYY4yJyBKEMcaY\niErlbK4mttx+0t+zZ/BUCs4Yiu6qurKA99wN5Knq8P247uU4E4b96l6zAvAJMCB0dLjHcw0DvlTV\nGSLyoTtADhFZqKotihuje46PgHo40xGk4IxA/Qm4ODi1SQHvuwrYrKovFOFadYERqtonZNtwIFDU\ne+2OHH4IZ0R5Ks5AqBtUdVtRzlPINWYA/XAGnb0DHAxMBRqr6tUFvOc4oL+qXl3YPRKRysBTQE93\ncJvxkSWI5LVyfx+kxfR68GHoDnL6BLgOeKQoJ1HVu0NedgjZXlKfqY+qBgcgIiKv4EwbfVuU97QD\nPiridR4C7nCvURUngV4A3F/E8wC8gDOD6xfu+SbgzPZ7SzHOFZGqdnPPfSjQRFXreXjPAiCYPKLe\nI1XdKiKzgGuAR/c/YrM/LEGYvYhIE5yHdWWgFvCAqv43ZH954AmgibvpUXeUZi2cWVTrAbuB21X1\ng2jXUtU8EfkUONI995U4D+HdOKPEBwK5YdeboKpTRGQq8DHQwn3vZ6raVkR24/xe/wYco6prRCQT\n+AY4FDgVGOYe8zPOyNoNEcLLr34VkXScCfA+d1/3cuOsgDNLaD+c6ZXPAk4RkT9xZhGNej9E5HCg\njqp+7246G6dk9wDFUxvn3y1oKM4cRLj3azfQFKdEdI+qPu1+Yx+Pc39TgftU9QURSXO3t8f5Nxih\nqi+JyM/AycCbQE0R+QK4FRiqqqeIyDHAY+59WQ9cAhzhxnJPyD3aCEwBGqrq326p9i1VPRon0X2O\nJQjfWRtE8qorIgtFZJH7983u9n44D4PjcebRvzfsfe1wpo8+Dudh287dno0zfL8VzoPucffhUyAR\nqQGcAcwVkaOB24ETVbU5zhw6QyNc74SQU+Sp6mAAVW0bsm038CLOBH4A5wGv4czDMwo4zT3fTAr+\npj7JvTd/4FTVzAQedEs9VwNdVfVYnOksbnUf/m8AQ1R1lsf70Q1n2mXczzBdVe/HeZAXx43AmyKi\n7qJJLYOlCVddoA3OxI5j3aR+J/CVG+fJwJ0i0gAIzg7cGOe+DxGRA0LOdRbwh6q2dl8Hq4OeBoa5\n/4bPA4OC+8Pu0RvADKCnu/8y4En3PmwAtrhVZsZHVoJIXgVVMd0MnC4i/8GZ3yb8ofYNcKSIvIsz\nc+r/uds7AyIiI9zXqcDhwNdh7z9bRBbifDlJAV5xv7FeB7yhqsEZJifilBxGFXC9wjwNPIgzT8+F\nONU4x+OUIj5yH/TlcGY+jaSvqs5x15h4GXg7OIePiJwLdBcRwaneijS3j5f70QjI8fh5CqWqT7lV\nYZ3dP1NF5BlVvck9ZKqbPFeKyFycSfw6AxVFpK97TEWc0sTJuOtqqOpqnJIHzkeOzE34B6nqO+77\nHne3n1zAW6bizA00DbgICF0BbQXO/QmfnNLEkSUIE+4lnIfmmzjfAHuH7lTV9e63/c44Uykvcqul\nygEdgw94EamDs9RluPw2iDDhpdkUoLyqbohwvaMK+xCqukBEqotIS6Cuqn4uImcBc1T1HDfGA9l7\nqujw66Oqn4nII8B0EWmG8wD9Eqch9ROcB/51BXyewu7HbiInl4jcc7yN8239j2B7gLvvCOACVb0H\neB14XUSygUU41WGEXSsVZ+nJcjjLkS52z1MLp2qob8ixweqwFYWEuJOQWUTdaqqDCzpYVWeLSF0R\n6QEsV9XQ+7OT4pekTAmxKqbkVdCiOp1wqgDexG38db9t4/7cHXhaVd/GWQt7C049+4e4D0r3Af41\nUKkI8XwMnCUiGe7rq3C+6Ue63iFh7w1dJCb0cz2L8y34eff1fKCtiDRyX9+NsyRlYca5n+VanPaS\nXap6L05j6xk4D1twHsDBL11e7sdPFGEpUFX9U1WPVdUWocnBtQYYJHsvDHQ0ToIIOt+Npz7QGmcW\n4I+AAe72Om6chwCzQ46vhfPvs9cyloT9DrmL8awQkU7upstw2ntCBYDQqqqncBY2mhp2XEOc1dGM\njyxBJK+CuhAOBeaJyFc4dc8/4/xnDXob+EdEvsVpSHxFVb/FqWtuIyJLgOdwuoRu9RqMqi7FqU6a\nLSLf4ax2dSdOV8ptEa4XGv8bwBL3G2vo9qdxFrN/2r3Gapypn1904zwGp0ot3F73RlVz3ViG4Dy0\nloiI4jSkb2HPQ/594Ha3Cup6D/djBntXqxSbqm7CKWENFZEf3Xt4OU71WlAl99/1TfY0zg/DqWJa\n6sZ/i6r+jFM1t82NfyYwUFX/Zu97E+l36FI3hoU4bUC3hu1/H7jNvUfgNEhXxCn1ACDOutFVVfWb\nIt8IU6Jsum9jfCQiLwN3u0kvlteZCnxUmhZRckum1wJHquoNIdsHATtV1Xox+czaIIzx10043+Kv\njPF1SuM3wVdxqrO6BDe4Pb06AT38CsrsYSUIY4wxEVkbhDHGmIgsQRhjjInIEoQxxpiILEEYY4yJ\nyBKEMcaYiCxBGGOMiej/ATHhpv9vrztLAAAAAElFTkSuQmCC\n", "text/plain": [ "<matplotlib.figure.Figure at 0x11d06f750>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "y_pred_prob = logreg.predict_proba(X_test)[:, 1]\n", "auc_score = metrics.roc_auc_score(y_test, y_pred_prob)\n", "auc_score\n", "fpr, tpr, thresholds = metrics.roc_curve(y_test, y_pred_prob)\n", "fig = plt.plot(fpr, tpr,label='ROC curve (area = %0.2f)' % auc_score )\n", "plt.plot([0, 1], [0, 1], 'k--')\n", "plt.xlim([0.0, 1.0])\n", "plt.ylim([0.0, 1.0])\n", "plt.title('ROC curve for win classifier')\n", "plt.xlabel('False Positive Rate (1 - Specificity)')\n", "plt.ylabel('True Positive Rate (Sensitivity)')\n", "plt.legend(loc=\"lower right\")\n", "plt.grid(True)" ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Optimization terminated successfully.\n", " Current function value: 0.584223\n", " Iterations 6\n" ] }, { "data": { "text/html": [ "<table class=\"simpletable\">\n", "<caption>Logit Regression Results</caption>\n", "<tr>\n", " <th>Dep. Variable:</th> <td>win</td> <th> No. Observations: </th> <td> 396</td> \n", "</tr>\n", "<tr>\n", " <th>Model:</th> <td>Logit</td> <th> Df Residuals: </th> <td> 386</td> \n", "</tr>\n", "<tr>\n", " <th>Method:</th> <td>MLE</td> <th> Df Model: </th> <td> 9</td> \n", "</tr>\n", "<tr>\n", " <th>Date:</th> <td>Thu, 10 Nov 2016</td> <th> Pseudo R-squ.: </th> <td>0.1571</td> \n", "</tr>\n", "<tr>\n", " <th>Time:</th> <td>11:45:45</td> <th> Log-Likelihood: </th> <td> -231.35</td> \n", "</tr>\n", "<tr>\n", " <th>converged:</th> <td>True</td> <th> LL-Null: </th> <td> -274.49</td> \n", "</tr>\n", "<tr>\n", " <th> </th> <td> </td> <th> LLR p-value: </th> <td>9.105e-15</td>\n", "</tr>\n", "</table>\n", "<table class=\"simpletable\">\n", "<tr>\n", " <td></td> <th>coef</th> <th>std err</th> <th>z</th> <th>P>|z|</th> <th>[95.0% Conf. Int.]</th> \n", "</tr>\n", "<tr>\n", " <th>const</th> <td> -1.4661</td> <td> 0.292</td> <td> -5.014</td> <td> 0.000</td> <td> -2.039 -0.893</td>\n", "</tr>\n", "<tr>\n", " <th>Round_2</th> <td> -0.3527</td> <td> 0.270</td> <td> -1.306</td> <td> 0.192</td> <td> -0.882 0.177</td>\n", "</tr>\n", "<tr>\n", " <th>Round_3</th> <td> -0.2240</td> <td> 0.364</td> <td> -0.615</td> <td> 0.538</td> <td> -0.937 0.489</td>\n", "</tr>\n", "<tr>\n", " <th>Round_4</th> <td> 0.1382</td> <td> 0.449</td> <td> 0.308</td> <td> 0.758</td> <td> -0.742 1.018</td>\n", "</tr>\n", "<tr>\n", " <th>Round_5</th> <td> -1.2500</td> <td> 0.841</td> <td> -1.486</td> <td> 0.137</td> <td> -2.899 0.399</td>\n", "</tr>\n", "<tr>\n", " <th>Round_6</th> <td> 2.0958</td> <td> 1.103</td> <td> 1.900</td> <td> 0.057</td> <td> -0.066 4.257</td>\n", "</tr>\n", "<tr>\n", " <th>Round_7</th> <td> 0.4501</td> <td> 0.959</td> <td> 0.470</td> <td> 0.639</td> <td> -1.429 2.329</td>\n", "</tr>\n", "<tr>\n", " <th>Surface_Grass</th> <td> 0.5040</td> <td> 0.316</td> <td> 1.597</td> <td> 0.110</td> <td> -0.115 1.123</td>\n", "</tr>\n", "<tr>\n", " <th>Surface_Hard</th> <td> 0.3538</td> <td> 0.276</td> <td> 1.280</td> <td> 0.200</td> <td> -0.188 0.895</td>\n", "</tr>\n", "<tr>\n", " <th>D</th> <td> 0.7876</td> <td> 0.108</td> <td> 7.313</td> <td> 0.000</td> <td> 0.576 0.999</td>\n", "</tr>\n", "</table>" ], "text/plain": [ "<class 'statsmodels.iolib.summary.Summary'>\n", "\"\"\"\n", " Logit Regression Results \n", "==============================================================================\n", "Dep. Variable: win No. Observations: 396\n", "Model: Logit Df Residuals: 386\n", "Method: MLE Df Model: 9\n", "Date: Thu, 10 Nov 2016 Pseudo R-squ.: 0.1571\n", "Time: 11:45:45 Log-Likelihood: -231.35\n", "converged: True LL-Null: -274.49\n", " LLR p-value: 9.105e-15\n", "=================================================================================\n", " coef std err z P>|z| [95.0% Conf. Int.]\n", "---------------------------------------------------------------------------------\n", "const -1.4661 0.292 -5.014 0.000 -2.039 -0.893\n", "Round_2 -0.3527 0.270 -1.306 0.192 -0.882 0.177\n", "Round_3 -0.2240 0.364 -0.615 0.538 -0.937 0.489\n", "Round_4 0.1382 0.449 0.308 0.758 -0.742 1.018\n", "Round_5 -1.2500 0.841 -1.486 0.137 -2.899 0.399\n", "Round_6 2.0958 1.103 1.900 0.057 -0.066 4.257\n", "Round_7 0.4501 0.959 0.470 0.639 -1.429 2.329\n", "Surface_Grass 0.5040 0.316 1.597 0.110 -0.115 1.123\n", "Surface_Hard 0.3538 0.276 1.280 0.200 -0.188 0.895\n", "D 0.7876 0.108 7.313 0.000 0.576 0.999\n", "=================================================================================\n", "\"\"\"" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import statsmodels.api as sm\n", "\n", "X = dfnew[feature_cols]\n", "X = sm.add_constant(X)\n", "y = dfnew['win']\n", "\n", "lm = sm.Logit(y, X)\n", "result = lm.fit()\n", "\n", "result.summary()" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "0.72848476454293631" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.cross_validation import cross_val_score\n", "cross_val_score(logreg, X, y, cv=10, scoring='roc_auc').mean()" ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYQAAAEZCAYAAACXRVJOAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xl8VPXVx/FPAhgQQUCDFhfEhaNWq0/VqlRZH7WIApZa\n0bq3uIC22uVBK2p3NW2trdYNLVpr0UqLgBbEQsSFKpYWxe2A0cYqxYUhggoRzDx//O6YSUgmF8jk\nTpLv+/XiRWa7c2YC99zfdn5F6XQaERGR4qQDEBGRwqCEICIigBKCiIhElBBERARQQhARkYgSgoiI\nANAx6QAkv8zsr8Acd/9NdHsfwIFr3P2K6L5S4E1gR+A+4Dvu/krM4/cFKoDno7uKor9/4+5TNjPW\ncuBGd//LZrzmamAHd/9mA489BHwX2Am4yd0PNLMfAsvd/Q9mdiWwxN1nbcb73QIcC/zR3a+M+7p6\nxzgA+DuwPOvuU9x9eSMv2Zxj3w7c4u7/2tpjbcZ7Hg8c7u5X53jOWcBX3P3ElopLNp8SQts3GxgC\n/Ca6fSIwExgJXBHdNxR40t3XAiO24D0+cvfPZ26YWR/gBTN71t1f2OLIt5K7nxDFsxOQju7LPmkN\nBV7czMOeB+zm7iu2IrQBwL3ufsFWHKMxxwC35uG4uRwG9IzxPC16KnBKCG3fbOAHWbdPBC4H7jOz\nPdz938Aw4GEAM3sdGAN0A34KvAYcAGwDTHD3BU29obuvMLPlQH8zOwT4OtAVqHL3YdGV+VhgA7AM\nuMjd34le/mUzuxzoQrgK/1kU1/eBUUBJdKzvuvuM6DX7m9kCwknpX8B4d/8w67N8ysymAC8A64BD\ngZ+bWWfgJuAL7v5q9Ly5hNbKrKzXPp75Ts1sPLA6et0OQA1wvbvfY2aDgF8DHwLbRsfdkBXGAKCf\nmT1DOEle5+7T63+PZnYhcD5QDawHznf3V6KEexOwG9AJuM/drzWznwB9gHvN7Mzo8SuAT6I/33P3\nJ+u9x1nRd9QF2AN4A/gtcBGwD/Ard7/ezLYFbonu6wWsBU6LvvMLgGIze9/dr4x+f2cSfr/LgXOi\nt+sTtdp2jx47zd3dzLpH39cB0eeZF8VaE7XoRgEfA6uAs9397frflTQPjSG0cdEJbpWZfc7MegD9\n3f1pQqIYFT1tGPBQAy//AvDz6Or/d9RNLI0ysyOBvYBnorv2BwZGyeAc4DjgEHc/mHCFfnfWy7tF\n73skcLqZHWdmuxOu5gdGr5kE/CjrNXsBJ7n75wj/pic1EWLa3W8G/kFILH8E7gLGRfHvBfSn3nfi\n7gMJXWKDgacJLa1fu/tBwPHAz8zs8OjpnyV0A/1PvWQA8AEh2R0OnA3cYmb/k/0EMysGfgUcFz3v\nduCo6OF7gDvd/TDgcOAYM/uKu08CVhBOtM8CZcCF7v4F4Moo7oYcBZzl7vsQutdOcfehhNbiT6Ln\nDAdWu/sAd983+u4ucvdFhBbJ/VEyGElIBodHv4/XgQnRMfoBF0f3P0HoziP6nP+IPs/ngVLg22a2\nK/At4LDoM8yNPq/kiRJC+zCbcDIYDjwa3fcQcGw0BpB292UNvK7S3ZdGP/+TcGXYkG3N7J9m9i8z\nW0poWZzm7m9Fjz/v7h9GP38JmOLu66PbvwaGmlmmtXqHu6ej7qtpwDHu/gbhxHm6mV1DuCLdLuv9\n/+LuqejnKYRuk7gyYx63AGeYWQdCYrjD3Rvr4igiJIySTCvF3f8L/Dn6fAD/cfc3G3qxu1/k7rdF\nP78C/InQhZf9nJro/r+b2Y3AGuDO6Ep9EPBjM/sXITHtBhzUwGeaCjxoZpMJv7uyRj7Ps1ldYK8T\nTrwQxoZKzGxbd/8zcLeZXWRmNxD+PW236aEYBjzg7muiz/Fdd78memyRu78e/bwE6B39fAJwfvR5\nFhO6oA6Ivr8lwL/M7OfAc+4+s5HPIM1AXUbtwxzgG4Ruh0zXxHxgMvC/RN1FDViX9XOa2hNNfXXG\nEBrwQdbP9S9COhD+HWaO/UnWY0XAhujqeQZwPfAIsAC4Oet5m7wmRywNcvflZvY8MJrQFfKFRp6a\nSRINXUwVE7o8oO5n/lR05X8ZoWWRSZINxuzuZ5rZ/oTf0UTgXMLVN8CR7l4dHXMH6v6uMq+/0szu\nJAyCnx29b0O/p+p6tzeJJeq+GgfcCNwLpAhdTPVtJGuswMy2B3o0cNzsf08dgJPd3aPXdKd2zGdw\n1O34v8CvzKzc3S9p4H2lGaiF0D6UAwcDAwknVNx9HeGq/yIaTwhxNZYoGvIIcE50pQvwTWBBVrfK\nmQBm1hM4hdC6GUi4ir0BeBw4iXASyRhpZttHV/fnAX+NGctGak/gEJLMz4Gn3X1lI6/JfFYHPjaz\n0VG8fQh98Y828rrwonDlPzKKMzNL68uE1sWnzGwHM3sDWBXNEJsEHBS1nJ4m6m6JugGforb7byPQ\nycw6RGMo27n77cB4YF8zy/68cWQ+77GElt0UwrjAidT+DrK/x78RxoEyrYcfAJc28R6PAN+OPk8J\nMAu4KOrmfAF42d2vI3QtfW4z45fNoITQDkTdM8uAV6ITSsbDwN7AY1n3bclMkM15zZ2Ek8YiM3uR\nkKhOzzrO+2a2GHiScBX9OKHrozR6/j8I3Se9zKxr9LqXos/yHGGg97qYcc0CfmFmZ0S3HyJ0g+Sa\npZO5ct1IaE1cYmbPEbpZfhBn0J3QAjk+apE8DHwrc3Wc4e6rgB8D883sH8A1hMF5gK8BR0Sv/zth\nxtLU6LEHgfsJM8u+Bfwx+j7/BJzTwHhGg5+vgdu/AC4ws38Skt5iwr8dCIPAI83s1+4+mzAeszD6\nXnaidjZbY75J6HZcSugieg4oc/fno8+y2MyeJQxON5VcZCsUqfy1SGBmA4Db3P3ApGMRSYLGEEQA\nM7uLMFh7RhNPFWmz1EIQERFAYwgiIhJRQhAREaCVjSFs3PhJevXqj5IOo0k9e26L4mw+irP5tIYY\nQXE2t9LSbrGmhreqFkLHjh2aflIBUJzNS3E2n9YQIyjOpLSqhCAiIvmjhCAiIoASgoiIRJQQREQE\nUEIQEZGIEoKIiABKCCIiElFCEBERQAlBREQiSggiIgIoIYiISEQJQUREACUEERGJKCGIiAighCAi\nIhElBBERAVogIZjZ4WZW3sD9J5rZIjN7ysy+ke84REQkt7wmBDP7HjAZKKl3f0fgeuB/gcHAeWZW\nms9YREQkt3y3EF4FTmrg/v2A5e6+xt03AE8CA/Mci4iI5JDXhODu04GNDTzUHXg/6/ZaYPt8xiIi\nIrl1TOh91xCSQkY3oCrOC0tLu+UloOamOJuX4mw+rSFGUJxJaKmEUFTv9svA3mbWA/iI0F308zgH\nevfdtc0cWvMrLe2mOJuR4mw+rSFGUJzNLW7SaqmEkAYws1OBru5+h5l9G5hLSBZ3uPt/WygWERFp\nQN4TgrtXAgOin6dm3f8w8HC+319EROJJagxBREQakUpVMXFiOZWV3enb933KyobSs2ePvL+vEoKI\nSAJynfQnTixnxowzgCKWLEkD9zB5ckMz+JuXEoKIyBZataqKceNmbtGVfK6TfmVld2rn4hRFt/Mv\nVkIws23c/WMz2xswYLa71+Q3NBGRwjZ+/OwtvpLPddLv2/f96HhFQJq+fdc0b+CNaDIhmNlVhCmi\nk4DHgZeA0cC4PMcmItLiNqf//vXXt2NLr+RznfTLyoYC90QxrKGsbMiWfpzNEqeFMBL4InAp8Ad3\n/z8z+0d+wxIRaTnZSeCdd15kxYrxQM8mr/r79VvLs89u2ZV8rpN+z549WmTMoL44CaGDu1eb2QnA\nJDMrBrrmOS4RkWbT1FV/dn8+jALuA06lqav+W245nurqLbuST+qkn0uchDDPzF4grCh+HFgAzMxr\nVCIizSCTCBYs2EhVVQkwmCVLtqf+VX/9/vzaa97cV/29ehXeSX1rNJkQ3P27ZvYb4E13rzGzi919\nSQvEJiLSpMxJv6JiW1Ipp1evPdhrr42UlQ2td+WfJnPlX/+qv35/fp8+L9C7d02L9t8XgjiDyj2B\nK4G9zOxk4Jtm9h13X5336EREsjTU9VP/pL9ixX288MKZZPrn6175b0dDV/2b9uef0SILwQpNnC6j\nyYSaQ18glKn+L/AHYEQe4xIR2URDc/cbPukXfZo0sq/8e/R4hUGDVm9y1V+I/flJiJMQ+rn77WZ2\nobt/DFxhZs/lOzARaZ9SqSouuughli3rsskAcENz9+uf9OEDMq2ATa/8x7bLK/+44iSEjWa2PbUV\nS/cBtChNRPIi1wrehubuZ076r722LatWLaNXr77stdc9lJUN0ZX/ZoqTEK4CHgN2N7MHgSOBc/MZ\nlIi0XU1NAc21grehuft1T/rHtdTHaJPizDJ6xMwWA4cDHYDz3f3tvEcmIm1SU4Xbcq3g1RV/fsWd\nZXQysCPhN3SwmeHuP8p3cCLS9jRVuK2sbCglJfdFYwjta9pn0uJ0GT0IvAO8SDSOICLSmMWLl3LS\nSbOpru5HSclrzJx5PAcffOCnjzdVuK1nzx7cf/+prWJryrYmTkLo5e6D8h6JiLRa2eMCzz+/iJqa\nnwJFrF+fZuTIq3jjjdqEkFThNmlanISw1MwOcffFeY9GRFqV2tIQb1NV9V3CVf8asruEqqv71XmN\nxgEKV6MJwcxeJ3QRbQucYmZvARuJ2nnuvmfLhCgihaK2TEQHUqlKPvigN2vWdAF2pTYJvEU4dYQu\noZKS15MKVzZTrhbC4JYKQkQKXypVxdCh97BixQGEIcXLqV0Mdi21SeBUiouvIJ3em5KS15k5c3hy\nQctmaTQhuHslgJkdAExy97Fmth9wG9ocR6TNS6WqmDDhYRYsKKKm5j06dfqQ6uowNhBk/70vMJUe\nPdYzaFBHysou1IrgVijOGMIdwA8B3P1lM/sxcCdwVD4DE5Fk1C0Z3Y1Qtmx7qqvvpjYJhPIQtTWC\nXmXQoJ0oKztGiaAVi5MQurr77MwNd3/UzMryGJOIJKRut9AHwHBgNmGzmHepTQLD6dz5KvbZ5yD2\n3PMj1QhqI+IkhHfM7AJChVOAsYBWKou0IXVnC2WPDdxHpmQ09KBz56vYd99Doumi5ykJtDFxEsI5\nwM3Az4GPCbumfT2fQYlIfjU+W2hv6u8cVlS0hKKipZSWrmPmzFPo169vcoFLXsVJCObuJ9S5w+zL\nwF/yE5KI5MuqVVWMGzcza93AfTQ+WyjsHFZefrZaAu1ErnUIpwAlwI/M7Kp6r/k+Sggirc748bOj\nwnIPkb2ZTFAE7EHd2ULtc+ew9ipXC6E7MADoBmSvLd8IXJHPoESkedQvNV1ZuQ3hxL+W0BLI/J2Z\nLfSmZgu1Y7nWIUwGJpvZMHef14IxichWyhSYW79+V0IrYDBLlmzPrrtmuoSOB6bSvXuK7ba7hh12\n6K/ZQhJrDKHazGZQ27bsAPR19z3yGZiIbL5Mi2DmzErS6R9Rd7bQqZSW7sshh2QKy22krGyUEoB8\nKu7CtOuAs4HfECYm/zOPMYnIZqqoqGTMmJmsXAk1NZdTO0YAtWMFafbZZz033aTCctKw4hjPWefu\nUwjbaK4mlK1QOWyRApFKVTFkyP2sWHEANTWdgPepHRsASFNUtIRRo+7hlltUV0gaF6eFsN7MegEO\nHOHu882sa57jEpGYJk4sZ/367O6hqWTGCGAtnTu/ycyZwzn44APp1aubNp6RRsVJCNcD9wNfBp41\ns68B2htBJCGpVBWXXPIITz9dDLxHTU0v6nYPfUzHjvczYkQXysqO1xiBxNZkQnD3B8xsmrunzewQ\noD+wJP+hiUhDJk4sZ86cr9PYYrLOnStYsEArimXzNZkQzMyA88ysZ72Hzs1PSCKSrXYK6c4UF79F\np079qe0W6gHsQY8ev2CPPfZRjSHZKnG6jKYT5qw9v7kHN7MiQh2kg4D1wDfc/bWsx78GfJuw2G2K\nu9+6ue8h0palUlWMGPEQNTVHAC9SU/M9qqt7UjuVdCyQZtCgnZg8eViisUrrFychVLn7j7bw+KOB\nEncfYGaHE8YjRmc9/nNgP+Aj4CUzm+ru72/he4m0ORMnln+6YT2cSGY9ARRRXLyO7t1/wZFHbkdZ\n2Qk5jyMSR5yEcJeZ/RSYR7iSB8DdH4/x2qOAOdHznzGzQ+s9/hyQudwh62+Rdit70HjNmg3Urz4a\npDnxxI5MnnxBMkFKmxQnIQwGDiPUNcpIA0NjvLY7YVJ0xkYzK3b3muj2i4QZSx8Af3H3NTGOKdIm\npVJVjBv3B5544n3gGkICuJfsAWNYSP/+H7DffhspKxuS42gimy9OQjjU3ffZwuOvIRTHy/g0GZjZ\ngYS9+foCHwL3mtkYd/9zrgOWlnbL9XDBUJzNqy3HuWpVFePHz+bRR6tZvXot4dor0yoYAfySbbft\nzQkndOKWWy6jV6+tGzBuy99lElpLnHHESQhLzexz7r7Zg8rAU8AJwDQzOwJYmvXY+4Sxg+poSus7\nhO6jnFrDoprS0tax+EdxNq8tibN2y8rMngS/p+5+xdsDfTjmmI3cdNMJfPLJ1v0faMvfZRJaU5xx\nxEkIewL/MrP/EnZMKwLS7r5njNdOB44xs6ei2+eY2amEfZrvMLPbgSfNrBqoAO6KFbVIK5dKVTF+\n/IOUl1eRTu9C7TTSt4ALCIPHXYGFDB/eV4PG0iLiJITRTT+lYe6eBi6sd/eyrMdvA27b0uOLtFaX\nXvow8+d/QJiRXX8z++uA/pSUvMisWSM5+OADE4xU2pM4K5UrWyIQkfYgU576kUdS1N268j5gA336\n/JHy8glaWCaJiFPtVESaycSJ5cyYcQY1NQdSfzpp587LKS/XlpWSnDhdRiKyFbK3sfz3vzdSdwvL\n0EIoKVnEggXarUySFaeWUS/g8+7+NzO7HPg8cLW7v5T36ERasVSqiq9+dTLPP18D9AOeBrYhewvL\n2s3sz1cykMTFaSFMBWaFGnecDPwKuBUYmMe4RFq12umkXYDLqB0ruCKrEN1GbWYvBSVOQujp7jeZ\n2Y3AXe5+j5l9K9+BibRGtVtZ7kxNzQ5AF+qOFezNoEEdVYhOClKcQeXiaB+E0cBDZnYwGnsQadCY\nMTNZseJyamrOJexj/BZ1S3UtU8kJKVhxEsL/EaqS/iIqXX0rcGleoxJpZSoqKtltt2tZsWJnQi9r\nFaHsRAq4ErgTuIwbb9xfXURSsOJc6e/m7p8WsnP3I8xsAlCev7BEWo/MJvd19zUOexV06tSF44/f\nm7KyIfTs+dVkAxVpQqMJwcwuIVQrvcDMsvfi6wh8DfhtnmMTaRXCJvf7UHesYB19+lzD9Olf1VaW\n0mrkaiG8ChxC+NddlHV/NXB2HmMSKXh11xasJPy3qF1X0KfPSpYsuTjZIEU2U6MJwd0fIgwi/8nd\nXzaznu6+ugVjEylItVNKDyDUIRpHGDe4DuhD587LmT79lERjFNkSccYQSszsFWBbMzsSWAB81d3/\nmd/QRArTxInlWeWqw3hBjx6d2Wef3ejTZ7U2uZdWK84so98AJwGr3P0tQvXSW/MalUgBq6zsTv06\nRIMGdWTRohOZPPkkJQNpteIkhG3d/eXMDXd/FCjJX0gihWXx4qXsvnsZO+30ALvvfh3du79B9tqC\nPn1e0NoCaRPidBmlzOwgov8BZvY1wuRqkXbhpJNmfzqldP36NE8/PZFRo+6hsrI7ffuuoaxMFUql\nbYiTEC4E7gY+a2bvEza4OT2vUYkkLHsW0fr1JYQdX3sARWzYsC+TJ5+UcIQizS/OBjkVwFFm1hXo\nEN23Jt+BiSRl032ORxFmEZ1GKFX9eqLxieRLk2MIZnaCmV1H+J/xDPBatFJZpM2ZP38h++7723ol\nKIooKnqfoqLf0bnzVcycOTzhKEXyI06X0dXAGcBYYBEwAXgMrVSWNiTTRTRjxkvANdQvQTFyZFd1\nE0mbF2sLTXd/hVCpa6a7f0DY5UOkTaioqOSgg25hxoz1hAql70eP1Jag0CwiaQ/itBDejvZCOBQ4\n3cx+CbyR37BEWkZFRSUDBvyedDrTK5ome7wAlmnTe2k34rQQTgWeBYa4+4fAa9F9Iq3a/PkLOfLI\nP5BO70vdhWYfA7+juPgKpk0bomQg7UacWUZrgd9n3dbYgbR6ixcvZezYcsJ4wVSyC9PBNgwf/hF3\n3315kiGKtDjtfCbt0kknzQYGEJLA8YTB44+A1XTq9DY33PDNJMMTSUSsQWWRtqa6uh+hUmmasOBs\nLPA6vXt/yJNPnqVuImmXcm2Qc5i7Pxv9PIxwGbUBmO7uz7RQfCLNIpWqYsKEh1mwoIiamvdIp98D\nriK0DLoCC5k2bQgDBw5INlCRBOVqIdwGEC1CuwH4D/A2cJuZXdQCsYk0m4kTy5k373w2bjyfmprv\nA58FrqOoaB2dOz/L3LkjlQyk3YszhjAOGOzuqwDM7A7CrKOb8hmYSHPILDibOxfqziTaiS5dtqGy\nUovNRDJytRA6mVkx8A7wYdb9HwM1eY1KpJmE1cdnsG5dR7JLVsNaevZ8M8HIRApPrhbCu4RuojRh\nQ5yzzWwoUAY80AKxiWyxui2DqcAXCbOn11NcvIrS0nXa5lKknlx7Kg8FMDMDekZ3VwNXu/vDLRCb\nyBbZtFpppibRmYwadQ+TJ1+cbIAiBSrOwjTP+vmp/IYjsnUWL17KiBEPUVOzL6FlcDzQgy5dNnDs\nsfeoJpFIDlqYJm1GKlUVJYOfUr9a6bHHomqlIk2Isx9Ch5YIRGRrZLqJQssgezbRBlUrFYkpTgvh\nWeDz+Q5EZEtUVFQyYsRdpFKdgX7AS2TXJSoufoXy8gu18lgkhrjlr48GFrl79eYc3MyKgJuBg4D1\nwDfc/bWsxw8DfhndXAmc7u4fb857SPuVSlUxcODv2bBhB+AyQhJYDVwB7Etx8SvMmXOCkoFITHFq\nGR0KLADWmVlN9OeTmMcfDZS4+wDgcuD6eo/fDpzt7gOBOUDfmMeVdi5sanMbGzbsBexCbTdRT2Bv\n+vR5k5dfvpCDDz4wuSBFWpk4s4xKt+L4RxFO9Lj7M2Z2aOYBM+sPrAK+bWYHAA+5+/KteC9pR8aM\nmUl19ReApYTrmuxuolfVTSSyBZpMCGa2LWFf5WHR8+cDV0ab5TSlO7X7EQJsNLNid68BdgSOBMYT\nNt15yMz+4e6Pbd5HkPZo9epdCdVKqwn/zK4kjCEsY86cUUoGIlsgzhjCTYRC8ecSLsHGEVYunxHj\ntWuAblm3M8kAQuvgVXdfBmBmcwjdU4/lOmBpabdcDxcMxdm8Sku7MXfukwwf/ldqavYGHDiRsJtr\nFSEZLGfevBMZOjS5InWt4ftsDTGC4kxCnIRwiLsflHX7IjN7KebxnwJOAKaZ2RGE9n3Ga8B2ZrZn\nNNB8NHBHUwd89921Md86OaWl3RRnM8rEGZJB9hqDyyku7keHDp8wcOCH3Hxz2Ps4qc/UGr7P1hAj\nKM7mFjdpxUkIxWbWw92rAMysB7AxZhzTgWPMLLPC+RwzOxXo6u53mNnXgamhOgYL3X12zONKO7Jq\nVRXjxs2MWgbZawz6s3LlyQlGJtK2xEkI1wOLzGxWdHskYSPaJrl7Griw3t3Lsh5/DDg8zrGk/Ro/\nfjYzZpwBXEv9wWMRaT65dkw7xd3vB2YRFqcNIkzn+LK7L23sdSLNZcaMuYwb9yzQn5AMjgauA3pT\nXPwqf/rToETjE2lrcrUQfmhmfwbmuvvngRdaKCYRKioqo2QwgDCb6ALCJn4To4qllycan0hblCsh\nLCTM6SuqtxCtCEi7u2ocSbNLpaq49NKHmTPnDULPZHaRut6MGqWKpSL5kms/hHOBc81shruPasGY\npJ0KpSju5J13diIsWs8eQO4KPMfkyVckFp9IWxdnpbKSgeRdKlXFgAF3kkpdTUgA95I9gAwLmTLl\nsCRDFGnztB+CJK6iopLBg6dSXW3UbmozAvgxsCtduvybxx47nX79VOpKJJ+UECRRFRWVDBjwR9Lp\nTTe1gQ6UlLzGm29+l08+0ZCVSL7F2SDnr2Z2spl1aomApP1IpaoYPPg+0ukjqL+pTZhm+h6PPz6W\nXr1Ul0ikJcQpf30t8CVguZn9NtrDQGSrjR//INXVexKmlaaje9NABfAO06Z9Sd1EIi0ozqDy48Dj\nZtYF+ArwZzNbQ6g7dMvmbpojAqF1UF7+NqFF8G1CN1FXYBGdOlXz5JNfVzIQaWGxxhDMbDChuumx\nwGzgfuAYYCZwXL6Ck7anoqKSMWNmsnLlDqTT+wEHArcAOwEvsuOOH/PUU99U+WqRBMTZD6GSUJl0\nCnCRu6+L7n+MUNJCJJYwgHw36fTewHvAW4Qq6N8HZlJSUsFTT12sZCCSkDgthBHuXqdshZkd4e5P\nA5/PT1jS1qRSVQwZcj/pdBm1s4mmEvY1SNO587MsWHCakoFIgnIVt/si0AHIlKnOTAPpRGjj989/\neNIWpFJVDB16D+vX70Pd2UTdgN2jchTnKRmIJCxXC+EYQoXTzwA/yrp/I6HKmEiTKioqGTLkftav\nPwx4kbqrj9fSu/cKJk++JNEYRSTIVcvoBwBmdoa739NiEUmbkekmWr/+R4QkkClfXQqspLT0I2bN\nOjXRGEWkVq4uox9ESWGomW1SXjIqfifSqIkTy+t1E/UE9qdPnxcoLz9TXUQiBSZXl9Hi6O/HWiAO\naUPmz1/IaactiLa8XA6sJiSDMHhcXq7xApFClCshPGdmuwPlLRWMtH4VFZWMHVsO7E+YVno+oZto\nfzp3Xs6CBacoGYgUqFwJYQG1I4D1pYE98xKRtFqZQnV1N7a5DujHqFFoJpFIgcs1qNyvJQOR1q12\nncG+1J1a2geo1EwikVagyUFlM/tdQ49rUFkyFi9eyvHHzySd7g8sI3vMAJYzYEDXROMTkXjiDCov\naIlApHW6997pXHrpS8AAQtXS7wDXA/2A19hxx4+ZMuW8JEMUkZhydRnNiv6+28x6A4cTCtUvcvdU\nC8UnBS4kg+wxg/sIyeBVhg3bgZtv/qrGDURaiTgb5JwMLAHOAs4DlpjZl/IdmBSuVKqKceOmc+yx\n8wgVTLKZnFxNAAAQqElEQVTHDLoCbzN8+G5MnapBZJHWJE5xu0nAIe7+XwAz60soez0nn4FJ4brk\nkkeYMydT3moSdctRLGTgwN7ccMMJSYYoIlsgTkLYAKzM3HD3SjPbmL+QpNA9/XQxta2CrwJXAHsD\ny7jxxv055ZSTEotNRLZcrllGZ0Y/vg7MMrO7CYXtTgWea4HYpIAsXryU0aP/SnV1GB+onUl0ID16\nPMKyZScnG6CIbLVcLYRM/aIPoj/HR7c/pOHFatKGjR79MNXVP6G2a+gq4AvAWo48crtEYxOR5pFr\nltE5jT0W7a8s7UAqVcX48Q9SXb0bdQeP+9Gly2qOPRbKyjReINIWxNlCcwzhcnA7wpmgA9AF6J3f\n0CRpqVQVRx01mffe+wxhq8vbCWMG2wNvc+yxuzB5ssYLRNqKOIPKZcA3CCuOfgocB+yYz6Akebff\nfi+TJj1P2LugJLp3HXAHsI4OHd6lrOz0xOITkebX5DoEYLW7lwNPA9tHeyQcmdeoJFGpVBWTJv0b\n2A34AXAu8H0gBfSnuHgdCxdqjYFIWxMnIawzs/7Ay8BgM9uG0GcgbdSECQ8TylfvQt1xg77AQv7+\n99Po169vUuGJSJ7ESQiTgJ8ADwHDgLeB6fkMSpIxf/5Cdt75GubN24awl8FbhBlFRH87c+eOVDIQ\naaOaHENw9wXUFrg7zMx6uvvq/IYlSRg7di5wCPAxYd7ACsJ8gr7AMubOHc3BBx+YYIQikk9xZhnt\nCvwGGEw4U/zNzC5193fzHJu0kMWLlzJq1INAL8K6w8xag58QJpW9yNy5JysZiLRxcWYZ/Y7QRXQW\n4UzxdWAK0OTkczMrAm4GDgLWA99w99caeN5twCp3/3780KU5zJgxl3Hj5hGSwWeAqYQ1iD2AHejT\np4rp07+ubiKRdiBOQih191uybv/KzM6KefzRQIm7DzCzwwmF8kdnP8HMzgcOQPsutLjlyysZN+5Z\nwmyiy6hbwnosvXq9yZIllycZooi0oDiDyovMbGzmhpmdAPwj5vGPIqqK6u7PAIdmP2hmRwKHAbfF\nPJ40o8GD7yWUr64/m2gdnTtfxezZpyUWm4i0vFzF7WqorWs8zszuBD4hrFheTVis1pTuwPtZtzea\nWbG715jZzsDVhBbDKVsYv2yB73znp9xzD4Rk4IRfa3YJ62U899wErTMQaWdy1TKK03poyhqgW9bt\nYneviX4+GdgB+Cuh87qLmb3i7r/PdcDS0m65Hi4YhRrn3LlPRskgs+XlZcDPgCuBPSgqWsbf/jaS\n/v13SzDKTRXq91lfa4izNcQIijMJcWYZbUu4kh8WPX8+cKW7fxjj+E8RBp+nmdkRwNLMA+5+I3Bj\n9B5nAdZUMgB49921Md42WaWl3QoyzoqKSo477iE23fJyP+Bthg17n6lTrwQK63su1O+zvtYQZ2uI\nERRnc4ubtOK0Am4i7It4LmGm0TbArTHjmA5Um9lTwC+BS83sVDOL090kzSSVquKss+5lwID7CC2D\n+lteLmPUqF24+eavJhajiCQvziyjQ9z9oKzbF5nZS3EO7u5p4MJ6dy9r4Hl3xzmebL5UqoovfvFm\nVq3qThgzeBE4mrC5Tdjy8txzN3LttapaKtLexUkIxWbWw92rAMysB2HnNClwFRWVDBr0ez7+eBvq\nTiu9jlCraCHTpg1h4MABSYYpIgUiTkK4njD1dFZ0eyShE1oKWCpVxdFHT2bjxm7AntRdcNaHzp2f\n5YUXLqR7916JxikihSNOQpgFPAsMIow5fNndl+Z+iSRp8eKljBjxEDU1O9PQgrOiopd47rkJ7LXX\nbq1iQExEWkachPCEu+8HvJDvYGTrzZ+/kLFjywnjBR2pv+CsqOgKHnlkpNYYiMgm4iSE58zsDGAR\nYcssANz9jbxFJVukoqIySgbXELqIKslecFZS8hrPP3+hkoGINChOQjg8+pMtTeiYlgKxePFShg+f\nSWgZTAW+CLxHZsHZzjuvZMaMsUoGItKoOPsh9GuJQGTLpFJVTJjwMPPmvcOmC84uBq5i0KCPeOCB\nbyUZpoi0ArlqGfUhLErbB3gSuDwz9VQKQypVxeDBd7Fy5STChnZ1xwvgCrbffi23335eYjGKSOuR\na6XyFOAV4HtAZ+BXLRKRxLJ48VL23fd6Vq4sJiSDF6i73eVr9O5dwqJF/6duIhGJJVeX0S7ufhyA\nmc0DlrRMSNKUiorKaLzgMGp3ODsKuJaw0U0lRx/dnTvu+JqSgYjElishfJz5wd03mNnHOZ4rLaSi\nopIBA/5AGDx+GXiDsOdxT+AzbLPNSzzxxJna4UxENlucWUYZ6aafIvkUZhI9SCg9kRk8vha4PPr5\nFZYuvVitAhHZIrkSwmfNLHv/412i20VA2t017bQF1U4r3Ze6g8elwC1ABdOmHaNkICJbLFdC6N9i\nUUijwnjBXVRVdaR2wVn27maVFBV9zNtvX51kmCLSBuTaMa2yJQORTVVUVHLkkbcSdi3N7GNwPGGN\nwTrgbWAVDzwwKrkgRaTNaI5tMiUPUqkqBg26F/gsYafRFKFF0AMYC6xgm20+YO7c01W+WkSaxeYM\nKksLqKioZMyYmaxcCTU1P6O2a+gOQsugK7CQuXNHcvDBByYZqoi0MbESgpl9ljDBPTOaibs/nq+g\n2qtUqoqBA+9mw4b/IexBlL2HQScyyWDatCFKBiLS7JpMCGb2W+BE4DXqLoUdmse42p3QRTSZDRt6\nULvYrHYPg1Co7hW1DEQkb+K0EI4FzN3XNflM2SK1exj0I5z43ye0CoqADykuvobS0nXMnHm6FpyJ\nSN7ESQiZtQeSB7fffi+TJjmwC1AS/f0AMA5I07v3f3jhhe8mGaKItBNxEkIKeMnMFgLrM3e6+7l5\ni6qdmDFjLpMm/Zu6NYnSwE+AmZSULGLWrLEJRigi7UmchDAn+iPNJHQRPQp0p3Z9Qd3Vx8OHv8MN\nN5yvlcci0mJy7Yews7uvBMpbMJ42b8KEq3jggY7AAUA18AGhVVC7+niHHVZw993fSzBKEWmPcrUQ\n7gBOABZQt1ZC5m/VMtoMFRWVjB79J95+uyvwY8LXeC8wHPgz8EtgW7p3f4O//vX0BCMVkfYqV0KY\nANpCszlkSlan058jlKHIdA+NIOTdnsB7TJlyGCNGnJFUmCLSzuVKCAvN7APgUWAuUO7ua1smrLZj\nxoy5jBv3LLVjBWupbWhtD/QBljBt2jCVoBCRROUqbreLme0FHA2MBq4zs/eIEoS7P91CMbZa8+cv\njJJBf2AR4eR/KrUlKBbRufNaFiwYp/UFIpK4nLOM3L0CqADuMrMewCjgO8AVhEnz0oC5c5/kS1+a\nRTrdiVCyOjPscgVwJ9CN3r2X8cQTmkUkIoUj1yyjjoSNer8EHAd0Af4GXAXMb5HoWqGKikqOO+4h\n4IuEVcfZ00n3Bt5k2LBtuPnmc5UMRKSg5GohrAb+Tlg2e5K7/7tFImrlxoyZSW2r4BrqTsxaFg0c\nH5tghCIiDcuVEG4DhgHnArua2Vzg7+5e0yKRtVKrV+9KbavgNOAHQF9gGdOmDdHAsYgUrEY3yHH3\n77r7/wBjgNeBi4BlZjbdzC5oqQBbm549/0NtUdjdgU8YMOAd3CcoGYhIQWuydIW7rzCzPwLLCR3j\nZwKHA7fmObZWafr0UYwZcx2rVvWhZ883mT59rGYQiUirkGtQeRRh8vxRhFXJTwPzgFPc/cWWCa/1\n6devL//5z2W8+66WbIhI69LUSuV5wCXAYo0diIi0bbkWpmkqjIhIOxJrT+UtZWZFwM3AQYS9FL7h\n7q9lPX4q8C1gA7DU3cfnMx4REWlco7OMmslooMTdBwCXA9dnHjCzzsCPgEHufjTQw8xOyHM8IiLS\niHwnhKOINtdx92eAQ7MeqwYGuHt1dLsjWTuyiYhIy8prlxFhS7D3s25vNLNid69x9zTwLoCZXQx0\ndfe/NXXA0tJu+Ym0mSnO5qU4m09riBEUZxLynRDWANnfVnH2bKVojKEM2Af4cpwDtobpnKWl3RRn\nM1Kczac1xAiKs7nFTVr5TghPEXZdm2ZmRwBL6z1+O7DO3UfnOQ4REWlCvhPCdOAYM3squn1ONLOo\nK7AYOAd4wszKCfUefu3uM/Ick4iINCCvCSEaJ7iw3t3LWur9RUQkvnzPMhIRkVZCCUFERAAlBBER\niSghiIgIoIQgIiIRJQQREQGUEEREJKKEICIigBKCiIhElBBERARQQhARkYgSgoiIAEoIIiISUUIQ\nERFACUFERCJKCCIiAighiIhIRAlBREQAKEqn00nHICIiBUAtBBERAZQQREQkooQgIiKAEoKIiESU\nEEREBFBCEBGRSMekA9gSZnYS8BV3/1rSsWQzsyLgZuAgYD3wDXd/LdmoGmZmhwPXuvuQpGNpiJl1\nBH4H7AFsA/zU3WclGlQDzKwYmAwYUANc4O4vJRtV48ysN/AP4H/dfVnS8TTEzBYD70c3X3f3rycZ\nT2PM7DJgJNAJuNndpyQc0ibM7CzgbCANdCGcm3Z29zUNPb/VtRDM7Abgp0BR0rE0YDRQ4u4DgMuB\n6xOOp0Fm9j3CSawk6VhyOB14z90HAsOBmxKOpzEnAml3Pwq4EvhZwvE0KkqytwIfJR1LY8ysBMDd\nh0Z/CjUZDAKOjP6vDwZ2Szaihrn73e4+xN2HAouBixtLBtAKEwLwFHBh0kE04ihgDoC7PwMcmmw4\njXoVOCnpIJrwJ8IJFsK/0w0JxtIod58BnBfd3ANYnVw0TfoFcAuwIulAcjgI6Gpmj5jZ36KWbCE6\nDnjBzB4EZgIPJRxPTmZ2KLC/u9+Z63kFmxDM7FwzW2pmz2f9fYi7P5B0bDl0p7apC7Ax6lIoKO4+\nHdiYdBy5uPtH7v6hmXUDHgCuSDqmxrh7jZndBfwauDfhcBpkZmcD77j7oxRm6zrjI+Dn7n4c4cLv\n3kL8PwTsCBwCfIUQ5x+TDadJlwM/bOpJBTuG4O6/I/QhtyZrgG5Zt4vdvSapYFo7M9sN+Atwk7vf\nn3Q8ubj72VH//CIz28/d1yUdUz3nADVmdgxwMPB7Mxvp7u8kHFd9ywgtWNx9uZmtAj4DvJVoVJta\nBbzs7huBZWa23sx2dPf3kg6sPjPbHujv7guaem4hZt7W7CngeAAzOwJYmmw4TSrYK0Uz2wl4BPg/\nd7876XgaY2anR4OLECYSfEIYXC4o7j4o6kseAiwBzizAZABwLvBLADPrQ7jA+m+iETXsSeBL8Gmc\n2xKSRCEaCMyL88SCbSG0UtOBY8zsqej2OUkGE0MhVza8HOgBXGlmVxFiHe7u1cmGtYm/AFPMbAHh\n/9O3CjDG+gr5934n4ft8gpBYzy3EVra7P2xmR5vZIsKF1Xh3L9Tv1YBYsx1V7VRERAB1GYmISEQJ\nQUREACUEERGJKCGIiAighCAiIhElBBERAbQOQWSrmFlfwuraFwnz0TsDzxOKiBXiwi+RRikhiGy9\nt9z985kbZvYzYBphhahIq6EuI5HmdzVwgJkdkHQgIptDCUGkmbn7BmA5sG/SsYhsDiUEkfxIA4VW\n8VQkJyUEkWZmZtsQCooV7FaaIg1RQhDZep+WEY/21f4h8Hd3fz25kEQ2n2YZiWy9z5jZPwmJoRj4\nF3BasiGJbD6VvxYREUBdRiIiElFCEBERQAlBREQiSggiIgIoIYiISEQJQUREACUEERGJKCGIiAgA\n/w8/nHNwVbkULgAAAABJRU5ErkJggg==\n", "text/plain": [ "<matplotlib.figure.Figure at 0x11cc4a210>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "logreg.fit(dfnew[[\"D\"]],dfnew[\"win\"])\n", "pred_probs = logreg.predict_proba(dfnew[[\"D\"]])\n", "plt.scatter(dfnew[\"D\"], pred_probs[:,1])\n", "plt.title('Win Probability for 5 sets matches')\n", "plt.xlabel('D')\n", "plt.ylabel('Win Probability for 5 sets matches')\n", "plt.legend(loc=\"lower right\")\n", "plt.grid(True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Decision Trees and Random Forests\n", "-------" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We now build a decision tree model to predict the upsets likelihood of a given match" ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=None,\n", " max_features=None, max_leaf_nodes=None, min_samples_leaf=1,\n", " min_samples_split=2, min_weight_fraction_leaf=0.0,\n", " presort=False, random_state=None, splitter='best')" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.tree import DecisionTreeClassifier\n", "\n", "model = DecisionTreeClassifier()\n", "\n", "X = dfnew[feature_cols].dropna()\n", "y = dfnew['win']\n", "\n", "\n", "model.fit(X, y)" ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "AUC [ 0.5875 0.5875 0.5625 0.6025641 0.50756082], Average AUC 0.569524983563\n" ] } ], "source": [ "from sklearn.cross_validation import cross_val_score\n", "\n", "scores = cross_val_score(model, X, y, scoring='roc_auc', cv=5)\n", "print('AUC {}, Average AUC {}'.format(scores, scores.mean()))" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CV AUC [ 0.66625 0.720625 0.6090625 0.74621959 0.55555556], Average AUC 0.659542529586\n" ] } ], "source": [ "model = DecisionTreeClassifier(\n", " max_depth = 4,\n", " min_samples_leaf = 6)\n", "\n", "model.fit(X, y)\n", "scores = cross_val_score(model, X, y, scoring='roc_auc', cv=5)\n", "print('CV AUC {}, Average AUC {}'.format(scores, scores.mean()))" ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',\n", " max_depth=None, max_features='auto', max_leaf_nodes=None,\n", " min_samples_leaf=1, min_samples_split=2,\n", " min_weight_fraction_leaf=0.0, n_estimators=200, n_jobs=1,\n", " oob_score=False, random_state=None, verbose=0,\n", " warm_start=False)" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.ensemble import RandomForestClassifier\n", "from sklearn.cross_validation import cross_val_score\n", "\n", "X = dfnew[feature_cols].dropna()\n", "y = dfnew['win']\n", "\n", "model = RandomForestClassifier(n_estimators = 200)\n", " \n", "model.fit(X, y)\n", "\n" ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>Features</th>\n", " <th>Importance Score</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>8</th>\n", " <td>D</td>\n", " <td>0.850724</td>\n", " </tr>\n", " <tr>\n", " <th>7</th>\n", " <td>Surface_Hard</td>\n", " <td>0.025179</td>\n", " </tr>\n", " <tr>\n", " <th>0</th>\n", " <td>Round_2</td>\n", " <td>0.025132</td>\n", " </tr>\n", " <tr>\n", " <th>6</th>\n", " <td>Surface_Grass</td>\n", " <td>0.022049</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>Round_3</td>\n", " <td>0.017896</td>\n", " </tr>\n", " <tr>\n", " <th>2</th>\n", " <td>Round_4</td>\n", " <td>0.016429</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>Round_6</td>\n", " <td>0.016345</td>\n", " </tr>\n", " <tr>\n", " <th>5</th>\n", " <td>Round_7</td>\n", " <td>0.013936</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>Round_5</td>\n", " <td>0.012309</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " Features Importance Score\n", "8 D 0.850724\n", "7 Surface_Hard 0.025179\n", "0 Round_2 0.025132\n", "6 Surface_Grass 0.022049\n", "1 Round_3 0.017896\n", "2 Round_4 0.016429\n", "4 Round_6 0.016345\n", "5 Round_7 0.013936\n", "3 Round_5 0.012309" ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "features = X.columns\n", "feature_importances = model.feature_importances_\n", "\n", "features_df = pd.DataFrame({'Features': features, 'Importance Score': feature_importances})\n", "features_df.sort_values('Importance Score', inplace=True, ascending=False)\n", "\n", "features_df" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "collapsed": false }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/marcotavora/anaconda/lib/python2.7/site-packages/ipykernel/__main__.py:2: FutureWarning: sort is deprecated, use sort_values(inplace=True) for INPLACE sorting\n", " from ipykernel import kernelapp as app\n" ] }, { "data": { "text/plain": [ "<matplotlib.axes._subplots.AxesSubplot at 0x1198f9d50>" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAeYAAAFtCAYAAADS5MnUAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3X2UXXV97/H3dHhqTAhEJ2JYtpOkzBcsFUHRiCAQ7Y3a\nGlHaBR2atlIRrrarVy8sm5YW7YPprQ810roaQwXkEirtlbbYlqYlCHihwUsXFSN8B6G2KtqMHIwM\nqUhg7h97j4xh5sxDzsz57eT9WouVffbj5xyGfM5v/84ZekZHR5EkSWX4oW4HkCRJz7CYJUkqiMUs\nSVJBLGZJkgpiMUuSVBCLWZKkghzU7QD7gz17nhp99NHd3Y4xbUceuYCm5G1SVmhWXrPOnSblbVJW\naFbevr5FPbM5zhFzBxx0UG+3I8xIk/I2KSs0K69Z506T8jYpKzQv72xYzJIkFcRiliSpIBazJEkF\nsZglSSqIxSxJUkEsZkmSCuL3mDtgaGiIVmuk2zGm7dFHFzYmb5OyQrPymnXuNClvk7LC3OXt719B\nb28ZX8WymDtg3fotLFi8tNsxJEmzsHvXTjZespaVK4/pdhTAYu6IBYuXsvDIo7sdQ5K0H3COWZKk\ngljMkiQVxGKWJKkgzjHvJSJOB64HdlC9cTkI2JiZf9HVYJKkA4Ij5ondnJmrM/MMYA3wnoh4cZcz\nSZIOABbzFDLzcWAT8DPdziJJ2v9ZzNPzn8Dzuh1CkrT/s5in50eBr3U7hCRp/+eHvybWM7YQEYcD\nFwBndy+OJGkuLVmykL6+Rd2OAVjMkzkzIrYBTwO9wG9l5gNdziRJmiOt1gjDw4919JyzLXqLeS+Z\neStwVLdzSJIOTM4xS5JUEItZkqSCWMySJBXEYpYkqSAWsyRJBfFT2R2we9fObkeQJM1SaX+H94yO\njnY7Q+MNDQ2Ntloj3Y4xbUuWLKQpeZuUFZqV16xzp0l5m5QV5i5vf/8Kent7O3rOvr5FPVPv9WwW\nc2eMdvqL6XOpr29Rx79IP1ealBWaldesc6dJeZuUFZqVd7bF7ByzJEkFsZglSSqIxSxJUkEsZkmS\nCmIxS5JUEItZkqSCWMySJBXEYpYkqSAWsyRJBbGYJUkqiMUsSVJBLGZJkgpiMUuSVBCLWZKkgljM\nkiQVxGKWJKkgB3U7wP5gaGiIVmtkWvv296+gt7d3jhNJkprKYu6Adeu3sGDx0in3271rJxsvWcvK\nlcfMQypJUhNZzB2wYPFSFh55dLdjSJL2A84xS5JUEItZkqSCzLqYI+I9EfGPEfHZiLg5Ik6a5nEv\nj4gHIuL3Z3vtaVzjG3s9XhMRV87yXNdFxKs7k0ySpPZmNcccEccBazPzVfXjFwNXAydO4/A1wEcy\n809mc+1pGp3mOkmSijLbD3/tAl4YEecDN2XmFyLiFRFxC3BhZg5FxIXA86kK+zPAMPD3wPnAExHx\ntfr676z/HAXenJmtiLgceDlwMHBZZt4YEe8HTgV6gT/KzL9sk69nsg0R8U7gLcAC4FvAm4Hz6lw9\nwGXAccDbgG8AfTN/eSRJmp1Z3crOzIeBtcCrgDsj4kvATzP5qHQp8JOZ+QHgKuDDmfnXwDHAGzLz\n1cB9wJqIOAt4bma+AjgTeFlEvA5YXu+3GvjNiDi8TcQlEbGt/ucW4A8BIqKnPvdrMvOVVMV/cn1M\nqz7/DuDXqN4YvAk4ZMYvkCRJszTbW9krgccy85frxycBNwEPj9tt/Kj13zLzqQlONQxcHRGPAwHc\nAfwIcCdAZu4CLouIS4CXRsS2+rwHAf3AFyaJ+Ehmrh6Xdw1wTmaORsT3IuI64HHgaKpyBsj6z5XA\nFzNzT33s56d6PSRJ6pTZ3sp+MfD2iFibmU8CXwa+DTwCLAOGgJOAr9X7P2skXY943we8kKps/7H+\n8z7gZ+t9FgOfAv4Y2JaZF9Wj3kuBB9vkm/BWdkT8BHBWZq6KiB8G7h6379P1nw8APx4RhwJ7qObN\nr2n7aszAkiUL6etb1KnTzVoJGaarSVmhWXnNOnealLdJWaF5eWdqVsWcmTdExLHA5yPiMapb4hcD\n3wM+FhH/Dnx93CHPKubM/E5EfA74Z6oCbAHLMvPqiHhtRNxONZ/83szcGhFnRsRtwHOAGzLz8TYR\nJ7ul/gAwUp+7h2qEv2yvXN+KiD+gGrXvBKb3uzanqdUaYXj4sU6ecsb6+hZ1PcN0NSkrNCuvWedO\nk/I2KSs0K+9s30D0jI76YeV9deb5Hxudzm/+Gnn062x4+6qu/0rOpv1gNyUrNCuvWedOk/I2KSs0\nK29f36JJP4jcTmN/JWdEXAAM8szouKdeXp+Z27sWTJKkfdDYYs7MzcDmbueQJKmT/JWckiQVxGKW\nJKkgFrMkSQVp7BxzSXbv2tnR/SRJBy6LuQOu2TBIqzW9rzv396+Y4zSSpCazmDtgYGCgMd+rkySV\nzTlmSZIKYjFLklQQi1mSpIJYzJIkFcRiliSpIBazJEkFsZglSSqIxSxJUkEsZkmSCmIxS5JUEItZ\nkqSCWMySJBXEYpYkqSAWsyRJBbGYJUkqiMUsSVJBDup2gP3B0NAQrdbItPbt719Bb2/vHCeSJDWV\nxdwB69ZvYcHipVPut3vXTjZespaVK4+Zh1SSpCaymDtgweKlLDzy6G7HkCTtB5xjliSpIBazJEkF\nsZglSSpIMXPMEXE6cD2wo151OPAgcF5m7ungdQ4F7s/M5W32eRdwDjAK/F1m/m6nri9JUjuljZhv\nzszV9T8vA/YAazt8jR6qwp1QRCwHfi4zV2XmK4E1EXF8hzNIkjShYkbMtZ6xhYg4BDgKeDQiPgic\nSlWoWzLz8oi4ErguM7dGxBrg3Mx8a0Q8ANwOHAt8EzgbWABcCxxBNQpv5z+A1417fDDw3Y48O0mS\nplBaMa+OiG3A84GngU1Updqfmasi4iDg9oi4ZYJjx0bBy4HTM/PhiLgdOBk4Dbg3M38rIl4OnDlZ\ngMx8CmgBRMQHgH/JzC936PlJktRWacV8c2YORsQSYCvwFeA4qhEwmbknIrYDL9rruJ5xy8OZ+XC9\n/FXgMGAA+Ex9jrsi4sl2Iep56E8Au4B37NMz2suSJQvp61vUyVPOSgkZpqtJWaFZec06d5qUt0lZ\noXl5Z6q0YgYgM1sRsQ64BbgYOAvYGBEHA6cAV1GNel9QH3LSJKcaK+wd9XE3RsSJVLen2/kb4J8y\n8wOzfhKTaLVGGB5+rNOnnZG+vkVdzzBdTcoKzcpr1rnTpLxNygrNyjvbNxClffjr+zLzPmAj8Ebg\noYi4A7gDuD4z7wGuAN4dEVuBZeMOHZ1geROwIiJuoxoBPzHZdSPiLKpb36+PiFsiYltEvKJTz0uS\npHaKGTFn5q3ArXut29Bm/7uBEyZYv2zc8uC4TedMM8dfUc1rS5I074op5vkWERcAgzwzqh77GtX6\nzNzetWCSpAPaAVvMmbkZ2NztHJIkjVfsHLMkSQcii1mSpIIcsLeyO2n3rp0d3U+SdOCymDvgmg2D\ntFoj09q3v3/FHKeRJDWZxdwBAwMDjfnCuySpbM4xS5JUEItZkqSCWMySJBXEYpYkqSAWsyRJBbGY\nJUkqiMUsSVJBLGZJkgpiMUuSVBCLWZKkgljMkiQVxGKWJKkgFrMkSQWxmCVJKojFLElSQSxmSZIK\nclC3A+wPhoaGaLVGptyvv38Fvb2985BIktRUFnMHrFu/hQWLl7bdZ/eunWy8ZC0rVx4zT6kkSU1k\nMXfAgsVLWXjk0d2OIUnaDzjHLElSQSxmSZIKYjFLklSQac0xR8R7gNcCBwNPAZdk5r9M47iXA9cC\n12fmb+5L0DbXWAD8PvBK4L+Ap4HLM/Ov5uJ6kiTNpSmLOSKOA9Zm5qvqxy8GrgZOnMb51wAfycw/\n2aeU7X0C+FxmvqvO91zgHyLis5n57Tm8riRJHTedEfMu4IURcT5wU2Z+ISJeERG3ABdm5lBEXAg8\nn6qwPwMMA38PnA88ERFfq6/1zvrPUeDNmdmKiMuBl1ONxi/LzBsj4v3AqUAv8EeZ+ZcTBYuI5wMD\nmXnu2LrMfAR4Wb39F+sMPcBlwIuAtwALgG8BbwaWA1cCT1Ld2h8EngA+VR93GHBRZn5hGq+VJEn7\nZMo55sx8GFgLvAq4MyK+BPw0VblOZCnwk5n5AeAq4MOZ+dfAMcAbMvPVwH3Amog4C3huZr4COBN4\nWUS8Dlhe77ca+M2IOHySa/UDD409iIj3RsQtEXFPRLylXt2qz/VZYElmviYzX0n1RuBk4CeB7VS3\n6t8LLKZ6o/At4PXArwDPmep1kiSpE6ZzK3sl8Fhm/nL9+CTgJuDhcbv1jFv+t8x8aoJTDQNXR8Tj\nQAB3AD8C3AmQmbuAyyLiEuClEbGtPu9BVAU80Yj1a1QjXupzvLfOuAFYOLa63jYaEU9GxHXA48DR\nVOX8Z8B7gH8Avg38BtVo/xjgb4DvAb/X7jWariVLFtLXt6gTp9pnpeSYjiZlhWblNevcaVLeJmWF\n5uWdqencyn4x8PaIWJuZTwJfpiqwR4BlwBBwElVJwgQj6XrE+z7ghVRl+4/1n/cBP1vvs5jq9vEf\nA9sy86KI6AEuBR6cKFhmfj0iHoqIizLzT8ed50TgS/U1nq7X/wRwVmauiogfBu6ut78JuD0zfyci\nzqUq6WuAb2TmmohYBbwfeM00Xqu2Wq0Rhocf29fT7LO+vkVF5JiOJmWFZuU169xpUt4mZYVm5Z3t\nG4gpizkzb4iIY4HPR8RjVLe/L6YaSX4sIv4d+Pq4Q55VzJn5nYj4HPDPwB6gBSzLzKsj4rURcTvV\nfPJ7M3NrRJwZEbdR3UK+ITMfbxPxF4D31ed4imr++Hrgz6nmi8d8GRip9+uhGvEvo7qNfXVEfK9+\nbu8C/gP484j473Wu9031OkmS1Ak9o6OTTRVrus48/2OjU/1KzpFHv86Gt68q4ndlN+0dZ1OyQrPy\nmnXuNClvk7JCs/L29S3qmXqvZ2vE78qOiAuoRr9j7yJ66uX1mbm9a8EkSeqwRhRzZm4GNnc7hyRJ\nc81fySlJUkEsZkmSCmIxS5JUkEbMMZdu966dHdlHkiSLuQOu2TBIqzUy5X79/SvmIY0kqcks5g4Y\nGBhozPfqJEllc45ZkqSCWMySJBXEYpYkqSAWsyRJBbGYJUkqiMUsSVJBLGZJkgpiMUuSVBCLWZKk\ngljMkiQVxGKWJKkgFrMkSQWxmCVJKojFLElSQSxmSZIK4v+PuQOGhoZotUYm3d7fv4Le3t55TCRJ\naiqLuQPWrd/CgsVLJ9y2e9dONl6ylpUrj5nnVJKkJrKYO2DB4qUsPPLobseQJO0HnGOWJKkgFrMk\nSQWxmCVJKkgxc8wRcTpwPbCjXnU48CBwXmbu6eB1DgXuz8zlbfZ5J/CLwNPAhzLzLzp1fUmS2imm\nmGs3Z+bg2IOIuBZYC3y6g9foAUYn2xgRzwUuBF4CLAC+BFjMkqR5UVox94wtRMQhwFHAoxHxQeBU\nqkLdkpmXR8SVwHWZuTUi1gDnZuZbI+IB4HbgWOCbwNlUBXstcATVKHxSmflIRLwkM5+OiBcA/9X5\npylJ0sRKm2NeHRHbImIHcDdwA1Wp9mfmKuA0YDAijp/g2LFR8HLg0sw8BegDTgYuAu7NzDOATVOF\nqEv5ncAdwP/ex+ckSdK0lTZivjkzByNiCbAV+ApwHNUImMzcExHbgRftdVzPuOXhzHy4Xv4qcBgw\nAHymPsddEfHkVEEy808iYhNwU0Tclpm3zvZJLVmykL6+RbM9fE6UlqedJmWFZuU169xpUt4mZYXm\n5Z2p0ooZgMxsRcQ64BbgYuAsYGNEHAycAlwFnAm8oD7kpElONVbYO+rjboyIE4GDJ7t2RAwAGzLz\nbOAp4AmqD4HNWqs1wvDwY/tyio7q61tUVJ52mpQVmpXXrHOnSXmblBWalXe2byBKu5X9fZl5H7AR\neCPwUETcQXVr+frMvAe4Anh3RGwFlo07dHSC5U3Aioi4DXgHVdlOdt0h4J6IuBP4HHBnZt7eoacl\nSVJbxYyY61vFt+61bkOb/e8GTphg/bJxy4PjNp0zgyy/C/zudPeXJKlTiinm+RYRFwCDPDOqHvsa\n1frM3N61YJKkA9oBW8yZuRnY3O0ckiSNV+wcsyRJByKLWZKkgljMkiQV5ICdY+6k3bt2zmqbJEl7\ns5g74JoNg7RaI5Nu7+9fMY9pJElNZjF3wMDAQGN+E40kqWzOMUuSVBCLWZKkgljMkiQVxGKWJKkg\nFrMkSQWxmCVJKojFLElSQSxmSZIKYjFLklQQi1mSpIJYzJIkFcRiliSpIBazJEkFsZglSSqIxSxJ\nUkH8/zF3wNDQEK3WyKTb+/tX0NvbO4+JJElNZTF3wLr1W1iweOmE23bv2snGS9aycuUx85xKktRE\nFnMHLFi8lIVHHt3tGJKk/YBzzJIkFcRiliSpIBazJEkFKWaOOSJOB64HdtSrDgceBM7LzD0dvM6h\nwP2ZuXyK/XqAvwX+KjM/3qnrS5LUTjHFXLs5MwfHHkTEtcBa4NMdvEYPMDqN/X4POKKD15UkaUql\nFXPP2EJEHAIcBTwaER8ETqUq1C2ZeXlEXAlcl5lbI2INcG5mvjUiHgBuB44FvgmcDSwArqUq2gen\nChERZwNPATd19NlJkjSF0uaYV0fEtojYAdwN3EBVqv2ZuQo4DRiMiOMnOHZsFLwcuDQzTwH6gJOB\ni4B7M/MMYFO7ABHx48AgcBnj3ihIkjQfSivmmzNzNVUBPwF8BTiOagRMPde8HXjRXseNL9DhzHy4\nXv4qcBgwANxVn+Mu4Mk2GX4BWAZsA34JeHdE/LdZPyNJkmagtFvZAGRmKyLWAbcAFwNnARsj4mDg\nFOAq4EzgBfUhJ01yqrHC3lEfd2NEnAgc3Oba7xlbjojLgG9k5tbZPxtYsmQhfX2L9uUUHVdannaa\nlBWaldesc6dJeZuUFZqXd6aKLGaAzLwvIjYCbwQeiog7qAr1U5l5T0RcAXwiIs4DhsYdOjrB8ibg\nkxFxG5BUo/F502qNMDz82Hxesq2+vkVF5WmnSVmhWXnNOnealLdJWaFZeWf7BqKYYs7MW4Fb91q3\noc3+dwMnTLB+2bjlwXGbzplFpvfN9BhJkvZFMcU83yLiAqoPeY2Nqse+RrU+M7d3LZgk6YB2wBZz\nZm4GNnc7hyRJ45X2qWxJkg5oFrMkSQWxmCVJKsgBO8fcSbt37ZzVNkmS9mYxd8A1GwZptUYm3d7f\nv2Ie00iSmsxi7oCBgYHGfOFdklQ255glSSqIxSxJUkEsZkmSCmIxS5JUEItZkqSCWMySJBXEYpYk\nqSAWsyRJBbGYJUkqiMUsSVJBLGZJkgpiMUuSVBCLWZKkgljMkiQVxGKWJKkgFrMkSQU5qNsB9gdD\nQ0O0WiOTbu/vX0Fvb+88JpIkNZXF3AHr1m9hweKlE27bvWsnGy9Zy8qVx8xzKklSE1nMHbBg8VIW\nHnl0t2NIkvYDzjFLklQQi1mSpIIUcys7Ik4Hrgd21KsOBx4EzsvMPR28zqHA/Zm5vM0+rwd+u354\nd2b+SqeuL0lSO6WNmG/OzNX1Py8D9gBrO3yNHmB0so0RsRD4Q+CnMvOVwFci4rkdziBJ0oSKGTHX\nesYWIuIQ4Cjg0Yj4IHAqVaFuyczLI+JK4LrM3BoRa4BzM/OtEfEAcDtwLPBN4GxgAXAtcATVKLyd\nU4B7gQ9HxApgc2Y+0tFnKUnSJEobMa+OiG0RsQO4G7iBqlT7M3MVcBowGBHHT3Ds2Ch4OXBpZp4C\n9AEnAxcB92bmGcCmKTI8DzgDuAR4PfCuiPixfXpWkiRNU2nFfHNmrqYq4CeArwDHUY2AqeeatwMv\n2uu4nnHLw5n5cL38VeAwYAC4qz7HXcCTbTI8Anw+M4cz83HgNuAl+/CcJEmattJuZQOQma2IWAfc\nAlwMnAVsjIiDqW41XwWcCbygPuSkSU41Vtg76uNujIgTgYPbXP5fgOMjYgnwHWAV8PHZPxtYsmQh\nfX2L9uUUHVdannaalBWaldesc6dJeZuUFZqXd6aKLGaAzLwvIjYCbwQeiog7qAr1U5l5T0RcAXwi\nIs4DhsYdOjrB8ibgkxFxG5BUo/HJrjscEeuBrfXxn8rML+3Lc2m1RhgefmxfTtFRfX2LisrTTpOy\nQrPymnXuNClvk7JCs/LO9g1EMcWcmbcCt+61bkOb/e8GTphg/bJxy4PjNp0zgyzXU311S5KkeVVM\nMc+3iLgAGOSZUfXY16jWZ+b2rgWTJB3QDthizszNwOZu55AkabzSPpUtSdIBzWKWJKkgFrMkSQU5\nYOeYO2n3rp2z2iZJ0t4s5g64ZsMgrdbIpNv7+1fMYxpJUpNZzB0wMDDQmC+8S5LK5hyzJEkFsZgl\nSSqIxSxJUkEsZkmSCmIxS5JUEItZkqSCWMySJBXEYpYkqSAWsyRJBbGYJUkqiMUsSVJBLGZJkgpi\nMUuSVBCLWZKkgljMkiQVxGKWJKkgB3U7wP5gaGiIVmvkWev7+1fQ29vbhUSSpKaymDtg3fotLFi8\n9AfW7d61k42XrGXlymO6lEqS1EQWcwcsWLyUhUce3e0YkqT9gHPMkiQVxGKWJKkgFrMkSQUpZo45\nIk4Hrgd21KsOBx4EzsvMPR28zqHA/Zm5fJLtJwAfAUaBHmAV8KbM3NqpDJIkTaaYYq7dnJmDYw8i\n4lpgLfDpDl6jh6p0J5SZ/wqcWV//Z4CvWcqSpPlSWjH3jC1ExCHAUcCjEfFB4FSqQt2SmZdHxJXA\ndZm5NSLWAOdm5lsj4gHgduBY4JvA2cAC4FrgCKpR+JQiYgHwPuC0jj07SZKmUNoc8+qI2BYRO4C7\ngRuoSrU/M1dRleRgRBw/wbFjo+DlwKWZeQrQB5wMXATcm5lnAJummeWXgeszszXrZyNJ0gyVNmK+\nOTMHI2IJsBX4CnAc1QiYzNwTEduBF+11XM+45eHMfLhe/ipwGDAAfKY+x10R8eQ0spxHNdqetSVL\nFtLXt2hfTjFnSs01kSZlhWblNevcaVLeJmWF5uWdqdKKGYDMbEXEOuAW4GLgLGBjRBwMnAJcRTUP\n/IL6kJMmOdVYYe+oj7sxIk4EDm53/Yg4HDgkM7++L8+j1RphePixfTnFnOjrW1Rkrok0KSs0K69Z\n506T8jYpKzQr72zfQJR2K/v7MvM+YCPwRuChiLgDuIPq9vI9wBXAuyNiK7Bs3KGjEyxvAlZExG3A\nO4Anprj8ANVoXZKkeVXMiDkzbwVu3Wvdhjb73w2cMMH6ZeOWB8dtOmcGWf4f8Jbp7i9JUqcUU8zz\nLSIuAAZ5ZlQ99jWq9Zm5vWvBJEkHtAO2mDNzM7C52zkkSRqv2DlmSZIORBazJEkFOWBvZXfS7l07\np7VOkqSpWMwdcM2GQVqtkWet7+9f0YU0kqQms5g7YGBgoDFfeJcklc05ZkmSCmIxS5JUEItZkqSC\nWMySJBXEYpYkqSAWsyRJBbGYJUkqiMUsSVJBLGZJkgpiMUuSVBCLWZKkgljMkiQVxGKWJKkgFrMk\nSQWxmCVJKojFLElSQQ7qdoD9wdDQEK3WyLPW9/evoLe3twuJJElNZTF3wLr1W1iweOkPrNu9aycb\nL1nLypXHdCmVJKmJLOYOWLB4KQuPPLrbMSRJ+wHnmCVJKojFLElSQSxmSZIKUswcc0ScDlwP7KhX\nHQ48CJyXmXs6eJ1Dgfszc3mbfT4CvAp4rF71psx8bLL9JUnqlGKKuXZzZg6OPYiIa4G1wKc7eI0e\nYHSKfV4KrMnMVgevK0nSlEor5p6xhYg4BDgKeDQiPgicSlWoWzLz8oi4ErguM7dGxBrg3Mx8a0Q8\nANwOHAt8EzgbWABcCxxBNQqfVET0AMcAH4+Io4A/y8wrO/1EJUmaSGlzzKsjYltE7ADuBm6gKtX+\nzFwFnAYMRsTxExw7NgpeDlyamacAfcDJwEXAvZl5BrBpigzPAT4K/DzwOuAdk1xPkqSOK23EfHNm\nDkbEEmAr8BXgOKoRMJm5JyK2Ay/a67ieccvDmflwvfxV4DBgAPhMfY67IuLJNhl2Ax/NzO8CRMQ2\n4ATgizN9MkuWLKSvb9FMD5sXpeaaSJOyQrPymnXuNClvk7JC8/LOVGnFDEBmtiJiHXALcDFwFrAx\nIg4GTgGuAs4EXlAfctIkpxor7B31cTdGxInAwW0uPwB8KiJeQvX6nFpfb8ZarRGGh8v7zFhf36Ii\nc02kSVmhWXnNOnealLdJWaFZeWf7BqK0W9nfl5n3ARuBNwIPRcQdwB3A9Zl5D3AF8O6I2AosG3fo\n6ATLm4AVEXEb8A7giTbXvR/4JLCd6o3B1XUWSZLmXDEj5sy8Fbh1r3Ub2ux/N9Ut5r3XLxu3PDhu\n0zkzyPIh4EPT3V+SpE4pppjnW0RcAAzyzKh67GtU6zNze9eCSZIOaAdsMWfmZmBzt3NIkjResXPM\nkiQdiCxmSZIKYjFLklSQA3aOuZN279o5rXWSJE3FYu6AazYM0mqNPGt9f/+KLqSRJDWZxdwBAwMD\njflNNJKksjnHLElSQSxmSZIKYjFLklQQi1mSpIJYzJIkFcRiliSpIBazJEkF6RkdHZ16L0mSNC8c\nMUuSVBCLWZKkgljMkiQVxGKWJKkgFrMkSQWxmCVJKoj/28cZiIge4GPACcB3gbdl5kPjtr8R+C3g\nSeDKzLyiK0GZOmu9zwJgK3B+Zg7Nf8ofyDLVa/tzwK9Rvbb3ZuY7uhKUaWU9G3gP8DSwJTM/2pWg\nTO/noN5vE/BIZv7GPEfcO8dUr+3/AN4G7KxXXZiZD8x7UKaV9WTgQ/XDbwI/n5nfm/egz+SZNG9E\nPB/4c2AU6AFeArwnMz9eWtZ6+3nAu4E9VH/X/mk3ctZZpsq6DrgY+DZwdWZ+YqpzOmKembOAQzPz\nFGA98OGxDRFxUP34tcAZwNsjoq8bIWuTZgWIiJcCtwIrupBtIu1e28OA3wFOz8zTgCMi4qe7ExNo\nn/WHgPcDq4FTgHdExJKupKy0/TkAiIgLgePnO9gkpsr7UmBdZq6u/+lKKdemyvpx4Jcy89XATcCP\nznO+vU10iRKiAAADo0lEQVSaNzP/MzPPzMzV9ba7gc3diQlM/dp+gOq/sVOB/xkRi+c533jt/j54\nLtXfXa+m6oXzIuJHpjqhxTwzp1L9B0ZmbgdeNm7bccADmfmdzHwS+BzVv4xuaZcV4BCqH6j75znX\nZNrlfQI4JTOfqB8fRPXOtFsmzZqZTwPHZeYI8Dyq/8a6Nkpiip+DiHglcDKwaf6jTWiqn9uXAusj\n4vaI+PX5DreXSbNGxADwCPDuiPgssKTLbyJg6td2zOXARZnZzd8+NVXWfwWOBH64flxq1hXAPZm5\nq349Pw+smuqEFvPMHA7sGvd4Tz1CmmjbY0A338W1y0pm3pmZX6e6bVWCSfNm5mhmDgNExK8Cz8nM\nf+pCxjFTvbZPR8SbgXuAzwKPz2+8HzBp1og4CrgM+BUa8HNQuw64CDgTODUi3jCf4fbSLuvzgFcC\nH6W6i/baiDhjfuM9y1Sv7dh03Bcz88vzmuzZpsq6g2pUfy/wmcz8znyG20u7rA8APx4RffXU4WuA\n50x1Qot5Zr4DLBr3+IfqEdLYtsPHbVtENafQLe2ylqht3ojoiYgPUP1gv2W+w+1lytc2M2/IzGXA\nocAvzGe4vbTL+rPAc4G/A34dGIyIbmaFqV/bjZnZysw9wN8CJ85ruh/ULusjwJczc6jOehOTj1Dn\ny3T+Tvh5qlvw3TZp1oj4CeCnqKYG+oHn15/r6JZJs2bmt6nmwv8PcC3Vm4lvTXVCi3lm/i/wBoCI\nWEX1bm3MfcCPRcQREXEI1W3sO+c/4ve1y1qiqfJ+nGoe56xxt7S7ZdKsEbEoIj5b/wxANVru5hui\nSbNm5uWZeXI9r/gHVB9U+2R3Yn5fu9f2cOCLEbGg/sDNaqq/6Lql3c/sQ8DCiBj7DMdpVKO8bprO\n3wkvy8xu/r01pl3WXcBu4In69vBOqtva3dLuZ7YXOKn+nME5wLH1/m35P7GYgXGfvntxveqtVHNe\nz8nMKyLip6huDfYAf1bIJwUnzDpuv21U80mlfCr7WXmp/vL9PHB7vW2UauT01/OdE6b1c/A2qk8O\nfw/4AvCr3Zqvm8HPwS8CUdCnsid7bc+j+nT+d4GbM/N93Uk6raxnAP+r3nZHZr5r/lM+Yxp5nwds\nzcyTupVxzDSyXgicT/X5kweBC+o7EyVm/W2qz/P8F/ChzPz0VOe0mCVJKoi3siVJKojFLElSQSxm\nSZIKYjFLklQQi1mSpIJYzJIkFcRiliSpIBazJEkF+f/fNIwV37IwbgAAAABJRU5ErkJggg==\n", "text/plain": [ "<matplotlib.figure.Figure at 0x11c00add0>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "feature_importances = pd.Series(model.feature_importances_, index=X.columns)\n", "feature_importances.sort()\n", "feature_importances.plot(kind=\"barh\", figsize=(7,6))" ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "AUC [ 0.72956841 0.66758494 0.64152893], Average AUC 0.679560759106\n", "n trees: 1, CV AUC [ 0.65151515 0.66666667 0.5632461 ], Average AUC 0.627142638506\n", "n trees: 11, CV AUC [ 0.74437557 0.64543159 0.62775482], Average AUC 0.672520661157\n", "n trees: 21, CV AUC [ 0.71613866 0.65381084 0.63636364], Average AUC 0.668771043771\n", "n trees: 31, CV AUC [ 0.72623967 0.67229109 0.64577594], Average AUC 0.681435567799\n", "n trees: 41, CV AUC [ 0.73783287 0.65048209 0.64015152], Average AUC 0.676155494337\n", "n trees: 51, CV AUC [ 0.72463269 0.65438476 0.64290634], Average AUC 0.673974594429\n", "n trees: 61, CV AUC [ 0.71705693 0.66069789 0.65587695], Average AUC 0.677877257423\n", "n trees: 71, CV AUC [ 0.73519284 0.66609275 0.64623508], Average AUC 0.682506887052\n", "n trees: 81, CV AUC [ 0.72945363 0.65943526 0.62775482], Average AUC 0.672214569942\n", "n trees: 91, CV AUC [ 0.74368687 0.67171717 0.63429752], Average AUC 0.683233853688\n" ] } ], "source": [ "from sklearn.cross_validation import cross_val_score\n", "\n", "scores = cross_val_score(model, X, y, scoring='roc_auc')\n", "print('AUC {}, Average AUC {}'.format(scores, scores.mean()))\n", "\n", "for n_trees in range(1, 100, 10):\n", " model = RandomForestClassifier(n_estimators = n_trees)\n", " scores = cross_val_score(model, X, y, scoring='roc_auc')\n", " print('n trees: {}, CV AUC {}, Average AUC {}'.format(n_trees, scores, scores.mean()))" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## Best_of = 3\n", "We now restrict our analysis to matches of Best_of = 3. " ] }, { "cell_type": "code", "execution_count": 189, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>Date</th>\n", " <th>Series</th>\n", " <th>Surface</th>\n", " <th>Round</th>\n", " <th>Best_of</th>\n", " <th>WRank</th>\n", " <th>LRank</th>\n", " <th>win</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>39530</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>92</td>\n", " <td>42</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>39531</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>45</td>\n", " <td>78</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>39532</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>53</td>\n", " <td>230</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>39533</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>84</td>\n", " <td>165</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>39534</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>18</td>\n", " <td>111</td>\n", " <td>1</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " Date Series Surface Round Best_of WRank LRank win\n", "39530 2014-12-02 ATP250 Clay 1st Round 3 92 42 0\n", "39531 2014-12-02 ATP250 Clay 1st Round 3 45 78 1\n", "39532 2014-12-02 ATP250 Clay 1st Round 3 53 230 1\n", "39533 2014-12-02 ATP250 Clay 1st Round 3 84 165 1\n", "39534 2014-12-02 ATP250 Clay 1st Round 3 18 111 1" ] }, "execution_count": 189, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "df_atp = pd.read_csv('Data.csv')\n", "df_atp['Date'] = pd.to_datetime(df_atp['Date']) \n", "# Restricing dates\n", "df_atp = df_atp.loc[(df_atp['Date'] > '2014-11-09') & (df_atp['Date'] <= '2016-11-09')]\n", "# Keeping only completed matches\n", "df_atp = df_atp[df_atp['Comment'] == 'Completed'].drop(\"Comment\",axis = 1)\n", "# Rename Best of to Best_of\n", "df_atp.rename(columns = {'Best of':'Best_of'},inplace=True)\n", "# Choosing features\n", "cols_to_keep = ['Date','Series','Surface', 'Round','Best_of', 'WRank','LRank']\n", "# Dropping NaN\n", "df_atp = df_atp[cols_to_keep].dropna()\n", "# Dropping errors in the dataset and unimportant entries (e.g. there are very few entries for Masters Cup)\n", "df_atp = df_atp[(df_atp['LRank'] != 'NR') & (df_atp['WRank'] != 'NR') & (df_atp['Series'] != 'Masters Cup')]\n", "df_atp[['Best_of','WRank','LRank']] = df_atp[['Best_of','WRank','LRank']].astype(int)\n", "def win(x):\n", " if x > 0:\n", " return 0\n", " elif x <= 0:\n", " return 1 \n", " \n", "df_atp['win'] = (df_atp['WRank'] - df_atp['LRank']).apply(win)\n", "df_atp.head()" ] }, { "cell_type": "code", "execution_count": 190, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>Date</th>\n", " <th>Series</th>\n", " <th>Surface</th>\n", " <th>Round</th>\n", " <th>Best_of</th>\n", " <th>WRank</th>\n", " <th>LRank</th>\n", " <th>win</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>39530</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>92</td>\n", " <td>42</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>39531</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>45</td>\n", " <td>78</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>39537</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Clay</td>\n", " <td>2nd Round</td>\n", " <td>3</td>\n", " <td>14</td>\n", " <td>58</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>39560</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Hard</td>\n", " <td>1st Round</td>\n", " <td>3</td>\n", " <td>96</td>\n", " <td>56</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>39563</th>\n", " <td>2014-12-02</td>\n", " <td>ATP250</td>\n", " <td>Hard</td>\n", " <td>2nd Round</td>\n", " <td>3</td>\n", " <td>85</td>\n", " <td>81</td>\n", " <td>0</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " Date Series Surface Round Best_of WRank LRank win\n", "39530 2014-12-02 ATP250 Clay 1st Round 3 92 42 0\n", "39531 2014-12-02 ATP250 Clay 1st Round 3 45 78 1\n", "39537 2014-12-02 ATP250 Clay 2nd Round 3 14 58 1\n", "39560 2014-12-02 ATP250 Hard 1st Round 3 96 56 0\n", "39563 2014-12-02 ATP250 Hard 2nd Round 3 85 81 0" ] }, "execution_count": 190, "metadata": {}, "output_type": "execute_result" } ], "source": [ "newdf = df_atp.copy()\n", "newdf2 = newdf[(newdf['WRank'] <= 100) & (newdf['LRank'] <= 100)]\n", "newdf2.head()" ] }, { "cell_type": "code", "execution_count": 191, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>Date</th>\n", " <th>Surface</th>\n", " <th>Round</th>\n", " <th>WRank</th>\n", " <th>LRank</th>\n", " <th>win</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>39530</th>\n", " <td>2014-12-02</td>\n", " <td>Clay</td>\n", " <td>1st Round</td>\n", " <td>92</td>\n", " <td>42</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>39531</th>\n", " <td>2014-12-02</td>\n", " <td>Clay</td>\n", " <td>1st Round</td>\n", " <td>45</td>\n", " <td>78</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>39537</th>\n", " <td>2014-12-02</td>\n", " <td>Clay</td>\n", " <td>2nd Round</td>\n", " <td>14</td>\n", " <td>58</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>39560</th>\n", " <td>2014-12-02</td>\n", " <td>Hard</td>\n", " <td>1st Round</td>\n", " <td>96</td>\n", " <td>56</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>39563</th>\n", " <td>2014-12-02</td>\n", " <td>Hard</td>\n", " <td>2nd Round</td>\n", " <td>85</td>\n", " <td>81</td>\n", " <td>0</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " Date Surface Round WRank LRank win\n", "39530 2014-12-02 Clay 1st Round 92 42 0\n", "39531 2014-12-02 Clay 1st Round 45 78 1\n", "39537 2014-12-02 Clay 2nd Round 14 58 1\n", "39560 2014-12-02 Hard 1st Round 96 56 0\n", "39563 2014-12-02 Hard 2nd Round 85 81 0" ] }, "execution_count": 191, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df3 = newdf2.copy()\n", "df3 = df3[df3['Best_of'] == 3]\n", "# Drop Best_of and Series columns\n", "df3.drop(\"Series\",axis = 1,inplace=True)\n", "df3.drop(\"Best_of\",axis = 1,inplace=True)\n", "df3.head()" ] }, { "cell_type": "code", "execution_count": 192, "metadata": { "collapsed": true }, "outputs": [], "source": [ "y_0 = df3[df3.win == 0] \n", "y_1 = df3[df3.win == 1] \n", "n = min([len(y_0), len(y_1)]) \n", "y_0 = y_0.sample(n = n, random_state = 0) \n", "y_1 = y_1.sample(n = n, random_state = 0)\n", "df_strat = pd.concat([y_0, y_1]) \n", "X_strat = df_strat[['Date', 'Surface', 'Round','WRank', 'LRank']]\n", "y_strat = df_strat.win" ] }, { "cell_type": "code", "execution_count": 193, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>Date</th>\n", " <th>Surface</th>\n", " <th>Round</th>\n", " <th>WRank</th>\n", " <th>LRank</th>\n", " <th>win</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>43714</th>\n", " <td>2015-12-08</td>\n", " <td>Hard</td>\n", " <td>2nd Round</td>\n", " <td>49</td>\n", " <td>34</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>42431</th>\n", " <td>2015-03-15</td>\n", " <td>Hard</td>\n", " <td>2nd Round</td>\n", " <td>41</td>\n", " <td>32</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>45333</th>\n", " <td>2016-04-20</td>\n", " <td>Clay</td>\n", " <td>2nd Round</td>\n", " <td>51</td>\n", " <td>35</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>42802</th>\n", " <td>2015-01-05</td>\n", " <td>Clay</td>\n", " <td>Quarterfinals</td>\n", " <td>63</td>\n", " <td>50</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>45438</th>\n", " <td>2016-01-05</td>\n", " <td>Clay</td>\n", " <td>The Final</td>\n", " <td>87</td>\n", " <td>29</td>\n", " <td>0</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " Date Surface Round WRank LRank win\n", "43714 2015-12-08 Hard 2nd Round 49 34 0\n", "42431 2015-03-15 Hard 2nd Round 41 32 0\n", "45333 2016-04-20 Clay 2nd Round 51 35 0\n", "42802 2015-01-05 Clay Quarterfinals 63 50 0\n", "45438 2016-01-05 Clay The Final 87 29 0" ] }, "execution_count": 193, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X_strat_1=X_strat.copy()\n", "X_strat_1['win']=y_strat\n", "X_strat_1.head()" ] }, { "cell_type": "code", "execution_count": 194, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>Date</th>\n", " <th>Surface</th>\n", " <th>Round</th>\n", " <th>WRank</th>\n", " <th>LRank</th>\n", " <th>win</th>\n", " <th>P1</th>\n", " <th>P2</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>43714</th>\n", " <td>2015-12-08</td>\n", " <td>Hard</td>\n", " <td>2nd Round</td>\n", " <td>49</td>\n", " <td>34</td>\n", " <td>0</td>\n", " <td>49</td>\n", " <td>34</td>\n", " </tr>\n", " <tr>\n", " <th>42431</th>\n", " <td>2015-03-15</td>\n", " <td>Hard</td>\n", " <td>2nd Round</td>\n", " <td>41</td>\n", " <td>32</td>\n", " <td>0</td>\n", " <td>41</td>\n", " <td>32</td>\n", " </tr>\n", " <tr>\n", " <th>45333</th>\n", " <td>2016-04-20</td>\n", " <td>Clay</td>\n", " <td>2nd Round</td>\n", " <td>51</td>\n", " <td>35</td>\n", " <td>0</td>\n", " <td>51</td>\n", " <td>35</td>\n", " </tr>\n", " <tr>\n", " <th>42802</th>\n", " <td>2015-01-05</td>\n", " <td>Clay</td>\n", " <td>Quarterfinals</td>\n", " <td>63</td>\n", " <td>50</td>\n", " <td>0</td>\n", " <td>63</td>\n", " <td>50</td>\n", " </tr>\n", " <tr>\n", " <th>45438</th>\n", " <td>2016-01-05</td>\n", " <td>Clay</td>\n", " <td>The Final</td>\n", " <td>87</td>\n", " <td>29</td>\n", " <td>0</td>\n", " <td>87</td>\n", " <td>29</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " Date Surface Round WRank LRank win P1 P2\n", "43714 2015-12-08 Hard 2nd Round 49 34 0 49 34\n", "42431 2015-03-15 Hard 2nd Round 41 32 0 41 32\n", "45333 2016-04-20 Clay 2nd Round 51 35 0 51 35\n", "42802 2015-01-05 Clay Quarterfinals 63 50 0 63 50\n", "45438 2016-01-05 Clay The Final 87 29 0 87 29" ] }, "execution_count": 194, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = X_strat_1.copy()\n", "df[\"P1\"] = df[[\"WRank\", \"LRank\"]].max(axis=1)\n", "df[\"P2\"] = df[[\"WRank\", \"LRank\"]].min(axis=1)\n", "df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exploratory Analysis for Best_of = 3" ] }, { "cell_type": "code", "execution_count": 195, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAfAAAAFICAYAAACvNaz+AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAG1VJREFUeJzt3XuUnWV59/HvJCNoMpM0gVFOfeWgXAooKigQUzEIWhVq\nPFDFMxiLaH1rtYqpioKKlSXUirIQYhVELYriARG7RA42YkRRAZULaBoQwdfBmeZAgJBk3j+ePWET\nMzM7yTyz597z/ayVlTznayf37F+e0313DQ0NIUmSyjKt3QVIkqStZ4BLklQgA1ySpAIZ4JIkFcgA\nlySpQAa4JEkF6q5z5xHRBZwDHAg8ACzKzOWNZY8D/gMYArqApwEnZ+Z5ddYkSVInqDXAgYXAjpk5\nLyIOAc5qzCMz/x+wACAiDgU+Apxfcz2SJHWEui+hzweuAMjMZcDBI6x3NvCWzLRXGUmSWlB3gM8C\nVjZNr4+IRxwzIo4Bbs7M22uuRZKkjlH3JfRVQG/T9LTM3LjZOq8FPtnKztav3zDU3T19vGqTJKkE\nXVuaWXeALwWOBi5p3Oe+aQvrHJyZ17Wys8HBteNZmyRJk15fX+8W59cd4JcCR0XE0sb08RFxHDAz\nM5dExM488hK7JElqQVdJo5H1968up1hJksZBX1/vFi+h25GLJEkFMsAlSSqQAS5JUoEMcEmSCmSA\nS5JUIANckqQCGeCb+dKXLuCOO1a0uwxJkkble+AdaMOGDaxYsbzdZUyYPffcm+nT7WJXUmca6T3w\nuntim7Te/OY38JnPnM/vf38Xb3vbm7n88iv5xS9+zte//lVe/erX8ZOf/Ji77/49AwMDrF69io9+\n9Awe+9jHtbvslqxYsZwPfO00enae1e5Sarfm3lV8+NhT2GefJ7a7FEmaUFM2wJ/5zEP45S9v4I47\nVtDX18ftt9/GT37yY1avXr1pncc+9nG8//2n8sUvfp5rrrmKY499VRsr3jo9O89i9i5z2l2GJKkm\nU/Ye+Lx587n++mXceOMvec1r3sgNN1zPLbf8hr6+vk3r7LPPEwDYeec+1q17sF2lSpL0Z6bsGfj+\n+z+F8847hxkzZnDYYc/mne98G3vttc9ma23xtoMkaTtMted0oJ5ndaZsgHd1dfG4x+3C7rvvQW9v\nL0ND8OxnP4elS6/dtFySNP6m0nM6UN+zOlM2wAHe974PbfrzkiUXAnD44QsA2G+/AzYte+ELj57Q\nuiSp0/mczvabsvfAJUkqmQEuSVKBDHBJkgpkgEuSVKCOeoitjlcT7KZTkjQZdVSAr1ixnMVnXszM\n2X1jr9yC+1b287F3vdJuOiVJk05HBTjAzNl9zJq764Qec2hoiDPP/Bduv/02dthhB04++f3svvse\nE1qDJGlq8R74OLj22qtZt24d557775x44t/z6U//a7tLkiR1OAN8HNx44y855JB5AOy//wHccstv\n21yRJKnTGeDjYO3a++jp6dk0PX36dDZu3NjGiiRJnc4AHwczZsxk7dr7Nk1v3LiRadP8q5Uk1afj\nHmK7b2X/hO/rqU89kKVLf8SCBUdy8803bRqGVJKkunRUgO+559587F2vHPd9juU5z1nA9dcv46ST\nTgBg8eIPjmsNkiRtrqMCfPr06W15Z7urq4t/+qfFE35cSdLU5Y1aSZIKZIBLklQgA1ySpAIZ4JIk\nFajWh9giogs4BzgQeABYlJnLm5Y/EzizMfkH4LWZuW5bj+doZJKkqaLup9AXAjtm5ryIOAQ4qzFv\n2HnAyzNzeUScADweuG1bD7ZixXI+8LXT6Nl51nYVPWzNvav48LGnOBqZJGnSqTvA5wNXAGTmsog4\neHhBROwL/Al4Z0QcAFyWmdsc3sN6dp7F7F3mbO9uttqvf30z5557Nmef/dkJP7Ykaeqp+x74LGBl\n0/T6iBg+5s7AYcCngCOBIyPiuTXXU4svf/lCzjjjIzz00EPtLkWSNEXUfQa+Cuhtmp6WmcOjfPwJ\nuD0zbwWIiCuAg4GrR9rZnDkz6O4e+X704GDPiMu21dy5PfT19Y66zn777ctLX3oM73nPe8ZcdyLU\n8fcwmbXybyRp8phq31FQz/dU3QG+FDgauCQiDgVualq2HOiJiL0bD7b9FbBktJ0NDq4d9WADA2u2\nr9oR9tnfv3rUdZ72tEP5wx/u4aGHNoy57kSo4+9hMmvl30jbro6HQyc7H16t11T7joLt+54aKfjr\nDvBLgaMiYmlj+viIOA6YmZlLIuJNwFciAuDHmfm9muuRtJXG++HQyc6HV1WKWgM8M4eAkzabfWvT\n8quBQ8bzmGvuXdW2fQ0NDY3bsaXJpF0Ph0oaWUcNZrLnnnvz4WNPGfd9tqqrq2tcjy1J0kg6KsDb\nNRoZwC677Mq55/57W44tSZp67EpVkqQCGeCSJBXIAJckqUAGuCRJBTLAJUkqkAEuSVKBDHBJkgpk\ngEuSVCADXJKkAhngkiQVyACXJKlABrgkSQUywCVJKpABLklSgQxwSZIKZIBLklQgA1ySpAIZ4JIk\nFcgAlySpQAa4JEkFMsAlSSqQAS5JUoEMcEmSCmSAS5JUIANckqQCGeCSJBXIAJckqUAGuCRJBTLA\nJUkqkAEuSVKBuuvceUR0AecABwIPAIsyc3nT8ncAi4A/NmadmJm31VmTJEmdoNYABxYCO2bmvIg4\nBDirMW/YQcDrMvMXNdchSVJHqfsS+nzgCoDMXAYcvNnyg4DFEfGjiHhvzbVIktQx6g7wWcDKpun1\nEdF8zK8AbwEWAPMj4kU11yNJUkeo+xL6KqC3aXpaZm5smv63zFwFEBHfBZ4OXD7SzubMmUF39/Ra\nCu0kg4M97S5hQs2d20NfX+/YK2qbTLX2BLaputmmxkfdAb4UOBq4JCIOBW4aXhARs4CbI+JJwP3A\nEcDnRtvZ4ODaGkvtHAMDa9pdwoQaGFhDf//qdpfRsaZaewLbVN1sU1tnpOCvO8AvBY6KiKWN6eMj\n4jhgZmYuiYjFwNVUT6hfmZlX1FyPJEkdodYAz8wh4KTNZt/atPxLwJfqrEGSpE5kRy6SJBXIAJck\nqUAGuCRJBTLAJUkqkAEuSVKBDHBJkgpkgEuSVCADXJKkAhngkiQVyACXJKlABrgkSQUywCVJKpAB\nLklSgQxwSZIKZIBLklQgA1ySpAIZ4JIkFcgAlySpQAa4JEkFMsAlSSqQAS5JUoEMcEmSCmSAS5JU\nIANckqQCGeCSJBXIAJckqUAGuCRJBTLAJUkqkAEuSVKBDHBJkgrU3cpKEdEDLACeCGwEbgd+kJkP\n1FibJEkawagBHhEzgA8CLwNuBO4AHgLmAf8aEd8APpyZa+ouVJIkPWysM/CLgPOAxZm5sXlBREwD\njm6ss3BLG0dEF3AOcCDwALAoM5dvYb3PAn/KzH/e6k8gtcGGDRtYseLPmnJHuvPOO9pdgqQtGCvA\nX56ZQ1ta0Aj0b0fEd0bZfiGwY2bOi4hDgLPYLOwj4kTgAOCa1suW2mvFiuUsPvNiZs7ua3cpteu/\nK9nt8HZXIWlzYwX4ByJixIWZedpIAd8wH7iise6yiDi4eWFEHAY8E/gs8KSWKpYmiZmz+5g1d9d2\nl1G7NSv7gXvaXYakzYwV4F3buf9ZwMqm6fURMS0zN0bELlT31xcCr2xlZ3PmzKC7e/p2ltT5Bgd7\n2l3ChJo7t4e+vt4JPeZU+zueatrRpqaSqfjzU0ebGjXAM/PULc1v3Nveq4X9rwKaK57WdC/9WGAn\n4HJgV+AxEXFLZl440s4GB9e2cEgNDEytZwoHBtbQ3796wo+pztWONjWVTMWfn+1pUyMFf6uvkf09\ncDows2n2/wBPGGPTpVQPul0SEYcCNw0vyMyzgbMb+38DEKOFtyRJelirHbm8i+pJ8ouBfYA3Acta\n2O5S4MGIWAqcCfxjRBwXEYu2pVhJklRp6Qwc+GNm/k9E3Ag8JTO/0DgrH1XjAbeTNpt96xbWu6DF\nOiRJEq2fgd8XEQuoOnM5pvEA2pz6ypIkSaNpNcDfDvwN1SthOwG30Lh/LUmSJl6rl9B3y8x/bPz5\n5QAR8bJ6SpIkSWMZqy/0VwI7AqdFxCmbbffPwDdqrE2SJI1grDPwWVQDl/RSjUY2bD3wvrqKkiRJ\noxurI5fzgfMj4nmZeWVE9ALTM/N/J6Y8SZK0Ja0+xLYiIn4KrACWR8QvImLf+sqSJEmjaTXAzwXO\nyMydMnMu8DGqYUYlSVIbtBrgO2fmJcMTmflVYG49JUmSpLG0GuAPRsQzhici4iDAkUUkSWqTVt8D\nfwfw9YgYoBpidC4tDgEqSZLGX6sBnsC+jV/TGtO71lWUJEka3Vgdufwl1Rn35cALgeHBTPdozHtS\nrdVJkqQtGusM/FSqDlx2A65tmr8euKyuoiRJ0ujG6sjlBICIODkzPz4xJUmSpLGM+hR6RHwsImaP\nFN4RMTciDHZJkibYWJfQvwp8KyLuprqEfhfV5fPHA0dQXVp/R60VSpKkPzPWJfRfAM+NiAVU44Ef\nDWwE/hv4bGb+sP4SJUnS5lp6jSwzrwKuqrkWSZqyNmzYwIoVy9tdxoS488472l1CR2gpwCPiBcBH\nqDpw6Rqen5l711SXJE0pK1YsZ/GZFzNzdl+7S6ld/13Jboe3u4rytdqRy9nAO4GbgaH6ypGkqWvm\n7D5mze38PrLWrOwH7ml3GcVrNcDvzUzf+5YkaZJoNcB/FBFnAVcADwzPzMxrR95k8phK95bA+0uS\nNBW0GuDPavz+9KZ5Q1Svkk16U+neEnh/SZKmglafQl9QdyF1myr3lsD7S5I0FbT6FPp84N1AD9VT\n6NOBx2fmnvWVJkmSRjJqV6pNlgDfpAr8zwC3AZfWVZQkSRpdqwF+f2Z+HrgaGATeDHiXVZKkNmk1\nwB+IiLlAAodm5hAws76yJEnSaFoN8LOAi4HvAK+PiF8DP6utKkmSNKqWAjwzvwY8PzNXAwcBrwVe\nV2dhkiRpZC0FeETMAc6LiB8CjwbeDsyuszBJkjSyVjtyOR/4T6oOXVZTvWR8EfDi0TaKiC7gHOBA\nqh7cFmXm8qblLwdOphqi9MuZ+amt/QCSJE1Frd4D3yszzwM2Zua6zHwfsEcL2y0EdszMecBiqnvp\nAETENOB0qt7c5gFvbTwoJ0mSxtBqgK+PiNk0RiKLiCdSnTWPZT5V/+lk5jLg4OEFmbkReHJmrgF2\nbtSyrvXSJUmaulq9hP5BqnfA/zIivgkcBpzQwnazgJVN0+sjYlojvMnMjRHxUqrOYS4D7httZ3Pm\nzKC7e3qLJT9scLBnq7dROebO7aGvr3dCj2mb6my2KY23OtpUqwH+c6qe144B/g/wDaqn0b87xnar\ngOaKN4X3sMy8FLg0Ii4AXg9cMNLOBgfXtljuIw0MrNmm7VSGgYE19PevnvBjqnPZpjTetqdNjRT8\nrQb45cCNVGfJw7pa2G4pcDRwSUQcCtw0vCAieqneK39+Zq6jOvtu5bK8JElTXqsBTma+aRv2fylw\nVEQsbUwfHxHHATMzc0lEXARcGxHrqP6DcNE2HEOSpCmn1QD/ZkQsAn4IrB+emZl3jrZRo8vVkzab\nfWvT8iVUA6VIkqSt0GqAzwbeC9zbNG8I2HvcK5IkSWNqNcBfDjw2M++vsxhJktSaVt8DXw7MqbMQ\nSZLUulbPwIeA30TEzTR1tpKZR9RSlSRJGlWrAf7RWquQJElbpaUAz8xr6i5EkiS1rtV74JIkaRIx\nwCVJKpABLklSgQxwSZIKZIBLklQgA1ySpAIZ4JIkFcgAlySpQAa4JEkFMsAlSSqQAS5JUoEMcEmS\nCmSAS5JUIANckqQCGeCSJBXIAJckqUAGuCRJBTLAJUkqkAEuSVKBDHBJkgpkgEuSVCADXJKkAhng\nkiQVyACXJKlABrgkSQXqrnPnEdEFnAMcCDwALMrM5U3LjwP+AXgIuCkz31pnPZIkdYq6z8AXAjtm\n5jxgMXDW8IKIeDRwGnB4Zv4V8BcRcXTN9UiS1BHqDvD5wBUAmbkMOLhp2YPAvMx8sDHdTXWWLkmS\nxlB3gM8CVjZNr4+IaQCZOZSZ/QAR8XZgZmb+oOZ6JEnqCLXeAwdWAb1N09Myc+PwROMe+RnAE4GX\njbWzOXNm0N09fauLGBzs2eptVI65c3vo6+sde8VxZJvqbLYpjbc62lTdAb4UOBq4JCIOBW7abPl5\nwP2ZubCVnQ0Ort2mIgYG1mzTdirDwMAa+vtXT/gx1blsUxpv29OmRgr+ugP8UuCoiFjamD6+8eT5\nTODnwPHAjyLiKmAI+LfM/FbNNUmSVLxaAzwzh4CTNpt960QdX5KkTmVHLpIkFcgAlySpQAa4JEkF\nMsAlSSqQAS5JUoEMcEmSCmSAS5JUIANckqQCGeCSJBXIAJckqUAGuCRJBTLAJUkqkAEuSVKBDHBJ\nkgpkgEuSVCADXJKkAhngkiQVyACXJKlABrgkSQUywCVJKpABLklSgQxwSZIKZIBLklQgA1ySpAIZ\n4JIkFcgAlySpQAa4JEkFMsAlSSqQAS5JUoEMcEmSCmSAS5JUIANckqQCdde584joAs4BDgQeABZl\n5vLN1pkB/CdwQmbeWmc9kiR1irrPwBcCO2bmPGAxcFbzwog4CLgG2LvmOiRJ6ih1B/h84AqAzFwG\nHLzZ8h2oQv6WmuuQJKmj1B3gs4CVTdPrI2LTMTPzusz8PdBVcx2SJHWUWu+BA6uA3qbpaZm5cVt3\nNmfODLq7p2/1doODPdt6SBVg7twe+vp6x15xHNmmOpttSuOtjjZVd4AvBY4GLomIQ4Gbtmdng4Nr\nt2m7gYE123NYTXIDA2vo71894cdU57JNabxtT5saKfjrDvBLgaMiYmlj+viIOA6YmZlLmtYbqrkO\nSZI6Sq0BnplDwEmbzf6zV8Uy84g665AkqdPYkYskSQUywCVJKpABLklSgQxwSZIKZIBLklQgA1yS\npAIZ4JIkFcgAlySpQAa4JEkFMsAlSSqQAS5JUoEMcEmSCmSAS5JUIANckqQCGeCSJBXIAJckqUAG\nuCRJBTLAJUkqkAEuSVKBDHBJkgpkgEuSVCADXJKkAhngkiQVyACXJKlABrgkSQUywCVJKpABLklS\ngQxwSZIKZIBLklQgA1ySpAIZ4JIkFcgAlySpQN117jwiuoBzgAOBB4BFmbm8afkxwAeAh4DPZ+aS\nOuuRJKlT1H0GvhDYMTPnAYuBs4YXRER3Y/pI4LnA30VEX831SJLUEeoO8PnAFQCZuQw4uGnZk4Hb\nMnNVZj4E/BfwnJrrkSSpI9R6CR2YBaxsml4fEdMyc+MWlq0GZtdVyH0r++va9aRz/+oBHnXvqnaX\nMSHWtPFzTpU2NZXaE9imJoJtanzUHeCrgN6m6eHwHl42q2lZL/C/o+2sr6+3a1uK6Ot7Bld97Rnb\nsqm0RbYpjTfblLZW3ZfQlwIvAoiIQ4Gbmpb9FnhCRPxFROxAdfn8uprrkSSpI3QNDQ3VtvOmp9Cf\n2ph1PHAQMDMzl0TEi4EPAl3A5zLz3NqKkSSpg9Qa4JIkqR525CJJUoEMcEmSCmSAS5JUIANckqQC\n1f0euGoQEfsDHwdmADOB7wFXAydm5nFtLE2FiIi9gDOA3YH7gbXAyZn5m7YWpuJFxOHAW5q/iyLi\nY8BvM/PCrdzXPZm563jX2CkM8MJExGzgK8DCzFzeeFXva8A9gK8UaEwR8Rjg28CbMvOnjXkHA58G\njmhnbeoY4/Vd5HfaKAzw8rwEuHJ4VLfMHIqI1wPPBg4HiIi3AS+jOkO/t/HnLwAXZeb3IuJJwCcy\n8+g21K/2O4aqDf10eEZm/gw4IiI+D+wEzG2sdwawB7Ar8J3M/EBEvAx4D7AOuDszXxURzwY+0Zi3\nFnhFZt43kR9Kk8qWes2cHhHn83B7+nZmnrJZm/sbqja3H7Ac2HGC6i2S98DLsxtVw94kM9dSfXEO\n2ykzn5eZhwGPohpE5jzgjY3lJwAO3Tp17QXcPjwREd+MiKsi4haqS+pXZuZ8qq6Or8vMFwKHAG9p\nbPIq4IzMfA5wWeOq0EuAi6lGFjwXmDNRH0aT0hER8cPGr6uA44ANPLI9ndS0/nCbW8AjR7CcMdGF\nl8Qz8PLcATyiw+SI2JNHjuS2LiK+AtxH9YX8qMy8JiLOjoidgedT/XBoavodTSMDZuZCgIi4DrgL\nyMaiAeBZEbGAarChHRrz3wksjoi3U3WJ/E3gdOB9wJWNffyk/o+hSezKzHz18EREnE71H8IDttCe\n4OE2ty/wU4DM/F1E/G6C6i2SZ+DluQx4QUTsDRARj6IaV72/Mf0UqvvjxwFvB6bz8OWsLwKfAr6f\nmRsmunBNGt8CnhcRzxqeERFPoLq0+XhgeMChNwKDmfk6qjY2fDb0d8AHM3MB1XfIS4HXAp/PzCOA\n3zTWkYZ1NX5tqT3Bw23uN8BhABGxG1Wb1Ag8Ay9MZq6OiDcA5zceYOsFvgPcQnUWfhuwJiJ+RPUD\nczfVZXeAC4CPAAdMeOGaNDLzvog4Bvh4ROxCdZtlPfAO4MVNq14JfDkiDqO6RXNrROxKdYb03YhY\nTXUmdRnwROBzEXEf1aVSA1zNhqja2F9voT1telAtM78VEUc1rgbdCfyxLdUWwr7Qp5CI2B34QmYe\n1e5aJEnbx0voU0REvBS4HDil3bVIkrafZ+CSJBXIM3BJkgpkgEuSVCADXJKkAhngkiQVyPfApQ4X\nEa8A3kv1894FfDEzP7EV27+Zqpe1izPz5HqqlLS1PAOXOlijN6tPAEdm5tOoerl6ZURszUA2rwIW\nGd7S5OJrZFIHi4inUr3/f2hm3tWYtx/wIPAD4PDMvLMxhvOHMnNBY/CJAaoRob5MNfLYPcD/BXqA\ndwGPBh5DFez/FRFPoxrE5DGNbV+TmXdHxMnA31KdLHw/M987UZ9d6nSegUsdLDNvpBr7e3lELIuI\nfwG6M/O/+fOxlpunf5WZT87MDwM/A94EfB84EXhxZj4d+Djw7sb6FwGnZuaBwH8A/xARLwAOoho4\n5RnAHhHxaiSNCwNc6nCZ+VaqQUrOafx+XaNnvtEs22y6KzOHqMaW/+uIOJVqsJOeiNgJ2CUzv9c4\n3mcbl9uPBJ4F/By4gSrM9x+fTyXJh9ikDhYRLwJ6MvOrVIPZXBARi6jOqId4eKS6R2226f1b2NdM\n4HrgQuAa4EbgbcBDTfshInakGkBnOvDJzPxkY/4sqgEtJI0Dz8ClzrYWOD0iHg/QGMFuP6oz4nt5\n+Iz4JS3sa19gQ2aeDlwFvBCYnpmrgDsj4nmN9V4PnEo1mtnrI2JmRHRTDWP6ivH5WJIMcKmDZebV\nVGF6WUT8lmq85WnAacCHgE9FxDJgsGmzke6N/wr4VUQk1WXx1VSX5AFeB3woIm4AjgXenZnfBS6h\nuhx/I3BDZl44rh9QmsJ8Cl2SpAJ5Bi5JUoEMcEmSCmSAS5JUIANckqQCGeCSJBXIAJckqUAGuCRJ\nBfr/j8dilj/qA6cAAAAASUVORK5CYII=\n", "text/plain": [ "<matplotlib.figure.Figure at 0x11d0cded0>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "#Without stratification\n", "win_by_Surface = pd.crosstab(df3.win, df3.Surface).apply( lambda x: x/x.sum(), axis = 0 )\n", "win_by_Surface\n", "win_by_Surface = pd.DataFrame( win_by_Surface.unstack() ).reset_index()\n", "win_by_Surface.columns = [\"Surface\", \"win\", \"total\" ]\n", "fig2 = sns.barplot(win_by_Surface.Surface, win_by_Surface.total, hue = win_by_Surface.win )\n", "fig2.figure.set_size_inches(8,5)" ] }, { "cell_type": "code", "execution_count": 196, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAfAAAAFICAYAAACvNaz+AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XucHXV9//HXJilRsklIYAVERUH4QFGpgoqYglDRStEf\naq2lXlH4IVZbtSqN9VKhoqjEKorcSoV6KUJF6w2kctMoERQLiHy4xIB3gom5cIvJbv/4zkkOy17O\nbnaSnbOv5+ORR/bMzJn5fufMmffczvfbMzAwgCRJapZpW7sAkiRp7AxwSZIayACXJKmBDHBJkhrI\nAJckqYEMcEmSGmhGnTOPiB7gdGBf4AHgmMxc2jb+6cCp1cvfAK/MzHV1lkmSpG5Q9xn4kcDMzDwQ\nWAgsGjT+LOC1mXkQcAmwa83lkSSpK9Qd4AsowUxmLgH2b42IiD2B3wFvi4grgfmZeVvN5ZEkqSvU\nHeBzgFVtr9dHRGuZOwDPAj4BPBd4bkQ8p+bySJLUFWq9Bw6sBma3vZ6Wmf3V378Dbs/MWwEi4hLK\nGfqVw81s/foNAzNmTK+pqJIkTUo9Qw2sO8AXA0cAF0XEAcCNbeOWAr0RsVv1YNufAueMNLOVK++r\nraCSJE1GfX2zhxzeU2dnJm1PoT+lGnQ0sB8wKzPPqS6Zn1KN+15mvnWk+S1fvsaeVyRJU0pf3+wh\nz8BrDfCJZoBLkqaa4QLchlwkSWogA1ySpAYywCVJaqC6n0KX1HAbNmxg2bKlo084AR7/+N2YPt2f\nikqdMMAljWjZsqW858IT6d1hTq3LWXvPak562XvZffc9al2O1C0McEmj6t1hDnN3mre1iyGpjffA\nB/nc587jzjuXbe1iSJI0Is/AB3nFK16ztYsgSdKopuwZ+LHHvoZ169bxs58t5fDD/wyA66//Ie9+\n9wncfPNNnHvuWfzLv7yPt73tzRx77Gu4++7fbuUSS5K0yZQN8Kc//Zn8+Mc/4rrrfkBfXx+3334b\n11zzPdasWbNxmkc9akcWLTqNgw56DldddcVWLK0kSQ81ZQP8wAMXcO21S7jhhh/zile8lh/96Fpu\nueVm+vr6Nk6z++5PBGCHHfpYt+7BrVVUSZIeZsoG+D77PJnMn/KHP6zjWc96Npdddgk77rgT06a1\nr5Ihm5+VJGmrm7IB3tPTw4477sRee/0xs2fPZmAAnv3sgx4yXpKkycreyCSN6I47buOUK/619t+B\nr/rNSk445C025CINMlxvZP6MTB3ZUs1p2pSmJHXGAFdHtkRzmjalKUmdM8DVMZvTlKTJY8o+xCZJ\nUpN11Rl4HfdpvScrSZqMuiLAW8F911138ukLr2HW3L7R39SBe1ct5/iXHcDjHrcrYJhLkiaPrgjw\nZcuWsvDUCxgAeuf2MWf+zhM279MvvIbeuXdw76rlfPAfXj7kA1YDAwOceuqHuP3229hmm2044YR3\ns8suj5mwMkiSNFjX3AOfNbeP3gk6827XOiAY6az+6quvZN26dZxxxrkcd9yb+OQnPzbh5ZAkqV3X\nBPjWdMMNP+aZzzwQgH32eRK33PLTrVwiSVK3M8AnwH333Utvb+/G19OnT6e/v38rlkiS1O0M8Amw\n7bazuO++eze+7u/vH9QpiiRJE6srHmJrd++q5RM6r06eaH/KU/Zl8eLvcMghz+Wmm27c2A2pJEl1\n6aoA791uxwmd36y5fR3N86CDDuHaa5dw/PGvA2DhwvdNaDkkSRqsqwJ82rRpE/oTsk719PTw9rcv\n3OLLlSRNXd6olSSpgQxwSZIayACXJKmBDHBJkhqoqx5i6+/vZ+3vfzuh8+zdbkd/0y1NUnX0QDgU\nOzLSZNRVAb7297/lwbnfpXeHORMzv3tWw+8XbJUn2yWNrtWR0UT1QDiUkToykramrgpwgN4d5jB3\np3kTNr8/rOpsup/85CbOOOM0TjvtzAlbtqTRzZrgHgilpui6AN8aPv/587n00m/wyEduu7WLIkma\nIgzwCbDLLo/l5JM/ykknvXdrF0WSJpTPGUxeBvgEOPjgQ/jNb369tYshSRNu2bKlvOfCEyfs2aKh\nrL1nNSe97L0+ZzBGBrgkaUQT/WyRJkatAR4RPcDpwL7AA8Axmbm0bfxbgGOAu6tBx2XmbZuzzLX3\nrN6ctz9sXjPHMP3AwMCELVuSpJHUfQZ+JDAzMw+MiGcCi6phLfsBr8rM6ydiYb3b7Qi/X9Dxk+Oj\nmcnYejjr6emZmAVr0vE+oKTJpu4AXwBcApCZSyJi/0Hj9wMWRsTOwNcz80Obs7Ct1RsZwE477cwZ\nZ5y7VZat+nkfUJpcttRB9YYNG4Aepk+vt0Gv8Ry81x3gc4D28+H1ETEtM/ur118APgWsBr4cEYdn\n5jeGm9m8edsyY8bDK7hyZe8EFnl48+f30tc3e4ssa7KZ6ut45creLXIfcDLWf0t99jD2+k/17XJL\nmKzr+NZbb629ER+A5b9I5uyzvPaD908d92H23HPPMb2v7gBfDbR/Iu3hDfDxzFwNEBFfB54KDBvg\nK1feN+TwFSvWbn5JO7BixVqWL1+zRZY12UzWdbyljsLvuuvO2pcBk3Mb21KffWtZY6n/ZN0uu8lk\nXccrVqzdIo34rF21nN4dHqz94H2k+g93YFN3gC8GjgAuiogDgBtbIyJiDnBTROwF3A8cCvxbzeVR\nl9kSTWlCOQp/9MG1LkKSxqTuAL8YOCwiFlevj46Io4BZmXlORCwErqQ8of7tzLyk5vKoC22po3Dw\nt/6SJo9aAzwzB4DjBw2+tW3854DP1VkGSepG3Xb7SGNnQy6S1EDePpIBLkkN5e2jqa3eH7ZJkqRa\nGOCSJDWQl9AlaQQD/f1b7EEum9LVWBjgkjSCe9f8jrOXfI/eO+priQtsSldjZ4BL0ijsTlOTkffA\nJUlqIANckqQGMsAlSWogA1ySpAYywCVJaiADXJKkBjLAJUlqIANckqQGMsAlSWogA1ySpAayKdUO\n2aGBJGkyMcA7ZIcGkqTJxAAfAzs0kCRNFt4DlySpgQxwSZIayEvoDbdhwwaWLVta+3K21AN8kqTO\nGOANt2zZUhaeegGz5vbVupzlv0gefXCti5AkjYEB3gVmze1jzvyda13G2lXLgV/XugxJUue8By5J\nUgMZ4JIkNZABLklSAxngkiQ1kAEuSVIDGeCSJDWQAS5JUgMZ4JIkNZANuUgNtiWa0rUZXWlyMsCl\nBtsSTenajK40ORngUsPV3ZSuzehKk5P3wCVJaiADXJKkBqr1EnpE9ACnA/sCDwDHZObDnriJiDOB\n32Xmu+osjyRJ3aLuM/AjgZmZeSCwEFg0eIKIOA54Us3lkCSpq9Qd4AuASwAycwmwf/vIiHgW8HTg\nzJrLIUlSV6k7wOcAq9per4+IaQARsRPwPuBNQE/N5ZAkqavU/TOy1cDsttfTMrO/+vtlwPbAN4Cd\ngUdGxC2Zef5wM5s3b1tmzJj+sOErV/ZOXIkngfnze+nrmz36hEztuoP1t/5Tt/5Tue5g/aH+AF8M\nHAFcFBEHADe2RmTmacBpABHxGiBGCm+AlSvvG3L4ihVrJ6q8k8KKFWtZvnxNx9N2k7HUvTV9N7H+\n1t/vfufTd5OR6j9csNcd4BcDh0XE4ur10RFxFDArM8+pedmSJHWtWgM8MweA4wcNvnWI6c6rsxyS\nJHUbG3KRJKmBDHBJkhrIAJckqYEMcEmSGsgAlySpgQxwSZIayACXJKmBDHBJkhrIAJckqYEMcEmS\nGsgAlySpgQxwSZIayACXJKmBDHBJkhrIAJckqYEMcEmSGsgAlySpgWZ0MlFE9AKHAHsA/cDtwP9k\n5gM1lk2SJA1jxACPiG2B9wEvAW4A7gT+ABwIfCwivgSclJlr6y6oJEnaZLQz8M8CZwELM7O/fURE\nTAOOqKY5sp7iSZKkoYwW4C/NzIGhRlSB/t8R8dWJL5YkSRrJaAH+nogYdmRmnjhcwEuSpPqMFuA9\nW6QUkiRpTEYM8Mx8/1DDI6IHeEItJZIkSaPq9GdkbwJOBma1Df4Z8MQ6CiVJkkbWaUMu/wDsC1wA\n7A68HlhSV6EkSdLIOg3wuzPzZ5Tfgj85Mz8DDP90myRJqlWnAX5vRBxCCfAXRsROwLz6iiVJkkbS\naYC/GXgRcAmwPXALcFpdhZIkSSPr6CE24NGZ+dbq75cCRMRL6imSJEkazWhtob8cmAmcGBHvHfS+\ndwFfqrFskiRpGKOdgc+hdFwym9IbWct64J/qKpQkSRrZaA25nA2cHRF/lpnfjojZwPTM/P2WKZ4k\nSRpKpw+xLYuIHwDLgKURcX1E7FlfsSRJ0kg6DfAzgA9n5vaZOR/4IKWbUUmStBV0GuA7ZOZFrReZ\n+UVgfj1FkiRJo+k0wB+MiKe1XkTEfsB99RRJkiSNptPfgb8F+K+IWEHpYnQ+8PLaSiVJkkbUaYAn\nsGf1b1r1eue6CiVJkkY2WkMuj6WccX8DeAGwphr1mGrYXqO8vwc4ndKT2QPAMZm5tG38S4ETgH7g\n85n5ifFVQ5KkqWW0e+DvB64C9gCurv6+CrgU+GYH8z8SmJmZBwILgUWtERExjdLH+KGUxmLeGBE+\nGCdJUgdGa8jldQARcUJmnjKO+S+gdIBCZi6JiP3b5t0fEXtX/z+KcjCxbhzLkCRpyhntEvoHgQ8N\nF97VGfMJmXnCMLOYA6xqe70+IqZlZj9sDPEXA58CvgbcO1J55s3blhkzpj9s+MqVvSO9rXHmz++l\nr292R9NO5bqD9bf+U7f+U7nuYP1h9IfYvgh8JSJ+RbmE/gtKO+i7Ui59P5ryhPpwVlPaUW/ZGN4t\nmXkxcHFEnAe8GjhvuJmtXDn0L9dWrFg7SjWaZcWKtSxfvmb0CZnadW9N302sv/X3u9/59N1kpPoP\nF+yjXUK/HnhORBxC6Q/8CMoDZ3cAZ2bm5aOUaXH1nosi4gDgxtaIql31rwLPy8x1lLPv/iHnIkmS\nHqKjn5Fl5hXAFeOY/8XAYRGxuHp9dEQcBczKzHMi4rPA1RGxDrgB+Ow4liFJ0pTTUYBHxPOBf6E0\n4NLTGp6Zu430vswcAI4fNPjWtvHnAOd0WlhJklR02pDLacDbgJuAgfqKI0mSOtFpgN+TmV+rtSSS\nJKljnQb4dyJiEeU33Q+0Bmbm1bWUSpIkjajTAH9G9f9T24YNUH5KJkmStrBOn0I/pO6CSJKkznX6\nFPoC4B1AL+Up9OnArpn5+PqKJkmShjNaZyYt5wBfpgT+p4DbKL/xliRJW0GnAX5/Zv47cCWwEjgW\nOLiuQkmSpJF1GuAPVB2XJHBA1UDLrPqKJUmSRtJpgC8CLqC0Xf7qiPgJcF1tpZIkSSPqKMAz80JK\npyNrgP2AVwKvqrNgkiRpeB0FeETMA86KiMuBRwBvBubWWTBJkjS8Ti+hnw1cC2wPrAF+jT2HSZK0\n1XQa4E/IzLOA/sxcl5n/BDymxnJJkqQRdBrg6yNiLlVPZBGxB9BfW6kkSdKIOm0L/X2U34A/NiK+\nDDwLeF1dhZIkSSPr9Az8h5SW134GPA74EuVpdEmStBV0egb+DeAGoL1P8J6JL44kSepEpwFOZr6+\nzoJIkqTOdRrgX46IY4DLgfWtgZl5Vy2lkiRJI+o0wOcC/wjc0zZsANhtwkskSZJG1WmAvxR4VGbe\nX2dhJElSZzp9Cn0pMK/OgkiSpM51egY+ANwcETcB61oDM/PQWkolSZJG1GmAf6DWUkiSpDHpKMAz\n86q6CyJJkjrX6T1wSZI0iRjgkiQ1kAEuSVIDGeCSJDWQAS5JUgMZ4JIkNZABLklSAxngkiQ1kAEu\nSVIDGeCSJDWQAS5JUgMZ4JIkNVCnvZGNS0T0AKcD+wIPAMdk5tK28UcBfw/8AbgxM99YZ3kkSeoW\ndZ+BHwnMzMwDgYXAotaIiHgEcCJwcGb+KbBdRBxRc3kkSeoKdQf4AuASgMxcAuzfNu5B4MDMfLB6\nPYNyli5JkkZRd4DPAVa1vV4fEdMAMnMgM5cDRMSbgVmZ+T81l0eSpK5Q6z1wYDUwu+31tMzsb72o\n7pF/GNgDeMloM5s3b1tmzJj+sOErV/Zufkknkfnze+nrmz36hEztuoP1t/5Tt/5Tue5g/aH+AF8M\nHAFcFBEHADcOGn8WcH9mHtnJzFauvG/I4StWrN2cMk46K1asZfnyNR1P203GUvfW9N3E+lt/v/ud\nT99NRqr/cMFed4BfDBwWEYur10dXT57PAn4IHA18JyKuAAaAj2fmV2oukyRJjVdrgGfmAHD8oMG3\nbqnlS5LUrWzIRZKkBjLAJUlqIANckqQGMsAlSWogA1ySpAYywCVJaiADXJKkBjLAJUlqIANckqQG\nMsAlSWogA1ySpAYywCVJaiADXJKkBjLAJUlqIANckqQGMsAlSWogA1ySpAYywCVJaiADXJKkBjLA\nJUlqIANckqQGMsAlSWogA1ySpAYywCVJaiADXJKkBjLAJUlqIANckqQGMsAlSWogA1ySpAYywCVJ\naiADXJKkBjLAJUlqIANckqQGMsAlSWogA1ySpAYywCVJaiADXJKkBjLAJUlqIANckqQGmlHnzCOi\nBzgd2Bd4ADgmM5cOmmZb4FvA6zLz1jrLI0lSt6j7DPxIYGZmHggsBBa1j4yI/YCrgN1qLockSV2l\n7gBfAFwCkJlLgP0Hjd+GEvK31FwOSZK6Sq2X0IE5wKq21+sjYlpm9gNk5vdh46X2Uc2bty0zZkx/\n2PCVK3snoKiTx/z5vfT1ze5o2qlcd7D+1n/q1n8q1x2sP9Qf4KuB9hJtDO/xWLnyviGHr1ixdryz\nnJRWrFjL8uVrOp62m4yl7q3pu4n1t/5+9zufvpuMVP/hgr3uS+iLgcMBIuIA4MaalydJ0pRQ9xn4\nxcBhEbG4en10RBwFzMrMc9qmG6i5HJIkdZVaAzwzB4DjBw1+2E/FMvPQOsshSVK3sSEXSZIayACX\nJKmBDHBJkhrIAJckqYEMcEmSGsgAlySpgQxwSZIayACXJKmBDHBJkhrIAJckqYEMcEmSGsgAlySp\ngQxwSZIayACXJKmBDHBJkhrIAJckqYEMcEmSGsgAlySpgQxwSZIayACXJKmBDHBJkhrIAJckqYEM\ncEmSGsgAlySpgQxwSZIayACXJKmBDHBJkhrIAJckqYEMcEmSGsgAlySpgQxwSZIayACXJKmBDHBJ\nkhrIAJckqYEMcEmSGsgAlySpgQxwSZIayACXJKmBZtQ584joAU4H9gUeAI7JzKVt418IvAf4A/Dv\nmXlOneWRJKlb1H0GfiQwMzMPBBYCi1ojImJG9fq5wHOA/x8RfTWXR5KkrlB3gC8ALgHIzCXA/m3j\n9gZuy8zVmfkH4LvAQTWXR5KkrlDrJXRgDrCq7fX6iJiWmf1DjFsDzB3vgu5dtXy8b+3I/WtW8Ef3\nrK51GQBrx7GMuusOW6b+46k7WP+pvO3D1K6/2/7Urn/PwMDABBdlk4g4Ffh+Zl5Uvb4rMx9X/f1k\n4EOZ+RfV60XAdzPzS7UVSJKkLlH3JfTFwOEAEXEAcGPbuJ8CT4yI7SJiG8rl8+/XXB5JkrpC3Wfg\nrafQn1INOhrYD5iVmedExF8A7wN6gH/LzDNqK4wkSV2k1gCXJEn1sCEXSZIayACXJKmBDHBJkhrI\nAJckqYHqbshlq4mIZ1J+Z37ICNM8Ftg3M782aPjPgDuBfso6mgUcm5k/qqGcXwA+nZlXb+Z8ZgDn\nAo8HtgE+kJlf7fC93wdenpl3tQ37d+BpwO8oB3rzgUWZ+ZnNKecwyz8O2DEzT9zM+UwDzgaC8tm9\nITNvHmH6mcAtmfmEQcMb9/kPmuejgOuA52bmrRHxJGC7zPxuVbfIzHXDvPd9wN8Av6T8OmQ+8J+Z\n+cGJKl/bsp4P/HVmHj3O9+8HnAw8krKNXgGcWLXsON4yHQucm5kbOph2OnAZ5ft2IXDH4H1JB/P4\ndWbuPK7CDj2/EyjNU/8RsAF4x3i324j4PPBq4LHAN4BrgJWU/cAvxjCfgynfxaPGU46xioiPUn7t\ntBOwLXAHsJzyi6gxl6Mq/xeBn1C+EwPA54GfA48dax8eo30Hx6IrAzwi3gG8Clg7yqSHAnsBg790\nA8BhrR1BRDwPeD/wwgku6kR6JXBPZr46IuYBPwY6CvARvD0zLwOo5vkT4DObOc86vRAYyMwF1Zfu\nZEp7/MNpfRkHa+LnD2w8kDsDuK9t8EuBX1OaK+7kZyenZuZZ1fy2AW6OiLMz856JLm+H5XmYiNgF\n+A/ghZl5RzXsPcDHgDdtRnneBZxHCb/R7ALMzsynb8byJuxnQBGxN/CizHx29foplLo8dTzzy8y/\nqeazAPhaZr5jM4q3xX7ulJlvB4iI11CC8l3V64M3oxzfbq2PCTBh66IrAxy4HXgx5QsOQES8kXI0\nuQG4Fngb8I/AIyNi8RBHzu23F3YFVlTzOQw4Cbifcnb6OsoXZOORXeuoujqLfZByVrwT8NrM/HFE\n/C3wespOdaI6cPki5SygVfZW+FxBCfMnAbOBl2XmzyPiA8DzgF8A2w8zz/Z1sDOlzkTErpSz/emU\njfHvMvPG9rOJ1pkl8ARKYz7bArsBp2Tm+dVO4V8p63UDE9CIT2Z+JSJaBy2Pp5wttNbB3cA84C+B\n84HtKEfmw2na59/yUcp6X1iV5dHAa4EHI+J6ykHLpyNiN8pn9+LMXDVoHj1tf+9A2U/cHxFzgc9S\nmkGeDrw7M69sP6OIiA9SGmm6EzgBWEfZBi7IzJMjYi/KtrOWcpCxYpz1fBVwdiu8ATLzpIhYGhHX\nAK+urj5svLoTESdTzsy2B/43M19fXXE4kHKV5fOUz+k/gZdU0y+o6rooM/9r0La0HtgjIj4N/Kb6\nd8sw9d6H0nnTtGqdHp+Z17TKPnj/lJlvGcc6WQU8NiJeB1ySmTdExDOqKzCfqKZpbbNPo2wjDwKP\nAc6knNA8Bfh4Zp5Zfa5/SjmoeWRE3AG8HDgOOKqq36OAxwFvzczLIuKlwN9StpkByn54o+o7sRvl\nqsnHM/Nz46jn5tgzIr5OKffXMvP9Q62fzFwz6H09g163DhD2ohwwfwG4C3gi8IPMfGN1kPlpYCZl\n//nuzPzvoeY1Xl15DzwzL6Z8udq9Bvjb6uj0p9WwDwGfHyK8e4BLI2JJRPwceDrw9mrcmcCR1aX5\nqyjdocJDj6ra/16WmX8OfJLS49qjgL8DngH8P8rlt82Wmfdl5r0RMZsS5P/UNnpJZh4G/A9wVHXp\ncUF15vBqSrAP5ZSIuDoi7gROpYQflJD4WGY+B3gLZYcMwx9ZzsnMF1Lq+4/VsNMpl+2fB/xsjNUd\nVmb2R8RngI8D7TuHz1XLOha4sSr7mcPMpnGfP0BEvBa4u7pq0gOQmb+iXDVZlJnXVpOeU5X/TuCw\nIWb1toi4otph/yfw+sy8F3g38K3MPBj4K+DfRinS4yg78GcB76yGfYSyI3se8L1xVbR4PLB0iOG/\nBXYcPLD6XqzIzOdTPs9nRUTr0vXNmbkgM0+nHFS9PCL+HHh8Zh5ECbZ3VwcwUPYZzwOOr957fDW8\n9bkPVe99gLdV38MPUxq1aveQ/VN1O2hMqs/6RcCzge9HxM2Uq0ZnA2/MzEOBb1IOMKBcQXgx8EbK\n/uIVlIPt49rqczeb9pNn8NBt+4HMPJyyD3hrNWxP4PBqvf0UeH5r4ojopRwQvQR4AZ1d5ZhoMynf\nu4MoBxow/Pppd2hEXF59Ly6vGimDTetjD8qB0TOAw6vv+V7AR6tt7ri25U2YrgzwYbwOeFN1BL0r\nI9e9dQn1mZRLULMyc3lE7ACsyszfVNN9B/jjId7ffoR1ffX/z4FHALsDN2Xm+sxcT7kaMCGqe/qX\nA+dl5gUjlGFPyj1SqiPNm4aZ5TurL+IbgEezaYe5N6XuZOb/Uo7g4aH1bv/7x4OWD+WsqHX2tLiT\n+nUqM19LqeM5EfHIavCt1f97Aj+opvsB1ZWKQRr5+VNC4bBqG/8T4PxqRzJY657obyhXRgY7tQr4\nl1HC8LZq+N7A1bAxLFYPMf/2ut+YmQOZeR+bLunvyaY6b87nfhdlXW5U7VQfRwmdweW5H9gxIj5H\nOQibRblPDJBD1OHJwP4RcTmlR8UZlIOGoaYfbKh6/xJ4b3UG+pdty26Vb/D+acxnaRGxO7AmM1+f\nmbtSbqudQdlGT6/qcjTluwxlO+wHfk+5f7+BctWq9R0dqgwjbdtQ1v15EXEuZR226klmrqUE/dmU\nA8OZY63jBGh99+5n00ne3gy9ftp9OzMPzcxDqv8Hn6zcXp1E9QO/oqyPXwNviIjzKPvQP2KCdXuA\nt29sxwLHVTump1GOjvspl8eGel/rve8BdomI46t7gHMionWEfzAlGB6g+tCry8vz2+Y1+IO+Ddgn\nImZWD8GM6/7UYFWZLqWE7nmDRg8uw82UI0UiYhZDh9BGmflN4CuUL17r/QdV7/8TShAAzIiIbav7\npvuMsHyAX0REVH9vzj3EjSLilRHROsN/gHKE31+9bv1/M+WSKRHxVIb+UjXu8wfIzIOrHcwhlIOm\nV2Xm3ZS6t3/XO7oHl+Xhp1OAC6pw/CmbPvddKJeR76GE487VNH8yzOxa6/MnVOufzfvczwdeHxG7\nR+lP4VLgHMrzLL9j0074adX/L6A8cPQKqkvCbWXq3zTbjfuEW4DLq7OyQym3qO4YYvpOfQJ4b5YH\n9m7k4eE4eP90IGP3FOCTEdHapm+nhPNtlFsKh1LOLltXHNu3g/Fc1n3IdhQRcyjPivw1cAzle9HT\nNn5HYL/MfAlwBPCR8Vxp2ExDbfu3MPT6Ga9WnU+inEy9hvKA5YRdOm/p9gBv/7BuBL4bEd+mXGZb\nUg17UUT81XDvq460jqFcQtuJ8kW7OCK+A/wZ5UO6Dvh9lKe5/5lNZ6oP21iqEDiFcs/364z+oF2n\nFlLu676n7TLPI4Ypw/8Cl0TEtZR7N78dYn6D33cSsHdEvIByOfnNEXEV8CnK2QOUe9rXUHZ2y0Yp\n7xuA/4iIyyhnTRPhS8BTq3J9E/j7zHyQh9blDGC3iLiacunwwSHm08TPf6g6tHYYP6Sc3T2H4S/1\nDzksM89RbwADAAACtUlEQVSl3Ft9A/AByqXEqyjr+tjqjOMjlPX9NR56T3uoZb2dsi4vozqIHI8s\nT0G/krL9fZVyX3knygHZ+ZQzqm+yaR+3hPK5XwlcRPmMHj24vpQH/b6e5Rcc91bbyXWUhyPXDjH9\nUIaa5rPARdW624NNBxitaYfaP41JdevwauDaavv8JmV9H0v5rn0H+CBwQ4dlHmnYUNv2asr6u4Zy\ndeo+2s5mM/O3wE4RsRj4FvDhavvZ2t7I6OtnNENt6xcCp1bb3GFsetZowh5isy10SV2jeiBpaXX5\nWupqBrgkSQ3U7ZfQJUnqSga4JEkNZIBLktRABrgkSQ1kgEuS1EDd2ha6pEGqRmZuZVOvStMozeie\nn5n/XNMyDwb+OUfoFVDS+Bjg0tTyy8xstU5G1R74bRHxhcwcrYnQ8fK3qlINDHBpamu1lLUmIt5F\n6dBiPaWlrHdSWsm7Mqs+06P03jWQpXevX1FaNVtAaVP+rzLzzijdry6iNLFa10GBNOV5D1yaWnaJ\niB9FxE8jYjlwIqVHqn0p7VM/tfq3B6X5VBj+DHon4LLqjP47lOZat6H0fvaSqre7+2uriTTFGeDS\n1PLLzHxaZu5NaTN8G0pHC4cCX8jMdVX71OdS2nofzaXV/zdROnF5crWMVu9vgzvWkTRBDHBp6non\npbvQt/PwnpJ6KLfYBnjofuIhvbdl5rrqz1bnKQM8tIe/9UiqhQEuTS0bg7rq//kdlO41rweOiohH\nRMQMSr/Il1O6o9wuIraPiJnAn48y/xuAvoh4cvX6qImugKTCAJemlsHdhV5K6dr0YEp3oNdRurZc\nBnyy6iLyI9Xwb/HQbi6H6lJyPfA3wGcj4jpKv9uSamBvZJIkNZBn4JIkNZABLklSAxngkiQ1kAEu\nSVIDGeCSJDWQAS5JUgMZ4JIkNdD/AdE1/+4HoSSmAAAAAElFTkSuQmCC\n", "text/plain": [ "<matplotlib.figure.Figure at 0x11c52ed10>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "#without stratification\n", "win_by_round = pd.crosstab(df.win, df.Round).apply( lambda x: x/x.sum(), axis = 0 )\n", "win_by_round\n", "win_by_round = pd.DataFrame(win_by_round.unstack() ).reset_index()\n", "win_by_round.columns = [\"Round\", \"win\", \"total\" ]\n", "fig2 = sns.barplot(win_by_round.Round, win_by_round.total, hue = win_by_round.win )\n", "fig2.figure.set_size_inches(8,5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Dummy variables\n", "------" ] }, { "cell_type": "code", "execution_count": 197, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>Date</th>\n", " <th>Surface</th>\n", " <th>Round</th>\n", " <th>WRank</th>\n", " <th>LRank</th>\n", " <th>win</th>\n", " <th>P1</th>\n", " <th>P2</th>\n", " <th>Round_2</th>\n", " <th>Round_3</th>\n", " <th>Round_4</th>\n", " <th>Round_5</th>\n", " <th>Round_6</th>\n", " <th>Round_7</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>43714</th>\n", " <td>2015-12-08</td>\n", " <td>Hard</td>\n", " <td>2</td>\n", " <td>49</td>\n", " <td>34</td>\n", " <td>0</td>\n", " <td>49</td>\n", " <td>34</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>42431</th>\n", " <td>2015-03-15</td>\n", " <td>Hard</td>\n", " <td>2</td>\n", " <td>41</td>\n", " <td>32</td>\n", " <td>0</td>\n", " <td>41</td>\n", " <td>32</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>45333</th>\n", " <td>2016-04-20</td>\n", " <td>Clay</td>\n", " <td>2</td>\n", " <td>51</td>\n", " <td>35</td>\n", " <td>0</td>\n", " <td>51</td>\n", " <td>35</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>42802</th>\n", " <td>2015-01-05</td>\n", " <td>Clay</td>\n", " <td>5</td>\n", " <td>63</td>\n", " <td>50</td>\n", " <td>0</td>\n", " <td>63</td>\n", " <td>50</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>45438</th>\n", " <td>2016-01-05</td>\n", " <td>Clay</td>\n", " <td>7</td>\n", " <td>87</td>\n", " <td>29</td>\n", " <td>0</td>\n", " <td>87</td>\n", " <td>29</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " Date Surface Round WRank LRank win P1 P2 Round_2 Round_3 \\\n", "43714 2015-12-08 Hard 2 49 34 0 49 34 1 0 \n", "42431 2015-03-15 Hard 2 41 32 0 41 32 1 0 \n", "45333 2016-04-20 Clay 2 51 35 0 51 35 1 0 \n", "42802 2015-01-05 Clay 5 63 50 0 63 50 0 0 \n", "45438 2016-01-05 Clay 7 87 29 0 87 29 0 0 \n", "\n", " Round_4 Round_5 Round_6 Round_7 \n", "43714 0 0 0 0 \n", "42431 0 0 0 0 \n", "45333 0 0 0 0 \n", "42802 0 1 0 0 \n", "45438 0 0 0 1 " ] }, "execution_count": 197, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df1 = df.copy()\n", "def round_number(x):\n", " if x == '1st Round':\n", " return 1\n", " elif x == '2nd Round':\n", " return 2\n", " elif x == '3rd Round':\n", " return 3\n", " elif x == '4th Round':\n", " return 4\n", " elif x == 'Quarterfinals':\n", " return 5\n", " elif x == 'Semifinals':\n", " return 6\n", " elif x == 'The Final':\n", " return 7\n", " \n", "df1['Round'] = df1['Round'].apply(round_number)\n", "\n", "dummy_ranks = pd.get_dummies(df1['Round'], prefix='Round')\n", "\n", "df1 = df1.join(dummy_ranks.ix[:, 'Round_2':])\n", "df1[['Round_2', 'Round_3',\n", " 'Round_4', 'Round_5', 'Round_6', 'Round_7']] = df1[['Round_2', 'Round_3','Round_4', 'Round_5', 'Round_6', 'Round_7']].astype('int_')\n", "df1.head()" ] }, { "cell_type": "code", "execution_count": 198, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>Date</th>\n", " <th>WRank</th>\n", " <th>LRank</th>\n", " <th>win</th>\n", " <th>P1</th>\n", " <th>P2</th>\n", " <th>Round_2</th>\n", " <th>Round_3</th>\n", " <th>Round_4</th>\n", " <th>Round_5</th>\n", " <th>Round_6</th>\n", " <th>Round_7</th>\n", " <th>Surface_Grass</th>\n", " <th>Surface_Hard</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>43714</th>\n", " <td>2015-12-08</td>\n", " <td>49</td>\n", " <td>34</td>\n", " <td>0</td>\n", " <td>49</td>\n", " <td>34</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>42431</th>\n", " <td>2015-03-15</td>\n", " <td>41</td>\n", " <td>32</td>\n", " <td>0</td>\n", " <td>41</td>\n", " <td>32</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " </tr>\n", " <tr>\n", " <th>45333</th>\n", " <td>2016-04-20</td>\n", " <td>51</td>\n", " <td>35</td>\n", " <td>0</td>\n", " <td>51</td>\n", " <td>35</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>42802</th>\n", " <td>2015-01-05</td>\n", " <td>63</td>\n", " <td>50</td>\n", " <td>0</td>\n", " <td>63</td>\n", " <td>50</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " </tr>\n", " <tr>\n", " <th>45438</th>\n", " <td>2016-01-05</td>\n", " <td>87</td>\n", " <td>29</td>\n", " <td>0</td>\n", " <td>87</td>\n", " <td>29</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>0</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " Date WRank LRank win P1 P2 Round_2 Round_3 Round_4 \\\n", "43714 2015-12-08 49 34 0 49 34 1 0 0 \n", "42431 2015-03-15 41 32 0 41 32 1 0 0 \n", "45333 2016-04-20 51 35 0 51 35 1 0 0 \n", "42802 2015-01-05 63 50 0 63 50 0 0 0 \n", "45438 2016-01-05 87 29 0 87 29 0 0 0 \n", "\n", " Round_5 Round_6 Round_7 Surface_Grass Surface_Hard \n", "43714 0 0 0 0 1 \n", "42431 0 0 0 0 1 \n", "45333 0 0 0 0 0 \n", "42802 1 0 0 0 0 \n", "45438 0 0 1 0 0 " ] }, "execution_count": 198, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dummy_ranks = pd.get_dummies(df1['Surface'], prefix='Surface')\n", "df_2 = df1.join(dummy_ranks.ix[:, 'Surface_Grass':])\n", "df_2.drop(\"Surface\",axis = 1,inplace=True)\n", "df_2[['Surface_Grass','Surface_Hard']] = df_2[['Surface_Grass','Surface_Hard']].astype('int_')\n", "df_2.drop(\"Round\",axis = 1,inplace=True)\n", "df_2.head()" ] }, { "cell_type": "code", "execution_count": 199, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>Date</th>\n", " <th>WRank</th>\n", " <th>LRank</th>\n", " <th>win</th>\n", " <th>P1</th>\n", " <th>P2</th>\n", " <th>Round_2</th>\n", " <th>Round_3</th>\n", " <th>Round_4</th>\n", " <th>Round_5</th>\n", " <th>Round_6</th>\n", " <th>Round_7</th>\n", " <th>Surface_Grass</th>\n", " <th>Surface_Hard</th>\n", " <th>D</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>43714</th>\n", " <td>2015-12-08</td>\n", " <td>49</td>\n", " <td>34</td>\n", " <td>0</td>\n", " <td>5.614710</td>\n", " <td>5.087463</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0.527247</td>\n", " </tr>\n", " <tr>\n", " <th>42431</th>\n", " <td>2015-03-15</td>\n", " <td>41</td>\n", " <td>32</td>\n", " <td>0</td>\n", " <td>5.357552</td>\n", " <td>5.000000</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0.357552</td>\n", " </tr>\n", " <tr>\n", " <th>45333</th>\n", " <td>2016-04-20</td>\n", " <td>51</td>\n", " <td>35</td>\n", " <td>0</td>\n", " <td>5.672425</td>\n", " <td>5.129283</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0.543142</td>\n", " </tr>\n", " <tr>\n", " <th>42802</th>\n", " <td>2015-01-05</td>\n", " <td>63</td>\n", " <td>50</td>\n", " <td>0</td>\n", " <td>5.977280</td>\n", " <td>5.643856</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0.333424</td>\n", " </tr>\n", " <tr>\n", " <th>45438</th>\n", " <td>2016-01-05</td>\n", " <td>87</td>\n", " <td>29</td>\n", " <td>0</td>\n", " <td>6.442943</td>\n", " <td>4.857981</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>1</td>\n", " <td>0</td>\n", " <td>0</td>\n", " <td>1.584963</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " Date WRank LRank win P1 P2 Round_2 Round_3 \\\n", "43714 2015-12-08 49 34 0 5.614710 5.087463 1 0 \n", "42431 2015-03-15 41 32 0 5.357552 5.000000 1 0 \n", "45333 2016-04-20 51 35 0 5.672425 5.129283 1 0 \n", "42802 2015-01-05 63 50 0 5.977280 5.643856 0 0 \n", "45438 2016-01-05 87 29 0 6.442943 4.857981 0 0 \n", "\n", " Round_4 Round_5 Round_6 Round_7 Surface_Grass Surface_Hard \\\n", "43714 0 0 0 0 0 1 \n", "42431 0 0 0 0 0 1 \n", "45333 0 0 0 0 0 0 \n", "42802 0 1 0 0 0 0 \n", "45438 0 0 0 1 0 0 \n", "\n", " D \n", "43714 0.527247 \n", "42431 0.357552 \n", "45333 0.543142 \n", "42802 0.333424 \n", "45438 1.584963 " ] }, "execution_count": 199, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df4 = df_2.copy()\n", "df4['P1'] = np.log2(df4['P1'].astype('float64')) \n", "df4['P2'] = np.log2(df4['P2'].astype('float64')) \n", "df4['D'] = df4['P1'] - df4['P2']\n", "df4['D'] = np.absolute(df4['D'])\n", "df4.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Model 1: Logistic Regression" ] }, { "cell_type": "code", "execution_count": 61, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "['Date',\n", " 'WRank',\n", " 'LRank',\n", " 'win',\n", " 'P1',\n", " 'P2',\n", " 'Round_2',\n", " 'Round_3',\n", " 'Round_4',\n", " 'Round_5',\n", " 'Round_6',\n", " 'Round_7',\n", " 'Surface_Grass',\n", " 'Surface_Hard',\n", " 'D']" ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df4.columns.tolist()" ] }, { "cell_type": "code", "execution_count": 266, "metadata": { "collapsed": true }, "outputs": [], "source": [ "#feature_cols = ['P1','P2','Round_2','Round_3','Round_4','Round_5','Round_6','Round_7','Surface_Grass','Surface_Hard','D']\n", "feature_cols = ['D','Surface_Hard','Surface_Grass','Round_6','Round_5','Round_3']" ] }, { "cell_type": "code", "execution_count": 267, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.591885441527\n", "0.620670798396\n" ] } ], "source": [ "dfnew = df4.copy()\n", "X = dfnew[feature_cols]\n", "y = dfnew.win\n", "from sklearn.cross_validation import train_test_split\n", "X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)\n", "from sklearn.linear_model import LogisticRegression\n", "logreg = LogisticRegression()\n", "logreg.fit(X_train, y_train)\n", "y_pred_class = logreg.predict(X_test)\n", "from sklearn import metrics\n", "print(metrics.accuracy_score(y_test, y_pred_class))\n", "y_pred_prob = logreg.predict_proba(X_test)[:, 1]\n", "auc_score = metrics.roc_auc_score(y_test, y_pred_prob)\n", "print(auc_score)" ] }, { "cell_type": "code", "execution_count": 268, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYgAAAEZCAYAAACNebLAAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzs3Xd4VGX2wPHvJIQEQiCU0KXDAalSFVEUsIMVFhXLWlF0\n5WfbXXsXV7BgXdtasHcRK1YEAQUBaTn0Ir0HAimTzO+PexOGkEwmkGnJ+TwPD7n9zCXcM+973+Lx\n+XwYY4wxRcVFOgBjjDHRyRKEMcaYYlmCMMYYUyxLEMYYY4plCcIYY0yxLEEYY4wpVpVIB2Big4jk\nA/OBfMAHVAd2AaNUdba7T3XgPmAIkO3u9znwkKpm+Z3rUmAkkARUBaYC/1LVXWH7QGUgIs8DJwNv\nq+pdIbpGD5x78LdyONerwHxVffzwIys85xBgoKr+n4h0BT4CdgKvAW1U9f/K61omeliCMMHyASeo\n6o6CFSJyM/A00FdE4oHvgF+BbqqaJSJJwCPANyJyoqrmi8jtwCnAmaq61T1uPDAR6B/mzxSsq4Ej\nVHV9qC7gJtnDTg6hoqqf4yR7gDOBH1T16giGZMLAEoQJlsf9A4D7YG8GbHNX/Q3wqOotBfu4pYb/\nE5E5wDki8hVwG9BVVbe6++SJyC3u9iqq6vW/qIgMBh5wr50JXItTclmgqinuPs0Llt3SyRU4JZwM\nIBF4TFU/dvcd4173NhG5wj2fx/0c/1BVLXL9Ke6PX4nIKGAH8AxQF6c09biqThCR/jiJLtO9dm9V\nzXXPcQPQS1UvFpEq7rVGq+prItIXeAL4J/CMqnZ2SwAZQGfgCCAdGK6qe4vEloyToI8FcoFPVfXO\nIvtcjpPgEoA6wH9U9b8i0gB4w/0cAF+q6t3FrP9CVe9x7+tQ4B1gFBAnItVwvhQMVdUhIlLTvQed\n3Ot9D9zqfjHIAj4DugAjVPUPTNSzdxCmLH4Ukbkisg5YglOquMzddgwwpYTjvgf6Ae2BTFVd4b9R\nVbNU9Z1ikkN9YAJwiap2A8YBY9zNRYcA8F8+EuivqgOAlwpiFJE44CLgJRE5HrgE6KeqPYCxwMdF\nA1fV43ESyAnADJySznhV7QqcDjwsIn3c3TviPMiPKkgOrk+BQe7PxwJ7/JbPAj4o5jN0x6nW6gA0\nBoYVjQ24H0hUVQGOAo51Pxfu503GSZanuZ/xfOBRd/NVwHJV7QkcD7QRkZRi1rd11wP4VPVt4L/A\ne6p6cZG4nwBmqWovN/404CZ3W1XgM1XtYMkhdliCMGVxgvugPgOoBvxaUBJwJZRwXCLOQySfsv3O\nHYtTlz4fQFU/UdUzgjjuT1XNdH9+HzjaTTanAkvdBHUG0Br41S3hPAqkikhqCef0AO1wHsifufFs\nwKmLP9XdZ62q/lX0QFVdA/wlIj3dfcfgJBxwEsRHxVzva1X1uklzPs63/6IGAa+418hV1RNVtTBJ\nu/dgCDBYRO4H7gCSC84PnCciX+C8D/q3qu4OsD4Yg4GR7v2cDfTCKU0UmBrkeUyUsARhysIDoKpz\ncb4ZviIizdxt03C+cR5ARDzu+mnAIiBBRFoV2SdRRL4QkYZFDvdSpKQgIp3ddf6/u1WLHLen4Ae3\nWuYDYATwd5wSBUA8MEFVu7vf+I/CqRbaWcznLoihuP8vcexPjHuK2V7gY5wSx0luPGtEZDiwV1VX\nFrP/viLX9xSzzwH3R0Saikgdv+UmwFycqsBfgMLqJ1WdBbQEXgCaA7+LyNElrQ/wufzFA8P87ufR\nwD/8tge6PyYKWYIwh0RV38V5IT3eXfUhkCkiT7ovp3HrqJ8GduPUj+cA/wH+536jR0QSgSeB6qq6\nschlZgIdRKSDu+/ZOFVOO3ESTXt3v3NLCfdlnORwDPu/rX8LXFCQlNz3C9+VcHzBw1mBHDcORKQx\ncB4wuZTrg1PNdCEQr6qb3GMeZX/10qH4DrhURDzuffyQA5N0T2Czqj6kqpNxShO4+48B7lbViW4L\npIVAu5LWBxnPN7hVSm48E4HrD+PzmQizBGGCVdywv/8AThWRk1Q1D6fOPBOYLSJ/ArNwkkPBdlT1\nEZyH9Dci8gcwxz33WUVPrqqbcb75v+Hu+384dfwZOC91vxaRmUBeoMDdOu9c4EM3SaGq3+Ikq8ki\nMhenfv6cQJ/dre45G+fF+zycJHOvqv4c6PrusYvd8xQkoW+AphTz3qOk6xfjPvdzzcOp0pmkqp/6\nbf8GWCciKiKz3ettAdrgJOVuIvKniPwOrMB5Ae2/fpbf+mDcAFQXkfk4JZd57H/nYcNGxyCPDfdt\njDGmOCEvQYhIHxH5sZj1Q0TkNxGZJiJXhjoOY4wxZRPSBCEit+K8FEwssr4K8DhOK4wTgKtFJC2U\nsRhjjCmbUJcgllF8vW4HnOaGGW578akU0wLGGGNM5IQ0QajqJzhN8YqqidMbtsBuoFYoYzHGGFM2\nkRpqIwMnSRRIwWm6GJDP5/N5PMU1BzfGGFNg4cKFnPe3EeiieSQm1yFrz7ZDenCGK0EUDW4xTtf+\nVGAvTvXS2FJP4vGwZUuwnTortrS0FLsXLrsX+9m92K8y3guv18uzz45n7Ngx5OTk0KRDfy4Zedsh\nny9cCcIHICIXAMmq+rKI3ITTjtwDvOwOW2CMMeYQjR37ME88MY6kGrXpedotNGzdmxN7B9vP8WAh\nTxCquhro6/78jt/6L4AvQn19Y4ypLK66ahTbtu9gY/UTaNigHn06NKB9s5KGFyud9aQ2xpgKol69\nejz40FiqVkuhaVoNhp3YhqSqh14OsPkgjDEmxni9XrZv3079+vUPWD951lo++HF5uV3HEoQxxsSQ\nRYsWMvziy8jNy2fQpeOIi4sv3LZnnzMNSVpqEv06Nzrsa1mCMMaYKLc9I4vxH8zlt+/fYc6PE8jP\n89KkQ3+Sq0JC4v7R7msmV6VhneqMOqcTceXQJcAShDHGRKEla3fy/o/LyMvzsWDhQuZ+8zS7Ni0l\nqUZtjhl8AzdcdSHHlkMpIRBLEMYYEyVmLtrEz3PXAZC+Zn/f4V0bFrFr01JOOf0cnnriCWrXLm6C\nwfJnCcIYY6LEj3PWsWTt/sSQUj2BB67oQ41qJzBz5lkcc8yxYY3HEoQxxkQBn89HTq4z99XL/zoR\ncHoRFwwvFO7kAJYgjDEmrPJ9Pr6asZqde3IOWL928x7mL1hI1fwdxHkGRCi6A1mCMMaYEJqVvpk1\nm/cULq/bsoc5S7cesE9+fh4rZn3C0hnvUb1aNXbe/XdSU2uHO9SDWIIwxpgQ2J6RxczFm0rsuNa7\nQ33OOKYFy5cp9915I+kL5lK/fgPGjRsfFckBLEEYY0yZ5Of7+GPJFvZk5Qbc761vl5CX7wMgtUZV\nrjmrU+G2+DgPLRql8MH773DLLaPJyclh6NDhPPTQf8LWQikYliCMMSZIm3bs5dvf1vLjnHVBH3PF\nGR3o1rYeyUkJB23r2LEzaWn1GTNmHKeeenp5hlouLEEYY0wpvHn5LP1rF2PfmVO4rk2TWgzo3iTg\ncXVrJdG2acmjqXbu3IXffptHQsLBySMaWIIwxpgAtu3K4v0fl/F7+ubCdZef3oGe7dMOa6TUAtGa\nHMAShDHGlGhvlpfbXpyON895l3BU23qccUwLWjWuWcqR+xXM8rZ27VrGjXsyVKGGhCUIY4wpwb5s\nL948H03qJdOvSyNO6nkEcXHBD4KXnr6Y0aOvZc6cP2jQoCG3334XderUDWHE5csmDDLGVEo+n499\n2d6Af7Lcns3NGqRwSu9mQScHr9fL+PGPMWjQccyZ8wdDhw5nypQZMZUcwEoQxphKasI3yk9z1we1\nb1lHzn722fE89NB9hf0aorGFUjAsQRhjKqW1W5zezV1bB/5WHxfn4fiujct07iuuuJqtW7dw003/\njKp+DWVlCcIYUyG9MnEB38xYTUlf/vfleImP8zB6WNdyv3aNGik88MAj5X7ecLMEYYyJedMXbOST\nX1bg8+1fty0jC4CmaTVKPE6OKLmPQjC8Xi+bNm2kSZOmh3WeaGUJwhgTkyZOXcksdfom/LUlE4Aq\n8XHUSnam4KxfuxpHtqjDJadISK5f0EJpz549fP/9VJKSkkJynUiyBGGMiUlT529g264sqidVoXpi\nFRrUqc5tF3WnSrzTODMtLYUtW3aX+3UL+jWMHTumcAyl3NwcSxDGGBNN6tRMZOyo8E2k49+vIdZb\nKAXD+kEYY0yQFi1aUNiv4ZdfZlbo5ABWgjDGmKCdc85QmjdvQY8evSIdSlhYCcIYY4Lk8XgqTXIA\nK0EYY2LIrPTNbNqxF3AG0quWGB+S66SnL2bx4oWcc87QkJw/VliCMMZEvW27spixaCMf/bzigPX1\napVvyyH/FkpxcXH07XscDRo0KNdrxBJLEMaYqJKZlcvCldsLp+sEeOnzRYU/162ZxKWnOn0bmtYv\nuRNcWRXXQqkyJwewBGGMiSJzlm7h9a+VjMycYrdfNfhIOreuS41q5TvJzscff8ANN1wbtXNDR4ol\nCGNMVNi9N4fnP10AeDjjmOYHVR81qptMu8McGqMk3bodRaNGjXnggUcqfNPVsggqQYhIZ6AtkA8s\nU9UFIY3KGFPpFEzO069zQ87r3zqs127Vqg0zZswhPj40L71jVYkJQkQ8wDXA/wG7gTVALtBSRGoC\n44EXVDU/HIEaY2KDz+djx+5s8v3eIQRjx+5s54cyzr1QVj6fD08xEzxYcjhYoBLEh8Bk4GhV3eG/\nQURqAZcCnwBnhS48Y0wsyfXm8eFPK5g8a+0hnyOurLPzBKmghdKCBfN58cVXi00S5kCBEsQlqppZ\n3AZV3QU8JSKvBDq5Wwp5DugKZAFXquoKv+0jgJsAL/Cqqv63jPEbY6LE9ows7nhpJtnuNJ3tm6VS\nt2bZmqF64jwM6N6k3GMr2kJp48YNNGpUtkmAKqMSE0RBchCRBcDrwARV3VjcPgGcDSSqal8R6QM8\n7q4rMBboAOwFFonIO27yMcbEmK27ssjOzaNBnep0bV2X4QPaRPxbenEjr1oLpeAF85L6DOAS4EcR\nWQG8CnymqrlBHNsP+BpAVWeKSM8i2+cBtYGCysqyVVoaYyJK1+zg2U8WkOPNK5ysp1f7NM49Prwv\nmUvy2msvV4i5oSOl1AShqquBB4AHROQc4CngvyLyJvCAqm4LcHhNwL9E4BWROL8X2wuB2cAe4GNV\nzTiUD2GMCZ8p89bz2dSV+Hw+du7Z31+hecMUqsR76Nq6XgSjO9DFF1/Gxo0bue66G6zUcAhKTRAi\nUgMYClwMNAGeB94DTgG+AYqWCvxlACl+y4XJwW06ewbQHMgE3hKR81T1o0DxpKWlBNpcqdi92M/u\nxX6hvhdL1u1ix+5sGtSpTqN6CdSolsAdl/Wmbq1qIb3uoWjatB7jxz8W6TBiVjBVTCuBScB9qjql\nYKWIPA+cVMqx04DBwIcicjQw32/bLpx3D9mq6hORzTjVTQGFYoaoWBSq2bJikd2L/cJxL7KzvQDc\nflF3Uqo703vm53gj+m/g9XpZu3YNLVu2Klxnvxf7HeqXhmASxBWqOtF/hYicq6ofA+eUcuwnwEki\nMs1dvkxELgCSVfVlEXkRmCoi2cBy4LWyhW+MqezS0xdzww3XsGXLFqZMmUFKSs1Ih1RhBOooNxxI\nBO4XEf/+7QnAbcDHpZ1cVX3AtUVWL/Hb/gLwQlkCNsZERlaOl/d/WMaK9dHxqrC4Fkr5+dZvtzwF\nKkHUBPrivEM40W+9F7gjlEEZY6LPsnW7+GnuegBq1ahKUtXIDeVWUGqYO3eOtVAKoUD9IF4CXhKR\ngar6fRhjMsaEyeqNu/lt8SbyfaW3MN+6MwuAwX2bM6RvSxKqRG5CynXr1jJ37hzr1xBigaqYXlTV\nq4E7ReSgEoOqDghpZMaYkJo4dSUTp60KKjn4q59aPaLJAWDgwJP5/vupdO7cJaJxVHSByogF7wbu\nDUMcxpgQy/XmMWPhJrJy89i2K4tvf19L7ZRELhzUjjo1E4M6R0KVOJrUSw5xpMGx5BB6gaqYZrs/\n3gRMACaqavGzeBhjot6cpVt59av0A9Yd07EhPSQtQhGVLj19MbNm/cZFF10a6VAqpWDeMr0IXAA8\nISLfAG+q6k8hjcoYUy6ycrykr95JXr6PJWt3AjCwe1PaN08lLs5Dh+aldj2KCP8WSnl5eRx3XH+a\nN28R6bAqnWCG2vgC+EJEquH0fH5MROqpavOQR2eMOWTrtuzhlS8Ws2rjgZ3F2jVLpYfUj1BUpSuu\nhZIlh8gIdka5I4HzgWHAWuDJUAZljDk8m3fu465XfitcPqnnEdSrlURS1Xi6takbwcgC+/zzz7j2\n2its5NUoEcxYTPNx+j68CQxQ1Q0hj8oYc8i8efms3eSUGto0qcXxXRvTr0ujCEcVnF69etOiRUvu\nvPM+69cQBYIpQVyoqvNL380YEw2e+2QBc5dtBaDtEbViJjkANGzYiClTZhIXF9lmtMYRTD+Ip0Tk\noIbS1g/CmOiR7/ORl+cj15vHll37iI/zcFzXxhzfJXpnTStpbmhLDtHD+kEYE+PyfT7ueeU31m3d\nP8FjclIVLjlFIhhVyQpaKE2b9gvvvvuxJYQoFkw/iKGq+g//bSLyOvBzKAMzxgTH681n3dZMalRL\noO0RqeTk5nFki+hsvlq0hdKaNatp0aJlpMMyJQhUxfQy0AroKSIdixyTWvxRxphIadEohftH9o3K\nORBsbujYFKiK6UGgBTAeuM9vvRdYHMKYjDEBfDxlOb8v3ly4XMahlCLigw/etbmhY1CgBJGlqj+J\nyJBittUAtocoJmNMADMXbWLrzixq1qhauC61RtWomgu6qL/97QL++mstV1450koNMSRQgngZZ7rQ\nnwEf4N/cwIdT/WSMiYDUlEQeu+7YSIcRtPj4eG699bZIh2HKKNBL6sHu3/YGyZgosGnHXj79ZSW7\nMnNITkqIdDjF8nq9rFixnHbtorMFlSmbYHpS9wb6Ac8Ak4CjgGtU9aMQx2aMcX05YzUf/rS8cLld\n0+gYctufajo33HANa9asZsqU30hLi95RYk1wgulJ/RTwL2AosA/oAXzk/jHGhNDqjbuZPGstvy7Y\nWLju3yO606ZprQhGdSCv18tzzz3Fo48+XNhCKSEhctORmvITzL9inKr+LCJvAR+q6hoRsX99Y0Jk\n4artLHWH5p44bVXh+m5t6nH5GR2oUS16qpeWLFH+8Y+RzJnzh7VQqoCCedDvFZGbgYHA9SIyGoi+\nhtbGxLg9+3KZuWgTb01ectC2uy7tSfMGKcTFHTw0RSTt3LmTefPmWr+GCiqYBDECuAI4V1V3iEhj\nnAmEjDHlxJuXz5MfzGPF+gzAaTL4zwuPAqBerWrUrZUUwehK1rt3H6ZMmWkvpSuoYCYMWiciHwF1\nROR44AugNbAu1MEZE6u8efksWrWd7Nz8oPb/c9lWVqzPoGf7+vTt2JAWjVJIrRHcPNGRZsmh4gqm\nFdOzwBBgBU7/B9y/bTRXY4qxZec+Pv91FVP/LNvUKY3rJXPF6R1IrBofosgOnWo6P/30PSNHXhfp\nUEwYBVPFdDIgqrov1MEYEwt27slmW0ZW8Rt98NCE2YWLHZrXpnu70pt7xsd56C5pUZccirZQOuGE\ngYi0j3RYJkyCSRArOLAXtTGVVq43n9tfnEFWTl6p+444qR3HdWlE1YToeugHq6Bfg38LJUsOlUsw\nCWI7sEhEfgUKvzap6uUhi8qYKJGXn09GZm7hclaOl6ycPOqnVqOHlFwy6NWhPi0a1gxHiCHx7bdf\ncfnlF9vIq5VcMAnia/ePMZXO4+/NY/HqHQetb5KWzLAT20QgovDo1asP7dsfyS23/Nv6NVRiwbRi\nel1EWgAdgW+AI1R1ZagDMyYabNy+l8SEeLq13T9Sqgfo3y16p/IsD7Vr12Hy5J+LnRLUVB7BtGIa\nDtwJVAP6AtNF5BZVfTPUwRkTSas2ZrA320vN5ARGntmx9ANiVF5eHvHxB78nseRggpkM9l84iWG3\nqm7GGazPxu01FVr66h08+PpssnPyOL5rxSwteL1ennrqcYYMOYXc3NzSDzCVTjDvIPJUdbeI0xlG\nVTeISHC9f4yJIT/OWcdXM1YDsHWX0x7j76e1r5AJomgLpVWrVtK2bbtIh2WiTDAliIUicj2QICLd\nRORFYG6I4zIm7OYs3cLWXVnk5fuoUzOR5g1SDnj3UBEUlBoGDuzHnDl/MHTocH75ZaYlB1OsYEoQ\n1+G8g9gH/A/4Abg5lEEZE0ljrj46ZvsulOarrybx4IP32sirJijBtGLKxHnncJuI1AW2q2oMTJNu\njClq8OCzuPvuBxgx4mLr12BKVWKCEJE04HmcmeR+xpkg6GRgk4gMUdVF4QnRGFNePB4P118/OtJh\nmBgRqATxNDDL/fM3oDvQGGgDjAdOKu3kIuIBngO64vTCvlJVV/ht7wU85i5uBC5S1ZyyfwxjjD+v\n18vixYvo3LlLpEMxMSzQS+ojVfURVd0DnAa8r6oZqvoHTqIIxtlAoqr2xammerzI9heBv6vq8Ti9\ntZuXLXxjTFGq6ZxxxiDOPPNU1q5dE+lwTAwLlCD83zMMAL7zW64e5Pn74Q7ToaozgZ4FG0SkHbAN\nuElEfgLqqOrSIM9rTLnJ9ebz45x1bNlZwgitMcLr9fLII48UtlA67bQzqFGjRqTDMjEsUBXTarcX\ndXX3z08AInIRsDDI89cEdvkte0UkTlXzgXrAMcAonBFjJ4nILFX9qUyfwJhDkJmVy2zdQl6+jz+X\nbWXe8m0AJFaNj7ppPYOxdOkSrr/+apsb2pSrQAniOuAFoAFwoarmiMjjOJMHBfublwGk+C0XJAdw\nSg/LVHUJgIh8jVPC+CnQCdPSUgJtrlTsXuxXlnuxcMU2nn5/Duu2ZB6w/vhuTRhxansapcXet+6N\nG6uyYMF8LrroIsaPH0+dOtZCCez/yOEqMUGo6loOTgQPALf4PeRLMw0YDHwoIkcD8/22rQBqiEgr\n98X1ccDLpZ1wy5bdQV66YktLS7F74SrLvdixO5t/PzutcPnMY1vQqG4yiQnxdGpVhyr4YvK+NmzY\ngqlTf6d3765s2bI7Jj9DebP/I/sdaqIM1Mz1f8AY//cCqrrDb3tHnGRxWYDzfwKcJCIF/yMvE5EL\ngGRVfVlErgDecYfx+FVVvzqkT2FMkLJyvAC0bVqLU3s3o1vbehVmULqWLVtFOgRTwQSqYroLeFJE\nGgFTgb8AL05LoxPd5ZsCndztUHdtkdVL/Lb/BPQpc9TGHKbG9ZI5KoipQKONajqTJn3GzTf/K9Kh\nmEogUBXTOmCYiLTGqSZqD+QDy4ERqro8PCEaY4rODT1gwCCOOqpHpMMyFVwwQ20sx+kYZ0xM8/l8\nQc0lHW2KmxvakoMJh2AG6zMmZvl8+7vzTPhG+WnuesCZFS4W/Pzzj4wYMczmhjYRYQnCVFgLV21n\nwjfK5h37Dljfo10a/brExhwPPXv25qijenDddaOtX4MJu6AShIgkA61xmqlWd0d4NSbqZGbl8uDr\ns9iZmUN2Th5xHg/tmtYq7PzWukktzuvfOsJRBi85OZmJE7+uMC2tTGwJZk7qgTgd5uJxph79U0RG\nqOq3oQ7OmLL49JcVTJy2qnC5U8s6nNe/Nc0bxkZnKa/XS5UqB/+XtORgIiWYGeUexhlTaaeqbgD6\nA2NDGpUxh2DO0q14gIZ1qnPT8K7cNLxbTCQH/1ne9u7dG+lwjCkUTBVTnKpu9JuTelHBz8ZEyp/L\nt/HF9FX4fJCQEE9ubh6bduwlKbEKD199dKTDC1rRFkorV66gY8dOkQ7LGCC4BPGXiAwGfCKSijNG\nk40hbCJqxqKNLP1rF3EeD/41MEc2T41cUGVQtF+DtVAy0SiYBDESpx/EETid5H4ArgplUMaUJNeb\nx3s/LGPp2p0APHrtMUjrtJgbc2fKlJ9sbmgT9YJJEF1V9QL/FSJyLvBxaEIypmQr1mfwwx/rAKhR\nLYHkagkRjujQDBgwiP/853HOPvtcKzWYqBVosL7hQCJwv4jcXeSY27EEYSKgoN/bKb2P4JzjWlE1\nIT6yAR2Gyy67MtIhGBNQoBJETZxmrSk4g/MV8AJ3hDIoY/zt2ZfL1D83kJuXz5adTqe3xIT4mEgO\nXq+XuXP/oGfP3pEOxZgyCzRY30vASyIyUFW/D2NMxrBtVxZzl23F5/Mxedbag6YDrREDVUsFLZQW\nLlzA999PRaR9pEMypkyCeQeRLSKfATVwhrCJB5qraotQBmYqn3VbM1m1IQOAV75YfND2s49rSevG\ntagS76FN01rhDi9oRVsoDRt2PvXr1490WMaUWTAJ4mXgP8DfgaeA04A/QhiTqaSe/ujPg8ZNunJw\nB6pWiSe5WgLtm6VGfa/i5cuXMmrUVYX9Gh577ClOOeW0SIdlzCEJJkHsU9VXRaQFsAOnievskEZl\nKhVvXj6fTV3JFjc5XHaaUxXToE512h0RG/0aClSpkoCqMmzY+Tz44CPWQsnEtGASRJaI1AEUOFpV\nf3AH7zOmXEyZt54vpq+mbs0kRp7ZMaqrj0rTvHkLpk79jaZNj4h0KMYctmASxOPAe8C5wO8iMgIr\nQZgy8Pl8bMvIIj/fV+z2dVudwYGvHNwhppNDAUsOpqIIZka5D0TkQ1X1iUgPoB2wLPShmYog15vH\nxGmr+GL66lL3jY8PZuzI6KCazrvvvsXdd98f9e9FjDlUgTrKpQE3AduBJ3D6P+zD6RvxNdAgHAGa\n2JCXn4+vSAEhc18u/35hBtm5zjSfLRul0CStRrHHp1RPoEWMjLzq30LppJNOoW/ffpEOy5iQCFSC\neAvYDdQDqorIl8AEoDpwYxhiMzHi+9l/8fbkJRRfgeQY2KMp5w9sQ3xc7JQSiipubmhLDqYiC5Qg\nWqtqaxFJAaYDo4CngcdVNScs0Zmoo2t28MzH88nx5hfO65zjzQegXq0k6teudsD+cXEezjq2Ja2b\nxPa7hRkzpjN06BAbedVUKoESRAaAqu52WzGdp6rTwxOWiTbL1u3ine+WstLtyAZOlVGBurWqcc2Z\nHQun9qxiVCgFAAAgAElEQVRounfvQb9+x/P3v19pI6+aSiNQgvCvMdhkyaHymrFoIy9NXIQPaFC7\nGsnVErjunM7UTkmMdGhhU7VqVd5918anNJVLoASRIiLH4UxLmuz+XPj1UFWnhDo4EzlfTF/FH0u2\nALBygzPXwo1/60rnVnUjGFV4ZGdnk5hYeZKfMSUJlCD+Au53f17n9zM4pYsBoQrKRM5s3cyUeRuY\nv2IbAFWrxJFQJY5GdarToXntCEcXWgUtlCZMeI3vvptCrVqx1YvbmPIWaDTXE0vaZiqm/325mKl/\nbihc7tm+PqPOrhzzIxc3N3S3bt0jHZYxERVMT2pTCezKzGHqnxtoUKc6/zi3M/VrVyO+gr5w9mdz\nQxtTMksQhlUbM/h57noAmjeoQeN6lWeorT/+mG1zQxtTAksQldTCldtZ+tdOACZOW1W4PrVG5Xo5\n27t3H5566nlOOeU0KzUYU0SpCUJEagOPAq2BYcBY4GZV3RHi2Ew5m5W+me0Zzsxs7/5w8HBad13a\nk2YNih8KoyI7//wRkQ7BmKgUTAniJeBboDfO0BsbgDeBM0IYlymDdVv2sGJ9RsB9tu/O5rOpKw9Y\n5/HArecfBUC91CTq1apW3KEVgtfrZebM6Rx77HGRDsWYmBFMgmipqi+KyLXuEBt3iMi8UAdmSpeX\nn88nU1by9cw15BcdKa8E9VOrMXxAGwBaNq5ZKaqUCloozZs3ly+//I7u3XtGOiRjYkIwCcIrIrVw\ne1aLSFsgP6RRmVJt3JbJhG+UKfM2UK9WEqcd3ZyqVQIPhBcf56Fz67okJyWEKcrIKq6FUsuWrSId\nljExI5gEcQ/wE9BMRD4FjgEuD2VQJrD8fB+jn/yJvVleAC4Y1Jaj2qZFOKrosnLlCq655vIDRl61\nFkrGlE0wCWIyMAvoA8QDI1V1U0ijMiXy+Xxs2L6XvVle6qdW46ReR9CppbW+Kap69WRWrVpp/RqM\nOQzBJIg1wCfAm6o6oywnFxEP8BzQFcgCrlTVFcXs9wKwTVVvL8v5K6NPflnBpF+d2dka1KnOwB5N\nIxxRdGrQoAFTpvxGgwY2r5UxhyqY2Vs6AXOBh0QkXUTuFZE2QZ7/bCBRVfsCt+HMb30AERnpXsOU\nIi8/ny07nWaqJ3RvyuC+zSMcUXSz5GDM4Sk1QajqDlV9WVUHAhcBQ4D0IM/fD2d6UlR1JnBA8xER\nOQboBbxQlqAro+0ZWdww/hdmLnJq9y45/UjaNrXB5FTTGTVqFHl5eZEOxZgKJ5iOcmk4HeTOB+oA\nbwPnBHn+msAuv2WviMSpar6INMR5AX42MDzYgNPSon/e4lDYlJHNvuw8GtatTo/2DaiXmoTHU/HH\nSiqJ1+tl3Lhx3HPPPeTk5DB48GBOP91eQkPl/T9SHLsXhyeYdxBzgfeBG1V1dhnPnwH4/wvFqWpB\nE9lhQF3gS6ARUE1E0lX1jUAn3LJldxlDqBh27twLQC+pzznHtcTj8VTae1F05NWXXnqRXr2Oq7T3\nw19aWordB5fdi/0ONVEGkyCO8Huol9U0YDDwoYgcDcwv2KCqT+PMcY2IXApIacnBmLlz/2Dw4JMP\nGHm1Xbvm9iAwJgRKTBAi8oeqdsepFvLvpusBfKoaH8T5PwFOEpFp7vJlInIBkKyqLx9y1KbS6tKl\nG6eeegZDhw63fg3GhFigCYO6u38f9CJbRIIan0FVfcC1RVYvKWa/14M5X2W2KzMn0iFEhbi4OF5+\n2X5djAmHUlsxicj0IstxOB3nTJj8NGcdL05cBDjjJ1UW+/bti3QIxlRqJSYIEflBRPKBPiKSX/AH\np8Obhi1Cw7e/ryU+3sPN53ejW5t6kQ4n5LxeL0899Ti9e3dl0ybrtG9MpASqYhoAICLjVXV0+EIy\nRfl8PqonVaFji4o/XETRFkpr1qyyDm/GREigl9SDVXUS8IeIXFJ0u7U4MuXJ5oY2JvoEaubaC5gE\nnFDMNh9gCcKUm/T0xTz88P3Uq5dmI68aEyUCVTHd4/59WcE6EamJ0y9iYRhiq1Ty833MXrKFzH25\nB23bm+0lLq5i95ru1KkzL774Kscd199KDcZEiWCG2rgCOBb4FzAH2C0iH6nqnaEOrjJZsnYnz3+6\noMTtjepWD2M0kXHmmcGO4GKMCYdgelKPAk7CGajvM2A0MAOwBFGOsnOdweb6HNmArm3qHrS9RcOK\n0bzV6/Xy888/MHDgyZEOxRhTimASBKq6XUROB55SVa+IVNzZ7SOsWYMaHH1kw0iHERL+LZQ++OAz\n+vc/MdIhGWMCCGY+iIUiMgloBXwnIu8Dv4c2LFORFPRrGDiwH3Pm/MHQocPp0qVrpMMyxpQimBLE\n5UBfYL6q5ojIBOCr0IZVeezKzCEvL5/dew9+OV0RrFmzmquuutTmhjYmBgWTIKrijMj6uIhUAX4E\nfgC8oQysIsr15uHN2z/u4dT5G3jnu6UH7OOhYrVWSk1NZdOmTdavwZgYFEyCeAbYi1OS8ABXAf8F\nLg5hXBXO6o27eWjCbLx5B4+cXq9WEm2b1iKhSjy92tePQHShU7NmLX74YSp16hz84t0YE92CSRA9\nVNW/wvh6EVkUqoAqosfem8vCldsLl/3HU6pRPYGLTmpH1YRgRk+PTZYcjIlNwSSIOBFJVdWdACKS\nilUvBWVW+mbe+EbZ43Z+69yqLiNObkf91IrXCEw1naeffoLHH3+aqlWrRjocY0w5CCZBPA78LiIT\n3eUzgTGhC6niWPrXLvbsy6VBneoM6N6Ek3oeEemQyl3RMZROOeU0hgw5O9JhGWPKQakJQlVfFZHf\ngf44zWLPVdX5pRxW6WzcvpdXJi0iy+3wBrBzdzYA15zZkeYNK97k6UVHXrUWSsZULIFGc40DrgPa\nAVNV9dmwRRWDlqzdyfL1GSQmxFMlfn9LpCb1kqlfu+JVKaWnL2bQoONs5FVjKrBAJYjngCOBX4Hb\nRURU9f7whBVbJk5bye/pmwG4+JR29O3UKMIRhZ5Ie4YNO59TTjndSg3GVFCBEkR/4EhV9YnIWJy+\nD5Yg/KzamMHk39cyfaEz61lClTga1kmOcFTh4fF4eOKJZyIdhjEmhAINtZGlqj4AVd2GMweE8fPz\n3PWFyaHPkQ149sbjaVUB54zes2d3pEMwxkRAoARRNCEc3MOrkvP5nFt020XduXrIkVSJD2Zoq9jh\n9XoZP/4xunfvyMqVKyIdjjEmzAJVMTUXkf+VtKyql4curNiQ76bQmtWr4vFUrCEy0tMXM3r0tYUt\nlDZt2kjLlq0iHZYxJowCJYibiiz/HMpAYs0X01cxfcFGqsR7qJ4U1KjpMcHr9fLss+MZO3aMtVAy\nppILNOXo6+EMJJas35rJRz+voFaNqlx2WntSqlecnsNr1qxi3LhHSE2tbf0ajKnkKs5X3zDZuSeb\nP5dvA6Bvx4Z0aV2vlCNiS6tWbXj11Tfp0aOXlRqMqeQsQQQp3+dj9cbdPPjGLNx30yRUqVgvpQsM\nGnRKpEMwxkSBoBKEiCQDrYH5QHVVzQxpVFFo5qJNvPT5/kFsLxzUlj5HNohgRIfH6/Xy9ddfMnjw\nmZEOxRgTpUr9CiwiA4F5wGdAQ2CViFSqGeezc/LYsG0vAL3a1+em4V0Z1POImH33kJ6+mDPOGMTl\nl1/E559/FulwjDFRKpg6koeBfsBOVd2A08N6bEijiiJ5+fn8+4XpTPp1FQDHdGxIp5axOb9BQb+G\nQYOOK5wbul+/4yIdljEmSgU1H4SqbhQRAFR1UcHPFYXP5yt8r1BUdk4+uzJzqJ2SSE+pjzRLDW9w\n5WTdur+4/PKLbORVY0zQgkkQf4nIYMDnThZ0HbAmtGGFjzcvnztfmsnmnfsC7tesfg0uGNQ2TFGV\nvzp16pKRkWH9GowxQQsmQYwExgNHACuA74GrQxlUKD3/6YLCZqoA2X7zN3RoXrvE447v1jikcYVa\ntWrV+OabH6lZs1akQzHGxIhgJgzaDFwQhljCYvHqHeTl+2iStn/U1TgPDD6mBUe1S4tgZKFnycEY\nUxalJggRWUkxI7mqaswOzNOgdjXu+XuvSIcREunpixk7dgzjxz9HjRo1Ih2OMSaGBVPFdILfzwnA\nOUBiSKIxh6zoGEonn3wqw4dfGOmwjDExLJgqptVFVo0VkVnAg6UdKyIenJnpugJZwJWqusJv+wXA\naCAXmK+qo8oQu3EVHXnVWigZY8pDMFVMx/steoCOQLCTLJ8NJKpqXxHpAzzurkNEknBmqOukqtki\n8raIDFbVSWX6BJXcqlUrbW5oY0xIBFPFdJ/fzz5gK3BpkOfvB3wNoKozRaSn37ZsoK+qZvvFkhXk\neY2rRYuWXH751fTt289KDcaYchVMgnhfVZ8/xPPXBHb5LXtFJE5V893pTLcAiMg/gGRV/e4Qr1Op\n3X//w5EOwRhTAQWTIK4DDjVBZAApfstxqlo4dan7juJRoC1wbjAnTEtLKX2nADweD/FV4g77PJGw\nY8cOatfe31cjFj9DqNi92M/uxX52Lw5PMAlirYj8AMwECrsbq+r9QRw7DRgMfCgiR+OMBuvvRWCf\nqp4dZLxs2bI72F2LlZ+fT543/7DPE04FLZSefPIxJk36lo4dO5GWlhJTnyGU7F7sZ/diP7sX+x1q\nogwmQczw+7msEy9/ApwkItPc5cvclkvJwGzgMuAXEfkR5/3GeFUNyfCi+7K9vP51OplZXo6oHzv9\nA4q2UNqxY3ukQzLGVBIlJggRuVRVX1fV+0rapzTue4Zri6xeEsz1y9uPc9bx2+LNtG5ck0tPbR+u\nyx4ymxvaGBNpgYb7Hh22KMIgMysXgPMHtqVBneoRjqZ0W7ZsZvz4x0lNrc0bb7zLc8+9ZMnBGBNW\nNuVolGrUqDFvvPEOHTt2ssRgjImIQAmio4isKGa9B/DF8lhMsaJfv+NL38kYY0IkUIJYBlSInlc5\nuXnk5OSXvmMEeL1ePvnkQ4YOHY7HU9Y2AMYYEzqBEkROMeMwxZyMzBz++d9fycl1EkQ0PYT9Wyjl\n5uZy4YUXRzokY4wpFChBTAuwLSa8+/1Svv19beHywO5NadYg8k1ci2uhdNppZ0Q6LGOMOUCJCUJV\nrw9nIOVpzabdjHt3Lnv2OS2XurSuy+lHN6fdEZGfT3rTpk1ccslwG3nVGBP1KmQrpnVbMtmzL5d6\ntZI4qm1aVM0lXbduXfLzfdavwRgT9SpkgigwuG8Lju8aXXNJV6lShU8++cJmezPGRL1AHeVMiFhy\nMMbEggpTglj6104mTl1JXr6PXZk5kQ6H9PTFPPjgPTz11PPUqVM30uEYY0yZVZgSxMxFm1i4agfp\na3ayYdteqibE0aRectjj8Hq9jB//GIMGHce3337NxImfhj0GY4wpDxWmBFHgvst70yTNSQxxYe7z\nYHNDG2MqkgqXIDye8CcGgI0bN3Dyyf3JysqyFkrGmAqhwiWISGnYsBE33HATnTp1sVKDMaZCsARR\njm655d+RDsEYY8pNhXlJHU5bt26NdAjGGBNyliDKoKCFUvfuRzJjxvRIh2OMMSFVIaqY1mzazfaM\n7JBeo2gLpaysfSG9njHGRFrMJ4itO/dx76u/Fy4nVCnfQpHNDW2MqaxiNkFs3bWP7Nx8NmzNBKBN\nk1qccFRjGtQu3/mmMzJ28cILz5KaWtv6NRhjKpWYTBBzl27lqY/+PGBdmya16NupUblfq06durzx\nxru0bt3GSg3GmEolJhPEjt1ZABzZojYNalcnPs5D/26hG7W1Z8/eITu3McZEq5hMEAWO79qY3h0a\nlMu5vF4v7733NsOHX0iVKjF9W4w5JHPmzObuu2+jZctWAGRmZtKkSVPuvvsBqlSpws6dO3n22SfZ\ntGkj+fn51K/fgOuv/7/CwSjnzZvDa6+9jNfrJSsri9NPH8I55wyN5EcqrCK+9dbbIxpHdnY2Dzxw\nFzt27CA5OZk77riXWrUOnMBs+vRpvPbaywCItOemm/5FZuYe7r//LjIzM8nL83L99TfRsWMnXnnl\nBQYOPJkWLVqGNG57EnJgC6XduzO45pqYnUzPmMPSo0cv7r33ocLl++67k2nTptC//wDuuONWLrzw\nEo499jgAZs36jX/+80Zeeul11q9fx/jx43j88WdJTU0lOzub0aOvpUmTpvTufXSkPg4vvvg85533\nt4hdv8Cnn35I69Ztueyyq/j++2957bVXGD365sLte/fu5fnnn+KZZ16kZs1avP32BHbt2smHH75H\nz559GDbsfNasWc29997B//73JsOHj+C+++5g7NjxIY27UieI4looDR9+YaTDMob3f1jG7+mby3xc\nfLyHvDxfsdt6ta/P3wa0CXi8z7f/2NzcXLZt20pKSk3S0xdTo0aNwuQATtVrkyZNmTNnNvPmzeHU\nUweTmup8K05MTOTxx5+mWrUDG4389ddaHnnkAbxeL0lJSdx778M899x4Bg06hd69j2bmzOl8//23\n3H77PZx33mBatGhFixYtmDbtF15//R0SE5N45503iY+P54QTBvDoow+Rk5NDYmIi//znHaSl1S+8\n1p49e1BdRKtWzmf+6KP3mTLlR7KysqhVK5WHHx7L5Mlf88UXE/H5fFxxxUh27drJe++9TXx8PF26\ndGPkyOvYsmUz48aNKbwfV111Lf369S+8zrp1f/HIIw/g8RsD7qSTTmXIkLMLl//8cy4jRlwKwNFH\n9y0sKRRYsOBPWrVqw9NPP8H69esYMuRsatVK5fzzR5CQUBVwnleJiYmAM6dMYmISK1YsK/x8oVBp\nE8S2bdu48MLzbORVY/z88ccsbrjhGrZv305cnIezzjqX7t178sMP39GkSdOD9m/cuAmbNm1k69Yt\ntG0rB2yrXv3g4fafffZJLr30cnr1Oppp035h6dL0EmPZsmUzr732DikpKSQkVOWnn37glFNOZ/Lk\nr3nyyed47LExDBt2AX36HMPs2b/z/PNPc/fdDxQeP2/ePJo1aw44iW/37gzGj38egJtu+gfp6YsA\nSEmpyZgx48jIyGDUqCt55ZUJJCYm8sADdzNr1m8AXHDBxXTr1p0FC/7klVdeOCBBNGnSlKeffiHg\nfc3MzCycKKx69WQyMzMP2L5z507mzJnNa6+9Q1JSEtdddyWdOnWhadMjANi2bSsPPng3o0ffWnhM\n69ZtmDNntiWIUKhduzY1aqRYvwYTlf42oE2p3/aLk5aWwpYtuw/5ugVVTBkZu7jxxutp1KiJe940\nNmxYf9D+a9euoVevPmzdupVNmzYesG3ZsqX4fPkHJI41a1bTsWNngMLSyOTJ3xRu9y/BpKbWJiUl\nBYDBg89i3LgxNGvWnObNW1CzZk2WL1/OhAmv8tZbr+Pz+Q56d7hjxw5q13bej3g8HuLjq3DPPbdT\nrVo1tm7djNfrBShMIuvWrWXnzh3ceutofD4f+/btY926v+jSpRuvv/4KkyZ9BkBeXt4B1/EvQfh8\nPjwez0EliOTkZPbu3QvA3r2ZhZ+rQK1atejQ4Uhq164NQNeu3Vm6VGna9AiWL1/GfffdwfXX30jX\nrt0Kj6lbtx5bt2456N+kPFXaBBEXF8dbb31AUlJSpEMxJurUrFmLu+66nxtuuIbXXnubzp27sn37\ndn79dSp9+/YDYMaMX1m//i+OOqoHjRs34fbbb2HgwJNJTU1l7969jB37MJdddhVt2+4/b4sWLVm0\naCE9e/bm22+/ZvfuXVStmlj4oFuyZH+Jwn/U/qZNj8Dng7ffnlD44rtFixacf/7FdOrUmTVrVjF3\n7pwDPkPdunXZs8dJlsuXL+OXX37ixRdfIzs7iyuuuLgwGcXFOZ1rGzVqQoMGDXniiWeJj4/nq68m\n0bat8PLLz3PmmefSp88xfPnl53z11aQDrhNMCaJz565Mnz6N9u2PZPr0aXTpctQB29u1a8+KFcvJ\nyNhF9erJLFw4nzPPPIeVK1dw993/5v77H6F16wO/MOzenRHyL7aVNkEAlhyMCaBFi5YMG3Y+Tz45\njvvvH8N//vME48ePY8KE/wFQv34DHn10PB6Ph4YNG3HttTdwxx23Eh8fz969exky5GyOPrrvAecc\nNWo0jz76MG+88T+SkpK4664HWLfuL8aMuZ/Jk7/miCOa+e194LwugwefySuvvEj37j0LzzVu3CPk\n5GSTk5PD6NG3HLB/165dGTPmPwA0bdqUatWqM2rUlfh8PurWTTvo23dqairDh4/g+uuvIi8vn0aN\nGjNgwEmceOIgnnnmCSZMeJX69Ruwa9fOMt/Lc84ZyoMP3suoUVeSkFCVe+99EID33nuLpk2bceyx\nxzFy5PXceOP1eDweBgw4iZYtW3HbbTeTk5PL+PHj8Pl81KiRwpgx4wBYtGgBI0eGtkGNx79IFwN8\nb32xkJmLN7Fyw26uOatjqc1c09MXc+ed/+app56jceMmYQoz9A63KqEisXuxn92L/dLSUvjXv+7g\nrLPOOej9SKzLyMjg4Yfv5ZFHHg9q/7S0lEOaRS2mRnPNy8vn3R+WsXLDbjweqFur5BKA/9zQU6b8\nyMSJn4QxUmNMNLjiipF88smHkQ6j3L3//ttcffV1Ib9OTFUxFZR12jatxQ1Du5CclFDsfjY3tDEG\nnMYo//znHZEOo9xdeeU1YblOTCWIAglV4kpMDjt2bOfUUwewd2+mtVAyxpjDEJMJIpDatetw++13\n0axZCys1GGPMYahwCQLg6qtHRToEY4yJeTH1krqojRs3RDoEY4ypsEKaIETEIyLPi8ivIvKDiLQq\nsn2IiPwmItNE5Mpgz5uf57RQ6tmzM5Mnf13+gRtjjAl5FdPZQKKq9hWRPsDj7jpEpIq73APYB0wT\nkc9UNWDf8d1b1/Dmh8+xcU069es3ID6+QtaSGWNMxIW6iqkf8DWAqs4Eevpt6wAsVdUMVc0FpgLH\nBzrZPfc9yC9v3cTGNekMG3Y+v/wykwEDBoUqdmOMqdRCnSBqArv8lr0iElfCtt1ArUAne2L8MyQk\npXDe1WN49tkXrfmqMcaEUKjrZzIA/2EL41Q1329bTb9tKUDAQU727tp8SN3FK6q0tJTSd6ok7F7s\nZ/diP7sXhyfUJYhpwOkAInI0MN9v22KgjYikikhVnOql6SGOxxhjTJBCOlifiHiA54Au7qrLcF5K\nJ6vqyyJyBnAPzrCNr6jqf0MWjDHGmDKJtdFcjTHGhElMd5QzxhgTOpYgjDHGFMsShDHGmGJFZTdk\nv5fbXYEs4EpVXeG3fQhwF5ALvKqqL0ck0DAI4l5cAIzGuRfzVbXCjlRY2r3w2+8FYJuq3h7mEMMm\niN+LXsBj7uJG4CJVzQl7oCEWxH0YAdwEeHGeFRW+IYw7asUjqnpikfVlfm5GawmicIgO4DacITmA\nA4boGAScAFwtImmRCDJMAt2LJOB+oL+qHgekisjgyIQZFiXeiwIiMhLoFO7AIqC0e/Ei8HdVPR5n\nNIPmYY4vXEq7D2OBATijOtwsIgE748Y6EbkVeAlILLL+kJ6b0ZogynWIjhgX6F5kA31VNdtdroLz\nLaqiCnQvEJFjgF7AC+EPLexKvBci0g7YBtwkIj8BdVR1aSSCDIOAvxPAPKA2UM1drujNNpcB5xSz\n/pCem9GaIMp1iI4YV+K9UFVfweCGIvIPnP4l30UgxnAp8V6ISEOcPjXX4/SrqegC/R+pBxwDPIXz\njXGQiJwQ3vDCJtB9AFgIzMbppDtJVTPCGVy4qeonONVpRR3SczNaE0S5DtER4wLdi4Ih1ccCA4Fz\nwx1cmAW6F8OAusCXwL+BC0XkkjDHF06B7sU2YJmqLlFVL8437KLfrCuKEu+DiHQGzsCpXmsBNBCR\n88IeYXQ4pOdmtCYIG6Jjv0D3Apy65kRVPduvqqmiKvFeqOrTqtpLVQcAjwBvq+obkQkzLAL9XqwA\navjNv3IczjfpiijQfdgF7AWyVdUHbMapbqoMipaiD+m5GZU9qW2Ijv0C3QucovPvwC/uNh8wXlU/\nC3ec4VDa74XffpcCUklaMZX0f+QE4D/utl9V9cbwRxl6QdyHkcDlOO/rlgNXuaWqCktEmgPvuPPw\nXMBhPDejMkEYY4yJvGitYjLGGBNhliCMMcYUyxKEMcaYYlmCMMYYUyxLEMYYY4plCcIYY0yxonI0\nVxNabjvpJezvPOXB6UMxRFXXlXDMPYBPVe8/jOteijNg2Gr3mknAz8Ao/97hQZ7rPuB3VZ0kIj+4\nHeQQkT9Utfuhxuie40egKc5wBB6cHqjLgREFQ5uUcNxVQIaqvleGazUBHlDVy/3W3Q94y3qv3Z7D\nT+L0KI/H6Qj1f6q6tyznKeUak4ArcTqdfQU0Bl4F2qvq1SUc0wMYqapXl3aPRCQZeAMY6nZuMxFk\nCaLyWne4D9JD9FnBw9Dt5PQzcB3wdFlOoqr3+C2e4Le+vD7T5apa0AEREfkIZ9jo2wIc0xf4sYzX\neRK4w71GTZwEej7waBnPA/Aezgiuv7nnew5ntN9bDuFcxVLVwe65mwEdVbVpEMfMBgqSR8B7pKqZ\nIjIZuAZ4/vAjNofDEoQ5gIh0xHlYJwP1gcdU9Rm/7VWA/wEd3VXPu7006+OMotoUyAduV9XvA11L\nVX0i8ivQzj33ZTgP4XycXuLXAzlFrvecqr4iIq8CPwHd3WOnq+oxIpKP83u9FuimqltEpDawAGgG\nnATc5+6zEqdn7Y5iwiusfhWRFJwB8Ga4y8PcOJNwRgm9Emd45TOBE0VkA84oogHvh4i0Bhqp6hJ3\n1Vk4JbvHODQNcP7dCtyLMwYR7v3KBzrjlIgeVNU33W/sz+Lc33jgP6r6nogkuuv74fwbPKCqH4jI\nSqA/8DlQT0R+A24F7lXVE0WkG/Bf975sBy4C2rixPOh3j3YCrwAtVXWPW6r9QlU74SS6GViCiDh7\nB1F5NRGRP0Rkjvv3ze76K3EeBn1wxtF/uMhxfXGGj+6B87Dt664fj9N9vxfOg+4F9+FTIhGpC5wG\nTNvlfGQAAATrSURBVBWRTsDtwHGq2hVnDJ17i7nesX6n8KnqaABVPcZvXT7wPs4AfgDnAZ/gjMMz\nBjjZPd+3lPxN/SX33qzHqar5FnjCLfVcDZyhqkfhDGdxq/vwnwjcraqTg7wfg3GGXcb9DBNU9VGc\nB/mhuBH4XETUnTSpZ0FpwtUEOBpnYMdxblK/E5jlxtkfuFNEWgAFowO3x7nvd4tIgt+5zgTWq2pv\nd7mgOuhN4D733/Bd4IaC7UXu0URgEjDU3X4J8Lp7H3YAu90qMxNBVoKovEqqYroZOFVE/o0zvk3R\nh9oCoJ2IfI0zcuq/3PWDABGRB9zleKA18GeR488SkT9wvpx4gI/cb6zXARNVtWCEyRdxSg5jSrhe\nad4EnsAZp+cCnGqcPjiliB/dB30czsinxblCVX9x55j4EPiyYAwfETkXGCIiglO9VdzYPsHcj7ZA\nepCfp1Sq+oZbFTbI/fOqiLylqje5u7zqJs91IjIVZxC/QUA1EbnC3acaTmmiP+68Gqq6CafkgfOR\ni+cm/Iaq+pV73Avu+v4lHPIqzthArwEXAv4zoK3BuT9FB6c0YWQJwhT1Ac5D83Ocb4DD/Teq6nb3\n2/4gnKGU57jVUnHAgIIHvIg0wpnqsqjCdxBFFC3NeoAqqrqjmOsdWdqHUNXZIlJHRHoCTVR1hoic\nCfyiqme7MVblwKGii14fVZ0uIk8DE/6/vbNnjSoKwvCDVqYQq4BKEEGdRkQtROsUYqGQRmxURCzi\nRxDRQhETQSwFf0FAgoqoECMpJBgNWIiixhhhQA2kUMRKohGMisXMkrvLDfkgYQN5n2rv2bv3zJ3d\nPXPOO4c7ZraFGEBfEInUp8SAf2KK+5nOH/8oDy6l5DV6idn650o+IN/bABxw9ytAN9BtZteB14Qc\nRk1fy4nSk8uIcqRv8jqNhDR0tHBuRQ4bncbECQpPEU2Zas1UJ7v7gJmtNbMW4JO7F/0zwdxXUmKe\nkMS0dJmqqE4zIQH0kMnfnG2Tr/cCXe7eS9TCHiN09sfkQJkD+FugYRb2PAH2mdmqPD5GzPTL+muq\n+WyxSEzxvm4Ss+Dbefwc2GVmG/O4nShJOR3X8l5aiXzJX3e/SiRb9xCDLcQAXJl0zcQfH5lFKVB3\n/+Lu29x9ezE4JN+ANqsuDLSZCBAV9qc964AdxFOA+4Hj2b467WwCBgrnNxLfT1UZS2p+Q1mMZ9TM\nmrPpEJHvKfIHKEpVN4jCRp01560nqqOJOqIAsXSZagthB/DMzF4S2vMI8Wet0Av8MrNhIpF4z92H\nCa15p5kNAreILaE/Z2qMuw8RctKAmb0nql1dJLZSjpf0V7T/ATCYM9ZiexdRzL4r+/hKPPr5Ttq5\nlZDUaqnyjbv/TlsuEYPWoJk5kUgfY3KQ7wMupAR1agb+eEi1rDJn3P07scLqMLMP6cPDhLxWoSG/\n1x4mk/OXCYlpKO0/6+4jhDQ3nvY/Ak66+w+qfVP2GzqYNrwickDnat7vA86njyAS0iuIVQ8AFnWj\nV7r7u1k7Qswrety3EHXEzO4C7Rn0FrKfTqB/MRVRypVpK7DJ3U8X2tuACXfXLqY6oxyEEPXlDDGL\nP7LA/SzGmeB9Qs7aXWnInV7NQEu9jBKTaAUhhBCiFOUghBBClKIAIYQQohQFCCGEEKUoQAghhChF\nAUIIIUQpChBCCCFK+Q+Lt8ikHd6LvAAAAABJRU5ErkJggg==\n", "text/plain": [ "<matplotlib.figure.Figure at 0x11d737d10>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "y_pred_prob = logreg.predict_proba(X_test)[:, 1]\n", "auc_score = metrics.roc_auc_score(y_test, y_pred_prob)\n", "auc_score\n", "fpr, tpr, thresholds = metrics.roc_curve(y_test, y_pred_prob)\n", "fig = plt.plot(fpr, tpr,label='ROC curve (area = %0.2f)' % auc_score )\n", "plt.plot([0, 1], [0, 1], 'k--')\n", "plt.xlim([0.0, 1.0])\n", "plt.ylim([0.0, 1.0])\n", "plt.title('ROC curve for win classifier')\n", "plt.xlabel('False Positive Rate (1 - Specificity)')\n", "plt.ylabel('True Positive Rate (Sensitivity)')\n", "plt.legend(loc=\"lower right\")\n", "plt.grid(True)" ] }, { "cell_type": "code", "execution_count": 269, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "0.64683339949421492" ] }, "execution_count": 269, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.cross_validation import cross_val_score\n", "cross_val_score(logreg, X, y, cv=10, scoring='roc_auc').mean()" ] }, { "cell_type": "code", "execution_count": 271, "metadata": { "collapsed": false }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYQAAAEZCAYAAACXRVJOAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xl8VPXVx/FPUBZFENCAUi0qlkOtVupSARUUd0UQaStS\nV5BaxS6UPo+l4tKFomgVxWqLKLVK0YpFUCtSBfERFBVUNj1AsLiggISAstPk+eN3x5nEZHIDmcwk\n+b5fL17JnTtzc2YS5sxvO7+8kpISREREGmQ7ABERyQ1KCCIiAighiIhIRAlBREQAJQQREYkoIYiI\nCAB7ZjsAySwz+xcwzd3viY6/ATgw0t1viG7LBz4C9gceA4a6+3sxr98OKAAWRDflRV/vcffxVYx1\nJjDG3f9ZhcfcDOzn7j8t59wzwC+BNsC97n6Umf0GWObuj5rZjcDb7v50FX7e/cCZwN/d/ca4jytz\njW8D9wFNgWLgBneftivXKufaY4H73f2t6rhezJ95LnCCu9+c5j6XA99z9/NrKi6pOiWEuu854FTg\nnuj4fGAq0Au4IbqtB/CKu38OnLcLP2Ozux+TODCztsAiM3vD3RftcuS7yd17RvG0AUqi21LftHoA\ni6t42R8BB7v7qt0I7RFguLs/bWbfAl41s1buvnM3rplwBvDnarhOVRwPtIxxPy16ynFKCHXfc8At\nKcfnA8OAx8zsEHf/D3Aa8CyAmb0P9AWaASOAFcCRQCNgsLvPquwHuvsqM1sGdDCzY4GBhE/DRe5+\nWvTJvB+wA1gKXOfua6KHX2hmw4C9CJ/C/xDF9WugN9A4utYv3X1K9JgjzGwW4U3pLeBad9+U8ly+\nZGbjgUXAFuA44HYzawLcC3zX3ZdH95tOaK08nfLYlxOvqZldC6yPHrcf4ZP+ne7+iJl1B+4GNgF7\nR9fdkRLGd9y9OPr+8Og6/y37OprZNcDVwDZgK3C1u78XJdx7gYOBhsBj7n6rmf0eaAtMMLPLovM3\nRNf+L/A/7v5KmZ9xefQa7QUcAnwA/Am4DvgGcJe732lmewP3R7e1Aj4H+kev+Y+BBma2wd1vjH5/\nlxF+v8uAK6Mf1zZqtX09Otff3d3Mmkev15HR83kxirU4atH1BrYD64Ar3H112ddKqofGEOq46A1u\nnZl928xaAB3c/TVCougd3e004JlyHv5d4Pbo0/9DlE4sFTKzLkB7YG500xFAtygZXAmcBRzr7p0I\nn9AfTnl4s+jndgEuMbOzzOzrhE/z3aLHDAd+m/KY9kAfd/824W96eCUhlrj7fcCbhMTyd+CvwKAo\n/vZAB8q8Ju7ejdAldgrwGqGldbe7Hw2cC/zBzE6I7v4t4CJ3/06ZZEAiGZjZcmAScJu7l/r0bGYN\ngLuAs9z9BGAscFJ0+hHgQXc/HjgBOMPMvufuw4FVhDfaN4BRwDXu/l3gxiju8pwEXO7u3yB0r13k\n7j0IrcXfR/c5B1jv7l3dvWP02l3n7q8TWiSPR8mgFyEZnBD9Pt4HBkfXOBT4SXT7/xG684ie55vR\n8zkGyAd+YWYHAT8Djo+ew/To+UqGKCHUD88R3gzOAf4d3fYMcGY0BlDi7kvLedxKd18YfT+f8Mmw\nPHub2Xwze8vMFhJaFv3d/ePo/AJ33xR9fzYw3t23Rsd3Az3MLNFaHefuJVH31STgDHf/ALiCkCBG\nEj6R7pPy8//p7oXR9+MJ3SZxJcY87gcuNbM9CIlhXNk36TKP6QA0TrRS3P0T4Mno+QF86O4fpfvB\n7n44oYUwzMxOKXOuGPgHoTtpDLAReDD6pN4d+J2ZvUVITAcDR5fznCYCT5nZA4Tf3agKQnkjpQvs\nfcIbL4SxocZmtre7Pwk8bGbXmdlowt/TPl+9FKcBT7j7xuh5/NLdR0bnXnf396Pv3wZaR9/3BK6O\nns88QhfUkdHr9zbwlpndDrzj7lMreA5SDZQQ6odphDeRniQ/9c4AvgOcTtRdVI4tKd+XkHyjKWuz\nux8TfRo+yt17uPv0lPNfpHxf9m9uD0LXZeLaqV0necAOM/sOMIfQengeuK1MLF95TAVxVsjdlxEG\nxi8gdIWMq+CuiSRR3v+dBoQuDyj9nL9kZg3N7KKUn7sSeIHwuygb02WE39ky4Hrgn4TXC6BL9Hp/\nh9CaGlnO428EugJvEBLqaxU8p21ljr/y+kXdVw8SusEmEJJNeX8PO0kZKzCzfaMPHWWvm/r3tAfw\n/ZTncwLwk+g5nAJcDnwG3BUlI8kQJYT6YSbQCehGeEPF3bcQPvVfR8UJIa6KEkV5ngeujD7pAvwU\nmJXSrXIZgJm1BC4itG66ET7FjgZeBvqQfGME6BW98exBGPT9V8xYdpJ8A4cw8+d24DV3/7SCxySe\nqwPbzeyCKN62hL74f1fwuPCg8Dx/b2b9Uh53ClBqbMbM9jOzD4B10Qyx4cDRUcvpNaLulqgbcDbJ\n7r+dQEMz2yMaQ9nH3ccC1wIdzSz1+caReL5nElp24wkJ6nySv4PU1/EFwjhQovVwCzCkkp/xPPCL\n6Pk0Bp4Grou6ORcB77r7bYSupW9XMX6pAiWEeiDqnlkKvBe9oSQ8S+iyeCnltl2ZCVKVxzxIeNN4\n3cwWExLVJSnX2WBm84BXCP3zLxM+jeZH93+T0H3SysyaRo9bEj2XdwgDtLfFjOtp4A4zuzQ6fobQ\nDZJulk5ittJOQmvi52b2DqGb5ZY4g+7R466JukimEsYx5qfewd3XAb8DZpjZm4QWwMDo9A+Bzma2\nAHgVmODuE6NzTwGPE2aW/Qz4e/R6/gO4sux4RkXPr5zjO4Afm9l8QtKbR/jbgTAI3MvM7nb35wjj\nMXOi16UNydlsFfkpodtxIaGL6B1glLsviJ7LPDN7gzA4XVlykd2Qp/LXIoGZdQX+4u5HZTsWkWzI\n+LTTaNbFre5+apnbzyfMfNhBaIpW1GcrknFm9lfCOMulldxVpM7KaAvBzP6H8B/sC3fvmnL7nsC7\nwLGEgcvZwHnuvjZjwYiISFqZHkNYThgALOubhPIBG6M+zVcIA4ciIpIlGU0I7j6ZMAOhrObAhpTj\nz4F9MxmLiIikl63SFRsJSSGhGVBU2YNKSkpK8vKqMsNRRESIOTW8phJC2WDeBQ6P5lBvJnQX3V7p\nRfLyWLv288rulnX5+c0UZzVSnNWnNsQIirO65ec3i3W/mkoIJQBmdjHQ1N3HmdkvCHO38whlAj6p\noVhERKQcGU8I0dL8rtH3E1Nuf5bdXyErIiLVRCuVRUQEUEIQEZGIEoKIiABKCCIiElFCEBERQAlB\nREQiSggiIgIoIYiISEQJQUREACUEERGJKCGIiAiQvfLXIiL1WkHBSvr2ncq6da3Iy/uIQw89ig4d\ntjJqVA9atmyRlZiUEEREsqBv36msWjUMeAy4mnffzePdd0uAR3jggfI2msw8dRmJiGTB+vUHEar/\n70Nyy5g8Vq5sXvGDMkwtBBGRXVBYWMR11z3D0qV7sd9+K1i8eDMbNnydli0/ZPLk3hx6aLu0j2/Z\n8kO2bCkh7CBcQkgKJbRrt7EGoi+fEoKIyC64/vqZTJlyKeGNfCRwA5DHli0l9Okzkrff/knax0+e\n3Js+fUaybl1L8vJu4tBDj6RDh22MGnVqDURfPiUEEZE0CguLuP76maxc2Zx27TZ8OegbunYSXT1f\nI7XbJ3QHpXfooe0qTRo1TQlBRCRF2QSwffsOnntuIJDH228nB33btdsQHecBH5Pa7dOy5UdZfAa7\nTglBROq91CSwZs1iVq26FmjJ22+X0KLFeMob9B01qgeNGz8WjSE0YvHiEdEYwkdMntwrW09ltygh\niEi9k1gDsH79QbRs+SFHHLEfL7xwNeGNvzdhKujF0fFnlDfo27JlCx5//GLWrv0cOC0rz6O6ZTQh\nmFkecB9wNLAVuMrdV6ScvxT4JVAEPOzuD2UyHhERSF0DEAaB16wZS2orAJpG35fQpUszGjV6JOpC\n2pjVQd9My3QL4QKgsbt3NbMTgDuj2zCz/YDfAp2AjcALZvaCu3+Q4ZhEpI5LdAEVFOxBYeFK9tuv\nA4cdtunLAeHkGgCAPIqLS7cC2rZdROvWxVEC6Jm1lcM1LdMJ4SRgGoC7zzWz41LOHQa87e4bAMzs\nDaAzoIQgIrts3ryFnHfeMxQXHw68C1zNqlWHsHBhckA4uQYgJID8/K107pzaCri03iSBVJlOCM2B\nDSnHO82sgbsXA8uAb5lZPrCJ0AnnlV0wP79ZRgKtboqzeinO6lMbYoRdj/PCC6dRXDyCxJs93Ab8\nCshj1aqW5Oc34+WXL+GUU26jsLAtrVqt4qWXrqJ9+/QLyao7zlyU6YSwEUh9tRLJAHcvMrNfAE8C\n64B5hNGbtMIATm7Lz2+mOKuR4qw+tSFGSB9nsjtobwoLnVatDqF9+51fdgdt3XoIpccD2kbfl9C2\n7XrWrv2c5s1bMX/+4FLX3ZXXpTa9nnHESghm1sjdt5vZ4YABzyXe2CsxG+gJTDKzzsDClGvuARzj\n7t3MrBEwHfh1rKhFpN5JJIJZs1ZTVPRLEi2AVaseY9Giy0h0BzVuvIKtW5PdQbCEo46axGGHba7T\nA8LVodKEYGY3AYeb2XDgZWAJYWB4UIzrTwbOMLPZ0fGVZnYx0NTdx5kZZjYf2AL80d0Ld+lZiEid\nVlhYRI8ej7Bq1ZFAMaVbAKE4XGJ9wNSp59Kr101s23YojRu/z9SpvejU6ajsBF7LxGkh9AJOBIYA\nj7r7/5rZm3Eu7u4lwDVlbl6acv63hJlGIiJfWreuiEGDpqasFt785TRRmEDqjCD4gtT1AZ06HcUH\nHygB7Io4CWEPd99mZj2B4WbWgOQkXRGRapG6WAyWsmXLVcAh0WrhO0i2Cs4D/kjz5i3YZ581tGrV\njvbtH1F3UDWIkxBeNLNFwGZCl9EsYGpGoxKReiMxNvDss1vYubMd4Q1/X1JnB8F+JFsF+9K27XZm\nzjyrXk4NzaRKE4K7/9LM7gE+cvdiM/uJu79dA7GJSB1UWFjEkCHPMnv2RjZv3hf4kJ07hwItCW/6\nibIRydlBXboUl1ktXD/XCWRanEHllsCNQHsz+z7wUzMb6u7rMx6diNQphYVFdO/+AKtX70Xy039q\nEkgMEpfQpMkyOnZ8KkoAag3UhDhdRg8QpoR+l7C1zyfAo4R2nYhIWoWFRfz858/z2msN2LTpP+zY\n0RzoQEW1gxo0WEjbtot58smLKt11TKpXnIRwqLuPNbNr3H07cIOZvZPpwESkbrj++plMmxb2EwjD\nj4UkZgYlWwiLadHiM7p335NRoy6lQ4eDa8WCr7omTkLYaWb7En5rmNk3CBOBRUQqVXpnsS+ir+cQ\nuomasueeb3DGGW0ZPfoMdQtlWZyEcBPwEvB1M3sK6AIMyGRQIlK7pHYLwWd06bIPo0efT8uWLcrs\nLHYOcBcNG95H06YH0qXLJkaP/pESQY6IM8voeTObB5wA7AFc7e6rMx6ZiOS8RCJ44YX/snPnPiSm\njD733EQaNZrJAw/0YdSoHmzf/iCvvtoAWEeXLgcyenT9KSldm8SdZfR9YH9Ciu9kZolVxiJSDyUS\nwYsvrmbHjl/z1dlCzVi58r9A2Fns4Ycvyl6wElucLqOngDXAYqJxBBGpn5IF5nZSVLQXcARfrStU\nAnxOu3Y7sxWm7KI4CaGVu3fPeCQikrMKC4sYPPhZZsxYS0nJcJItglspPVtoAQ0bLuD001sxalTP\nLEYsuyJOQlhoZse6+7yMRyMiOSWxqviFF1axY8dxhKVIG4AWhCRwCDAR2E7Dhis4/XSND9RmFSYE\nM3ufkPL3Bi4ys4+BnUQfBdz9sJoJUURqUqJbaOXK5qxZs4RVqw4BribZCpgI9AdKaNhwBU2bNopm\nFQ1SIqjl0rUQTqmpIEQk+xLVRj/99ACKi1cD3YEGhESQOk6wHfgrTZoU8M47mjJalzSo6IS7r3T3\nlYQtMG+Lvt8beARoUkPxiUgN6dPnKVatGkZx8QBCnaGJhIVkG0nOJykBCmjTpoBZsy5SMqhj4owh\njAN+A+Du75rZ74AHgZMyGZiIZFZioHjWrDyKiz+juPgAvroX8Tm0aXMn27bdAexHly7FWkhWh8VJ\nCE3d/bnEgbv/28xGZTAmEcmwGTPm0K/fTKAroRXwA+B2Ss8YWkbv3k8zapQSQH0RJyGsMbMfEyqc\nAvQDtFJZpJYqLCyKksFISi8oa0WYRvo14GNOO60FDzzQJ3uBSo2LkxCuBO4jfHzYTtg1bWAmgxKR\n6pUYMC4qOoji4qWEavZly0/n06TJcjp2bE67dmhLynooTkIwdy+1wsTMLgT+WekDzfIIyeRoYCtw\nlbuvSDn/Q+AXhOms4939z1WIXUQqUVCwkl69HmPt2sbAQYRCxUOBP1O6e2gObdo0ZOpU7UFQn6Vb\nh3AR0Bj4rZndVOYxvyZGQgAuABq7e1czOwG4M7ot4Xbgm4T9mpeY2UR331DF5yAi5SgsLOLUUx9n\n69bDgYYk9yp+DGhHovw0zGHSpFPp1q1r9oKVnJCuhdCcMOLUDEhtO+4Eboh5/ZOAaQDuPtfMjitz\n/h2SG6mCaiWJ7LZkvaHVbN36W75aeK4pTZq8QceOx9KuXRGjRg3WoLEAaRKCuz8APGBmp7n7i7t4\n/eaEde4JO82sgbsnNthZDMwjTHP4p7tvrOyC+fnNdjGUmqU4q5fiTO/11xdw8smPsn17c8JnrI1A\na8orPNegwWssWnQ17dvndteQfuc1L84YwjYzm0L4a8oj7InQzt0PifHYjYQWRsKXycDMjiK0YdsB\nm4AJZtbX3Z9Md8HasK1efn4zxVmNFGd6hYVFdO78d0pKOpHcqL68wnNvc8AB85kypT/Nm7fK6ddU\nv/PqFTdpVbhSOcU4QgnsPYE/AcuAyTHjmA2cC2BmnYGFKec2EMYOtrl7CaHEdsuY1xURYN68hRxx\nxP2UlHQgTAJMbREcAvyRFi0eonfvR3C/gk8+uVmDxlKhOC2ELe4+3swOAdYDgwjdPHFMBs4ws9nR\n8ZVmdjFhsds4MxsLvGJm24AC4K9Vil6kniooWEnPnhNZt64EGEFIABMo3SIooW3b7cycqb2KJZ44\nCWGrmbUCHOjs7jPMrGmci0ef/K8pc/PSlPN/Af4SN1iR+i5MI53I2rUlwJEkCxBD6IH9A2GB2aec\ndtp+3HffpUoGEluchHAn8DhwIfBGtHZAeyOI1LDkNNKWhAHjZsAikq2CfYFNnHNOK0aPvkqJQKqs\n0oTg7k+Y2SR3LzGzY4EOwNuZD01EINQd6t9/FsXFhxPe/BsBHxPmd1wM3EYoRLeE6dN70anTUdkL\nVmq1ShOCmRnwIzMrO+A7IDMhiQiE7qE+fZ7i00/3A75FcmHZDcCPgPHASqAFjRot5plneisZyG6J\n02U0mbCiZUGGYxGRSGFhEd27T2D79j+QHCT+G3A5cDgwBziA1q3X8PTT3TRzSKpFnIRQ5O6/zXgk\nIsKUKdMZNOgNQs/swZSeRloS/VtO794dGTVKexdL9YqTEP5qZiOAFwlTGgBw95czFpVIPZMoNzFl\nylKSZanLTiP9GBimukOSMXESwinA8YS6RgklQI9MBCRS3xQWFtGt24OsWXMgYXfaDUALwpjBSOAo\n4HMaNixk0aKfq1UgGRMnIRzn7t/IeCQi9VBhYRE9ejzCmjU3k2wJTAT6EwaQN9KgwVpat17FlClX\nKhlIRsVJCAvN7NvurkFlkWoyb95Cevd+iu3b9wXaUHqs4HPgIWAp48cfz3nnnZmtMKWeiZMQDgPe\nMrNPSBZLKXH3wzIamUgdVVCwknPOmQocQdinYAelxwqaccABK1iwIG6VeZHqESchXFD5XUSkMomB\n43/960NK72f8IGFm92bgU9q02cqUKRdlMVKpr+KsVF5ZE4GI1FWFhUUMHvwsM2aspaRkOPAMpbuI\n2gA9adt2JDNnXqZxAsmaOC0EEdlFM2bMoV+/mYRJek1IjhGkdhG9Rffu7zN2rArRSXYpIYhkyLx5\nC1OSwReEIbgSwhYhE2nQ4HMOOGAdkydfqJXGkhPi1DJqBRzj7i+Y2TDgGOBmd1+S8ehEaqGCgpX0\n7TuVVataEdYQdCNMIf0b8Edgf5o0KeCdd36kFoHklDg7pk0EOprZ6cD3ganAnzMalUgtNWXKdLp0\neZRVq4YBPyasJ3iO0D3UEiiiTZsCZs26SMlAck6chNDS3e8FegN/dfdHgL0zG5ZI7TJjxhzy8gYz\naNA8Qh2iiUARqZvbwxwmTTqZhQv/V11EkpPiJIQG0T4IFwDPmFknNPYgUkq/fk8QNqzpCKwmjBs8\nR2Jz+yZNbmL69F6qQSQ5Lc4b+/8CtwN3uPsKM3sNGJLZsERqh8LCInr1+g2wP6XXFtxG2NVsGHPn\nXqIWgdQKcRLCwe7+ZSE7d+9sZoOBmZU90MzygPuAo4GtwFXuviI614awGicx/64TcL27j63ysxDJ\ngoKCldGWlseQnFJK9DXsYDZ+/PFKBlJrVJgQzOznQHPgx2aW+he9J/BD4E8xrn8B0Njdu5rZCYT9\nmS8AcPfVwKnRz+oM/B54YFeehEhNKiwsYsiQZ5k27QNKSjoTppS+R+m1BdrOUmqfdC2E5cCxhL/w\nvJTbtwFXxLz+ScA0AHefa2bHVXC/McDF7l4S87oiNa6gYCW9ej3G2rV7AcXACEqXn7gFaA8sYdKk\nU5UMpNapMCG4+zOEQeR/uPu7ZtbS3ddX8frNCcXdE3aaWQN3L07cYGbnA4vcfXkVry1SYwoLizj5\n5AfYubM58Gu+Wn6iNWEweQnDh+dr8FhqpThjCI3N7D1gbzPrAswCfuDu82M8diNh6kVCqWQQuQQY\nHStaID+/WeV3ygGKs3plM86RI+/j17/+gDCDqPzyE3vt9SYLFw6kffvcHy/Q77x61ZY444iTEO4B\n+gB/d/ePzewawsK078Z47GygJzApGidYWM59jnP3V+MGvHbt53HvmjX5+c0UZzXKVpwFBSvp02cy\nn366ha9uaxnKT+TlbeDssxsyevQg2rc/OOdfT/3Oq1dtijOOOOsQ9nb3dxMH7v5voHHMOCYD28xs\nNmHN/hAzu9jMrgIws/0p3aUkkhPmzVtIly7j+PTTBoTN7scRFpqdR/hTfpm8vAW89trJPPywVh1L\n3RCnhVBoZkcTPhZhZj8ECuNcPBokvqbMzUtTzn9GqI0kkjNChdIZhEbwxSQHjh8D+gFbadz4dV5+\nWesLpG6JkxCuAR4GvmVmGwhv6JdkNCqRLCgsLOIHP3iABQu2AydSeoJdHrAFGIZZEVOn3qBWgdQ5\ncTbIKQBOMrOmwB7RbRszHZhITZo3byHnnPMQYcVxB2AxcCCpA8d5eUt5773BSgRSZ1U6hmBmPc3s\nNsL/irnAimilskidEPY4fgI4gORGNj8G3ieMF9wH/A/PP99LyUDqtDhdRjcDlxI6T18HBgMvEW+l\nskjO69t3KnA8Xx0v+DqwipYtP2PatCs0XiB1XpxZRrj7e4TpFVPd/QugUUajEqkBEyZMpnXrEaxa\ndQChNZCY8JYHNAWWMWZMO9xvUDKQeiFOC2G1mY0BjgMuMbM/Ah9kNiyRzCkoWEnv3k+yZs0OSlco\nnUjY0CbsXTB+/PGcd96ZWYxUpGbFaSFcDLwBnOrum4AV0W0itU5BwUq6dv0ba9bcRBg8Tp1F9Dlw\nP61b38LcuZcoGUi9E2eW0eeEzWATxxo7kFopzCR6glCCIg/4mNIVSt9nzZobshihSHZp5zOp8woK\nVnL66feyaVMLIJ/klpb9gVsJhemWMWbMEVmMUiT7lBCkTpsyZTqDBr0CfA04CPiUUJX9MUJiaAgs\nYu7cgRo4lnovbUIws28Am9x9VVR/6NvAK+7+jxqJTmQ3TJgwmSFDlvDVKaXJweO8vNk8//z3lQxE\nSDOobGZDgOeBV83sIcI6hPeAgWZ2Yw3FJ7JLQjKYR6i+vpPSg8fbgfvo3n0M7703WBvZiETStRAG\nAEcAbQjr+Pd3961mNo4w6+h3NRCfSJXcddc4Ro5cTegOOoHQMphI6cHj9xg5sh0DB16ZvUBFclC6\nhLAHsM3dV5rZHe6+NebjRLImJIORhB3NICSBcwljBpuAxUyadLZ2NBMpR7p1CJOAl81sD3e/BSAq\ng/0KoDEEySkzZsyhdethwOEk1xRsJLQIWgD92HPP5bj/QslApALp9lS+ycy6uft/U27eCtzs7s9l\nPjSReMaOncDw4SuAbwGfAGOBs4DphJ7N1rRo8SHPP3+JitOJpJG268fdXy5z7IBnNCKRmAoLi7jw\nwrtYsqQhkCjIW0JYRzkB2EyHDh/zyitDshmmSK0Rq7idSK4pLCyie/cHWLKkCaFlkDqLqAQ4kgYN\ninn66d9mLUaR2ibOfgh71EQgInFNmTKdjh3vZfXqtoRksI1oh9fo62fAbKZN66kuIpEqiDNb6A20\n77HkiLDy+FXgdpKtgQcJs4h2EOoTfcqkSedqfYFIFcUtf30y8Lq7b6vKxc0sj7Dd1NGEAemr3H1F\nyvnjCVtSQagpcIm7b6/Kz5D6I7nyODGTiOjrnoR1kzdy5JF5PPnkL9UyENkFcRLCccAsADNL3Fbi\n7nG6ki4AGrt7VzM7Abgzui1hLNDX3VeY2QCgHbAsbvBSfySTQQeggNILzT4GfoX7dUoEIrshTvnr\n/N24/knAtOg6c83suMQJM+sArAN+YWZHAs+4u5KBlFJQsJLDDruPL75oTtjv+AtgEKFK6ZGE9Qar\nGTPmOCUDkd1UaUIws70J+yqfFt1/BnBjtFlOZZqT3JcQYKeZNXD3YmB/oAtwLWHTnWfM7E13f6lq\nT0HqqoKClXTp8hDQirCeIHW/45bAamAZ48d312Y2ItUgTpfRvcBmQm2jPMLHsz8Dl8Z47EZCdbGE\nRDKA0DpY7u5LAcxsGqF76qV0F8zPb5budM5QnLtn2bKVdOlyH6EVUEj4XNGC5H7H79C8+Qbmz/8V\n7dvnTqXSXH09U9WGGEFxZkOchHCsux+dcnydmS2Jef3ZQE9gkpl1BhamnFsB7GNmh0UDzScD4yq7\n4Nq1n8csPKjfAAAWj0lEQVT80dmTn99Mce6GsLPZU8CfqGi/4zFjjuCii/oAufM3kauvZ6raECMo\nzuoWN2nFWZjWwMy+7JyNvt8ZM47JwDYzm02YTTTEzC42s6vcfQcwEJhoZnOBD1QSQ+66axznnDOV\n5DaXkCxZ/RAwjJEj232ZDESk+sRpIdwJvG5mT0fHvQjlJCvl7iXANWVuXppy/iVCjWIRhg+/g7Fj\ntxIGjMuWrG4ELIjKVl+SxShF6q4KE4KZXeTujwNPExandSe0KC5094UVPU5kVwwdOoJHHgEwSpes\n3kIoWPcZJ574HwYOvCF7QYrUcelaCL8xsyeB6e5+DLCohmKSeiSsL3gV2A84hNCAXE+YRRQWm8Em\n7WEgUgPSJYQ5hCIxeWaWWgI7j/gL00QqNG/eQoYMeYswA/m3JLuHbiCMIXwMrGP58l/RvHmr7AUq\nUk+k2w9hADDAzKa4e+8ajEnqgRkz5tCv3z8IO7QeSukB5PbAF+y996fMnDmY9u3b1YqZHCK1XaWz\njJQMpLpNmDCZfv2mEZLBCKAJpauVLmPAgBX85z+/59BDc2eNgUhdp72RpUaF3c3+AxwFFPPVAWRn\n6NBmXH/9z7IXpEg9pYQgNeauu8YxcuRqkjWJ3id1z2MYxvjx31UZCpEsiVPL6F/AeOCpaDGZSJWE\nlcf3Al8nLGFJDB6PIcwiOgxYypgxRygZiGRRnJXKtwJnA8vM7E/RHgYiscyYMScqQ3EYoWWQOnh8\nCFDMHnssYfr0Xlp9LJJlccpfvwy8bGZ7Ad8DnjSzjYS6Q/dXddMcqT/mzVtIv34vkNzm8gtKrz6e\no5XHIjkk1hiCmZ1CqG56JvAc8DhwBjAVOCtTwUntFcYLVhGSwVrCvgVDCIPHTYE5DB+er2QgkkPi\njCGsJFQmHQ9c5+5bottfIpS0ECnlsMNO4YsvzgBGkWwN/I1Q3/Aw4B3699/ET3+qMhQiuSROC+E8\ndy9VtsLMOrv7a8AxmQlLaqvhw++IkkHZ8YJWhEHldxk58jC1DERyULridicCewDjzGwgyf/dDYH7\nCZvbigCJmUQPAAcS/jReB84n2UL4HFjO+PGdNZNIJEelayGcQahweiCh0EzCTuAvmQxKapcpU6Yz\naNDrwEEkaxKtJ9Qk6gB8AKxl6NA2SgYiOSxdLaNbAMzsUnd/pMYiklol7GGwDTgR+IxkQ7IlcDjw\nHvA4a9YsyFKEIhJXui6jW6Kk0MPMTi17Pip+J/XYjBlzomRwBLCY0HhMnVa6lK9/fQX//vfLWYxS\nROJK12U0L/r6Ug3EIbXMV8tQ/Bi4DbgFOBhYxplnfsSjjz6YvSBFpErSJYR3zOzrwMyaCkZqhylT\npkfJILUMxWOElsJ7wHxGjjyKgQOHZzFKEamqdAlhFsn2f1klhAnlUs+MGHEvd9+9gTBYnDqttClh\nWco6Jk26ULubidRC6QaVD93di5tZHnAfcDSwFbjK3VeknP85cBWwJrrpandftrs/VzIjmQxGAhMp\nW4YCNjB9+uV06nRUFqMUkV1V6aCymT1U3vmYg8oXAI3dvauZnQDcGd2WcCxwqbu/VZWgpWYVFhbR\nseNFQDdCy+BW4DySexisADYwdOjXlAxEarE4g8qzduP6JwHTANx9rpkdV+b8scAwMzsQeNbdb92N\nnyUZEDa0WUgoeHsjyRbBbcD1hLUGmxk//nStMRCp5Sosf+3uT0dfHyYUtCsEVgNPR7fF0RzYkHK8\n08xSf+ZEwvSUU4GTzOzcKsQuGRaSwXLgXuA7lB4zaA0MAz5h7tyrlAxE6oA4xe2+D9xN6CRuAIw1\nsx+5+7QY198INEs5buDuxSnHd7v7xujnPEt41/lXugvm5zdLdzpn1PY4hw//IyNGvEsoV5VHKD1R\neo3BXnutYuHC39G+feb3Pa7tr2cuqQ0xguLMhjjF7YYDx7r7JwBm1o5Q9jpOQpgN9AQmmVlnYGHi\nhJk1BxaZWUdCR3QPoNJJ62vXfh7jx2ZXfn6zWh1ncvC4NbAvIQGcS2jQbQfeY599JrNixZtA5n8n\ntf31zCW1IUZQnNUtbtKKkxB2AJ8mDtx9pZntjBnHZOAMM5sdHV9pZhcDTd19nJkNIyx82wq8GLPV\nIRl01VXXM3XqPoQ1BcUkB4/3IeyB/Bl9+27m/vvfzGKUIpIJ6WYZXRZ9+z7wtJk9TKhNcDHwTpyL\nu3sJcE2Zm5emnJ8ATKhKwJI5F174I155pR0hGXxMGDbal/ArLwFmM3JkB5WuFqmj0rUQEvWLvoj+\nJQZ8N1H+YjWppQoLi+jW7Q7WrDmZkADOi77eTJhFdDjgTJrUQwvOROqwdAvTrqzoXLS/stQBBQUr\n6dLlT4T1g6llKC4mJII9gXmMH99VyUCkjoszy6gvcBOhEzmPsGnOXoQRR6nF7rnnr/zsZ4sIy0FS\np5TuQ0gM7wHrGTOmi6aVitQDcQaVRxHKSwwFRgBnAftnMijJvGS10tv5ahmKBcBsmjefyxtv/I2W\nLVtkMVIRqSlxEsJ6d58Zbam5b1TOYl6lj5KcdfjhPdi48TSS+x6fS+gm2kH4k/iMM89cw6OPTs1i\nlCJS0ypcqZxii5l1AN4FTjGzRoQRR6mFJkyYnJIMviC0CFoA/YAPgQUceeRCHn30z1mMUkSyIe7C\ntN8DlwK/Aq4GxmUyKMmMwYNv4oknGhMK1H0BnENoGTQFXgM20bPnRh56aHIWoxSRbKk0Ibj7LJIF\n7o43s5buvj6zYUl1KihYyYknjqC4uB3J0tXnEBabJ5JBIf37N2T06HuyGKmIZFOcWUYHAfcApxDq\nFrxgZkPcfW2GY5NqMGPGHPr1exRoR+kxg+cIFUMWA6sZOvRQrr/+Z9kLVESyLk6X0UOEEhSXE95N\nBgLjCTWKJIdNmTKdQYP+DnyD0E30OnAy0JIwZjAM2MyaNaOzF6SI5Iw4CSHf3e9POb7LzC7PVEBS\nPWbMmMOgQS8TkkHq3sc3A8cRitd+xKRJl2YvSBHJKXFmGb1uZv0SB2bWE1Blsxw2YsS99Os3AziK\nsMgssSVFHnAIMIe8vOUsXz5Cq49F5EvpitsVk1ytNMjMHgT+S3iHWU9YrCY55uyzL2X+/I7AiYSZ\nRIMJW0z0J7GPQf/+mxg9enytKd0rIjUjXS2jOK0HySEXXvijKBmkdhE9BhQRhoKWcuaZHzJ6tNYY\niMhXxZlltDeh4/m06P4zgBvdfVOGY5OYCguLOPnka1i79kiSM4mIvjYlzCTaybXX7sEttygZiEj5\n4gwq3wtsBgYQdR8BfyYsVJMsC7ubfQx8FziI8OafmElUQhg8Xsv06QPp1Omo7AUqIjkvTkI41t2P\nTjm+zsyWZCogiS/sbtYY2A+4kWQ30W2ETW7mAAVMnz5EyUBEKhVnnKCBmX1Z7jL6Pu4WmpIh/foN\nZurU5sDxfLV8dWtCMljBmDHnKxmISCxxWgh3EqaePh0d9yKMWkqW9Os3mBkz2pIcL/ic0uWrl9K3\n73ruv//BLEYpIrVNnITwNPAG0J3QorjQ3RdmNCqp0ODBNzFjxgEkC9SVkCxf3RSYTd++Rdx//x1Z\njFJEaqM4CeH/3P2bwKKqXtzM8gh7Mx4NbAWucvcV5dzvL8A6d/91VX9GfXLssefz4YcnAAYsBRoB\nBwAPAnsDb9C//w5Gj1YyEJGqi5MQ3jGzSwmFcLYkbnT3D2I89gKgsbt3NbMTCN1PF6TewcyuBo4k\nWVFVygi7m70DdKb0GoObgHXANmAJAwa04tZb/5C9QEWkVouTEE6I/qUqAQ6L8diTCDWWcfe5ZnZc\n6kkz60IYFf0L0DHG9eqdKVOmM3LkUkJLoAOlB48Pjb5fyMiRxzNw4CXZCFFE6og4+yEcWtl90mhO\nspAOwE4za+DuxWZ2AGHB2wXARbvxM+qsE0/sy7JlxxBy5lLAKT14/D6wg/HjT+K8887MXqAiUiek\nq2XUlrAo7RvAK8Awdy+q4vU3As1Sjhu4e3H0/fcJE+j/BRwI7GVm77n739JdMD+/WbrTOWN34zz9\n9EujZJDaRTSU0E10KLAcWMMllzThiiv6Zi3OmqI4q09tiBEUZzakayGMB+YBYwmf4O8Crqzi9WcT\n9k2YZGadgS9nJ7n7GGAMQFRO2ypLBkCtKMa2u0XjwrTSrxG6iCYSZhG1IAy17AksoWfP1Tz0UNjd\nbFd/Vm0pbqc4q09tiBEUZ3WLm7TSJYSvuftZAGb2IvD2LsQxGTjDzGZHx1ea2cVAU3fXvsxlhN3N\nniCUoBhB6QJ1/QjdRiXRtFJtdSki1StdQtie+Mbdd5jZ9jT3LZe7lwDXlLl5aTn3e7iq165rLrts\nCNOmtSRsdXk0pQePtxB2N1vO+PEXa7xARDKiKiWuSzIWRT0XkkEjoBXwNcIM38TLHVYew/tce60p\nGYhIxqRrIXzLzFIXkX0tOs4DStw9zrRTqURIBhCSwa8IL+96Ure63Hff13j99Udo2bJFxRcSEdlN\n6RJChxqLop7q3v0HvPvu0YSXuphkN1FLwlaXs9lnn6dYtkw7lopI5qXbMW1lTQZSn8ybt5BzzrmD\nMFaQmFY6gtJrDJwjj5zHjBlKBiJSM+KsVJZqFLqIthFaBR2AWwn7HV9C2NOgA7CEa6/dk1tumZy9\nQEWk3lFCqEFDh45g2rR9gfbAvsB50dfbgOsJ3UaLGTPmW1x0UZ/sBSoi9VKshGBm3yKMeiY6uXH3\nlzMVVF00dOgIHnmkGLiD0usLLiZsaDMMWMikST+lW7eu2QtUROqtShOCmf0JOB9YQem5kD0yGFed\nYnY669efSqjfl7ryeB8S00oPPngG8+a9mMUoRaS+i9NCOJNQVmJLpfeUUsaOncDw4UuBUyldkyix\n8tgJG9qs5/77lQxEJLviJITE2gOpgrPPvpT58w8GuhD2BkpdebyZkCBWM2BAQ269VRvaiEj2xUkI\nhcASM5tDeGcDwN0HZCyqWu6KK/6X+fM7EgrFrgQ+ItQFTLQQCoB1TJ9+OZ06HZW9QEVEUsRJCNOi\nfxJDqFTaltJdRLcANwCHA0vZf/9nWbJkTvaCFBEpR7r9EA5w90+BmTUYT60XksERlO4iakdoIfyS\n/fefrmQgIjkpXQthHGEvg1mUXkKb+KpaRuXqAHxM6ZdqBTCM737XeeYZJQMRyU3pEsJg2O0tNOsh\nJ1T8vg1oCywDPmH48CP46U8nZjUyEZF00iWEOWb2BfBvYDow091zf2ugLDvzzI+YPv0+wmrk/wCf\nMnLkcQwceEl2AxMRqUS64nZfM7P2wMnABcBtZvYZUYJw99dqKMZa5dFH/1xrttUTEUmVdpaRuxcQ\n5kj+1cxaAL0JO73fADTOfHgiIlJT0s0y2hM4CTgbOAvYC3gBuAmYUSPRiYhIjUnXQlgPvAo8AfRx\n9//USEQiIpIV6RLCX4DTgAHAQWY2HXjV3YvjXtzM8oD7CDvBbAWucvcVKef7kqz7/Hd3v6fqT0FE\nRKpDg4pOuPsv3f07QF/gfeA6YKmZTTazH8e8/gVAY3fvSqjvfGfihJk1AP5AqJraFbjWzFrt2tMQ\nEZHdVWFCSHD3VcDfgXuBsYQCPTfFvP5JRGUv3H0uYdf4xHWLgW+6+xfA/lEs26sSvIiIVJ90g8q9\nCZ/cTyKsSn4NeBG4yN0Xx7x+c2BDyvFOM2uQ6HZy92Iz6wP8CXgG2FT1pyAiItWhspXKLwI/B+ZV\nZewgxUagWcpxg7LXcffJwGQzexi4DHg43QXz85ulO50zFGf1UpzVpzbECIozG9ItTDuzGq4/m1AP\naZKZdQYWJk6YWTPgaeBMd99OaB1UmnRqw4Kv2rIwTXFWr9oQZ22IERRndYubtGLtqbwbJgNnmNns\n6PhKM7sYaOru48zsUeBlM9sOLAAezXA8IiJSgYwmBHcvIVR6S7U05fw4QlVVERHJskpnGYmISP2g\nhCAiIoASgoiIRJQQREQEUEIQEZGIEoKIiABKCCIiElFCEBERQAlBREQiSggiIgIoIYiISEQJQURE\nACUEERGJKCGIiAighCAiIhElBBERAZQQREQkooQgIiKAEoKIiESUEEREBIA9M3lxM8sD7gOOBrYC\nV7n7ipTzFwM/A3YAC9392kzGIyIiFct0C+ECoLG7dwWGAXcmTphZE+C3QHd3PxloYWY9MxyPiIhU\nINMJ4SRgGoC7zwWOSzm3Dejq7tui4z0JrQgREcmCTCeE5sCGlOOdZtYAwN1L3H0tgJn9BGjq7i9k\nOB4REalARscQgI1As5TjBu5enDiIxhhGAd8ALoxxvbz8/GaV3ysHKM7qpTirT22IERRnNmQ6IcwG\negKTzKwzsLDM+bHAFne/IMNxiIhIJfJKSkoydvGUWUbfjm66EjgWaArMA94A/i86VwLc7e5TMhaQ\niIhUKKMJQUREag8tTBMREUAJQUREIkoIIiICZH6WUUaYWR/ge+7+w2zHkqqyUh25xMxOAG5191Oz\nHUt5zGxP4CHgEKARMMLdn85qUOWI1tU8ABhQDPzY3ZdkN6qKmVlr4E3gdHdfmu14ymNm80iuX3rf\n3QdmM56KmNmvgF5AQ+A+dx+f5ZC+wswuB64gTNrZi/DedIC7byzv/rWuhWBmo4ERQF62YylHhaU6\ncomZ/Q/hTaxxtmNJ4xLgM3fvBpwD3JvleCpyPlDi7icBNwJ/yHI8FYqS7J+BzdmOpSJm1hjA3XtE\n/3I1GXQHukT/108BDs5uROVz94fd/VR370GY2fmTipIB1MKEQFjbcE22g6hAulIduWQ50CfbQVTi\nH4Q3WAh/pzuyGEuFomnSP4oODwHWZy+aSt0B3A+synYgaRwNNDWz583shaglm4vOAhaZ2VPAVOCZ\nLMeTlpkdBxzh7g+mu1/OJgQzG2BmC81sQcrXY939iWzHlkaFpTpyibtPBnZmO4503H2zu28ys2bA\nE8AN2Y6pIu5ebGZ/Be4GJmQ5nHKZ2RXAGnf/N7nZuk7YDNzu7mcRPvhNyMX/Q8D+hDVV3yPE+ffs\nhlOpYcBvKrtTzo4huPtDhD7k2iRtqQ6pGjM7GPgncK+7P57teNJx9yui/vnXzeyb7r4l2zGVcSVQ\nbGZnAJ2Av5lZL3dfk+W4ylpKaMHi7svMbB1wIPBxVqP6qnXAu+6+E1hqZlvNbH93/yzbgZVlZvsC\nHdx9VmX3zcXMW5vNBs4FqKBUR67J2U+KZtYGeB74X3d/ONvxVMTMLokGFyFMJPgvYXA5p7h796gv\n+VTgbeCyHEwGAAOAPwKYWVvCB6xPshpR+V4BzoYv49ybkCRyUTfgxTh3zNkWQi01GTjDzGZHx1dm\nM5gYcnmZ+jCgBXCjmd1EiPWclHLpueKfwHgzm0X4//SzHIyxrFz+vT9IeD3/j5BYB+RiK9vdnzWz\nk83sdcIHq2vdPVdfVwNizXZU6QoREQHUZSQiIhElBBERAZQQREQkooQgIiKAEoKIiESUEEREBNA6\nBJHdYmbtCKtrFxPmozcBFhCKiOXiwi+RCikhiOy+j939mMSBmf0BmERYISpSa6jLSKT63QwcaWZH\nZjsQkapQQhCpZu6+A1gGdMx2LCJVoYQgkhklQK5VPBVJSwlBpJqZWSNCQbGc3UpTpDxKCCK778sy\n4tG+2r8BXnX397MXkkjVaZaRyO470MzmExJDA+AtoH92QxKpOpW/FhERQF1GIiISUUIQERFACUFE\nRCJKCCIiAighiIhIRAlBREQAJQQREYkoIYiICAD/DxisGNLc82WVAAAAAElFTkSuQmCC\n", "text/plain": [ "<matplotlib.figure.Figure at 0x11d5f40d0>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "logreg.fit(dfnew[[\"D\"]],dfnew[\"win\"])\n", "pred_probs = logreg.predict_proba(dfnew[[\"D\"]])\n", "plt.scatter(dfnew[\"D\"], pred_probs[:,1])\n", "plt.title('Win Probability for 3 sets matches')\n", "plt.xlabel('D')\n", "plt.ylabel('Win Probability for 3 sets matches')\n", "plt.legend(loc=\"lower right\")\n", "plt.grid(True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Decision Trees and Random Forests" ] }, { "cell_type": "code", "execution_count": 272, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=None,\n", " max_features=None, max_leaf_nodes=None, min_samples_leaf=1,\n", " min_samples_split=2, min_weight_fraction_leaf=0.0,\n", " presort=False, random_state=None, splitter='best')" ] }, "execution_count": 272, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.tree import DecisionTreeClassifier\n", "\n", "model = DecisionTreeClassifier()\n", "\n", "X = dfnew[feature_cols].dropna()\n", "y = dfnew['win']\n", "\n", "\n", "model.fit(X, y)" ] }, { "cell_type": "code", "execution_count": 273, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "AUC [ 0.55257937 0.57486536 0.53993056 0.5170318 0.49804582], Average AUC 0.536490582524\n" ] } ], "source": [ "from sklearn.cross_validation import cross_val_score\n", "\n", "scores = cross_val_score(model, X, y, scoring='roc_auc', cv=5)\n", "print('AUC {}, Average AUC {}'.format(scores, scores.mean()))" ] }, { "cell_type": "code", "execution_count": 274, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CV AUC [ 0.61034935 0.67795139 0.61050879 0.6052207 0.55577468], Average AUC 0.611960979666\n" ] } ], "source": [ "model = DecisionTreeClassifier(\n", " max_depth = 4,\n", " min_samples_leaf = 6)\n", "\n", "model.fit(X, y)\n", "scores = cross_val_score(model, X, y, scoring='roc_auc', cv=5)\n", "print('CV AUC {}, Average AUC {}'.format(scores, scores.mean()))" ] }, { "cell_type": "code", "execution_count": 275, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',\n", " max_depth=None, max_features='auto', max_leaf_nodes=None,\n", " min_samples_leaf=1, min_samples_split=2,\n", " min_weight_fraction_leaf=0.0, n_estimators=200, n_jobs=1,\n", " oob_score=False, random_state=None, verbose=0,\n", " warm_start=False)" ] }, "execution_count": 275, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.ensemble import RandomForestClassifier\n", "from sklearn.cross_validation import cross_val_score\n", "\n", "X = dfnew[feature_cols].dropna()\n", "y = dfnew['win']\n", "\n", "model = RandomForestClassifier(n_estimators = 200)\n", " \n", "model.fit(X, y)\n", "\n" ] }, { "cell_type": "code", "execution_count": 276, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>Features</th>\n", " <th>Importance Score</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>0</th>\n", " <td>D</td>\n", " <td>0.958857</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>Surface_Hard</td>\n", " <td>0.010787</td>\n", " </tr>\n", " <tr>\n", " <th>2</th>\n", " <td>Surface_Grass</td>\n", " <td>0.008321</td>\n", " </tr>\n", " <tr>\n", " <th>5</th>\n", " <td>Round_3</td>\n", " <td>0.007937</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>Round_5</td>\n", " <td>0.007525</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>Round_6</td>\n", " <td>0.006573</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " Features Importance Score\n", "0 D 0.958857\n", "1 Surface_Hard 0.010787\n", "2 Surface_Grass 0.008321\n", "5 Round_3 0.007937\n", "4 Round_5 0.007525\n", "3 Round_6 0.006573" ] }, "execution_count": 276, "metadata": {}, "output_type": "execute_result" } ], "source": [ "features = X.columns\n", "feature_importances = model.feature_importances_\n", "\n", "features_df = pd.DataFrame({'Features': features, 'Importance Score': feature_importances})\n", "features_df.sort_values('Importance Score', inplace=True, ascending=False)\n", "\n", "features_df" ] }, { "cell_type": "code", "execution_count": 277, "metadata": { "collapsed": false }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/marcotavora/anaconda/lib/python2.7/site-packages/ipykernel/__main__.py:2: FutureWarning: sort is deprecated, use sort_values(inplace=True) for INPLACE sorting\n", " from ipykernel import kernelapp as app\n" ] }, { "data": { "text/plain": [ "<matplotlib.axes._subplots.AxesSubplot at 0x1211caf10>" ] }, "execution_count": 277, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAeYAAAFtCAYAAADS5MnUAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAGnRJREFUeJzt3X+U3XV95/HnOOGHMUNM8EaEYx0yzbzFVSooEn5KUE/8\nGUHqWTeWutLyY8E9PbLkuKl0wZ6tsdVaI5VTDC5QFqjUU+rK2dK4gAUXGj30uNKI7+FH7SroZuTG\nkCEtELj7x/0OjGF+3Hy5zP1M5vk4J2e+33u/P973fWby+n6+33u/t6/VaiFJksrwkl4XIEmSnmMw\nS5JUEINZkqSCGMySJBXEYJYkqSAGsyRJBVnQ6wL2Bbt3P93avn1Xr8uYk5YsWYi923v2rR77Vo99\nq6fRGOirs54j5i5YsKC/1yXMWfauHvtWj32rx77NLoNZkqSCGMySJBXEYJYkqSAGsyRJBTGYJUkq\niMEsSVJBDGZJkgpiMEuSVBCDWZKkghjMkiQVxGCWJKkgfolFF4yMjNBsjvW6jDlp+/ZF9q4G+1aP\nfatnvvZtcHA5/f2zf59wg7kLzlx/PQsXL+t1GZKkLtm1Yxsb161haGjFrO/bYO6ChYuXsWjJYb0u\nQ5K0D/AasyRJBTGYJUkqiMEsSVJBDGZJkgpiMEuSVBCDWZKkghjMkiQVxGCWJKkg3mBkDxHxVuBG\nYCvtA5cFwMbM/MueFiZJmhccMU/u1sw8NTNPAVYDn4iII3tckyRpHjCYZ5CZjwNXAL/e61okSfs+\ng7kz/w94Ra+LkCTt+7zG3JnXAD/pdRGSpNmzdOkiGo2BWd+vwTy5vvGJiDgIOBs4o3flSJJmW7M5\nxujoztrr1w11g3lyqyLiNuAZoB/4vcy8v8c1SZLmAYN5D5n5d8Ahva5DkjQ/+eYvSZIKYjBLklQQ\ng1mSpIIYzJIkFcRgliSpIAazJEkFMZglSSqIwSxJUkEMZkmSCuKdv7pg145tvS5BktRFvfx/va/V\navVs5/uKkZGRVrM51usy5qSlSxdh7/aefavHvtUzX/s2OLic/v7+2us3GgN9My/1fAZzd7ReyDeQ\nzGeNxsAL+vaW+cq+1WPf6rFv9dQNZq8xS5JUEINZkqSCGMySJBXEYJYkqSAGsyRJBTGYJUkqiMEs\nSVJBDGZJkgpiMEuSVBCDWZKkghjMkiQVxGCWJKkgBrMkSQUxmCVJKojBLElSQQxmSZIKYjBLklQQ\ng1mSpIIYzJIkFcRgliSpIAazJEkFMZglSSqIwSxJUkEMZkmSCmIwS5JUkAW9LmBfMDIyQrM59kuP\nDQ4up7+/v0cVSZLmKoO5C85cfz0LFy97dn7Xjm1sXLeGoaEVPaxKkjQXGcxdsHDxMhYtOazXZUiS\n9gFeY5YkqSAGsyRJBTGYJUkqiMEsSVJBDGZJkgpiMEuSVBCDWZKkgtQO5oj4RER8MyK+FRG3RsTR\nHa73loi4PyL+oO6+O9jHT/eYXx0RV9Xc1g0RcXJ3KpMkaXq1bjASEUcAazLzhGr+SOAa4KgOVl8N\nfCEzv1Rn3x1qdfiYJElFqXvnrx3AqyPiLOCWzPx+RBwbEbcD52bmSEScC7ySdmDfDIwCfwOcBTwR\nET+p9n9B9bMFnJ6ZzYi4DHgLsB9wSWZ+IyI+DZwI9AN/kplfm6a+vqmeiIgLgA8AC4GfA6cDH67q\n6gMuAY4Afhv4KdDY+/ZIklRPrVPZmfkIsAY4Abg7In4AvJepR6XLgHdk5meBq4HPZ+bXgRXAuzPz\nZOA+YHVEnAYcnJnHAquAN0fEO4HDq+VOBT4ZEQdNU+LSiLit+nc78EcAEdFXbfttmXkc7eA/plqn\nWW1/K/A7tA8M3g/sv9cNkiSpprqnsoeAnZn5W9X80cAtwCMTFps4av2nzHx6kk2NAtdExONAAHcB\nvwLcDZCZO4BLImId8KaIuK3a7gJgEPj+FCU+mpmnTqh3NfBvM7MVEU9GxA3A48BhtMMZIKufQ8A/\nZubuat3vztQPSZK6pe6p7COBcyJiTWY+BTwA/AJ4FDgUGAGOBn5SLf+8kXQ14v0U8GraYfvN6ud9\nwAerZRYDXwX+FLgtM8+rRr0XAw9OU9+kp7Ij4g3AaZm5MiJeCtwzYdlnqp/3A/8mIg4AdtO+bn7t\ntN2YxNKli2g0BvZ2tXnJPtVj3+qxb/XYt9lTK5gz86aIeC3w3YjYSfuU+EXAk8DlEfHPwMMTVnle\nMGfmYxHxbeDvaQdgEzg0M6+JiLdHxJ20rydfmpmbI2JVRNwBvAy4KTMfn6bEqU6p3w+MVdvuoz3C\nP3SPun4eEZ+hPWrfBow9bysdaDbHGB3dWWfVeaXRGLBPNdi3euxbPfatnroHM32tlm9WfqFWnXV5\na+LXPo5tf5gN56z0+5g74B98PfatHvtWj32rp9EYmPKNyNOZs9/HHBFnA2t5bnTcV02vz8wtPStM\nkqQXYM4Gc2ZuAjb1ug5JkrrJW3JKklQQg1mSpIIYzJIkFcRgliSpIAazJEkFMZglSSqIwSxJUkEM\nZkmSCjJnbzBSkl07tk07L0lSpwzmLrh2w1qazV/+rovBweU9qkaSNJcZzF0wPDzsDd4lSV3hNWZJ\nkgpiMEuSVBCDWZKkghjMkiQVxGCWJKkgBrMkSQUxmCVJKojBLElSQQxmSZIKYjBLklQQg1mSpIIY\nzJIkFcRgliSpIAazJEkFMZglSSqIwSxJUkEMZkmSCmIwS5JUEINZkqSCGMySJBXEYJYkqSAGsyRJ\nBTGYJUkqiMEsSVJBDGZJkgqyoNcF7AtGRkZoNseenR8cXE5/f38PK5IkzVUGcxecuf56Fi5eBsCu\nHdvYuG4NQ0MrelyVJGkuMpi7YOHiZSxaclivy5Ak7QO8xixJUkEMZkmSCmIwS5JUEINZkqSCGMyS\nJBXEYJYkqSAGsyRJBenoc8wR8Qng7cB+wNPAusz8hw7WewtwHXBjZn7yhRQ6zT4WAn8AHAf8C/AM\ncFlm/vWLsT9Jkl5MMwZzRBwBrMnME6r5I4FrgKM62P5q4AuZ+aUXVOX0/hvw7cz8eFXfwcDfRsS3\nMvMXL+J+JUnquk5GzDuAV0fEWcAtmfn9iDg2Im4Hzs3MkYg4F3gl7cC+GRgF/gY4C3giIn5S7euC\n6mcLOD0zmxFxGfAW2qPxSzLzGxHxaeBEoB/4k8z82mSFRcQrgeHM/ND4Y5n5KPDm6vmPVDX0AZcA\nrwM+ACwEfg6cDhwOXAU8RfvU/lrgCeCr1XoHAudl5vc76JUkSS/IjNeYM/MRYA1wAnB3RPwAeC/t\ncJ3MMuAdmflZ4Grg85n5dWAF8O7MPBm4D1gdEacBB2fmscAq4M0R8U7g8Gq5U4FPRsRBU+xrEHho\nfCYiLo2I2yPiexHxgerhZrWtbwFLM/NtmXkc7QOBY4B3AFton6q/FFhM+0Dh58C7gI8BL5upT5Ik\ndUMnp7KHgJ2Z+VvV/NHALcAjExbrmzD9T5n59CSbGgWuiYjHgQDuAn4FuBsgM3cAl0TEOuBNEXFb\ntd0FtAN4shHrT2iPeKm2cWlV4wZg0fjD1XOtiHgqIm4AHgcOox3OXwE+Afwt8Avgd2mP9lcA/wN4\nEviv0/VIkqRu6eRU9pHAORGxJjOfAh6gHWCPAocCI8DRtEMSJhlJVyPeTwGvph2236x+3gd8sFpm\nMe3Tx38K3JaZ50VEH3Ax8OBkhWXmwxHxUEScl5l/NmE7RwE/qPbxTPX4G4DTMnNlRLwUuKd6/v3A\nnZn5+xHxIdohfS3w08xcHRErgU8Db+ugVwAsXbqIRmOg08XnPXtVj32rx77VY99mz4zBnJk3RcRr\nge9GxE7ap78voj2SvDwi/hl4eMIqzwvmzHwsIr4N/D2wG2gCh2bmNRHx9oi4k/b15Eszc3NErIqI\nO2ifQr4pMx+fpsTfBD5VbeNp2tePbwT+gvb14nEPAGPVcn20R/yH0j6NfU1EPFm9to8D/xf4i4j4\nD1Vdn5qpTxM1m2OMju7cm1XmrUZjwF7VYN/qsW/12Ld66h7M9LVaU10qVqdWnXV5a/xrH8e2P8yG\nc1b6fcwd8g++HvtWj32rx77V02gM9M281PPNie9jjoizaY9+x48i+qrp9Zm5pWeFSZLUZXMimDNz\nE7Cp13VIkvRi85ackiQVxGCWJKkgBrMkSQUxmCVJKojBLElSQQxmSZIKYjBLklSQOfE55tLt2rFt\n0mlJkvaWwdwF125YS7M59uz84ODyHlYjSZrLDOYuGB4e9j6ykqSu8BqzJEkFMZglSSqIwSxJUkEM\nZkmSCmIwS5JUEINZkqSCGMySJBXEYJYkqSAGsyRJBTGYJUkqiMEsSVJBDGZJkgpiMEuSVBCDWZKk\nghjMkiQVxGCWJKkgBrMkSQUxmCVJKojBLElSQQxmSZIKYjBLklQQg1mSpIIYzJIkFcRgliSpIAaz\nJEkFWdDrAvYFIyMjNJtjz84PDi6nv7+/hxVJkuYqg7kLzlx/PQsXLwNg145tbFy3hqGhFT2uSpI0\nFxnMXbBw8TIWLTms12VIkvYBXmOWJKkgBrMkSQUxmCVJKojBLElSQQxmSZIKYjBLklQQg1mSpIIY\nzJIkFaSYG4xExFuBG4Gt1UMHAQ8CH87M3V3czwHADzPz8GmWuQD4CPAM8MeZ+Zfd2r8kSdMpJpgr\nt2bm2vGZiLgOWAP8VRf30Qe0pnoyIg4GzgXeCCwEfgAYzJKkWVFaMPeNT0TE/sAhwPaI+BxwIu1A\nvT4zL4uIq4AbMnNzRKwGPpSZH42I+4E7gdcCPwPOoB2w1wEvpz0Kn1JmPhoRb8zMZyLiVcC/dP9l\nSpI0udKuMZ8aEbdFxFbgHuAm2qE6mJkrgZOAtRHx+knWHR8FHw5cnJnHAw3gGOA84N7MPAW4YqYi\nqlC+ALgL+O8v8DVJktSx0kbMt2bm2ohYCmwGfgQcQXsETGbujogtwOv2WK9vwvRoZj5STf8YOBAY\nBm6utvGdiHhqpkIy80sRcQVwS0TckZl/1+mLWLp0EY3GQKeLz3v2qh77Vo99q8e+zZ7SghmAzGxG\nxJnA7cBFwGnAxojYDzgeuBpYBbyqWuXoKTY1Hthbq/W+ERFHAftNte+IGAY2ZOYZwNPAE7TfBNax\nZnOM0dGde7PKvNVoDNirGuxbPfatHvtWT92DmdJOZT8rM+8DNgLvAx6KiLton1q+MTO/B1wJXBgR\nm4FDJ6zammT6CmB5RNwBnE87bKfa7wjwvYi4G/g2cHdm3tmllyVJ0rT6Wq0p36CsDq066/LW+Pcx\nj21/mA3nrGRoaEWPq5obPBKvx77VY9/qsW/1NBoDfTMv9XxFnsqeDRFxNrCW50bV4x+jWp+ZW3pW\nmCRpXpu3wZyZm4BNva5DkqSJir3GLEnSfGQwS5JUEINZkqSCGMySJBXEYJYkqSAGsyRJBTGYJUkq\nyLz9HHM37dqxbdJpSZL2lsHcBdduWEuzOfbs/ODg8h5WI0maywzmLhgeHvY+spKkrvAasyRJBTGY\nJUkqiMEsSVJBDGZJkgpiMEuSVBCDWZKkghjMkiQVxGCWJKkgBrMkSQUxmCVJKojBLElSQQxmSZIK\nYjBLklQQg1mSpIIYzJIkFcRgliSpIAazJEkFMZglSSqIwSxJUkEMZkmSCmIwS5JUEINZkqSCGMyS\nJBXEYJYkqSAGsyRJBVnQ6wL2BSMjIzSbYwAMDi6nv7+/xxVJkuYqg7kLzlx/PQsXL2PXjm1sXLeG\noaEVvS5JkjRHGcxdsHDxMhYtOazXZUiS9gFeY5YkqSAGsyRJBTGYJUkqiMEsSVJBDGZJkgpiMEuS\nVBCDWZKkghjMkiQVpJgbjETEW4Ebga3VQwcBDwIfzszdXdzPAcAPM/PwaZb5AnACsLN66P2ZuXOq\n5SVJ6pZigrlya2auHZ+JiOuANcBfdXEffUBrhmXeBKzOzGYX9ytJ0oxKC+a+8YmI2B84BNgeEZ8D\nTqQdqNdn5mURcRVwQ2ZujojVwIcy86MRcT9wJ/Ba4GfAGcBC4Drg5bRH4VOKiD5gBfDliDgE+Epm\nXtXtFypJ0mRKu8Z8akTcFhFbgXuAm2iH6mBmrgROAtZGxOsnWXd8FHw4cHFmHg80gGOA84B7M/MU\n4IoZangZ8EXgN4B3AudPsT9JkrqutBHzrZm5NiKWApuBHwFH0B4Bk5m7I2IL8Lo91uubMD2amY9U\n0z8GDgSGgZurbXwnIp6apoZdwBcz818BIuI24NeAf+zkBSxduohGY6CTRVWxX/XYt3rsWz32bfaU\nFswAZGYzIs4EbgcuAk4DNkbEfsDxwNXAKuBV1SpHT7Gp8cDeWq33jYg4Cthvmt0PA1+NiDfS7s+J\n1f460myOMTrq+8Q61WgM2K8a7Fs99q0e+1ZP3YOZ0k5lPysz7wM2Au8DHoqIu4C7gBsz83vAlcCF\nEbEZOHTCqq1Jpq8AlkfEHcD5wBPT7PeHwJ8DW2gfGFxT1SJJ0ouur9Wa6Q3Kmsmqsy5vLVpyGGPb\nH2bDOSsZGlrR65LmDI/E67Fv9di3euxbPY3GQN/MSz1fkaeyZ0NEnA2s5blR9fjHqNZn5paeFSZJ\nmtfmbTBn5iZgU6/rkCRpomKvMUuSNB8ZzJIkFcRgliSpIAazJEkFMZglSSqIwSxJUkEMZkmSCjJv\nP8fcTbt2bPuln5Ik1WUwd8G1G9bSbI4BMDi4vMfVSJLmMoO5C4aHh72PrCSpK7zGLElSQQxmSZIK\nYjBLklQQg1mSpIIYzJIkFcRgliSpIAazJEkFMZglSSqIwSxJUkEMZkmSCmIwS5JUEINZkqSCGMyS\nJBXEYJYkqSAGsyRJBTGYJUkqiMEsSVJBDGZJkgpiMEuSVBCDWZKkghjMkiQVxGCWJKkgBrMkSQUx\nmCVJKojBLElSQRb0uoB9wcjICM3mGACDg8vp7+/vcUWSpLnKYO6CM9dfz8LFy9i1Yxsb161haGhF\nr0uSJM1RBnMXLFy8jEVLDut1GZKkfYDXmCVJKojBLElSQQxmSZIKYjBLklQQg1mSpIIYzJIkFcRg\nliSpIAazJEkFKeYGIxHxVuBGYGv10EHAg8CHM3N3F/dzAPDDzDx8mmXeBfyXavaezPxYt/YvSdJ0\nShsx35qZp1b/3gzsBtZ0eR99QGuqJyNiEfBHwHsy8zjgRxFxcJdrkCRpUsWMmCt94xMRsT9wCLA9\nIj4HnEg7UK/PzMsi4irghszcHBGrgQ9l5kcj4n7gTuC1wM+AM4CFwHXAy2mPwqdzPHAv8PmIWA5s\nysxHu/oqJUmaQmkj5lMj4raI2ArcA9xEO1QHM3MlcBKwNiJeP8m646Pgw4GLM/N4oAEcA5wH3JuZ\npwBXzFDDK4BTgHXAu4CPR8SvvqBXJUlSh0obMd+amWsjYimwGfgRcATtETCZuTsitgCv22O9vgnT\no5n5SDX9Y+BAYBi4udrGdyLiqWlqeBT4bmaOAkTEHcAbgQc6eQFLly6i0RjoZFFV7Fc99q0e+1aP\nfZs9pQUzAJnZjIgzgduBi4DTgI0RsR/tU81XA6uAV1WrHD3FpsYDe2u13jci4ihgv2l2/w/A66uD\ng8eAlcCXO6292RxjdHRnp4vPe43GgP2qwb7VY9/qsW/11D2YKe1U9rMy8z5gI/A+4KGIuAu4C7gx\nM78HXAlcGBGbgUMnrNqaZPoKYHk1+j0feGKa/Y4C62mP2O8GvpaZP+jOq5IkaXp9rdaUb1BWh1ad\ndXlr0ZLDGNv+MBvOWcnQ0IpelzRneCRej32rx77VY9/qaTQG+mZe6vmKPJU9GyLibGAtz42qxz9G\ntT4zt/SsMEnSvDZvgzkzNwGbel2HJEkTFXuNWZKk+chgliSpIAazJEkFMZglSSqIwSxJUkEMZkmS\nCmIwS5JUkHn7OeZu2rVj2y/9lCSpLoO5C67dsJZmcwyAwcHlPa5GkjSXGcxdMDw87H1kJUld4TVm\nSZIKYjBLklQQg1mSpIIYzJIkFcRgliSpIAazJEkFMZglSSqIwSxJUkEMZkmSCmIwS5JUEINZkqSC\nGMySJBWkr9Vq9boGSZJUccQsSVJBDGZJkgpiMEuSVBCDWZKkghjMkiQVxGCWJKkgC3pdwFwSEX3A\n5cCvAf8K/HZmPjTh+fcBvwc8BVyVmVf2pNDCdNC3fwf8Du2+3ZuZ5/ek0MLM1LcJy10BPJqZvzvL\nJRapg9+3Y4A/rmZ/BvxGZj4564UWpoO+fRi4ENhN+/+3P+tJoYWKiGOBz2Tmqj0e3+tccMS8d04D\nDsjM44H1wOfHn4iIBdX824FTgHMiotGLIgs0Xd8OBH4feGtmngS8PCLe25syizNl38ZFxLnA62e7\nsMLN1LcvA/8+M08GbgFeM8v1lWqmvn0WOBU4EfhPEbF4lusrVkSsAzYBB+zxeK1cMJj3zom0/5DJ\nzC3Amyc8dwRwf2Y+lplPAd8GTp79Eos0Xd+eAI7PzCeq+QW0j9Y1fd+IiOOAY4ArZr+0ok3Zt4gY\nBh4FLoyIbwFLM/P+XhRZoGl/34D/AywBXlrNe3eq5zwAnD7J47VywWDeOwcBOybM746Il0zx3E7A\nI8q2KfuWma3MHAWIiP8IvCwz/1cPaizRlH2LiEOAS4CPAX09qK1k0/2dvgI4Dvgi7VHM2yPilNkt\nr1jT9Q1gK3APcC9wc2Y+NpvFlSwzb6J9in9PtXLBYN47jwEDE+ZfkpnPTHjuoAnPDQC/mK3CCjdd\n34iIvoj4LPA24AOzXVzBpuvbB4GDgf8J/GdgbUT85izXV6rp+vYo8EBmjmTmbtojxD1HhvPVlH2L\niDcA76F92n8QeGVEnDHrFc49tXLBYN47/xt4N0BErKR95DjuPuBXI+LlEbE/7dMVd89+iUWarm/Q\nvuZ3QGaeNuGUtqbpW2ZelpnHZOapwGeA6zPzz3tTZnGm+317CFgUEcur+ZNojwQ1fd92ALuAJzKz\nBWyjfVpbv2zPs1e1csEvsdgLE961eGT10EeBN9E+/XplRLyH9unFPuArvmuxbbq+0T419l3gzuq5\nFrAxM78+23WWZqbftwnLfQQI35Xd1sHf6SnAH1bP3ZWZH5/9KsvTQd/OBc6i/b6QB4Gzq7MOAiLi\nNcANmXl89UmT2rlgMEuSVBBPZUuSVBCDWZKkghjMkiQVxGCWJKkgBrMkSQUxmCVJKojBLElSQQxm\nSZIK8v8BAYVGWmtzxwMAAAAASUVORK5CYII=\n", "text/plain": [ "<matplotlib.figure.Figure at 0x11ebd11d0>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "feature_importances = pd.Series(model.feature_importances_, index=X.columns)\n", "feature_importances.sort()\n", "feature_importances.plot(kind=\"barh\", figsize=(7,6))" ] }, { "cell_type": "code", "execution_count": 278, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "AUC [ 0.56410714 0.55746329 0.54182243], Average AUC 0.554464288773\n", "n trees: 1, CV AUC [ 0.54352679 0.52182012 0.52169808], Average AUC 0.529014994355\n", "n trees: 11, CV AUC [ 0.57148597 0.5538084 0.55104636], Average AUC 0.558780245814\n", "n trees: 21, CV AUC [ 0.55047194 0.54987731 0.53842448], Average AUC 0.546257911153\n", "n trees: 31, CV AUC [ 0.57475765 0.55339731 0.54241338], Average AUC 0.556856114695\n", "n trees: 41, CV AUC [ 0.55816327 0.5527357 0.54013952], Average AUC 0.550346161769\n", "n trees: 51, CV AUC [ 0.5441199 0.55787438 0.53792988], Average AUC 0.546641388544\n", "n trees: 61, CV AUC [ 0.56126913 0.55277424 0.54369805], Average AUC 0.552580476248\n", "n trees: 71, CV AUC [ 0.56765306 0.55975643 0.53744813], Average AUC 0.554952539745\n", "n trees: 81, CV AUC [ 0.56794643 0.55797716 0.54158477], Average AUC 0.555836118697\n", "n trees: 91, CV AUC [ 0.5585523 0.55412315 0.53579733], Average AUC 0.549490924948\n" ] } ], "source": [ "from sklearn.cross_validation import cross_val_score\n", "\n", "scores = cross_val_score(model, X, y, scoring='roc_auc')\n", "print('AUC {}, Average AUC {}'.format(scores, scores.mean()))\n", "\n", "for n_trees in range(1, 100, 10):\n", " model = RandomForestClassifier(n_estimators = n_trees)\n", " scores = cross_val_score(model, X, y, scoring='roc_auc')\n", " print('n trees: {}, CV AUC {}, Average AUC {}'.format(n_trees, scores, scores.mean()))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python [Root]", "language": "python", "name": "Python [Root]" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.12" } }, "nbformat": 4, "nbformat_minor": 0 }