{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "Created on Fri Oct 4 14:32:45 2019\n", "\n", "@author: dezsoribli\n", "\n", "\n", "1. Load hurricane data from the article \"Hurricane-induced selection on the morphology of an island lizard\".\n", "https://www.nature.com/articles/s41586-018-0352-3\n", " - A, Drop the lizard with the most missing values\n", " - B, Drop the ID column\n", " - C, Encode, the Sex, Origin ans Hurricane values into binary columns,\n", " and drop the original text columns.\n", " - D, Make sure all your columns are encoded as floating point values, \n", " not unsigned integers!\n", " \n", " \n", "2. Use logistic regression from the statsmodels package to predict whether\n", " the lizard was measured after of before the hurricane, \n", " using the whole dataset\n", " - A, Investigate the Toe and Finger area coefficients, whats going on? \n", " Fix this problem by only keeping the mean measurements.\n", " - B, Which measured quality had the most significant positive effect on survival?\n", " - C, Which measured quality had the most significant negative effect on survival?\n", " - D, Try explain the results in your words. Check the abstract of the paper.\n", " - E, Repeat the fit after scaling each input column to 0 mean and 1 variance. \n", " Have the coefficients changed? Have the predictions changed?\n", "\n", "\n", "3. Repeat the fit with scikit-learn on the unnormalized dataset.\n", " - A, Compare the coefficients with the ones you got from statsmodels. \n", " Are they the same? If not try to answer why?\n", " - B, Try to tweak the parameters of the scikit-learn method to reproduce the\n", " the coefficients produced by statsmodels.\n", " - C, Plot the ROC curve for the full dataset, and calculate the AUC.\n", " - D, Repeat the fit after scaling each input column to 0 mean and 1 variance. \n", " Have the coefficients changed? Have the predictions changed?\n", " \n", " \n", "4. Split the dataset into 5 folds and predict each fold by training on the other 4.\n", " - A, Make sure to fix the seed of the splitting to 0 to make it reproducible.\n", " - B, Plot the ROC for the 5 folds separately as curves on the same plot.\n", " - C, Calculate the AUC values for the 5 folds separately.\n", "\n", "\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Populating the interactive namespace from numpy and matplotlib\n" ] } ], "source": [ "%pylab inline\n", "\n", "import pandas as pd\n", "from sklearn.preprocessing import StandardScaler\n", "from sklearn.linear_model import LogisticRegression\n", "from sklearn.model_selection import KFold\n", "from sklearn.metrics import roc_auc_score,roc_curve\n", "import statsmodels.api as sm\n", "figsize(8,8)\n", "mpl.rcParams['font.size']=16" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/opt/conda/lib/python3.6/site-packages/numpy/core/fromnumeric.py:52: FutureWarning: Method .ptp is deprecated and will be removed in a future version. Use numpy.ptp instead.\n", " return getattr(obj, method)(*args, **kwds)\n" ] } ], "source": [ "data = pd.read_csv('hurricane.csv') # renamed it\n", "\n", "\n", "data = data.drop([39])\n", "\n", "data['Female'] = pd.get_dummies(data.Sex)['Female'].astype('float')\n", "data['Pine Cay'] = pd.get_dummies(data.Origin)['Pine Cay'].astype('float')\n", "data['After hurricane'] = pd.get_dummies(data.Hurricane)['After'].astype('float')\n", "\n", "# only keep numercal data\n", "data_num = data.drop(columns=['ID','Sex','Origin','Hurricane'])\n", "\n", "# drop which was not measured\n", "data_num = data_num.drop(columns=['SumFingers','SumToes','MaxFingerForce'])\n", "\n", "data_num = data_num.drop(columns=['FingerArea1','FingerArea2','FingerArea3',\n", " 'ToeArea1','ToeArea2','ToeArea3'])\n", "\n", "#data_num = data_num.drop(columns=['FingerCount','ToeCount'])\n", "#\n", "#data_num = data_num.drop(columns=['MeanFingerArea','MeanToeArea'])\n", "\n", "#data_num = data_num.drop(columns=['Female','Metatarsal','SVL','Tibia'])\n", "\n", "# %%\n", "data_num = sm.add_constant(data_num)\n", "\n" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Optimization terminated successfully.\n", " Current function value: 0.256168\n", " Iterations 8\n", " Logit Regression Results \n", "==============================================================================\n", "Dep. Variable: After hurricane No. Observations: 163\n", "Model: Logit Df Residuals: 147\n", "Method: MLE Df Model: 15\n", "Date: Fri, 04 Oct 2019 Pseudo R-squ.: 0.6259\n", "Time: 15:41:55 Log-Likelihood: -41.755\n", "converged: True LL-Null: -111.63\n", " LLR p-value: 2.588e-22\n", "==================================================================================\n", " coef std err z P>|z| [0.025 0.975]\n", "----------------------------------------------------------------------------------\n", "const 46.4732 11.204 4.148 0.000 24.514 68.433\n", "SVL -0.4982 0.263 -1.894 0.058 -1.014 0.017\n", "Femur -2.5733 0.723 -3.557 0.000 -3.991 -1.156\n", "Tibia -1.1555 1.114 -1.037 0.300 -3.339 1.028\n", "Metatarsal 0.8029 1.514 0.530 0.596 -2.165 3.771\n", "LongestToe -3.3138 0.917 -3.614 0.000 -5.111 -1.517\n", "Humerus 2.2870 0.846 2.704 0.007 0.629 3.945\n", "Radius -1.1332 1.718 -0.660 0.509 -4.500 2.234\n", "Metacarpal 1.0017 1.242 0.807 0.420 -1.432 3.435\n", "LongestFinger -0.2999 1.280 -0.234 0.815 -2.810 2.210\n", "FingerCount 1.5508 0.536 2.893 0.004 0.500 2.602\n", "ToeCount -0.9490 0.468 -2.028 0.043 -1.866 -0.032\n", "MeanFingerArea 5.4238 2.153 2.519 0.012 1.204 9.643\n", "MeanToeArea 5.3069 1.827 2.905 0.004 1.726 8.888\n", "Female -4.1135 1.528 -2.693 0.007 -7.108 -1.119\n", "Pine Cay 1.6539 0.715 2.312 0.021 0.252 3.056\n", "==================================================================================\n" ] } ], "source": [ "# %%\n", "logit = sm.Logit(data_num['After hurricane'], \n", " data_num.drop(columns=['After hurricane']))\n", "result = logit.fit()\n", "print(result.summary())\n", "pred1=result.predict(data_num.drop(columns=['After hurricane']))" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Optimization terminated successfully.\n", " Current function value: 0.256168\n", " Iterations 8\n", " Logit Regression Results \n", "==============================================================================\n", "Dep. Variable: y No. Observations: 163\n", "Model: Logit Df Residuals: 147\n", "Method: MLE Df Model: 15\n", "Date: Fri, 04 Oct 2019 Pseudo R-squ.: 0.6259\n", "Time: 15:41:58 Log-Likelihood: -41.755\n", "converged: True LL-Null: -111.63\n", " LLR p-value: 2.588e-22\n", "==============================================================================\n", " coef std err z P>|z| [0.025 0.975]\n", "------------------------------------------------------------------------------\n", "const 0.7351 0.324 2.267 0.023 0.100 1.371\n", "x1 -3.1191 1.647 -1.894 0.058 -6.347 0.109\n", "x2 -4.4528 1.252 -3.557 0.000 -6.906 -1.999\n", "x3 -1.9250 1.856 -1.037 0.300 -5.562 1.712\n", "x4 0.7944 1.498 0.530 0.596 -2.142 3.731\n", "x5 -3.4991 0.968 -3.614 0.000 -5.397 -1.602\n", "x6 3.1678 1.171 2.704 0.007 0.872 5.464\n", "x7 -1.2024 1.823 -0.660 0.509 -4.775 2.370\n", "x8 0.3932 0.487 0.807 0.420 -0.562 1.348\n", "x9 -0.1784 0.762 -0.234 0.815 -1.672 1.315\n", "x10 1.6952 0.586 2.893 0.004 0.547 2.844\n", "x11 -1.2302 0.606 -2.028 0.043 -2.419 -0.041\n", "x12 2.9580 1.174 2.519 0.012 0.657 5.259\n", "x13 4.7231 1.626 2.905 0.004 1.536 7.910\n", "x14 -2.0427 0.759 -2.693 0.007 -3.530 -0.556\n", "x15 0.8262 0.357 2.312 0.021 0.126 1.526\n", "==============================================================================\n" ] } ], "source": [ "# %% ???\n", "X = StandardScaler().fit_transform(data_num.drop(columns=['After hurricane','const']))\n", "logit = sm.Logit(data_num['After hurricane'].values.astype('float'), \n", " sm.add_constant(X))\n", "result = logit.fit()\n", "print (result.summary())\n", "pred2=result.predict(sm.add_constant(X))" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 7.771561e-15\n", "1 1.110223e-15\n", "2 0.000000e+00\n", "3 0.000000e+00\n", "4 2.886580e-15\n", "5 5.551115e-15\n", "6 2.609024e-15\n", "7 0.000000e+00\n", "8 3.441691e-15\n", "9 0.000000e+00\n", "10 1.110223e-15\n", "11 2.220446e-16\n", "12 1.110223e-15\n", "13 -3.663736e-15\n", "14 7.771561e-16\n", "15 3.330669e-15\n", "16 6.661338e-16\n", "17 4.440892e-16\n", "18 1.554312e-15\n", "19 9.992007e-16\n", "20 2.220446e-16\n", "21 2.220446e-16\n", "22 9.992007e-16\n", "23 2.220446e-16\n", "24 6.661338e-15\n", "25 2.220446e-16\n", "26 -2.220446e-15\n", "27 2.997602e-15\n", "28 2.220446e-16\n", "29 4.440892e-16\n", " ... \n", "134 -4.274359e-15\n", "135 -4.874573e-16\n", "136 -8.586881e-17\n", "137 -2.484124e-15\n", "138 -1.491862e-16\n", "139 -8.361367e-16\n", "140 4.662937e-15\n", "141 5.995204e-15\n", "142 1.276756e-15\n", "143 -1.110223e-16\n", "144 -1.762479e-15\n", "145 -2.144118e-15\n", "146 1.110223e-15\n", "147 -8.153200e-16\n", "148 -1.706968e-15\n", "149 5.384582e-15\n", "150 1.942890e-16\n", "151 -1.179612e-15\n", "152 3.885781e-15\n", "153 -4.510281e-16\n", "154 -3.642919e-16\n", "155 9.020562e-17\n", "156 -8.337515e-17\n", "157 1.942890e-16\n", "158 -1.179612e-15\n", "159 -2.275957e-15\n", "160 1.387779e-15\n", "161 8.187895e-16\n", "162 -1.110223e-16\n", "163 -8.812395e-16\n", "Length: 163, dtype: float64" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# %%\n", "pred1-pred2" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 0.04554114 -1.7823243 -0.30337529 -0.06108083 -1.70506902 1.39230212\n", " 0.2203313 0.39376842 0.07844526 1.09892878 -0.02088438 1.18629206\n", " 1.33274758 -0.34572881 0.7630456 ]]\n", "[[ 0.22234396 -0.94557182 -2.42393522 -0.7036805 -0.12256229 -1.89094196\n", " 1.41110982 -0.32962143 0.24677151 -0.07331174 0.9290967 -0.37626081\n", " 1.6842705 1.73926583 -1.01235636 0.45617172]]\n" ] }, { "data": { "text/plain": [ "array([-2.85954451e-01, -7.09270087e-02, 2.52931727e-02, -3.75112573e-02,\n", " -1.82868902e-01, 1.22361822e-01, -1.50453050e-01, -1.73057026e-02,\n", " 1.63652417e-02, -4.94673199e-03, -4.76133539e-02, -7.45871174e-02,\n", " 3.71719032e-02, 7.37748272e-02, 1.42450982e-02, -4.13348974e-02,\n", " -7.55508641e-03, -7.41726008e-03, -1.49086939e-02, -1.84649655e-01,\n", " 5.52245766e-03, -3.44212123e-02, 1.62892696e-02, -6.35571092e-02,\n", " -1.45604870e-01, 5.98540199e-03, -1.81378497e-02, -5.48721913e-02,\n", " 5.13643126e-02, 2.92396130e-03, 7.37866577e-04, -8.07552701e-02,\n", " -1.08030644e-02, -1.08872743e-01, 4.59388407e-02, -2.43881395e-02,\n", " 6.87644881e-03, -2.11623133e-02, -7.75186961e-02, -3.38727951e-01,\n", " -2.86183583e-02, -2.22651867e-01, 2.24155232e-02, 4.58415270e-03,\n", " -1.91523133e-01, -6.99155595e-02, 1.28553775e-02, -1.05525038e-01,\n", " -3.51705014e-02, -3.87082513e-03, 5.69454174e-02, 5.72313675e-01,\n", " 1.05548107e-01, -1.20334502e-01, -3.69877277e-02, 5.06248915e-02,\n", " -3.84380256e-02, -9.88710820e-02, -1.33746066e-01, -5.90592508e-02,\n", " 1.02610137e-01, 1.25463655e-01, 1.34944972e-01, 3.14467176e-02,\n", " -4.36721034e-03, -8.86148267e-03, 6.81541575e-02, 4.15408665e-02,\n", " -5.50043401e-04, 3.29535918e-02, -5.44593458e-03, -8.34609281e-02,\n", " 1.38843867e-03, -2.55038949e-01, -3.47943015e-02, -2.01995303e-03,\n", " -1.31039949e-02, 1.62094362e-02, -3.60741982e-02, 4.37611807e-02,\n", " 1.38785864e-03, -2.95167509e-02, 7.62205743e-02, 2.30056790e-02,\n", " 6.32798575e-02, -1.23848837e-01, 2.23321476e-02, 9.54246185e-03,\n", " -8.82850129e-03, -3.15829305e-02, -4.82295231e-03, 1.21588387e-03,\n", " 2.53952544e-02, 9.10338709e-02, 7.65128525e-02, 3.61167833e-02,\n", " 1.44186831e-01, -1.18203561e-02, 1.24079508e-01, 2.44486443e-01,\n", " -3.52003302e-02, -7.52588020e-02, -2.38736482e-02, 1.92849359e-01,\n", " 3.61035508e-02, 4.67047087e-02, -7.70786751e-03, 1.94409045e-01,\n", " 1.09520993e-02, 1.60127413e-01, 3.91685079e-02, 1.81068551e-01,\n", " -4.48520019e-02, -4.06360250e-02, 6.30416931e-02, 1.21917300e-01,\n", " 8.71965110e-03, -1.83710758e-02, -4.01640593e-02, -2.40315956e-02,\n", " 7.38867704e-02, 4.34943102e-02, 1.26695194e-01, -6.88656728e-02,\n", " 1.80926449e-02, -2.52644581e-02, -4.01952768e-01, 1.02597272e-01,\n", " -2.54281687e-02, 2.80492870e-02, -5.91302178e-02, 9.37796339e-03,\n", " 2.24533202e-01, 1.80117157e-01, -1.67296287e-02, -2.50218577e-02,\n", " -7.97443395e-02, 4.75757576e-02, 1.75027906e-02, -8.57089345e-02,\n", " 6.00440838e-02, -8.39154470e-02, -2.33856384e-02, -3.51504632e-02,\n", " 3.12975733e-01, 1.20293119e-01, -1.52068053e-02, -5.90593632e-02,\n", " -1.56078165e-01, -8.71987743e-02, -6.48835278e-02, -5.15719769e-02,\n", " -4.64757255e-02, 1.03821659e-01, 1.22298260e-01, 8.60908120e-02,\n", " -1.00220379e-01, -3.68931373e-02, -9.03496047e-02, -6.44586586e-02,\n", " 1.36674815e-01, -8.18798700e-02, -1.99549748e-02])" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\n", "# %%\n", "cls = LogisticRegression()\n", "cls.fit(data_num.drop(columns=['After hurricane','const']),\n", " data_num['After hurricane'])\n", "print (cls.coef_)\n", "pred1 = cls.predict_proba(data_num.drop(columns=['After hurricane','const']))[:,1]\n", "\n", "# %%\n", "cls = LogisticRegression()\n", "cls.fit(sm.add_constant(X),\n", " data_num['After hurricane'])\n", "print (cls.coef_)\n", "pred2 = cls.predict_proba(sm.add_constant(X))[:,1]\n", "\n", "# %%\n", "pred1-pred2\n", "\n", "\n" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[-0.48940382 -2.56746169 -1.14277845 0.79773787 -3.30123013 2.27390672\n", " -1.11591419 0.9976535 -0.28057054 1.54624263 -0.93732781 5.36681844\n", " 5.25424454 -4.05273037 1.64582975]]\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXcAAAD8CAYAAACMwORRAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAFGVJREFUeJzt3X2Mlned7/H3tzwUq4ClQw1lqLCBNh3IWj0jYtacumo32Ljg0YaHRLEWrW5P12jXtRzxYQ9r4onHs6vbsK60NdZtCu02tmBDJTl9iKaRllFplwdp51AKA+SUTtm2pqGUnu/54x7IMAzMNcM9czO/eb+SSe7run5z3d/ffQ+f/vq7niIzkSSV5bxGFyBJqj/DXZIKZLhLUoEMd0kqkOEuSQUy3CWpQIa7JBXIcJekAhnuklSg0Y1646amppw+fXqj3l6ShqXf/va3L2bm5L7aNSzcp0+fTltbW6PeXpKGpYh4vko7p2UkqUCGuyQVyHCXpAIZ7pJUIMNdkgrUZ7hHxE8i4oWI2Haa7RER/xQR7RHxdES8p/5lSpL6o8rI/afA/DNs/ygwq+vnBuBHZ1+WJOls9Hmee2b+KiKmn6HJQuBnWXte3+aIeHtETMnMg3WqUQ1w9xN7Wb91f6PLkIrUcskEvv2Xswf1Peox5z4V2NdtuaNr3Ski4oaIaIuItkOHDtXhrTVY1m/dz46DrzS6DEkDVI8rVKOXdb0+dTsz1wBrAFpbW30y9zmuZcoE7vnC+xtdhqQBqMfIvQOY1m25GThQh/1KkgaoHuG+AVjWddbMPOBl59slqbH6nJaJiLXAB4GmiOgAvg2MAcjMfwE2AtcA7cBrwGcHq1gNnp4HUHccfIWWKRMaWJGks1HlbJmlfWxP4L/WrSI1xPEDqMcDvWXKBBZe2etxcUnDQMNu+avGOt1I3QOoUhm8/cAI1fNUR0fqUlkcuY8AvV2Q5EhdKpsj9xGgtwuSHKlLZXPk3mBDcZm/o3Rp5HHk3mBDcZm/o3Rp5HHkXtFgjbAdVUsaDI7cKxqsEbajakmDwZH7aXgeuKThzJH7aXgeuKThzJH7GThSlzRcOXKXpAIZ7pJUIMNdkgpkuEtSgQx3SSqQ4S5JBTLcJalAhrskFciLmLr4gGhJJXHk3sXbDUgqiSP3brzdgKRSOHKXpAIZ7pJUIMNdkgpkuEtSgQx3SSqQ4S5JBRoRp0L2vECpN160JKkkI2Lk3vMCpd540ZKkkoyIkTt4gZKkkaXSyD0i5kfErohoj4gVvWy/NCIejYjfR8TTEXFN/UuVJFXVZ7hHxChgNfBRoAVYGhEtPZp9A7g3M98NLAH+ud6FSpKqqzJynwu0Z+buzDwKrAMW9miTwPGjkROBA/UrUZLUX1XCfSqwr9tyR9e67v4O+FREdAAbgb/ubUcRcUNEtEVE26FDhwZQriSpiirhHr2syx7LS4GfZmYzcA3wrxFxyr4zc01mtmZm6+TJk/tfrSSpkirh3gFM67bczKnTLsuBewEy8zfAOKCpHgVKkvqvSrhvAWZFxIyIGEvtgOmGHm32Ah8GiIgrqIW78y6S1CB9hntmHgNuAjYBO6mdFbM9IlZFxIKuZn8DfD4ingLWAtdlZs+pG0nSEKl0EVNmbqR2oLT7um91e70D+LP6liZJGqgRcfsBSRppDHdJKpDhLkkFKvLGYT1v8evtfCWNNEWO3Hve4tfb+UoaaYocuYO3+JU0shU5cpekkc5wl6QCGe6SVCDDXZIKZLhLUoEMd0kqkOEuSQUy3CWpQIa7JBXIcJekAhnuklQgw12SCmS4S1KBDHdJKpDhLkkFMtwlqUCGuyQVyHCXpAIZ7pJUoCKeoXr3E3tZv3X/ieUdB1+hZcqEBlYkSY1VxMh9/db97Dj4yonllikTWHjl1AZWJEmNVcTIHWqBfs8X3t/oMiTpnFDEyF2SdDLDXZIKZLhLUoEqhXtEzI+IXRHRHhErTtNmUUTsiIjtEXF3fcuUJPVHnwdUI2IUsBq4GugAtkTEhszc0a3NLOC/AX+WmYcj4uLBKliS1LcqI/e5QHtm7s7Mo8A6YGGPNp8HVmfmYYDMfKG+ZUqS+qNKuE8F9nVb7uha191lwGUR8XhEbI6I+b3tKCJuiIi2iGg7dOjQwCqWJPWpSrhHL+uyx/JoYBbwQWApcHtEvP2UX8pck5mtmdk6efLk/tYqSaqoSrh3ANO6LTcDB3ppsz4z38jM54Bd1MJektQAVcJ9CzArImZExFhgCbChR5sHgD8HiIgmatM0u+tZqCSpuj7DPTOPATcBm4CdwL2ZuT0iVkXEgq5mm4DOiNgBPAr8bWZ2DlbRkqQzq3RvmczcCGzsse5b3V4ncHPXjySpwbxCVZIKZLhLUoEMd0kqkOEuSQUy3CWpQIa7JBXIcJekAhnuklQgw12SCmS4S1KBDHdJKpDhLkkFMtwlqUCGuyQVyHCXpAIZ7pJUIMNdkgpkuEtSgQx3SSqQ4S5JBTLcJalAhrskFchwl6QCGe6SVCDDXZIKZLhLUoEMd0kqkOEuSQUy3CWpQIa7JBWoUrhHxPyI2BUR7RGx4gztro2IjIjW+pUoSeqvPsM9IkYBq4GPAi3A0oho6aXdeOBLwBP1LlKS1D9VRu5zgfbM3J2ZR4F1wMJe2v098D3gSB3rkyQNQJVwnwrs67bc0bXuhIh4NzAtMx+sY22SpAGqEu7Ry7o8sTHiPOAfgb/pc0cRN0REW0S0HTp0qHqVkqR+qRLuHcC0bsvNwIFuy+OBOcBjEbEHmAds6O2gamauyczWzGydPHnywKuWJJ1RlXDfAsyKiBkRMRZYAmw4vjEzX87MpsycnpnTgc3AgsxsG5SKJUl96jPcM/MYcBOwCdgJ3JuZ2yNiVUQsGOwCJUn9N7pKo8zcCGzsse5bp2n7wbMvS5J0NrxCVZIKZLhLUoEMd0kqkOEuSQUy3CWpQIa7JBXIcJekAhnuklQgw12SCmS4S1KBDHdJKpDhLkkFMtwlqUCGuyQVyHCXpAIZ7pJUIMNdkgpkuEtSgQx3SSqQ4S5JBTLcJalAhrskFchwl6QCGe6SVCDDXZIKZLhLUoEMd0kqkOEuSQUy3CWpQIa7JBXIcJekAlUK94iYHxG7IqI9Ilb0sv3miNgREU9HxMMR8c76lypJqqrPcI+IUcBq4KNAC7A0Ilp6NPs90JqZfwrcB3yv3oVKkqqrMnKfC7Rn5u7MPAqsAxZ2b5CZj2bma12Lm4Hm+pYpSeqPKuE+FdjXbbmja93pLAceOpuiJElnZ3SFNtHLuuy1YcSngFbgqtNsvwG4AeDSSy+tWKIkqb+qjNw7gGndlpuBAz0bRcRHgJXAgsx8vbcdZeaazGzNzNbJkycPpF5JUgVVwn0LMCsiZkTEWGAJsKF7g4h4N/BjasH+Qv3LlCT1R5/hnpnHgJuATcBO4N7M3B4RqyJiQVez/wm8Dfi3iNgaERtOsztJ0hCoMudOZm4ENvZY961urz9S57okSWfBK1QlqUCGuyQVyHCXpAIZ7pJUIMNdkgpkuEtSgQx3SSqQ4S5JBTLcJalAhrskFchwl6QCGe6SVCDDXZIKZLhLUoEMd0kqUKX7uZ9L7n5iL+u37j9p3Y6Dr9AyZUKDKpKkc8+wG7mv37qfHQdfOWldy5QJLLxyaoMqkqRzz7AbuUMtzO/5wvsbXYYknbOG3chdktQ3w12SCjQsp2UkDa033niDjo4Ojhw50uhSRoxx48bR3NzMmDFjBvT7hrukPnV0dDB+/HimT59ORDS6nOJlJp2dnXR0dDBjxowB7cNpGUl9OnLkCBdddJHBPkQigosuuuis/k/JcJdUicE+tM728zbcJQ0b999/PxHBH/7wBwAee+wxPvaxj53U5rrrruO+++4DascKVqxYwaxZs5gzZw5z587loYceqvRer7/+OosXL2bmzJm8733vY8+ePb22++EPf8icOXOYPXs2P/jBD07aduutt3L55Zcze/Zsvva1r51Y/93vfpeZM2dy+eWXs2nTpqrd7xfDXdKwsXbtWj7wgQ+wbt26Su2/+c1vcvDgQbZt28a2bdv4xS9+wauvvlrpd++44w4uvPBC2tvb+cpXvsItt9xySptt27Zx22238eSTT/LUU0/x4IMP8uyzzwLw6KOPsn79ep5++mm2b9/OV7/6VQB27NjBunXr2L59O7/85S+58cYbefPNNyt+AtUZ7pKGhT/+8Y88/vjj3HHHHZXC/bXXXuO2227j1ltv5fzzzwfgHe94B4sWLar0fuvXr+czn/kMANdeey0PP/wwmXlSm507dzJv3jwuuOACRo8ezVVXXcX9998PwI9+9CNWrFhx4r0vvvjiE/tdsmQJ559/PjNmzGDmzJk8+eST1T6EfvBsGUn98t9/sZ0dB17pu2E/tFwygW//5ewztnnggQeYP38+l112GZMmTeJ3v/vdGdu3t7dz6aWXMmFC7/edWrx4Mbt27Tpl/c0338yyZcvYv38/06ZNA2D06NFMnDiRzs5OmpqaTrSdM2cOK1eupLOzk7e85S1s3LiR1tZWAJ555hl+/etfs3LlSsaNG8f3v/993vve97J//37mzZt3Yh/Nzc3s33/y/bLqwXCXNCysXbuWL3/5ywAsWbKEtWvXnjLfflyVg5H33HPPGbf3HKX3tt8rrriCW265hauvvpq3ve1tvOtd72L06FqsHjt2jMOHD7N582a2bNnCokWL2L17d6X91oPhLqlf+hphD4bOzk4eeeQRtm3bRkTw5ptvEhEsW7aMw4cPn9T2pZdeoqmpiZkzZ7J3715effVVxo8ff8o++xq5Nzc3s2/fPpqbmzl27Bgvv/wykyZNOqX98uXLWb58OQBf//rXaW5uBmoj8k984hNEBHPnzuW8887jxRdfPLHf4zo6OrjkkkvO6vPpjXPuks559913H8uWLeP5559nz5497Nu3jxkzZvDSSy9x4MABdu7cCcDzzz/PU089xZVXXskFF1zA8uXL+dKXvsTRo0cBOHjwIHfddRdQG7lv3br1lJ9ly5YBsGDBAu68884T7/+hD32o1xH2Cy+8AMDevXv5+c9/ztKlSwH4+Mc/ziOPPALUpmiOHj1KU1MTCxYsYN26dbz++us899xzPPvss8ydO7fun1mlkXtEzAd+CIwCbs/M/9Fj+/nAz4D/BHQCizNzT31LlTRSrV27lhUrVpy07pOf/CTr1q3jrrvu4rOf/SxHjhxhzJgx3H777UycOBGA73znO3zjG9+gpaWFcePG8da3vpVVq1ZVes/ly5fz6U9/mpkzZzJp0qQTB3EPHDjA5z73OTZu3Hiijs7OTsaMGcPq1au58MILAbj++uu5/vrrmTNnDmPHjuXOO+8kIpg9ezaLFi2ipaWF0aNHs3r1akaNGlWvj+qE6G3+56QGEaOAZ4CrgQ5gC7A0M3d0a3Mj8KeZ+cWIWAL8l8xcfKb9tra2ZltbW78LXvzj3wB4y19pCO3cuZMrrrii0WWMOL197hHx28xs7et3q0zLzAXaM3N3Zh4F1gELe7RZCNzZ9fo+4MMxSJeztVwygZZLfOqSJJ1JlWmZqcC+bssdwPtO1yYzj0XEy8BFwIv1KLK7RhzMkaThpsrIvbcReM+5nCptiIgbIqItItoOHTpUpT5J0gBUCfcOYFq35WbgwOnaRMRoYCLwUs8dZeaazGzNzNbJkycPrGJJDdHX8TnV19l+3lXCfQswKyJmRMRYYAmwoUebDcBnul5fCzyS/iVIxRg3bhydnZ0G/BA5fj/3cePGDXgffc65d82h3wRsonYq5E8yc3tErALaMnMDcAfwrxHRTm3EvmTAFUk65zQ3N9PR0YHTqUPn+JOYBqrPUyEHy0BPhZSkkayep0JKkoYZw12SCmS4S1KBGjbnHhGHgOcH+OtNDMIFUuc4+zwy2OeR4Wz6/M7M7PNc8oaF+9mIiLYqBxRKYp9HBvs8MgxFn52WkaQCGe6SVKDhGu5rGl1AA9jnkcE+jwyD3udhOecuSTqz4TpylySdwTkd7hExPyJ2RUR7RKzoZfv5EXFP1/YnImL60FdZXxX6fHNE7IiIpyPi4Yh4ZyPqrKe++tyt3bURkREx7M+sqNLniFjU9V1vj4i7h7rGeqvwt31pRDwaEb/v+vu+phF11ktE/CQiXoiIbafZHhHxT12fx9MR8Z66FpCZ5+QPtZuU/R/gT4CxwFNAS482NwL/0vV6CXBPo+segj7/OXBB1+u/Ggl97mo3HvgVsBlobXTdQ/A9zwJ+D1zYtXxxo+segj6vAf6q63ULsKfRdZ9ln/8z8B5g22m2XwM8RO15GPOAJ+r5/ufyyP2cerzfEOmzz5n5aGa+1rW4mdr99YezKt8zwN8D3wOODGVxg6RKnz8PrM7MwwCZ+cIQ11hvVfqcwPFnaE7k1OdGDCuZ+St6ea5FNwuBn2XNZuDtETGlXu9/Lod7b4/3m3q6Npl5DDj+eL/hqkqfu1tO7b/8w1mffY6IdwPTMvPBoSxsEFX5ni8DLouIxyNic0TMH7LqBkeVPv8d8KmI6AA2An89NKU1TH//vfdLlWeoNkrdHu83jFTuT0R8CmgFrhrUigbfGfscEecB/whcN1QFDYEq3/NoalMzH6T2f2e/jog5mfkfg1zbYKnS56XATzPzf0XE+6k9I2JOZv6/wS+vIQY1v87lkXvdHu83jFTpMxHxEWAlsCAzXx+i2gZLX30eD8wBHouIPdTmJjcM84OqVf+212fmG5n5HLCLWtgPV1X6vBy4FyAzfwOMo3YPllJV+vc+UOdyuI/Ex/v12eeuKYofUwv24T4PC330OTNfzsymzJyemdOpHWdYkJnD+UkvVf62H6B28JyIaKI2TbN7SKusryp93gt8GCAirqAW7iU/+mkDsKzrrJl5wMuZebBue2/0EeU+jjZfAzxD7Sj7yq51q6j944bal/9vQDvwJPAnja55CPr8v4H/C2zt+tnQ6JoHu8892j7GMD9bpuL3HMA/ADuAfweWNLrmIehzC/A4tTNptgJ/0eiaz7K/a4GDwBvURunLgS8CX+z2Ha/u+jz+vd5/116hKkkFOpenZSRJA2S4S1KBDHdJKpDhLkkFMtwlqUCGuyQVyHCXpAIZ7pJUoP8P+/6FnNvlw3sAAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "\n", "\n", "\n", "# %%\n", "cls = LogisticRegression(C=1e26)\n", "cls.fit(data_num.drop(columns=['After hurricane','const']),\n", " data_num['After hurricane'])\n", "print (cls.coef_)\n", "pred1 = cls.predict_proba(data_num.drop(columns=['After hurricane','const']))[:,1]\n", "\n", "# %%\n", "xroc,yroc,_ = roc_curve(data_num['After hurricane'],pred1 )\n", "plot(xroc,yroc, label='AUC=%.3f'%roc_auc_score(data_num['After hurricane'],pred1))\n", "legend()\n", "\n" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Optimization terminated successfully.\n", " Current function value: 0.187765\n", " Iterations 9\n", "Optimization terminated successfully.\n", " Current function value: 0.264155\n", " Iterations 8\n", "Optimization terminated successfully.\n", " Current function value: 0.203891\n", " Iterations 9\n", "Optimization terminated successfully.\n", " Current function value: 0.230124\n", " Iterations 9\n", "Optimization terminated successfully.\n", " Current function value: 0.289693\n", " Iterations 8\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXcAAAD8CAYAAACMwORRAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAIABJREFUeJzt3Xl0lfW97/H3V0ISQFAhYIEIYQiEEDTQGGVRq0fFRkZBZDgLGYpSemtdDseSK9Vjrfc4YOsEt0W0VMsxAUHKYBxBlKMihhYljOUiQwgyBMqUMgR/948dYkhC9k6yk733k89rrSz2s/cvv+f77B2++eYZvo855xAREW+5KNQBiIhI8Cm5i4h4kJK7iIgHKbmLiHiQkruIiAcpuYuIeJCSu4iIBym5i4h4kJK7iIgHRYVqxXFxcS4hISFUqxcRiUhr16496Jxr7W9cyJJ7QkICubm5oVq9iEhEMrOdgYzTbhkREQ9SchcR8SAldxERD1JyFxHxICV3EREP8pvczexPZrbfzPIu8LqZ2Ytmts3MvjazPsEPU0REqiOQyv3PQEYVr98KJJZ8TQb+UPuwRESkNvye5+6c+8TMEqoYMhR43fnu17fazC41s7bOub1BijEkNqzaw9Y1+0IdxoUd+xZOHAh1FCF1gLMc4myowwipZscdTYt0q8xIYxft46dzHq3TdQRjn3t7YHeZ5fyS5yows8lmlmtmuQcOhHdi2rpmHwfzj4c6jAs7cQBOnwh1FCF1iLMU0bATW9MiR/RpD78HHt60uhaMK1Stkucq/Uiccy8DLwOkpaWF/ccWF38xwx4M00MIcx7x/Tvx7dDGEUIT350IwJyMOSGOJHR23jkOoqDja6+HOpSgGzXrcwDm/axviCOJTMGo3POBK8osxwMFQZhXRERqKBjJfQkwruSsmWuBI5G+v11EJNL53S1jZlnADUCcmeUD/wk0BnDO/RHIAQYA24AiYGJdBSsiIoEJ5GyZMX5ed8AvghaRiIjUmq5QFRHxICV3EREPUnIXEfEgJXcREQ8K2W32JPy8ufVNcrbnhDqMgG05tIXuLbuHOgyRsKTKXUrlbM9hy6EtoQ4jYN1bdmdA5wGhDkMkLKlyl/N0b9m9QV/OH4g3vtjF4nV7Qh0GABP2HgXgVyWX6nvJxr1HSW7bItRhRCxV7iLVtHjdHjaWJFWpO8ltWzA0tdIehBIAVe4iNZDctkVYNLTa+T++yjYcYpHwospdRMSDlNxFRDxIyV1ExIOU3EVEPEjJXUTEg5TcRUQ8SKdCnpM7B9Yv+H7525G+f8/dq7SG3uQ4OVYHN7I+cwKim8G7wbs3ii7nF/EOVe7nrF8A364P+rQ5doItnA76vEQ3g2atgzqlLucX8Q5V7mX9oBdMfNv3+Hd/8/07cXzt5nx3It1Bl/SLSL1S5S4i4kGq3CUihFOzLjW0kkigyl0iQjg161JDK4kEqtwlYoRLsy6RSKDKXUTEg5TcRUQ8SMldRMSDlNxFRDxIB1RLlLYJKLmcv/uh/gBMfPelWs2rS/pFJBRUuZeoqzYBuqRfREJBlXsZ3YkubROwaIOv/UBmxr+HMqR6dXjefI4uWxbqMCo1oeQc93P3DBWfk5s3E5uUFOowJAypcpdSR5ct4+TmzaEOQ6ohNimJFoMGhToMCUOq3OU8sUlJdPzL66EOo4JfzfocQBcxiQRIlbuIiAcFVLmbWQbwAtAIeMU591S51zsArwGXlozJdM7lBDlWqUfh1KgL1KxLpLr8Vu5m1giYCdwKJANjzCy53LBfA/Odc72B0cD/DXagUr/CqVEXqFmXSHUFUrmnA9ucc9sBzCwbGApsLDPGAefKqkuAgmAGKaGhRl0ikSuQ5N4e2F1mOR+4ptyYx4D3zeyXQDPg5qBEJyIiNRLIAVWr5DlXbnkM8GfnXDwwAPiLmVWY28wmm1mumeUeOHCg+tGKiEhAAknu+cAVZZbjqbjbZRIwH8A59zkQC8SVn8g597JzLs05l9a6dXBv7iwiIt8LJLl/CSSaWSczi8Z3wHRJuTG7gJsAzKwHvuSu0lxEJET8JnfnXDFwD/AesAnfWTEbzOxxMxtSMuxB4G4z+wrIAiY458rvuhERkXoS0HnuJees55R77tEyjzcC/YIbmoiI1JSuUBUR8SAldxERD1JyFxHxICV3EREPUsvfMBEOjbrO3RDjV7M+V6MukQinyj1MqFGXiASTKvcwEupGXeduYadmYSKRT5W7iIgHKbmLiHiQkruIiAdpn3sdOzxvPkeXLfM77tyZKuf2e4fCyc2biU1KCtn6RSR4VLnXsaPLlnFy8+ZQhxGQ2KQkWgwaFOowRCQIVLnXg9ikJDr+5fUqx/xq1ueAzlQRkeBQ5S4i4kFK7iIiHhRxu2U2rNrD1jX7gj5v929/BsCi3/0NgIP5x4mLvzjo6xERqQ8RV7lvXbOPg/nH63w9cfEX0y398jpfj4hIXYi4yh18iXfYg32COufEP08GIHNCblDnFREJhchL7se+hRMHYM4jwZ33zAmIbhbcOUVEQiTidstw4gCcPhH8eaObQbPWwZ9XRCQEIq9yB18invh2cOd8d2Jw5xMRCaHIq9xFRMSvyKzc68BVn+2jx9qD7PzvcUGdV/1aRCQUVLmX6LH2IG32FAV9XvVrEZFQUOVexv72TentpweMiEgkUOUuIuJBSu4iIh6k5C4i4kFK7iIiHqTkLiLiQUruIiIepOQuIuJBSu4iIh4UUHI3swwz22Jm28ws8wJjRprZRjPbYGZvBDdMERGpDr9XqJpZI2Am0B/IB740syXOuY1lxiQC/xvo55w7bGZt6ipgERHxL5DKPR3Y5pzb7pw7DWQDQ8uNuRuY6Zw7DOCc2x/cMEVEpDoCSe7tgd1llvNLniurG9DNzD41s9VmllHZRGY22cxyzSz3wIEDNYtYRET8CiS5WyXPuXLLUUAicAMwBnjFzC6t8E3OveycS3POpbVurbseiYjUlUCSez5wRZnleKCgkjGLnXNnnHPfAFvwJXsREQmBQJL7l0CimXUys2hgNLCk3Ji/Av8GYGZx+HbTbA9moCIiEji/yd05VwzcA7wHbALmO+c2mNnjZjakZNh7QKGZbQQ+Ah5yzhXWVdAiIlK1gG7W4ZzLAXLKPfdomccOeKDkS0REQkxXqIqIeJBus1dLb3yxi8Xr9tR6no17j5LctkUQIhIRUeVea4vX7WHj3qO1nie5bQuGppa/fEBEpGZUuQdBctsWzPtZ31CHISJSSpW7iIgHKbmLiHiQkruIiAcpuYuIeJCSu4iIBym5i4h4kJK7iIgHKbmLiHiQkruIiAcpuYuIeJCSu4iIBym5i4h4kJK7iIgHKbmLiHiQkruIiAcpuYuIeJCSu4iIBym5i4h4kJK7iIgHKbmLiHiQkruIiAcpuYuIeJCSu4iIBym5i4h4kJK7iIgHKbmLiHiQkruIiAcpuYuIeFBAyd3MMsxsi5ltM7PMKsaNMDNnZmnBC1FERKrLb3I3s0bATOBWIBkYY2bJlYxrDtwLfBHsIEVEpHoCqdzTgW3Oue3OudNANjC0knG/BZ4BTgYxPhERqYFAknt7YHeZ5fyS50qZWW/gCufcsiDGJiIiNRQVwBir5DlX+qLZRcBzwAS/E5lNBiYDdOjQIbAIRSTkzpw5Q35+PidP6g/z+hIbG0t8fDyNGzeu0fcHktzzgSvKLMcDBWWWmwMpwEozA/gBsMTMhjjncstO5Jx7GXgZIC0tzSEiESE/P5/mzZuTkJBAyf9zqUPOOQoLC8nPz6dTp041miOQ3TJfAolm1snMooHRwJIyQRxxzsU55xKccwnAaqBCYheRyHXy5ElatWqlxF5PzIxWrVrV6i8lv8ndOVcM3AO8B2wC5jvnNpjZ42Y2pMZrFpGIosRev2r7fgd0nrtzLsc5180518U5939KnnvUObekkrE3qGoXkbqwaNEizIzNmzcDsHLlSgYNGnTemAkTJrBgwQLAd6wgMzOTxMREUlJSSE9P55133gloXadOnWLUqFF07dqVa665hh07dlQYs2XLFlJTU0u/WrRowfPPPw/AV199Rd++fenVqxeDBw/m6NGjpTGNHz+eXr160aNHD5588smavh1V0hWqIhIxsrKy+NGPfkR2dnZA4x955BH27t1LXl4eeXl5LF26lGPHjgX0va+++iqXXXYZ27Zt4/7772fq1KkVxnTv3p1169axbt061q5dS9OmTRk2bBgAd911F0899RTr169n2LBhTJ8+HYA333yTU6dOsX79etauXcusWbMq/cVRW0ruIhIRjh8/zqeffsqrr74aUHIvKipi9uzZvPTSS8TExABw+eWXM3LkyIDWt3jxYsaPHw/AiBEjWL58Oc5d+DyQ5cuX06VLFzp27Aj4qvof//jHAPTv35+FCxcCvt0tJ06coLi4mH/9619ER0fTokWLgGKqjkDOlhERKfWbpRvYWHA0qHMmt2vBfw7uWeWYv/71r2RkZNCtWzdatmzJ3/72tyrHb9u2jQ4dOlwwcY4aNYotW7ZUeP6BBx5g3Lhx7Nmzhyuu8J0oGBUVxSWXXEJhYSFxcXGVzpednc2YMWNKl1NSUliyZAlDhw7lzTffZPdu3+VCI0aMYPHixbRt25aioiKee+45WrZsWeW21ISSu4hEhKysLO677z4ARo8eTVZWVoX97ecEcjBy3rx5Vb5eWZV+oXlPnz7NkiVLztt//qc//Yl7772Xxx9/nCFDhhAdHQ3AmjVraNSoEQUFBRw+fJjrrruOm2++mc6dO/uNuTqU3EWkWvxV2HWhsLCQFStWkJeXh5lx9uxZzIxx48Zx+PDh88YeOnSIuLg4unbtyq5duzh27BjNmzevMKe/yj0+Pp7du3cTHx9PcXExR44cuWCF/c4779CnTx8uv/zy0ueSkpJ4//33Adi6dStvv/02AG+88QYZGRk0btyYNm3a0K9fP3Jzc4Oe3LXPXUTC3oIFCxg3bhw7d+5kx44d7N69m06dOnHo0CEKCgrYtGkTADt37uSrr74iNTWVpk2bMmnSJO69915Onz4NwN69e5k7dy7gq9zPHQwt+zVu3DgAhgwZwmuvvVa6/htvvPGClXtWVtZ5u2QA9u/fD8B3333HE088wZQpUwDf1fkrVqzAOceJEydYvXo1SUlJQX7HlNxFJAJkZWWVnoVyzu233052djZz585l4sSJpKamMmLECF555RUuueQSAJ544glat25NcnIyKSkp3HbbbbRu3TqgdU6aNInCwkK6du3K73//e5566ikACgoKGDBgQOm4oqIiPvjgA4YPH14h5m7dupGUlES7du2YOHEiAL/4xS84fvw4KSkpXH311UycOJErr7yyxu/NhVhVR3/rUlpamsvNrf7p8Ise8v0mHTZ9fFDjeW9gOgA/eXtNtb5v1KzPAZj3s75BjUcknGzatIkePXqEOowGp7L33czWOuf83jNDlbuIiAcpuYuIeJCSu4iIBym5i4h4UIM9z/2NL3axeN2e0uXRp88C3x8gDdTGvUdJbhv8S4dFRGqjwVbui9ftYePe2l9Cndy2BUNT2/sfKCJSjxps5Q6+xHzuFMb3ljQCdEqjSDhbtGgRw4cPZ9OmTSQlJbFy5UqeffZZli37/vbNEyZMYNCgQYwYMYIzZ87wyCOPsHDhQmJiYmjatCm/+c1vuPXWW/2u69SpU4wbN461a9fSqlUr5s2bR0JCQoVxL7zwArNnz8Y5x913313aIgHgpZdeYsaMGURFRTFw4ECeeeYZPvjgAzIzMzl9+jTR0dFMnz6dG2+8MSjvT1kNOrmLSGQp2/L3scce8zu+bMvfmJgY9u3bx8cffxzQusq2/M3Ozmbq1KkV+tHk5eUxe/Zs1qxZQ3R0NBkZGQwcOJDExEQ++ugjFi9ezNdff01MTEzpFatxcXEsXbqUdu3akZeXx09+8hP27NlTWQi10mB3y4hIZAnHlr+bNm3i2muvpWnTpkRFRXH99dezaNEiAP7whz+QmZlZuu42bdoA0Lt3b9q1awdAz549OXnyJKdOnQoopupQ5S4i1fNOJny7Prhz/qAX3PpUlUPCseVvSkoK06ZNo7CwkCZNmpCTk0Namu/i0a1bt7Jq1SqmTZtGbGwszz77LFdfffV561q4cCG9e/cu/QUQTEruIhIRwrHlb48ePZg6dSr9+/fn4osv5qqrriIqypdWi4uLOXz4MKtXr+bLL79k5MiRbN++vXSODRs2MHXq1NLOkcGm5C4i1eOnwq4L4dzyd9KkSUyaNAmAhx9+mPj4eADi4+MZPnw4ZkZ6ejoXXXQRBw8epHXr1uTn5zNs2DBef/11unTpEoy3qALtcxeRsBfOLX/PHSjdtWsXb731Vmnr39tuu40VK1YAvl00p0+fJi4ujn/+858MHDiQJ598kn79+tXZe6bkLiJhL5xb/t5+++0kJyczePBgZs6cyWWXXQbAT3/6U7Zv305KSgqjR4/mtddew8yYMWMG27Zt47e//S2pqamkpqaW/oIIpgbb8rd8q96atvwVaQjU8jc01PJXRETOo+QuIuJBSu4iIh6k5C4i4kFK7iIiHqTkLiLiQUruIhIxFi1ahJmxefNmAFauXFmhBcGECRNYsGABAGfOnCEzM5PExERSUlJIT0/nnXfeCWhdp06dYtSoUXTt2pVrrrmGHTt2VDruueeeo2fPnqSkpDBmzBhOnjwJwIoVK+jTpw8pKSmMHz+e4uJiAI4cOcLgwYO56qqr6NmzJ3PmzKnJW+GXkruIRIyyLX8DUbblb15eHkuXLuXYsWMBfW/Zlr/3338/U6dOrTBmz549vPjii+Tm5pKXl8fZs2fJzs7mu+++Y/z48WRnZ5OXl0fHjh1Lr3adOXMmycnJfPXVV6xcuZIHH3yw9AraYFJyF5GIEI4tf8HXIOxf//oXxcXFFBUV0a5dOwoLC4mJiaFbt24A9O/fn4ULFwK+5mPHjh3DOcfx48dp2bJlabOxYFLjMBGplqfXPM3mQ5uDOmdSyySmplesjMsKx5a/7du35z/+4z/o0KEDTZo04ZZbbuGWW27BOceZM2fIzc0lLS2NBQsWsHv3bgDuuecehgwZQrt27Th27Bjz5s3joouCX2cHNKOZZZjZFjPbZmaZlbz+gJltNLOvzWy5mXUMeqQi0qBlZWUxevRo4PuWvxdq7Rtoy9+qGocF0vL38OHDLF68mG+++YaCggJOnDjB3LlzMTOys7O5//77SU9Pp3nz5qXV+XvvvUdqaioFBQWsW7eOe+65h6NHa38/5/L8Vu5m1giYCfQH8oEvzWyJc25jmWF/B9Kcc0Vm9nPgGWBU0KMVkZDzV2HXhXBt+fvhhx/SqVOn0mZkw4cP57PPPmPs2LH07duXVatWAfD++++zdetWAObMmUNmZiZmRteuXenUqRObN28mPT09KO/VOYFU7unANufcdufcaSAbGFp2gHPuI+dcUcniaiA+qFGKSIMWri1/O3TowOrVqykqKsI5x/Lly0sbfZ3r9Hjq1CmefvpppkyZUvo9y5cvB2Dfvn1s2bKFzp07B/09CyS5twd2l1nOL3nuQiYBgZ1rJCISgHBt+XvNNdcwYsQI+vTpQ69evfjuu++YPHkyANOnT6dHjx5ceeWVDB48mBtvvBHwncHz2Wef0atXL2666Saefvrp8/bjB4vflr9mdgfwE+fcXSXLdwLpzrlfVjJ2LHAPcL1zrsIdX81sMjAZoEOHDj/cuXNntQNWy1+R+qeWv6FR1y1/84EryizHAwXlB5nZzcA0YEhliR3AOfeycy7NOZcW6G9PERGpvkCS+5dAopl1MrNoYDSwpOwAM+sNzMKX2IN/SxEREakWv8ndOVeMb1fLe8AmYL5zboOZPW5mQ0qGTQcuBt40s3VmtuQC04mISD0I6CIm51wOkFPuuUfLPL45yHGJiEgtqP2AiIgHRVz7geLjxRQXnWXnneNqNc+Evb4rwnb+j+/S5DZ7itjfvmmt4xMRCQcRV7kXF53lu9NVn75ZE/vbN2XTD4N/rqmIBE8ktfydMWMGXbt2xcw4ePBg6fjp06eTmppKamoqKSkpNGrUiEOHDlX3rfAr4ip3gIuijY6vvV6rOX5V7jz3x96dWOu4RKRulW35+9hjj/kdX7blb0xMDPv27ePjjz8OaF1lW/5mZ2czdepU5s2bd96Ycy1/N27cSJMmTRg5ciTZ2dlMmDCBfv36MWjQIG644Ybzvuehhx7ioYceAmDp0qU899xzFdoaBEPEVe4i0jBFUstfgN69e5OQkFDlOrKyshgzZkxA8VRXRFbuIhI63/7Xf3FqU3Bb/sb0SOIHDz9c5ZhIavkbiKKiIt59911mzJgR0PjqUnIXkYiQlZXFfffdB3zf8rf8/vZzAm35W5Xqtvy99NJLueOOO5g7dy5jx471u/6lS5fSr1+/OtklA0ruIlJN/irsuhCJLX/9yc7OrrNdMqB97iISASKx5W9Vjhw5wscff8zQoUP9jq2piKvcnfP9uXSuq2NNbdx7lOS2le+LE5HwkpWVRWbm+TeBK9/y9+TJkzRu3LhCy99f//rXJCcnExsbS7NmzXj88ccDWuekSZO488476dq1Ky1btiw9iFtQUMBdd91FTk7OeS1/o6Ki6N27d2nL3xdffJFnnnmGb7/9liuvvJIBAwbwyiuvAL5TOm+55RaaNWsWrLeoAr8tf+tKWlqay83Nrfb3/feds3DA0h9dWesYhqa259+v6QDAxJJTIedkzKn1vCJeo5a/oVGblr8RV7kDGN+fny4iIhVpn7uIiAcpuYuIeJCSu4iIBym5i4h4kJK7iIgHKbmLSMQIx5a/L7zwAikpKfTs2ZPnn3++9PlRo0aVtvZNSEggNTUVgB07dtCkSZPS16ZMmVLdtyEgEXkqpIg0TOHW8jcvL4/Zs2ezZs0aoqOjycjIYODAgSQmJp439sEHHyy9sAqgS5curFu3LrCNriFV7iISEcKx5e+mTZu49tpradq0KVFRUVx//fUsWrTovDHOOebPn1+nfWQqo8pdRKpl1fytHNx9PKhzxl1xMdeN7FblmHBs+ZuSksK0adMoLCykSZMm5OTkkJZ2/sWjq1at4vLLLycxMbH0uW+++YbevXvTokULnnjiCa677roqt6UmlNxFJCKEY8vfHj16MHXqVPr378/FF1/MVVddRVTU+Wm1/A052rZty65du2jVqhVr167ltttuY8OGDRf8JVRTSu4iUi3+Kuy6EK4tf8HXYGzSpEkAPPzww8THx5e+VlxczFtvvcXatWtLn4uJiSndTfTDH/6QLl26sHXr1goVf21FXnIP4DeyiHjLuZa/s2bNKn3u+uuvP6/lb48ePS7Y8nfWrFlER0ezd+9eli9fztixY/1W7uda/vbt2/eCLX8B9u/fT5s2bdi1axdvvfUWn3/+fcfaDz/8kKSkpPMS/oEDB2jZsiWNGjVi+/bt/OMf/6Bz585BeJfOF3HJ/Yw5zvJ9F8dg2XJoC91bdg/qnCISHOHa8vdcHIWFhTRu3JiZM2dy2WWXlc5R2Q05PvnkEx599FGioqJo1KgRf/zjH+vkbkwR1/L3lcmvUOwcX9z+WdBjGtB5AHd0uyPo84pEOrX8DY0G1fK3+80/BmBKxt0hjkREJHxFXHIPxcEcEZFIo4uYREQ8SMldRAISquNzDVVt328ldxHxKzY2lsLCQiX4euKco7CwkNjY2BrPEXH73EWk/sXHx5Ofn8+BAwdCHUqDERsbe9758dWl5C4ifjVu3JhOnTqFOgypBu2WERHxICV3EREPUnIXEfGgkLUfMLMDwM4afnsccDCI4UQCbXPDoG1uGGqzzR2dc639DQpZcq8NM8sNpLeCl2ibGwZtc8NQH9us3TIiIh6k5C4i4kGRmtxfDnUAIaBtbhi0zQ1DnW9zRO5zFxGRqkVq5S4iIlUI6+RuZhlmtsXMtplZZiWvx5jZvJLXvzCzhPqPMrgC2OYHzGyjmX1tZsvNrGMo4gwmf9tcZtwIM3NmFvFnVgSyzWY2suSz3mBmb9R3jMEWwM92BzP7yMz+XvLzPSAUcQaLmf3JzPabWd4FXjcze7Hk/fjazPoENQDnXFh+AY2A/wd0BqKBr4DkcmP+F/DHksejgXmhjrsetvnfgKYlj3/eELa5ZFxz4BNgNZAW6rjr4XNOBP4OXFay3CbUcdfDNr8M/LzkcTKwI9Rx13Kbfwz0AfIu8PoA4B3AgGuBL4K5/nCu3NOBbc657c6500A2MLTcmKHAayWPFwA3WWW3J48cfrfZOfeRc66oZHE1UPO2ceEhkM8Z4LfAM8DJ+gyujgSyzXcDM51zhwGcc/vrOcZgC2SbHdCi5PElQEE9xhd0zrlPgENVDBkKvO58VgOXmlnbYK0/nJN7e2B3meX8kucqHeOcKwaOAK3qJbq6Ecg2lzUJ32/+SOZ3m82sN3CFc25ZfQZWhwL5nLsB3czsUzNbbWYZ9RZd3Qhkmx8DxppZPpAD/LJ+QguZ6v5/r5ZwbvlbWQVe/tSeQMZEkoC3x8zGAmnA9XUaUd2rcpvN7CLgOWBCfQVUDwL5nKPw7Zq5Ad9fZ6vMLMU59886jq2uBLLNY4A/O+d+Z2Z9gb+UbPN3dR9eSNRp/grnyj0fuKLMcjwV/0wrHWNmUfj+lKvqz6BwF8g2Y2Y3A9OAIc65U/UUW13xt83NgRRgpZntwLdvckmEH1QN9Gd7sXPujHPuG2ALvmQfqQLZ5knAfADn3OdALL4eLF4V0P/3mgrn5P4lkGhmncwsGt8B0yXlxiwBxpc8HgGscCVHKiKU320u2UUxC19ij/T9sOBnm51zR5xzcc65BOdcAr7jDEOcc7mhCTcoAvnZ/iu+g+eYWRy+3TTb6zXK4Apkm3cBNwGYWQ98yd3Lt35aAowrOWvmWuCIc25v0GYP9RFlP0ebBwBb8R1ln1by3OP4/nOD78N/E9gGrAE6hzrmetjmD4F9wLqSryWhjrmut7nc2JVE+NkyAX7OBvwe2AisB0aHOuZ62OZk4FN8Z9KsA24Jdcy13N4sYC9wBl+VPgmYAkwp8xnPLHk/1gf751pXqIqIeFA475YyDozDAAAAMUlEQVQREZEaUnIXEfEgJXcREQ9SchcR8SAldxERD1JyFxHxICV3EREPUnIXEfGg/w+v56GPULjUNwAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# %%\n", "X = data_num.drop(columns=['After hurricane']).values\n", "y=data_num['After hurricane'].values\n", "np.random.seed(0)\n", "cv = KFold(5, shuffle=True)\n", "for train_idx, test_idx in cv.split(X,y):\n", " logit = sm.Logit( y[train_idx], X[train_idx])\n", " result = logit.fit()\n", " p = result.predict(X[test_idx])\n", " xroc,yroc,_ = roc_curve(y[test_idx],p)\n", " plot(xroc,yroc, label='AUC=%.3f'%(roc_auc_score(y[test_idx],p)))\n", "legend()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.8" } }, "nbformat": 4, "nbformat_minor": 2 }