{ "cells": [ { "cell_type": "markdown", "id": "minute-lender", "metadata": {}, "source": [ "
\n", " | x0 | \n", "x1 | \n", "x2 | \n", "x3 | \n", "x4 | \n", "x5 | \n", "x6 | \n", "x7 | \n", "x8 | \n", "x9 | \n", "... | \n", "x41 | \n", "x42 | \n", "x43 | \n", "x44 | \n", "x45 | \n", "x46 | \n", "x47 | \n", "x48 | \n", "x49 | \n", "y | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "-0.166563 | \n", "-3.961588 | \n", "4.621113 | \n", "2.481908 | \n", "-1.800135 | \n", "0.804684 | \n", "6.718751 | \n", "-14.789997 | \n", "-1.040673 | \n", "-4.204950 | \n", "... | \n", "-1.497117 | \n", "5.414063 | \n", "-2.325655 | \n", "1.674827 | \n", "-0.264332 | \n", "60.781427 | \n", "-7.689696 | \n", "0.151589 | \n", "-8.040166 | \n", "0 | \n", "
1 | \n", "-0.149894 | \n", "-0.585676 | \n", "27.839856 | \n", "4.152333 | \n", "6.426802 | \n", "-2.426943 | \n", "40.477058 | \n", "-6.725709 | \n", "0.896421 | \n", "0.330165 | \n", "... | \n", "36.292790 | \n", "4.490915 | \n", "0.762561 | \n", "6.526662 | \n", "1.007927 | \n", "15.805696 | \n", "-4.896678 | \n", "-0.320283 | \n", "16.719974 | \n", "0 | \n", "
2 | \n", "-0.321707 | \n", "-1.429819 | \n", "12.251561 | \n", "6.586874 | \n", "-5.304647 | \n", "-11.311090 | \n", "17.812850 | \n", "11.060572 | \n", "5.325880 | \n", "-2.632984 | \n", "... | \n", "-0.368491 | \n", "9.088864 | \n", "-0.689886 | \n", "-2.731118 | \n", "0.754200 | \n", "30.856417 | \n", "-7.428573 | \n", "-2.090804 | \n", "-7.869421 | \n", "0 | \n", "
3 | \n", "-0.245594 | \n", "5.076677 | \n", "-24.149632 | \n", "3.637307 | \n", "6.505811 | \n", "2.290224 | \n", "-35.111751 | \n", "-18.913592 | \n", "-0.337041 | \n", "-5.568076 | \n", "... | \n", "15.691546 | \n", "-7.467775 | \n", "2.940789 | \n", "-6.424112 | \n", "0.419776 | \n", "-72.424569 | \n", "5.361375 | \n", "1.806070 | \n", "-7.670847 | \n", "0 | \n", "
4 | \n", "-0.273366 | \n", "0.306326 | \n", "-11.352593 | \n", "1.676758 | \n", "2.928441 | \n", "-0.616824 | \n", "-16.505817 | \n", "27.532281 | \n", "1.199715 | \n", "-4.309105 | \n", "... | \n", "-13.911297 | \n", "-5.229937 | \n", "1.783928 | \n", "3.957801 | \n", "-0.096988 | \n", "-14.085435 | \n", "-0.208351 | \n", "-0.894942 | \n", "15.724742 | \n", "1 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
159995 | \n", "-0.487024 | \n", "-4.270269 | \n", "0.417395 | \n", "-1.992423 | \n", "1.757552 | \n", "-1.167819 | \n", "0.606860 | \n", "41.084463 | \n", "-1.923188 | \n", "-2.374213 | \n", "... | \n", "-9.390451 | \n", "8.096802 | \n", "-0.875131 | \n", "-1.413787 | \n", "-0.363968 | \n", "15.339392 | \n", "4.364205 | \n", "-3.831489 | \n", "28.389858 | \n", "1 | \n", "
159996 | \n", "0.825477 | \n", "4.804368 | \n", "22.161535 | \n", "11.371303 | \n", "1.715901 | \n", "6.990759 | \n", "32.221207 | \n", "-12.278038 | \n", "-3.861086 | \n", "6.715126 | \n", "... | \n", "12.803189 | \n", "0.841446 | \n", "-0.682177 | \n", "-5.047677 | \n", "-0.017898 | \n", "0.780130 | \n", "6.387266 | \n", "-1.374742 | \n", "-1.623952 | \n", "0 | \n", "
159997 | \n", "-0.802489 | \n", "5.362696 | \n", "7.243419 | \n", "-7.496074 | \n", "2.295250 | \n", "-2.756067 | \n", "10.531388 | \n", "42.515821 | \n", "1.420984 | \n", "6.788916 | \n", "... | \n", "-0.346570 | \n", "-0.144098 | \n", "0.738298 | \n", "7.241041 | \n", "0.215347 | \n", "-12.155249 | \n", "3.265263 | \n", "1.230963 | \n", "3.335471 | \n", "1 | \n", "
159998 | \n", "0.339237 | \n", "7.609895 | \n", "5.368414 | \n", "-2.825481 | \n", "4.046102 | \n", "15.322603 | \n", "7.805271 | \n", "-10.233054 | \n", "2.609986 | \n", "4.251127 | \n", "... | \n", "-0.307656 | \n", "-0.601145 | \n", "-3.443112 | \n", "0.549931 | \n", "0.206728 | \n", "5.081980 | \n", "1.701462 | \n", "-0.279619 | \n", "-1.986424 | \n", "0 | \n", "
159999 | \n", "-0.296748 | \n", "-0.412773 | \n", "-10.911407 | \n", "-5.633629 | \n", "-4.028154 | \n", "15.939428 | \n", "-15.864365 | \n", "-46.388192 | \n", "18.339472 | \n", "-4.575499 | \n", "... | \n", "27.837473 | \n", "1.392395 | \n", "0.893555 | \n", "-1.848590 | \n", "-0.423982 | \n", "-17.379380 | \n", "5.916490 | \n", "-2.767444 | \n", "15.547557 | \n", "1 | \n", "
160000 rows × 51 columns
\n", "\n", " | x0 | \n", "x1 | \n", "x2 | \n", "x3 | \n", "x4 | \n", "x5 | \n", "x6 | \n", "x7 | \n", "x8 | \n", "x9 | \n", "... | \n", "x41 | \n", "x42 | \n", "x43 | \n", "x44 | \n", "x45 | \n", "x46 | \n", "x47 | \n", "x48 | \n", "x49 | \n", "y | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
count | \n", "159974.000000 | \n", "159975.000000 | \n", "159962.000000 | \n", "159963.000000 | \n", "159974.000000 | \n", "159963.000000 | \n", "159974.000000 | \n", "159973.000000 | \n", "159979.000000 | \n", "159970.000000 | \n", "... | \n", "159960.000000 | \n", "159974.000000 | \n", "159963.000000 | \n", "159960.000000 | \n", "159971.000000 | \n", "159969.000000 | \n", "159963.000000 | \n", "159968.000000 | \n", "159968.000000 | \n", "160000.000000 | \n", "
mean | \n", "-0.001028 | \n", "0.001358 | \n", "-1.150145 | \n", "-0.024637 | \n", "-0.000549 | \n", "0.013582 | \n", "-1.670670 | \n", "-7.692795 | \n", "-0.030540 | \n", "0.005462 | \n", "... | \n", "6.701076 | \n", "-1.833820 | \n", "-0.002091 | \n", "-0.006250 | \n", "0.000885 | \n", "-12.755395 | \n", "0.028622 | \n", "-0.000224 | \n", "-0.674224 | \n", "0.401231 | \n", "
std | \n", "0.371137 | \n", "6.340632 | \n", "13.273480 | \n", "8.065032 | \n", "6.382293 | \n", "7.670076 | \n", "19.298665 | \n", "30.542264 | \n", "8.901185 | \n", "6.355040 | \n", "... | \n", "18.680196 | \n", "5.110705 | \n", "1.534952 | \n", "4.164595 | \n", "0.396621 | \n", "36.608641 | \n", "4.788157 | \n", "1.935501 | \n", "15.036738 | \n", "0.490149 | \n", "
min | \n", "-1.592635 | \n", "-26.278302 | \n", "-59.394048 | \n", "-35.476594 | \n", "-28.467536 | \n", "-33.822988 | \n", "-86.354483 | \n", "-181.506976 | \n", "-37.691045 | \n", "-27.980659 | \n", "... | \n", "-82.167224 | \n", "-27.933750 | \n", "-6.876234 | \n", "-17.983487 | \n", "-1.753221 | \n", "-201.826828 | \n", "-21.086333 | \n", "-8.490155 | \n", "-65.791191 | \n", "0.000000 | \n", "
25% | \n", "-0.251641 | \n", "-4.260973 | \n", "-10.166536 | \n", "-5.454438 | \n", "-4.313118 | \n", "-5.148130 | \n", "-14.780146 | \n", "-27.324771 | \n", "-6.031058 | \n", "-4.260619 | \n", "... | \n", "-5.804080 | \n", "-5.162869 | \n", "-1.039677 | \n", "-2.812055 | \n", "-0.266518 | \n", "-36.428329 | \n", "-3.216016 | \n", "-1.320800 | \n", "-10.931753 | \n", "0.000000 | \n", "
50% | \n", "-0.002047 | \n", "0.004813 | \n", "-1.340932 | \n", "-0.031408 | \n", "0.000857 | \n", "0.014118 | \n", "-1.948594 | \n", "-6.956789 | \n", "-0.016840 | \n", "0.006045 | \n", "... | \n", "6.840110 | \n", "-1.923754 | \n", "-0.004385 | \n", "-0.010484 | \n", "0.001645 | \n", "-12.982497 | \n", "0.035865 | \n", "-0.011993 | \n", "-0.574410 | \n", "0.000000 | \n", "
75% | \n", "0.248532 | \n", "4.284220 | \n", "7.871676 | \n", "5.445179 | \n", "4.306660 | \n", "5.190749 | \n", "11.446931 | \n", "12.217071 | \n", "5.972349 | \n", "4.305734 | \n", "... | \n", "19.266367 | \n", "1.453507 | \n", "1.033275 | \n", "2.783274 | \n", "0.269049 | \n", "11.445443 | \n", "3.268028 | \n", "1.317703 | \n", "9.651072 | \n", "1.000000 | \n", "
max | \n", "1.600849 | \n", "27.988178 | \n", "63.545653 | \n", "38.906025 | \n", "26.247812 | \n", "35.550110 | \n", "92.390605 | \n", "149.150634 | \n", "39.049831 | \n", "27.377842 | \n", "... | \n", "100.050432 | \n", "22.668041 | \n", "6.680922 | \n", "19.069759 | \n", "1.669205 | \n", "150.859415 | \n", "20.836854 | \n", "8.226552 | \n", "66.877604 | \n", "1.000000 | \n", "
8 rows × 46 columns
\n", "\n", " | x24 | \n", "x29 | \n", "x30 | \n", "x32 | \n", "x37 | \n", "
---|---|---|---|---|---|
0 | \n", "euorpe | \n", "July | \n", "tuesday | \n", "0.0% | \n", "$1313.96 | \n", "
1 | \n", "asia | \n", "Aug | \n", "wednesday | \n", "-0.02% | \n", "$1962.78 | \n", "
2 | \n", "asia | \n", "July | \n", "wednesday | \n", "-0.01% | \n", "$430.47 | \n", "
3 | \n", "asia | \n", "July | \n", "wednesday | \n", "0.01% | \n", "$-2366.29 | \n", "
4 | \n", "asia | \n", "July | \n", "tuesday | \n", "0.01% | \n", "$-620.66 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
159995 | \n", "asia | \n", "Aug | \n", "wednesday | \n", "0.0% | \n", "$-891.96 | \n", "
159996 | \n", "asia | \n", "May | \n", "wednesday | \n", "-0.01% | \n", "$1588.65 | \n", "
159997 | \n", "asia | \n", "Jun | \n", "wednesday | \n", "-0.0% | \n", "$687.46 | \n", "
159998 | \n", "asia | \n", "May | \n", "wednesday | \n", "-0.02% | \n", "$439.21 | \n", "
159999 | \n", "asia | \n", "Aug | \n", "tuesday | \n", "0.02% | \n", "$-1229.34 | \n", "
160000 rows × 5 columns
\n", "\n", " | x24 | \n", "x29 | \n", "x30 | \n", "x32 | \n", "x37 | \n", "
---|---|---|---|---|---|
0 | \n", "euorpe | \n", "July | \n", "tuesday | \n", "0.00 | \n", "1313.96 | \n", "
1 | \n", "asia | \n", "Aug | \n", "wednesday | \n", "-0.02 | \n", "1962.78 | \n", "
2 | \n", "asia | \n", "July | \n", "wednesday | \n", "-0.01 | \n", "430.47 | \n", "
3 | \n", "asia | \n", "July | \n", "wednesday | \n", "0.01 | \n", "-2366.29 | \n", "
4 | \n", "asia | \n", "July | \n", "tuesday | \n", "0.01 | \n", "-620.66 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
159995 | \n", "asia | \n", "Aug | \n", "wednesday | \n", "0.00 | \n", "-891.96 | \n", "
159996 | \n", "asia | \n", "May | \n", "wednesday | \n", "-0.01 | \n", "1588.65 | \n", "
159997 | \n", "asia | \n", "Jun | \n", "wednesday | \n", "-0.00 | \n", "687.46 | \n", "
159998 | \n", "asia | \n", "May | \n", "wednesday | \n", "-0.02 | \n", "439.21 | \n", "
159999 | \n", "asia | \n", "Aug | \n", "tuesday | \n", "0.02 | \n", "-1229.34 | \n", "
160000 rows × 5 columns
\n", "\n", " | %Missing | \n", "
---|---|
x0 | \n", "0.016250 | \n", "
x1 | \n", "0.015625 | \n", "
x2 | \n", "0.023750 | \n", "
x3 | \n", "0.023125 | \n", "
x4 | \n", "0.016250 | \n", "
x5 | \n", "0.023125 | \n", "
x6 | \n", "0.016250 | \n", "
x7 | \n", "0.016875 | \n", "
x8 | \n", "0.013125 | \n", "
x9 | \n", "0.018750 | \n", "
x10 | \n", "0.026875 | \n", "
x11 | \n", "0.018750 | \n", "
x12 | \n", "0.022500 | \n", "
x13 | \n", "0.019375 | \n", "
x14 | \n", "0.021250 | \n", "
x15 | \n", "0.021875 | \n", "
x16 | \n", "0.016250 | \n", "
x17 | \n", "0.016875 | \n", "
x18 | \n", "0.025000 | \n", "
x19 | \n", "0.021875 | \n", "
x20 | \n", "0.023750 | \n", "
x21 | \n", "0.018125 | \n", "
x22 | \n", "0.016875 | \n", "
x23 | \n", "0.029375 | \n", "
x24 | \n", "0.017500 | \n", "
x25 | \n", "0.013750 | \n", "
x26 | \n", "0.022500 | \n", "
x27 | \n", "0.018750 | \n", "
x28 | \n", "0.021875 | \n", "
x29 | \n", "0.018750 | \n", "
x30 | \n", "0.018750 | \n", "
x31 | \n", "0.024375 | \n", "
x32 | \n", "0.019375 | \n", "
x33 | \n", "0.025625 | \n", "
x34 | \n", "0.025625 | \n", "
x35 | \n", "0.018750 | \n", "
x36 | \n", "0.016875 | \n", "
x37 | \n", "0.014375 | \n", "
x38 | \n", "0.019375 | \n", "
x39 | \n", "0.014375 | \n", "
x40 | \n", "0.022500 | \n", "
x41 | \n", "0.025000 | \n", "
x42 | \n", "0.016250 | \n", "
x43 | \n", "0.023125 | \n", "
x44 | \n", "0.025000 | \n", "
x45 | \n", "0.018125 | \n", "
x46 | \n", "0.019375 | \n", "
x47 | \n", "0.023125 | \n", "
x48 | \n", "0.020000 | \n", "
x49 | \n", "0.020000 | \n", "
Class | \n", "Distribution | \n", "
---|---|
1 | \n", "59% | \n", "
0 | \n", "41% | \n", "
\n", " | mean_fit_time | \n", "std_fit_time | \n", "mean_score_time | \n", "std_score_time | \n", "param_C | \n", "param_max_iter | \n", "param_penalty | \n", "param_random_state | \n", "param_solver | \n", "params | \n", "... | \n", "split23_test_precision | \n", "split24_test_precision | \n", "split25_test_precision | \n", "split26_test_precision | \n", "split27_test_precision | \n", "split28_test_precision | \n", "split29_test_precision | \n", "mean_test_precision | \n", "std_test_precision | \n", "rank_test_precision | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "13.432851 | \n", "1.474184 | \n", "0.040597 | \n", "0.005634 | \n", "0.1 | \n", "500 | \n", "l2 | \n", "1999 | \n", "sag | \n", "{'C': 0.1, 'max_iter': 500, 'penalty': 'l2', '... | \n", "... | \n", "0.668673 | \n", "0.670941 | \n", "0.657928 | \n", "0.670738 | \n", "0.659365 | \n", "0.656282 | \n", "0.665728 | \n", "0.665584 | \n", "0.004726 | \n", "8 | \n", "
1 | \n", "0.744260 | \n", "0.051803 | \n", "0.042062 | \n", "0.006093 | \n", "0.1 | \n", "500 | \n", "l2 | \n", "1999 | \n", "lbfgs | \n", "{'C': 0.1, 'max_iter': 500, 'penalty': 'l2', '... | \n", "... | \n", "0.668673 | \n", "0.670857 | \n", "0.658100 | \n", "0.670738 | \n", "0.659365 | \n", "0.656282 | \n", "0.665899 | \n", "0.665599 | \n", "0.004716 | \n", "7 | \n", "
2 | \n", "10.501819 | \n", "1.227208 | \n", "0.043512 | \n", "0.009269 | \n", "0.01 | \n", "500 | \n", "l2 | \n", "1999 | \n", "sag | \n", "{'C': 0.01, 'max_iter': 500, 'penalty': 'l2', ... | \n", "... | \n", "0.669011 | \n", "0.671036 | \n", "0.657742 | \n", "0.670484 | \n", "0.660010 | \n", "0.656353 | \n", "0.665642 | \n", "0.665791 | \n", "0.004806 | \n", "5 | \n", "
3 | \n", "0.789555 | \n", "0.114426 | \n", "0.044956 | \n", "0.009950 | \n", "0.01 | \n", "500 | \n", "l2 | \n", "1999 | \n", "lbfgs | \n", "{'C': 0.01, 'max_iter': 500, 'penalty': 'l2', ... | \n", "... | \n", "0.669011 | \n", "0.670951 | \n", "0.657742 | \n", "0.670398 | \n", "0.659926 | \n", "0.656353 | \n", "0.665812 | \n", "0.665777 | \n", "0.004801 | \n", "6 | \n", "
4 | \n", "4.774041 | \n", "1.120333 | \n", "0.041310 | \n", "0.010212 | \n", "0.001 | \n", "500 | \n", "l2 | \n", "1999 | \n", "sag | \n", "{'C': 0.001, 'max_iter': 500, 'penalty': 'l2',... | \n", "... | \n", "0.671425 | \n", "0.672234 | \n", "0.659706 | \n", "0.673286 | \n", "0.663155 | \n", "0.659015 | \n", "0.668317 | \n", "0.667550 | \n", "0.005075 | \n", "4 | \n", "
5 rows × 175 columns
\n", "\n", " | mean_fit_time | \n", "std_fit_time | \n", "mean_score_time | \n", "std_score_time | \n", "param_n_neighbors | \n", "params | \n", "split0_test_roc_auc | \n", "split1_test_roc_auc | \n", "split2_test_roc_auc | \n", "split3_test_roc_auc | \n", "... | \n", "split23_test_precision | \n", "split24_test_precision | \n", "split25_test_precision | \n", "split26_test_precision | \n", "split27_test_precision | \n", "split28_test_precision | \n", "split29_test_precision | \n", "mean_test_precision | \n", "std_test_precision | \n", "rank_test_precision | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "0.305139 | \n", "0.084233 | \n", "296.612833 | \n", "13.084617 | \n", "10 | \n", "{'n_neighbors': 10} | \n", "0.872573 | \n", "0.873158 | \n", "0.870447 | \n", "0.865345 | \n", "... | \n", "0.834268 | \n", "0.828769 | \n", "0.821903 | \n", "0.852197 | \n", "0.834430 | \n", "0.834135 | \n", "0.844618 | \n", "0.835537 | \n", "0.005922 | \n", "2 | \n", "
1 | \n", "0.310254 | \n", "0.188162 | \n", "286.185080 | \n", "55.575974 | \n", "12 | \n", "{'n_neighbors': 12} | \n", "0.879290 | \n", "0.879120 | \n", "0.878307 | \n", "0.870748 | \n", "... | \n", "0.849788 | \n", "0.837100 | \n", "0.823304 | \n", "0.854698 | \n", "0.837368 | \n", "0.842690 | \n", "0.851820 | \n", "0.841920 | \n", "0.006669 | \n", "1 | \n", "
2 rows × 171 columns
\n", "\n", " | mean_fit_time | \n", "std_fit_time | \n", "mean_score_time | \n", "std_score_time | \n", "param_max_features | \n", "param_n_estimators | \n", "param_n_jobs | \n", "param_random_state | \n", "params | \n", "split0_test_roc_auc | \n", "... | \n", "split23_test_precision | \n", "split24_test_precision | \n", "split25_test_precision | \n", "split26_test_precision | \n", "split27_test_precision | \n", "split28_test_precision | \n", "split29_test_precision | \n", "mean_test_precision | \n", "std_test_precision | \n", "rank_test_precision | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "4.678497 | \n", "0.269283 | \n", "0.247049 | \n", "0.010967 | \n", "1 | \n", "10 | \n", "-1 | \n", "1999 | \n", "{'max_features': 1, 'n_estimators': 10, 'n_job... | \n", "0.809679 | \n", "... | \n", "0.758632 | \n", "0.759348 | \n", "0.749256 | \n", "0.769534 | \n", "0.760996 | \n", "0.754995 | \n", "0.776630 | \n", "0.764207 | \n", "0.012129 | \n", "20 | \n", "
1 | \n", "9.528533 | \n", "0.469264 | \n", "0.259123 | \n", "0.021618 | \n", "1 | \n", "20 | \n", "-1 | \n", "1999 | \n", "{'max_features': 1, 'n_estimators': 20, 'n_job... | \n", "0.850978 | \n", "... | \n", "0.818266 | \n", "0.807516 | \n", "0.801144 | \n", "0.830425 | \n", "0.809205 | \n", "0.808978 | \n", "0.833385 | \n", "0.817380 | \n", "0.009593 | \n", "19 | \n", "
2 | \n", "14.978017 | \n", "0.658647 | \n", "0.282651 | \n", "0.034041 | \n", "1 | \n", "30 | \n", "-1 | \n", "1999 | \n", "{'max_features': 1, 'n_estimators': 30, 'n_job... | \n", "0.876086 | \n", "... | \n", "0.838897 | \n", "0.834249 | \n", "0.840244 | \n", "0.852968 | \n", "0.843693 | \n", "0.843120 | \n", "0.854378 | \n", "0.847270 | \n", "0.008565 | \n", "17 | \n", "
3 | \n", "19.508908 | \n", "0.737629 | \n", "0.367102 | \n", "0.081046 | \n", "1 | \n", "40 | \n", "-1 | \n", "1999 | \n", "{'max_features': 1, 'n_estimators': 40, 'n_job... | \n", "0.886774 | \n", "... | \n", "0.857915 | \n", "0.857403 | \n", "0.857315 | \n", "0.866707 | \n", "0.853622 | \n", "0.848503 | \n", "0.875188 | \n", "0.862306 | \n", "0.007948 | \n", "16 | \n", "
4 | \n", "7.073599 | \n", "0.840057 | \n", "0.252776 | \n", "0.011082 | \n", "2 | \n", "10 | \n", "-1 | \n", "1999 | \n", "{'max_features': 2, 'n_estimators': 10, 'n_job... | \n", "0.866640 | \n", "... | \n", "0.825772 | \n", "0.827779 | \n", "0.834115 | \n", "0.842350 | \n", "0.829114 | \n", "0.843328 | \n", "0.841470 | \n", "0.835455 | \n", "0.006917 | \n", "18 | \n", "
5 rows × 174 columns
\n", "\n", " Table 7.4.1 - Performance Vs Dollar Cost \n", "
Model | \n", "Accuracy | \n", "Precision | \n", "Recall | \n", "F1 Score | \n", "Dollar Cost | \n", "
---|---|---|---|---|---|
Logistic Regression | \n", "0.70 | \n", "0.67 | \n", "0.51 | \n", "0.58 | \n", "3,062,500 | \n", "
KNN | \n", "0.79 | \n", "0.85 | \n", "0.60 | \n", "0.70 | \n", "2,531,000 | \n", "
Random Forest | \n", "0.90 | \n", "0.92 | \n", "0.83 | \n", "0.87 | \n", "1,049,500 | \n", "
\n", " Table 7.4.2 - Threshold selection \n", "
Model | \n", "Classification threshold | \n", "FNR | \n", "Accuracy | \n", "Dollar Cost | \n", "
---|---|---|---|---|
Logistic Regression | \n", "0.006807 | \n", "0.0 | \n", "0.4012 | \n", "189,690 | \n", "
KNN | \n", "0.083333 | \n", "0.0020 | \n", "0.4641 | \n", "182,010 | \n", "
Random Forest | \n", "0.175 | \n", "0.0076 | \n", "0.6862 | \n", "146,450 | \n", "
\n", "\n" ] }, { "cell_type": "markdown", "id": "conservative-aberdeen", "metadata": {}, "source": [ "**Observations**\n", "\n", "For each of the models, the \\\\$ cost is significantly reduced by using a much lower classification threshold. Compare with default results in Table 7.4.1.\n", "\n", "Using a very low threshold is converting TN and FN to FP (false positives). Since the cost of FP is smaller by 50x than of FN, it is helping lower the actual \\\\$ cost to company.\n", "Above also implies, one need not build any sophisticated model, and classify all new records as TRUE, yet not be too worse off the minimum \\\\$ cost. This is happening as the $ cost of FN = 50x \\\\$ cost of FP, i.e. there is a very high skew.\n", "\n", "Experiments with setting \\\\$ cost of FN = 500, and $ cost of FP = 100, (i.e. much less skew),gave a much higher Accuracy, and higher threshold. In such a scenario, spending time and effort to build a model will likely provide higher return on investment of building a model\n", "\n", "
" ] }, { "cell_type": "markdown", "id": "choice-kinase", "metadata": {}, "source": [ "# Model Interpretability & Explainability" ] }, { "cell_type": "markdown", "id": "acquired-wrestling", "metadata": {}, "source": [ "## Logistic Regression" ] }, { "cell_type": "code", "execution_count": 66, "id": "agricultural-looking", "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "