{ "metadata": { "name": "", "signature": "sha256:b2f9a236477bb40dfd4ba3009d61d6ce5e32fa8ddd86ed796f5116ff90e873dc" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "To reiterate, it looks like the Qi code provided with the 2006 paper does not include anything for the classification - I guess they didn't see any point seeing as the classification can be performed with plenty of different toolsets and they probably used one they couldn't just share.\n", "In any case, I want to replicate their results so I've got to get the feature vectors they had and get out some Python classification algorithms and see if I can get similar results to the paper.\n", "\n", "## Running Perl scripts\n", "\n", "The code is basically just a couple of perl scripts, and I think I should be able to just run these without any hassle and hopefully this will give me the feature vectors.\n", "Unfortunately, if anything goes wrong it's going to take a while to work out because I don't know Perl." ] }, { "cell_type": "code", "collapsed": false, "input": [ "ls" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "\u001b[0m\u001b[40m\u001b[m\u001b[34;42m0yeast_gene_list\u001b[0m/ \u001b[40m\u001b[m\u001b[34;42m12homology-PPI\u001b[0m/ \u001b[40m\u001b[m\u001b[34;42m2tf-binding\u001b[0m/ \u001b[40m\u001b[m\u001b[34;42m5essentiality\u001b[0m/ \u001b[40m\u001b[m\u001b[34;42m8nature-compare-sequence\u001b[0m/ \u001b[40m\u001b[m\u001b[00mBatch_feature_summary_ExtractWrapper.pl\u001b[0m \u001b[40m\u001b[m\u001b[00mREADME\u001b[0m\r\n", "\u001b[40m\u001b[m\u001b[34;42m10mips-phenotype\u001b[0m/ \u001b[40m\u001b[m\u001b[34;42m13domain-interaction\u001b[0m/ \u001b[40m\u001b[m\u001b[34;42m3gene-ontology\u001b[0m/ \u001b[40m\u001b[m\u001b[34;42m6HighExp-PPI\u001b[0m/ \u001b[40m\u001b[m\u001b[34;42m9mips-pclass\u001b[0m/ \u001b[40m\u001b[m\u001b[00mInvestigating qi_evaluation_2006.ipynb\u001b[0m \u001b[40m\u001b[m\u001b[00mReplicating Qi 2006.ipynb\u001b[0m\r\n", "\u001b[40m\u001b[m\u001b[34;42m11sequence-similarity\u001b[0m/ \u001b[40m\u001b[m\u001b[34;42m1gene-expression\u001b[0m/ \u001b[40m\u001b[m\u001b[34;42m4protein-expression\u001b[0m/ \u001b[40m\u001b[m\u001b[34;42m7genetic-interaction\u001b[0m/ \u001b[40m\u001b[m\u001b[00mBatch_feature_ExtractWrapper.pl\u001b[0m \u001b[40m\u001b[m\u001b[00mInvestigating qi_evaluation_2006.md\u001b[0m \u001b[40m\u001b[m\u001b[34;42mtrain-set\u001b[0m/\r\n" ] } ], "prompt_number": 1 }, { "cell_type": "code", "collapsed": false, "input": [ "%%bash\n", "head -n 30 Batch_feature_ExtractWrapper.pl" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "######################################################################3\r\n", "#\r\n", "# copyright @ Yanjun Qi , qyj@cs.cmu.edu\r\n", "# Please cite: \r\n", "# Y. Qi, Z. Bar-Joseph, J. Klein-Seetharaman, \"Evaluation of different biological data and computational classification methods for use in protein interaction prediction\", PROTEINS: Structure, Function, and Bioinformatics. 63(3):490-500. 2006\r\n", "# Y. Qi, J. Klein-Seetharaman, Z. Bar-Joseph, \"A mixture of feature experts approach for protein-protein interaction prediction\", BMC Bioinformatics 8 (S10):S6, 2007 \r\n", "# Y. Qi, J. Klein-Seetharaman, Z. Bar-Joseph, \ufffdRandom Forest Similarity for Protein-Protein Interaction Prediction from Multiple source\ufffd, Pacific Symposium on Biocomputing 10: (PSB 2005) Jan. 2005. \r\n", "# \r\n", "######################################################################3\r\n", "\r\n", "\r\n", "# This program is a yeast PPI feature extraction wrapper \r\n", "# perl command inputPairlist\r\n", "\r\n", "\r\n", "use strict; \r\n", "die \"Usage: command inputPairFile \\n\" if scalar(@ARGV) < 1;\r\n", "my ($inputPair ) = @ARGV;\r\n", "\r\n", "\r\n", "print \"\\n--------------------------- 1gene-expression ----------------------------------------\\n\"; \r\n", "\r\n", "# ------------------- 1gene-expression ------------------------------\r\n", "\r\n", "my $cmdPre = \"perl ./1gene-expression/get_gene_expression.pl \"; \r\n", "my $cmdPro = \"./1gene-expression/YeastGeneListOrfGeneName-106_pval_v9.0.txt ./1gene-expression/all_expression_fixed_s4_csv.txt ./1gene-expression/expressionYanjunSplit.txt 0.6 \"; \r\n", "\r\n", "my $cmd = $cmdPre.\" \".$inputPair.\" \".$cmdPro.\" \".$inputPair.\".genexp\" ; \r\n", "print \"$cmd\\n\"; \r\n", "system($cmd); \r\n" ] } ], "prompt_number": 5 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Ok, so the documentation isn't very extensive, the important part is that `perl command inputPairlist`. Where I guess the command is the name of the wrapper?\n", "\n", "Anyway, which file is the inputpairlist? I think it's in the `0yeast_gene_list/` directory." ] }, { "cell_type": "code", "collapsed": false, "input": [ "cd 0yeast_gene_list/" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "/home/gavin/Documents/MRes/YeastPPI-shared-08/0yeast_gene_list\n" ] } ], "prompt_number": 6 }, { "cell_type": "code", "collapsed": false, "input": [ "ls" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "\u001b[0m\u001b[40m\u001b[m\u001b[00mmake_fullpair_4protein.pl\u001b[0m \u001b[40m\u001b[m\u001b[00mYeastGeneListOrfGeneName-106_pval_v9.0.txt\u001b[0m\r\n" ] } ], "prompt_number": 7 }, { "cell_type": "code", "collapsed": false, "input": [ "%%bash\n", "head YeastGeneListOrfGeneName-106_pval_v9.0.txt" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "YAL001C\tTFC3\r\n", "YAL002W\tVPS8\r\n", "YAL003W\tEFB1\r\n", "YAL004W\tYAL004W\r\n", "YAL005C\tSSA1\r\n", "YAL007C\tERP2\r\n", "YAL008W\tFUN14\r\n", "YAL009W\tSPO7\r\n", "YAL010C\tMDM10\r\n", "YAL011W\tFUN36\r\n" ] } ], "prompt_number": 9 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Guess this maps one reference name for a gene to another?" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%%bash\n", "head -n 30 make_fullpair_4protein.pl" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "######################################################################3\r\n", "#\r\n", "# copyright @ Yanjun Qi , qyj@cs.cmu.edu\r\n", "# Please cite: \r\n", "# Y. Qi, Z. Bar-Joseph, J. Klein-Seetharaman, \"Evaluation of different biological data and computational classification methods for use in protein interaction prediction\", PROTEINS: Structure, Function, and Bioinformatics. 63(3):490-500. 2006\r\n", "# Y. Qi, J. Klein-Seetharaman, Z. Bar-Joseph, \"A mixture of feature experts approach for protein-protein interaction prediction\", BMC Bioinformatics 8 (S10):S6, 2007 \r\n", "# Y. Qi, J. Klein-Seetharaman, Z. Bar-Joseph, \ufffdRandom Forest Similarity for Protein-Protein Interaction Prediction from Multiple source\ufffd, Pacific Symposium on Biocomputing 10: (PSB 2005) Jan. 2005. \r\n", "# \r\n", "######################################################################3\r\n", "\r\n", "\r\n", "# \r\n", "# This program is to make the pair list from the protein list \r\n", "#\r\n", "# perl make_fullpair_4protein.pl protein_list.txt pos_list output_pos output_other \r\n", "# \r\n", "# \r\n", "# $ perl make_fullpair_4protein.pl YeastGeneListOrfGeneName-106_pval_v9.0.txt Science03-pos_MIPS_complexes.txt mipsPosPair.txt mipsRandpair.txt\r\n", "#==> There are 6270 unique proteins in original list.\r\n", "#==> There should have 19653315 pairs possibly generated totally !\r\n", "# There are 8617 POS pairs originally .\r\n", "# fullpairs has: 7390 POS pairs.\r\n", "# fullpairs has: 19645925 RAND pairs.\r\n", "# ==> There are 19653315 pairs generated !\r\n", "# \r\n", "\r\n", "\r\n", "use strict; \r\n", "die \"Usage: command gene_name_file pos_pairFile outPosPairFile outRandPairFile \\n\" if scalar(@ARGV) < 4; \r\n", "\r\n" ] } ], "prompt_number": 11 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Slightly confused about this. I'm guessing that this is to produce the input pair list for the main wrapper but I don't see where to get `pos_pairFile`, `outPosPairFile`, `outRandPairFile`. Trying to find: `Science03-pos_MIPS_complexes.txt`, `mipsPosPair.txt mipsRandpair.txt`:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "cd .." ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "%%bash\n", "find . -name Science*\n", "find . -name mips*" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 31 }, { "cell_type": "markdown", "metadata": {}, "source": [ "__No sign of them.__\n", "\n", "I guess I'm supposed to get them from elsewhere maybe? Doesn't make any sense though, why tarball up all the code but not make it runnable?\n", "\n", "Ok, maybe I'm just being stupid and two of those are output files? So we'd just need a positive list of proteins? Is that in here somewhere? Don't see it.\n", "\n", "Looks like I can get it online at [this page on the Qi site][qifeatureset]\n", "\n", "Downloaded them and put them in a directory called `featuresets`. Can have a look at them now:\n", "\n", "[qifeatureset]: http://www.cs.cmu.edu/~qyj/papers_sulp/proteins05_pages/feature-download.html" ] }, { "cell_type": "code", "collapsed": false, "input": [ "ls -lh featuresets/phyInteract/" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "total 188M\r\n", "\u001b[0m\u001b[40m-rw-r--r-- 1 gavin users 51K May 28 16:19 \u001b[m\u001b[00mdipsPosPair\u001b[0m\r\n", "\u001b[40m-rw-r--r-- 1 gavin users 2.1M May 28 16:19 \u001b[m\u001b[00mdipsPosPair.feature\u001b[0m\r\n", "\u001b[40m-rw-r--r-- 1 gavin users 4.2M May 28 16:19 \u001b[m\u001b[00mdipsRandpairSub23w\u001b[0m\r\n", "\u001b[40m-rw-r--r-- 1 gavin users 181M May 28 16:19 \u001b[m\u001b[00mdipsRandpairSub23w.feature\u001b[0m\r\n", "\u001b[40m-rw-r--r-- 1 gavin users 624 May 28 16:19 \u001b[m\u001b[00mreadme.txt\u001b[0m\r\n" ] } ], "prompt_number": 1 }, { "cell_type": "code", "collapsed": false, "input": [ "%%bash\n", "head -n 1 featuresets/phyInteract/dipsPosPair.feature" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "0.793388763224369,0.00420804252006518,0.425688981443356,0.050368298677632,-100,-0.431032656998849,-100,0.219336773277164,-0.260415665650041,0.197204579148454,0.561672012832894,-0.0228811714495604,0.226695830426883,-0.498251467771468,0.191892679025485,-0.0442896786511169,-0.0410836070811429,0.31939401640312,0.201702280568662,-0.206576006538866,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5.42227951866507,1,-100,0,-100,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,1\n" ] } ], "prompt_number": 16 }, { "cell_type": "code", "collapsed": false, "input": [ "%%bash\n", "head -n 1 featuresets/phyInteract/dipsRandpairSub23w.feature" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "-100,-0.451693610174967,-0.0287451930023522,0.199960245640947,-100,-0.132785992866974,-100,0.0113596052584383,-0.0184840432669403,-0.332990558018139,0.231135300459446,0.568965723612462,-0.00592435004576543,0.212805215910624,-0.934391682945072,-0.0356978663758904,0.635341056917029,0.507997427846247,0.449270105867988,0.376562615235622,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3.47217114669236,0,-100,0,-100,-100,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,-100,0\n" ] } ], "prompt_number": 11 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Opening up these files:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import csv,glob\n", "nipfiles = glob.glob(\"featuresets/phyInteract/dips*.feature\")\n", "print nipfiles\n", "#write generator functions for each file will require generator object " ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "['featuresets/phyInteract/dipsPosPair.feature', 'featuresets/phyInteract/dipsRandpairSub23w.feature']\n" ] } ], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "One option would be to write a generator function to open these up and return rows as they are required to save RAM. If I get round to writing this I'll put it here:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "#writing generator class" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another option would be to use `pandas` as it's pretty much designed to solve this problem." ] }, { "cell_type": "code", "collapsed": false, "input": [ "import pandas as pd\n", "#test importing one of these csvs\n", "pd.read_csv(nipfiles[0],header=None).head()" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
012345678910111213141516171819
0 0.793389 0.004208 0.425689 0.050368-100.000000-0.431033-100.000000 0.219337-0.260416 0.197205 0.561672-0.022881 0.226696-0.498251 0.191893-0.044290-0.041084 0.319394 0.201702-0.206576...
1 -0.060023-0.210066-0.091081 0.057263 0.615934 0.000000 -0.270493 0.459723 0.283593-0.820271 0.768494 0.476355 0.100546-0.458742 0.179573-0.001993 0.670944 0.523979 0.031387 0.144759...
2 -0.139235 0.233024-0.246856-0.197301 -0.735368-0.848902 -0.242361-0.274482-0.260667 0.968193 0.512238 0.116022 0.087863-0.000231-0.999968 0.018011-0.331740 0.006124-0.144496-0.287701...
3-100.000000 0.516909 0.489173 0.382931-100.000000 0.365115 0.055116 0.382403-0.396365 0.106891 0.658290 0.533645 0.096992 0.496594 0.976505-0.014651-0.257374 0.015310 0.194071-0.033708...
4 0.654579 0.657707 0.114509 0.293567 0.096530 0.000000 -0.557642-0.108332 0.079067 0.801406 0.234897 0.414738 0.370008-0.273466-0.949009 0.043061 0.343765 0.329410 0.320448-0.165973...
\n", "

5 rows \u00d7 163 columns

\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 2, "text": [ " 0 1 2 3 4 5 6 \\\n", "0 0.793389 0.004208 0.425689 0.050368 -100.000000 -0.431033 -100.000000 \n", "1 -0.060023 -0.210066 -0.091081 0.057263 0.615934 0.000000 -0.270493 \n", "2 -0.139235 0.233024 -0.246856 -0.197301 -0.735368 -0.848902 -0.242361 \n", "3 -100.000000 0.516909 0.489173 0.382931 -100.000000 0.365115 0.055116 \n", "4 0.654579 0.657707 0.114509 0.293567 0.096530 0.000000 -0.557642 \n", "\n", " 7 8 9 10 11 12 13 \\\n", "0 0.219337 -0.260416 0.197205 0.561672 -0.022881 0.226696 -0.498251 \n", "1 0.459723 0.283593 -0.820271 0.768494 0.476355 0.100546 -0.458742 \n", "2 -0.274482 -0.260667 0.968193 0.512238 0.116022 0.087863 -0.000231 \n", "3 0.382403 -0.396365 0.106891 0.658290 0.533645 0.096992 0.496594 \n", "4 -0.108332 0.079067 0.801406 0.234897 0.414738 0.370008 -0.273466 \n", "\n", " 14 15 16 17 18 19 \n", "0 0.191893 -0.044290 -0.041084 0.319394 0.201702 -0.206576 ... \n", "1 0.179573 -0.001993 0.670944 0.523979 0.031387 0.144759 ... \n", "2 -0.999968 0.018011 -0.331740 0.006124 -0.144496 -0.287701 ... \n", "3 0.976505 -0.014651 -0.257374 0.015310 0.194071 -0.033708 ... \n", "4 -0.949009 0.043061 0.343765 0.329410 0.320448 -0.165973 ... \n", "\n", "[5 rows x 163 columns]" ] } ], "prompt_number": 2 }, { "cell_type": "code", "collapsed": false, "input": [ "#initialise dictionary\n", "nipd = {}\n", "#load into dictionary\n", "for fname in nipfiles:\n", " nipd[fname] = pd.read_csv(fname,header=None)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 3 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now just need to see if we can extract the array that we need to train the logistic regression model. Looks like I should use the `iloc` indexing method." ] }, { "cell_type": "code", "collapsed": false, "input": [ "#concatenate the positive and negative training sets together\n", "X = pd.concat((nipd[nipfiles[0]].iloc[:10,0:162],nipd[nipfiles[1]].iloc[:10,0:162]))\n", "#get the class label vector y the same way\n", "y = pd.concat((nipd[nipfiles[0]].iloc[:10,162],nipd[nipfiles[1]].iloc[:10,162]))" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 5 }, { "cell_type": "code", "collapsed": false, "input": [ "#checking that our arrays will be the right size\n", "print \"length of X array is %i\"%len(X.values)\n", "print \"length of y array is %i\"%len(y.values)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "length of X array is 20\n", "length of y array is 20\n" ] } ], "prompt_number": 6 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The different classification algorithms to test to replicate Qi's results are:\n", "\n", "1. Random Forest\n", " 1. [Scikit-learn solution](http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html)\n", "2. RF similarity-based k-Nearest-Neighbour\n", " 1. __Ignoring this for now__\n", "3. Naive Bayes\n", " 1. [Scikit-learn solution](http://scikit-learn.org/stable/modules/naive_bayes.html) - but with problems\n", "4. Decision Tree\n", " 1. [Scikit-learn solution](http://scikit-learn.org/stable/modules/tree.html)\n", "5. Logistic Regression\n", " 1. [Theano solution](http://deeplearning.net/tutorial/logreg.html)\n", " 2. [Scikit-learn solution](http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html)\n", "6. Support Vector Machine\n", " 1. [Scikit-learn solution](http://scikit-learn.org/stable/modules/svm.html)\n", " \n", "Starting with good old Naive Bayes as I pretty well understand how it works.\n", "\n", "### Naive Bayes\n", "\n", "Simple classifier assuming that the features are independent given the class label, which is apparently a pretty good assumption most of the time. Naive Bayes is usually studied with binary features but those used in this example are clearly _not all binary_. So, how do we get Scikit-learn to deal with that?\n", "\n", "Specifically, the features are encoded as described [here][qifeaturepage].\n", "\n", "Looks like the ways round are both pretty much just [hacks][stackscikit] so it's probably better to use something else. If I have to write my own then [this page][nb50l] might come in useful. Don't particularly want to though.\n", "\n", "[qifeaturepage]: http://www.cs.cmu.edu/~qyj/papers_sulp/proteins05_pages/feature-download.html\n", "[stackscikit]: http://stackoverflow.com/questions/14254203/mixing-categorial-and-continuous-data-in-naive-bayes-classifier-using-scikit-lea\n", "[nb50l]: http://ebiquity.umbc.edu/blogger/2010/12/07/naive-bayes-classifier-in-50-lines/\n", "\n", "### Logistic Regression\n", "\n", "This one's pretty simple as well, so shouldn't be too difficult to run. On the logistic regression page it appears that this can be used with defaults. As far as I understand logistic regression it doesn't make any assumptions about the distributions of the features." ] }, { "cell_type": "code", "collapsed": false, "input": [ "from sklearn import linear_model as sklm\n", "#initialise a logistic regression model\n", "logmodel = sklm.LogisticRegression()" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 7 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now the question is what the `fit` command expects of the data. What format should it be in?\n", "\n", "```\n", "X : {array-like, sparse matrix}, shape = [n_samples, n_features]\n", "\n", " Training vector, where n_samples in the number of samples and n_features is the number of features.\n", "\n", "y : array-like, shape = [n_samples]\n", "\n", " Target vector relative to X\n", "```\n", "\n", "Now trying to fit the model. Appears the following cell will take **a very long time to run** with the full number of samples (240000). Unclear what I can do about this." ] }, { "cell_type": "code", "collapsed": false, "input": [ "logmodel.fit(X.values,y.values)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 8, "text": [ "LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,\n", " intercept_scaling=1, penalty='l2', random_state=None, tol=0.0001)" ] } ], "prompt_number": 8 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now all we need is a test set." ] }, { "cell_type": "code", "collapsed": false, "input": [ "#concatenate the positive and negative training sets together\n", "testX = pd.concat((nipd[nipfiles[0]].iloc[10:20,0:162],nipd[nipfiles[1]].iloc[10:20,0:162]))\n", "#get the class label vector y the same way\n", "testy = pd.concat((nipd[nipfiles[0]].iloc[10:20,162],nipd[nipfiles[1]].iloc[10:20,162]))" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 9 }, { "cell_type": "code", "collapsed": false, "input": [ "estimates = logmodel.predict_proba(testX.values)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 14 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then compare these to the real results. Is there an automated way to do this? Of course there is: [Scikit-learn's ROC curve][sciroc].\n", "\n", "[sciroc]: http://scikit-learn.org/0.11/auto_examples/plot_roc.html" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from sklearn.metrics import roc_curve,auc\n", "fpr, tpr, thresholds = roc_curve(testy.values,estimates[:,1])\n", "roc_auc = auc(fpr,tpr)\n", "print \"Area under ROC curve: %.2f\"%(roc_auc)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Area under ROC curve: 0.52\n" ] } ], "prompt_number": 21 }, { "cell_type": "code", "collapsed": false, "input": [ "#define a function for quickly plotting ROC curves with nice annotations\n", "def plotroc(fpr,tpr):\n", " clf()\n", " plot(fpr, tpr)\n", " plot([0, 1], [0, 1], 'k--')\n", " xlim([0.0, 1.0])\n", " ylim([0.0, 1.0])\n", " xlabel('False Positive Rate')\n", " ylabel('True Positive Rate')\n", " title('Receiver operating characteristic')\n", " show()\n", " return None\n", "plotroc(fpr,tpr)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "display_data", "png": "iVBORw0KGgoAAAANSUhEUgAAAYYAAAEZCAYAAACTsIJzAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XdYFNf6B/AvCNYgoahIC0aQIj2ILeRi/AlEDTeSGNHE\niAVLrqZY0tQr8apRgyY3QXM1UUlsFzEFK8ZGNDaUIipBhVgAJSqiIAoLy/n94WVkBWQpW1i+n+fZ\n52HY2Zl3DjAvZ847Z/SEEAJERET/o6/pAIiISLswMRARkQImBiIiUsDEQERECpgYiIhIARMDEREp\nYGIgpbm6uuLQoUOaDkPjpkyZggULFqh1n2FhYZg7d65a96kqGzduRGBgYIM+y99B9dDjfQzNk52d\nHW7cuIFWrVqhQ4cOGDRoEFasWIGOHTtqOjSdEh0djTVr1uDw4cMajWPs2LGwsbHB/PnzNRpHREQE\nsrKysH79epXvKywsDDY2NvjXv/6l8n2RIvYYmik9PT3s2LEDRUVFOH36NM6cOaP2/2KbQnl5eYvc\ntybJ5fIWuW9SHhODDujSpQsCAgJw7tw56XvHjx9Hv379YGJiAk9PT/z222/Se7dv38bYsWNhZWUF\nU1NTDBs2THpvx44d8PT0hImJCfr3748zZ85I79nZ2eHAgQO4du0a2rdvj4KCAum9lJQUdOrUSfrD\nX7t2LVxcXGBqaoqgoCBcvXpVWldfXx8rV66Eg4MDHB0dazymbdu2oWfPnjAxMcGAAQOQkZGhEMfi\nxYvRs2dPmJqaYty4cSgtLVX6GJYuXQp3d3cYGRlBLpdj8eLFsLe3R8eOHdGzZ0/88ssvAIA//vgD\nU6ZMwbFjx2BkZARTU1MAipd1EhISYG1tjeXLl6NLly6wtLREdHS0tL/8/Hy8/PLLMDY2hq+vL+bM\nmQM/P79af5a///679HOztbXFDz/8oPBzGzp0KDp27Ig+ffrgzz//lN579913YWtrC2NjY/j4+OD3\n33+X3ouIiMBrr72G0aNHw9jYGN9//z1OnjyJvn37wsTEBJaWlpg2bRrKysqkz5w7dw6DBg2CmZkZ\nLCws8Nlnn2HPnj347LPPEBMTAyMjI3h5eQEA7t69i/Hjx8PS0hLW1taYO3cuKioqADzscfXv3x/T\np0+Hubk5IiIiEB0dLbWBEALvv/8+unTpAmNjY7i7u+PcuXNYvXo1Nm3ahKVLl8LIyAh///vfpZ/f\n/v37ATxMMosWLZJ+dj4+PsjJyam1bakeBDVLdnZ2Yt++fUIIIbKzs4Wbm5v49NNPhRBC5OTkCDMz\nM7F7924hhBB79+4VZmZm4tatW0IIIQYPHixCQ0PFnTt3RFlZmTh06JAQQojk5GTRuXNnkZiYKCoq\nKsT3338v7OzshEwmk/a5f/9+IYQQL774ovj222+leGbOnCmmTJkihBDil19+Efb29iIjI0PI5XKx\nYMEC0a9fP2ldPT09ERAQIAoKCkRJSUm1Yzt//rzo0KGD2LdvnygvLxdLly4V9vb2oqysTAghxDPP\nPCPc3NxETk6OuH37tujfv7+YM2eOUsfwzDPPCC8vL5GTkyPtOzY2Vly/fl0IIURMTIzo0KGDyMvL\nE0IIER0dLZ5//nmF+MLCwsTcuXOFEEIcPHhQGBgYiHnz5ony8nKxa9cu0b59e3Hnzh0hhBAjRowQ\nI0eOFA8ePBDp6enCxsZG+Pn51fgzvXz5sjAyMhL//e9/RXl5ucjPzxepqalCCCHGjBkjzMzMxMmT\nJ0V5ebl44403RGhoqPTZDRs2iNu3bwu5XC6WLVsmLCwsRGlpqRBCiHnz5glDQ0MRFxcnhBDiwYMH\nIikpSZw4cULI5XJx+fJl4ezsLL788kshhBCFhYXCwsJCLF++XJSWloqioiJx4sQJIYQQERERYvTo\n0Qpxv/LKK2Ly5Mni/v374saNG8LX11esWrVKCCHEunXrhIGBgYiKihJyuVw8ePBArFu3TmrT+Ph4\n8dxzz4m7d+8KIYTIyMiQfhZV27lS1d/BpUuXCjc3N3HhwgUhhBBpaWkiPz+/xral+mFiaKaeeeYZ\n8dRTTwkjIyOhp6cnXnnlFSGXy4UQQixevLjaH29gYKD4/vvvxbVr14S+vr504qpq8uTJ1f4QHR0d\npcRR9Y/yu+++Ey+++KIQQoiKigphY2MjDh8+LIQQIigoSKxZs0bahlwuF+3btxdXr14VQjxMDAcP\nHqz12ObPny9GjBghLVdUVAgrKyvx22+/SXFUnniEEGLXrl2ie/fuSh/DunXrat23EEJ4enpKJ9Gq\nJ7FKYWFhUiI6ePCgaNeundT2QgjRuXNnceLECVFeXi4MDQ2lE5cQQsyZM6fa9iotWrRIhISE1Phe\nWFiYCA8PVzhmJyenWo/BxMREpKWlCSEeJoa//e1vTzhiIb744gsxbNgwIYQQmzZtEt7e3jWuN2/e\nPPHmm29Ky3l5eaJNmzbiwYMH0vc2bdokBgwYIIR42H62trYK26japvv37xc9evQQx48fV2jDymOu\nbOdKVX8He/ToIbZt2/bE46KG4aWkZkpPTw9xcXEoLCxEQkICDhw4gFOnTgEArly5gtjYWJiYmEiv\nI0eOIC8vD9nZ2TA1NYWxsXG1bV65cgXLli1T+FxOTg6uXbtWbd2QkBAcO3YMeXl5OHToEPT19fH8\n889L23n33XelbZiZmQEAcnNzpc/b2NjUemzXr1+Hra2twrHa2NjU+nlbW1spRmWO4fF9//DDD/Dy\n8pLWP3v2LPLz82uN73FmZmbQ13/0p9S+fXvcu3cPN2/eRHl5ucL+rK2ta91OTk4Onn322Vrf79Kl\ni/R1u3btcO/ePWk5MjISLi4uePrpp2FiYoK7d+/i1q1bte73woULGDp0KLp27QpjY2PMnj1bOubs\n7OwnxlHVlStXUFZWhq5du0rtN3nyZNy8eVNa50k/6xdffBFTp07FP/7xD3Tp0gWTJk1CUVGRUvvO\nyclB9+7dlVqX6oeJQQe88MILmDZtGj788EMAD0+Uo0ePRkFBgfQqKirCBx98ABsbG9y+fRt3796t\nth1bW1vMnj1b4XP37t3DiBEjqq1rYmKCgIAAxMTEYNOmTRg5cqTCdlavXq2wneLiYvTp00daR09P\nr9bjsbS0xJUrV6RlIQSys7NhZWUlfa/qmMXVq1el95Q5hqr7vnLlCiZOnIgVK1bg9u3bKCgogKur\nK8T/ivVqi/NJ8Vfq1KkTDAwMkJ2dLX2v6tePs7GxQVZWVp3bfdzhw4fx+eefIzY2Fnfu3EFBQQGM\njY2lY6gp3ilTpsDFxQWZmZm4e/cuFi5cKI0L2NraKoxfVFU1AVbG3KZNG+Tn50vtfffuXYVxnbra\natq0aTh16hTS09Nx4cIFfP7550p9zsbGBpmZmU9chxqGiUFHvPfee0hMTMSJEyfw5ptvYvv27fj1\n118hl8tRUlKChIQE5ObmomvXrnjppZfw9ttv486dOygrK5PqwsPDw/Gf//wHiYmJEEKguLgYO3fu\nVPjPtKpRo0bh+++/x48//ohRo0ZJ3588eTIWLVqE9PR0AA8HJ2NjY5U+ltdffx07d+7EgQMHUFZW\nhmXLlqFt27bo168fgIeJYuXKlcjNzcXt27excOFC6cRf32MoLi6Gnp4ezM3NUVFRgXXr1uHs2bPS\n+126dEFOTo7CwKx4eAm2zuNo1aoVQkJCEBERgQcPHiAjIwPr16+v9YT3xhtvYN++fYiNjUV5eTny\n8/Nx+vRpaZ+1KSoqgoGBAczNzSGTyTB//nwUFhY+MbZ79+7ByMgI7du3R0ZGBr755hvpvSFDhuD6\n9ev497//jdLSUhQVFSExMVFqj8uXL0vxdO3aFQEBAZg+fTqKiopQUVGBrKwspe81OHXqFE6cOIGy\nsjK0b98ebdu2RatWraR91ZagAGDChAmYO3cuMjMzIYRAWloabt++rdR+6cmYGHSEubk5xowZgyVL\nlsDa2hpxcXFYtGgROnfuDFtbWyxbtkz6j3D9+vUwNDSEk5MTunTpgq+++goA8Nxzz+Hbb7/F1KlT\nYWpqCgcHB/zwww+1nsiCg4ORmZmJrl27ws3NTfr+K6+8gg8//BChoaEwNjaGm5sb9uzZI71f13+C\nPXr0wIYNGzBt2jR06tQJO3fuxPbt22FgYCB9ftSoUQgICED37t3h4OCAOXPmNOgYXFxcMGPGDPTt\n2xcWFhY4e/asdEkMAAYOHIiePXvCwsICnTt3lvZfdXtPOp6oqCjcvXsXFhYWGDNmDEaOHInWrVvX\nuK6NjQ127dqFZcuWwczMDF5eXkhLS6txn1X3GxQUhKCgIPTo0QN2dnZo165dtUtxj382MjISmzZt\nQseOHTFx4kSEhoZK6xgZGWHv3r3Yvn07unbtih49eiAhIQEAMHz4cAAPL5/5+PgAeHgpTiaTSVVo\nw4cPR15e3hPjrvxeYWEhJk6cCFNTU9jZ2cHc3ByzZs0CAIwfPx7p6ekwMTFBSEhItfaaPn06Xn/9\ndQQEBMDY2Bjh4eEoKSmp9WdBylPpDW7jxo3Dzp070blzZ4WuZVXvvPMOdu/ejfbt2yM6OloqgSOq\nTbdu3bBmzRq8+OKLmg6l3j788EPcuHED69at03QoRLVSaY9h7NixiI+Pr/X9Xbt2ITMzExcvXsTq\n1asxZcoUVYZDpHbnz59HWloahBBITEzE2rVrFe4bIdJGBqrcuJ+fHy5fvlzr+9u2bcOYMWMAAL17\n98adO3fw119/KVRfEDVnRUVFGDlyJK5du4YuXbpg5syZCA4O1nRYRE+k0sRQl9zc3GqlfDk5OUwM\n9ESXLl3SdAhK8/HxwcWLFzUdBlG9aHzw+fEhDmXKAImISHU02mOwsrJSqOvOyclRqFWvZG9v36D6\nbiKilqx79+4NutdDoz2G4OBgaZKw48eP4+mnn67xMlJWVpZUO97SX/PmzdN4DNryYluwLdgWj17J\nyclwd3fHkCFDkJubCyFEg/+hVmmPYeTIkfjtt99w69Yt2NjY4NNPP5VuFJo0aRIGDx6MXbt2wd7e\nHh06dGAJHxFRA3zxxRf47LPPEBkZidGjRzf6krxKE8PmzZvrXCcqKkqVIRAR6bxevXohNTUVlpaW\nTbI9jY4xUP35+/trOgStwbZ4hG3xSEtsi6p36zeFZvFoTz09PTSDMImItEpDz50aL1clIqK6yWQy\nzJs3D1988YXK98XEQESk5VJSUtCrVy8kJSXVOA1+U2NiICLSUpW9hMDAQMyYMQPbt29vsgHmJ+Hg\nMxGRlnrvvfdw9erVJq04UgYHn4mItFRRURGeeuqpBt+X0NBzJxMDEZGOYlUSEVEzJZPJkJ+fr+kw\nJEwMREQaVFlxtHLlSk2HImFiICLSgMcrjiqfW64NWJVERKRmKSkpCAsLg42NjdorjpTBwWciIjVb\nvnw5zM3Nm2Qm1CdhVRIRESlgVRIRETUJJgYiIhVJSUnBwYMHNR1GvTExEBE1saoVR9p0f4KyWJVE\nRNSEtL3iSBnsMRARNZGoqCi1z4SqCqxKIiJqIklJSejatavWJASWqxIRkQKWqxIRUZPg4DMRUT3I\nZDIsXLgQ+vr6mDdvnqbDUQn2GIiIlFT12cvh4eGaDkdlmBiIiOqgqWcvawovJRER1WH27Nn4448/\nmu19CfXFqiQiojo8ePAAbdu2VelMqKrAclUiIlLAclUiokaSyWTIy8vTdBgax8RARIRHFUdfffWV\npkPROCYGImrRHq84WrhwoaZD0jhWJRFRi6ULM6GqAgefiajF+u6779C6dWuVP3tZU1iVRERECliV\nRERETYKJgYh0XkpKCnbs2KHpMJoNlSaG+Ph4ODk5wcHBAUuWLKn2/q1btxAUFARPT0+4uroiOjpa\nleEQUQtTteKouLhY0+E0GyobY5DL5XB0dMS+fftgZWWFXr16YfPmzXB2dpbWiYiIQGlpKT777DPc\nunULjo6O+Ouvv2BgoFgsxTEGIqqvqhVHq1evbpEVR1o3xpCYmAh7e3vY2dnB0NAQoaGhiIuLU1in\na9euKCwsBAAUFhbCzMysWlIgIqqv1atXt5iZUFVBZWfh3Nxc2NjYSMvW1tY4ceKEwjrh4eF48cUX\nYWlpiaKiImzZskVV4RBRC/L888/zvoRGUFliUKYmeNGiRfD09ERCQgKysrIwaNAgnD59GkZGRtXW\njYiIkL729/eHv79/E0ZLRLrExcVF0yFoREJCAhISEhq9HZUlBisrK2RnZ0vL2dnZsLa2Vljn6NGj\nmD17NgCge/fu6NatG86fPw8fH59q26uaGIiIKgkhdPLmtIZ4/J/mTz/9tEHbUdkYg4+PDy5evIjL\nly9DJpMhJiYGwcHBCus4OTlh3759AIC//voL58+fx7PPPquqkIhIh1RWHM2YMUPToegclfUYDAwM\nEBUVhcDAQMjlcowfPx7Ozs5YtWoVAGDSpEn45JNPMHbsWHh4eKCiogJLly6FqampqkIiIh3xeMUR\nNS1OiUFEzYZMJsPChQvxzTffIDIyUmfnOGoqDT13sjaUiJqNRYsWISkpiRVHKsYeAxE1GzKZDIaG\nhuwlKImzqxIRkQKtu/OZiKihZDIZrl69qukwWiwmBiLSKpXPXv7yyy81HUqLxcRARFrh8WcvL1u2\nTNMhtVisSiIijeOzl7ULB5+JSOO2bNmCkpIS3pfQxFiVRERECliVRERETYKJgYjUJiUlBf/97381\nHQbVgYmBiFSuasVRRUWFpsOhOrAqiYhUihVHzQ97DESkMtHR0Xz2cjPEqiQiUpk///wTbdu2ZULQ\nEJarEhGRAparEpFG8Z833cHEQESNUllxFB4erulQqImwKomIGozPXtZNSvcY7t+/r8o4iKgZeXwm\nVFYc6ZY6E8PRo0fh4uICR0dHAEBqairefvttlQdGRNrr66+/lp69/NZbb3HiOx1TZ1WSr68vtm7d\nir///e9ISUkBAPTs2RPnzp1TS4AAq5KItE15eTlatWrFhKDlGnruVGqMwdbWVvFDBhyaIGrJeA7Q\nbXVeSrK1tcWRI0cAPLyuGBkZCWdnZ5UHRkSaJ5PJcPHiRU2HQWpWZ2L45ptvsGLFCuTm5sLKygop\nKSlYsWKFOmIjIg2qfPbyF198oelQSM3qHGM4cuQI+vfvX+f3VIljDETqI5PJsGDBAvznP//BsmXL\n8Oabb3IsoZlS2ZQYXl5e0qDzk76nSkwMROqRkpKCMWPG4JlnnsGqVatYgtrMNfng87Fjx3D06FHc\nvHkTy5cvlzZeVFTE+dSJdFReXh5mzZrFXkILV2tikMlkKCoqglwuR1FRkfT9jh07YuvWrWoJjojU\n66WXXtJ0CKQF6ryUdPnyZdjZ2akpnJrxUhIRUf2p7D6G9u3bY+bMmUhPT8eDBw+knR04cKD+URKR\nVkhOTkZycjImTJig6VBIC9VZrvrGG2/AyckJf/75JyIiImBnZwcfHx91xEZETUwmk+Gf//wngoKC\n0K5dO02HQ1qqzktJ3t7eSE5Ohru7O9LS0gAAPj4+OHXqlFoCBHgpiagpJCcnIywsjBVHLYjKHtTT\nunVrAICFhQV27NiB5ORkFBQU1D9CItKYjRs3IigoCLNmzcK2bduYFOiJ6uwxbN++HX5+fsjOzsa0\nadNQWFiIiIgIBAcHqytG9hiIGunatWsAwITQwqj1mc+JiYnw9fWtc734+Hi89957kMvlmDBhAj78\n8MNq6yQkJOD9999HWVkZzM3NkZCQUD1IJgYionpr8sRQUVGBn3/+GVlZWXB1dcXgwYNx6tQpfPLJ\nJ7hx4wZSU1OfuGG5XA5HR0fs27cPVlZW6NWrFzZv3qwwAd+dO3fQv39/7NmzB9bW1rh16xbMzc2b\n7OCIWqKKigro6/OpvaSCMYaJEydi5cqVKCgowIIFC/Dqq69izJgxePvtt5WaDiMxMRH29vaws7OD\noaEhQkNDERcXp7DOpk2b8Oqrr8La2hoAakwKRKScyoqjUaNGaToUauZqvY/h+PHjSEtLg76+PkpK\nSmBhYYGsrCyYmZkpteHc3FzY2NhIy9bW1jhx4oTCOhcvXkRZWRkGDBiAoqIivPvuuxg9enQDD4Wo\n5aqsOLK1teWzl6nRak0MhoaGUne0bdu26Natm9JJAYBS86yUlZUhOTkZ+/fvx/3799G3b1/06dMH\nDg4OSu+HqCWTyWRYuHAhvvnmG0RGRmL06NGc44gardbEkJGRATc3N2k5KytLWtbT05PuaaiNlZUV\nsrOzpeXs7GzpklElGxsbmJubo127dmjXrh1eeOEFnD59usbEEBERIX3t7+8Pf3//J+6fqCVYu3Yt\nkpKSIJOlYswYS4wZo+mISLMS/vdqnFoHny9fvvzED9Y1f1J5eTkcHR2xf/9+WFpawtfXt9rgc0ZG\nBqZOnYo9e/agtLQUvXv3RkxMDFxcXBSD5OAzUY0qKiqgp6cHfX098E+EHtfkcyU1duI8AwMDREVF\nITAwEHK5HOPHj4ezszNWrVoFAJg0aRKcnJwQFBQEd3d36OvrIzw8vFpSIKLasfqIVKFB9zGoG3sM\n1NJVPnu5Z8+eNb6vpwf2GKgalU2JQUSalZqaCl9fXyxfvlzToVALoVRiuH//Ps6fP6/qWIioCplM\nhnnz5iEgIADTp0/Hd999p+mQqIWoMzFs27YNXl5eCAwMBPDwmbDqnCeJqCVKS0uDr68vkpKSkJqa\nirfeeotlqKQ2Sk27feDAAQwYMEC649nV1RVnz55VS4AAxxio5Tl8+DAuXbqk9H0JHGOgmqjsCW6G\nhoZ4+umnFb7HSggi1fLz84Ofn5+mw6AWqs4zfM+ePbFx40aUl5fj4sWLmDZtGvr166eO2IiISAPq\nTAxff/01zp07hzZt2mDkyJHo2LEjvvzyS3XERqTzUlNT+fdEWqfOMYbk5GR4e3urK54acYyBdE3V\nOY6WLVvW6MkjOcZANVHZGMP06dORl5eH4cOHY8SIEXB1dW1QgET0UGpqKsLCwmBtbY3U1FQ+VY20\nTp2XkhISEnDw4EGYm5tj0qRJcHNzw7/+9S91xEakc3788UfpvoTt27czKZBWqteUGGfOnMGSJUsQ\nExODsrIyVcalgJeSSFfk5+ejtLS0yRMCLyVRTVT2zOf09HRs2bIFW7duhZmZGUaMGIHXXnsNnTt3\nbnCw9cXEQPRkTAxUE5Ulhj59+iA0NBTDhw+HlZVVgwNsDCYGao7kcjlatWqlln0xMVBNVJYYtAET\nAzUnlRVHp06dws6dO9WyTyYGqkmTVyUNHz4csbGxCk9xq7qzup7gRtQSVa04+vbbbzUdDlGD1Npj\nuHbtGiwtLXHlypVqGUdPTw/PPPOMWgKs3B97DKTNNP3sZfYYqCZN/jyGyqqJlStXws7OTuG1cuXK\nhkdKpINiY2M5EyrpjDrHGLy8vKRZVSu5ubnhzJkzKg2sKvYYSNtV/n5qKiGwx0A1afIxhm+++QYr\nV65EVlaWwjhDUVER+vfv37AoqdFMTYGCAk1HQdVptodgYqLR3ZOOqbXHcPfuXRQUFOCjjz7CkiVL\npKxjZGQEMzMz9QbJHoOE/xlqlkwmw9mzZzU+fxiRMpq8XLWwsBAdO3ZEfn5+jd1jU1PT+kfZQEwM\njzAxaE5lxZGrqys2bNig6XCI6tTkiWHIkCHYuXMn7OzsakwMly5dqn+UDcTE8AgTg/ppuuKIqKF4\ng1sLwcSgXmfOnMHo0aNhbW2N1atXc9I7alaavFy10pEjR3Dv3j0AwPr16zF9+nRcuXKl/hESNUNy\nuZwzoVKLU2ePwc3NDadPn8aZM2cQFhaG8ePHIzY2Fr/99pu6YmSPoQr2GIhIWSrrMRgYGEBfXx+/\n/PIL/vGPf2Dq1KkoKipqUJBERKT96kwMRkZGWLRoETZs2IChQ4dCLper9VkMROqQmprKB1AR/U+d\niSEmJgZt2rTB2rVrYWFhgdzcXMyaNUsdsRGpnEwmw7x58xAQEKDW+b+ItJlSVUl5eXk4efIk9PT0\n4Ovrq9aH9AAcY6iKYwxNp+pMqKw4Il2ksjGGLVu2oHfv3oiNjcWWLVvg6+uL2NjYBgVJpC127tzJ\nZy8T1aLOHoO7uzv27dsn9RJu3ryJgQMHqvV5DOwxPMIeQ9MoKipCUVEREwLptCafRK+SEAKdOnWS\nls3MzHiSpmbPyMgIRkZGmg6DSCvVmRiCgoIQGBiIUaNGQQiBmJgYvPTSS+qIjahJlJWVwdDQUNNh\nEDUbSg0+//TTT/j9998BAH5+fhg2bJjKA6uKl5Ie4aUk5VXOcZSQkICEhATOb0QtTpNfSrpw4QJm\nzZqFzMxMuLu74/PPP4e1tXWjgiRSl6oVR5s3b2ZSIKqHWquSxo0bh6FDh+LHH3+Et7c33nnnHXXG\nRdQgVe9LYMURUcPUmhju3buH8PBwODk5YdasWQ2aZjs+Ph5OTk5wcHDAkiVLal3v5MmTMDAwwE8/\n/VTvfRBVtWfPHj57maiRar2UVFJSguTkZAAPK5MePHiA5ORkCCGgp6dX5xOs5HI5pk6din379sHK\nygq9evVCcHAwnJ2dq6334YcfIigoiOMI1GhDhw7F0KFDmRCIGqHWxGBhYYEZM2bUunzw4MEnbjgx\nMRH29vaws7MDAISGhiIuLq5aYvj666/x2muv4eTJkw2Jn0gBEwJR49WaGBISEhq14dzcXNjY2EjL\n1tbWOHHiRLV14uLicODAAWnKDSJlyGQynDp1Cv369dN0KEQ6p84pMRpKmZP8e++9h8WLF0slVbyU\nRMpITU2Fr68vvvjiC/7OEKlAnTe4NZSVlRWys7Ol5ezs7GrlrklJSQgNDQUA3Lp1C7t374ahoSGC\ng4OrbU9PL6LKkv//Xi2PiYmmI9AcPnuZ6Mkq79lpLJU987m8vByOjo7Yv38/LC0t4evri82bN1cb\nY6g0duxYvPzyywgJCakeJG9wa/HS09MxatQozoRKVA8qm121oqIC69evx/z58wEAV69eRWJiYp0b\nNjAwQFRUFAIDA+Hi4oIRI0bA2dkZq1atwqpVq+odKLVsrVu35n0JRGpSZ49h8uTJ0NfXx4EDB5CR\nkYHbt2/DcJAhAAATBElEQVQjICAAp06dUleM7DEQETWAymZXPXHiBFJSUuDl5QUAMDU15aM9iYh0\nWJ2Xklq3bg25XC4t37x5E/r6KitmohYuNTUVs2bNYg+RSIPqPMNPmzYNw4YNw40bN/DJJ5+gf//+\n+Pjjj9URG7UgVec4cnNz03Q4RC2aUlVJf/zxB/bv3w8AGDhwYK2VRarCMQbdxmcvE6lGQ8+ddSaG\nq1evAoC08cq6cVtb23rvrKGYGHTX/v37MXLkSN6XQKQCKksMrq6u0h9rSUkJLl26BEdHR5w7d65h\nkTYAE4PuKi0tRX5+PnsJRCqgsqqks2fPKiwnJydjxYoV9d4RUU3atGnDpECkZRp057Orq2u1hKFK\n7DHohpKSErRt21bTYRC1GCrrMSxbtkz6uqKiAsnJybCysqr3jqjlqpzjaOfOnZxFl6gZqDMx3Lt3\n79HKBgYYOnQoXn31VZUGRbqjasXRtm3bmBSImoEnJga5XI7CwkKFXgORMjgTKlHzVWtiKC8vh4GB\nAY4cOSI9zpNIWceOHUNycjJSU1M5uEzUzNQ6+Ozt7Y3k5GRMnjwZ165dw/Dhw9G+ffuHH9LTq3F6\nbJUFycFnIqJ6a/LB58qNlZSUwMzMDAcOHFB4X52JgYiI1KfWxHDz5k0sX76c89bQE8lkMhw+fBgD\nBw7UdChE1ERqTQxyuRxFRUXqjIWamcqKo27dumHAgAGcdZdIR9Q6xuDl5YWUlBR1x1MjjjFoF1Yc\nETUPKrvBjaiqjIwMhIaGwtramhVHRDqq1h5Dfn4+zMzM1B1Pjdhj0B7Xrl3D/v378eabb7KXQKTl\nVDa7qjZgYiAiqr+Gnjs5WkhERAqYGKhGqampmDx5MioqKjQdChGpGRMDKaj67OV+/fpxHIGoBWJV\nEkmqzoTKiiOilos9BgIAHD16FAEBAZg+fTq2b9/OpEDUgrEqiQA8vNP95s2bsLCw0HQoRNREWK5K\nREQKWK5KSisuLtZ0CESkxZgYWpDKiiNfX1/I5XJNh0NEWoqJoYVITU2Fr68vkpKSsHfvXrRq1UrT\nIRGRlmJi0HFV70tgxRERKYP3Mei4M2fOIDU1lfclEJHSWJVERKSjWJVERERNgolBR8hkMuzYsUPT\nYRCRDmBi0AGVFUerV69GeXm5psMhomZO5YkhPj4eTk5OcHBwwJIlS6q9v3HjRnh4eMDd3R39+/dH\nWlqaqkPSGY9XHMXFxcHAgPUERNQ4Kj2LyOVyTJ06Ffv27YOVlRV69eqF4OBgODs7S+s8++yzOHTo\nEIyNjREfH4+JEyfi+PHjqgxLJ2RmZuK1117jTKhE1ORU2mNITEyEvb097OzsYGhoiNDQUMTFxSms\n07dvXxgbGwMAevfujZycHFWGpDPMzMzwwQcf8L4EImpyKk0Mubm5sLGxkZatra2Rm5tb6/pr1qzB\n4MGDVRmSzjAxMcGoUaP4IB0ianIqvZRUn5PWwYMHsXbtWhw5cqTG9yMiIqSv/f394e/v38joiIh0\nS0JCAhISEhq9HZUmBisrK2RnZ0vL2dnZsLa2rrZeWloawsPDER8fDxMTkxq3VTUxtCSpqamIjIzE\nunXrYGhoqOlwiEiLPf5P86efftqg7aj0UpKPjw8uXryIy5cvQyaTISYmBsHBwQrrXL16FSEhIdiw\nYQPs7e1VGU6zUrXiKCAggNVGRKQ2Kj3bGBgYICoqCoGBgZDL5Rg/fjycnZ2xatUqAMCkSZMwf/58\nFBQUYMqUKQAAQ0NDJCYmqjIsrcdnLxORJnGuJC2TkpKCwMBAREZGYvTo0RxcJqIG46M9dYQQArdu\n3UKnTp00HQoRNXNMDEREpICzqzZDd+/e1XQIRETVMDFoQGXFkbe3N2QymabDISJSwMSgZikpKejV\nqxeSkpJw+PBhtG7dWtMhEREpYGJQk8peQmBgIGbOnMk5johIa/GuKTXJysrC2bNneV8CEWk9ViUR\nEekoViUREVGTYGJoYjKZDLGxsZoOg4iowZgYmlBlxdEPP/yA0tJSTYdDRNQgTAxN4PGKo23btqFN\nmzaaDouIqEFYldRIly5dwiuvvAJbW1tWHBGRTmBVUiMVFxdjx44deP311zkTKhFpFU6iR0RECliu\nSkRETYKJQUkpKSkICQlBSUmJpkMhIlIpJoY6VK04GjZsGKuNiEjnsSrpCVJSUhAWFsaKIyJqUTj4\nXIvz58/Dz88Py5Ytw5tvvsmKIyJqdliVpAIFBQUwMTFR+36JiJoCEwMRESlguWoj5OfnazoEIiKt\n0aITQ2XFkZeXF+7fv6/pcIiItEKLTQyVM6EmJyfj+PHjaN++vaZDIiLSCi0uMdQ0EyrLUImIHmlx\n9zFcv34dGRkZvC+BiKgWrEoiItJRrEoiIqImobOJQSaT4fvvv2dPg4ionnQyMVRWHG3dupVlqERE\n9aRTiaGmiqMOHTpoOiwiomZFZ6qScnJyMGTIEM6ESkTUSDpTlSSTybBjxw4MGzaMM6ESEYGT6BER\n0WO0slw1Pj4eTk5OcHBwwJIlS2pc55133oGDgwM8PDyQkpKiynCIiEgJKksMcrkcU6dORXx8PNLT\n07F582b88ccfCuvs2rULmZmZuHjxIlavXo0pU6bUud2UlBS89NJLKCwsVFXoWi0hIUHTIWgNtsUj\nbItH2BaNp7LEkJiYCHt7e9jZ2cHQ0BChoaGIi4tTWGfbtm0YM2YMAKB37964c+cO/vrrrxq3V7Xi\naNSoUTAyMlJV6FqNv/SPsC0eYVs8wrZoPJVVJeXm5sLGxkZatra2xokTJ+pcJycnB126dKm2vV69\nerHiiIhIDVSWGJStDHp8YKS2z82YMQOjR49mxRERkaoJFTl27JgIDAyUlhctWiQWL16ssM6kSZPE\n5s2bpWVHR0eRl5dXbVvdu3cXAPjiiy+++KrHq3v37g06f6usx+Dj44OLFy/i8uXLsLS0RExMDDZv\n3qywTnBwMKKiohAaGorjx4/j6aefrvEyUmZmpqrCJCKix6gsMRgYGCAqKgqBgYGQy+UYP348nJ2d\nsWrVKgDApEmTMHjwYOzatQv29vbo0KED1q1bp6pwiIhISc3iBjciIlIfrZpEjzfEPVJXW2zcuBEe\nHh5wd3dH//79kZaWpoEo1UOZ3wsAOHnyJAwMDPDTTz+pMTr1UaYdEhIS4OXlBVdXV/j7+6s3QDWq\nqy1u3bqFoKAgeHp6wtXVFdHR0eoPUk3GjRuHLl26wM3NrdZ16n3ebNDIhAqUl5eL7t27i0uXLgmZ\nTCY8PDxEenq6wjo7d+4UL730khBCiOPHj4vevXtrIlSVU6Ytjh49Ku7cuSOEEGL37t0tui0q1xsw\nYIAYMmSI2Lp1qwYiVS1l2qGgoEC4uLiI7OxsIYQQN2/e1ESoKqdMW8ybN0989NFHQoiH7WBqairK\nyso0Ea7KHTp0SCQnJwtXV9ca32/IeVNregxNfUNcc6ZMW/Tt2xfGxsYAHrZFTk6OJkJVOWXaAgC+\n/vprvPbaa+jUqZMGolQ9Zdph06ZNePXVV2FtbQ0AMDc310SoKqdMW3Tt2lWaHaGwsBBmZmYwMNCZ\nyaQV+Pn5wcTEpNb3G3Le1JrEUNPNbrm5uXWuo4snRGXaoqo1a9Zg8ODB6ghN7ZT9vYiLi5OmVNHF\ne12UaYeLFy/i9u3bGDBgAHx8fLB+/Xp1h6kWyrRFeHg4zp07B0tLS3h4eODf//63usPUGg05b2pN\nCm3qG+Kas/oc08GDB7F27VocOXJEhRFpjjJt8d5772Hx4sXSTJKP/47oAmXaoaysDMnJydi/fz/u\n37+Pvn37ok+fPnBwcFBDhOqjTFssWrQInp6eSEhIQFZWFgYNGoTTp0+32Kl06nve1JrEYGVlhezs\nbGk5Oztb6hLXtk5OTg6srKzUFqO6KNMWAJCWlobw8HDEx8c/sSvZnCnTFklJSQgNDQXwcNBx9+7d\nMDQ0RHBwsFpjVSVl2sHGxgbm5uZo164d2rVrhxdeeAGnT5/WucSgTFscPXoUs2fPBgB0794d3bp1\nw/nz5+Hj46PWWLVBg86bTTYC0khlZWXi2WefFZcuXRKlpaV1Dj4fO3ZMZwdclWmLK1euiO7du4tj\nx45pKEr1UKYtqgoLCxM//vijGiNUD2Xa4Y8//hADBw4U5eXlori4WLi6uopz585pKGLVUaYt3n//\nfRERESGEECIvL09YWVmJ/Px8TYSrFpcuXVJq8FnZ86bW9Bh4Q9wjyrTF/PnzUVBQIF1XNzQ0RGJi\noibDVgll2qIlUKYdnJycEBQUBHd3d+jr6yM8PBwuLi4ajrzpKdMWn3zyCcaOHQsPDw9UVFRg6dKl\nMDU11XDkqjFy5Ej89ttvuHXrFmxsbPDpp5+irKwMQMPPm7zBjYiIFGhNVRIREWkHJgYiIlLAxEBE\nRAqYGIiISAETAxERKWBiICIiBUwMpDVatWoFLy8v6XX16tVa133qqacavb+wsDA8++yz8PLywnPP\nPYfjx4/Xexvh4eHIyMgA8HAahqr69+/f6BiBR+3i7u6OkJAQ3Lt374nrnz59Grt3726SfVPLxPsY\nSGsYGRmhqKioydetzdixY/Hyyy8jJCQEe/fuxcyZM3H69OkGb68pYqpru2FhYXBzc8OMGTNqXT86\nOhpJSUn4+uuvmzwWahnYYyCtVVxcjP/7v//Dc889B3d3d2zbtq3aOtevX8cLL7wALy8vuLm54fff\nfwcA/Prrr+jXrx+ee+45vP766yguLq5xH5X/F/n5+UnPFl++fDnc3Nzg5uYmzcpZXFyMIUOGwNPT\nE25uboiNjQUA+Pv7IykpCR999BEePHgALy8vjB49GsCjXk1oaCh27dol7TMsLAw//fQTKioqMGvW\nLPj6+sLDwwOrV6+us0369u2LrKwsAA+nn+7Xrx+8vb3Rv39/XLhwATKZDP/85z8RExMDLy8vxMbG\nori4GOPGjUPv3r3h7e1dYzsSKWiquTqIGqtVq1bC09NTeHp6ipCQEFFeXi4KCwuFEA8ftmJvby+t\n+9RTTwkhhIiMjBQLFy4UQgghl8tFUVGRuHnzpnjhhRfE/fv3hRBCLF68WMyfP7/a/sLCwqSH+mzZ\nskX06dNHJCUlCTc3N3H//n1x79490bNnT5GSkiK2bt0qwsPDpc/evXtXCCGEv7+/SEpKUojp8Rh/\n/vlnMWbMGCGEEKWlpcLGxkaUlJSIVatWiQULFgghhCgpKRE+Pj7i0qVL1eKs3E55ebkICQkRK1as\nEEIIUVhYKMrLy4UQQuzdu1e8+uqrQgghoqOjxbRp06TPf/zxx2LDhg1CiIcP8+nRo4coLi6u8WdA\nJIQWzZVE1K5dO4XHDpaVleHjjz/G4cOHoa+vj2vXruHGjRvo3LmztI6vry/GjRuHsrIyvPLKK/Dw\n8EBCQgLS09PRr18/AIBMJpO+rkoIgVmzZmHBggXo3Lkz1qxZg7179yIkJATt2rUDAISEhODw4cMI\nCgrCzJkz8dFHH2Ho0KF4/vnnlT6uoKAgvPvuu5DJZNi9ezf+9re/oU2bNvj1119x5swZbN26FcDD\nB8pkZmbCzs5O4fOVPZHc3FzY2dlh8uTJAIA7d+7grbfeQmZmJvT09FBeXi4dl6hyhfjXX3/F9u3b\nERkZCQAoLS1FdnY2HB0dlT4GalmYGEhrbdy4Ebdu3UJycjJatWqFbt26oaSkRGEdPz8/HD58GDt2\n7EBYWBimT58OExMTDBo0CJs2bXri9vX09BAZGYmQkBDpe/v27VM4qQohoKenBwcHB6SkpGDnzp2Y\nM2cOBg4ciLlz5yp1HG3btoW/vz/27NmDLVu2YOTIkdJ7UVFRGDRo0BM/X5kwHzx4gMDAQMTFxWHY\nsGGYO3cuBg4ciJ9//hlXrlx54jOef/rpJ52bfptUh2MMpLUKCwvRuXNntGrVCgcPHsSVK1eqrXP1\n6lV06tQJEyZMwIQJE5CSkoI+ffrgyJEj0rX44uJiXLx4scZ9iMdqL/z8/PDLL7/gwYMHKC4uxi+/\n/AI/Pz9cv34dbdu2xRtvvIGZM2fW+EB1Q0ND6b/2x40YMQJr166Veh8AEBgYiJUrV0qfuXDhAu7f\nv19re7Rr1w5fffUVZs+eDSEECgsLYWlpCQAKM2Z27NhRYRA8MDAQX331lbSs1MPgqUVjYiCt8fhT\npd544w2cOnUK7u7uWL9+PZydnaute/DgQXh6esLb2xtbtmzBu+++C3Nzc0RHR2PkyJHw8PBAv379\ncP78eaX26eXlhbCwMPj6+qJPnz4IDw+Hh4cHzpw5g969e8PLywvz58/HnDlzqm1r4sSJcHd3lwaf\nq247ICAAhw4dwqBBg6RnD0+YMAEuLi7w9vaGm5sbpkyZUmNiqbodT09P2NvbY8uWLfjggw/w8ccf\nw9vbG3K5XFpvwIABSE9Plwaf586di7KyMri7u8PV1RXz5s2r/YdABJarEhHRY9hjICIiBUwMRESk\ngImBiIgUMDEQEZECJgYiIlLAxEBERAqYGIiISAETAxERKfh/tBJrKHG1kPUAAAAASUVORK5CYII=\n", "text": [ "" ] } ], "prompt_number": 24 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Ok, so that's terrible, probably because we're only using 20 training and test examples.\n", "Can now increase that to 1000:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from sklearn.utils import shuffle as skshuffle\n", "#how many samples?\n", "N = 1000\n", "#concatenate the positive and negative training sets together\n", "X = pd.concat((nipd[nipfiles[0]].iloc[:N,0:162],nipd[nipfiles[1]].iloc[:N,0:162]))\n", "#get the class label vector y the same way\n", "y = pd.concat((nipd[nipfiles[0]].iloc[:N,162],nipd[nipfiles[1]].iloc[:N,162]))\n", "#then shuffle all of them\n", "X,y = skshuffle(X.values,y.values)\n", "#find the midpoint\n", "half = int(len(X))/2\n", "#split it into test and train\n", "Xtrain, Xtest = X[:half], X[half:]\n", "ytrain, ytest = y[:half], y[half:]" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 44 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then we want to fit the model again and test it again:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "#reinitialise a logistic regression model\n", "logmodel = sklm.LogisticRegression()\n", "#fit it again\n", "logmodel.fit(Xtrain,ytrain)\n", "#test it again\n", "estimates = logmodel.predict_proba(Xtest)\n", "#recalc the roc curve\n", "fpr, tpr, thresholds = roc_curve(ytest,estimates[:,1])\n", "#print area under roc curve\n", "roc_auc = auc(fpr,tpr)\n", "print \"Area under ROC curve: %.2f\"%(roc_auc)\n", "#replot the roc curve\n", "plotroc(fpr,tpr)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Area under ROC curve: 0.90\n" ] }, { "metadata": {}, "output_type": "display_data", "png": "iVBORw0KGgoAAAANSUhEUgAAAYYAAAEZCAYAAACTsIJzAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XdYVFf6B/AvCCoaJBQbLRhRQQHBIBZCFuNPIGrYSGJE\noxELlqymWNLUFV01atBkEzWriSWxrWIKdmMjGhtKEUtsxAIoURAFERgYzu8PlysjjAzlzgwz38/z\n8Mgwd+5954L3nXPPe84xEUIIEBER/Y+prgMgIiL9wsRAREQqmBiIiEgFEwMREalgYiAiIhVMDERE\npIKJgTTm4eGBQ4cO6ToMnRs/fjzmzJmj1WNGRERgxowZWj2mXNavX4/g4OAavZZ/g9phwnEM9ZOL\niwtu376NBg0aoGnTpujTpw+WLl2KZs2a6To0g7JmzRqsXLkShw8f1mkcI0aMgJOTE2bPnq3TOKKi\nopCamoq1a9fKfqyIiAg4OTnhX//6l+zHIlVsMdRTJiYm2L59O/Ly8nD69GmcOXNG659i60JJSYlR\nHluXlEqlUR6bNMfEYABatmyJoKAgnDt3TvrZ8ePH0bNnT1hbW8Pb2xu//fab9Nzdu3cxYsQIODg4\nwMbGBgMGDJCe2759O7y9vWFtbQ1/f3+cOXNGes7FxQUHDhzAzZs30aRJE+Tk5EjPJSUloXnz5tJ/\n/FWrVqFjx46wsbFBSEgIbty4IW1ramqKZcuWoV27dujQoUOl72nr1q3o1KkTrK2t0atXL1y4cEEl\njvnz56NTp06wsbHByJEjUVRUpPF7WLhwIby8vGBpaQmlUon58+fD1dUVzZo1Q6dOnfDLL78AAP74\n4w+MHz8ex44dg6WlJWxsbACo3taJi4uDo6MjFi9ejJYtW8Le3h5r1qyRjpednY1XX30VVlZW8PPz\nw/Tp0xEQEKD2d/n7779LvzdnZ2f88MMPKr+3/v37o1mzZujevTv+/PNP6bn33nsPzs7OsLKygq+v\nL37//XfpuaioKLzxxhsYNmwYrKys8P333+PkyZPo0aMHrK2tYW9vj4kTJ6K4uFh6zblz59CnTx/Y\n2tqiVatW+Oyzz7Bnzx589tln2LRpEywtLeHj4wMAuH//PkaNGgV7e3s4OjpixowZKC0tBfCoxeXv\n749JkybBzs4OUVFRWLNmjXQOhBD44IMP0LJlS1hZWcHLywvnzp3DihUrsGHDBixcuBCWlpb4+9//\nLv3+9u/fD+BRkpk3b570u/P19UV6errac0vVIKhecnFxEfv27RNCCJGWliY8PT3FrFmzhBBCpKen\nC1tbW7Fr1y4hhBB79+4Vtra2IisrSwghRN++fUV4eLi4d++eKC4uFocOHRJCCJGYmChatGgh4uPj\nRWlpqfj++++Fi4uLUCgU0jH3798vhBDi5ZdfFt9++60Uz5QpU8T48eOFEEL88ssvwtXVVVy4cEEo\nlUoxZ84c0bNnT2lbExMTERQUJHJyckRhYWGF93bx4kXRtGlTsW/fPlFSUiIWLlwoXF1dRXFxsRBC\niOeee054enqK9PR0cffuXeHv7y+mT5+u0Xt47rnnhI+Pj0hPT5eOHRMTI27duiWEEGLTpk2iadOm\nIjMzUwghxJo1a8SLL76oEl9ERISYMWOGEEKIgwcPCjMzMzFz5kxRUlIidu7cKZo0aSLu3bsnhBBi\n0KBBYvDgwaKgoECcP39eODk5iYCAgEp/p9euXROWlpbiv//9rygpKRHZ2dkiOTlZCCHE8OHDha2t\nrTh58qQoKSkRb731lggPD5deu27dOnH37l2hVCrFokWLRKtWrURRUZEQQoiZM2cKc3NzERsbK4QQ\noqCgQCQkJIgTJ04IpVIprl27Jtzd3cWXX34phBAiNzdXtGrVSixevFgUFRWJvLw8ceLECSGEEFFR\nUWLYsGEqcb/22mti3Lhx4uHDh+L27dvCz89PLF++XAghxOrVq4WZmZlYsmSJUCqVoqCgQKxevVo6\np7t37xYvvPCCuH//vhBCiAsXLki/i/LnuUz5v8GFCxcKT09PcenSJSGEECkpKSI7O7vSc0vVw8RQ\nTz333HPimWeeEZaWlsLExES89tprQqlUCiGEmD9/foX/vMHBweL7778XN2/eFKamptKFq7xx48ZV\n+I/YoUMHKXGU/0/53XffiZdfflkIIURpaalwcnIShw8fFkIIERISIlauXCntQ6lUiiZNmogbN24I\nIR4lhoMHD6p9b7NnzxaDBg2SHpeWlgoHBwfx22+/SXGUXXiEEGLnzp2ibdu2Gr+H1atXqz22EEJ4\ne3tLF9HyF7EyERERUiI6ePCgsLCwkM69EEK0aNFCnDhxQpSUlAhzc3PpwiWEENOnT6+wvzLz5s0T\nYWFhlT4XEREhIiMjVd6zm5ub2vdgbW0tUlJShBCPEsPf/va3p7xjIb744gsxYMAAIYQQGzZsEF26\ndKl0u5kzZ4qhQ4dKjzMzM0WjRo1EQUGB9LMNGzaIXr16CSEenT9nZ2eVfZQ/p/v37xft27cXx48f\nVzmHZe+57DyXKf832L59e7F169anvi+qGd5KqqdMTEwQGxuL3NxcxMXF4cCBAzh16hQA4Pr164iJ\niYG1tbX0deTIEWRmZiItLQ02NjawsrKqsM/r169j0aJFKq9LT0/HzZs3K2wbFhaGY8eOITMzE4cO\nHYKpqSlefPFFaT/vvfeetA9bW1sAQEZGhvR6Jycnte/t1q1bcHZ2VnmvTk5Oal/v7OwsxajJe3jy\n2D/88AN8fHyk7c+ePYvs7Gy18T3J1tYWpqaP/ys1adIEDx48wJ07d1BSUqJyPEdHR7X7SU9Px/PP\nP6/2+ZYtW0rfW1hY4MGDB9Lj6OhodOzYEc8++yysra1x//59ZGVlqT3upUuX0L9/f7Ru3RpWVlaY\nNm2a9J7T0tKeGkd5169fR3FxMVq3bi2dv3HjxuHOnTvSNk/7Xb/88suYMGEC/vGPf6Bly5YYO3Ys\n8vLyNDp2eno62rZtq9G2VD1MDAbgpZdewsSJE/HRRx8BeHShHDZsGHJycqSvvLw8fPjhh3BycsLd\nu3dx//79CvtxdnbGtGnTVF734MEDDBo0qMK21tbWCAoKwqZNm7BhwwYMHjxYZT8rVqxQ2U9+fj66\nd+8ubWNiYqL2/djb2+P69evSYyEE0tLS4ODgIP2sfJ/FjRs3pOc0eQ/lj339+nWMGTMGS5cuxd27\nd5GTkwMPDw+I/xXrqYvzafGXad68OczMzJCWlib9rPz3T3JyckJqamqV+33S4cOH8fnnnyMmJgb3\n7t1DTk4OrKyspPdQWbzjx49Hx44dceXKFdy/fx9z586V+gWcnZ1V+i/KK58Ay2Ju1KgRsrOzpfN9\n//59lX6dqs7VxIkTcerUKZw/fx6XLl3C559/rtHrnJyccOXKladuQzXDxGAg3n//fcTHx+PEiRMY\nOnQotm3bhl9//RVKpRKFhYWIi4tDRkYGWrdujVdeeQXvvPMO7t27h+LiYqkuPDIyEv/5z38QHx8P\nIQTy8/OxY8cOlU+m5Q0ZMgTff/89fvzxRwwZMkT6+bhx4zBv3jycP38ewKPOyZiYGI3fy5tvvokd\nO3bgwIEDKC4uxqJFi9C4cWP07NkTwKNEsWzZMmRkZODu3buYO3eudOGv7nvIz8+HiYkJ7OzsUFpa\nitWrV+Ps2bPS8y1btkR6erpKx6x4dAu2yvfRoEEDhIWFISoqCgUFBbhw4QLWrl2r9oL31ltvYd++\nfYiJiUFJSQmys7Nx+vRp6Zjq5OXlwczMDHZ2dlAoFJg9ezZyc3OfGtuDBw9gaWmJJk2a4MKFC/jm\nm2+k5/r164dbt27h3//+N4qKipCXl4f4+HjpfFy7dk2Kp3Xr1ggKCsKkSZOQl5eH0tJSpKamajzW\n4NSpUzhx4gSKi4vRpEkTNG7cGA0aNJCOpS5BAcDo0aMxY8YMXLlyBUIIpKSk4O7duxodl56OicFA\n2NnZYfjw4ViwYAEcHR0RGxuLefPmoUWLFnB2dsaiRYukT4Rr166Fubk53Nzc0LJlS3z11VcAgBde\neAHffvstJkyYABsbG7Rr1w4//PCD2gtZaGgorly5gtatW8PT01P6+WuvvYaPPvoI4eHhsLKygqen\nJ/bs2SM9X9Unwfbt22PdunWYOHEimjdvjh07dmDbtm0wMzOTXj9kyBAEBQWhbdu2aNeuHaZPn16j\n99CxY0dMnjwZPXr0QKtWrXD27FnplhgA9O7dG506dUKrVq3QokUL6fjl9/e097NkyRLcv38frVq1\nwvDhwzF48GA0bNiw0m2dnJywc+dOLFq0CLa2tvDx8UFKSkqlxyx/3JCQEISEhKB9+/ZwcXGBhYVF\nhVtxT742OjoaGzZsQLNmzTBmzBiEh4dL21haWmLv3r3Ytm0bWrdujfbt2yMuLg4AMHDgQACPbp/5\n+voCeHQrTqFQSFVoAwcORGZm5lPjLvtZbm4uxowZAxsbG7i4uMDOzg5Tp04FAIwaNQrnz5+HtbU1\nwsLCKpyvSZMm4c0330RQUBCsrKwQGRmJwsJCtb8L0pysA9xGjhyJHTt2oEWLFipNy/Leffdd7Nq1\nC02aNMGaNWukEjgiddq0aYOVK1fi5Zdf1nUo1fbRRx/h9u3bWL16ta5DIVJL1hbDiBEjsHv3brXP\n79y5E1euXMHly5exYsUKjB8/Xs5wiLTu4sWLSElJgRAC8fHxWLVqlcq4ESJ9ZCbnzgMCAnDt2jW1\nz2/duhXDhw8HAHTr1g337t3DX3/9pVJ9QVSf5eXlYfDgwbh58yZatmyJKVOmIDQ0VNdhET2VrImh\nKhkZGRVK+dLT05kY6KmuXr2q6xA05uvri8uXL+s6DKJq0Xnn85NdHJqUARIRkXx02mJwcHBQqetO\nT09XqVUv4+rqWqP6biIiY9a2bdsajfXQaYshNDRUmiTs+PHjePbZZyu9jZSamirVjhv718yZM3Ue\ng7588VzwXGj7XFhbCwCPv6ytdf9+y74SExPh5eWFfv36ISMjA0KIGn+glrXFMHjwYPz222/IysqC\nk5MTZs2aJQ0UGjt2LPr27YudO3fC1dUVTZs2ZQkfEcnKxgYoNylwtVlbA0IPV7D54osv8NlnnyE6\nOhrDhg2r9S15WRPDxo0bq9xmyZIlcoZAREbgyQv+rFmVb6evF/ba6tq1K5KTk2Fvb18n+9N55zNV\nT2BgoK5D0Bs8F4/V93NhYwOYmNT8C3h0wRcCOHgwUPr+yS9DnTHjxRdfrLOkANSTpT1NTExQD8Ik\nMlp1cYvGUC/aulTTaydbDESkMXWf7IHKP6Fr+sWkUDWFQoGZM2fiiy++kP1YTAxEpJH/rWzKC7sO\nJCUloWvXrkhISKh0Gvy6xsRARJKn3esHmAC0rayVEBwcjMmTJ2Pbtm112pegjk4HuBGRvKp7799Q\nq3bqq/fffx83btyo04ojTbDzmciAPJkI2Klbv+Xl5eGZZ56p8biEml472WIg0gO1reopw0/8hsXS\n0lInx2UfA5EWVFWnD9SuqoedwPWbQqFAdna2rsOQMDEQ1ZImg7MAXtCpcmUVR8uWLdN1KBImBqJq\nqCwJAPwkT9X3ZMVR2brl+oB9DEQaKl/HT1QbSUlJiIiIgJOTk9YrjjTBqiQiDZQlBX7yp7qwePFi\n2NnZ1clMqE9T02snEwMZPU0qglj2SfUR50oieoKmM3YC7CMgKo+Jgeq1qqZwYIkn6VJSUhIOHjyo\n6zCqjYmB9FJdfNrnBZ90pXzFkT6NT9AUq5JIFoa6hCJRVfS94kgTbDFQnXjyEz7AEbxkfJYsWaL1\nmVDlwKokqhFO1kZUUUJCAlq3bq03CYHlqqRVJia81UOk71iuSnVC005fa2tdR0pEcmFiMHI17Rvg\nbSMyVmUVR7NmzdJ1KLJhYjBila3hyws+kXrl116OjIzUdTiyYWIwQmWtBICJgEgTulp7WVc4jsGI\nlFUScYwAUfVMmzYNf/zxR70dl1BdrEoyEpwdlKjmCgoK0LhxY1lnQpUDy1VJLSYFIuPEclVSKyeH\nSYFIEwqFApmZmboOQ+eYGIiI8Lji6KuvvtJ1KDrHxGBgKhugxsFoROo9WXE0d+5cXYekc6xKqqfU\nzV7KiiMizRnCTKhyYOdzPcW5iohq77vvvkPDhg1lX3tZV1iVZERYZUREmmBVkhHgiGUi0gYmBj2m\nboI7JgWi6klKSsL27dt1HUa9IWti2L17N9zc3NCuXTssWLCgwvNZWVkICQmBt7c3PDw8sGbNGjnD\n0XtVzXTKhEBUPeUrjvLz83UdTr0hWx+DUqlEhw4dsG/fPjg4OKBr167YuHEj3N3dpW2ioqJQVFSE\nzz77DFlZWejQoQP++usvmJmpFksZSh9DVesgcxU0orpTvuJoxYoVRllxpHd9DPHx8XB1dYWLiwvM\nzc0RHh6O2NhYlW1at26N3NxcAEBubi5sbW0rJAVDkpPDNQ6ItGHFihVGMxOqHGS7CmdkZMDJyUl6\n7OjoiBMnTqhsExkZiZdffhn29vbIy8vD5s2b5QqHiIzIiy++yHEJtSBbYtCkJnjevHnw9vZGXFwc\nUlNT0adPH5w+fRqWlpYVto2KipK+DwwMRGBgYB1GK6/y010Tkfw6duyo6xB0Ii4uDnFxcbXej2yJ\nwcHBAWlpadLjtLQ0ODo6qmxz9OhRTJs2DQDQtm1btGnTBhcvXoSvr2+F/ZVPDPVN2S0kIqp7QgiD\nHJxWE09+aK7p8qOy9TH4+vri8uXLuHbtGhQKBTZt2oTQ0FCVbdzc3LBv3z4AwF9//YWLFy/i+eef\nlyskrSurMmJLgajulVUcTZ48WdehGBzZWgxmZmZYsmQJgoODoVQqMWrUKLi7u2P58uUAgLFjx+LT\nTz/FiBEj0LlzZ5SWlmLhwoWwKRvWW89UVnHEeYuI5PFkxRHVLU6JUQc4RQWRdigUCsydOxfffPMN\noqOjDXaOo7pS02un4daGyqx8C4HjD4i0Y968eUhISGDFkczYYqgBthCIdEOhUMDc3JytBA1xdlUt\n4pTXRFQf6N3IZyKimlIoFLhx44auwzBaTAzVZGPD8lMiOZWtvfzll1/qOhSjxVtJ1cTbSETyYMVR\n3WNVEhHVW1x7Wb+wxaCh8vMdsRqJqG5t3rwZhYWFbCXUMVYlyR4DbyERUf3CqqQ69uRqauxwJiJj\nwcSgxpOL6vD2EVHtJSUl4b///a+uw6AqMDEQkezKr71cWlqq63CoCqxKIiJZseKo/mGLoZzy/Qrs\nUyCqvTVr1nDt5XqIVUkqx2HlEVFd+vPPP9G4cWMmBB1huWqdHIeJgYgMB8tVa4lzIBHVTj34jEka\nYmL4n5wclqQS1URZxVFkZKSuQ6E6wqokIqoxrr1smDRuMTx8+FDOOIioHik/LoEVR4anysRw9OhR\ndOzYER06dAAAJCcn45133pE9MCLSX19//bW09vLbb7/Nie8MTJVVSX5+ftiyZQv+/ve/IykpCQDQ\nqVMnnDt3TisBAvJVJZXNmApw1lSi6igpKUGDBg2YEPScrOsxODs7q77IrP53TdjYPPqXhRRE1WcI\n1wBSr8pbSc7Ozjhy5AiAR/cVo6Oj4e7uLntgcikb3QywhUBUFYVCgcuXL+s6DNKyKhPDN998g6VL\nlyIjIwMODg5ISkrC0qVLtRFbnSqfEDhbKlHVytZe/uKLL3QdCmlZle3BS5cuYcOGDSo/O3LkCPz9\n/WULqq7xthGR5hQKBebMmYP//Oc/WLRoEYYOHarrkEjLqux89vHxkTqdn/YzOdWm87ksKbCFQFS1\npKQkDB8+HM899xyWL1/OEtR6rs47n48dO4ajR4/izp07WLx4sbTzvLy8ejWfetmCO0RUtczMTEyd\nOhVDhw5lxZERU5sYFAoF8vLyoFQqkZeXJ/28WbNm2LJli1aCIyLteuWVV3QdAumBKm8lXbt2DS4u\nLloKp3K1uZXEGVOJyFjJNo6hSZMmmDJlCs6fP4+CggLpYAcOHKh+lFpUNniNM6YSVZSYmIjExESM\nHj1a16GQHqqyXPWtt96Cm5sb/vzzT0RFRcHFxQW+vr7aiK1WyvoW2OlM9JhCocA///lPhISEwMLC\nQtfhkJ6q8lZSly5dkJiYCC8vL6SkpAAAfH19cerUKa0ECFS/OcRKJKKKEhMTERERwYojIyLbQj0N\nGzYEALRq1Qrbt29HYmIicsomGNIzHNVMVLn169cjJCQEU6dOxdatW5kU6KmqbDFs27YNAQEBSEtL\nw8SJE5Gbm4uoqCiEhoZqK0aNsh5bCUTq3bx5EwCYEIyMVtd8jo+Ph5+fX5Xb7d69G++//z6USiVG\njx6Njz76qMI2cXFx+OCDD1BcXAw7OzvExcVVDFKDN8fqIyIiVXWeGEpLS/Hzzz8jNTUVHh4e6Nu3\nL06dOoVPP/0Ut2/fRnJy8lN3rFQq0aFDB+zbtw8ODg7o2rUrNm7cqDIB37179+Dv7489e/bA0dER\nWVlZsLOzq9GbY2IgeqS0tBSmply1l2ToYxgzZgyWLVuGnJwczJkzB6+//jqGDx+Od955R6PpMOLj\n4+Hq6goXFxeYm5sjPDwcsbGxKtts2LABr7/+OhwdHQGg0qRARJopqzgaMmSIrkOhek7tOIbjx48j\nJSUFpqamKCwsRKtWrZCamgpbW1uNdpyRkQEnJyfpsaOjI06cOKGyzeXLl1FcXIxevXohLy8P7733\nHoYNG1btN2Fjw/EKZNzKKo6cnZ259jLVmtrEYG5uLjVHGzdujDZt2micFABoNM9KcXExEhMTsX//\nfjx8+BA9evRA9+7d0a5dO42Pw05nMmYKhQJz587FN998g+joaAwbNoxzHFGtqU0MFy5cgKenp/Q4\nNTVVemxiYiKNaVDHwcEBaWlp0uO0tDTpllEZJycn2NnZwcLCAhYWFnjppZdw+vTpShNDVFSU9H1g\nYCACAwMBcJI8Mm6rVq2S1l5mxRHFxcVVWsBTXWo7n69du/bUF1Y1f1JJSQk6dOiA/fv3w97eHn5+\nfhU6ny9cuIAJEyZgz549KCoqQrdu3bBp0yZ07NhRNcindKCw05mMWWlpKUxMTNhKoErV+VxJtZ04\nz8zMDEuWLEFwcDCUSiVGjRoFd3d3LF++HAAwduxYuLm5ISQkBF5eXjA1NUVkZGSFpEBE6rH6iORQ\no3EM2sYWAxm7srWXO3XqpOtQqB6RbUoMItKt5ORk+Pn5YfHixboOhYyERonh4cOHuHjxotyxEFE5\nCoUCM2fORFBQECZNmoTvvvtO1yGRkagyMWzduhU+Pj4IDg4G8GhNWG3Ok0RkjFJSUuDn5ydVHL39\n9tvsYCat0Wja7QMHDqBXr17SiGcPDw+cPXtWKwEC7GMg43P48GFcvXqV4xKoVmRbwc3c3BzPPvus\nys/0pRKCI57JUAUEBCAgIEDXYZCRqjIxdOrUCevXr0dJSQkuX76Mr776Cj179tRGbFXi4DYiorpX\n5Uf/r7/+GufOnUOjRo0wePBgNGvWDF9++aU2YiMyeMnJyfz/RHqnyj6GxMREdOnSRVvxVKqy+2Sc\nI4nqs/JzHC1atKhGk0cSVUW2hXoCAwORmZmJgQMHYtCgQfDw8KhxkDVV2ZtjpzPVV8nJyYiIiICj\noyNWrFjBOY5INrINcIuLi8PBgwdhZ2eHsWPHwtPTE//6179qFCSRsfvxxx+lcQnbtm1jUiC9VK0p\nMc6cOYMFCxZg06ZNKC4uljMuFWwxkKHIzs5GUVEREwJphWy3ks6fP4/Nmzdjy5YtsLW1xaBBg/DG\nG2+gRYsWNQ62upgYiIiqT7bE0L17d4SHh2PgwIFwcHCocYC1wcRA9ZFSqUSDBg10HQYZMdkSgz5g\nYqD6pKzi6NSpU9ixY4euwyEjVucjnwcOHIiYmBiVVdzKH6yqFdyIjFH5iqNvv/1W1+EQ1YjaFsPN\nmzdhb2+P69evV/Jp3QTPPfecVgIsOx5bDKTPuPYy6aM6L1ctq5pYtmwZXFxcVL6WLVtW80iJDFBM\nTAxnQiWDUWUfg4+PjzSrahlPT0+cOXNG1sDKY4uB9F3Z3ycTAumTOu9j+Oabb7Bs2TKkpqaq9DPk\n5eXB39+/ZlESGSgmBDIkalsM9+/fR05ODj7++GMsWLBAyjqWlpawtbXVbpDlsp6NzaNZVa2tOU8S\naZ9CocDZs2d1Pn8YkSbqvFw1NzcXzZo1Q3Z2dqWfhmzKZrHTgvJvjreQSFfKKo48PDywbt06XYdD\nVKU6Twz9+vXDjh074OLiUmliuHr1avWjrCEmBtIlVhxRfWU0A9yYGEibzpw5g2HDhnEmVKqXZJtd\n9ciRI3jw4AEAYO3atZg0aRKuX79e/QiJ6iGlUsmZUMnoVNli8PT0xOnTp3HmzBlERERg1KhRiImJ\nwW+//aatGNliICKqAdlaDGZmZjA1NcUvv/yCf/zjH5gwYQLy8vJqFCQREem/KhODpaUl5s2bh3Xr\n1qF///5QKpVaXYuBSBuSk5O5ABXR/1SZGDZt2oRGjRph1apVaNWqFTIyMjB16lRtxEYkO4VCgZkz\nZyIoKEir838R6TONqpIyMzNx8uRJmJiYwM/PT6uL9ADsYyB5cO1lMnSy9TFs3rwZ3bp1Q0xMDDZv\n3gw/Pz/ExMTUKEgifbFjxw6uvUykRpUtBi8vL+zbt09qJdy5cwe9e/fW6noMbDFQXcvLy0NeXh4T\nAhm0Op9Er4wQAs2bN5ce29ra1uhARPrE0tISlpaWug6DSC9VmRhCQkIQHByMIUOGQAiBTZs24ZVX\nXtFGbER1ori4GObm5roOg6je0Kjz+aeffsLvv/8OAAgICMCAAQNkD6w83kqimiib4yguLg5xcXGc\n34iMTp3fSrp06RKmTp2KK1euwMvLC59//jkcHR1rFSSRtpSvONq4cSOTAlE1qK1KGjlyJPr3748f\nf/wRXbp0wbvvvqvNuIhqpPy4BFYcEdWM2sTw4MEDREZGws3NDVOnTq3RNNu7d++Gm5sb2rVrhwUL\nFqjd7uTJkzAzM8NPP/301P3Z2DxaoIdInT179nDtZaJaUnsrqbCwEImJiQAeVSYVFBQgMTERQgiY\nmJhUuYIEYcBJAAAU10lEQVSVUqnEhAkTsG/fPjg4OKBr164IDQ2Fu7t7he0++ugjhISEVHkvLCeH\n/Qv0dP3790f//v2ZEIhqQW1iaNWqFSZPnqz28cGDB5+64/j4eLi6usLFxQUAEB4ejtjY2AqJ4euv\nv8Ybb7yBkydP1iR+IhVMCES1pzYxxMXF1WrHGRkZcHJykh47OjrixIkTFbaJjY3FgQMHpCk3iDSh\nUChw6tQp9OzZU9ehEBmcKqfEqClNLvLvv/8+5s+fL5VUceAcaSI5ORl+fn744osv+DdDJIMqB7jV\nlIODA9LS0qTHaWlpFcpdExISEB4eDgDIysrCrl27YG5ujtDQ0Ar7i4qK+t+/QGBgIAIDA+UKnfQU\n114merqyMTu1JduazyUlJejQoQP2798Pe3t7+Pn5YePGjRX6GMqMGDECr776KsLCwioG+b8WBQe3\nGa/z589jyJAhnAmVqBpkm121tLQUa9euxezZswEAN27cQHx8fJU7NjMzw5IlSxAcHIyOHTti0KBB\ncHd3x/Lly7F8+fJqB0rGrWHDhhyXQKQlVbYYxo0bB1NTUxw4cAAXLlzA3bt3ERQUhFOnTmkrRrYY\niIhqQLbZVU+cOIGkpCT4+PgAAGxsbLi0JxGRAavyVlLDhg2hVCqlx3fu3IGpqWzFTGTkkpOTMXXq\nVFYbEelQlVf4iRMnYsCAAbh9+zY+/fRT+Pv745NPPtFGbGREys9x5OnpqetwiIyaRlVJf/zxB/bv\n3w8A6N27t9rKIrmwj8Gwce1lInnUtI+hysRw48YNACi3HsKjunFnZ+dqH6ymmBgM1/79+zF48GCO\nSyCSgWyJwcPDQ/rPWlhYiKtXr6JDhw44d+5czSKtASYGw1VUVITs7Gy2EohkIFtV0tmzZ1UeJyYm\nYunSpdU+EFFlGjVqxKRApGdqNPLZw8OjQsKQE1sMhqGwsBCNGzfWdRhERkO2FsOiRYuk70tLS5GY\nmAgHB4dqH4iMV9kcRzt27OAsukT1QJWJ4cGDB483NjND//798frrr8saFBmO8hVHW7duZVIgqgee\nmhiUSiVyc3NVWg1EmuBMqET1l9rEUFJSAjMzMxw5ckRazpNIU8eOHUNiYiKSk5PZuUxUz6jtfO7S\npQsSExMxbtw43Lx5EwMHDkSTJk0evcjEpNLpsWULkp3PRETVVuedz2U7KywshK2tLQ4cOKDyvDYT\nAxERaY/axHDnzh0sXryY89bQUykUChw+fBi9e/fWdShEVEfUJgalUom8vDxtxkL1TFnFUZs2bdCr\nVy/OuktkINT2Mfj4+CApKUnb8VSKfQz6hRVHRPWDbAPciMq7cOECwsPD4ejoyIojIgOltsWQnZ0N\nW1tbbcdTKRMTE1hbPwrz7l0dB2Pkbt68if3792Po0KFsJRDpOdlmV9UHjy5AgreRiIiqoaaJgb2F\nRESkgomBKpWcnIxx48ahtLRU16EQkZYxMZCK8msv9+zZk/0IREaIVUkkKT8TKiuOiIwXWwwEADh6\n9CiCgoIwadIkbNu2jUmByIixKokAPBrpfufOHbRq1UrXoRBRHTH4clVra8ExDERE1WDw5apMCnUn\nPz9f1yEQkR6rN4mBaq+s4sjPzw9KpVLX4RCRnmJiMBLJycnw8/NDQkIC9u7diwYNGug6JCLSU0wM\nBq78uARWHBGRJjiOwcCdOXMGycnJHJdARBqrN1VJ9SBMIiK9YvBVSUREpB1MDAZCoVBg+/btug6D\niAwAE4MBKKs4WrFiBUpKSnQdDhHVc7Inht27d8PNzQ3t2rXDggULKjy/fv16dO7cGV5eXvD390dK\nSorcIRmMJyuOYmNjYWbGegIiqh1ZryJKpRITJkzAvn374ODggK5duyI0NBTu7u7SNs8//zwOHToE\nKysr7N69G2PGjMHx48flDMsgXLlyBW+88QZnQiWiOidriyE+Ph6urq5wcXGBubk5wsPDERsbq7JN\njx49YGVlBQDo1q0b0tPT5QzJYNja2uLDDz/kuAQiqnOyJoaMjAw4OTlJjx0dHZGRkaF2+5UrV6Jv\n375yhmQwrK2tMWTIEC6kQ0R1TtZbSdW5aB08eBCrVq3CkSNHKn0+KipK+j4wMBCBgYG1jI6IyLDE\nxcUhLi6u1vuRNTE4ODggLS1NepyWlgZHR8cK26WkpCAyMhK7d++GtbV1pfsqnxiMSXJyMqKjo7F6\n9WqYm5vrOhwi0mNPfmieNWtWjfYj660kX19fXL58GdeuXYNCocCmTZsQGhqqss2NGzcQFhaGdevW\nwdXVVc5w6pXyFUdBQUGsNiIirZH1amNmZoYlS5YgODgYSqUSo0aNgru7O5YvXw4AGDt2LGbPno2c\nnByMHz8eAGBubo74+Hg5w9J7XHuZiHSJcyXpmaSkJAQHByM6OhrDhg1j5zIR1ZjBL+1ZD8KsE0II\nZGVloXnz5roOhYjqOSYGIiJSwdlV66H79+/rOgQiogqYGHSgrOKoS5cuUCgUug6HiEgFE4OWJSUl\noWvXrkhISMDhw4fRsGFDXYdERKSCiUFLyloJwcHBmDJlCuc4IiK9xVFTWpKamoqzZ89yXAIR6T1W\nJRERGShWJRERUZ1gYqhjCoUCMTExug6DiKjGmBjqUFnF0Q8//ICioiJdh0NEVCNMDHXgyYqjrVu3\nolGjRroOi4ioRliVVEtXr17Fa6+9BmdnZ1YcEZFBYFVSLeXn52P79u148803ORMqEekVTqJHREQq\nWK5KRER1golBQ0lJSQgLC0NhYaGuQyEikhUTQxXKVxwNGDCA1UZEZPBYlfQUSUlJiIiIYMURERkV\ndj6rcfHiRQQEBGDRokUYOnQoK46IqN5hVZIMcnJyYG1trfXjEhHVBSYGIiJSwXLVWsjOztZ1CERE\nesOoE0NZxZGPjw8ePnyo63CIiPSC0SaGsplQExMTcfz4cTRp0kTXIRER6QWjSwyVzYTKMlQioseM\nbhzDrVu3cOHCBY5LICJSg1VJREQGilVJRERUJww2MSgUCnz//fdsaRARVZNBJoayiqMtW7awDJWI\nqJoMKjFUVnHUtGlTXYdFRFSvGExVUnp6Ovr168eZUImIaslgqpIUCgW2b9+OAQMGcCZUIiJwEj0i\nInqCXpar7t69G25ubmjXrh0WLFhQ6Tbvvvsu2rVrh86dOyMpKUnOcIiISAOyJQalUokJEyZg9+7d\nOH/+PDZu3Ig//vhDZZudO3fiypUruHz5MlasWIHx48dXud+kpCS88soryM3NlSt0vRYXF6frEPQG\nz8VjPBeP8VzUnmyJIT4+Hq6urnBxcYG5uTnCw8MRGxurss3WrVsxfPhwAEC3bt1w7949/PXXX5Xu\nr3zF0ZAhQ2BpaSlX6HqNf/SP8Vw8xnPxGM9F7clWlZSRkQEnJyfpsaOjI06cOFHlNunp6WjZsmWF\n/XXt2pUVR0REWiBbYtC0MujJjhF1r5s8eTKGDRvGiiMiIrkJmRw7dkwEBwdLj+fNmyfmz5+vss3Y\nsWPFxo0bpccdOnQQmZmZFfbVtm1bAYBf/OIXv/hVja+2bdvW6PotW4vB19cXly9fxrVr12Bvb49N\nmzZh48aNKtuEhoZiyZIlCA8Px/Hjx/Hss89WehvpypUrcoVJRERPkC0xmJmZYcmSJQgODoZSqcSo\nUaPg7u6O5cuXAwDGjh2Lvn37YufOnXB1dUXTpk2xevVqucIhIiIN1YsBbkREpD16NYkeB8Q9VtW5\nWL9+PTp37gwvLy/4+/sjJSVFB1FqhyZ/FwBw8uRJmJmZ4aefftJidNqjyXmIi4uDj48PPDw8EBgY\nqN0Ataiqc5GVlYWQkBB4e3vDw8MDa9as0X6QWjJy5Ei0bNkSnp6earep9nWzRj0TMigpKRFt27YV\nV69eFQqFQnTu3FmcP39eZZsdO3aIV155RQghxPHjx0W3bt10EarsNDkXR48eFffu3RNCCLFr1y6j\nPhdl2/Xq1Uv069dPbNmyRQeRykuT85CTkyM6duwo0tLShBBC3LlzRxehyk6TczFz5kzx8ccfCyEe\nnQcbGxtRXFysi3Bld+jQIZGYmCg8PDwqfb4m1029aTHU9YC4+kyTc9GjRw9YWVkBeHQu0tPTdRGq\n7DQ5FwDw9ddf44033kDz5s11EKX8NDkPGzZswOuvvw5HR0cAgJ2dnS5ClZ0m56J169bS7Ai5ubmw\ntbWFmZnBTCatIiAgANbW1mqfr8l1U28SQ2WD3TIyMqrcxhAviJqci/JWrlyJvn37aiM0rdP07yI2\nNlaaUsUQx7poch4uX76Mu3fvolevXvD19cXatWu1HaZWaHIuIiMjce7cOdjb26Nz587497//re0w\n9UZNrpt6k0LrekBcfVad93Tw4EGsWrUKR44ckTEi3dHkXLz//vuYP3++NJPkk38jhkCT81BcXIzE\nxETs378fDx8+RI8ePdC9e3e0a9dOCxFqjybnYt68efD29kZcXBxSU1PRp08fnD592min0qnudVNv\nEoODgwPS0tKkx2lpaVKTWN026enpcHBw0FqM2qLJuQCAlJQUREZGYvfu3U9tStZnmpyLhIQEhIeH\nA3jU6bhr1y6Ym5sjNDRUq7HKSZPz4OTkBDs7O1hYWMDCwgIvvfQSTp8+bXCJQZNzcfToUUybNg0A\n0LZtW7Rp0wYXL16Er6+vVmPVBzW6btZZD0gtFRcXi+eff15cvXpVFBUVVdn5fOzYMYPtcNXkXFy/\nfl20bdtWHDt2TEdRaocm56K8iIgI8eOPP2oxQu3Q5Dz88ccfonfv3qKkpETk5+cLDw8Pce7cOR1F\nLB9NzsUHH3wgoqKihBBCZGZmCgcHB5Gdna2LcLXi6tWrGnU+a3rd1JsWAwfEPabJuZg9ezZycnKk\n++rm5uaIj4/XZdiy0ORcGANNzoObmxtCQkLg5eUFU1NTREZGomPHjjqOvO5pci4+/fRTjBgxAp07\nd0ZpaSkWLlwIGxsbHUcuj8GDB+O3335DVlYWnJycMGvWLBQXFwOo+XWTA9yIiEiF3lQlERGRfmBi\nICIiFUwMRESkgomBiIhUMDEQEZEKJgYiIlLBxEB6o0GDBvDx8ZG+bty4oXbbZ555ptbHi4iIwPPP\nPw8fHx+88MILOH78eLX3ERkZiQsXLgB4NA1Def7+/rWOEXh8Xry8vBAWFoYHDx48dfvTp09j165d\ndXJsMk4cx0B6w9LSEnl5eXW+rTojRozAq6++irCwMOzduxdTpkzB6dOna7y/uoipqv1GRETA09MT\nkydPVrv9mjVrkJCQgK+//rrOYyHjwBYD6a38/Hz83//9H1544QV4eXlh69atFba5desWXnrpJfj4\n+MDT0xO///47AODXX39Fz5498cILL+DNN99Efn5+pcco+1wUEBAgrS2+ePFieHp6wtPTU5qVMz8/\nH/369YO3tzc8PT0RExMDAAgMDERCQgI+/vhjFBQUwMfHB8OGDQPwuFUTHh6OnTt3SseMiIjATz/9\nhNLSUkydOhV+fn7o3LkzVqxYUeU56dGjB1JTUwE8mn66Z8+e6NKlC/z9/XHp0iUoFAr885//xKZN\nm+Dj44OYmBjk5+dj5MiR6NatG7p06VLpeSRSUVdzdRDVVoMGDYS3t7fw9vYWYWFhoqSkROTm5goh\nHi224urqKm37zDPPCCGEiI6OFnPnzhVCCKFUKkVeXp64c+eOeOmll8TDhw+FEELMnz9fzJ49u8Lx\nIiIipEV9Nm/eLLp37y4SEhKEp6enePjwoXjw4IHo1KmTSEpKElu2bBGRkZHSa+/fvy+EECIwMFAk\nJCSoxPRkjD///LMYPny4EEKIoqIi4eTkJAoLC8Xy5cvFnDlzhBBCFBYWCl9fX3H16tUKcZbtp6Sk\nRISFhYmlS5cKIYTIzc0VJSUlQggh9u7dK15//XUhhBBr1qwREydOlF7/ySefiHXr1gkhHi3m0759\ne5Gfn1/p74BICD2aK4nIwsJCZdnB4uJifPLJJzh8+DBMTU1x8+ZN3L59Gy1atJC28fPzw8iRI1Fc\nXIzXXnsNnTt3RlxcHM6fP4+ePXsCABQKhfR9eUIITJ06FXPmzEGLFi2wcuVK7N27F2FhYbCwsAAA\nhIWF4fDhwwgJCcGUKVPw8ccfo3///njxxRc1fl8hISF47733oFAosGvXLvztb39Do0aN8Ouvv+LM\nmTPYsmULgEcLyly5cgUuLi4qry9riWRkZMDFxQXjxo0DANy7dw9vv/02rly5AhMTE5SUlEjvS5S7\nQ/zrr79i27ZtiI6OBgAUFRUhLS0NHTp00Pg9kHFhYiC9tX79emRlZSExMRENGjRAmzZtUFhYqLJN\nQEAADh8+jO3btyMiIgKTJk2CtbU1+vTpgw0bNjx1/yYmJoiOjkZYWJj0s3379qlcVIUQMDExQbt2\n7ZCUlIQdO3Zg+vTp6N27N2bMmKHR+2jcuDECAwOxZ88ebN68GYMHD5aeW7JkCfr06fPU15clzIKC\nAgQHByM2NhYDBgzAjBkz0Lt3b/z888+4fv36U9d4/umnnwxu+m2SD/sYSG/l5uaiRYsWaNCgAQ4e\nPIjr169X2ObGjRto3rw5Ro8ejdGjRyMpKQndu3fHkSNHpHvx+fn5uHz5cqXHEE/UXgQEBOCXX35B\nQUEB8vPz8csvvyAgIAC3bt1C48aN8dZbb2HKlCmVLqhubm4ufWp/0qBBg7Bq1Sqp9QEAwcHBWLZs\nmfSaS5cu4eHDh2rPh4WFBb766itMmzYNQgjk5ubC3t4eAFRmzGzWrJlKJ3hwcDC++uor6bFGi8GT\nUWNiIL3x5KpSb731Fk6dOgUvLy+sXbsW7u7uFbY9ePAgvL290aVLF2zevBnvvfce7OzssGbNGgwe\nPBidO3dGz549cfHiRY2O6ePjg4iICPj5+aF79+6IjIxE586dcebMGXTr1g0+Pj6YPXs2pk+fXmFf\nY8aMgZeXl9T5XH7fQUFBOHToEPr06SOtPTx69Gh07NgRXbp0gaenJ8aPH19pYim/H29vb7i6umLz\n5s348MMP8cknn6BLly5QKpXSdr169cL58+elzucZM2aguLgYXl5e8PDwwMyZM9X/EojAclUiInoC\nWwxERKSCiYGIiFQwMRARkQomBiIiUsHEQEREKpgYiIhIBRMDERGpYGIgIiIV/w/K3D2jAtNEmwAA\nAABJRU5ErkJggg==\n", "text": [ "" ] } ], "prompt_number": 45 }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, this is strange, because we're not even using the full data and we're getting *better results than the paper*. Should be getting an AUC score of about __0.15__ - what we are getting is __0.90__.\n", "\n", "Although, the number they're quoting in the paper is something called an __R50 value__, which is:\n", "\n", "> ...R50 is a partial AUC score that measures the area under the ROC curve until reaching 50 negative predictions.\n", "\n", "So we need to calculate this R50 value if we're going to replicate the results.\n", "\n", "In this case we have 500 training points so the R50 value corresponds to (I guess?) the area under the curve up until the false positive rate is _0.1_." ] }, { "cell_type": "code", "collapsed": false, "input": [ "roccrv = pd.DataFrame(array([fpr,tpr]).T)\n", "#don't ask about this line\n", "plotroc(*zip(*roccrv[roccrv[0]<0.1].values))" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "display_data", "png": "iVBORw0KGgoAAAANSUhEUgAAAYYAAAEZCAYAAACTsIJzAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XlYVdX6B/AvCCoaEoOKTGGCDAICIQ5EF/MnkBo3KRNN\nEwccutrg0KReyaumhtYttKulUk4XsQFHzIk0J5RBVMKBHAA1FVEQhQOH9fuDy44jIIfhDBy+n+fh\n0cPZZ+93L3S/rL3etbaeEEKAiIjof/Q1HQAREWkXJgYiIlLAxEBERAqYGIiISAETAxERKWBiICIi\nBUwMpDQ3NzccOnRI02Fo3JQpU7BgwQK1HjM8PBxz585V6zFVZePGjQgKCmrQZ/lvUD30OI+hebK3\nt8etW7fQqlUrtG/fHgMHDsSKFSvQoUMHTYemU2JiYrBmzRocPnxYo3GMHTsWtra2mD9/vkbjiIyM\nRFZWFtavX6/yY4WHh8PW1hb/+te/VH4sUsQeQzOlp6eHHTt2oLCwEKdPn8aZM2fU/ltsUygrK2uR\nx9YkuVzeIo9NymNi0AGdO3dGYGAgzp07J33v+PHj6NevH0xNTeHp6Ylff/1Veu/u3bsYO3YsrK2t\nYWZmhqFDh0rv7dixA56enjA1NYWfnx/OnDkjvWdvb48DBw7g+vXraNeuHfLz86X3UlNT0bFjR+k/\n/tq1a+Hq6gozMzMEBwfj2rVr0rb6+vpYuXIlHB0d4eTkVOM5bdu2DT169ICpqSn69++PzMxMhTgW\nL16MHj16wMzMDOPGjUNJSYnS57B06VJ4eHjA2NgYcrkcixcvhoODAzp06IAePXrg559/BgD8/vvv\nmDJlCo4dOwZjY2OYmZkBULytk5iYCBsbGyxfvhydO3eGlZUVYmJipOPl5eXh5ZdfhomJCXx9fTFn\nzhz4+/vX+rP87bffpJ+bnZ0dvv/+e4Wf25AhQ9ChQwf06dMHf/zxh/TeO++8Azs7O5iYmMDHxwe/\n/fab9F5kZCRee+01jB49GiYmJvjuu+9w8uRJ9O3bF6amprCyssK0adNQWloqfebcuXMYOHAgzM3N\nYWlpiU8//RR79uzBp59+itjYWBgbG8PLywsAcP/+fYwfPx5WVlawsbHB3LlzUV5eDqCix+Xn54fp\n06fDwsICkZGRiImJkdpACIH33nsPnTt3homJCTw8PHDu3DmsXr0amzZtwtKlS2FsbIy///3v0s9v\n//79ACqSzKJFi6SfnY+PD3JycmptW6oHQc2Svb292LdvnxBCiOzsbOHu7i4++eQTIYQQOTk5wtzc\nXOzevVsIIcTevXuFubm5uHPnjhBCiEGDBomwsDBx7949UVpaKg4dOiSEECIlJUV06tRJJCUlifLy\ncvHdd98Je3t7IZPJpGPu379fCCHEiy++KL755hspnpkzZ4opU6YIIYT4+eefhYODg8jMzBRyuVws\nWLBA9OvXT9pWT09PBAYGivz8fFFcXFzt3M6fPy/at28v9u3bJ8rKysTSpUuFg4ODKC0tFUII8cwz\nzwh3d3eRk5Mj7t69K/z8/MScOXOUOodnnnlGeHl5iZycHOnYcXFx4saNG0IIIWJjY0X79u3FzZs3\nhRBCxMTEiOeff14hvvDwcDF37lwhhBAHDx4UBgYGYt68eaKsrEzs2rVLtGvXTty7d08IIcTw4cPF\niBEjxKNHj0RGRoawtbUV/v7+Nf5Mr1y5IoyNjcV///tfUVZWJvLy8kRaWpoQQogxY8YIc3NzcfLk\nSVFWVibeeOMNERYWJn12w4YN4u7du0Iul4tly5YJS0tLUVJSIoQQYt68ecLQ0FDEx8cLIYR49OiR\nSE5OFidOnBByuVxcuXJFuLi4iC+++EIIIURBQYGwtLQUy5cvFyUlJaKwsFCcOHFCCCFEZGSkGD16\ntELcr7zyipg8ebJ4+PChuHXrlvD19RWrVq0SQgixbt06YWBgIKKjo4VcLhePHj0S69atk9o0ISFB\nPPfcc+L+/ftCCCEyMzOln0XVdq5U9d/g0qVLhbu7u7hw4YIQQoj09HSRl5dXY9tS/TAxNFPPPPOM\neOqpp4SxsbHQ09MTr7zyipDL5UIIIRYvXlztP29QUJD47rvvxPXr14W+vr504apq8uTJ1f4jOjk5\nSYmj6n/Kb7/9Vrz44otCCCHKy8uFra2tOHz4sBBCiODgYLFmzRppH3K5XLRr105cu3ZNCFGRGA4e\nPFjruc2fP18MHz5cel1eXi6sra3Fr7/+KsVReeERQohdu3aJbt26KX0O69atq/XYQgjh6ekpXUSr\nXsQqhYeHS4no4MGDwsjISGp7IYTo1KmTOHHihCgrKxOGhobShUsIIebMmVNtf5UWLVokQkNDa3wv\nPDxcREREKJyzs7Nzredgamoq0tPThRAVieFvf/vbE85YiM8//1wMHTpUCCHEpk2bhLe3d43bzZs3\nT4waNUp6ffPmTdGmTRvx6NEj6XubNm0S/fv3F0JUtJ+dnZ3CPqq26f79+0X37t3F8ePHFdqw8pwr\n27lS1X+D3bt3F9u2bXvieVHD8FZSM6Wnp4f4+HgUFBQgMTERBw4cwKlTpwAAV69eRVxcHExNTaWv\nI0eO4ObNm8jOzoaZmRlMTEyq7fPq1atYtmyZwudycnJw/fr1atuGhobi2LFjuHnzJg4dOgR9fX08\n//zz0n7eeecdaR/m5uYAgNzcXOnztra2tZ7bjRs3YGdnp3Cutra2tX7ezs5OilGZc3j82N9//z28\nvLyk7c+ePYu8vLxa43ucubk59PX/+q/Url07PHjwALdv30ZZWZnC8WxsbGrdT05ODp599tla3+/c\nubP0dyMjIzx48EB6HRUVBVdXVzz99NMwNTXF/fv3cefOnVqPe+HCBQwZMgRdunSBiYkJZs+eLZ1z\ndnb2E+Oo6urVqygtLUWXLl2k9ps8eTJu374tbfOkn/WLL76IqVOn4h//+Ac6d+6MSZMmobCwUKlj\n5+TkoFu3bkptS/XDxKADXnjhBUybNg0ffPABgIoL5ejRo5Gfny99FRYW4v3334etrS3u3r2L+/fv\nV9uPnZ0dZs+erfC5Bw8eYPjw4dW2NTU1RWBgIGJjY7Fp0yaMGDFCYT+rV69W2E9RURH69OkjbaOn\np1fr+VhZWeHq1avSayEEsrOzYW1tLX2v6pjFtWvXpPeUOYeqx7569SomTpyIFStW4O7du8jPz4eb\nmxvE/4r1aovzSfFX6tixIwwMDJCdnS19r+rfH2dra4usrKw69/u4w4cP47PPPkNcXBzu3buH/Px8\nmJiYSOdQU7xTpkyBq6srLl26hPv372PhwoXSuICdnZ3C+EVVVRNgZcxt2rRBXl6e1N73799XGNep\nq62mTZuGU6dOISMjAxcuXMBnn32m1OdsbW1x6dKlJ25DDcPEoCPeffddJCUl4cSJExg1ahS2b9+O\nX375BXK5HMXFxUhMTERubi66dOmCl156CW+99Rbu3buH0tJSqS48IiIC//nPf5CUlAQhBIqKirBz\n506F30yrGjlyJL777jv88MMPGDlypPT9yZMnY9GiRcjIyABQMTgZFxen9Lm8/vrr2LlzJw4cOIDS\n0lIsW7YMbdu2Rb9+/QBUJIqVK1ciNzcXd+/excKFC6ULf33PoaioCHp6erCwsEB5eTnWrVuHs2fP\nSu937twZOTk5CgOzouIWbJ3n0apVK4SGhiIyMhKPHj1CZmYm1q9fX+sF74033sC+ffsQFxeHsrIy\n5OXl4fTp09Ixa1NYWAgDAwNYWFhAJpNh/vz5KCgoeGJsDx48gLGxMdq1a4fMzEx8/fXX0nuDBw/G\njRs38O9//xslJSUoLCxEUlKS1B5XrlyR4unSpQsCAwMxffp0FBYWory8HFlZWUrPNTh16hROnDiB\n0tJStGvXDm3btkWrVq2kY9WWoABgwoQJmDt3Li5dugQhBNLT03H37l2ljktPxsSgIywsLDBmzBgs\nWbIENjY2iI+Px6JFi9CpUyfY2dlh2bJl0m+E69evh6GhIZydndG5c2d8+eWXAIDnnnsO33zzDaZO\nnQozMzM4Ojri+++/r/VCFhISgkuXLqFLly5wd3eXvv/KK6/ggw8+QFhYGExMTODu7o49e/ZI79f1\nm2D37t2xYcMGTJs2DR07dsTOnTuxfft2GBgYSJ8fOXIkAgMD0a1bNzg6OmLOnDkNOgdXV1fMmDED\nffv2haWlJc6ePSvdEgOAAQMGoEePHrC0tESnTp2k41fd35POJzo6Gvfv34elpSXGjBmDESNGoHXr\n1jVua2tri127dmHZsmUwNzeHl5cX0tPTazxm1eMGBwcjODgY3bt3h729PYyMjKrdinv8s1FRUdi0\naRM6dOiAiRMnIiwsTNrG2NgYe/fuxfbt29GlSxd0794diYmJAIBhw4YBqLh95uPjA6DiVpxMJpOq\n0IYNG4abN28+Me7K7xUUFGDixIkwMzODvb09LCwsMGvWLADA+PHjkZGRAVNTU4SGhlZrr+nTp+P1\n119HYGAgTExMEBERgeLi4lp/FqQ8lU5wGzduHHbu3IlOnTopdC2revvtt7F79260a9cOMTExUgkc\nUW26du2KNWvW4MUXX9R0KPX2wQcf4NatW1i3bp2mQyGqlUp7DGPHjkVCQkKt7+/atQuXLl3CxYsX\nsXr1akyZMkWV4RCp3fnz55Geng4hBJKSkrB27VqFeSNE2shAlTv39/fHlStXan1/27ZtGDNmDACg\nd+/euHfvHv7880+F6gui5qywsBAjRozA9evX0blzZ8ycORMhISGaDovoiVSaGOqSm5tbrZQvJyeH\niYGe6PLly5oOQWk+Pj64ePGipsMgqheNDz4/PsShTBkgERGpjkZ7DNbW1gp13Tk5OQq16pUcHBwa\nVN9NRNSSdevWrUFzPTTaYwgJCZEWCTt+/DiefvrpGm8jZWVlSbXjLf1r3rx5Go9BW77YFmwLtsVf\nXykpKfDw8MDgwYORm5sLIUSDf6FWaY9hxIgR+PXXX3Hnzh3Y2trik08+kSYKTZo0CYMGDcKuXbvg\n4OCA9u3bs4SPiKgBPv/8c3z66aeIiorC6NGjG31LXqWJYfPmzXVuEx0drcoQiIh0Xq9evZCWlgYr\nK6sm2Z9Gxxio/gICAjQdgtZgW/yFbfGXltgWVWfrN4Vm8WhPPT09NIMwiYi0SkOvnRovVyUiorrJ\nZDLMmzcPn3/+ucqPxcRARKTlUlNT0atXLyQnJ9e4DH5TY2IgItJSlb2EoKAgzJgxA9u3b2+yAeYn\n4eAzEZGWevfdd3Ht2rUmrThSBgefiYi0VGFhIZ566qkGz0to6LWTiYGISEexKomIqJmSyWTIy8vT\ndBgSJgYiIg2qrDhauXKlpkORMDEQEWnA4xVHlc8t1wasSiIiUrPU1FSEh4fD1tZW7RVHyuDgMxGR\nmi1fvhwWFhZNshLqk7AqiYiIFLAqiYiImgQTAxGRiqSmpuLgwYOaDqPemBiIiJpY1YojbZqfoCxW\nJRERNSFtrzhSBnsMRERNJDo6Wu0roaoCq5KIiJpIcnIyunTpojUJgeWqRESkgOWqRETUJDj4TERU\nDzKZDAsXLoS+vj7mzZun6XBUgj0GIiIlVX32ckREhKbDURkmBiKiOmjq2cuawltJRER1mD17Nn7/\n/fdmOy+hvliVRERUh0ePHqFt27YqXQlVFViuSkRECliuSkTUSDKZDDdv3tR0GBrHxEBEhL8qjr78\n8ktNh6JxTAxE1KI9XnG0cOFCTYekcaxKIqIWSxdWQlUFDj4TUYv17bffonXr1ip/9rKmsCqJiIgU\nsCqJiIiaBBMDEem81NRU7NixQ9NhNBsqTQwJCQlwdnaGo6MjlixZUu39O3fuIDg4GJ6ennBzc0NM\nTIwqwyGiFqZqxVFRUZGmw2k2VDbGIJfL4eTkhH379sHa2hq9evXC5s2b4eLiIm0TGRmJkpISfPrp\np7hz5w6cnJzw559/wsBAsViKYwxEVF9VK45Wr17dIiuOtG6MISkpCQ4ODrC3t4ehoSHCwsIQHx+v\nsE2XLl1QUFAAACgoKIC5uXm1pEBEVF+rV69uMSuhqoLKrsK5ubmwtbWVXtvY2ODEiRMK20RERODF\nF1+ElZUVCgsLsWXLFlWFQ0QtyPPPP895CY2gssSgTE3wokWL4OnpicTERGRlZWHgwIE4ffo0jI2N\nq20bGRkp/T0gIAABAQFNGC0R6RJXV1dNh6ARiYmJSExMbPR+VJYYrK2tkZ2dLb3Ozs6GjY2NwjZH\njx7F7NmzAQDdunVD165dcf78efj4+FTbX9XEQERUSQihk5PTGuLxX5o/+eSTBu1HZWMMPj4+uHjx\nIq5cuQKZTIbY2FiEhIQobOPs7Ix9+/YBAP7880+cP38ezz77rKpCIiIdUllxNGPGDE2HonNU1mMw\nMDBAdHQ0goKCIJfLMX78eLi4uGDVqlUAgEmTJuHjjz/G2LFj0bNnT5SXl2Pp0qUwMzNTVUhEpCMe\nrziipsUlMYio2ZDJZFi4cCG+/vprREVF6ewaR02loddO1oYSUbOxaNEiJCcns+JIxdhjIKJmQyaT\nwdDQkL0EJXF1VSIiUqB1M5+JiBpKJpPh2rVrmg6jxWJiICKtUvns5S+++ELTobRYTAxEpBUef/by\nsmXLNB1Si8WqJCLSOD57Wbtw8FnpGADtbymi5mnLli0oLi7mvIQmxqoklcfAxEBEzQurkpqYmVlF\nMqj8MjXVdEREROrBxFCL/PyKHkLl1927mo6IqPlLTU3Ff//7X02HQXVgYiAilatacVReXq7pcKgO\nrEoiIpVixVHzwx5DFVXHFTimQNR4MTExfPZyM8SqJIXjsPKIqCn98ccfaNu2LROChrBctUmOw8RA\nRLqD5aqNZGbG20dEjdEMfsckJTEx/E9+PktSiRqisuIoIiJC06FQE2FVEhE1GJ+9rJuU7jE8fPhQ\nlXEQUTPy+EqorDjSLXUmhqNHj8LV1RVOTk4AgLS0NLz11lsqD4yItNdXX30lPXv5zTff5MJ3OqbO\nqiRfX19s3boVf//735GamgoA6NGjB86dO6eWAAHVVSWZmVWMLQAVA88cYyBSTllZGVq1asWEoOUa\neu1UaozBzs5O8UMGzX9owsys4k8WUhDVny5cA6h2dd5KsrOzw5EjRwBU3FeMioqCi4uLygNTlcrZ\nzQB7CER1kclkuHjxoqbDIDWrMzF8/fXXWLFiBXJzc2FtbY3U1FSsWLFCHbE1qaoJgaulEtWt8tnL\nn3/+uaZDITWrsz944cIFbNq0SeF7R44cgZ+fn8qCamq8bUSkPJlMhgULFuA///kPli1bhlGjRmk6\nJFKzOgefvby8pEHnJ31PlRoz+FyZFNhDIKpbamoqxowZg2eeeQarVq1iCWoz1+SDz8eOHcPRo0dx\n+/ZtLF++XNp5YWFhs1pPvfKBO0RUt5s3b2LWrFkYNWoUK45asFoTg0wmQ2FhIeRyOQoLC6Xvd+jQ\nAVu3blVLcESkXi+99JKmQyAtUOetpCtXrsDe3l5N4dSsMbeSuGIqEbVUKpvH0K5dO8ycORMZGRl4\n9OiRdLADBw7UP0o1qpy8xhVTiapLSUlBSkoKJkyYoOlQSAvVWa76xhtvwNnZGX/88QciIyNhb28P\nHx8fdcTWKJVjCxx0JvqLTCbDP//5TwQHB8PIyEjT4ZCWqvNWkre3N1JSUuDh4YH09HQAgI+PD06d\nOqWWAIH6d4dYiURUXUpKCsLDw1lx1IKo7EE9rVu3BgBYWlpix44dSElJQX7lAkNahrOaiWq2ceNG\nBAcHY9asWdi2bRuTAj1RnT2G7du3w9/fH9nZ2Zg2bRoKCgoQGRmJkJAQdcWoVNZjL4GodtevXwcA\nJoQWRq3PfE5KSoKvr2+d2yUkJODdd9+FXC7HhAkT8MEHH1TbJjExEe+99x5KS0thYWGBxMTE6kEq\ncXKsPiIiUtTkiaG8vBw//fQTsrKy4ObmhkGDBuHUqVP4+OOPcevWLaSlpT1xx3K5HE5OTti3bx+s\nra3Rq1cvbN68WWEBvnv37sHPzw979uyBjY0N7ty5AwsLiwadHBMDUYXy8nLo6/OpvaSCMYaJEydi\n5cqVyM/Px4IFC/Dqq69izJgxeOutt5RaDiMpKQkODg6wt7eHoaEhwsLCEB8fr7DNpk2b8Oqrr8LG\nxgYAakwKRKScyoqjkSNHajoUauZqncdw/PhxpKenQ19fH8XFxbC0tERWVhbMzc2V2nFubi5sbW2l\n1zY2Njhx4oTCNhcvXkRpaSn69++PwsJCvPPOOxg9enS9T8LMjPMVqGWrrDiys7Pjs5ep0WpNDIaG\nhlJ3tG3btujatavSSQGAUuuslJaWIiUlBfv378fDhw/Rt29f9OnTB46Ojkofh4PO1JLJZDIsXLgQ\nX3/9NaKiojB69GiucUSNVmtiyMzMhLu7u/Q6KytLeq2npyfNaaiNtbU1srOzpdfZ2dnSLaNKtra2\nsLCwgJGREYyMjPDCCy/g9OnTNSaGyMhI6e8BAQEICAgAwEXyqGVbu3at9OxlVhxRYmJijQU89VXr\n4POVK1ee+MG61k8qKyuDk5MT9u/fDysrK/j6+lYbfM7MzMTUqVOxZ88elJSUoHfv3oiNjYWrq6ti\nkE8YQOGgM7Vk5eXl0NPTYy+BatTkayU1duE8AwMDREdHIygoCHK5HOPHj4eLiwtWrVoFAJg0aRKc\nnZ0RHBwMDw8P6OvrIyIiolpSIKLasfqIVKFB8xjUjT0Gaukqn73co0cPTYdCzYjKlsQgIs1KS0uD\nr68vli9frulQqIVQKjE8fPgQ58+fV3UsRFSFTCbDvHnzEBgYiOnTp+Pbb7/VdEjUQtSZGLZt2wYv\nLy8EBQUBqHgmrDrXSSJqidLT0+Hr6ytVHL355pscYCa1UWrZ7QMHDqB///7SjGc3NzecPXtWLQEC\nHGOglufw4cO4fPky5yVQo6jsCW6GhoZ4+umnFb6nLZUQnPFMusrf3x/+/v6aDoNaqDoTQ48ePbBx\n40aUlZXh4sWL+PLLL9GvXz91xFYnTm4jImp6df7q/9VXX+HcuXNo06YNRowYgQ4dOuCLL75QR2xE\nOi8tLY3/n0jr1DnGkJKSAm9vb3XFU6Oa7pNxjSRqzqqucbRs2bIGLR5JVBeVPagnICAAN2/exLBh\nwzB8+HC4ubk1OMiGqunkOOhMzVVaWhrCw8NhY2OD1atXc40jUhmVTXBLTEzEwYMHYWFhgUmTJsHd\n3R3/+te/GhQkUUv3ww8/SPMStm/fzqRAWqleS2KcOXMGS5YsQWxsLEpLS1UZlwL2GEhX5OXloaSk\nhAmB1EJlt5IyMjKwZcsWbN26Febm5hg+fDhee+01dOrUqcHB1hcTAxFR/aksMfTp0wdhYWEYNmwY\nrK2tGxxgYzAxUHMkl8vRqlUrTYdBLZjKEoM2YGKg5qSy4ujUqVPYuXOnpsOhFqzJZz4PGzYMcXFx\nCk9xq3qwup7gRtQSVa04+uabbzQdDlGD1NpjuH79OqysrHD16tUaflvXwzPPPKOWACuPxx4DaTM+\ne5m0UZOXq1ZWTaxcuRL29vYKXytXrmx4pEQ6KC4ujiuhks6oc4zBy8tLWlW1kru7O86cOaPSwKpi\nj4G0XeW/TyYE0iZNPsbw9ddfY+XKlcjKylIYZygsLISfn1/DoiTSUUwIpEtq7THcv38f+fn5+PDD\nD7FkyRIp6xgbG8Pc3Fy9QVbJemZmFauqmppynSRSP5lMhrNnz2p8/TAiZTR5uWpBQQE6dOiAvLy8\nGn8bMqtcxU4Nqp4cbyGRplRWHLm5uWHDhg2aDoeoTk2eGAYPHoydO3fC3t6+xsRw+fLl+kfZQEwM\npEmsOKLmqsVMcGNiIHU6c+YMRo8ezZVQqVlS2eqqR44cwYMHDwAA69evx/Tp03H16tX6R0jUDMnl\ncq6ESi1OnT0Gd3d3nD59GmfOnEF4eDjGjx+PuLg4/Prrr+qKkT0GIqIGUFmPwcDAAPr6+vj555/x\nj3/8A1OnTkVhYWGDgiQiIu1XZ2IwNjbGokWLsGHDBgwZMgRyuVytz2IgUoe0tDQ+gIrof+pMDLGx\nsWjTpg3Wrl0LS0tL5ObmYtasWeqIjUjlZDIZ5s2bh8DAQLWu/0WkzZSqSrp58yZOnjwJPT09+Pr6\nqvUhPQDHGEg1+Oxl0nUqG2PYsmULevfujbi4OGzZsgW+vr6Ii4trUJBE2mLnzp189jJRLersMXh4\neGDfvn1SL+H27dsYMGCAWp/HwB4DNbXCwkIUFhYyIZBOa/JF9CoJIdCxY0fptbm5eYMORKRNjI2N\nYWxsrOkwiLRSnYkhODgYQUFBGDlyJIQQiI2NxUsvvaSO2IiaRGlpKQwNDTUdBlGzodTg848//ojf\nfvsNAODv74+hQ4eqPLCqeCuJGqJyjaPExEQkJiZyfSNqcZr8VtKFCxcwa9YsXLp0CR4eHvjss89g\nY2PTqCCJ1KVqxdHmzZuZFIjqodaqpHHjxmHIkCH44Ycf4O3tjbfffludcRE1SNV5Caw4ImqYWhPD\ngwcPEBERAWdnZ8yaNatBy2wnJCTA2dkZjo6OWLJkSa3bnTx5EgYGBvjxxx+fuD8zs4oH9BDVZs+e\nPXz2MlEj1Xorqbi4GCkpKQAqKpMePXqElJQUCCGgp6dX5xOs5HI5pk6din379sHa2hq9evVCSEgI\nXFxcqm33wQcfIDg4uM57Yfn5HF+gJxsyZAiGDBnChEDUCLUmBktLS8yYMaPW1wcPHnzijpOSkuDg\n4AB7e3sAQFhYGOLj46slhq+++gqvvfYaTp482ZD4iRQwIRA1Xq2JITExsVE7zs3Nha2trfTaxsYG\nJ06cqLZNfHw8Dhw4IC25QaQMmUyGU6dOoV+/fpoOhUjn1LkkRkMpc5F/9913sXjxYqmkihPnSBlp\naWnw9fXF559/zn8zRCpQ5wS3hrK2tkZ2drb0Ojs7u1q5a3JyMsLCwgAAd+7cwe7du2FoaIiQkJBq\n+4uMjPzfn0BAQAACAgJUFTppKT57mejJKufsNJbKnvlcVlYGJycn7N+/H1ZWVvD19cXmzZurjTFU\nGjt2LF4o2bCVAAATnUlEQVR++WWEhoZWD/J/PQpObmu5MjIyMHLkSK6ESlQPKltdtby8HOvXr8f8\n+fMBANeuXUNSUlKdOzYwMEB0dDSCgoLg6uqK4cOHw8XFBatWrcKqVavqHSi1bK1bt+a8BCI1qbPH\nMHnyZOjr6+PAgQPIzMzE3bt3ERgYiFOnTqkrRvYYiIgaQGWrq544cQKpqanw8vICAJiZmfHRnkRE\nOqzOW0mtW7eGXC6XXt++fRv6+iorZqIWLi0tDbNmzWK1EZEG1XmFnzZtGoYOHYpbt27h448/hp+f\nHz766CN1xEYtSNU1jtzd3TUdDlGLplRV0u+//479+/cDAAYMGFBrZZGqcIxBt/HZy0Sq0dAxhjoT\nw7Vr1wCgyvMQKurG7ezs6n2whmJi0F379+/HiBEjOC+BSAVUlhjc3Nyk/6zFxcW4fPkynJyccO7c\nuYZF2gBMDLqrpKQEeXl57CUQqYDKqpLOnj2r8DolJQUrVqyo94GIatKmTRsmBSIt06CZz25ubtUS\nhiqxx6AbiouL0bZtW02HQdRiqKzHsGzZMunv5eXlSElJgbW1db0PRC1X5RpHO3fu5Cq6RM1AnYnh\nwYMHf21sYIAhQ4bg1VdfVWlQpDuqVhxt27aNSYGoGXhiYpDL5SgoKFDoNRApgyuhEjVftSaGsrIy\nGBgY4MiRI9LjPImUdezYMaSkpCAtLY2Dy0TNTK2Dz97e3khJScHkyZNx/fp1DBs2DO3atav4kJ5e\njctjqyxIDj4TEdVbkw8+V+6suLgY5ubmOHDggML76kwMRESkPrUmhtu3b2P58uVct4aeSCaT4fDh\nwxgwYICmQyGiJlJrYpDL5SgsLFRnLNTMVFYcde3aFf379+equ0Q6otYxBi8vL6Smpqo7nhpxjEG7\nsOKIqHlQ2QQ3oqoyMzMRFhYGGxsbVhwR6ahaewx5eXkwNzdXdzw10tPTg6lpRZh372o4mBbu+vXr\n2L9/P0aNGsVeApGWU9nqqtqg4gIkeBuJiKgeGpoYOFpIREQKmBioRmlpaZg8eTLKy8s1HQoRqRkT\nAymo+uzlfv36cRyBqAViVRJJqq6EyoojopaLPQYCABw9ehSBgYGYPn06tm/fzqRA1IKxKokAVMx0\nv337NiwtLTUdChE1EZ0vVzU1FZzDQERUDzpfrsqk0HSKioo0HQIRabFmkxio8Sorjnx9fSGXyzUd\nDhFpKSaGFiItLQ2+vr5ITk7G3r170apVK02HRERaiolBx1Wdl8CKIyJSBucx6LgzZ84gLS2N8xKI\nSGnNpiqpGYRJRKRVdL4qiYiI1IOJQUfIZDLs2LFD02EQkQ5gYtABlRVHq1evRllZmabDIaJmTuWJ\nISEhAc7OznB0dMSSJUuqvb9x40b07NkTHh4e8PPzQ3p6uqpD0hmPVxzFx8fDwID1BETUOCq9isjl\nckydOhX79u2DtbU1evXqhZCQELi4uEjbPPvsszh06BBMTEyQkJCAiRMn4vjx46oMSydcunQJr732\nGldCJaImp9IeQ1JSEhwcHGBvbw9DQ0OEhYUhPj5eYZu+ffvCxMQEANC7d2/k5OSoMiSdYW5ujvff\nf5/zEoioyak0MeTm5sLW1lZ6bWNjg9zc3Fq3X7NmDQYNGqTKkHSGqakpRo4cyQfpEFGTU+mtpPpc\ntA4ePIi1a9fiyJEjNb4fGRkp/T0gIAABAQGNjI6ISLckJiYiMTGx0ftRaWKwtrZGdna29Do7Oxs2\nNjbVtktPT0dERAQSEhJgampa476qJoaWJC0tDVFRUVi3bh0MDQ01HQ4RabHHf2n+5JNPGrQfld5K\n8vHxwcWLF3HlyhXIZDLExsYiJCREYZtr164hNDQUGzZsgIODgyrDaVaqVhwFBgay2oiI1EalVxsD\nAwNER0cjKCgIcrkc48ePh4uLC1atWgUAmDRpEubPn4/8/HxMmTIFAGBoaIikpCRVhqX1+OxlItIk\nrpWkZVJTUxEUFISoqCiMHj2ag8tE1GA6/2jPZhBmkxBC4M6dO+jYsaOmQyGiZo6JgYiIFHB11Wbo\n/v37mg6BiKgaJgYNqKw48vb2hkwm03Q4REQKmBjULDU1Fb169UJycjIOHz6M1q1bazokIiIFTAxq\nUtlLCAoKwsyZM7nGERFpLc6aUpOsrCycPXuW8xKISOuxKomISEexKomIiJoEE0MTk8lkiIuL03QY\nREQNxsTQhCorjr7//nuUlJRoOhwiogZhYmgCj1ccbdu2DW3atNF0WEREDcKqpEa6fPkyXnnlFdjZ\n2bHiiIh0AquSGqmoqAg7duzA66+/zpVQiUircBE9IiJSwHJVIiJqEkwMSkpNTUVoaCiKi4s1HQoR\nkUoxMdShasXR0KFDWW1ERDqPVUlPkJqaivDwcFYcEVGLwsHnWpw/fx7+/v5YtmwZRo0axYojImp2\nWJWkAvn5+TA1NVX7cYmImgITAxERKWC5aiPk5eVpOgQiIq3RohNDZcWRl5cXHj58qOlwiIi0QotN\nDJUroaakpOD48eNo166dpkMiItIKLS4x1LQSKstQiYj+0uLmMdy4cQOZmZmcl0BEVAtWJRER6ShW\nJRERUZPQ2cQgk8nw3XffsadBRFRPOpkYKiuOtm7dyjJUIqJ60qnEUFPFUfv27TUdFhFRs6IzVUk5\nOTkYPHgwV0IlImoknalKkslk2LFjB4YOHcqVUImIwEX0iIjoMVpZrpqQkABnZ2c4OjpiyZIlNW7z\n9ttvw9HRET179kRqaqoqwyEiIiWoLDHI5XJMnToVCQkJyMjIwObNm/H7778rbLNr1y5cunQJFy9e\nxOrVqzFlypQ695uamoqXXnoJBQUFqgpdqyUmJmo6BK3BtvgL2+IvbIvGU1liSEpKgoODA+zt7WFo\naIiwsDDEx8crbLNt2zaMGTMGANC7d2/cu3cPf/75Z437q1pxNHLkSBgbG6sqdK3Gf/R/YVv8hW3x\nF7ZF46msKik3Nxe2trbSaxsbG5w4caLObXJyctC5c+dq++vVqxcrjoiI1EBliUHZyqDHB0Zq+9yM\nGTMwevRoVhwREamaUJFjx46JoKAg6fWiRYvE4sWLFbaZNGmS2Lx5s/TayclJ3Lx5s9q+unXrJgDw\ni1/84he/6vHVrVu3Bl2/VdZj8PHxwcWLF3HlyhVYWVkhNjYWmzdvVtgmJCQE0dHRCAsLw/Hjx/H0\n00/XeBvp0qVLqgqTiIgeo7LEYGBggOjoaAQFBUEul2P8+PFwcXHBqlWrAACTJk3CoEGDsGvXLjg4\nOKB9+/ZYt26dqsIhIiIlNYsJbkREpD5atYgeJ8T9pa622LhxI3r27AkPDw/4+fkhPT1dA1GqhzL/\nLgDg5MmTMDAwwI8//qjG6NRHmXZITEyEl5cX3NzcEBAQoN4A1aiutrhz5w6Cg4Ph6ekJNzc3xMTE\nqD9INRk3bhw6d+4Md3f3Wrep93WzQSMTKlBWVia6desmLl++LGQymejZs6fIyMhQ2Gbnzp3ipZde\nEkIIcfz4cdG7d29NhKpyyrTF0aNHxb1794QQQuzevbtFt0Xldv379xeDBw8WW7du1UCkqqVMO+Tn\n5wtXV1eRnZ0thBDi9u3bmghV5ZRpi3nz5okPP/xQCFHRDmZmZqK0tFQT4arcoUOHREpKinBzc6vx\n/YZcN7Wmx9DUE+KaM2Xaom/fvjAxMQFQ0RY5OTmaCFXllGkLAPjqq6/w2muvoWPHjhqIUvWUaYdN\nmzbh1VdfhY2NDQDAwsJCE6GqnDJt0aVLF2l1hIKCApibm8PAQGcWk1bg7+8PU1PTWt9vyHVTaxJD\nTZPdcnNz69xGFy+IyrRFVWvWrMGgQYPUEZraKfvvIj4+XlpSRRfnuijTDhcvXsTdu3fRv39/+Pj4\nYP369eoOUy2UaYuIiAicO3cOVlZW6NmzJ/7973+rO0yt0ZDrptak0KaeENec1eecDh48iLVr1+LI\nkSMqjEhzlGmLd999F4sXL5ZWknz834guUKYdSktLkZKSgv379+Phw4fo27cv+vTpA0dHRzVEqD7K\ntMWiRYvg6emJxMREZGVlYeDAgTh9+nSLXUqnvtdNrUkM1tbWyM7Oll5nZ2dLXeLatsnJyYG1tbXa\nYlQXZdoCANLT0xEREYGEhIQndiWbM2XaIjk5GWFhYQAqBh13794NQ0NDhISEqDVWVVKmHWxtbWFh\nYQEjIyMYGRnhhRdewOnTp3UuMSjTFkePHsXs2bMBAN26dUPXrl1x/vx5+Pj4qDVWbdCg62aTjYA0\nUmlpqXj22WfF5cuXRUlJSZ2Dz8eOHdPZAVdl2uLq1auiW7du4tixYxqKUj2UaYuqwsPDxQ8//KDG\nCNVDmXb4/fffxYABA0RZWZkoKioSbm5u4ty5cxqKWHWUaYv33ntPREZGCiGEuHnzprC2thZ5eXma\nCFctLl++rNTgs7LXTa3pMXBC3F+UaYv58+cjPz9fuq9uaGiIpKQkTYatEsq0RUugTDs4OzsjODgY\nHh4e0NfXR0REBFxdXTUcedNTpi0+/vhjjB07Fj179kR5eTmWLl0KMzMzDUeuGiNGjMCvv/6KO3fu\nwNbWFp988glKS0sBNPy6yQluRESkQGuqkoiISDswMRARkQImBiIiUsDEQERECpgYiIhIARMDEREp\nYGIgrdGqVSt4eXlJX9euXat126eeeqrRxwsPD8ezzz4LLy8vPPfcczh+/Hi99xEREYHMzEwAFcsw\nVOXn59foGIG/2sXDwwOhoaF48ODBE7c/ffo0du/e3STHppaJ8xhIaxgbG6OwsLDJt63N2LFj8fLL\nLyM0NBR79+7FzJkzcfr06Qbvryliqmu/4eHhcHd3x4wZM2rdPiYmBsnJyfjqq6+aPBZqGdhjIK1V\nVFSE//u//8Nzzz0HDw8PbNu2rdo2N27cwAsvvAAvLy+4u7vjt99+AwD88ssv6NevH5577jm8/vrr\nKCoqqvEYlb8X+fv7S88WX758Odzd3eHu7i6tyllUVITBgwfD09MT7u7uiIuLAwAEBAQgOTkZH374\nIR49egQvLy+MHj0awF+9mrCwMOzatUs6Znh4OH788UeUl5dj1qxZ8PX1Rc+ePbF69eo626Rv377I\nysoCULH8dL9+/eDt7Q0/Pz9cuHABMpkM//znPxEbGwsvLy/ExcWhqKgI48aNQ+/eveHt7V1jOxIp\naKq1Oogaq1WrVsLT01N4enqK0NBQUVZWJgoKCoQQFQ9bcXBwkLZ96qmnhBBCREVFiYULFwohhJDL\n5aKwsFDcvn1bvPDCC+Lhw4dCCCEWL14s5s+fX+144eHh0kN9tmzZIvr06SOSk5OFu7u7ePjwoXjw\n4IHo0aOHSE1NFVu3bhURERHSZ+/fvy+EECIgIEAkJycrxPR4jD/99JMYM2aMEEKIkpISYWtrK4qL\ni8WqVavEggULhBBCFBcXCx8fH3H58uVqcVbup6ysTISGhooVK1YIIYQoKCgQZWVlQggh9u7dK159\n9VUhhBAxMTFi2rRp0uc/+ugjsWHDBiFExcN8unfvLoqKimr8GRAJoUVrJREZGRkpPHawtLQUH330\nEQ4fPgx9fX1cv34dt27dQqdOnaRtfH19MW7cOJSWluKVV15Bz549kZiYiIyMDPTr1w8AIJPJpL9X\nJYTArFmzsGDBAnTq1Alr1qzB3r17ERoaCiMjIwBAaGgoDh8+jODgYMycORMffvghhgwZgueff17p\n8woODsY777wDmUyG3bt3429/+xvatGmDX375BWfOnMHWrVsBVDxQ5tKlS7C3t1f4fGVPJDc3F/b2\n9pg8eTIA4N69e3jzzTdx6dIl6OnpoaysTDovUeUO8S+//ILt27cjKioKAFBSUoLs7Gw4OTkpfQ7U\nsjAxkNbauHEj7ty5g5SUFLRq1Qpdu3ZFcXGxwjb+/v44fPgwduzYgfDwcEyfPh2mpqYYOHAgNm3a\n9MT96+npISoqCqGhodL39u3bp3BRFUJAT08Pjo6OSE1Nxc6dOzFnzhwMGDAAc+fOVeo82rZti4CA\nAOzZswdbtmzBiBEjpPeio6MxcODAJ36+MmE+evQIQUFBiI+Px9ChQzF37lwMGDAAP/30E65evfrE\nZzz/+OOPOrf8NqkOxxhIaxUUFKBTp05o1aoVDh48iKtXr1bb5tq1a+jYsSMmTJiACRMmIDU1FX36\n9MGRI0eke/FFRUW4ePFijccQj9Ve+Pv74+eff8ajR49QVFSEn3/+Gf7+/rhx4wbatm2LN954AzNn\nzqzxgeqGhobSb+2PGz58ONauXSv1PgAgKCgIK1eulD5z4cIFPHz4sNb2MDIywpdffonZs2dDCIGC\nggJYWVkBgMKKmR06dFAYBA8KCsKXX34pvVbqYfDUojExkNZ4/KlSb7zxBk6dOgUPDw+sX78eLi4u\n1bY9ePAgPD094e3tjS1btuCdd96BhYUFYmJiMGLECPTs2RP9+vXD+fPnlTqml5cXwsPD4evriz59\n+iAiIgI9e/bEmTNn0Lt3b3h5eWH+/PmYM2dOtX1NnDgRHh4e0uBz1X0HBgbi0KFDGDhwoPTs4QkT\nJsDV1RXe3t5wd3fHlClTakwsVffj6ekJBwcHbNmyBe+//z4++ugjeHt7Qy6XS9v1798fGRkZ0uDz\n3LlzUVpaCg8PD7i5uWHevHm1/xCIwHJVIiJ6DHsMRESkgImBiIgUMDEQEZECJgYiIlLAxEBERAqY\nGIiISAETAxERKWBiICIiBf8PvtLQiChINZkAAAAASUVORK5CYII=\n", "text": [ "" ] } ], "prompt_number": 58 }, { "cell_type": "code", "collapsed": false, "input": [ "r50 = auc(*zip(*roccrv[roccrv[0]<0.1].values))\n", "print \"The R50 value for this ROC curve is %.2f\"%r50" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "The R50 value for this ROC curve is 0.06\n" ] } ], "prompt_number": 60 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Which could be right, it's certainly _below_ what they get in the paper and it's obtained using much less training data than used in the paper. Unfortunately, I still can't find another source for what an R50 actually is. I doubt what I'm doing is 100% correct because then the R50 value you get would depend on the _size of your test set_ (if my training set were smaller I could tolerate a higher false positive rate when finding the R50 value, which would obviously improve my results.\n", "\n", "In lieu of any of that I'm just going to try and train this model with _80%_ of the data and see if the result is exactly what they have in the paper. If it is, I'll just assume my interpretation of the R50 value is right." ] }, { "cell_type": "code", "collapsed": false, "input": [ "#concatenate the positive and negative training sets together\n", "X = pd.concat((nipd[nipfiles[0]].iloc[:,0:162],nipd[nipfiles[1]].iloc[:,0:162]))\n", "#get the class label vector y the same way\n", "y = pd.concat((nipd[nipfiles[0]].iloc[:,162],nipd[nipfiles[1]].iloc[:,162]))\n", "#then shuffle all of them\n", "X,y = skshuffle(X.values,y.values)\n", "#find the midpoint\n", "eighty = int(len(X))*0.8\n", "#split it into test and train\n", "Xtrain, Xtest = X[:eighty], X[eighty:]\n", "ytrain, ytest = y[:eighty], y[eighty:]" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stderr", "text": [ "-c:11: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n", "-c:11: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n", "-c:12: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n", "-c:12: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future\n" ] } ], "prompt_number": 61 }, { "cell_type": "markdown", "metadata": {}, "source": [ "__Ignoring the above warnings for now.__\n", "\n", "Ran the code below, took a while (less than an hour though):" ] }, { "cell_type": "code", "collapsed": false, "input": [ "#reinitialise a logistic regression model\n", "logmodel = sklm.LogisticRegression()\n", "#fit it again\n", "logmodel.fit(Xtrain,ytrain)\n", "#test it again\n", "estimates = logmodel.predict_proba(Xtest)\n", "#recalc the roc curve\n", "fpr, tpr, thresholds = roc_curve(ytest,estimates[:,1])\n", "#print area under roc curve\n", "roc_auc = auc(fpr,tpr)\n", "print \"Area under ROC curve: %.2f\"%(roc_auc)\n", "#replot the roc curve\n", "plotroc(fpr,tpr)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Area under ROC curve: 0.94\n" ] }, { "metadata": {}, "output_type": "display_data", "png": "iVBORw0KGgoAAAANSUhEUgAAAYYAAAEZCAYAAACTsIJzAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3X1cjff/B/DXSSEkKaS7ZYpQqSR3yzfzpbZZmzYTY8Jy\ns/Hdvm52h4nR2LC78GVzN8OPzOb+ntwT3cj9TXNTYUiUqFOnz++Ps46OOnW6Oec6nV7Px6MH55zr\nXNf7XNX17vp83p/PRyaEECAiIvqHidQBEBGRYWFiICIiNUwMRESkhomBiIjUMDEQEZEaJgYiIlLD\nxEBac3d3x8GDB6UOQ3KjR4/GjBkz9HrMsLAwTJkyRa/H1JVVq1YhMDCwQu/lz6B+yDiOoXpydnbG\n3bt3UatWLdSvXx+9evXC/Pnz0bBhQ6lDMyrLly/HkiVLcOjQIUnjGDp0KBwdHTF9+nRJ44iIiEBy\ncjJWrlyp82OFhYXB0dERX331lc6PRep4x1BNyWQybNmyBVlZWTh9+jTOnDmj979iq0J+fn6NPLaU\nFApFjTw2aY+JwQg0a9YMvXv3xrlz51TPHT9+HF27doWVlRW8vLxw4MAB1WsPHjzA0KFDYW9vj8aN\nG6Nv376q17Zs2QIvLy9YWVmhW7duOHPmjOo1Z2dn7Nu3D7du3UK9evWQkZGhei0hIQFNmjRR/eIv\nXboUbdu2RePGjREUFISbN2+qtjUxMcGCBQvg6uqK1q1bl/iZNm3ahHbt2sHKygo9evTAxYsX1eKY\nNWsW2rVrh8aNG2PYsGHIzc3V+jN888038PT0hIWFBRQKBWbNmgUXFxc0bNgQ7dq1w59//gkAuHDh\nAkaPHo1jx47BwsICjRs3BqDerBMTEwMHBwfMmzcPzZo1g52dHZYvX646Xnp6Ol5//XVYWlrCz88P\nkydPhr+/v8bv5eHDh1XfNycnJ/z6669q37c+ffqgYcOG6Ny5M/766y/Vax999BGcnJxgaWkJX19f\nHD58WPVaREQE3n77bQwePBiWlpZYsWIFTp48iS5dusDKygp2dnYYO3Ys8vLyVO85d+4cevXqBWtr\na9ja2uLrr7/Gzp078fXXX2Pt2rWwsLCAt7c3AODRo0cYPnw47Ozs4ODggClTpqCgoACA8o6rW7du\nGDduHGxsbBAREYHly5erzoEQAv/973/RrFkzWFpawtPTE+fOncPixYuxevVqfPPNN7CwsMAbb7yh\n+v7t3bsXgDLJREZGqr53vr6+SE1N1XhuqRwEVUvOzs5iz549QgghUlJShIeHh5g2bZoQQojU1FRh\nbW0ttm/fLoQQYvfu3cLa2lrcv39fCCHEq6++KkJDQ8XDhw9FXl6eOHjwoBBCiPj4eNG0aVMRGxsr\nCgoKxIoVK4Szs7OQy+WqY+7du1cIIcTLL78sfv75Z1U8EyZMEKNHjxZCCPHnn38KFxcXcfHiRaFQ\nKMSMGTNE165dVdvKZDLRu3dvkZGRIXJycop9tkuXLon69euLPXv2iPz8fPHNN98IFxcXkZeXJ4QQ\n4oUXXhAeHh4iNTVVPHjwQHTr1k1MnjxZq8/wwgsvCG9vb5Gamqo6dnR0tLh9+7YQQoi1a9eK+vXr\nizt37gghhFi+fLl46aWX1OILCwsTU6ZMEUIIsX//fmFqaiqmTp0q8vPzxbZt20S9evXEw4cPhRBC\n9O/fXwwYMEA8ffpUnD9/Xjg6Ogp/f/8Sv6fXr18XFhYW4v/+7/9Efn6+SE9PF4mJiUIIIYYMGSKs\nra3FyZMnRX5+vnj33XdFaGio6r2//fabePDggVAoFGLu3LnC1tZW5ObmCiGEmDp1qjAzMxMbN24U\nQgjx9OlTERcXJ06cOCEUCoW4fv26aNOmjfj++++FEEJkZmYKW1tbMW/ePJGbmyuysrLEiRMnhBBC\nREREiMGDB6vF/eabb4pRo0aJJ0+eiLt37wo/Pz+xaNEiIYQQy5YtE6ampiIqKkooFArx9OlTsWzZ\nMtU53bFjh+jQoYN49OiREEKIixcvqr4XRc9zoaI/g998843w8PAQly9fFkIIkZSUJNLT00s8t1Q+\nTAzV1AsvvCAaNGggLCwshEwmE2+++aZQKBRCCCFmzZpV7Jc3MDBQrFixQty6dUuYmJioLlxFjRo1\nqtgvYuvWrVWJo+gv5S+//CJefvllIYQQBQUFwtHRURw6dEgIIURQUJBYsmSJah8KhULUq1dP3Lx5\nUwihTAz79+/X+NmmT58u+vfvr3pcUFAg7O3txYEDB1RxFF54hBBi27ZtomXLllp/hmXLlmk8thBC\neHl5qS6iRS9ihcLCwlSJaP/+/cLc3Fx17oUQomnTpuLEiRMiPz9fmJmZqS5cQggxefLkYvsrFBkZ\nKUJCQkp8LSwsTISHh6t9Zjc3N42fwcrKSiQlJQkhlInhX//6VymfWIjvvvtO9O3bVwghxOrVq4WP\nj0+J202dOlUMGjRI9fjOnTuiTp064unTp6rnVq9eLXr06CGEUJ4/JycntX0UPad79+4VrVq1EseP\nH1c7h4WfufA8Fyr6M9iqVSuxadOmUj8XVQybkqopmUyGjRs3IjMzEzExMdi3bx9OnToFALhx4wai\no6NhZWWl+jpy5Aju3LmDlJQUNG7cGJaWlsX2eePGDcydO1ftfampqbh161axbUNCQnDs2DHcuXMH\nBw8ehImJCV566SXVfj766CPVPqytrQEAaWlpqvc7Ojpq/Gy3b9+Gk5OT2md1dHTU+H4nJydVjNp8\nhueP/euvv8Lb21u1/dmzZ5Genq4xvudZW1vDxOTZr1K9evXw+PFj3Lt3D/n5+WrHc3Bw0Lif1NRU\nvPjiixpfb9asmer/5ubmePz4serxnDlz0LZtWzRq1AhWVlZ49OgR7t+/r/G4ly9fRp8+fdC8eXNY\nWlpi0qRJqs+ckpJSahxF3bhxA3l5eWjevLnq/I0aNQr37t1TbVPa9/rll1/GmDFj8OGHH6JZs2YY\nOXIksrKytDp2amoqWrZsqdW2VD5MDEage/fuGDt2LD799FMAygvl4MGDkZGRofrKysrCJ598AkdH\nRzx48ACPHj0qth8nJydMmjRJ7X2PHz9G//79i21rZWWF3r17Y+3atVi9ejUGDBigtp/Fixer7Sc7\nOxudO3dWbSOTyTR+Hjs7O9y4cUP1WAiBlJQU2Nvbq54r2mdx8+ZN1WvafIaix75x4wZGjBiB+fPn\n48GDB8jIyIC7uzvEP8V6muIsLf5CTZo0gampKVJSUlTPFf3/8xwdHZGcnFzmfp936NAhfPvtt4iO\njsbDhw+RkZEBS0tL1WcoKd7Ro0ejbdu2uHr1Kh49eoSZM2eq+gWcnJzU+i+KKpoAC2OuU6cO0tPT\nVef70aNHav06ZZ2rsWPH4tSpUzh//jwuX76Mb7/9Vqv3OTo64urVq6VuQxXDxGAkPv74Y8TGxuLE\niRMYNGgQNm/ejF27dkGhUCAnJwcxMTFIS0tD8+bN8corr+CDDz7Aw4cPkZeXp6oLDw8Px//+9z/E\nxsZCCIHs7Gxs3bpV7S/TogYOHIgVK1bg999/x8CBA1XPjxo1CpGRkTh//jwAZedkdHS01p/lnXfe\nwdatW7Fv3z7k5eVh7ty5qFu3Lrp27QpAmSgWLFiAtLQ0PHjwADNnzlRd+Mv7GbKzsyGTyWBjY4OC\nggIsW7YMZ8+eVb3erFkzpKamqnXMCmUTbJmfo1atWggJCUFERASePn2KixcvYuXKlRoveO+++y72\n7NmD6Oho5OfnIz09HadPn1YdU5OsrCyYmprCxsYGcrkc06dPR2ZmZqmxPX78GBYWFqhXrx4uXryI\nhQsXql577bXXcPv2bfzwww/Izc1FVlYWYmNjVefj+vXrqniaN2+O3r17Y9y4ccjKykJBQQGSk5O1\nHmtw6tQpnDhxAnl5eahXrx7q1q2LWrVqqY6lKUEBwPvvv48pU6bg6tWrEEIgKSkJDx480Oq4VDom\nBiNhY2ODIUOGYPbs2XBwcMDGjRsRGRmJpk2bwsnJCXPnzlX9Rbhy5UqYmZnBzc0NzZo1w48//ggA\n6NChA37++WeMGTMGjRs3hqurK3799VeNF7Lg4GBcvXoVzZs3h4eHh+r5N998E59++ilCQ0NhaWkJ\nDw8P7Ny5U/V6WX8JtmrVCr/99hvGjh2LJk2aYOvWrdi8eTNMTU1V7x84cCB69+6Nli1bwtXVFZMn\nT67QZ2jbti3Gjx+PLl26wNbWFmfPnlU1iQFAz5490a5dO9ja2qJp06aq4xfdX2mfJyoqCo8ePYKt\nrS2GDBmCAQMGoHbt2iVu6+joiG3btmHu3LmwtraGt7c3kpKSSjxm0eMGBQUhKCgIrVq1grOzM8zN\nzYs1xT3/3jlz5mD16tVo2LAhRowYgdDQUNU2FhYW2L17NzZv3ozmzZujVatWiImJAQD069cPgLL5\nzNfXF4CyKU4ul6uq0Pr164c7d+6UGnfhc5mZmRgxYgQaN24MZ2dn2NjYYOLEiQCA4cOH4/z587Cy\nskJISEix8zVu3Di888476N27NywtLREeHo6cnByN3wvSnk4HuA0bNgxbt25F06ZN1W4ti/rPf/6D\n7du3o169eli+fLmqBI5IkxYtWmDJkiV4+eWXpQ6l3D799FPcvXsXy5YtkzoUIo10escwdOhQ7Nix\nQ+Pr27Ztw9WrV3HlyhUsXrwYo0eP1mU4RHp36dIlJCUlQQiB2NhYLF26VG3cCJEhMtXlzv39/XH9\n+nWNr2/atAlDhgwBAHTq1AkPHz7E33//rVZ9QVSdZWVlYcCAAbh16xaaNWuGCRMmIDg4WOqwiEql\n08RQlrS0tGKlfKmpqUwMVKpr165JHYLWfH19ceXKFanDICoXyTufn+/i0KYMkIiIdEfSOwZ7e3u1\nuu7U1FS1WvVCLi4uFarvJiKqyVq2bFmhsR6SJobg4GBERUUhNDQUx48fR6NGjUpsRkpOTtaqbrwm\niIiIQEREhNRhGASei2eKnouCAqAi5fznzgFFZxdfuhS4dw9o1Ej7faSmApaWwD9DTrSSkwP07Am4\nuysfd+wI2Nlp//7n1dSfi4SEBNVU5YsXL4adnV2FW2B0mhgGDBiAAwcO4P79+3B0dMS0adNUA4VG\njhyJV199Fdu2bYOLiwvq16/PEj4iAImJwN276s9t3gzcvAmYaviNPX8eKKwIP3RIeUH/ZyYSraWn\nKy/oPXooH/fvDwwaBJR3iQ87O+CfMWqkJ9999x2+/vprzJkzB4MHD650k7xOE8OaNWvK3CYqKkqX\nIRDpVFYWUJllHR4/BlatAv7+G5g/H7C3B65fB7p0ARo0eLbd06dAUBDg5lbyftauVV7IAWDgQKBz\nZ+W+qGbo2LEjEhMTYVeZW60iJG1KovILCAiQOgSDURXn4ulT4PhxoGhL5ZMnyou0hUXp7719Gzh8\nuHxNLc97+FD5l/3w4cDXXwMhIUDdukDz5uXbj7V1APijoVQTf0eKjtavCtViaU+ZTMY+BipTXp6y\nvRpQXuhXrlRe+DWJjAQK1xoqbD4BALkcqF0bGDWq7GO2aKFsEycyRBW9djIxkEHLywOiopR/1ZfV\nbl3YctmgAaBQKJPC+PGaty8oAEaOVDa5FG22ITJEcrkcM2fORKNGjfDf//5Xq/cwMZBBUiiAEyeA\nfybnVPnpJ+VFv1690t//11/K7YYPB8qaGsnUFHjzTeVf+0TGpKSKI20wMZDBePgQmDULSElRdooq\nFMpql6JNLnl5QHi4sj29LM2aAVZWuouXyFAV3iUsXLiwQhVHFb12svOZtJKbCyxZorzIF/rxR2Vb\n/vMX93PnlP9OmgQsXw688UbZHblEVNzHH3+MmzdvVmnFkTZ4x0AlunJFWRe/YYOyaaZwiMmYMc+2\nkcuBESOAOnWKv79lS8DcXD+xEhmrrKwsNGjQoMLjEtiURBUmhPLrwgXg4EFlEjh5Ullx4+qqrI+X\nyZTt92zSIao+2JREWrtzRzl1wa+/AtnZylG1hWu3e3gAvr7KzuFOnaSNk6imkMvlyMrKgnV5h6vr\niOSzq5J+5OQAEycCLi7KwVO9egF//AG0awfMnKkceSsEkJSknCOHSYFIPxISEtCxY0csWLBA6lBU\neMdgxJKTlVM2fPstsHq18rmvvgJeew3w9OR8NkRSKqniyFAwMRipGTOAKVOeJYBz54A2bZR9BUQk\nraLjEvRdcaQNdj4bgaws4PXXgQMHns2+mZ8PTJsGfPmltLERUXHz5s2DjY1NlcyEWhpWJdUwiYnK\n/oCVK5XTMV++rJw2wsdH+bpMpnmKZiKqGZgYjNyxY8D//d+zpqAffgBatVI2Fb37LtC+vbK8lIio\nEBODEbtxA2jdGmjbFnjvPeVzZmbK2T/ZgUxkuBISEvDw4UP0KDp9rx5xHIMRyc9XThh36JBy1s/H\nj5Vz9m/aBDg4SB0dEZWlaMWRIZWhaouJwYAMH64cfXz2rLJDee/eZxPPNWjAiiKi6sDQK460wcQg\nodxc4NYt4LffgEuXlEs8/vqrcn2Atm0BW1upIySi8oiKisL06dOrbO1lqbCPQY8uXlSOJ7h+Hfj9\ndyAuTjkRnUymLCv19FQu7UhE1VNcXByaN29uMHcJ7Hw2UAcPAjt2KNfzBZSVRG5uyjUG3n4b8POr\n3JrBRESaMDEYoCdPgPr1gRdfBD78EBgyRNmJTESkD0wMBiQrC/jiC+VaxYByAJqjo7QxEVHVKKw4\nMjExwdSpU6UOp1QVvXZydtUqdPgwEBEBNGyoTAoLFyoXpGdSIDIOhTOhxsXFITw8XOpwdIaJoZKe\nPAFGj1Z2IPv7A0ePAuPHK9c0HjVKuzWNiciwyeVyTJ06FYGBgRg/fjw2b95sMB3MusBy1Ury8lIu\ng7lggXKls8aNpY6IiKrapEmTcOHChWo7LqG82MdQCWPHKpuMrlxRLoBDRMbp6dOnqFu3brUbl8Ap\nMfTg9m0gIQH48UflimgHDgA//8ykQGTszM3NpQ5Br9jHoIXMTGWVkZ0d8M47yqTw4YfKaa7ff1/q\n6Iioqsjlcty5c0fqMCTHpiQtODoCqalAZCTwySec0ZTIGBXOcfTaa68hMjJS6nCqBJuSdGTFCmVS\nOHMGcHeXOhoiqmqGvPayVJgYNEhKApKTgbAwYOJEJgUiY2QMM6HqApuS/nH6tHIcQuF6BxcuAB06\nAM7OwPr1Oj00EUnkl19+Qe3atav1TKil4ZQYlVSnDmBlBezbp3zcoAHg5KTTQxIR6RT7GCph82bl\n9NenTytnPSUiqslqdLnq48fAsGFAcDAwaBCTApGxSkhIwJYtW6QOo9rQaWLYsWMH3Nzc4Orqitmz\nZxd7/f79+wgKCoKXlxfc3d2xfPlyXYajcusWMHCgcpW0lSuVdwwrV+rl0ESkR0XnOMrOzpY6nGpD\nZ30MCoUCrVu3xp49e2Bvb4+OHTtizZo1aNOmjWqbiIgI5Obm4uuvv8b9+/fRunVr/P333zA1VW/h\nquo+hkaNlFNjR0Up7xbs7ats10RkIIpWHC1evLhGVhwZ3LTbsbGxcHFxgbOzM8zMzBAaGoqNGzeq\nbdO8eXNkZmYCADIzM2FtbV0sKVS1/Hzg0SPluITRo5kUiIzR4sWLa8xMqLqgs6twWloaHIssRODg\n4IATJ06obRMeHo6XX34ZdnZ2yMrKwrp163QVjkrhFBZFblyIyMi89NJLHJdQCTpLDNrUBEdGRsLL\nywsxMTFITk5Gr169cPr0aVhYWBTbNiIiQvX/gIAABAQEVCiuP/8Edu1Srp9ARMapbdu2UocgiZiY\nGMTExFR6PzpLDPb29khJSVE9TklJgUPh6LF/HD16FJMmTQIAtGzZEi1atMClS5fg6+tbbH9FE0NF\nffmlshmpa9dK74qIDIQQwigHp1XE8380T5s2rUL70Vkfg6+vL65cuYLr169DLpdj7dq1CA4OVtvG\nzc0Ne/bsAQD8/fffuHTpEl588UVdhYQZM5RLb9avr7NDEJGeFFYcjR8/XupQjI7O7hhMTU0RFRWF\nwMBAKBQKDB8+HG3atMGiRYsAACNHjsQXX3yBoUOHon379igoKMA333yDxjpaAm3nTkAI5bxHRFS9\nPV9xRFWrRkyJkZkJWFoC//43sHt3FQZGRHpV0kyobEbSjFNilKKwme2fmxUiqqYiIyMRFxfHiiMd\nM/o7hvx85eR4kycDn35axYERkV7J5XKYmZnxLkFLvGPQoGdP5ZxI/ftLHQkRVVbt2rWlDqFGMOpJ\n9CZPBg4eBDZtUq6rQETVg1wux82bN6UOo8Yy2sRw7hwwc6ZyneY+faSOhoi0lZCQgI4dO+L777+X\nOpQay2j7GIYMAY4dAy5f1lFQRFSlWHFU9djH8Jw//gBmzZI6CiLSBtdeNixGecewezfQu7dy+ouG\nDXUYGBFViXXr1iEnJ4d3CVWMaz4XYW8PuLoCVTCXFBFRtcWmpH9s26ZcoW3nTqkjISKqnozujsHR\nEfDyUi7XSUSGJSEhAZcuXUJoaKjUodQIBreCmxQOHgRSU4HvvpM6EiIqqujaywUFBVKHQ2Uwqqak\nTZuAN94AXFykjoSICrHiqPoxqjuGbduALl2kjoKICi1fvpxrL1dDRtPHkJ4O2NgAt28DtrZ6CoyI\nSvXXX3+hbt26TAgSqfFVSStXAtbWTApEhkSXKzKS7hhNU9K8ecr+BSKSRjVofCAtGUVTUl4eULs2\nkJwM8A8UIv0qnOMoLS0Nv/zyi9ThUBE1uikpPV35L5MCkX5x7WXjpHVT0pMnT3QZR6XExyunwSAi\n/Sg6LoEVR8anzMRw9OhRtG3bFq1btwYAJCYm4oMPPtB5YOURGwu0bSt1FEQ1x08//aRae/m9997j\nxHdGpsw+Bj8/P6xfvx5vvPEGEhISAADt2rXDuXPn9BIgUHY72TvvAK1bA199pbeQiGq0/Px81KpV\niwnBwOl0SgwnJye1x6amhtU1cfs20LSp1FEQ1RympqZMCkaszMTg5OSEI0eOAFC2K86ZMwdt2rTR\neWDayskBDh8GAgKkjoTI+Mjlcly5ckXqMEjPykwMCxcuxPz585GWlgZ7e3skJCRg/vz5+ohNKwcO\nKP/18JA2DiJjU7j28neclbLGKbNN6PLly1i9erXac0eOHEG3bt10FlR5REez45moKsnlcsyYMQP/\n+9//MHfuXAwaNEjqkEjPyrxjGDNmjFbPSeXyZWDECKmjIDIOCQkJ8PX1RUJCAhITE7nUZg2l8Y7h\n2LFjOHr0KO7du4d58+aperazsrIMaj71w4eBKVOkjoLIONy5cwcTJ07EoEGDmBBqMI2JQS6XIysr\nCwqFAllZWarnGzZsiPXr1+sluLI8egQIAfj4SB0JkXF45ZVXpA6BDECZ4xiuX78OZ2dnPYVTMk21\nuFFRwNixyuRARETqdDZXUr169TBhwgScP38eT58+VR1s37595Y+yiqWkAOHhUkdBVP3Ex8cjPj4e\n77//vtShkAEqs/P53XffhZubG/766y9ERETA2dkZvr6++oitTGfPAq6uUkdBVH3I5XJ8+eWXCAoK\ngrm5udThkIEqsynJx8cH8fHx8PT0RFJSEgDA19cXp06d0kuAQMm3QwUFQK1awK5dQK9eeguFqNqK\nj49HWFgYXnjhBSxatIiT3tUAOpsSo3bt2gAAW1tbbNmyBfHx8cjIyCh/hFVs0yblv927SxsHUXWw\natUqBAUFYeLEidi0aROTApWqzDuGzZs3w9/fHykpKRg7diwyMzMRERGB4OBgfcVYYtYbNUo5R9LG\njXoLg6jaunXrFgAwIdQwOrtjeP3119GoUSN4eHggJiYG8fHxsNVyYeUdO3bAzc0Nrq6umD17donb\nxMTEwNvbG+7u7ggox4RHmZlAq1Zab05Uo9nZ2TEpkNY0ViUVFBTgjz/+QHJyMtzd3fHqq6/i1KlT\n+OKLL3D37l0kJiaWumOFQoExY8Zgz549sLe3R8eOHREcHKw2Ad/Dhw/x4YcfYufOnXBwcMD9+/e1\nDjwtDXjrLa03J6oxCgoKYGJiNMu5kwQ0/vSMGDECCxYsQEZGBmbMmIG33noLQ4YMwQcffKBal6E0\nsbGxcHFxgbOzM8zMzBAaGoqNz7X7rF69Gm+99RYcHBwAADY2NloFnZoKHDzIVduIiiqsOBo4cKDU\noVA1p/GO4fjx40hKSoKJiQlycnJga2uL5ORkWFtba7XjtLQ0ODo6qh47ODjgxIkTattcuXIFeXl5\n6NGjB7KysvDRRx9h8ODBZe57wwblv507axUKkdErrDhycnLi2stUaRoTg5mZmep2tG7dumjRooXW\nSQGAVvOs5OXlIT4+Hnv37sWTJ0/QpUsXdO7cGa5lDE4wMwP4RxGR8i5h5syZWLhwIebMmcNJ76hK\naEwMFy9ehEeRRQ6Sk5NVj2UymWpMgyb29vZISUlRPU5JSVE1GRVydHSEjY0NzM3NYW5uju7du+P0\n6dMlJoaIiAjV/0+fDoCDQ0CpxyeqCZYuXapae5mdyxQTE4OYmJhK70djuer169dLfWNZ8yfl5+ej\ndevW2Lt3L+zs7ODn54c1a9aodT5fvHgRY8aMwc6dO5Gbm4tOnTph7dq1aPvcAgvPl1w1bw5MmgQY\n0OzfRJIoKCiATCbjXQKVqMrnSqrsxHmmpqaIiopCYGAgFAoFhg8fjjZt2mDRokUAgJEjR8LNzQ1B\nQUHw9PSEiYkJwsPDiyWFkty/D/TuXanwiIwCq49IF8oc4GYIima9nBzA3BzIzgbq1ZM4MCI9KVx7\nuV27dlKHQtWIzga4GZqFC5X/MilQTZGYmAg/Pz/MmzdP6lCohtAqMTx58gSXLl3SdSxauXQJ6NdP\n6iiIdE8ul2Pq1Kno3bs3xo0bh19++UXqkKiGKDMxbNq0Cd7e3ggMDASgXBNWn/MkPS87m+MXyPgl\nJSXBz89PVXH03nvvsYOZ9KbMxBAREYETJ07AysoKAODt7Y2//vpL54Fpcu4c8MILkh2eSC8ePXqE\ncePGYfPmzSxDJb0rcwU3MzMzNGrUSO05KSshEhKA1q0lOzyRXvj7+8Pf31/qMKiGKvMK365dO6xa\ntQr5+flhLFNWAAAahElEQVS4cuUKxo4di65du+ojtmLu3VP+W2QoBBERVbEyE8NPP/2Ec+fOoU6d\nOhgwYAAaNmyI77//Xh+xFZOUBJiaKlduIzIGiYmJkv0+EWlSZmK4dOkSIiMjcerUKZw6dQozZ85E\n3bp19RFbMadOAeVYsoHIYBWtOCrPHGRE+lBmH8O4ceNw584d9OvXD/3794e7u7s+4irR2bNAixaS\nHZ6oSiQmJiIsLAwODg6c44gMUpl3DDExMdi/fz9sbGwwcuRIeHh44KuvvtJHbMXcu8dSVarefv/9\nd9W4BFYckaEq15QYZ86cwezZs7F27Vrk5eXpMi41hcO6u3UDpk0D/v1vvR2aqEqlp6cjNzeXCYH0\noson0St0/vx5rFu3DuvXr4e1tTX69+8v2dD8o0cBCwtJDk1UJdifQNVBmXcMnTt3RmhoKPr16wd7\nidbSlMlkKCgQMDEBHj8G6teXJAyiclEoFKjFEjqSUEXvGKrN7Ko3bwo4OQGGHy3VdIWrqp06dQpb\nt26VOhyqwaq8Kalfv36Ijo5WW8Wt6MHKWsGtqqWmAkWWkCYySEUrjn7++WepwyGqEI2J4YcffgAA\nbNmypVjGkWIyr/v3AfbXkaHi2stkTDSWqxZWTSxYsADOzs5qXwsWLNBbgIXu3gUaN9b7YYm0Eh0d\nzZlQyWiUOY5h165dxZ7btm2bToIpTU4O8M8Er0QGZ+DAgRyXQEZDY1PSwoULsWDBAiQnJ6v1M2Rl\nZaFbt256Ca6o8+cBBwe9H5ZIK7xDIGOisSrp0aNHyMjIwGeffYbZs2er+hksLCz0Xostk8nQvbtA\nz57Al1/q9dBEauRyOc6ePQsfHx+pQyEqU5Wv+SyTyeDs7Iz58+fDwsICDRs2RMOGDSGTyfDgwYNK\nBVsReXmAl5feD0ukwrWXqabQ2JQ0YMAAbN26FR06dCjxNvnatWs6Dex5Fy4AzZrp9ZBEAEquOCIy\nZtVmgBsgkJEBPLeYHJFOnTlzBoMHD4aDgwMWL17MzmWqVqq8KanQkSNH8PjxYwDAypUrMW7cONy4\ncaP8EVaSiQmTAumfQqHgTKhU45R5x+Dh4YHTp0/jzJkzCAsLw/DhwxEdHY0DBw7oK0bIZDJYWAhk\nZurtkERE1Z7O7hhMTU1hYmKCP//8Ex9++CHGjBmDrKysCgVZGfn5ej8kEVGNVGZisLCwQGRkJH77\n7Tf06dMHCoVCr2sxFOIklaRLiYmJki1ARWRoykwMa9euRZ06dbB06VLY2toiLS0NEydO1Edsarik\nJ+lC0bWXX3jhBanDITIIWlUl3blzBydPnoRMJoOfnx+aNm2qj9hUZDIZgoIEtm/X62HJyBWdCZUV\nR2SMdNbHsG7dOnTq1AnR0dFYt24d/Pz8EB0dXaEgK0OC1isyYlu3buXay0QalHnH4OnpiT179qju\nEu7du4eePXvqdT0GmUyGwYMFfv1Vb4ckI5eVlYWsrCwmBDJqOlvzWQiBJk2aqB5bW1tX6ECVVbeu\n3g9JRszCwgIWXECcqERlJoagoCAEBgZi4MCBEEJg7dq1eOWVV/QRmxpOXkkVlZeXBzMzM6nDIKo2\ntOp83rBhAw4fPgwA8Pf3R9++fXUeWFEymQyffy4QGanXw1I1VzjHUUxMDGJiYjg1NtU4Vd6UdPny\nZUycOBFXr16Fp6cnvv32WzhIuCCCpaVkh6ZqqGjF0Zo1a5gUiMpBY1XSsGHD0KdPH/z+++/w8fHB\nf/7zH33GVQxbAkgbRcclsOKIqGI0JobHjx8jPDwcbm5umDhxYoWm2d6xYwfc3Nzg6uqK2bNna9zu\n5MmTMDU1xYYNG8p9DKKidu7cybWXiSpJY1NSTk4O4uPjASgrk54+fYr4+HgIISCTycpcwUqhUGDM\nmDHYs2cP7O3t0bFjRwQHB6NNmzbFtvv0008RFBRUaltYvXrl+VhUU/Xp0wd9+vRhQiCqBI2JwdbW\nFuPHj9f4eP/+/aXuODY2Fi4uLnB2dgYAhIaGYuPGjcUSw08//YS3334bJ0+eLHV/NjalvkwEgGsv\nE1UFjYkhJiamUjtOS0uDo6Oj6rGDgwNOnDhRbJuNGzdi3759qik3NOEkelSUXC7HqVOn0LVrV6lD\nITI6ZU6JUVHa/OX28ccfY9asWaqSqtKaknjHQIUK117+7rvvJBlsSWTsyhzgVlH29vZISUlRPU5J\nSSlW7hoXF4fQ0FAAwP3797F9+3aYmZkhODi42P5WrIjA3r3K/wcEBCAgIEBXoZOBKmntZTYdET1T\nOGansnS25nN+fj5at26NvXv3ws7ODn5+flizZk2xPoZCQ4cOxeuvv46QkJDiQcpkiI8X8PbWRaRU\nHZw/fx4DBw7kTKhE5aCz2VULCgqwcuVKTJ8+HQBw8+ZNxMbGlrljU1NTREVFITAwEG3btkX//v3R\npk0bLFq0CIsWLSp3oOxjqNlq167NcQlEelLmHcOoUaNgYmKCffv24eLFi3jw4AF69+6NU6dO6StG\nyGQyXL0q0LKl3g5JRFTt6Wx21RMnTiAhIQHe/7TjNG7cmEt7EhEZsTKbkmrXrg2FQqF6fO/ePZiY\n6KyYSSNTnXWTkyFJTEzExIkTWW1EJKEyr/Bjx45F3759cffuXXzxxRfo1q0bPv/8c33EpoaJwbgV\nnePIw8ND6nCIajStqpIuXLiAvf/Uivbs2VNjZZGuyGQy3LsnOJbBSHHtZSLdqGgfQ5mJ4ebNmwCg\n2nlh3biTk1O5D1ZRMpkMDx4IWFnp7ZCkJ3v37sWAAQM4LoFIB3SWGNzd3VW/rDk5Obh27Rpat26N\nc+fOVSzSCpDJZHj8WKB+fb0dkvQkNzcX6enpvEsg0gGdVSWdPXtW7XF8fDzmz59f7gNVFquSjFOd\nOnWYFIgMTLnLi3x8fIpNhqcPTAzVX05OjtQhEJEWyrxjmDt3rur/BQUFiI+Ph729vU6DKgkTQ/VV\nOMfR1q1by5xFl4ikV2ZiePz48bONTU3Rp08fvPXWWzoNqiQSDJ2gKlC04mjTpk1MCkTVQKmJQaFQ\nIDMzU+2ugUgbnAmVqPrSmBjy8/NhamqKI0eOqJbzJNLWsWPHEB8fj8TERHYuE1UzGstVfXx8EB8f\nj1GjRuHWrVvo168f6v2z8LJMJitxemydBVnBkisiopqsystVC3eWk5MDa2tr7Nu3T+11fSYGIiLS\nH42J4d69e5g3bx7nraFSyeVyHDp0CD179pQ6FCKqIhoTg0KhQFZWlj5joWqmsOKoRYsW6NGjhySz\n7hJR1dPYx+Dt7Y2EhAR9x1Mi9jEYFlYcEVUPOpsSg6ioixcvIjQ0FA4ODqw4IjJSGu8Y0tPTYW1t\nre94SsQ7BsNx69Yt7N27F4MGDeJdApGB09nsqoaAiYGIqPwqeu1kbyEREalhYqASJSYmYtSoUSgo\nKJA6FCLSMyYGUlN07eWuXbuyH4GoBmJVEqkUnQmVFUdENRfvGAgAcPToUfTu3Rvjxo3D5s2bmRSI\najBWJREA5Uj3e/fuwdbWVupQiKiKsFyViIjUsFyVtJadnS11CERkwJgYapDCiiM/Pz8oFAqpwyEi\nA8XEUEMkJibCz88PcXFx2L17N2rVqiV1SERkoJgYjFzRcQmsOCIibXAcg5E7c+YMEhMTOS6BiLTG\nqiQiIiPFqiQiIqoSTAxGQi6XY8uWLVKHQURGgInBCBRWHC1evBj5+flSh0NE1ZzOE8OOHTvg5uYG\nV1dXzJ49u9jrq1atQvv27eHp6Ylu3bohKSlJ1yEZjecrjjZu3AhTU9YTEFHl6PQqolAoMGbMGOzZ\nswf29vbo2LEjgoOD0aZNG9U2L774Ig4ePAhLS0vs2LEDI0aMwPHjx3UZllG4evUq3n77bc6ESkRV\nTqd3DLGxsXBxcYGzszPMzMwQGhqKjRs3qm3TpUsXWFpaAgA6deqE1NRUXYZkNKytrfHJJ59wXAIR\nVTmdJoa0tDQ4OjqqHjs4OCAtLU3j9kuWLMGrr76qy5CMhpWVFQYOHMiFdIioyum0Kak8F639+/dj\n6dKlOHLkSImvR0REqP4fEBCAgICASkZHRGRcYmJiEBMTU+n96DQx2NvbIyUlRfU4JSUFDg4OxbZL\nSkpCeHg4duzYASsrqxL3VTQx1CSJiYmYM2cOli1bBjMzM6nDISID9vwfzdOmTavQfnTalOTr64sr\nV67g+vXrkMvlWLt2LYKDg9W2uXnzJkJCQvDbb7/BxcVFl+FUK0Urjnr37s1qIyLSG51ebUxNTREV\nFYXAwEAoFAoMHz4cbdq0waJFiwAAI0eOxPTp05GRkYHRo0cDAMzMzBAbG6vLsAwe114mIilxriQD\nk5CQgMDAQMyZMweDBw9m5zIRVRiX9jQSQgjcv38fTZo0kToUIqrmmBiIiEgNZ1ethh49eiR1CERE\nxTAxSKCw4sjHxwdyuVzqcIiI1DAx6FlCQgI6duyIuLg4HDp0CLVr15Y6JCIiNUwMelJ4lxAYGIgJ\nEyZwjiMiMlgcNaUnycnJOHv2LMclEJHBY1USEZGRYlUSERFVCSaGKiaXyxEdHS11GEREFcbEUIUK\nK45+/fVX5ObmSh0OEVGFMDFUgecrjjZt2oQ6depIHRYRUYWwKqmSrl27hjfffBNOTk6sOCIio8Cq\npErKzs7Gli1b8M4773AmVCIyKJxEj4iI1LBclYiIqgQTg5YSEhIQEhKCnJwcqUMhItIpJoYyFK04\n6tu3L6uNiMjosSqpFAkJCQgLC2PFERHVKOx81uDSpUvw9/fH3LlzMWjQIFYcEVG1w6okHcjIyICV\nlZXej0tEVBWYGIiISA3LVSshPT1d6hCIiAxGjU4MhRVH3t7eePLkidThEBEZhBqbGApnQo2Pj8fx\n48dRr149qUMiIjIINS4xlDQTKstQiYieqXHjGG7fvo2LFy9yXAIRkQasSiIiMlKsSiIioiphtIlB\nLpdjxYoVvNMgIiono0wMhRVH69evZxkqEVE5GVViKKniqH79+lKHRURUrRhNVVJqaipee+01zoRK\nRFRJRlOVJJfLsWXLFvTt25czoRIRgZPoERHRcwyyXHXHjh1wc3ODq6srZs+eXeI2//nPf+Dq6or2\n7dsjISFBl+EQEZEWdJYYFAoFxowZgx07duD8+fNYs2YNLly4oLbNtm3bcPXqVVy5cgWLFy/G6NGj\ny9xvQkICXnnlFWRmZuoqdIMWExMjdQgGg+fiGZ6LZ3guKk9niSE2NhYuLi5wdnaGmZkZQkNDsXHj\nRrVtNm3ahCFDhgAAOnXqhIcPH+Lvv/8ucX9FK44GDhwICwsLXYVu0PhD/wzPxTM8F8/wXFSezqqS\n0tLS4OjoqHrs4OCAEydOlLlNamoqmjVrVmx/HTt2ZMUREZEe6CwxaFsZ9HzHiKb3jR8/HoMHD2bF\nERGRrgkdOXbsmAgMDFQ9joyMFLNmzVLbZuTIkWLNmjWqx61btxZ37twptq+WLVsKAPziF7/4xa9y\nfLVs2bJC12+d3TH4+vriypUruH79Ouzs7LB27VqsWbNGbZvg4GBERUUhNDQUx48fR6NGjUpsRrp6\n9aquwiQioufoLDGYmpoiKioKgYGBUCgUGD58ONq0aYNFixYBAEaOHIlXX30V27Ztg4uLC+rXr49l\ny5bpKhwiItJStRjgRkRE+mNQk+hxQNwzZZ2LVatWoX379vD09ES3bt2QlJQkQZT6oc3PBQCcPHkS\npqam2LBhgx6j0x9tzkNMTAy8vb3h7u6OgIAA/QaoR2Wdi/v37yMoKAheXl5wd3fH8uXL9R+kngwb\nNgzNmjWDh4eHxm3Kfd2sUM+EDuTn54uWLVuKa9euCblcLtq3by/Onz+vts3WrVvFK6+8IoQQ4vjx\n46JTp05ShKpz2pyLo0ePiocPHwohhNi+fXuNPheF2/Xo0UO89tprYv369RJEqlvanIeMjAzRtm1b\nkZKSIoQQ4t69e1KEqnPanIupU6eKzz77TAihPA+NGzcWeXl5UoSrcwcPHhTx8fHC3d29xNcrct00\nmDuGqh4QV51pcy66dOkCS0tLAMpzkZqaKkWoOqfNuQCAn376CW+//TaaNGkiQZS6p815WL16Nd56\n6y04ODgAAGxsbKQIVee0ORfNmzdXzY6QmZkJa2trmJoazWTSavz9/WFlZaXx9YpcNw0mMZQ02C0t\nLa3MbYzxgqjNuShqyZIlePXVV/URmt5p+3OxceNG1ZQqxjjWRZvzcOXKFTx48AA9evSAr68vVq5c\nqe8w9UKbcxEeHo5z587Bzs4O7du3xw8//KDvMA1GRa6bBpNCq3pAXHVWns+0f/9+LF26FEeOHNFh\nRNLR5lx8/PHHmDVrlmomyed/RoyBNuchLy8P8fHx2Lt3L548eYIuXbqgc+fOcHV11UOE+qPNuYiM\njISXlxdiYmKQnJyMXr164fTp0zV2Kp3yXjcNJjHY29sjJSVF9TglJUV1S6xpm9TUVNjb2+stRn3R\n5lwAQFJSEsLDw7Fjx45SbyWrM23ORVxcHEJDQwEoOx23b98OMzMzBAcH6zVWXdLmPDg6OsLGxgbm\n5uYwNzdH9+7dcfr0aaNLDNqci6NHj2LSpEkAgJYtW6JFixa4dOkSfH199RqrIajQdbPKekAqKS8v\nT7z44ovi2rVrIjc3t8zO52PHjhlth6s25+LGjRuiZcuW4tixYxJFqR/anIuiwsLCxO+//67HCPVD\nm/Nw4cIF0bNnT5Gfny+ys7OFu7u7OHfunEQR64425+K///2viIiIEEIIcefOHWFvby/S09OlCFcv\nrl27plXns7bXTYO5Y+CAuGe0ORfTp09HRkaGql3dzMwMsbGxUoatE9qci5pAm/Pg5uaGoKAgeHp6\nwsTEBOHh4Wjbtq3EkVc9bc7FF198gaFDh6J9+/YoKCjAN998g8aNG0scuW4MGDAABw4cwP379+Ho\n6Ihp06YhLy8PQMWvmxzgRkREagymKomIiAwDEwMREalhYiAiIjVMDEREpIaJgYiI1DAxEBGRGiYG\nMhi1atWCt7e36uvmzZsat23QoEGljxcWFoYXX3wR3t7e6NChA44fP17ufYSHh+PixYsAlNMwFNWt\nW7dKxwg8Oy+enp4ICQnB48ePS93+9OnT2L59e5Ucm2omjmMgg2FhYYGsrKwq31aToUOH4vXXX0dI\nSAh2796NCRMm4PTp0xXeX1XEVNZ+w8LC4OHhgfHjx2vcfvny5YiLi8NPP/1U5bFQzcA7BjJY2dnZ\n+Pe//40OHTrA09MTmzZtKrbN7du30b17d3h7e8PDwwOHDx8GAOzatQtdu3ZFhw4d8M477yA7O7vE\nYxT+XeTv769aW3zevHnw8PCAh4eHalbO7OxsvPbaa/Dy8oKHhweio6MBAAEBAYiLi8Nnn32Gp0+f\nwtvbG4MHDwbw7K4mNDQU27ZtUx0zLCwMGzZsQEFBASZOnAg/Pz+0b98eixcvLvOcdOnSBcnJyQCU\n00937doVPj4+6NatGy5fvgy5XI4vv/wSa9euhbe3N6Kjo5GdnY1hw4ahU6dO8PHxKfE8Eqmpqrk6\niCqrVq1awsvLS3h5eYmQkBCRn58vMjMzhRDKxVZcXFxU2zZo0EAIIcScOXPEzJkzhRBCKBQKkZWV\nJe7duye6d+8unjx5IoQQYtasWWL69OnFjhcWFqZa1GfdunWic+fOIi4uTnh4eIgnT56Ix48fi3bt\n2omEhASxfv16ER4ernrvo0ePhBBCBAQEiLi4OLWYno/xjz/+EEOGDBFCCJGbmyscHR1FTk6OWLRo\nkZgxY4YQQoicnBzh6+srrl27VizOwv3k5+eLkJAQMX/+fCGEEJmZmSI/P18IIcTu3bvFW2+9JYQQ\nYvny5WLs2LGq93/++efit99+E0IoF/Np1aqVyM7OLvF7QCSEAc2VRGRubq627GBeXh4+//xzHDp0\nCCYmJrh16xbu3r2Lpk2bqrbx8/PDsGHDkJeXhzfffBPt27dHTEwMzp8/j65duwIA5HK56v9FCSEw\nceJEzJgxA02bNsWSJUuwe/duhISEwNzcHAAQEhKCQ4cOISgoCBMmTMBnn32GPn364KWXXtL6cwUF\nBeGjjz6CXC7H9u3b8a9//Qt16tTBrl27cObMGaxfvx6AckGZq1evwtnZWe39hXciaWlpcHZ2xqhR\nowAADx8+xHvvvYerV69CJpMhPz9f9blEkRbiXbt2YfPmzZgzZw4AIDc3FykpKWjdurXWn4FqFiYG\nMlirVq3C/fv3ER8fj1q1aqFFixbIyclR28bf3x+HDh3Cli1bEBYWhnHjxsHKygq9evXC6tWrS92/\nTCbDnDlzEBISonpuz549ahdVIQRkMhlcXV2RkJCArVu3YvLkyejZsyemTJmi1eeoW7cuAgICsHPn\nTqxbtw4DBgxQvRYVFYVevXqV+v7ChPn06VMEBgZi48aN6Nu3L6ZMmYKePXvijz/+wI0bN0pd43nD\nhg1GN/026Q77GMhgZWZmomnTpqhVqxb279+PGzduFNvm5s2baNKkCd5//328//77SEhIQOfOnXHk\nyBFVW3x2djauXLlS4jHEc7UX/v7++PPPP/H06VNkZ2fjzz//hL+/P27fvo26devi3XffxYQJE0pc\nUN3MzEz1V/vz+vfvj6VLl6ruPgAgMDAQCxYsUL3n8uXLePLkicbzYW5ujh9//BGTJk2CEAKZmZmw\ns7MDALUZMxs2bKjWCR4YGIgff/xR9VirxeCpRmNiIIPx/KpS7777Lk6dOgVPT0+sXLkSbdq0Kbbt\n/v374eXlBR8fH6xbtw4fffQRbGxssHz5cgwYMADt27dH165dcenSJa2O6e3tjbCwMPj5+aFz584I\nDw9H+/btcebMGXTq1Ane3t6YPn06Jk+eXGxfI0aMgKenp6rzuei+e/fujYMHD6JXr16qtYfff/99\ntG3bFj4+PvDw8MDo0aNLTCxF9+Pl5QUXFxesW7cOn3zyCT7//HP4+PhAoVCotuvRowfOnz+v6nye\nMmUK8vLy4OnpCXd3d0ydOlXzN4EILFclIqLn8I6BiIjUMDEQEZEaJgYiIlLDxEBERGqYGIiISA0T\nAxERqWFiICIiNUwMRESk5v8BRpKNl+tNjnEAAAAASUVORK5CYII=\n", "text": [ "" ] } ], "prompt_number": 62 }, { "cell_type": "markdown", "metadata": {}, "source": [ "To recalculate the R50 will need to know the size of the test set:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "len(ytest)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 63, "text": [ "48050" ] } ], "prompt_number": 63 }, { "cell_type": "markdown", "metadata": {}, "source": [ "So we can only tolerate a false positive rate of:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "r50fpr = 50.0/len(ytest)\n", "print r50fpr" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "0.00104058272633\n" ] } ], "prompt_number": 69 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then we can calculate the R50 for this:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "roccrv = pd.DataFrame(array([fpr,tpr]).T)\n", "r50 = auc(*zip(*roccrv[roccrv[0] ...in our random test set, there are approximately 50 positive items (because 1 in 600 pairs are interacting and we selected 30,000 pairs)...\n", "\n", "What's the problem then? does the test set have too many positive items in it? The training and test have been shuffled in this example so how many positive examples are in there?" ] }, { "cell_type": "code", "collapsed": false, "input": [ "sum(ytest)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 72, "text": [ "554" ] } ], "prompt_number": 72 }, { "cell_type": "markdown", "metadata": {}, "source": [ "That's a lot more than 50...\n", "\n", "Only other lead is that the paper says it uses a LR classifier that:\n", "\n", "> ...uses a ridge estimator for building a binary LR model.\n", "\n", "This is already a binary LR model so I guess I should make sure it's using a ridge estimator - ie make sure it is regularised.\n", "\n", "Asked the Machine Learning professors about the R50 value and they had never heard of it." ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] } ], "metadata": {} } ] }