{ "metadata": { "name": "" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "Cross Validation: The Right and Wrong Way" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*For a real-world example that showcases the pitfalls of improper cross validation, see [this blog post.](http://followthedata.wordpress.com/2013/10/30/the-importance-of-proper-cross-validation-and-experimental-design/)*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The scenario\n", "\n", "You have 20 datapoints, each of which has 1,000,000 attributes. Each observation also has an associated $y$ value, and you are interested in whether a linear combination of a few attributes can be used to predict $y$. That is, you are looking for a model\n", "\n", "$$\n", "y_i \\sim \\sum_j w_j x_{ij}\n", "$$\n", "\n", "where most of the 1 million $w_j$ values are 0.\n", "\n", "## The problem\n", "\n", "Since there are so many more attributes than datapoints, the chance that a few attributes correlate with $y$ by pure coincidence is fairly high. \n", "\n", "You kind of remember that cross-validation helps you detect over-fitting, but you're fuzzy on the details.\n", "\n", "## The wrong way to cross-validate\n", "\n", "* Determine a few attributes of X that correlate well with Y\n", "* Use cross-validation to measure how well a linear fit to these attributes predicts y" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%matplotlib inline\n", "\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "\n", "from sklearn.feature_selection import SelectKBest, f_regression\n", "from sklearn.cross_validation import cross_val_score, KFold\n", "from sklearn.linear_model import LinearRegression" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's make the dataset, and compute the y's with a \"hidden\" model that we are trying to recover" ] }, { "cell_type": "code", "collapsed": false, "input": [ "def hidden_model(x):\n", " #y is a linear combination of columns 5 and 10...\n", " result = x[:, 5] + x[:, 10]\n", " #... with a little noise\n", " result += np.random.normal(0, .005, result.shape)\n", " return result\n", " \n", " \n", "def make_x(nobs):\n", " return np.random.uniform(0, 3, (nobs, 10 ** 6))\n", "\n", "x = make_x(20)\n", "y = hidden_model(x)\n", "\n", "print(x.shape)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "(20, 1000000)\n" ] } ], "prompt_number": 15 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Find the 2 attributes in X that best correlate with y" ] }, { "cell_type": "code", "collapsed": false, "input": [ "selector = SelectKBest(f_regression, k=2).fit(x, y)\n", "best_features = np.where(selector.get_support())[0]\n", "print(best_features)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "[341338 690135]\n" ] } ], "prompt_number": 17 }, { "cell_type": "markdown", "metadata": {}, "source": [ "We know we are already in trouble -- we've selected 2 columns which correlate with Y by chance, but neither of which are columns 5 or 10 (the only 2 columns that *actually* have anything to do with y). We can look at the correlations between these columns and Y, and confirm they are pretty good (again, just a coincidence):" ] }, { "cell_type": "code", "collapsed": false, "input": [ "for b in best_features:\n", " plt.plot(x[:, b], y, 'o')\n", " plt.title(\"Column %i\" % b)\n", " plt.xlabel(\"X\")\n", " plt.ylabel(\"Y\")\n", " plt.show()" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "display_data", "png": "iVBORw0KGgoAAAANSUhEUgAAAXwAAAEZCAYAAACU3p4jAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAG4RJREFUeJzt3XtwVOXh//HPkmAuBBCICQRCGRcv3BNlGq+4lCYhRBys\n2JaxDBftOCoJyNSpcikRwXrpyCTp6HRodaReaHXqBVci8bIGrTZVwTroqLNyCZBioxJA2CSbnO8f\n/NwfS3ZDdsnZ3bPn/ZrZmew5J+d5Hp/xw5PnnPMch2EYhgAASa9fvCsAAIgNAh8AbILABwCbIPAB\nwCYIfACwCQIfAGyCwEfC8ng8ys/Pj3c1gKRB4MN0zzzzjKZOnaqBAwcqLy9Ps2bN0rvvvhvvakVk\n+vTpysnJ0aBBgzRu3Dht3Lgx5HGLFy9Wv3799NVXXwW2/f3vf9cVV1yhAQMGaPr06UHHf/PNN7ry\nyiuVnZ2twYMHq7CwUC+++GJg/+bNm3XxxRdr8ODBys7O1s9+9jMdPHgwsH///v2aPXu2hg0bphEj\nRqiiokKdnZ193HokCwIfpnrkkUd05513atWqVfr666/V1NSkO+64Qy+//HK8qxaRmpoaHThwQEeO\nHNGTTz6piooKff7550HHvPPOO/rqq6/kcDiCtg8bNkzLly/X3Xff3e28WVlZevzxx/X111+rtbVV\nVVVV+vnPf65jx45Jkq688ko1NDSotbVVe/fuVWZmppYvXx74/crKSmVnZ6u5uVk7d+7U22+/rUcf\nfdSE/wJIBgQ+TNPa2qo1a9bo0Ucf1Zw5c5SRkaGUlBSVl5frwQcflCS1tbVp2bJlGjlypEaOHKk7\n77xT7e3tIc93+sh54cKFWr16taST0z+jRo3Sww8/rJycHOXl5enFF1/Uq6++qgsvvFDDhg3TAw88\nEPjdH4J1wYIFGjRokCZOnKgPP/wwbFsmTZqk/v37B75nZWVp0KBBge9+v1+VlZWqra3V6Q+vz5gx\nQ3PnztWIESO6nTctLU0XXXSR+vXrp66uLvXr10/Z2dk655xzJEn5+fnKycmRJBmGoZSUlKDz7Nq1\nS7/4xS90zjnnKDc3VzNnztSuXbvCtgP2RuDDNO+99558Pp+uv/76sMesX79ejY2N+vjjj/Xxxx+r\nsbFR69at69X5HQ5H0Gj60KFDamtrU3Nzs9auXatbbrlFTz/9tHbs2KHt27dr7dq12rt3b+D4LVu2\naN68eWptbdV1112nJUuW9Fjetddeq4yMDLlcLj3++ONBwbthwwZdc801mjRpUq/qfrrJkycrIyND\nCxcu1AsvvBAIfOnkXw7nnnuuBg0apH379gX+sZSk0tJSPfPMMzpx4oQOHDigrVu3qqysLKo6wAYM\nwCRPPfWUMXz48B6PcTqdxtatWwPfX3vtNWPMmDGGYRjGW2+9ZYwaNSqwz+FwGF6vN/B94cKFxqpV\nqwLHZmRkGF1dXYZhGMaRI0cMh8NhNDY2Bo6/9NJLjZdeeskwDMNYs2aNUVxcHNi3a9cuIyMj44xt\n8vv9xnPPPWcMGTLE2Lt3r2EYhrFv3z5j7NixxpEjR0LW8wcbN240XC5X2HO3tbUZNTU1xsiRI42j\nR49223/gwAGjuLjYqKysDGz75ptvjMLCQiM1NdVwOBzGokWLztgG2BcjfJhm2LBhamlpUVdXV9hj\nDh48qB/96EeB76NHjw66KBlpeT+M+DMyMiRJubm5gf0ZGRmBufHT92VmZsrn8/VYV0lKSUnR3Llz\nVVRUpBdeeEGStGzZMv3ud7/TwIEDA9M5RhRrEp5zzjmqqKjQwIED9cYbb3Tbn5eXp/vuu0+bNm0K\nlFFaWqobb7xRx48fV0tLi7799lv99re/jbhs2AOBD9NcfvnlSktLCwRjKHl5edqzZ0/g+759+5SX\nlxfy2MzMTB0/fjzwvbm5udsF0ljp6OjQgAEDJElvvvmm7rrrLo0YMSJQ98svv1ybN28O+p3e1tXv\n9wfOHarczMxMSVJLS4s+/PBDLVmyRP3799fQoUO1cOFCvfrqq9E2C0mOwIdpBg8erLVr1+qOO+7Q\nSy+9pOPHj6ujo0Nbt24NjELnzZundevWqaWlRS0tLVq7dq3mz58f8nwFBQV6+umn1dnZqbq6OjU0\nNMSkHZ9//rm2bt2qEydOqKOjQ0899ZQ++OADlZSUSJK+/PJL/ec//9HHH3+snTt3SpJeeeUVzZkz\nR5LU1dUln8+njo4OdXV1qa2tTR0dHZKkf/3rX3rnnXfU3t6uEydO6MEHH5TP59Nll10mSXr66afV\n1NQkSdq7d69WrlypG264QZKUnZ2tESNG6LHHHlNnZ6cOHz6sJ598UlOmTInJfxdYD4EPUy1fvlyP\nPPKI1q1bp5ycHI0ePVqPPvpo4ELuqlWrNHXqVE2ePFmTJ0/W1KlTtWrVqsDvnzoqrq6u1pYtWzRk\nyBA988wz3S4Gnz6C7mlEffoF356ONwxD9957r3JzczV8+HD9+c9/ltvt1ujRoyWdDN6cnBzl5OQo\nNzdXDodD2dnZSk9PlyRt2rRJmZmZuv3227V9+3ZlZGTo1ltvlXTyLqUlS5YoOztbo0ePVkNDg+rq\n6pSVlSVJ+uyzz3TFFVcoKytLLpdLl19+uR566KFAff/xj39oy5Ytys7O1gUXXKC0tDRt2LAhbLth\nbw4jmsnGXjp8+LBuueUW7dq1Sw6HQ48//nhg5AIAiK1UM0++dOlSzZo1S88//7z8fr++//57M4sD\nAPTAtBF+a2urCgsLgx6UAQDEj2lz+Lt379Z5552nRYsW6ZJLLtGvf/3roDssAACxZVrg+/1+ffTR\nR7r99tv10UcfacCAAUGPtgMAYsysJ7qam5sDT0wahmFs377dKC8vDzrG6XQakvjw4cOHTwQfp9MZ\nVS6bNsIfPny48vPz9cUXX0iSXn/9dU2YMCHoGK/XK8MwkvazZs2auNeB9tE+O7YvmdtmGIa8Xm9U\nuWzqXTq1tbW66aab1N7eLqfTqSeeeMLM4gAAPTA18KdMmaJ///vfZhYBAOglnrQ1kcvlincVTEX7\nrC2Z25fMbTsbpj5pe8bCHQ7Fo3i3u0E1NdvU1paqtDS/KitLVF4+Leb1AIBoRJudpk7pJCK3u0FL\nl74mr3d9YJvXu1KSCH0ASc12Uzo1NduCwl6SvN71qq2tj1ONACA2bBf4bW2h/6jx+VJiXBMAiC3b\nBX5amj/k9vT0zhjXBABiy3aBX1lZIqdzZdA2p3OFKiqK41QjAIgN296lU1tbL58vRenpnaqoKOaC\nLQDLiDY7bRn4AGBl0Wan7aZ0AMCuCHwAsAkCHwBsgsAHAJsg8AHAJgh8ALAJAh8AbILABwCbIPAB\nwCYIfACwCQIfAGyCwAcAmyDwAcAmCHwAsAkCHwBsgsAHAJsg8AHAJgh8ALAJAh8AbCLV7ALGjBmj\nQYMGKSUlRf3791djY6PZRQIAQjA98B0Ohzwej4YOHWp2UQCAHsRkSieat6sDAPqW6YHvcDj005/+\nVFOnTtXGjRvNLg4AEIbpUzrvvvuuRowYof/9738qLi7WxRdfrKuvvtrsYgEApzE98EeMGCFJOu+8\n83T99dersbExKPCrqqoCP7tcLrlcLrOrBACW4vF45PF4zvo8DsPECfbjx4+rs7NTAwcO1Pfff6+S\nkhKtWbNGJSUlJwt3OJjfB4AIRZudpo7wDx06pOuvv16S5Pf7ddNNNwXCHgAQW6aO8M9YOCN8AIhY\ntNnJk7YAYBMEPgDYBIEPADZB4AOATRD4AGATBD4A2ASBDwA2QeADgE0Q+ABgEwQ+ANgEgQ8ANkHg\nA4BNEPgAYBMEPgDYBIEPADZB4AOATRD4AGATBD4A2ASBDwA2QeADgE2kxrsCsCa3u0E1NdvU1paq\ntDS/KitLVF4+Ld7VAtADAh8Rc7sbtHTpa/J61we2eb0rJYnQBxIYUzqIWE3NtqCwlySvd71qa+vj\nVCMAvUHgI2JtbaH/MPT5UmJcEwCRYEonTqw8B56W5g+5PT29M8Y1ARAJAj8OrD4HXllZIq93ZVD9\nnc4VqqiYGcdaATgTh2EYRtwKdzgUx+LjprR0lbZtWxdi+2rV1d0XhxpFzu1uUG1tvXy+FKWnd6qi\notgS/1gBySDa7GSEHwfJMAdeXj6NgAcshou2ccAcOIB4MD3wOzs7VVhYqNmzZ5tdlGVUVpbI6VwZ\ntO3kHHhxnGoEwA5Mn9Kprq7W+PHjdfToUbOLsowfpkJqa1efMgc+kykSAKYy9aLt/v37tXDhQq1c\nuVKPPPKItmzZEly4TS/a9paVb90EYJ6EvGh755136uGHH9aRI0fMLCYpWf3WTQCJx7TAf+WVV5ST\nk6PCwkJ5PJ6wx1VVVQV+drlccrlcZlXJUsIvX7CawAdsxuPx9JijvWXalM6KFSv017/+VampqfL5\nfDpy5IhuuOEGbdq06f8XzpROWC5Xld5+u6rb9muuqZLH0307APuINjtNu0vn/vvvV1NTk3bv3q3N\nmzfrJz/5SVDYW5Xb3aDS0lVyuapUWrpKbneDKeVw6yaAvhazB68cDkesijJNLOfVWb4AQF9jaYUI\nxHpJBJYvABBKQt6lk2xivSQCyxcA6EssrRAB5tUBWBmBHwGWRABgZczhR4h5dQDxFm12EvgAYDEJ\ndx8+ACCxEPgAYBMEPgDYBPfhAzbBctsg8AEbYLltSEzpALYQfrnt+jjVCPFA4AM2EOtlQZCYCHzA\nBlgWBBKBD9gCy4JA4klbwDZYFiR5sLQCANgESysAAHpE4AOATRD4AGATBD4A2ASBDwA2QeADgE0Q\n+ABgEwQ+ANgEgQ8ANkHgA4BNEPgAYBMEPgDYhKmB7/P5VFRUpIKCAo0fP1733HOPmcUBAHpg+mqZ\nx48fV2Zmpvx+v6666ir94Q9/0FVXXXWycFbLBICI9flqmWVlZdq9e/dZVUqSMjMzJUnt7e3q7OzU\n0KFDz/qcAIDIhQ38xYsXq7S0VOvXr1dHR0fUBXR1damgoEC5ubmaPn26xo8fH/W5AADRC/1mY0k3\n3nijysrKtHbtWk2dOlXz58+Xw+GQdPLPieXLl/eqgH79+mnnzp1qbW1VaWmpPB6PXC5XYH9VVVXg\nZ5fLFbQPACB5PB55PJ6zPk/YwJek/v37KysrSz6fT0ePHlW/ftFf4x08eLDKy8v1wQcfhA18AEB3\npw+G77333qjOEzbw6+rqtHz5cs2ePVs7duwIzMVHoqWlRampqTr33HN14sQJ1dfXa82aNVFVFABw\ndsIG/vr16/Xcc89pwoQJUZ+8ublZCxYsUFdXl7q6ujR//nzNmDEj6vMBAKIX9rZMwzACc/amFc5t\nmbbkdjeopmab2tpSlZbmV2VlicrLp8W7WoBlRJudYUf4Zoc97MntbtDSpa/J610f2Ob1rpQkQh8w\nGUsrIKZqarYFhb0keb3rVVtbH6caAfZB4COm2tpC/1Hp86XEuCaA/RD4iKm0NH/I7enpnTGuCWA/\nBD5iqrKyRE7nyqBtTucKVVQUx6lGMJPb3aDS0lVyuapUWrpKbndDvKtkaz0+eAX0tR8uzNbWrpbP\nl6L09E5VVMzkgm0S4gJ94jF9tcweC+e2TCBplZau0rZt60JsX626uvviUKPk0eerZQLA2eACfeIh\n8AGYggv0iYfAB2AKLtAnHubwgQSRjEtOuN0Nqq2tP+UCfbHl25QIos1OAh9IAKHuaHE6V6q6upSA\nRDcEPhAjZozEuaMFkejzxdMAdGfWveXc0YJY4KItEAGzFn87mztaeJoVvcUIH4iAWSPxysoSeb0r\nT5vDX6GKipk9/h5PsyISBD4QgZ5G4mcztx/tkhPh/+JYTeCjGwIfiEC4kfhll40665F2efm0iEOa\nuX9EgsAHIhBuJB6vkTZPsyISBD4QoVAj8YcffjPksWaPtKOd+4c9EfhAH4jXSJvlphEJHrwC+kDo\nJ2VXqLqa8EXf40lbIM5YNwaxQuADgE3wAhQAQI8IfACwCe7SAYBTJON7CX5A4APA/5PsaxOZOqXT\n1NSk6dOna8KECZo4caJqamrMLA4AzopZq6EmClNH+P3799eGDRtUUFCgY8eO6dJLL1VxcbHGjRtn\nZrEAEJVkX5vI1BH+8OHDVVBQIEnKysrSuHHjdPDgQTOLBICoJfvaRDG7S2fPnj3asWOHioqKYlUk\nAESksrJETufKoG0n1yYqjlON+lZMLtoeO3ZMc+fOVXV1tbKysoL2VVVVBX52uVxyuVyxqBIAdJOo\naxN5PB55PJ6zPo/pT9p2dHTo2muvVVlZmZYtWxZcOE/aAkDEEnJpBcMwtGDBAg0bNkwbNmzoXjiB\nDwARS8jAf+eddzRt2jRNnjxZDodDkvT73/9eM2eeXKubwAeAyCVk4J+xcAIfACLG4mkAgB6xtAIA\nJLBQa/tEi8AHgATV09o+0WAOH7CIZF7FEaGVlq7Stm3rQuyJLjsZ4QMWkOyrOCK0cGv7RIuLtoAF\nJPsqjggt3No+0SLwAQtI9lUcEVq4tX2ixZQOYAHJvoojQgu3ts+11/4+qvNx0RawgFBz+E7nClVX\nx39hL8QeT9oCSc7tblBtbf0pI71iwr6PWO0OKAIfgK30VUiH/utppaqrSxM29KPNTubwAVhOX96m\nGv4OqNUJG/jR4i4dAJbTl7ep2ukOKAIfgOX0ZUjb6Q4oAh+A5fRlSCf7e2xPxRw+AMuprCyR17uy\n222qFRUzIz5Xor7H1gzcpQPAkux8myq3ZQKATfDGKwBAjwh8ALAJAh8AbILABwCbIPABwCYIfACw\nCR68AoAIWW055R8Q+AAQASu/UJ4pHQCIgJVfKE/gA0AErLycsqmBv3jxYuXm5mrSpElmFgMAMWPl\n5ZRNDfxFixaprq7OzCIAIKasvJyyqRdtr776au3Zs8fMIgAgpqy8nDJ36QBAhMrLp1ki4E/HRVsA\nsIm4j/CrqqoCP7tcLrlcrrjVBQASkcfjkcfjOevzmP4ClD179mj27Nn65JNPuhfOC1AAIGIJ+QKU\nefPm6YorrtAXX3yh/Px8PfHEE2YWBwDoAa84BACLScgRPgAgccT9oi0AJCqrrooZDoEPACFYeVXM\ncJjSAYAQrLwqZjgEPgCEYOVVMcMh8AEgBCuvihkOgQ8AIVh5VcxwuA8fAMJwuxtUW1t/yqqYxQlx\nwTba7CTwAcBiePAKANAjAh8AbILABwCbIPABwCYIfACwCQIfAGyCwAcAmyDwAcAmCHwAsAkCHwBs\ngsAHAJsg8AHAJgh8ALAJAh8AbILABwCbIPABwCYIfACwidCvZQcAC3O7G1RTs01tbalKS/OrsrIk\nIV5NGG8EPoCk4nY3aOnS1+T1rg9s83pPvozc7qFv6pROXV2dLr74Yl1wwQV68MEHzSwKACRJNTXb\ngsJekrze9aqtrY9TjRKHaYHf2dmpJUuWqK6uTp9++qmeffZZffbZZ2YVl5A8Hk+8q2Aq2mdtydq+\ntrZUSZ5u232+lJjXJdGYFviNjY0aO3asxowZo/79++uXv/ylXnrpJbOKS0jJ+j/UD2iftSVr+9LS\n/AoV+OnpnTGvS6IxLfAPHDig/Pz8wPdRo0bpwIEDZhUHAJKkysoSDRnyRtA2p3OFKiqK41SjxGHa\nRVuHw2HWqQEgrPLyaZo5c6y+/Xa1fL4Upad3qqJipu0v2EqSDJO89957RmlpaeD7/fffbzzwwANB\nxzidTkMSHz58+PCJ4ON0OqPKZYdhGIZM4Pf7ddFFF+mNN95QXl6efvzjH+vZZ5/VuHHjzCgOAHAG\npk3ppKam6o9//KNKS0vV2dmpm2++mbAHgDgybYQPAEgsMVlLpzcPYFVWVuqCCy7QlClTtGPHjlhU\nq8+cqX0ej0eDBw9WYWGhCgsLtW7dujjUMjqLFy9Wbm6uJk2aFPYYK/fdmdpn5b5ramrS9OnTNWHC\nBE2cOFE1NTUhj7Nq//WmfVbuP5/Pp6KiIhUUFGj8+PG65557Qh4XUf9FfVW2l/x+v+F0Oo3du3cb\n7e3txpQpU4xPP/006Bi3222UlZUZhmEY77//vlFUVGR2tfpMb9r31ltvGbNnz45TDc9OQ0OD8dFH\nHxkTJ04Mud/KfWcYZ26flfuuubnZ2LFjh2EYhnH06FHjwgsvTKr/93rTPiv3n2EYxvfff28YhmF0\ndHQYRUVFxvbt24P2R9p/po/we/MA1ssvv6wFCxZIkoqKinT48GEdOnTI7Kr1id4+YGZYdObs6quv\n1pAhQ8Lut3LfSWdun2Tdvhs+fLgKCgokSVlZWRo3bpwOHjwYdIyV+6837ZOs23+SlJmZKUlqb29X\nZ2enhg4dGrQ/0v4zPfB78wBWqGP2799vdtX6RG/a53A49M9//lNTpkzRrFmz9Omnn8a6mqaxct/1\nRrL03Z49e7Rjxw4VFRUFbU+W/gvXPqv3X1dXlwoKCpSbm6vp06dr/PjxQfsj7T/TV8vs7QNYp/8r\nbJUHt3pTz0suuURNTU3KzMzU1q1bNWfOHH3xxRcxqF1sWLXveiMZ+u7YsWOaO3euqqurlZWV1W2/\n1fuvp/ZZvf/69eunnTt3qrW1VaWlpfJ4PHK5XEHHRNJ/po/wR44cqaampsD3pqYmjRo1qsdj9u/f\nr5EjR5pdtT7Rm/YNHDgw8KdZWVmZOjo69O2338a0nmaxct/1htX7rqOjQzfccIN+9atfac6cOd32\nW73/ztQ+q/ffDwYPHqzy8nJ98MEHQdsj7T/TA3/q1Kn68ssvtWfPHrW3t+tvf/ubrrvuuqBjrrvu\nOm3atEmS9P777+vcc89Vbm6u2VXrE71p36FDhwL/Cjc2NsowjG5zcVZl5b7rDSv3nWEYuvnmmzV+\n/HgtW7Ys5DFW7r/etM/K/dfS0qLDhw9Lkk6cOKH6+noVFhYGHRNp/5k+pRPuAaw//elPkqRbb71V\ns2bN0quvvqqxY8dqwIABeuKJJ8yuVp/pTfuef/55PfbYY0pNTVVmZqY2b94c51r33rx58/T222+r\npaVF+fn5uvfee9XR0SHJ+n0nnbl9Vu67d999V0899ZQmT54cCIr7779f+/btk2T9/utN+6zcf83N\nzVqwYIG6urrU1dWl+fPna8aMGWeVnTx4BQA2wUvMAcAmCHwAsAkCHwBsgsAHAJsg8AHAJgh8ALAJ\nAh84RVNTk84//3x99913kqTvvvtO559/fuDebsDKCHzgFPn5+brtttt09913S5Luvvtu3XrrrRo9\nenScawacPR68Ak7j9/t16aWXatGiRfrLX/6inTt3KiUlJd7VAs6a6UsrAFaTmpqqhx56SGVlZaqv\nryfskTSY0gFC2Lp1q/Ly8vTJJ5/EuypAnyHwgdPs3LlTr7/+ut577z1t2LBB//3vf+NdJaBPEPjA\nKQzD0G233abq6mrl5+frrrvu0m9+85t4VwvoEwQ+cIqNGzdqzJgxmjFjhiTp9ttv12effabt27fH\nuWbA2eMuHQCwCUb4AGATBD4A2ASBDwA2QeADgE0Q+ABgEwQ+ANgEgQ8ANkHgA4BN/B+Nr3PakD2x\nFQAAAABJRU5ErkJggg==\n", "text": [ "" ] }, { "metadata": {}, "output_type": "display_data", "png": "iVBORw0KGgoAAAANSUhEUgAAAXwAAAEZCAYAAACU3p4jAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAG3JJREFUeJzt3XtwVOUB9/Hf5lJiSAgiNCQkwrBSIaAEpKZWwEULAQKO\nFh2lCihafb0koK31EpCIxAt2ZJL0Mh1Rxwtqp844alcDsWUJKg5lDIwVZnC2ARJJUSqEIGzI5Xn/\n4HVfFhLYTXL2dr6fmZ3JnnNynufwwI9zzvOc5ziMMUYAgLiXEOkKAADCg8AHAJsg8AHAJgh8ALAJ\nAh8AbILABwCbIPAR1Twej3JzcyNdDSAuEPgIizfeeEOTJk1Senq6srOzNXv2bH3yySeRrlbIKioq\nNHLkSKWlpSkvL09fffWVf115ebmGDx+ujIwMzZ8/Xy0tLf51ra2tWrx4sTIyMpSVlaU1a9YE7Peu\nu+7S6NGjlZiYqFdeeSVg3VtvvaXRo0crIyNDgwcP1i9/+Uvt37/fv97lcum8885Tenq60tPTNWbM\nGIuOHrGOwIflnn/+eT3wwANatmyZvvnmGzU0NOi+++7Te++9F+mqhWTt2rV66aWX9MEHH+jo0aNy\nu90aPHiwJOmVV17R66+/rk8//VT79+/X8ePHVVxc7P/dsrIyeb1e7du3Txs3btTq1au1fv16//r8\n/Hz96U9/0sSJE+VwOALKvfLKK1VbW6vm5mbt3btXqampevDBB/3rHQ6H/vjHP6qlpUUtLS3atWuX\nxX8SiFkGsNDhw4dNWlqaefvtt7vdxufzmSVLlpjs7GyTnZ1tli5dalpbW40xxmzcuNHk5OT4t3U4\nHMbr9fq/L1q0yCxbtsy/7bBhw8zq1avNkCFDTFZWlnnnnXeM2+02o0aNMoMGDTJPP/20/3dXrFhh\nbrzxRrNw4UKTnp5uxo4da7Zt29ZlHTs6OkxOTo755z//2eX6efPmmeeee87//dNPPzUpKSnm+PHj\nxhhjsrOzTU1NjX/9448/bm6++eYz9jN58mTzyiuvdPtn1dLSYhYuXGiWLl3qX+ZyuczatWu7/R3g\nB5zhw1JbtmyRz+fT9ddf3+025eXl2rp1q3bs2KEdO3Zo69atWrVqVVD7dzgcAWfEBw4cUGtrq5qa\nmrRy5UrdeeedWrdunerq6rR582atXLlSe/fu9W///vvva/78+Wpubta1116r+++/v8tyGhsb9fXX\nX+uLL77QhRdeqJEjR6qsrEzm/81M4nA4/D9LUmdnp1pbW/XVV1/p0KFDampq0vjx4/3rL730Un35\n5ZdBHaMkffzxxxo4cKAGDBigffv26dlnnw1Y/+ijj2rIkCGaPHmyNm3aFPR+YS8EPiz1v//9T4MH\nD1ZCQvd/1d544w09/vjjGjx4sAYPHqwVK1botddeC7qMU4M2OTlZpaWlSkxM1E033aTvvvtOS5cu\nVf/+/ZWXl6e8vDzt2LHDv/2UKVM0c+ZMORwO3XrrrQHrTtXY2ChJqqmp0b///W9t3LhRb775pl58\n8UVJ0syZM7V27Vrt3btXzc3N/kA+duyYjh49KknKyMjw72/AgAEB9/jPZfLkyTp8+LAaGxuVnJys\nhx56yL/u2WefVX19vfbv36+77rpLc+fO1X/+85+g9w37IPBhqQsuuEAHDx5UZ2dnt9vs379fw4cP\n93+/8MILAzolQy3vhzP+8847T5KUmZnpX3/eeef5A/j0dampqfL5fF3W9Yd9/e53v9OAAQM0fPhw\n3X333frggw8kSYsXL9b8+fPlcrl0ySWX6Oqrr5Yk5eTkKC0tTZJ05MgR//6am5uVnp4e8vFlZ2fr\nySef1Kuvvupfdvnll6t///5KTk7WwoULdeWVV/rrBZyKwIelrrjiCvXr10/vvPNOt9tkZ2drz549\n/u/79u1TdnZ2l9umpqbq2LFj/u9NTU1ndHJa4eKLL9aPfvSjM5b/ULbD4VBZWZnq6+u1b98+5eXl\nKScnR8OGDdP555+vrKwsbd++3f97O3bs0Lhx43pUl7a2NqWmpvbsQGBrBD4slZGRoZUrV+q+++7T\nu+++q2PHjqmtrU0ffvihHn74YUnS/PnztWrVKh08eFAHDx7UypUrtWDBgi73l5+fr3Xr1qmjo0PV\n1dWqra0Ny3Gkpqbqpptu0urVq3X06FE1NjbqhRde0Jw5cyRJhw4dktfrlTFGO3fu1G9+8xs9/vjj\n/t9fuHChVq1apcOHD2vXrl1au3atbrvtNv/6trY2/9XFiRMn5PP5/Leq3njjDTU0NEiS9u7dq9LS\nUs2bN0/SySuF9evXy+fzqb29XevWrdPmzZs1c+bMsPy5IMZEtMsYtrFu3TozadIk079/fzN06FAz\nZ84cs2XLFmPMyVE6JSUlJisry2RlZZklS5YEjNLJzc3172fbtm1m7NixJj093SxYsMD86le/MsuX\nL+9y27a2NpOQkGD27t3rXzZ58mSzbt06Y4wxZWVlZsGCBf519fX1JiEhwXR0dHR5DEeOHDE333yz\nSU9PN7m5uebJJ5/0r9u9e7e5+OKLTWpqqhk+fLhZs2ZNwO+2traaxYsXmwEDBpjMzMwz1l911VXG\n4XCYhIQE43A4jMPhMJs2bTLGGFNaWmpycnJM//79zYgRI8zDDz/sH/3z7bffmp/+9KcmPT3dDBw4\n0FxxxRXmo48+CqZJYEMOY6x7Acrhw4d155136ssvv5TD4dBLL72kn/3sZ1YVBwA4iyQrd75kyRLN\nnj1bb7/9ttrb2/X9999bWRwA4CwsO8Nvbm7WhAkTGB4GAFHCsk7b+vp6DRkyRLfffrsmTpyoX//6\n1wGjKwAA4WVZ4Le3t+vzzz/Xvffeq88//1z9+/fXM888Y1VxAIBzsao3uKmpyYwYMcL/ffPmzaao\nqChgG6fTaSTx4cOHD58QPk6ns0e5bNkZ/tChQ5Wbm6vdu3dLkj766CONHTs2YJsfxi3H62fFihUR\nrwPHx/HZ8fji+diMMfJ6vT3KZUtH6VRVVemWW27RiRMn5HQ69fLLL1tZHADgLCwN/PHjx+tf//qX\nlUUAAILE1AoWcrlcka6CpTi+2BbPxxfPx9Yblj5pe87CT5tDPFRud60qKzeotTVJ/fq1q6RkhoqK\npvZhDQEg+vQ0Oy29pWMlt7tWS5asl9db7l/m9ZZKEqEPAF2I2Vs6lZUbAsJekrzeclVV1USoRgAQ\n3WI28Ftbu7448fkSw1wTAIgNMRv4/fq1d7k8JaUjzDUBgNgQs4FfUjJDTmdpwDKn8zEVF0+PUI0A\nILrF/Cidqqoa+XyJSknpUHHxdDpsAcS9nmZnTAc+ANhRT7MzZm/pAABCQ+ADgE0Q+ABgEwQ+ANgE\ngQ8ANkHgA4BNEPgAYBMEPgDYBIEPADZB4AOATRD4AGATBD4A2ASBDwA2QeADgE0Q+ABgEwQ+ANgE\ngQ8ANkHgA4BNEPgAYBNJVhcwYsQIDRgwQImJiUpOTtbWrVutLhIA0AXLA9/hcMjj8WjQoEFWFwUA\nOIuw3NLpydvVAQB9y/LAdzgc+sUvfqFJkybphRdesLo4AEA3LL+l88knnygrK0vffvutpk+frtGj\nR2vKlClWFwsAOI3lgZ+VlSVJGjJkiK6//npt3bo1IPDLysr8P7tcLrlcLqurBAAxxePxyOPx9Ho/\nDmPhDfZjx46po6ND6enp+v777zVjxgytWLFCM2bMOFm4w8H9fQAIUU+z09Iz/AMHDuj666+XJLW3\nt+uWW27xhz0AILwsPcM/Z+Gc4QNAyHqanTxpCwA2QeADgE0Q+ABgEwQ+ANgEgQ8ANkHgA4BNEPgA\nYBMEPgDYBIEPADZB4AOATRD4AGATBD4A2ASBDwA2QeADgE0Q+ABgEwQ+ANgEgQ8ANkHgA4BNEPgA\nYBMEPgDYRFKkK4C+53bXqrJyg1pbk9SvX7tKSmaoqGhqpKsFIMII/DjjdtdqyZL18nrL/cu83lJJ\nIvQBm+OWTpyprNwQEPaS5PWWq6qqJkI1AhAtCPw409ra9UWbz5cY5poAiDbc0okz/fq1d7k8JaXD\nsjLpMwBiA4EfZ0pKZsjrLQ24reN0Pqbi4pmWlEefARA7HMYYE7HCHQ5FsPi45XbXqqqqRj5folJS\nOlRcPN2y8C0sXKYNG1Z1sXy5qquftKRMwO56mp2c4cehoqKpYTu7ps8AiB102qJXItFnAKBnLA/8\njo4OTZgwQXPnzrW6KERASckMOZ2lActO9hlMj1CNAHTH8ls6FRUVysvLU0tLi9VFIQJ+uHVUVbX8\nlD6DmXTYAlHI0k7bxsZG3XbbbSotLdXzzz+v999/P7DwOO60ZagiAKtEZaftAw88oOeee05Hjhyx\nspiow1BFANHIssD/+9//rh//+MeaMGGCPB5Pt9uVlZX5f3a5XHK5XFZVKWy6n95gOYEPIGQej+es\nORosy27pPPbYY3rttdeUlJQkn8+nI0eOaN68eXr11Vf/f+FxekvH5SrTpk1lZyy/6qoyeTxnLgeA\nUPQ0Oy0bpfPUU0+poaFB9fX1euutt3T11VcHhH08Y6jiydtahYXL5HKVqbBwmdzu2khXCbC9sD14\n5XA4wlVUxIV7eoNoQx8GEJ2YWsEi4ZzeINow3QJgragcpWNn4ZzeINow3QIQnZhaAX2OPgwgOhH4\n6HNMtwBEJ+7hwxJ27sMArNbT7CTwASDGRN04fABAdCHwAcAmCHwAsAnG4cc5pmkG8AMCP44xxQGA\nU3FLJ451P01zTYRqBCCSCPw4xhQHAE5F4McxpjgAcCoCP44xxQGAU/GkbZxjigMg/jC1AgDYBFMr\nAADOisAHAJsg8AHAJgh8ALAJAh8AbILABwCbIPABwCYIfACwCQIfAGyCwAcAmyDwAcAmCHwAsAlL\nA9/n86mgoED5+fnKy8vTo48+amVxAICzsHy2zGPHjik1NVXt7e2aPHmyfv/732vy5MknC2e2TAAI\nWZ/Pljlr1izV19f3qlKSlJqaKkk6ceKEOjo6NGjQoF7vEwAQum4Df/HixSosLFR5ebna2tp6XEBn\nZ6fy8/OVmZmpadOmKS8vr8f7AgD0XNdvuZZ04403atasWVq5cqUmTZqkBQsWyOFwSDp5OfHggw8G\nVUBCQoK2b9+u5uZmFRYWyuPxyOVy+deXlZX5f3a5XAHrAACSx+ORx+Pp9X66DXxJSk5OVlpamnw+\nn1paWpSQ0PM+3oyMDBUVFWnbtm3dBj4A4Eynnww/8cQTPdpPt4FfXV2tBx98UHPnzlVdXZ3/Xnwo\nDh48qKSkJA0cOFDHjx9XTU2NVqxY0aOKAgB6p9vALy8v19/+9jeNHTu2xztvamrSokWL1NnZqc7O\nTi1YsEDXXHNNj/cHAOi5bodlGmP89+wtK5xhmYhBbnetKis3qLU1Sf36taukZIaKiqZGulqwkZ5m\nZ7dn+FaHPRCL3O5aLVmyXl5vuX+Z11sqSYQ+oh5TKwAhqKzcEBD2kuT1lquqqiZCNQKCR+ADIWht\n7fqi2OdLDHNNgNAR+EAI+vVr73J5SkpHmGsChI7AB0JQUjJDTmdpwDKn8zEVF08P6vfd7loVFi6T\ny1WmwsJlcrtrragm0KWzPngFINAPHbNVVcvl8yUqJaVDxcUzg+qwpcMXkWb5bJlnLZxhmbCRwsJl\n2rBhVRfLl6u6+skI1Aixqs9nywTQt+jwRaQR+ECY0OGLSCPwgTDpbYcv0Fvcww8Bj9Sjt9zuWlVV\n1ZzS4Tudv0MIWU+zk8APUlcjLJzOUlVUFPIPFkBYEfgWY4RF5HGFBZzU55OnIRAjLCKLMexA79Fp\nG6RoHmFhh6c3mbQM6D3O8INUUjJDXm/paffwH1Nx8cwI1so+Z75cYQG9R+AHqTeP1Fup+zPf5RGv\nW1/qqyss+gFgZwR+CIqKpkZdONjlzLcvrrDscjUEdIfAj3HR3LfQl/riCssuV0NAdwj8GBetfQtW\n6O0Vll2uhoDuEPgxLlr7FqKRXa6GgO7w4BVso+unpR9TRQX/QSK28KQtEATmskE8IPABwCZ4AQoA\n4KwIfACwCUbpABHEk78IJwIfiBCe/EW4WXpLp6GhQdOmTdPYsWM1btw4VVZWWlkcEFOYARThZukZ\nfnJystasWaP8/HwdPXpUl112maZPn64xY8ZYWSwQE3jyF+Fm6Rn+0KFDlZ+fL0lKS0vTmDFjtH//\nfiuLBGIGT/4i3MI2SmfPnj2qq6tTQUFBuIoEolpJyQw5naUBy07OgzQ9QjVCvAtLp+3Ro0d1ww03\nqKKiQmlpaQHrysrK/D+7XC65XK5wVAmIOOZBQrA8Ho88Hk+v92P5k7ZtbW2aM2eOZs2apaVLlwYW\nzpO2ABCyqJxawRijRYsW6YILLtCaNWvOLJzAB4CQRWXgf/zxx5o6daouvfRSORwOSdLTTz+tmTNP\nztVO4ANA6KIy8M9ZOIEPACFj8jQAwFkxtQIABCnW5z4i8AEgCPEw9xH38AEgCIWFy7Rhw6ozlk+c\neJ8GDz4/rGf9Pc1OzvABIAjdzX20c2eLfL4/+r9H81k/nbYAEITu5j7y+YYHfI/mGU8JfAAIQldz\nH6Wk/B9JZ859FK0znnJLBwCC0NXcR9980666ujNv3UTrjKd02gJAD3U1csfpfEwVFdZOgseTtgCi\nTqyPWw+G212rqqqaU2Y8nR61o3QIfACS+j6cuz77LVVFRWHchX64MSwTQI9Z8VBR9+/sXU7gRwij\ndABY8kJ13tkbfQh8AJaEM+/sjT4EPgBLwpl39kYf7uEDUEnJDHm9pWcMLywuntnjffLO3ujDKB0A\nkiIzvBA9w7BMALAJ3ngFADgrAh8AbILABwCbIPABwCYIfACwCQIfAGyCB68AxCw7TL/clwh8ADHJ\nihk+4x23dADEJCtm+Ix3BD6AmMT0y6GzNPAXL16szMxMXXLJJVYWA8CGmH45dJYG/u23367q6mor\niwBgU0y/HDpLO22nTJmiPXv2WFkEAJti+uXQMUoHQMwqKppKwIeATlsAsImIn+GXlZX5f3a5XHK5\nXBGrCwBEI4/HI4/H0+v9WP4ClD179mju3Ln64osvziycF6AAQMii8gUo8+fP189//nPt3r1bubm5\nevnll60sDgBwFrziEABiTFSe4QMAokfEO20BwM7COeMngQ8AERLuGT+5pQMAERLuGT8JfACIkHDP\n+EngA0CEhHvGTwIfACIk3DN+Mg4fACLI7a5VVVXNKTN+Tj9nh21Ps5PAB4AYw4NXAICzIvABwCYI\nfACwCQIfAGyCwAcAmyDwAcAmCHwAsAkCHwBsgsAHAJsg8AHAJgh8ALAJAh8AbILABwCbIPABwCYI\nfACwCQIfAGyCwAcAm+j6lekAgKjhdteqsnKDWluTun3xeTB4xSEARDG3u1ZLlqyX11t+ytIofMVh\ndXW1Ro8erVGjRunZZ5+1sigAiEuVlRtOC/uesyzwOzo6dP/996u6ulo7d+7Um2++qV27dllVXFTy\neDyRroKlOL7YFs/HF0/H1trad3feLQv8rVu36qKLLtKIESOUnJysm2++We+++65VxUWlePpL1xWO\nL7bF8/HF07H15p796SwL/K+//lq5ubn+7zk5Ofr666+tKg4A4lJJyQw5naV9si/LRuk4HA6rdg0A\ntlFUNFWSVFW1XD5folJSOrR+fQ93ZiyyZcsWU1hY6P/+1FNPmWeeeSZgG6fTaSTx4cOHD58QPk6n\ns0e5bNmwzPb2dl188cX6xz/+oezsbF1++eV68803NWbMGCuKAwCcg2W3dJKSkvSHP/xBhYWF6ujo\n0B133EHYA0AERfTBKwBA+IRlLp1gHsAqKSnRqFGjNH78eNXV1YWjWn3mXMfn8XiUkZGhCRMmaMKE\nCVq1alUEatkzixcvVmZmpi655JJut4nltjvX8cVy2zU0NGjatGkaO3asxo0bp8rKyi63i9X2C+b4\nYrn9fD6fCgoKlJ+fr7y8PD366KNdbhdS+/W4VzZI7e3txul0mvr6enPixAkzfvx4s3PnzoBt3G63\nmTVrljHGmM8++8wUFBRYXa0+E8zxbdy40cydOzdCNeyd2tpa8/nnn5tx48Z1uT6W286Ycx9fLLdd\nU1OTqaurM8YY09LSYn7yk5/E1b+9YI4vltvPGGO+//57Y4wxbW1tpqCgwGzevDlgfajtZ/kZfjAP\nYL333ntatGiRJKmgoECHDx/WgQMHrK5anwj2ATMTo3fOpkyZovPPP7/b9bHcdtK5j0+K3bYbOnSo\n8vPzJUlpaWkaM2aM9u/fH7BNLLdfMMcnxW77SVJqaqok6cSJE+ro6NCgQYMC1ofafpYHfjAPYHW1\nTWNjo9VV6xPBHJ/D4dCnn36q8ePHa/bs2dq5c2e4q2mZWG67YMRL2+3Zs0d1dXUqKCgIWB4v7dfd\n8cV6+3V2dio/P1+ZmZmaNm2a8vLyAtaH2n6WT48c7ANYp/8vHCsPbgVTz4kTJ6qhoUGpqan68MMP\ndd1112n37t1hqF14xGrbBSMe2u7o0aO64YYbVFFRobS0tDPWx3r7ne34Yr39EhIStH37djU3N6uw\nsFAej0culytgm1Daz/Iz/GHDhqmhocH/vaGhQTk5OWfdprGxUcOGDbO6an0imONLT0/3X5rNmjVL\nbW1t+u6778JaT6vEctsFI9bbrq2tTfPmzdOtt96q66677oz1sd5+5zq+WG+/H2RkZKioqEjbtm0L\nWB5q+1ke+JMmTdJXX32lPXv26MSJE/rrX/+qa6+9NmCba6+9Vq+++qok6bPPPtPAgQOVmZlpddX6\nRDDHd+DAAf//wlu3bpUx5ox7cbEqltsuGLHcdsYY3XHHHcrLy9PSpUu73CaW2y+Y44vl9jt48KAO\nHz4sSTp+/Lhqamo0YcKEgG1CbT/Lb+l09wDWX/7yF0nS3XffrdmzZ+uDDz7QRRddpP79++vll1+2\nulp9Jpjje/vtt/XnP/9ZSUlJSk1N1VtvvRXhWgdv/vz52rRpkw4ePKjc3Fw98cQTamtrkxT7bSed\n+/hiue0++eQTvf7667r00kv9QfHUU09p3759kmK//YI5vlhuv6amJi1atEidnZ3q7OzUggULdM01\n1/QqO3nwCgBsgpeYA4BNEPgAYBMEPgDYBIEPADZB4AOATRD4AGATBD5wioaGBo0cOVKHDh2SJB06\ndEgjR470j+0GYhmBD5wiNzdX99xzjx555BFJ0iOPPKK7775bF154YYRrBvQeD14Bp2lvb9dll12m\n22+/XS+++KK2b9+uxMTESFcL6DXLp1YAYk1SUpJWr16tWbNmqaamhrBH3OCWDtCFDz/8UNnZ2fri\niy8iXRWgzxD4wGm2b9+ujz76SFu2bNGaNWv03//+N9JVAvoEgQ+cwhije+65RxUVFcrNzdVDDz2k\n3/72t5GuFtAnCHzgFC+88IJGjBiha665RpJ07733ateuXdq8eXOEawb0HqN0AMAmOMMHAJsg8AHA\nJgh8ALAJAh8AbILABwCbIPABwCYIfACwCQIfAGzi/wIkJKZiXlk1LQAAAABJRU5ErkJggg==\n", "text": [ "" ] } ], "prompt_number": 18 }, { "cell_type": "markdown", "metadata": {}, "source": [ "A linear regression on the full data looks good. The \"score\" here is the $R^2$ score -- scores close to 1 imply a good fit." ] }, { "cell_type": "code", "collapsed": false, "input": [ "xt = x[:, best_features]\n", "clf = LinearRegression().fit(xt, y)\n", "print(\"Score is \", clf.score(xt, y))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Score is 0.839859389027\n" ] } ], "prompt_number": 25 }, { "cell_type": "code", "collapsed": false, "input": [ "yp = clf.predict(xt)\n", "plt.plot(yp, y, 'o')\n", "plt.plot(y, y, 'r-')\n", "plt.xlabel(\"Predicted\")\n", "plt.ylabel(\"Observed\")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 26, "text": [ "" ] }, { "metadata": {}, "output_type": "display_data", "png": "iVBORw0KGgoAAAANSUhEUgAAAXcAAAEPCAYAAAC5sYRSAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3X1UVHX+B/D3CIoJqLlpmpi0s6YMqICFuz7QqMtgErWW\nR1dtUzQsO4K5v/WcTTTZTG3XLQ9ge8x+1W5Z9uAef+06RmA5QppaQptH3agR1id0fQiVhwFmuL8/\nRmFGB2YG7p37MO/XOXMOc+fhfgf143s+93u/VycIggAiItKUbnIPgIiIxMfiTkSkQSzuREQaxOJO\nRKRBLO5ERBrE4k5EpEGSFveamhrMmDEDMTExMBgMOHDggJS7IyKi60KlfPOlS5di2rRp2L59O+x2\nO+rq6qTcHRERXaeT6iSmK1euICEhASdOnJDi7YmIqAOStWUqKyvRv39/ZGRkIDExEZmZmaivr5dq\nd0RE5EKy4m6321FWVoZnnnkGZWVlCA8Px0svvSTV7oiIyJUgkerqaiE6Orr1fmlpqZCWlub2HL1e\nLwDgjTfeeOPNj5ter/dagyVL7gMHDsSQIUNQUVEBANi9ezdiY2PdnmO1WiEIgmZvq1evln0M/Hz8\nfMH4+bT82QRBgNVq9VqDJZ0tU1BQgLlz56KpqQl6vR5vvfWWlLsjIqLrJC3uo0ePxldffSXlLoiI\nyAOeoSoho9Eo9xAkxc+nblr+fFr+bL6SbJ67TzvX6SDj7olIRczmEuTnF6GxMRRhYXZkZ5uQlpYs\n97Bk4UvtlLQtQ0QkBrO5BEuXfgqrdW3rNqs1BwCCtsB7w7YMESlefn6RW2EHAKt1LQoKimUakUia\nmoDLlyV5axZ3IlK8xkbPTQabLSTAIxHRRx8BYWHAwIGSvD3bMkSkeGFhdo/be/Z0BHgkIrDZgDvv\nBK5eBcaMAQ4dkmQ3TO5EpHjZ2Sbo9Tlu2/T6FcjKSpFpRJ300UfAbbc5C/vevcDXXwPdpCnDnC1D\nRKpgNpegoKAYNlsIevZ0ICsrRT0HUz2l9S4UdV9qJ4s7EZGUPvoImDnT+fPevUBy1/9D4lRIIiK5\niJzW/cWeOxGR2ALYW28PkzsRkVhkTuuumNyJiMSggLTuismdiKgrFJTWXck/AiIitVJYWnfF5E5E\n5C+FpnVXyhoNEZHSKTitu2JyJyLyhQrSuivljoyISClUktZdMbkTEbVHZWndlTpGSUQUaCpM666Y\n3ImIXKk4rbtS34iJiKSi8rTuismdiEgjad2VukdPRNRVGkrrrpjciSg4aTCtu9LOJyEi8pVG07or\nJnciCh4aT+uuJC/u0dHR6N27N0JCQtC9e3ccOnRI6l0SEd1KgmuZKpnkxV2n08FisaBfv35S74qI\n6FZBlNZdBeQTertKNxGRJIKgt94enSBx5f3pT3+KPn36ICQkBE899RQyMzPbdq7TsfATkfg0ntZ9\nqZ2St2X27duHQYMG4cKFC0hJScGIESMwceJEqXdLRMEqyHrr7ZG8uA8aNAgA0L9/f0yfPh2HDh1y\nK+65ubmtPxuNRhiNRqmHRERapOG0brFYYLFY/HqNpG2Z+vp6OBwOREZGoq6uDiaTCatXr4bJZHLu\nnG0ZIhJDkKV12dsy58+fx/Tp0wEAdrsdc+fObS3sRERdpuG03lWSH1DtcOdM7kTUWUGW1l3JntyJ\niETHtO4T/kaISD2CeN66v5jciUj5mNb9xt8OESkb03qnMLkTkTIxrXcJf1NEpDxM613G5E5EysG0\nLhr+1ohIGZjWRcXkTkTyYlqXBH+DRCQfpnXJMLkTUeAxrUuOv00iCiym9YBgcieiwGBaDyj+ZolI\nekzrAcfkTkTSYVqXDX/LRCQNpnVZMbkTkbiY1hWBv3EiEg/TumIwuRNR1zGtKw5/+0TUNUzrisTk\nTqRiZnMJ8vOL0NgYirAwO7KzTUhLC9CFopnWFY3FnUilzOYSLF36KazWta3brNYcAJC+wH/0ETBz\npvPnvXuB5AD9h0I+0wmCIMi2c50OMu6eSNVSU1eiqOhFD9tXobBwjTQ7ZVpXBF9qJ/9UiFSqsdHz\nF2+bLUSaHbK3ripsyxCpVFiY3eP2nj0dt2zrUm/eNa3fdx9w8CCLugqwuBOpVHa2CVZrjlvPXa9f\ngaysqW7P61Jv3rW3XlICTJwozuBJcuy5E6mY2VyCgoJi2Gwh6NnTgayslFsKdqd681euAH37On9m\nWlccX2onkzuRiqWlJXtN33735mfOdCZ2gDNhVIzFnUjjfO7Nu6Z1AGhuBkJZItRK8u9ZDocDCQkJ\nSE9Pl3pXRORBdrYJen2O2zZnbz6lbcPMmW2Ffd06QBBY2FVO8j+9vLw8GAwGXLt2TepdEZEHN9o2\nBQWrXHrzU53bmdY1S9IDqqdPn8b8+fORk5ODV155Bf/85z/dd84DqkTyce2tr1sHPPdch0+XdakD\nciP7AdVly5Zhw4YNuHr1qpS7ISJ/dCKty7rUAXWKZMV9586dGDBgABISEmCxWNp9Xm5ubuvPRqMR\nRqNRqiERkU7X9rMPaf2G/Pwit8IOAFbrWhQUrGJxDwCLxdJhHfVEsrbMihUr8M477yA0NBQ2mw1X\nr17FY489hrfffrtt52zLEAXG+fPAwIFt9/3srRuNudi7N/eW7Q88kAuL5dbtJC1Z15ZZt24dTp06\nhcrKSrz//vuYPHmyW2EnogDR6doK+7RprTNhzOYSpKauhNGYi9TUlTCbS9p9C3+WOiBlCNhhcZ3r\n10Eikt7Nab2pCejeHYD/PXRflzog5eDyA0Ra5Bqmpk0DzGa3hzuzJIEvSx1QYMg+W4aIAqyDtO6q\nM8sF+7LUASkHVwIi0gpPvXUPhR1gDz0YsLgTqd358+5tmKamW9owN/NpSQJSNfbcidTMS2+9I+yh\nq5cvtZPFnUiNfOytkzbxGqpEWuRHb52CF2fLEKkF0zr5gcmdSA2Y1slPTO5E1ylySVumdeokFnci\nKHRJ2y7MhCFiW4YIHS1pWxz4wZw86fe8daKbtZvcIyIi2l3sS6fT8QIcpCmdOR1fEq7/5nr3dl5Y\ng6gT2i3utbW1AICVK1firrvuwuOPPw4AePfdd3H27NnAjI4oQGQ/Hf/kSWDo0Lb7NhsQFhaYfZMm\neT2JadSoUfj222+9buvUznkSEymEp567Xr8CeXlTpe+5u6b1yEiA34rJC1FWhQwPD8fWrVsxe/Zs\nAMD777+PiIgIcUZIpBA3CnhBwSqX0/ElLuxM6yQhr8m9srISS5cuxf79+wEA48ePR15eHqKjo7u+\ncyZ3ClZM69QFXFuGSGmY1kkEoqwt891332HKlCmIjY0FAHz77bd48cVbr+BCRF7odG2FPTLSeZYp\nCztJxGtxz8zMxLp169CjRw8AwMiRI7Ft2zbJB0akGTfPW7fZ2IYhyXk9oFpfX4+xY8e23tfpdOjO\n05+JfMPeOsnEa3Lv378/fvjhh9b727dvx6BBgyQdFJHqMa2TzLweULVarVi0aBH279+P22+/Hffc\ncw/effddzpYhag/TOklMlNkyDocDISEhqK2tRUtLC3r37h3QARKpBmfCUICIMlvmnnvuwaJFi3Dw\n4EFERkaKNjgiTeFMGFIYr8X9+PHjmDJlCjZt2oTo6GgsWbIEpaWlgRgbkfKxt04K5ddJTD/++COy\ns7Px3nvvweHo+oJKbMuQqrG3TjIRpS0jCAIsFgsWL16MxMRENDY24sMPPxRtkESqw7ROKuA1uUdH\nRyM+Ph6zZs1Cenq6qIuGMbmT6jCtkwJ0eVVIh8OBBQsW4Pnnn+/UAGw2Gx544AE0NjaiqakJjzzy\nCNavX9+p9yKSFWfCkMp4Te73338/vvrqq07voL6+Hr169YLdbseECRPw5z//GRMmTHDunMmd1IBp\nnRRGlPXcJ0yYgCVLlmDWrFkIDw9v3Z6YmOjTIHr16gUAaGpqgsPhQL9+/Xx6HZHsmNZJxbwmd6PR\n6PFaqnv27PFpBy0tLUhMTITVasXixYvxpz/9qW3nTO6kVEzrpGCiJHeLxdKlQXTr1g3ffPMNrly5\ngtTUVFgsFhiNxtbHc3NzW382Go1ujxEFHNM6KZDFYvG7FntN7ufOnUNOTg7OnDmDwsJCHDt2DF9+\n+SUWLlzo9wDXrFmD2267Db/73e+cO2dyJyVhWieVEGWe+/z582EymXD27FkAwLBhw7Bx40afBnDx\n4kXU1NQAABoaGlBcXIyEhASfXksUMJy3ThrktbhfvHgRs2bNQkhICACge/fuCA312s0BAFRXV2Py\n5MmIj4/H2LFjkZ6ejilTpnRtxERi4powpFFeq3RERAQuXbrUev/AgQPo06ePT28+cuRIlJWVdX50\nRFLxs7duNpcgP78IjY2hCAuzIzvbhLS05AAMlKhzvBb3l19+Genp6Thx4gTGjRuHCxcuYPv27YEY\nG5E0/Oytm80lWLr0U1ita1u3Wa05AMACT4rl08Jhzc3N+O677wAAw4cPF+0yezygSgHVyZkwqakr\nUVR060XhU1NXobBwjZgjJPKJKAdUP/zwQzQ0NCAuLg47duzArFmz2Goh9XHtrUdE+NVbb2z0/AXX\nZgsRa3REovNa3NesWYPevXvjiy++wGeffYYFCxbg6aefDsTYiLrO00yYa9f8eouwMLvH7T17dn3Z\nayKpeC3uN2bJ7Ny5E5mZmXjooYfQ3Nws+cBIfczmEqSmroTRmIvU1JUwm0vkHVAX0rqr7GwT9Poc\nt216/QpkZaWIMUoiSXg9oDp48GAsWrQIxcXF+P3vfw+bzYaWlpZAjI1URFEHHUU+y/TG+AsKVsFm\nC0HPng5kZU3lwVRSNK8HVOvq6lBYWIhRo0Zh2LBhqK6uxpEjR2Aymbq+cx5Q1QzFHHR0bcFERPjd\ngiFSA1HWlgkPD0d0dDR27dqFbt26Yfz48aIUdtIW2Q86fv89cO+9rjvmyUgU1Lz23F944QXMnz8f\nly9fxoULF5CRkYE1azj9i9zJetBRp3Mv7DzLlMh7W+bee+/Ft99+i549ewJwrhEzevRoVFRUdH3n\nbMtohqeeu16/Anl5Evamb07rDQ3A9b+nRFomSltm8ODBaGhoaC3uNpsNUVFR4oyQNCPgBx1vvsaA\nykIClzMgqbVb3LOysgAAffr0QWxsLFJSUqDT6VBcXIykpKSADZDUIy0tWfoCpYG0rqiZRaRZ7bZl\n/vrXv0Kn06G+vh52u7OfGhoa2nrZvHnz5nV952zLkD8kTuuBStOKmVlEqtWltszcuXORk5ODN998\nE3fffTcA4OTJk8jIyMDatWvbexmR+AKQ1gOZpmWfWURBod3ZMsuXL8fly5dRWVmJsrIylJWV4cSJ\nE6ipqcHy5csDOUYKZp5mwkjQhsnPL3Ir7ABgta5FQUExAHHPvuVyBhQI7Sb3nTt3oqKiAt26tdX/\n3r17Y/PmzRg+fDjy8vICMkAKUjefZSpxb72jNC12qs/ONsFqzbllZlFW1lS/34uoPe0W927durkV\n9htCQkI8bicSjURnmXbUU+8oTbef6ld1qrhzOQMKhHaLe0xMDP72t7/dcuD0nXfewYgRIyQfGAWh\n8+eBgQPb7ot4lqm39N1Rmt6w4XOP79mVHnlAZhZRUGu3uL/66qt49NFH8eabb2LMmDEAgMOHD6O+\nvh47duwI2AApSLim9QcfBHbtEvXtvaXvjtJ0fn6Rx/dkj5yUrN3iHhUVhYMHD+Lzzz/H0aNHodPp\nkJaWxgtck7huTutNTYBIV/py5csMlfbSNHvkpEYdnqGq0+kwZcoUFnSShsRp3VVXZqiwR05q5NM1\nVCXbOU9iCk4BSuuuZFn7hkgivtROFncKrACm9ZuZzSUoKCh2Sd8pLOykSizupBwypHUirfKldnLC\nOklPp2sr7A8+6DzLlIWdSFJel/wl6jSmdSLZsLgHMUlXQZSxt05ELO5BS7JVEKurgbvuarvPtE4k\nC0l77qdOncKkSZMQGxuLuLg45OfnS7k78oO3VRA7RadrK+zsrRPJStLk3r17d2zcuBHx8fGora3F\nmDFjkJKSgpiYGCl3Sz4QdU3xs2eBwYNd3xzo0aOTIyMiMUia3AcOHIj4+HgAQEREBGJiYnD27Fkp\nd0k+Em1NcZ2urbAbDM60zsJOJLuATYWsqqpCeXk5xo4dG6hdUgeys03Q63PctjnXS0nx7Q3OnnU/\naNrYCBw9KuIIiagrAnJAtba2FjNmzEBeXh4iIiLcHsvNzW392Wg0wmg0BmJIQa9L66W4FnWDgUWd\nSGIWiwUWi8Wv10h+hmpzczMeeughPPjgg3j22Wfdd84zVNWFvXUiRZB9+QFBEDBv3jz85Cc/wcaN\nGzs1QFIIpnUixZC9uH/xxRdITk7GqFGjoLteHNavX4+pU6f6PECSGdM6keLIXty9YXFXOKZ1IkXy\npXbyDFW6FdM6keqxuPtA0jVYlIZpnUgTWNy9kGwNFqVhWifSFPbcvUhNXYmiohc9bF+FwsI1Moyo\nczr89sG0TqQq7LmLQNQ1WGTS3reP7levwDTn4bYnMq0TaQavxOSFaGuwyMjTCpD/sO5oK+xPPcU1\nYYg0hsndi+xsE6zWHLfi6FyDZaqMo/KP67ePfriES7ij7cHmZiCUfw2ItIb/qr3o0hosCnHj28cx\nxCAG/wYAbMZT+L/U/ihkYSfSJB5QDQJF2/7p1lsPRTOi9c8jL09d/0kRkRPPUNUgv+fcx8QA/3am\ndXNUIjbo069/+0hhYSdSKc6W0Ri/5txfugTccVNv/dP9CMsvgs0Wivz8Is+vIyJNYHFXkfave7rK\nvUi7pHU89RSweXPwnIxFRAA4FVJVvM65v3TJeULSjcLe3Axs3gxAogtiE5FisbirSIdz7g2GtjbM\njXnrLjNhtHAyFhH5jm0ZFfE0535M9LMo/DSv7UntzFvXwslYROQ7JncVSUtLRl5eKlJTV+GBB3JR\nFd4fX1ddL+we0rqrLl8Qm4hUhVMh1cjTTBgfTkYym0tQUFDscjIWp0MSqRHnuWuRwQAcP+78+fpM\nGCIKLpznriWdTOtEFJzYc1cDLzNhiIhuxgqhZEzrRNRJTO5KxbRORF3AaqE0TOtEJAImdyVhWici\nkbByyMR16d47dFex3bKx7UGmdSLqIlYQGbiu0HgUBhjgnLf+n6kPY+gnH8s8OiLSArZlZJCfX4Rz\n1ucgQNda2EPRjKeEUTKPjIi0gsVdBnP+XYpaRAIA1mAldBDgQChXaCQi0UjallmwYAHMZjMGDBiA\nI0eOSLkrdaitBSIjMe/63RDY0YK2gs4VGolILJIm94yMDBQWFkq5C/V4/nkg0pnWy/9nJX6mX+FW\n2LlCIxGJSdLkPnHiRFRVVUm5C+W7ntZb2e1ICAlB3qQSFBSsclmhcSpXaCQi0XC2jJSefx5Ys8b5\n89atwNy5rQ+lpSWzmBORZFjcpeAhrSOEB0uJKHBkL+65ubmtPxuNRhiNRtnGIooO0joRUWdYLBZY\nLBa/XiP5xTqqqqqQnp7ucbaMpi7WwbRORAHiS+2UdLbM7NmzMW7cOFRUVGDIkCF46623pNydfFxm\nwmDrVueaMCzsRCQjXmavK5jWiUgGsid3TWNaJyIFY3L3101pfXLyKnTv2YLsbBOnNhJRQPAC2WJz\nmQnz2wEPY+N/PwZKnA9ZrTkAwAJPRIrAtowv7Ha3wv5gynPOwu7Cal2LgoJiOUZHRHQLFndvjh4F\nfvEL4MABoKoKEAQ0NPXw+FSu6khESsHi3h67HVi/HjAagUWLgE8/BYYOBQCEhdk9voSrOhKRUrC4\ne3Ijre/ZAxw+DGRmAjpd68PZ2Sbo9TluL+GqjkSkJJwt48puBzZsAF55BVi3DnjySbei7spsLkFB\nQbHLqo4pPJhKRAHhS+1kcb/h6FFg/nzg9tuB//1f4O675R4REZFHPInJF5566yzsRKRywT3P3TWt\nHz7Mok5EmhGcyZ1pnYg0LviSO9M6EQWB4EnuTOtEFESCI7kzrRNRkNF2cmdaJ6Igpd3kzrROREFM\ne8mdaZ2ISGPJnWmdiAiAVpI70zoRkRv1J3emdSKiW6g3uTOtExG1S53JnWmdiKhD6kruTOtERD5R\nT3JnWici8pnykzvTOhGR35Sd3FWU1s3mEuTnF6GxMRRhYXZkZ5t42T0iko0yi7sf1zJVArO5BEuX\nfgqrdW3rNqvVeQFtFngikoOkbZnCwkKMGDECw4YNwx//+EffXnTsGPCLXwB79jjTemamogs7AOTn\nF7kVdgCwWteioKBYphERUbCTrLg7HA4sWbIEhYWFOHbsGLZt24bjx497f+Hly86CrqLeemOj5y9A\n586dCvBIAstiscg9BEnx86mXlj+bryQr7ocOHcLPfvYzREdHo3v37vj1r3+Njz/+2PsLJ0xwHjhV\neFp3FRZm97i9tvZEgEcSWFr/B8TPp15a/my+kqy4nzlzBkOGDGm9HxUVhTNnzki1O1llZ5ug1+e4\nbdPrVyApSS/TiIgo2El2QFWnouTdVTcOmhYUrILNFoKePR3IypqKr776XOaREVHQEiTy5ZdfCqmp\nqa33161bJ7z00ktuz9Hr9QIA3njjjTfe/Ljp9XqvNVgnCIIACdjtdgwfPhyfffYZ7rrrLiQlJWHb\ntm2IiYmRYndERORCsrZMaGgoNm3ahNTUVDgcDixcuJCFnYgoQCRL7kREJB/Z1pbp1AlOKrFgwQLc\neeedGDlypNxDkcSpU6cwadIkxMbGIi4uDvn5+XIPSVQ2mw1jx45FfHw8DAYDnnvuObmHJDqHw4GE\nhASkp6fLPRTRRUdHY9SoUUhISEBSUpLcwxFdTU0NZsyYgZiYGBgMBhw4cMDzE0U9iuoju90u6PV6\nobKyUmhqahJGjx4tHDt2TI6hSKKkpEQoKysT4uLi5B6KJKqrq4Xy8nJBEATh2rVrwr333qupPz9B\nEIS6ujpBEAShublZGDt2rFBaWirziMT18ssvC3PmzBHS09PlHorooqOjhUuXLsk9DMk88cQTwhtv\nvCEIgvPvZ01NjcfnyZLcO32Ck0pMnDgRt99+u9zDkMzAgQMRHx8PAIiIiEBMTAzOnj0r86jE1atX\nLwBAU1MTHA4H+vXrJ/OIxHP69Gns2rULTz75JASNdmW1+rmuXLmC0tJSLFiwAIDz2GafPn08PleW\n4h5MJzhpXVVVFcrLyzF27Fi5hyKqlpYWxMfH484778SkSZNgMBjkHpJoli1bhg0bNqBbN+Wv+N0Z\nOp0Ov/zlL3Hffffh9ddfl3s4oqqsrET//v2RkZGBxMREZGZmor6+3uNzZfnTDaYTnLSstrYWM2bM\nQF5eHiIiIuQejqi6deuGb775BqdPn0ZJSYlmTmffuXMnBgwYgISEBM2m23379qG8vByffPIJXn31\nVZSWlso9JNHY7XaUlZXhmWeeQVlZGcLDw/HSSy95fK4sxX3w4ME4daptUa1Tp04hKipKjqFQJzU3\nN+Oxxx7D448/jl/96ldyD0cyffr0QVpaGr7++mu5hyKK/fv34x//+AfuuecezJ49G59//jmeeOIJ\nuYclqkGDBgEA+vfvj+nTp+PQoUMyj0g8UVFRiIqKwv333w8AmDFjBsrKyjw+V5bift999+H7779H\nVVUVmpqa8MEHH+Dhhx+WYyjUCYIgYOHChTAYDHj22WflHo7oLl68iJqaGgBAQ0MDiouLkZCQIPOo\nxLFu3TqcOnUKlZWVeP/99zF58mS8/fbbcg9LNPX19bh27RoAoK6uDkVFRZqatTZw4EAMGTIEFRUV\nAIDdu3cjNjbW43NluViH1k9wmj17Nvbu3YtLly5hyJAheOGFF5CRkSH3sESzb98+bN26tXW6GQCs\nX78eU6dOlXlk4qiursa8efPQ0tKClpYW/OY3v8GUKVPkHpYktNYiPX/+PKZPnw7A2cKYO3cuTCaT\nzKMSV0FBAebOnYumpibo9Xq89dZbHp/Hk5iIiDRIm4fLiYiCHIs7EZEGsbgTEWkQizsRkQaxuBMR\naRCLOxGRBrG4k6qFhIQgISEBI0eOxMyZM9HQ0NDp95o/fz7+/ve/AwAyMzNx/Pjxdp+7d+9efPnl\nl37vIzo6GpcvX+70GIl8xeJOqtarVy+Ul5fjyJEj6NGjBzZv3uz2uN1u9/m9dDpd60k9r7/+eocn\n1u3Zswf79+/3e7xaO2mIlIvFnTRj4sSJ+OGHH7B3715MnDgRjzzyCOLi4tDS0oLly5cjKSkJo0eP\nxpYtWwA4l1FYsmQJRowYgZSUFPz3v/9tfS+j0YjDhw8DcF5YZsyYMYiPj0dKSgr+85//4LXXXsPG\njRuRkJCAffv24cKFC5gxYwaSkpKQlJTUWvgvXboEk8mEuLg4ZGZmanaxLlIeWZYfIBKb3W7Hrl27\nMG3aNABAeXk5jh49iqFDh2LLli3o27cvDh06hMbGRkyYMAEmkwllZWWoqKjA8ePHce7cORgMBixc\nuBBAW4q/cOECFi1ahNLSUgwdOhQ1NTXo27cvnn76aURGRuK3v/0tAGDOnDlYtmwZxo8fj5MnT2Lq\n1Kk4duwY/vCHPyA5ORkrV67Erl278MYbb8j2O6LgwuJOqtbQ0NC6vk1ycjIWLFiAffv2ISkpCUOH\nDgUAFBUV4ciRI9i+fTsA4OrVq/j+++9RWlqKOXPmQKfTYdCgQZg8ebLbewuCgAMHDiA5Obn1vfr2\n7ev2+A27d+9269Ffu3YNdXV1KC0txY4dOwAA06ZN0/RFXEhZWNxJ1W677TaUl5ffsj08PNzt/qZN\nm5CSkuK2bdeuXV7bJL72yAVBwMGDB9GjRw+PjxEFGnvupHmpqan4y1/+0npwtaKiAvX19UhOTsYH\nH3yAlpYWVFdXY8+ePW6v0+l0+PnPf46SkhJUVVUBQOtMl8jIyNalZQHAZDK5XSj8X//6FwDnt4n3\n3nsPAPDJJ5/gxx9/lOxzErlicSdV85SsXWe9AMCTTz4Jg8GAxMREjBw5EosXL4bD4cD06dMxbNgw\nGAwGzJs3D+PGjbvlve644w5s2bIFjz76KOLj4zF79mwAQHp6Onbs2NF6QDU/Px9ff/01Ro8ejdjY\nWLz22msvjj8SAAAATUlEQVQAgNWrV6OkpARxcXHYsWNHa3uHSGpc8peISIOY3ImINIjFnYhIg1jc\niYg0iMWdiEiDWNyJiDSIxZ2ISINY3ImINIjFnYhIg/4fTxyIMqe4r0AAAAAASUVORK5CYII=\n", "text": [ "" ] } ], "prompt_number": 26 }, { "cell_type": "markdown", "metadata": {}, "source": [ "We're worried about overfitting, and remember that cross-validation is supposed to detect this. Let's look at the average $R^2$ score, when performing 5-fold cross validation. It's not as good, but still not bad..." ] }, { "cell_type": "code", "collapsed": false, "input": [ "cross_val_score(clf, xt, y, cv=5).mean()" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 27, "text": [ "0.61616795686754722" ] } ], "prompt_number": 27 }, { "cell_type": "markdown", "metadata": {}, "source": [ "And even if we make some plots of the predicted and actual data at each cross-validation iteration, \n", "the model seems to predict the \"independent\" data pretty well..." ] }, { "cell_type": "code", "collapsed": false, "input": [ "for train, test in KFold(len(y), 10):\n", " xtrain, xtest, ytrain, ytest = xt[train], xt[test], y[train], y[test]\n", "\n", " clf.fit(xtrain, ytrain)\n", " yp = clf.predict(xtest)\n", " \n", " plt.plot(yp, ytest, 'o')\n", " plt.plot(ytest, ytest, 'r-')\n", " \n", "\n", "plt.xlabel(\"Predicted\")\n", "plt.ylabel(\"Observed\")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 29, "text": [ "" ] }, { "metadata": {}, "output_type": "display_data", "png": "iVBORw0KGgoAAAANSUhEUgAAAXcAAAEPCAYAAAC5sYRSAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3X9UVHX+P/DngMYUqGlraWniZ9aSHyZYwmZJIwbjhoSa\nZ13BLVFxq48z5H7W89n8sVGuPzZPus7kfuy3a5La1w5ajhEUIoiaq1CZeqJG2DSp9cdSIgw4cL9/\njOBMDM7A3Dt35s7zcc6cA8PM3BeFL5687vu+RyUIggAiIlKUELkLICIi8bG5ExEpEJs7EZECsbkT\nESkQmzsRkQKxuRMRKZCkzb2+vh7Tp09HVFQUoqOjcejQISkPR0REV/WS8sVzc3PxyCOPYMeOHbDZ\nbLh8+bKUhyMioqtUUl3E9OOPPyI+Ph6nTp2S4uWJiOg6JBvL1NTUYODAgcjOzsaYMWOQk5ODxsZG\nqQ5HREQOJGvuNpsNlZWVePrpp1FZWYnw8HCsXr1aqsMREZEjQSJ1dXVCZGRkx+fl5eVCWlqa02M0\nGo0AgDfeeOONt27cNBqN2x4sWXIfNGgQhg4diurqagDAxx9/jJiYGKfHWCwWCIKg2Ntzzz0new38\n/vj9BeP3p+TvTRAEWCwWtz1Y0tUyJpMJWVlZaGlpgUajwVtvvSXl4YiI6CpJm/vo0aPxz3/+U8pD\nEBGRC7xCVUJarVbuEiTF7y+wKfn7U/L35inJ1rl7dHCVCjIenohEVlJixs6dRqhUzRCEMEyZYkBy\ncprcZSmOJ71T0rEMEQWPkhIztm7NRVbWtZN9+fn2j9ngfY9jGSISxc6dRqfGDgBZWRbs2mWSqaIA\nYLMBZ85I8tJs7kQkCpWquYuvWH1aR8A4fhwYNw5YuVKSl2dzJyJRCEJYF19R+7QOv2ezAatWAVot\nMG8esGGDJIdhcyciUUyZYkB+vsbpvi1bNMjI0MtUkR9qT+slJcCRI8D8+YBKJcmhuFqGiERTUmK+\nOmO3AlAjI0PPk6mAPa2vWQOsXQusWAHk5HjV1D3pnWzuRERSOn4cyM4G+vUDXn8dGDbM65f0pHdy\nLENEJIWfz9aLikRp7J7iOnciIrE5pvUjR3za1NsxuRMRiUXmtO6IyZ2ISAx+kNYdMbkTEXnDj9K6\nIyZ3IqKe8rO07ojJnYiou/w0rTticici6g4/TuuOmNyJiDwRAGndEZM7EZE7AZLWHTG5ExF1JcDS\nuiMmdyIiVwIwrTticicichTAad0RkzsRUbsAT+uOmNyJiBSS1h0xuRNRcFNQWnfE5E5EwUmBad0R\nkzsRBR+FpnVHTO5EFDwUntYdSZ7cIyMj0bdvX4SGhqJ37944fPiw1IckIuosCNK6I8mbu0qlQmlp\nKQYMGCD1oYiIOrPZgDVrgLVrgRUrgJwcQKWSuyrJ+WTm7u5duomIJBFkad2R5DN3lUqFhx9+GPfd\ndx9ee+01qQ9HRBRUs/WuSJ7cKyoqMHjwYJw7dw4pKSkYOXIkxo8fL/VhiShYBXFadyR5cx88eDAA\nYODAgZg6dSoOHz7s1Nzz8vI6PtZqtdBqtVKXRERKpODZemlpKUpLS7v1HJUg4UC8sbERra2t6NOn\nDy5fvozU1FQ899xzSE1NtR9cpeI8noi855jWX39d8Wndk94paXL/4YcfMHXqVACAzWZDVlZWR2Mn\nIvKagtO6tyRN7m4PzuRORD0VZGndkSe9k1eoElFg4UoYj3BvGSIKHFwJ4zEmdyLyf0zr3cbkTkT+\njWm9R5jcicg/Ma17hcmdiPwP07rXmNyJyH8wrYuGyZ2I/APTuqiY3IlIXkzrkmByJyL5MK1Lhsmd\niHyPaV1yTO5E5FtM6z7B5E5EvsG07lNM7kQkPaZ1n2NyJyLpMK3LhsmdiKTBtC4rJnciEhfTul9g\ncici8TCt+w0mdyLyHtO632FyJyLvMK37JTZ3Ij9QUmLGzp1GqFTNEIQwTJliQHJymtxlXZ/NBqxZ\nA6xdC6xYAeTkACqV3FXRVWzuRDIrKTFj69ZcZGVZOu7Lz7d/7LcNnmnd73HmTiSznTuNTo0dALKy\nLNi1yyRTRdfB2XrAYHInkplK1dzFV6w+rcMjJ08CZWVM6wGAzZ1IZoIQ1sVX1KIdw1xshvEdI5qF\nZoSpwmDINCAtpQcjn1GjgA8/FK0ukg6bO5HMpkwxID/f4jSa2bJFg8xMvSivby42I3dDLizx117f\nssH+cY8aPAUElSAIgmwHV6kg4+GJ/EZJifnqjN0KQI2MDL1oJ1N12ToURRZ1vv9fOhS+Wej6STYb\nsGMHMGMGV8D4IU96J5M7kR9ITk6TbGVMs+B6pm9t62Km/+WXwOzZwIABwOTJQESEJHWRtLhahkjh\nwlSuZ/rqkJ/N9G02YOVKYMIE4Pe/Bz76iI09gEne3FtbWxEfH4/09HSpD0VELhgyDdBUaZzu01Rq\noJ/pMNP/8kvgV78CSkuBo0d5QZICSD6WWb9+PaKjo3Hp0iWpD0VELrSfNDVtNcHaZoU6RA39Ar39\nfpsNePFFYN06e2qfN49NXSEkPaF65swZzJ49G0uWLMHatWvxwQcfOB+cJ1SJ5OM4W3/9deDOOz1+\nqtlshtFoRHNzM8LCwmAwGJCWxpU3viL7CdWFCxdizZo1+Omnn6Q8DBF1h5dp3Ww2Izc3FxaLw9LK\nqx+zwfsPyZr77t27ceuttyI+Ph6lpaVdPi4vL6/jY61WC61WK1VJRLRzp31PmDFj7LP1bqT1dkaj\n0amxA/bmbjKZ2NwlUlpaet0+6opkY5nFixfj7bffRq9evWC1WvHTTz/hsccew+bNm68dnGMZIt+w\nWoGMDKC4GJg1C9i0CQjp2XoKrVaLffv2dbr/oYce6nYDop7xpHdKtlpm5cqVOH36NGpqarBt2zYk\nJyc7NXYi8pGdO4FbbrHvB1NRAWzeDISEwGwug063FFptHnS6pTCbyzx6ubCwLpZWqsXbLoG857OL\nmFQ8A0/kW01NwJQpLtO62VyG3NyPYLGs6Hi4xbIEAJCWlnTdlzUYDLBYLE6jGY1GA71enO0SSBzc\nfoBIiYxGYOFCoE8f+0Zf99/v9GWdbimKiv7S6Wk63TIUFi53+/JmsxkmkwlWqxVqtRp6vZ7zdh+S\nfbUMEflYQwMQHw988w2QmGgfw4SGdnpYc7Prf/pWa+fHupKWlsZm7ue4/QCRUhiN9ndGqq0F3nsP\nOHTIZWMHgLAwm8v71epWCQskX2JzJwp0DQ3AiBFAbi4wdqx9Zcy0add9isGQCo1midN9Gs1i6PUp\nUlZKPsSZO1Ega5+th4QA27e7beqOzOYymEzFsFpDoVa3Qq9PcXsylfyDJ72TzZ0oEHk4WydlknWd\nOxFJpBuzdQpeXC1DFCiY1qkbmNyJAgHTOnUTkzspmtlcBqOxCM3NvRAWZoPBkBpYJw2Z1qmH2NxJ\nsby5xN4v/O1vwP/8j30lzHvvdWslDBFXy5BieXuJvWzq6wGNBrh4kWmdXPJqtUxERAT69Onj8ta3\nb1/RiyUSm7eX2Mvi2WeB/v2B//wH2LiRs3XqsS7HMg0NDQCApUuX4vbbb8esWbMAAPn5+Th79qxv\nqiPyQkBdYu+Y1ocPB6qrgV6cmlLPuV0t8/777+Ppp59G37590bdvXzz11FPYtWuXL2oj8krAXGK/\neLFzWj91io2dvOb2Jyg8PBxbtmzBzJkzAQDbtm1DRESE5IUReav9pKnJtMzhEvtJ/nMylWmdJOT2\nhGpNTQ1yc3Nx4MABAMADDzyA9evXIzIy0vuD84QqBavFi4FVq+xvTP1//wf8/vdyV0QBhHvLEPkb\npnUSgSh7y3z11VeYOHEiYmJiAABffPEF/vKXzsvLiMgNztbJh9wm96SkJKxZswZPPvkkqqqqIAgC\nYmNjcfz4ce8PzuROwYBpnUQmSnJvbGxEYmKi04v27t3b++qIggHTOsnE7U/ZwIED8c0333R8vmPH\nDgwePFjSoogCHtM6ycztWMZisWD+/Pk4cOAA+vfvj+HDhyM/P5+rZYi6wpUwJDFRVsu0trYiNDQU\nDQ0NaGtrE3XrATZ3UhSmdfIRUWbuw4cPx/z58/Hpp5+iT58+ohVHpCicrZOfcZvcL1++jN27d2Pb\ntm2orKxEeno6ZsyYgfHjx3t/cCZ3CnRM6yQDUZJ7eHg4ZsyYgYKCAnz22Wf48ccfodVqxaqRKHAx\nrZMfc/uTKAgC9u3bh+3bt6OwsBBjx47Fu+++64vaiPwT0zoFALdjmcjISMTFxWHGjBlIT08XddMw\njmUo4HAlDPkBT3rndeNGa2sr5syZgz//+c89KsBqteKhhx5Cc3MzWlpakJGRgVWrVvXotYhkxbRO\nAea6M/fQ0FB88MEHPX5xtVqNvXv34rPPPsMXX3yBvXv3Yv/+/T1+PSJZcLZOAcjtT+iDDz6IBQsW\nYMaMGQgPD++4f8yYMR4d4KabbgIAtLS0oLW1FQMGDOhhqUQ+xrROAcztzF2r1UKlUnW6f+/evR4d\noK2tDWPGjIHFYsFTTz2FF1988drBOXMnf8XZOvkxr2fuAFBaWupVESEhIR1LKHU6HUpLS52WUubl\n5XV8rNVqucyS5MW0Tn6otLS0273YbXL//vvvsWTJEnz33XcoLCzEiRMncPDgQcydO7fbBS5fvhw3\n3ngj/vjHP9oPzuRO/oRpnQKEKBcxzZ49G6mpqTh79iwAYMSIEVi3bp1HBZw/fx719fUAgKamJhQX\nFyM+Pt6j5xL5TH09cMst9sY+fDjQ0sLGTgHPbXM/f/48ZsyYgdDQUABA79690cvDP1Pr6uqQnJyM\nuLg4JCYmIj09HRMnTvSuYiIxcSUMKZTbn+KIiAhcuHCh4/NDhw6hX79+Hr34qFGjUFlZ2fPqiKTS\nzdm6udgM4ztGNAvNCFOFwZBpQFpKmg8LJuoet839pZdeQnp6Ok6dOoVx48bh3Llz2LFjhy9qI5KG\n42x940a3IxhzsRm5G3Jhibd03GfZYP+YDZ78ldsTqgBw5coVfPXVVwCAu+++W7S32eMJVfKpHq6E\n0WXrUBRZ1Pn+f+lQ+GahFJUSXZcoJ1TfffddNDU1ITY2FgUFBZgxYwZHLRR4TCZgwIAezdabhWaX\n91vbrGJWSCQqt819+fLl6Nu3L/bv349PPvkEc+bMwZNPPumL2oi819AAjBgBGAxAQgJw5Uq3V8KE\nqcJc3q8OUYtRIZEk3Db39lUyu3fvRk5ODiZPnowrV65IXhgFljKzGUt1OuRptViq06HMbJa7JHta\n79cPqK0F3nsPOHQIuPrz3B2GTAM0VRqn+zSVGuhn6kUqlEh8bv8uveOOOzB//nwUFxfjT3/6E6xW\nK9ra2nxRGwWIMrMZH+XmYoXl2gnHJVc/TkqT4YRjQwMQHw988w2QmAhUVPSoqbdrP2lq2mqCtc0K\ndYga+gV6nkwlv+bR2+wVFhbinnvuwYgRI1BXV4djx44hNTXV+4PzhKoiLNXp8Jeiziccl+l0WF7o\n4xOOixcDq1fbm/n27cC0ab49PpEPiLK3THh4OCIjI7Fnzx6EhITggQceEKWxk3L0anZ9wjHU6sMT\njo4rYUaOBL780qu0ThTo3M7cX3jhBcyePRsXL17EuXPnkJ2djeXLl/uiNgoQtjDXJxxb1T464fjz\nq0xPnmRjp6Dndixz11134YsvvoD66j/UpqYmjB49GtXV1d4fnGMZRXA1c1+s0WDS+vXSzty5gyMF\nKVHGMnfccQeampo6mrvVasWQIUPEqZAUob2BLzOZEGq1olWtxiS9XtrG3s2rTJWO2yPQz3XZ3PV6\n+zKvfv36ISYmBikpKVCpVCguLkZCQoLPCqTAkJSW5puVMUzrnXB7BHKly7HMpk2boFKp0NjYCJvN\nBgDo1atXx9vmPfHEE94fnGMZ6o6cHOD112XZb92fkzG3Rwg+Xo1lsrKysGTJErz55pu48847AQDf\nfvstsrOzsWLFCnErJbqe8+eBO+8Emprs+67X1QEi7W/kCX9PxtwegVzpcrXMokWLcPHiRdTU1KCy\nshKVlZU4deoU6uvrsWjRIl/WSMEsJwcYONDe2FessDd6HzZ2ADC+Y3Rq7ABgibfAtNXU8bnZXAad\nbim02jzodEthNpf5rD5uj0CudJncd+/ejerqaoSEXOv/ffv2xcaNG3H33Xdj/fr1PimQgpTjbF2G\ntO7IXTI2m8uQm/sRLJZrf9FaLEsAAGlpSZLXZ8g0wLLB4vQLSFOpgX4Bt0cIZl0295CQEKfG3i40\nNNTl/USief55IC/PPlt/+WXgv//b65csM5tRZDSiV3MzbGFhSDUYPD4B7C4ZG41FTo0dACyWFTCZ\nlvmkuXN7BHKly+YeFRWFf/zjH51OnL799tsYOXKk5IVREHLcE+a//gv46itRVsJ4u/eNu2Tc3Oy6\nRqvVdxdSpaWksZmTky7/5WzYsAHTpk3Dm2++iXvvvRcAcPToUTQ2NqKgoMBnBVKQMJmAZ54BQkLs\nOziKuCdMkdHo1NgBYIXFgmUmk0fN3V0yDguzuXyeWt3qZeVEPddlcx8yZAg+/fRTlJSU4Pjx41Cp\nVEhLS+MbXJO4rFZg8mTgk09E2cHRFTH2vrleMjYYUmGxLHEazWg0i6HXT+peoUQiuu7fvCqVChMn\nTmRDJ2ns2gVkZgJqNbBvH5AkzXxa6r1v2ufqJtMyWK2hUKtboddP8sm8nagrHr2HqmQH50VMwclq\nBTIygOJiYNYsYNMm+zhGIrLtfUMkEU96J5s7+ZZjWt+9G7j/fp8ctsxsRrHD3jcpUu99QyQhNnfy\nHz5O60RKJsqukERec0zrFRU+S+tEwYzRiaRjtQI6HTB1KvDYY8C5c2zsRD7C5K4g3lyFKTqmdSJZ\nsbkrhLdXYYrm8mX7BUicrRPJStJ/dadPn8aECRMQExOD2NhYGI1GKQ8X1Lq6CrPYZOriGRIwmYC+\nfYGDB+1pffNmNnYimUia3Hv37o1169YhLi4ODQ0NuPfee5GSkoKoqCgpDxuUxLgKs8cc94SR6CpT\nIuoeSWPVoEGDEBcXBwCIiIhAVFQUzp49K+Uhg5bUV2F2yWQC+vUDamvte8IcOsTGTuQHfPY3c21t\nLaqqqpCYmOirQwaVVIMBSzQap/sWazRI0Uu0p3dDAzBiBGAwAGPH2lfGiLjZFxF5xycnVBsaGjB9\n+nSsX78eERERTl/Ly8vr+Fir1UKr1fqiJMVpP2m6zOEqzElSXYUp4Q6ORNRZaWkpSktLu/Ucya9Q\nvXLlCiZPnoxf//rXeOaZZ5wPzitUAwtn60R+wZPeKelYRhAEzJ07F9HR0Z0aOwUYztaJAoqkyX3/\n/v1ISkrCPffcA5VKBQBYtWoVJk2y73PN5B4AmNaJ/A43DiPvOM7Wt2/nbJ3IT3DjMOoZpnWigBc0\nlw+Wmc1YqtMhT6vFUp0OZWaz3CX5J87WiRQhKJK73+y74s+Y1okUJShm7kt1OvylqKjT/ct0Oiwv\nLJT8+HIrMZdgp3EnVM0qCGECphimIDkt+doDOFsnCiicuV8l674rMisxl2Br7lZkWbI67su35AMA\nkieOs787UlER0zqRwgTFzF22fVf8wE7jTqfGDgBZliwc+9NLwC23AEeOAAcOcLZOpDBB0dx9vu+K\nH1E1q5w/RwvuwSLov9wDpKXx3ZGIFCooxjI+3XfFzwhh1+Zyt2A/orECbeiN9aMfxcJ335WxMiKS\nUlCcUA1mJeYS/D/9ZqyoOY7+OIofMBFLhw9FpmmW80lVIgoYPKGqMOaSEhh37kSzSoUwQYBhyhSk\nJV+/QSefOo6HajfDqgrF36IeRe3QYcjUZ7CxEykck3uAMJeUIHfrVliyrp0c1eTnY/3Mma4bvIt1\n6+Z9+7r9y4GI/A+Tu4IYd+50auwAYMnKgqmgoHODfv554IUXnPZbd/XLwZJvXxLJBk+kPGzuAaJZ\npXJ5v9NK/YYG4Je/BH74AYiLsy9zvLq8sVu/HIgo4AXFUkglCOviT7COlfrte8JcuABs3AhUVTmt\nW/folwMRKQaTe4AwTJkCS36+88x9yxbkTptmfy9TN3vCuP3lQESKwuYeINpHJ6aCAlhhb8rLb7wR\nY9PTPXov065+OegzMyWunIjkwNUygaiHOziaS0pg2rWr45eDPiOD83aiAMR3YlIi7uBIFPS4FFJJ\nuN86EXUDV8sEAr47EhF1E5O7P2NaJ6IeYnL3V0zrROQFJnd/w7RORCJgcvcnTOtEJBImdxn8/A2r\np+VMgvbZXKZ1IhIN17n72M/fsPoW7MddeA4hoSHo/S7XrRORe1zn7ofa37BahVZE4QUMRDl+wESs\nSh6J9WzsRCQSNncfUzWrcBNqMBJ/hYBeqIQRlxALtBTIXRoRKYikJ1TnzJmD2267DaNGjZLyMIHD\nasXDx4oQh4Wow2RUwWRv7AC3ZyQiUUna3LOzs1FYWCjlIQLHrl3ALbfggSt1WHbHw6jDZAD2Pda3\naLYgQ58hb31EpCiSjmXGjx+P2tpaKQ/h/6xWICMDKC4GZs3CgE2boPuwFAWmArRvz5ipz+QbVhOR\nqDhzl9KuXUBmJqBW25c33n8/ACA5LZnNnIgkxYuYpGC1AjodMHUq8NhjwLlzHY2diMgXZE/ueXl5\nHR9rtVpotVrZahFFF2mdiKinSktLUVpa2q3nSH4RU21tLdLT03Hs2LHOB1fSRUw/m61j0yb7G2oQ\nEYnMk94pafeZOXMmxo0bh+rqagwdOhRvvfWWlIeTz9WVMDhyxJ7WN29mYyciWXH7AW8wrRORDLj9\ngJQ4WyciP8bk3l0Oaf0LrRaLevVCc0sLwsLCYDAYkJaWJneFRKRwTO5ic0jrn7z4In6/cSMsFkvH\nl9s/ZoMnIrlxQOwJmw343/91Wrf+YnGxU2MH7M3dZDLJVCQR0TVM7u4cPw7Mng307w98/jlwdRO0\n5uZmlw+3Wq0+LI6IyDUm967YbMCqVYBWC8yfD3z0UUdjB4CwsDCXT1Orub0jEcmPyd0Vx7R+9Chw\n552dHmIwGGCxWJxGMxqNBnq93oeFEhG5xubuyGYD1qwB1q4FVq4E5s0DVCqXD20/aWoymWC1WqFW\nq6HX63kylYj8ApdCtnNM66+/7jKtExH5A9m3HwgIrmbrbOxEFOCCeyzjwWydiCgQBWdyZ1onIoUL\nvuTOtE5EQSB4kjvTOhEFkeBI7kzrRBRklJ3cmdaJKEgpN7kzrRNREFNecmdaJyJSWHJnWiciAqCU\n5M60TkTkJPCTO9M6EVEngZvcmdaJiLoUmMmdaZ2I6LoCK7kzrRMReSRwkjvTOhGRx/w/uTOtExF1\nm38n9yBJ62ZzGYzGIjQ390JYmA0GQyrS0pLkLouIAph/NvduvJdpoDOby5Cb+xEslhUd91ksSwCA\nDZ6IekzSsUxhYSFGjhyJESNG4K9//atnTzpxArj/fmDvXntaz8lRbGMHAKOxyKmxA4DFsgImU7FM\nFRGREkjW3FtbW7FgwQIUFhbixIkT2Lp1K06ePOn+iRcv2hu6AmbrpaWlbh/T3Oz6jyerNVTkasTn\nyfcXyPj9BS4lf2+ekqy5Hz58GL/85S8RGRmJ3r1747e//S127drl/okPPmg/caqAtO7JD1hYmM3l\n/Wp1q8jViE/p/4D4/QUuJX9vnpKsuX/33XcYOnRox+dDhgzBd999J9XhApbBkAqNZonTfRrNYuj1\nKTJVRERKINkJVZUCkrcvtJ80NZmWwWoNhVrdCr1+Ek+mEpF3BIkcPHhQ0Ol0HZ+vXLlSWL16tdNj\nNBqNAIA33njjjbdu3DQajdserBIEQYAEbDYb7r77bnzyySe4/fbbkZCQgK1btyIqKkqKwxERkQPJ\nxjK9evXCyy+/DJ1Oh9bWVsydO5eNnYjIRyRL7kREJB/Z9pbp0QVOAWLOnDm47bbbMGrUKLlLkcTp\n06cxYcIExMTEIDY2FkajUe6SRGW1WpGYmIi4uDhER0fj2Weflbsk0bW2tiI+Ph7p6elylyK6yMhI\n3HPPPYiPj0dCQoLc5Yiuvr4e06dPR1RUFKKjo3Ho0CHXDxT1LKqHbDaboNFohJqaGqGlpUUYPXq0\ncOLECTlKkURZWZlQWVkpxMbGyl2KJOrq6oSqqipBEATh0qVLwl133aWo/3+CIAiXL18WBEEQrly5\nIiQmJgrl5eUyVySul156ScjMzBTS09PlLkV0kZGRwoULF+QuQzKPP/648MYbbwiCYP/5rK+vd/k4\nWZJ7jy9wChDjx49H//795S5DMoMGDUJcXBwAICIiAlFRUTh79qzMVYnrpptuAgC0tLSgtbUVAwYM\nkLki8Zw5cwZ79uzBvHnzICh0KqvU7+vHH39EeXk55syZA8B+brNfv34uHytLc+cFTspRW1uLqqoq\nJCYmyl2KqNra2hAXF4fbbrsNEyZMQHR0tNwliWbhwoVYs2YNQkL8f8fvnlCpVHj44Ydx33334bXX\nXpO7HFHV1NRg4MCByM7OxpgxY5CTk4PGxkaXj5Xl/y4vcFKGhoYGTJ8+HevXr0dERITc5YgqJCQE\nn332Gc6cOYOysjLFXM6+e/du3HrrrYiPj1dsuq2oqEBVVRU+/PBDbNiwAeXl5XKXJBqbzYbKyko8\n/fTTqKysRHh4OFavXu3ysbI09zvuuAOnT5/u+Pz06dMYMmSIHKVQD125cgWPPfYYZs2ahSlTpshd\njmT69euHtLQ0HDlyRO5SRHHgwAG8//77GD58OGbOnImSkhI8/vjjcpclqsGDBwMABg4ciKlTp+Lw\n4cMyVySeIUOGYMiQIRg7diwAYPr06aisrHT5WFma+3333Yevv/4atbW1aGlpwfbt2/Hoo4/KUQr1\ngCAImDt3LqKjo/HMM8/IXY7ozp8/j/r6egBAU1MTiouLER8fL3NV4li5ciVOnz6NmpoabNu2DcnJ\nydi8ebPcZYmmsbERly5dAgBcvnwZRUVFilq1NmjQIAwdOhTV1dUAgI8//hgxMTEuHyvLm3Uo/QKn\nmTNnYt++fbhw4QKGDh2KF154AdnZ2XKXJZqKigps2bKlY7kZAKxatQqTJk2SuTJx1NXV4YknnkBb\nWxva2tpjtyOuAAAD00lEQVTwu9/9DhMnTpS7LEkobUT6ww8/YOrUqQDsI4ysrCykpqbKXJW4TCYT\nsrKy0NLSAo1Gg7feesvl43gRExGRAinzdDkRUZBjcyciUiA2dyIiBWJzJyJSIDZ3IiIFYnMnIlIg\nNncKaKGhoYiPj8eoUaPwm9/8Bk1NTT1+rdmzZ+O9994DAOTk5ODkyZNdPnbfvn04ePBgt48RGRmJ\nixcv9rhGIk+xuVNAu+mmm1BVVYVjx47hhhtuwMaNG52+brPZPH4tlUrVcVHPa6+9dt0L6/bu3YsD\nBw50u16lXTRE/ovNnRRj/Pjx+Oabb7Bv3z6MHz8eGRkZiI2NRVtbGxYtWoSEhASMHj0ar776KgD7\nNgoLFizAyJEjkZKSgn//+98dr6XVanH06FEA9jeWuffeexEXF4eUlBT861//wiuvvIJ169YhPj4e\nFRUVOHfuHKZPn46EhAQkJCR0NP4LFy4gNTUVsbGxyMnJUexmXeR/ZNl+gEhsNpsNe/bswSOPPAIA\nqKqqwvHjxzFs2DC8+uqruPnmm3H48GE0NzfjwQcfRGpqKiorK1FdXY2TJ0/i+++/R3R0NObOnQvg\nWoo/d+4c5s+fj/LycgwbNgz19fW4+eab8eSTT6JPnz74wx/+AADIzMzEwoUL8cADD+Dbb7/FpEmT\ncOLECTz//PNISkrC0qVLsWfPHrzxxhuy/Tei4MLmTgGtqampY3+bpKQkzJkzBxUVFUhISMCwYcMA\nAEVFRTh27Bh27NgBAPjpp5/w9ddfo7y8HJmZmVCpVBg8eDCSk5OdXlsQBBw6dAhJSUkdr3XzzTc7\nfb3dxx9/7DSjv3TpEi5fvozy8nIUFBQAAB555BFFv4kL+Rc2dwpoN954I6qqqjrdHx4e7vT5yy+/\njJSUFKf79uzZ43ZM4umMXBAEfPrpp7jhhhtcfo3I1zhzJ8XT6XT4+9//3nFytbq6Go2NjUhKSsL2\n7dvR1taGuro67N271+l5KpUKv/rVr1BWVoba2loA6Fjp0qdPn46tZQEgNTXV6Y3CP//8cwD2vybe\neecdAMCHH36I//znP5J9n0SO2NwpoLlK1o6rXgBg3rx5iI6OxpgxYzBq1Cg89dRTaG1txdSpUzFi\nxAhER0fjiSeewLhx4zq91i9+8Qu8+uqrmDZtGuLi4jBz5kwAQHp6OgoKCjpOqBqNRhw5cgSjR49G\nTEwMXnnlFQDAc889h7KyMsTGxqKgoKBjvEMkNW75S0SkQEzuREQKxOZORKRAbO5ERArE5k5EpEBs\n7kRECsTmTkSkQGzuREQKxOZORKRA/x8L2xx9eDja4gAAAABJRU5ErkJggg==\n", "text": [ "" ] } ], "prompt_number": 29 }, { "cell_type": "markdown", "metadata": {}, "source": [ "**But** -- what if we generated some more data?" ] }, { "cell_type": "code", "collapsed": false, "input": [ "x2 = make_x(100)\n", "y2 = hidden_model(x2)\n", "x2 = x2[:, best_features]\n", "\n", "y2p = clf.predict(x2)\n", "\n", "plt.plot(y2p, y2, 'o')\n", "plt.plot(y2, y2, 'r-')\n", "plt.xlabel(\"Predicted\")\n", "plt.ylabel(\"Observed\")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 9, "text": [ "[]" ] }, { "metadata": {}, "output_type": "display_data", "png": "iVBORw0KGgoAAAANSUhEUgAAAXAAAAEACAYAAACqOy3+AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XtwlPW9P/D3EtAEKAgMuDSbM9FFJOGWIEjPjJC1ShYb\n9FQaD5cqHATrqZYAnpl2IIlsxVILrTVJ7elMazv683fAY61HYQEDtZtQKzI24JVqiaDh1iqnMVwS\nQ5Ln/LEke3t299l9vs/9/ZrZmeyy2f1kQz772c/35pIkSQIREVnOIKMDICKi7DCBExFZFBM4EZFF\nMYETEVkUEzgRkUUxgRMRWZSqBP7BBx+gtLR04DJy5EjU19eLio2IiFJwiZoH3tfXh/z8fBw8eBAF\nBQUiHpKIiFIQ1kLZt28fvF4vkzcRkU6EJfDt27dj6dKloh6OiIjSENJC6e7uRn5+Pt5//32MHTtW\nRFxERJTGYBEPsnv3btxwww2yyXvChAlobW0V8TRERI7h9Xpx9OjRlPcR0kLZtm0blixZIvtvra2t\nkCTJ9JeNGzcaHoNd4rRCjIyTcZr9oqTwVZ3AL1y4gH379mHhwoVqH4qIiDKguoUybNgwfPbZZyJi\nISKiDHAl5mU+n8/oEBSxQpxWiBFgnKIxTv0JW8iT9AlcLmj8FEREtqMkd7ICJyKyKCZwIiKLEjIP\nnMwhGGxGfX0jvvhiMK68sgdVVeWoqJhrdFhEpBEmcJsIBpuxZs0raG39wcBtra3VAMAkTmRTbKHY\nRH19Y0zyBoDW1h+goWGvQREROVB7O/Dxx7o9HRO4TXzxhfyHqa6uHJ0jIXKo+npg1Cjg9tt1e0q2\nUGziyit7ZG/Pze3VORIih2lvDyduAKisBJ5/XrenZgVuE1VV5fB6q2Nu83o3YPXqeQZFROQA/VU3\nALzzjq7JG+BCHlsJBpvR0LAXXV05yM3txerV8ziASaQFHapuJbmTCZyIKBP19cCaNeGv33kHmDJF\nk6dRkjvZAyciUsLAXncy7IETEaVjcK87GVbgRETJmLDqjsYKnIhIjkmr7miswImIopm86o7GCpyI\nqJ8Fqu5orMCJiCxUdUdTXYG3t7ejsrISRUVFKC4uxoEDB0TERUSkD4tV3dFUV+Br1qzB1772Nfz2\nt79FT08PLly4ICIuohjc65yEs2jVHU1VAv/888+xf/9+PP300+EHGzwYI0eOFBIYUT/udU7C6bSa\nUmuqWijHjh3D2LFjsWLFCsyYMQP33XcfLl68KCo2cohgsBl+fw18vgD8/hoEg80x/869zkmY9nbA\n5Qon78pKQJIsm7wBlQm8p6cHLS0teOCBB9DS0oJhw4bhscceExUbOUB/dd3Y+CiamgJobHwUa9a8\nEpPEudc5CWHhXncyqlooHo8HHo8Hs2bNAgBUVlbKJvBAIDDwtc/ng8/nU/O0ZCPJq+vagfYI9zon\nVY4fB665Jvy1iXvdoVAIoVAoo+9RlcDdbjcKCgrw4YcfYuLEidi3bx8mT56ccL/oBE4UTUl1XVVV\njtbW6phEH97rfL7m8ZHFXXcdcPRo+OuXX9b1tJxMxRe33//+99N+j+pZKA0NDfjmN7+J7u5ueL1e\n/OY3v1H7kOQgSqrr/kq8oaE2aq/z+RzApOSiq24g3Ou2Ie4HToaSm2Hi9W5AXR0TNGUpuurevh1Y\ntMjYeLLEAx3IEniSEAlhs6qbCZyInMEmVXc0nshDRPZms6o7U9yNkIis6brrIsl7+3bHJW+AFTgR\nWY3Dq+5orMCJBEi3HQAJwqo7BitwIpW42ZYOWHXLYgVOpBI329KYy8WqOwlW4EQqcbMtjbz1FlBS\nErnOxJ2AFTiRStxsSwMuVyR5P/wwk3cSTOBEKlVVlcPrrY65LbzZ1jyDIrKwt94KJ+9+kgQo2NTJ\nqbgSk0iA+O0AvvKV8Xj99VM8Ai4T0Yn74Ycdn7i5lJ7IAPIbdFWjrs7PJC6HvW5ZSnInWyhEgnFW\nSgbY61aFs1CIBOOsFAWamoDok7mYuLPCCpxIMM5KScPliiTvhQuZvFVgAicSjLNSkmhqSpxh8sIL\nxsVjAxzEJNIAD6mIE524Fy5k4laAs1CIyFjsdWdNlwMdCgsLMWLECOTk5GDIkCE4ePCg2ockIjtg\n1a051Qnc5XIhFAph9OjRIuIhIqtj1a0bIdMI2SIhuwoGm1Ff38gVlUqx6taVkAr81ltvRU5ODu6/\n/37cd999IuIiMlwg8HNs2fI2Ojt/MXCbE/b5zupNi1W3IVQn8Ndeew3jx4/Hp59+innz5mHSpEmY\nM2dOzH0CgcDA1z6fD77oXzTZmlUr2GCwGVu2NKGz87mY28MrKmst8TNkI6vDKVh1CxEKhRAKhTL7\nJkmgQCAg/fjHP465TfBTkIXs3Nkkeb0bpHA5Fr54vRuknTubjA4trfLyagnYGBN7/6WsbKPR4Wkm\n/HMn/sx+f03inZ9+OvZOJJSS3KlqIc/Fixdx7tw5AMCFCxfQ2NiIqVOnqnlIshEr7wkSXg7vvBWV\nircBcLmA5cvDX19/PVsmBlGVwP/2t79hzpw5KCkpwezZs7FgwQKUl5eLio0szsp7goSXw5cDiF1R\nmZd3v61XVKbdBqChIXE15V/+okNkJEdVD/yaa67B4cOHRcVCNmPlPUGqqsrR2voKWlv9AGoB5CAv\n7wi++90y2/a/gf6fuzpuK9wNWL16fmziBlh1mwB3IyTNpEwGJtefpGOXwz9o6+Tdb8SIf2DUqOUA\nvkBh4XD8vxuHYfKCssgdmLhNg0vpSbVUM024J4h1yM1AkcCq2yjcC4WESZakefqMffj9NWhsfBQA\nUINN2ISHI//Iv2Hd6bIXCtlfqrnByWea2HeutF31DzrHV92+so0IGRAPpcf9wCmtVNMBrTzThGJ9\n7+j/xCRvFyS4IFli0NmpWIFTWqmStJ4zTayyqlOvOIU+j8uF26KvIvzR3SqDzk7FBE5ppUrSq1fr\nM9MkqyXeGpNLoAB0iVPY67FqFfDUU5HH3dmEhoa9KOsKXP79zjflmyRdpuFKUOnyAKnWT0Eak18S\nv35gSfzOnU2S318jlZVtlPz+Gk2Wyme0xFsHybYJKC1dqUucQl6P+G8mU1GSO1mBU1qROdG1UdMB\nI5VZRcVczas0s/Xak40LjBq1RPb+ouNU9Xrccgvw6quR65xhYllM4KSIHkk6FbOt6kyWQCXpCtnb\nRceZ9evB1ZS2wlkoZAlmO+k9WQK99trhusSZ8esxc2biHiZM3pbHhTw2ZpVZG0plsqpT659dfgHT\nBtTVhQdv08UpIj7FrwerbktSlDs17sNzENMgVt6LWy29fvZsB291+93ccAMHKS1MSe5kArcps83a\n0JPZf3Zd4uMME8tTkjs5iGlTRs3aMEPbxmwzVuJpGt+0acA770Sus11ia0zgNmXErA01i0tEJv5k\nP/u77x5BMNhs+DiAZr8b9rqdxwwfA0i8dItvRD9XeXm1NGrUoqxaA+l6wv2PX1a2USovr077M8g9\nHrBeAppMMQ4g/HdTUMB2iQ0pyZ2swG0q3eIbUWKr7oDsfdK1BlLtaAhkvjS9//blyxfh7NkiAL0A\n5gOYi9bWuYbvlCj0d8Oq29GYwG1OuvwHLWn0hx2bfLNrDaTqCWe7XW1FxVxMmfIqmpoCso8LGNuv\nV70wiombICiB9/b2YubMmfB4PNixY4eIhySV9Nr8KTb59h8CnNnGVql6wl1dmQ34RSfld989AqAZ\nQOzPm5vba8rNsRRj8qbLhKzErKurQ3FxMVzx/7HIMKn28I4WDDbD76+BzxeA31+DYLA5o+eJTb5z\nAYQPAR41ajn8/lrU1aVvDaRaVZjJgF9/Um5sfBRNTQGcPfscBg/+L4STeOzjKn19TMXl4mpKiqG6\nAj9x4gR27dqF6upqPP744yJiIgGUTFUTUYUmHlw8F17vHtTVrVT8GOl6wkq3q5VLyj09v8CYMYsx\nZcqrMY+7deurCd8PmGeqYQJW3SRDdQJft24dtm7dio6ODhHxkCBKKle1x6H1tyvy8i5gzJhFcLuv\ngsczLqsBuWQ94UwG/JK9aU2ZMgmhUCDmNrNtjpUUEzeloCqB79y5E+PGjUNpaSlCoVDS+wUCgYGv\nfT4ffD6fmqclBRIr48TKVc2CErnq/aqrqjU5dV7pgF8mSVnJ62M4V+LZlFf6ayy/pw3JC4VCKfOo\nLDXzFNevXy95PB6psLBQcrvd0tChQ6V77rkn47mMpI10e3WoWdJtxuXqcvOr8/K+JW3c+GTS+2t9\nEEVW4l5Up+5p43RKcqew7BoKhaQFCxZkFQQZQ82CkrKyjbIJvKxso/aBp7Bx45NSXt6/SsBGCagx\nzeIdxeJeUDO+UZI+lOROofPAOQvFWtQsKDFrD/n110+hs/O5mNvMsHgnrSS97i98Adm7m3awlXQl\nLIGXlZWhrKxM1MORTrJdUGLWHrLZN7KSlWKg0qxvlGQOXInpICJXHuq1VD9TVkl4wWAzKhbEFTwy\nM0zM+kZJ5sAE7hBarDw0+pxMOVZIeHLJe4J3A+pkdko06xslmQOPVHMIv78GjY2Pytxeiz17NhkQ\nkXYyOXpNz5jq6xvxSmPsvHsXIn8bqX4XZthnnfSlJHeyAncIs/SG9UhEZvtk0P/p52jr5pjbXWiK\nuZ5qfxfL7ttCmmICd4h0vWE9EqtTE1HFgjJURF13oQ+AC0AtojfaStanV7tiluyLCdwhUvWG9Uqs\njkxEcTNMolsmQKTiTtWnz/TTE9stzsEE7hCpBsP8/hpdEms4ETUDaET4v14PgHJzT/HLVkLi7q+6\nI8aM+QumTAmkHZjMZkdGp33KcSomcBMTXUkl6w2nqvBExtDRcQLAK4jeLxyoRkfH37J6PKVS/Qya\nVKtxyTu4swneNTUJn37q6h5Q9FyZzKxx5KccB2MCNyk9K6lkFV5HxwnBMVyB2OQNAD+Ay/Vgwj1F\nJdZUryOQ+XFtKcUvyOnrA1yugf53tlMBRezIaMtPOcRDjc1Kzz0w5PZEcbtXSF/60p1CY1C6f0q6\nQ44zkep1FPoaxz+IIJke6My9U+xDSe5kBW5SIiupdNVsfIXX0XECp09fhXPn/klYDIDyXq7INkA2\nr2NGP1+SqluEbD6FWWEhE4nDBG5SopaEK00C0f1xv78Ghw49CqBGSAz9lCYXkW9eqV5HKckiCcU/\nn8aHLWTzRsaVm87CBG5SoiqpbJJAJIFmd0hxMkqTi8j9TNK9jlm9xhpW3dGyfSMz20Im0g4TuEmJ\nqqSySQKRBNr/XLUAcjBmzF8Uz5xIRklyEdkGUPI6ZvQa63jEmVU25iIDmaERT9rJZlBLzUEPoqQ6\nLSfTgT0h4l/Avj7Nn9IMvwcyjpLcyc2sbC62Bx5eRJOb+wmKioZj06bFSatNM24I1R9XfE/f661G\nXZ1fu/gMPFjYrL8H0p6S3MkE7gDBYDNqa5/BkSND0NX1nwO3a574NKDrroo69bqJ5CjJnYN0ioUM\nVFExF2PHumOSN9A/mLnXoKiyo9tCFbmqm8mbTIaDmA5hlxV6mg/sseomC1FVgXd1dWH27NkoKSlB\ncXEx1q9fLyouEswuMxqqqsrh9VbH3BaeoTJP/YM7uOoOBpvh99fA5wvA769BMNhsdEikgKoKPDc3\nF3/4wx8wdOhQ9PT04KabbsIf//hH3HTTTaLiI0HsskJPk4Uq8ZtP7QglnlcZx05btnIHQ+tS3UIZ\nOnQoAKC7uxu9vb0YPXq06qBIPDut0BO6UEVmv27v2urwJlQpZujYKeFxB0PrUp3A+/r6MGPGDLS2\ntuLb3/42iouLRcRFGuAKvSgp9utOl7zslvCyHR+x06cQq1KdwAcNGoTDhw/j888/h9/vRygUgs/n\ni7lPIBAY+Nrn8yX8O9mb6f7QU56SE5YqedllQLhfNuMjdvsUYgahUAihUCizbxK5cuiRRx6Rtm7d\nmvFqIlLHkJWJConcGlY1mdWU2axUtduWrdms+LTba2BGSnKnqgr8s88+w+DBg3HVVVehs7MTe/fu\nxcaNG9U8JGUok0rIiErYNO2GJKspsxnctcuAcL9sxkfs9inEqlQl8NOnT2P58uXo6+tDX18f7rnn\nHtxyyy2iYiMFlCZIoz7yGv6HnmZedzbJy+gBYS3eiDMdH7HLtFSrU5XAp06dipaWFlGxUBaUJki9\nKuH45NLRcUb2frr8oSvcwySbwV2jBoTN0nu226cQq+JKTItTWgllUglnW+HJJRe3+yG43Stx5sxT\nA7dp/oeepuo2clBV7XObpSVl9KcQCmMCtzillZDSRK+mwpNLLmfOPI4ZMx7E9Ok6/aGnqbqNrGBF\nPLfhLakonJZqAmYYSSV1Uu2dHX0fJTMN1MwuUHposSYU7tdt5OwJEc/N2R/OoSR3sgK3ASWVkNKP\nvGoqPMMGtjLYrzvZz3fy5LmE26LbHR0dfwfQjREjPFm3XURUz+w9UzQmcAdRkujVJGHdk0sWOwcm\n+/laW08jGGweeH3k2h3h80G/CmBuVm0XEW9w7D1TDDN8DCDzUHuMl5J2TrZxRS9WSughZPA4eXn3\nx337egloimlDJGtVADVZty14RBplQknuZAVOMdRWeFoMbEVXwxLU7dddUTEX1177//Hee+GDmoFe\nAPMBzEVX16sD90vW7gh/T1imA4dGV8+m29KAVGMCpwRmm13QP7slPnnP99dgTxb7defnj8V77yUe\nvxbdykjW7ggn/MT7K+X0+eMkFo9UI9Nv5v9KY2zydqEPLkhZT51TciiE3H2ADQDmyd5fK6J+N8nn\nj1vrSD2KxQrc4UxfmaXYOTDb2S1KWhnx9zl37lNI0hcYMeJV5Obu1aX1IfJ3Y6b54ySQGRrxpL1k\nOxaadl5xXEDea9drOvhnxh0dRf5uTPt7pqSU5E5W4A6QqpJLVZkZNuglM6+7Ltis2eBfIPBzbNny\nNjo7fzFwW3yla8RrIbJq5vxxmzLDuwhpK1X1lezfhg37esJ0O8338Va4mlKk8LTCf01ZnRq1p7no\nqlmrKZ6kDSW5kwncAVItcZdLTuF50Sv1/cgd90R6tTHCSTL1FgBGtR84b9zZlOROtlAcINUKwOjB\nujfe+ATt7f+E8LzoV2W/R/igV7KzKRv13K889QpJowYAjZ43TubHBO4A6fqf/XOTfb4AmpoCl+/R\nKPtYQvc1SXM2Zfw2qVr0ocNvbuUIL5OPvD55efdj9epvRt0nkR57mpttTj6ZCxO4Ayit5GITVWJS\nEzboJbOHie/m7wNNiXftr3K1mu4YfnN7Ba2tfgDh1Zl5eUfw3e+WDTwuBwDJrJjAHUJJJRebqML3\nzctbBK93PPLzvyTm43tc8vaXV6Nq1/60Va5WBxlE3tz2Rr25PZhyTjhbGWQWTOA0QD5RPSgmUaXp\ndd99d/7lNw8/wu2bwcjLO4KvfKUMgLZ9aKXb8TJhk9moSuBtbW1YtmwZ/v73v8PlcuFb3/oWqqqq\nRMVGBtAkUSnodR84UIu7787Hli3/NTAfu7MTePbZasya1cxDdIlkqNoLZciQIfjpT3+K9957DwcO\nHMCTTz6JI0eOiIqNrM7lik3efX3wlW2UvWtXVw5ef/1UzGIaILJfh5L9S4icRlUF7na74Xa7AQDD\nhw9HUVERTp06haKiIiHBkYUlOSUnVSXd1ZW8TcI+NLeDpUTCeuDHjx/HoUOHMHv2bFEPSVaU5pSc\nVDM66utTT110ch/a9JuOkSGEJPDz58+jsrISdXV1GD58eMK/BwKBga99Ph98Pp+IpyWzUXA2ZbpK\nmtP15Gk1C4fMIxQKIRQKZfQ9rstLNrN26dIlLFiwALfddhvWrl2b+AQuF1Q+BZldFmdTJhMMNsdN\n6ZvHBAXELbKKKCsLIBRKvJ2sT0nuVFWBS5KElStXori4WDZ5kwPEJergzibUz6/Nuk/r5DZJKpGx\ng2b0T7MEetDRcca4oMhwqhL4a6+9hmeffRbTpk1DaWkpAOCHP/wh5s/nR17bk6m6g7v2s0+rkaqq\ncrz99kqcOeNG9OrY06cfQjDYzNfXoVS3UNI+AVso9pOk1+3316Cx8dGEu/v9tdizJ/EMSsqM17sI\nH310Hfqr7/B2B3P5+tqU5i0UcpgrrgAuXYpcj+t189gu7QSDzTh1ajSA6DfI8Kcbvr7OxQROyshU\n3fHzkjs6/lf2W7laUr36+kZ0df1n3K0/AFCL3FwjIiIzYAKn1MaNAz79NHL9ctUtNy/Z7V4Jt/sh\nnDnz+MBtyaYBclFKZpJ9usnN/RirV6/SORoyCyZwSi7FvG65eclnzjyF0tJVmD499WpJLkrJXLIV\nrMXFX+Jr5mBM4JTI6wU++ihyXWZed7KKcMQID/bsCaR8eC5KyVyyFayPPLIoq8fjJyB7YAKnWApW\nUwLqTqnhYGfmRO4Fw09A9sEETmEzZgCHDkWup1lNqeaUGtFbwzqlmhS1yImfgOyDCZwUV93R1FSE\nIo8oYzWZOX4Csg8mcCebNQt4883I9Qz3MFFaEcpVyHV1fiHtAFaTmePhGPbBBO5UWVTd2UhWIdfV\n+YWsHmQ1mTke0mwfTOBOo7LqzpTWFTKryczxcAz7YAJ3Ep2q7mhaV8hGVZPBYDNqa7fj2LELcLm+\nQGHhcGzatMwySZC7PtoDE7gT6Fx1R9O6QjaimgwGm7Fq1f/gzJmfD9z2j39UY9Wqp/GrX3HwlPTD\n3QjtzoCqO5pcD9zr3YC6Out+ZE+26yJQC78f3BmQhOBuhE5WWQm88ELkuo5VdzQ79luTtYWAHHR1\n6RoKORwTuB0ZXHXHs1u/NVlbCOjlzoCkq0FGB0ACVVbGJu++PsOTt9kFg83w+2vg8wXg99cgGGxO\n+z1VVeVwux+Ku3UD3O6TWL16njaBEslgBW4XJqu6rSDbVZwVFXPxq18BDz/8II4dOw+gG4WFw7Bp\n07/Z6pMGmR8HMa1u1Srgqaci1w3qdVsRj4AjM9NlEPPee+9FMBjEuHHj8M4776h9OMqERapus242\nxVWcZHWqE/iKFSuwevVqLFu2TEQ8pISFqm4zbzbFVZxkdaoHMefMmYNRo0aJiIWUcLlik7ckmTZ5\nA6mW0u81KKKIqqpyeL3VMbeFV3FyIJKsgYOYVvEf/wE8Hjlr0sxVdzQztynsOEednEWXBB4IBAa+\n9vl88Pl8ejytfSg4Ed4sfeV4Zm9T2G2OOllXKBRCKBTK7JskAY4dOyZNmTJF9t8EPYUzPfSQJIWb\nJOFLX58kSZK0c2eT5PVuiPknr3eDtHNnk8EBJ5KPdb0pYyUyEyW5ky0Us8rwRHizHmLANgWRdlQn\n8CVLlqCpqQlnz55FQUEBHnnkEaxYsUJEbM60aRPw8MOR6xmcCG+GvrIctimItKE6gW/btk1EHATo\nciI8iWOVcQiyL7ZQzEBB1R2NR2IZz8zz28k5uJTeaFmupgwGm9HQsDeqrzxPNnGwStQGl+GT1rgf\nuJk1NABVVZHrGpwIzypRO1YbhyB74nayRnC5YpO3RqspzbwK0uo4DkFmwASup2ee0XW/blaJ2uEy\nfDIDtlD0YsDOgawStcP57WQGTOBae+YZYPnyyHUd9zD553/+Mvbv/3d0dv5i4DbOVhGH89vJaEzg\nWjJwv+5gsBnPPnsSnZ1LAdQCyEFe3hHcfXeZZZIOZ9AQpcYEroWXXwb+5V8i1w3YOTB2ADOc9Do7\ngQMHanWNI1ucQUOUHgcxRXO5YpO3Qft1W30AkzNoiNJjAhelqclUJ8JbfQDT6m9ARHpgAhfB5QL6\n9zhfuNAUp+RYfZqb1d+AiPTAHrgaTU2RxA2Y6pQcq09zM9t+LxxQJTPiXijZik7UCxcCL7xgXCw2\npXS/Fz3iiB9Q9XqrUVfnZxInzSjJnUzgmXrrLaCkJHLdRFU3aYMbV5ERuJmVaNGJes0a4IknjIuF\nZGnR6uCAKpkVE7gSJqm62YdNTau54xxQJdPS6DzOATo8hbZyciKn8dbWGhaGlQ4yNkp5eXXM69N/\n8ftrVD0uD2YmIyjJnaor8D179mDt2rXo7e3FqlWr8L3vfU/9u4oZfPwxUFgYuW5wr9tKBxkbRatW\nh9Vn9JB9qUrgvb29+M53voN9+/YhPz8fs2bNwh133IGioiJR8RnjiSeAdevCX2/fDixaZGw8YB9W\nCS1bHdy4isxI1UKegwcPYsKECSgsLMSQIUOwePFivPTSS6Ji0197e7jKXrcOuOuu8KdlEyRvgH1Y\nJay+eIkoU6oq8JMnT6KgoGDgusfjwRtvvKE6KENEV93vvgtMnmxsPHHMtrDFjNjqIKdRlcBddpj/\n3N4OjBoV/vquu4D//m9j40mCyUkZtjrISVQl8Pz8fLS1tQ1cb2trg8fjSbhfIBAY+Nrn88EXvfzc\nSCavuuMxORHZVygUQigUyuh7VK3E7OnpwfXXX4/f//73+PKXv4wbb7wR27ZtixnENOVKTItU3UTk\nXEpyp6pBzMGDB+NnP/sZ/H4/iouLsWjRIvPPQHniiUjyfvddJm8isizn7IXCqpuILETzCtwyWHUT\nkQ3Zey8UVt1EZGP2rcBZdRORzdmvAmfVTUQOYa8KnFU3ETmIPSpwVt1E5EDWr8BZdRORQ1m3AmfV\nTUQOZ80KnFU3EZHFKnBW3Y7B8z+J0rNOArfYzoGUPa0OJyayG2u0UFpbY0/JYfK2teTnf+41KCIi\nc7JGBX7ttUBPD5DD8x+dgOd/EiljjQrc5WLydhCe/0mkjDUSODkKDycmUsY5+4GTpQSDzWho2Bt1\n/uc8DmCSoyjJnUzgREQmxAMdiIhsjAmciMiisk7gzz//PCZPnoycnBy0tLSIjImIiBTIOoFPnToV\nL774IubOtcfAUigUMjoERawQpxViBBinaIxTf1kn8EmTJmHixIkiYzGUVX6pVojTCjECjFM0xqk/\n9sCJiCwq5VL6efPm4cyZMwm3b968GbfffrtmQRERkQKSSj6fT/rzn/+c9N+9Xq8EgBdeeOGFlwwu\nXq83bf4VsplVqsnmR48eFfEUREQUJ+se+IsvvoiCggIcOHAAFRUVuO2220TGRUREaWi+lJ6IiLSh\nyywUMy/VHc/IAAAE1klEQVT62bNnDyZNmoTrrrsOP/rRj4wOR9a9996Lq6++GlOnTjU6lJTa2tpw\n8803Y/LkyZgyZQrq6+uNDklWV1cXZs+ejZKSEhQXF2P9+vVGh5RSb28vSktLTT1xoLCwENOmTUNp\naSluvPFGo8OR1d7ejsrKShQVFaG4uBgHDhwwOqQEH3zwAUpLSwcuI0eOTP13pHYQU4kjR45IH3zw\nQdoBT7319PRIXq9XOnbsmNTd3S1Nnz5dev/9940OK0Fzc7PU0tIiTZkyxehQUjp9+rR06NAhSZIk\n6dy5c9LEiRNN+XpKkiRduHBBkiRJunTpkjR79mxp//79BkeU3E9+8hNp6dKl0u233250KEkVFhZK\nZ8+eNTqMlJYtWyY99dRTkiSFf+/t7e0GR5Rab2+v5Ha7pU8++STpfXSpwM266OfgwYOYMGECCgsL\nMWTIECxevBgvvfSS0WElmDNnDkb1H+ZsYm63GyUlJQCA4cOHo6ioCKdOnTI4KnlDhw4FAHR3d6O3\ntxejR482OCJ5J06cwK5du7Bq1SrT7+pp5vg+//xz7N+/H/feey8AYPDgwRg5cqTBUaW2b98+eL1e\nFBQUJL2PoxfynDx5MubF8Xg8OHnypIER2cfx48dx6NAhzJ492+hQZPX19aGkpARXX301br75ZhQX\nFxsdkqx169Zh69atGDTI3H+qLpcLt956K2bOnIlf/vKXRoeT4NixYxg7dixWrFiBGTNm4L777sPF\nixeNDiul7du3Y+nSpSnvI+x/xbx58zB16tSEy44dO0Q9hXAul8voEGzp/PnzqKysRF1dHYYPH250\nOLIGDRqEw4cP48SJE2hubjbl8uqdO3di3LhxKC0tNXV1CwCvvfYaDh06hN27d+PJJ5/E/v37jQ4p\nRk9PD1paWvDAAw+gpaUFw4YNw2OPPWZ0WEl1d3djx44duOuuu1LeT9ihxnv3Wu/E8Pz8fLS1tQ1c\nb2trg8fjMTAi67t06RK+8Y1v4O6778bXv/51o8NJa+TIkaioqMCbb74Jn89ndDgx/vSnP+Hll1/G\nrl270NXVhY6ODixbtgzPPPOM0aElGD9+PABg7NixuPPOO3Hw4EHMmTPH4KgiPB4PPB4PZs2aBQCo\nrKw0dQLfvXs3brjhBowdOzbl/XT/XGamSmLmzJn461//iuPHj6O7uxvPPfcc7rjjDqPDsixJkrBy\n5UoUFxdj7dq1RoeT1GeffYb29nYAQGdnJ/bu3YvS0lKDo0q0efNmtLW14dixY9i+fTu++tWvmjJ5\nX7x4EefOnQMAXLhwAY2NjaabMeV2u1FQUIAPP/wQQLi/PHnyZIOjSm7btm1YsmRJ+jvqMZr6u9/9\nTvJ4PFJubq509dVXS/Pnz9fjaRXZtWuXNHHiRMnr9UqbN282OhxZixcvlsaPHy9dccUVksfjkX79\n618bHZKs/fv3Sy6XS5o+fbpUUlIilZSUSLt37zY6rARvv/22VFpaKk2fPl2aOnWqtGXLFqNDSisU\nCpl2FspHH30kTZ8+XZo+fbo0efJk0/4dHT58WJo5c6Y0bdo06c477zTtLJTz589LY8aMkTo6OtLe\nlwt5iIgsytxD20RElBQTOBGRRTGBExFZFBM4EZFFMYETEVkUEzgRkUUxgRMRWRQTOBGRRf0fUkOT\n5qBJPvYAAAAASUVORK5CYII=\n", "text": [ "" ] } ], "prompt_number": 9 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Yikes -- there is no correlation at all! Cross-validation did **not** detect the overfitting, because we used the entire data to select \"good\" features." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The right way to cross-validate\n", "\n", "To prevent overfitting, we can't let *any* information about the full dataset leak into cross-validation. Thus, we must re-select good features in each cross-validation iteration" ] }, { "cell_type": "code", "collapsed": false, "input": [ "scores = []\n", "\n", "for train, test in KFold(len(y), n_folds=5):\n", " xtrain, xtest, ytrain, ytest = x[train], x[test], y[train], y[test]\n", " \n", " b = SelectKBest(f_regression, k=2)\n", " b.fit(xtrain, ytrain)\n", " xtrain = xtrain[:, b.get_support()]\n", " xtest = xtest[:, b.get_support()]\n", " \n", " clf.fit(xtrain, ytrain) \n", " scores.append(clf.score(xtest, ytest))\n", "\n", " yp = clf.predict(xtest)\n", " plt.plot(yp, ytest, 'o')\n", " plt.plot(ytest, ytest, 'r-')\n", " \n", "plt.xlabel(\"Predicted\")\n", "plt.ylabel(\"Observed\")\n", "\n", "print(\"CV Score is \", np.mean(scores))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "CV Score is -1.64839183777\n" ] }, { "metadata": {}, "output_type": "display_data", "png": "iVBORw0KGgoAAAANSUhEUgAAAXcAAAEPCAYAAAC5sYRSAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3X9UVHXeB/D3ADZToKatqaWJO1nywwQr3LVkJ1wYN2IF\n8+QKbYqKZY8zZFtnN7HNzdR2faqHmezpx1PtlqSWHbQcIzBEUFO3oHK1EzXi5g90/RH5A2Zwhvv8\nMYIzMcjAzJ079877dc6cM1yHuZ8h/PT2c7/3XpUgCAKIiEhRIqQugIiIAo/NnYhIgdjciYgUiM2d\niEiB2NyJiBSIzZ2ISIFEbe5NTU2YNm0a4uLiEB8fj127dom5OyIiuihKzDcvLCzE3XffjfXr18Ph\ncOD8+fNi7o6IiC5SiXUS048//ojk5GQcOHBAjLcnIqLLEG0s09DQgEGDBiE/Px/jxo1DQUEBmpub\nxdodERG5Ea25OxwO1NbW4uGHH0ZtbS2io6Px7LPPirU7IiJyJ4iksbFRiI2N7fi6pqZGyMzM9HiN\nVqsVAPDBBx988NGDh1ar7bYHi5bchwwZguHDh6O+vh4AsGXLFiQkJHi8xmq1QhAExT6eeuopyWvg\n5+PnC8fPp+TPJggCrFZrtz1Y1NUyZrMZeXl5aG1thVarxZtvvinm7oiI6CJRm/vYsWPxz3/+U8xd\nEBGRFzxDVUQ6nU7qEkTFzydvSv58Sv5svhJtnbtPO1epIOHuSQLVFgvKTSZE2e1wqNXIMBqRmpkp\ndVlEsuJL7xR1LEPkrtpiwceFhVjmdjCo6OJzNniiwOJYhoKm3GTyaOwAsMxqRYXZLFFFRBJrbQVO\nnxblrdncKWii7Hav2yNttiBXQhQC3nsPUKuBIUNEeXuOZShoHGq11+1OjSbIlRBJyGYDBg8GzpwB\nbrsN2L1blN0wuVPQZBiNKNJqPbYt0mqRbjBIVBFRkL33HnDlla7GXl0N/POfQIQ4bZirZSioqi0W\nVJjNiLTZ4NRokG4w8GAqKZ+3tO5HU/eld7K5ExGJ6b33gPvucz2vrgYmTvT7LbkUkohIKgFO6z3F\nmTsRUaAFcbbeFSZ3IqJAkTitu2NyJyIKhBBI6+6Y3ImI/BFCad2d9BUQEclViKV1d0zuREQ9FaJp\n3V1oVUNEFOpCOK27Y3InIvKFDNK6u9CtjIgoVMgkrbtjcici6orM0ro7eVRJRBRsMkzr7pjciYjc\nyTitu5NfxUREYpF5WnfH5E5EpJC07k7e1RMR+UtBad0dkzsRhScFpnV3yvkkRES+Umhad8fkTkTh\nQ+Fp3Z3ozT02Nhb9+vVDZGQk+vTpgz179oi9SyKizkS4l2koE725q1QqVFVVYeDAgWLvioioszBK\n6+6C8gm7u0s3EZEowmC23hWVIHLn/fnPf47+/fsjMjISDz74IAoKCi7tXKVi4yeiwFN4Wveld4o+\nltmxYweGDh2KEydOID09HaNHj8ZEhc+6iEhCYTZb74rozX3o0KEAgEGDBiEnJwd79uzxaO5Llizp\neK7T6aDT6cQuiYiUSMFpvaqqClVVVT36HlHHMs3NzXA6nejbty/Onz+PjIwMPPXUU8jIyHDtnGMZ\nIgqEMEvrko9ljh8/jpycHACAw+FAXl5eR2MnIvKbgtO6v0Q/oHrZnTO5E1FvhVladyd5ciciCjim\ndZ/wJ0JE8hHG69Z7ismdiEIf03qP8adDRKGNab1XmNyJKDQxrfuFPykiCj1M635jciei0MG0HjD8\nqRFRaGBaDygmdyKSFtO6KPgTJCLpMK2LhsmdiIKPaV10/GkSUXAxrQcFkzsRBQfTelDxJ0tE4mNa\nDzomdyISD9O6ZPhTJiJxMK1LismdiAKLaT0k8CdORIHDtB4ymNyJyH9M6yGHP30i8g/TekhicicA\nQLXFgnKTCVF2OxxqNTKMRqRmZkpdFoUypvWQxuZOqLZY8HFhIZZZrR3bii4+Z4Mnr957D7jvPtfz\n6mpg4kRp66FOVIIgCJLtXKWChLunixbr9XimvLzT9if1eiwtK5OgIgpZTOshwZfeyf8qhCi73ev2\nSJstyJVQSNu3Dxg4kLN1meBYhuBQq71ud2o0Qa4kOCotldhg2gCVXQVBLSDbmI20zDSpywpdDgew\nciXw/PPA//wPUFAAqFRSV0XdYHMnZBiNKLJaPWbui7RaTDYYJKxKHJWWSqwpXIM8a17HthJrCQCw\nwXuzbx+Qnw/07w989hkwYoTUFZGPOHMnAK6DqhVmMyJtNjg1GqQbDIo8mGrUGzG1fGqn7aX6UhSX\nFUtQUYiy2YBZs4AtW4Dly5nWQ4wvvZPJnQC4VsUosZn/lMreRYPi4YVLNmwA8vIAjQaoqQHi46Wu\niHqBR0MorAjqLtKOMg8v9IzNBuj1wNSprseJE2zsMiZ6c3c6nUhOTkZWVpbYuyLqVrYxGyXaEo9t\nq7WrMcUwRaKKQsSGDcA117jm6tu3A2+/zZUwMif6WKa4uBjx8fE4e/as2Lsi6lb7QdNSc6lrFKMB\ncg254Xsw1WYDpkwBKipco5h//INNXSFEPaB6+PBhzJo1C0VFRXj++efx4Ycfeu6cB1QVxVJZCdOG\nDbCrVFALAozZ2chMC9OmKQfus/UPPwQmTJC6IvKR5AdUFy5ciJUrV+LMmTNi7oZCgKWyEoVr1sCa\nd2mJobXENf5ggw8xTOthQbTmvmnTJlx77bVITk5GVVVVl69bsmRJx3OdTgedTidWSSQi04YNHo0d\nAKx5eTCXlrK5h5LHHnOdjDRggGu2zrQuC1VVVZfto96INpZZtGgR3n77bURFRcFms+HMmTO49957\n8dZbb13aOccyiqErLMS2nJxO239VWoqqYq4fl1xTE/CznwFOJzBkCHDkCNO6jEl6bZnly5fj0KFD\naGhowNq1a5GWlubR2El8lspK6I1G6AoLoTcaYamsFG1f6i5+0bjCMAQ89pgrqTudwEsvAY2NbOxh\nIGgnMal4dltQBXsGbszOhrWkxGN/2tWrYcjNDfi+yEfuaf2aa4Bjx4AonrcYLnj5AYXSG40on9r5\nNHt9aSnKRBqTWCorYd64sX2FIQxTpnDeLpXHHgOee871/KWXgPnzpa2HAkry1TIkHXsX/1IS8yz7\nzLQ0NnOpuaf1gQOB48eZ1sMUB28KxRl4GPrpbP3UKTb2MMb/8grFGXgY4WydvODMXcE4Aw8DnK2H\nJV96J5s7kRwxrYc13kOVSIl+Ols/eZKNnTrhbwSRXDCtUw8wuRPJAdM69RB/O0KYxVINk6kcdnsU\n1GoHjMYMZGamSl0WBRPTOvUSf0tClMVSjcLCj2G1LuvYZrUWAQAbfLjgShjyA1fLhCi9fjHKy5/x\nsv1JlJUtlaAiCppjx4ChQ13PmdbJC79Wy8TExKBv375eH/369Qt4seTJbvf+l9lmiwxyJRRUWVmX\nGvujj3K2Tr3W5W/NuXPnAACLFy/Gddddh/vvvx8AUFJSgqNHjwanujCmVju8btdonEGuhILCPa1H\nRgLNzcAVV0hbE8lat6tlPvjgAzz88MPo168f+vXrh/nz52Pjxo3BqC2sGY0Z0GqLPLZptYtgMKRL\nVBGJ5qdp3eFgYye/dfvvvejoaKxevRozZswAAKxduxYxMTGiFxbu2g+ams1PwmaLhEbjhMEwmQdT\nlYRpnUTU7QHVhoYGFBYWYufOnQCAO+64A8XFxYiNjfV/5zygSuEqKwvYtMn1/NFHL62KIfIBry1D\nFGqY1ikAAnJtmW+++QaTJk1CQkICAOCrr77CM890XqJHRN3gbJ2CqNvknpqaipUrV+Khhx5CXV0d\nBEFAYmIi9u3b5//OmdwpHDCtU4AFJLk3Nzdj/PjxHm/ap08f/6sjCgdM6ySRblfLDBo0CN99913H\n1+vXr8fQ9l9WIvKOaZ0k1u1Yxmq1Yt68edi5cycGDBiAkSNHoqSkhKtliLrClTAksoCslnE6nYiM\njMS5c+fQ1tYW0EsPsLmTojCtU5AEZOY+cuRIzJs3D7t370bfvn0DVhyRonC2TiGm2+R+/vx5bNq0\nCWvXrkVtbS2ysrIwffp0TJw40f+dM7mT3DGtkwQCktyjo6Mxffp0lJaW4osvvsCPP/4InU4XqBqJ\n5ItpnUJYt6tlBEHAtm3bsG7dOpSVleH222/Hu+++G4zaiEIT0zrJQLdjmdjYWCQlJWH69OnIysoK\n6EXDOJYh2eFKGAoBvvTOyyZ3p9OJ2bNn489//nOvCrDZbPjVr34Fu92O1tZWTJkyBStWrOjVexFJ\nimmdZOayM/fIyEh8+OGHvX5zjUaDrVu34osvvsBXX32FrVu3Yvv27b1+PyJJcLZOMtTtzP3OO+/E\nggULMH36dERHR3dsHzdunE87uOqqqwAAra2tcDqdGDhwYC9LJQoypnWSsW5n7jqdDiqVqtP2rVu3\n+rSDtrY2jBs3DlarFfPnz8ff/va3SzvnzJ1CFWfrFML8nrkDQFVVlV9FREREdCyh1Ov1qKqq8lhK\nuWTJko7nOp2OyyxJWkzrFIKqqqp63Iu7Te7Hjh1DUVERjhw5grKyMuzfvx+ffvop5syZ0+MCly5d\niiuvvBKPPfaYa+dM7hRKmNZJJgJyEtOsWbOQkZGBo0ePAgBGjRqFF154wacCTp48iaamJgBAS0sL\nKioqkJyc7NP3EgXNsWOASuVq7JGRgN3Oxk6y121zP3nyJKZPn47IyEgAQJ8+fRAV1e00BwDQ2NiI\ntLQ0JCUlYfz48cjKysKkSZP8q5gokLgShhSq2y4dExODU6dOdXy9a9cu9O/f36c3HzNmDGpra3tf\nHZFYLs7WWwG8jyHYdec0CP+6gGxLJdIy06Sujshv3Tb35557DllZWThw4AAmTJiAEydOYP369cGo\njUgcF2frpxGBv/bPwG9+/CNyLp5+UWItAQA2eJK9bg+oAsCFCxfwzTffAABuvvnmgN1mjwdUKah+\nshLGeNdDmLplWqeXlepLUVxWHOTiiHwXkAOq7777LlpaWpCYmIjS0lJMnz6doxaSH/fZ+sKFgMMB\n1YVI76+1Ba8sIrF029yXLl2Kfv36Yfv27fjkk08we/ZsPPTQQ8Gojch/3lbCPP88AEBQd5F8NEGs\nj0gk3Tb39lUymzZtQkFBAe655x5cuHBB9MKI/OYlrbuvhMk2ZqNEW+LxLau1qzHFMCWYVfaKpcIC\nfb4eulk66PP1sFRYpC6JQky3B1Svv/56zJs3DxUVFfjTn/4Em82Gtra2YNRG1Ds+nmXaftC01Fzq\nGsVogFxDbsgfTLVUWFC4qhDWZGvHNusq1/PM9EypyqIQ49Nt9srKynDLLbdg1KhRaGxsxN69e5GR\nkeH/znlAlQJNpwO2bXM9X7iwYwSjJPp8Pcpjyztv/7ceZW+USVARBVtAri0THR2N2NhYbN68GRER\nEbjjjjsC0tiJAurwYWD4cNfziAigpUWxJyPZBbvX7bY2HgmmS7qduT/99NOYNWsWTp8+jRMnTiA/\nPx9Lly4NRm1EvtHpLjX2uXMBp1OxjR0A1Cq11+2aCB4Jpku6HcvcdNNN+Oqrr6DRuH5xWlpaMHbs\nWNTX1/u/c45lyB/uaV2lcqV1tffGpyTeZu7aWi2KFxRz5h4mAjKWuf7669HS0tLR3G02G4YNGxaY\nCol6y322Pncu8NprkpbjC4ulGiZTOez2KKjVDhiNGcjMTO3x+7Q3cPMaM2xtNmgiNDAsMLCxk4cu\nm7vBYAAA9O/fHwkJCUhPT4dKpUJFRQVSUlKCViCRB5mmdYulGoWFH8NqXdaxzWotAoBeN3g2c7qc\nLscyf//736FSqdDc3AyHwwEAiIqK6rht3syZM/3fOccyshSoBNpjMkzr7fT6xSgvf8bL9idRVsZj\nWNQzfo1l8vLyUFRUhDfeeAM33HADAOD7779Hfn4+li1b1tW3kcIFOoH6RKZp3Z3d7v2vms3WxSUQ\niPzU5WqZxx9/HKdPn0ZDQwNqa2tRW1uLAwcOoKmpCY8//ngwa6ReslRWQm80QldYCL3RCEtlpd/v\naTKVezR2ALBal8FsrvD7vb366UqYtjbZNXYAUKsdXrdrNM4gV0LhosvkvmnTJtTX1yMi4lL/79ev\nH15++WXcfPPNKC7mVfNCmaWyEoVr1sCal9exzVriOtU+M633Z2AGLYG6n2Uq07TuzmjMgNVa5PE/\nRq12EQyGyRJWRUrWZXOPiIjwaOztIiMjvW6n0GLasMGjsQOANS8P5tJSv5p7UBLoffcB773nev7I\nI4CPt3UMBZYKC0zvmGAX7FCr1DDmGl0HPy+OrMzmJ2GzRUKjccJgmBycYxUUlrps7nFxcfjHP/7R\n6cDp22+/jdGjR4teGPnHrlJ53e7vOYyiJtCTJ4EbbriU0s+ckdXJSN1d8yUzM5XNnIKmy+a+atUq\nTJ06FW+88QZuvfVWAMDnn3+O5uZmlJaWBq1A6h11F0fS/T2HUbQEOncu8PrrrufLlwNPPOFnpcFn\nesfk0dgBwJpshXmNmcsWKei6bO7Dhg3D7t27UVlZiX379kGlUiEzM5M3uJYJY3Y2rCUlHqMZ7erV\nMOTm+v3eAU2g7mn9mmuAxkYgQHf6CjZe84VCyWXPUFWpVJg0aRIbugy1z9XNpaXtV7OFITfXr3l7\nwCkgrbvjNV8olPh0D1XRds6TmMKTgtK6O17zhYLFl97J5k7BpbC0/lOWCovnNV9m8JovFHhs7hQ6\nFJrWiaTgS+/kgnUS39y5wKBBrsa+bJmr0bOxE4mq20v+EvUa0zqRZJjcSRxM6z6ptFTCqDeiUFcI\no96ISov/1/8hApjcKdCOHwdGjmRa90GlpRJrCtcgz3rpXIQSq+v6P2mZIbRklWRJ1OR+6NAh3HXX\nXUhISEBiYiJMJpOYuyOp5eQAQ4Ywrftog2mDR2MHgDxrHjaaN0pUESmJqMm9T58+eOGFF5CUlIRz\n587h1ltvRXp6OuLi4sTcLQWb+xUcr7gCOHtWVteEkYrK7v36P35fAIgIIif3IUOGICkpCQAQExOD\nuLg4HD16VMxdUrDl5Fxq7H/4A2C3s7H7SFB3sZSNJ7RSAARt5n7w4EHU1dVh/PjxwdoliYlp3W/Z\nxmyUWEs8RjOrtauRa/D/+j9EQWnu586dw7Rp01BcXIyYmBiPP1uyZEnHc51OB51OF4ySyB85OcCG\nDa7nf/gD8N//LW09MtV+0LTUXIr2CwDlGnJ5MJU6qaqqQlVVVY++R/QzVC9cuIB77rkHv/nNb/DI\nI4947pxnqMoL0zpRSJD8DFVBEDBnzhzEx8d3auwkM5ytE8mKqMl9+/btSE1NxS233ALVxTsDrVix\nApMnu+7aw+QuA0zrRCGHFw4j/3C2ThSSfOmdPEOVOmNaJ5I9XluGPHG2Tl3gdXDkhcmdXJjW6TJ4\nHRz5YXInYOZMxaX1aosFi/V6LNHpsFivR7XFInVJssbr4MgPk3s4a2oCtFrg9GngyitdX8u8qQOu\nxv5xYSGWWS/dy7To4vPUTN7yrjd4HRz5YXIPV4sWAQMGAD/8ALz8MtDcrIjGDgDlJpNHYweAZVYr\nKsxmiSqSP14HR36Y3MONe1ofORKorweilPVrEGW3e90eaWPM7C1eB0d+lPW3mi5v0SJgxQpApXKl\n9QcflLoiUTjUaq/bnRrGzN7idXDkhycxhYMwSOvuvM3cF2m1mFxczJk7KQLPUCXPtP6//6vYtP5T\n1RYLKsxmRNpscGo0SDcYfGrslgoLTO+YYBfsUKvUMOYakZnO/yFQaGFzD2dhkNYD3YgtFRYUriqE\nNflS4tfWaVH8X8Vs8BRSePmBcBUGs3Vvjdi6yvW8t43Y9I7J4/0AwJpshXmNmc2dZIfNXUnOnQNu\nvBE4flyxab2dGI3YLnhfZWNr4yobkh+uc1cKsxno3x84dcqV1g8cUGxjB8RpxGqV91U2mgiusiH5\nUe7f/nBx7hyQnAx89x0wfjywYwcQGSl1VaIToxEbc42wrrJ6ztxrtTAsMPT6PYmkwuYuZ2Yz8Mgj\nQEQE8P77wNSpUlcUNGI04vZxjnmNGbY2GzQRGhgWGDhvJ1niahk5CtO0/lOWCotnI57BRkzhgUsh\nlcg9ra9bF1ZpnYhcuBRSSZjWiagHuFpGDtpXwhw86Jqt79rFxk5El8XkHsqY1omol5jcQxXTOhH5\ngck91DCtE1EAMLmHEqZ1IgoQJvdQwLQuumqLBeUmE6LsdjjUamQYjby2Oykam7vUwvgs02DhDbMp\nHPEkJqnYbMDttwP/+hfTusgW6/V4pry80/Yn9XosLSuToCIi//jSOzlzl8LGjcA11wCHDnG2HgS8\nYTaFI1Gb++zZszF48GCMGTNGzN3Ih80G6PVATg5w772uuyRxDCM63jCbwpGozT0/Px9l/GevS3ta\n/+wz1wjmrbdcc3YSXYbRiCKt1mPbIq0W6QZeypeUS9QDqhMnTsTBgwfF3EXos9mAKVOAigrg/vuB\nv/+dTT3I2g+aPul2w+zJPt4wm0iuuFpGTBs3Arm5gEbjSuu//KXUFYWt1MxMNnMKK4yQYvjpbP3E\nCTZ2IgoqyZP7kiVLOp7rdDrodDrJagkIpnUiCrCqqipUVVX16HtEX+d+8OBBZGVlYe/evZ13rqR1\n7pytE1GQSL7OfcaMGZgwYQLq6+sxfPhwvPnmm2LuTjpcCUNEIYZnqPqDaZ2IJMDb7ImJs3UiCmFs\n7j0lo7RuqayEacMG2FUqqAUBxuxsZKalSV0WEQUBm3tPyCitWyorUbhmDax5eR3brCUlAMAGTxQG\nQjNyhhqHA/jjH2W1bt20YYNHYwcAa14ezBs3SlQREQUTk3t39u0DZs0CBgwAvvwSkMlF0Owqldft\nvA4iUXhgcu+KwwGsWAHodMC8ecDHH8umsQOAuosj6bwOIlF4YHP3Zt8+19hl61bg88+BggKgiyQc\nqozZ2dBenLG3065eDcOUKRJVRETBxHXu7hwOYOVK4PnngeXLgblzZdfU3VkqK2HeuBE2uBK7YcoU\nHkwlUgBfeiebezv32fr//R9www1SV0RE5JXklx+QBW+zdTZ2IpK58F4t457WP/+cTZ2IFCM8kzvT\nOhEpXPgld6Z1IgoD4ZPcmdaJKIyER3JnWieiMKPs5M60TkRhSrnJnWmdiMKY8pI70zoRkcKSO9M6\nEREApSR3pnUiIg/yT+5M60REncg3uTOtExF1SZ7JnWmdiOiy5JXcmdaJiHwin+TOtE5E5LPQT+5M\n60REPRbayZ1pnURgsVTDZCqH3R4FtdoBozEDmZmpUpdFFFCh2dwVdi9TCh0WSzUKCz+G1bqsY5vV\nWgQAbPCkKKKOZcrKyjB69GiMGjUKf/3rX337pv37gV/+Eti61ZXWCwrY2ClgTKZyj8YOAFbrMpjN\nFRJVRCQO0Zq70+nEggULUFZWhv3792PNmjX4+uuvu//G06ddDV0Bs/WqqiqpSxCVHD+f3e79H6s2\nW2SnbXL8fD2h5M+n5M/mK9Ga+549e3DjjTciNjYWffr0we9+9zts3Lix+2+8807XgVMFpHWl/4LJ\n8fOp1Q6v2zUaZ6dtcvx8PaHkz6fkz+Yr0Zr7kSNHMHz48I6vhw0bhiNHjoi1OyKfGI0Z0GqLPLZp\ntYtgMKRLVBGROEQ7oKpSQPIm5Wk/aGo2PwmbLRIajRMGw2QeTCXlEUTy6aefCnq9vuPr5cuXC88+\n+6zHa7RarQCADz744IOPHjy0Wm23PVglCIIAETgcDtx888345JNPcN111yElJQVr1qxBXFycGLsj\nIiI3oo1loqKi8OKLL0Kv18PpdGLOnDls7EREQSJaciciIulIdm2ZXp3gJBOzZ8/G4MGDMWbMGKlL\nEcWhQ4dw1113ISEhAYmJiTCZTFKXFFA2mw3jx49HUlIS4uPj8cQTT0hdUsA5nU4kJycjKytL6lIC\nLjY2FrfccguSk5ORkpIidTkB19TUhGnTpiEuLg7x8fHYtWuX9xcG9CiqjxwOh6DVaoWGhgahtbVV\nGDt2rLB//34pShFFdXW1UFtbKyQmJkpdiigaGxuFuro6QRAE4ezZs8JNN92kqP9+giAI58+fFwRB\nEC5cuCCMHz9eqKmpkbiiwHruueeE3NxcISsrS+pSAi42NlY4deqU1GWI5oEHHhBef/11QRBcv59N\nTU1eXydJcu/1CU4yMXHiRAwYMEDqMkQzZMgQJCUlAQBiYmIQFxeHo0ePSlxVYF111VUAgNbWVjid\nTgwcOFDiigLn8OHD2Lx5M+bOnQtBoVNZpX6uH3/8ETU1NZg9ezYA17HN/v37e32tJM2dJzgpx8GD\nB1FXV4fx48dLXUpAtbW1ISkpCYMHD8Zdd92F+Ph4qUsKmIULF2LlypWIiAj9K373hkqlwq9//Wvc\ndttteO2116QuJ6AaGhowaNAg5OfnY9y4cSgoKEBzc7PX10ryX5cnOCnDuXPnMG3aNBQXFyMmJkbq\ncgIqIiICX3zxBQ4fPozq6mrFnM6+adMmXHvttUhOTlZsut2xYwfq6urw0UcfYdWqVaipqZG6pIBx\nOByora3Fww8/jNraWkRHR+PZZ5/1+lpJmvv111+PQ4cOdXx96NAhDBs2TIpSqJcuXLiAe++9F/ff\nfz+ys7OlLkc0/fv3R2ZmJj777DOpSwmInTt34oMPPsDIkSMxY8YMVFZW4oEHHpC6rIAaOnQoAGDQ\noEHIycnBnj17JK4ocIYNG4Zhw4bh9ttvBwBMmzYNtbW1Xl8rSXO/7bbb8O233+LgwYNobW3FunXr\n8Nvf/laKUqgXBEHAnDlzEB8fj0ceeUTqcgLu5MmTaGpqAgC0tLSgoqICycnJElcVGMuXL8ehQ4fQ\n0NCAtWvXIi0tDW+99ZbUZQVMc3Mzzp49CwA4f/48ysvLFbVqbciQIRg+fDjq6+sBAFu2bEFCQoLX\n10pysw6ln+A0Y8YMbNu2DadOncLw4cPx9NNPIz8/X+qyAmbHjh1YvXp1x3IzAFixYgUmT54scWWB\n0djYiJkzZ6KtrQ1tbW34/e9/j0mTJkldliiUNiI9fvw4cnJyALhGGHl5ecjIyJC4qsAym83Iy8tD\na2srtFrBia6vAAADoklEQVQt3nzzTa+v40lMREQKpMzD5UREYY7NnYhIgdjciYgUiM2diEiB2NyJ\niBSIzZ2ISIHY3EnWIiMjkZycjDFjxuC+++5DS0tLr99r1qxZeP/99wEABQUF+Prrr7t87bZt2/Dp\np5/2eB+xsbE4ffp0r2sk8hWbO8naVVddhbq6OuzduxdXXHEFXn75ZY8/dzgcPr+XSqXqOKnntdde\nu+yJdVu3bsXOnTt7XK/SThqi0MXmTooxceJEfPfdd9i2bRsmTpyIKVOmIDExEW1tbXj88ceRkpKC\nsWPH4tVXXwXguozCggULMHr0aKSnp+M///lPx3vpdDp8/vnnAFw3lrn11luRlJSE9PR0/Pvf/8Yr\nr7yCF154AcnJydixYwdOnDiBadOmISUlBSkpKR2N/9SpU8jIyEBiYiIKCgoUe7EuCj2SXH6AKNAc\nDgc2b96Mu+++GwBQV1eHffv2YcSIEXj11Vdx9dVXY8+ePbDb7bjzzjuRkZGB2tpa1NfX4+uvv8ax\nY8cQHx+POXPmALiU4k+cOIF58+ahpqYGI0aMQFNTE66++mo89NBD6Nu3Lx599FEAQG5uLhYuXIg7\n7rgD33//PSZPnoz9+/fjL3/5C1JTU7F48WJs3rwZr7/+umQ/IwovbO4kay0tLR3Xt0lNTcXs2bOx\nY8cOpKSkYMSIEQCA8vJy7N27F+vXrwcAnDlzBt9++y1qamqQm5sLlUqFoUOHIi0tzeO9BUHArl27\nkJqa2vFeV199tceft9uyZYvHjP7s2bM4f/48ampqUFpaCgC4++67FX0TFwotbO4ka1deeSXq6uo6\nbY+Ojvb4+sUXX0R6errHts2bN3c7JvF1Ri4IAnbv3o0rrrjC658RBRtn7qR4er0eL730UsfB1fr6\nejQ3NyM1NRXr1q1DW1sbGhsbsXXrVo/vU6lU+MUvfoHq6mocPHgQADpWuvTt27fj0rIAkJGR4XGj\n8C+//BKA618T77zzDgDgo48+wg8//CDa5yRyx+ZOsuYtWbuvegGAuXPnIj4+HuPGjcOYMWMwf/58\nOJ1O5OTkYNSoUYiPj8fMmTMxYcKETu/1s5/9DK+++iqmTp2KpKQkzJgxAwCQlZWF0tLSjgOqJpMJ\nn332GcaOHYuEhAS88sorAICnnnoK1dXVSExMRGlpacd4h0hsvOQvEZECMbkTESkQmzsRkQKxuRMR\nKRCbOxGRArG5ExEpEJs7EZECsbkTESkQmzsRkQL9P9dUJsPehf/5AAAAAElFTkSuQmCC\n", "text": [ "" ] } ], "prompt_number": 31 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now cross-validation properly detects overfitting, by reporting a low average $R^2$ score and a plot that looks like noise. Of course, it doesn't help us actually *discover* the fact that columns 5 and 10 determine Y (this task is probably hopeless without more data) -- it just lets us know when our fitting approach isn't generalizing to new data." ] } ], "metadata": {} } ] }