{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Feature Selection\n", "\n", "## Subset Selection\n", "\n", "
Probably the most common approach to explicit feature selection is to choose a subset of $k$ features from the total set of $p$ features. The idea scenario would be to consider every possible subset of size $k$ and choose the one that has the best out-of-sample error. Although ideal in theory, in practice, there are $\\binom{p}{k} = \\frac{p!}{k!(p-k)!}$ possible subsets of size $k$. Testing each one is often cost prohibitive when $p$ gets large.
\n",
"\n",
"Forward Stepwise Selection
\n",
"Forward stepwise selection uses a greedy procedure to learn a good subset of $k$ features (though not guaranteed to be the \"best\" subset). The procedure goes as follows:
\n",
"
\n", "\n", "\n", "
" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAY4AAAEWCAYAAABxMXBSAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/d3fzzAAAACXBIWXMAAAsTAAALEwEAmpwYAABU80lEQVR4nO3dd3gU1frA8e+bQkkoIp0ECFKUXhWRYhAFFFH0CkgxxXbVq9j1er2Coj9FxXrtomIBBGyoiKJoQFCRIiC9SOhdeoBA8v7+mEnYJLubBLLZlPfzPPtkd+bMzLvLMu+ec2bOEVXFGGOMyauQYAdgjDGmeLHEYYwxJl8scRhjjMkXSxzGGGPyxRKHMcaYfLHEYYwxJl8scZhCIyLTRCQ+2HEEm4gkiciNwY6jKBKRWBHZHID9dhWRVQW939LKEkcJJSJdROQXEdkvIn+LyBwROdddlyAisws7JlW9VFXfL8h9uifhoyJyyOPRqSCPUZhE5FEROZ7t/TxQAPv8qKBizOMxbxCRlSJyUER2iMhUEalYiMdXEWmU8VpVf1bVswvr+CVdWLADMAVPRCoBXwO3ApOAMkBX4Fgw4wqg21V1zKluLCJhqnqiIANy9yuAqGp6PjedqKpDCzqeU5Xfz0dELgSeBHqr6h8icibQN2ABmkJnNY6SqQmAqk5Q1TRVPaKq01V1iYg0Bd4AOrm/ZvcBiEhZERktIhvdX4hviEh5d12siGwWkf+IyG4RSRaRIe66BiKyT0RC3NdjRGRnRiAi8pGI3OU+z2yiEZFGIjLTrRHtFpGJHtucIyLfuzWlVSIyIL8fgIiEiMh/RWSDiOwUkQ9EpLK7Lsb9RXqDiGwEfhSR90XkXnd9lLv+No9Y/xZHFRH5WkR2iche93m0x3GTROT/RGQOkAKcJSKXuL++94vIK4Dk9/24+75eRFa4x/1OROp7rHtJRDaJyAERWSAiXd3lvYH/AAPdf+/F7vJkEbnYY/vMWom3zye342dzLvCrqv4BoKp/q+r7qnrQ3Y/P75qX91xHRD51P+/1IjLMY12o+51cJ07NZoGI1BWRWW6Rxe57HijZmsBEpKn7b7VPRJaJyBUe68aKyKvi1JIOishcEWmY13+n0sASR8m0GkhzT4aXikiVjBWqugK4Bec/dgVVPcNd9TROwmkDNAKigOEe+6wFVHOXxwNvicjZqroeOAC0dct1BQ6Jk6AAugEzvcT4ODAdqAJEA/8DEJFI4HtgPFADGAS8JiLN8/kZJLiP7sBZQAXglWxlLgSaAr3cGGM9lv/l/s14Dz+rMz5PCPAeUB+oBxzxst/rgJuBisB+4FPgvzif3zqgcz7fCyLSDycBXA1UB34GJngUmYfzb3cmzmc3WUTKqeq3OL/+J7r/3q3zcdjMzycPx/c0193mMRHpLCJls63P7buW8Z5DgK+AxW6ZHsBdItLLLXIPzvfjMqAScD2Qoqrd3PWt3fc8Mdt+w939Tsf5jt0BjBMRz6asQcBjON/PtcD/+XivpZOq2qMEPnD+w48FNgMngC+Bmu66BGC2R1kBDgMNPZZ1Ata7z2PdfUR6rJ8EPOI+/xDnP3EtYBXwDE5yagDsA0LccknAje7zD4C3gOhscQ/EOUl7LnsTGOHjfSbh/LLf5z4WustnALd5lDsbOI7TPBsDKHCWx/qGGbHi1Mj+CWx2170P3OPj+G2AvdniGenxOg74LdtnvTnjc/Cyv0eBVI/3sw+oA0wDbvAoF+K+7/o+9rMX58SZsc+Psq1PBi7OdtyP3OfePp/8Hv9SnJPzPuAQ8DwQSt6+axmfe0dgY7b9PgS85z5fBVzp4/gKNPJ47bnfrsB23O+lu2wC8Kj7fCwwxmPdZcDKYP5/LmoPq3GUUKq6QlUTVDUaaIFz8nnRR/HqQASwwK267wO+dZdn2Kuqhz1eb3D3CSd/rXcDZuGcPC90Hz+r9zb+B3BOIr+7TQXXu8vrAx0z4nBjGYKTlHwZpqpnuI927rI6boye8YYBNT2Wbcp4oqrrcE5wbXBOLF8DW91foRe67xERiRCRN90msAPu+z1DREK97deNw/M4mm29N5M83s8ZqroV53N5yeMz+Rvn84ty47rXbUba766vjFPDOR2ecfo9fnaqOk1V++LUgK7E+bFyI3n7rnkes06278J/OPlvWBenBpdfdYBN2b6XG7K9l+0ez1NwaqzGZZ3jpYCqrhSRsTi/osH5NeZpN06TS3NV3eJjN1VEJNIjedQDlrrPZwLP4vySngnMxvnVfhTvzVSo6nbgJnCuAAN+cNumNwEzVfWSfL3JnDJOthnq4dSaduA0jUHOz2EmcA1QRlW3iMhMnBpDFWCRW+ZenNpLR1XdLiJtgD/I2m/hud9tOCc4ILPDvC75twn4P1Udl32F25/xIE5TzjJVTReRvR4xeRsC+zDOCTyDt8TsuZ3P4/vjnpxniMiPOD9g3ib375rnMderamM/6xty8nuYV1uBuiIS4pE86uE08Zo8sBpHCSRO5/K94nbaikhdnDbb39wiO4BoESkDmf+53wZeEJEa7jZRHm3JGR4TkTLuiepyYLK7/Rqck8FQYJaqHnCP8Q98JA4R6S8nO5X34pyk0nB+6TcRketEJNx9nOvRZ5JXE4C7xem8r8DJdn5/VwfNBG7HqUWAU3O6A6dZL81dVtF9r/vEuVpoRC5xTAWai8jVIhIGDMN/7cmXN4CHMvp6RKSyiPT3iOkEsAsIE5HhOG3+GXYAMW6fQYZFwLXu59sBJ2Ge6vGzEJErReRacS4kEBE5D6fW9ls+vmsAvwMHRORBESkvTmd4C3EvKwfGAI+LSGP3OK1EpKrHez7Lx3uZi5M4H3DffyzOVV8f5/IZGJcljpLpIE778FwROYyTMJbi/FoG5yqZZcB2EdntLnsQpxPwN7cJ5gecX9YZtuOc4LcC44BbVHWlx/qZwB5V3ejxWnB+jXtzrhvfIZz+lztVdb06V970BK51j7UdpzM1ewdrbt7F6XuZBazHqf3ckcs2M3FOwhmJYzbOr/JZHmVeBMrj1NJ+w2lm8UlVdwP9gVHAHqAxMCfvbyNzP5/jfA4fu/8+S3H6EQC+w+mDWI3T5HKUrM1Mk92/e0Rkofv8EZxf63txOoHHn8bxs9uLU5tcg3PhxEfAsx61ldy+axnHTMM5obfB+TfcjZMsKrtFnsfpa5vuHucdnH8bcPps3nebuLJclaeqqcAVbvy7gdeAuGzfZ+OHuJ0/xvjk/iL7yO0vMcaUclbjMMYYky+WOIwxxuSLNVUZY4zJF6txGGOMyZdScR9HtWrVNCYmJthhGGNMsbJgwYLdqprj5sxSkThiYmKYP39+sMMwxphiRUQ2eFtuTVXGGGPyxRKHMcaYfLHEYYwxJl9KRR+HMcePH2fz5s0cPXo02KEYU+SUK1eO6OhowsPD81TeEocpFTZv3kzFihWJiYnBGaDWGAPOnEx79uxh8+bNNGjQIE/blIqmqq1btwY7BBNkR48epWrVqpY0jMlGRKhatWq+auOlInFs27Yt2CGYIsCShjHe5ff/RqlIHMYYYwpOqUkcIoKI8OijjwY7FFNKVaiQc/bRRx99lNGjR5/S/iZPnkzTpk3p3r17luXp6ekMGzaMFi1a0LJlS84991zWr1/vd18vvvgiKSkpXtfFxMSwe/dur+uKs+uvv54aNWrQokWLfG23aNEivvnmG5/ri8rn5e37VlBKTeL45ZdfUFVLHKbEeOedd3jttdf46aefsiyfOHEiW7duZcmSJfz55598/vnnnHHGGX735S9xlFQJCQl8+63febi8yi1xnI60tDS/rz2dOOFvMsvACmjiEJHeIrJKRNaKyL99lIkVkUUissyd4zlj+d3usqUiMkFEyrnL24jIb+42891pKXM1duzYAnlPpvT49Vd46innbzBNmDCBli1b0qJFCx588EEARo4cyezZs7nlllu4//77s5Tftm0btWvXJiTE+e8dHR1NlSpVAJg+fTqdOnWiXbt29O/fn0OHDvHyyy+zdetWunfvnqP24unIkSP07t2bt99+O0DvtHB169aNM88802+ZyZMn06JFC1q3bk23bt1ITU1l+PDhTJw4kTZt2jBx4kT27NlDz549adu2Lf/85z/xNeK4t88enBrKyJEj6dKlC5MnT87x2lNCQgL33HMP3bt358EHH8xRY23RogXJyck5jv3ss89y7rnn0qpVK0aMyG2249wF7HJcEQkFXgUuATYD80TkS1Vd7lHmDJxpG3ur6kbPOYhx5mZupqpHRGQSzlSiY4FngMdUdZqIXOa+jvUXS/ny5fn444954YUXiIiIKOB3aoqbu+6CRYv8l9m/H5YsgfR0CAmBVq2gcmXf5du0gRdfLLgYM2zdupUHH3yQBQsWUKVKFXr27MkXX3zB8OHD+fHHHxk9ejQdOnTIss2AAQPo0qULP//8Mz169GDo0KG0bduW3bt388QTT/DDDz8QGRnJ008/zfPPP8/w4cN5/vnn+emnn6hWrZrXOA4dOsS1115LXFwccXFxBfsm8/IPkl8F9A8ycuRIvvvuO6Kioti3bx9lypRh5MiRzJ8/n1deeQWAYcOG0aVLF4YPH87UqVN56623cuzH32cPzn0Us2fPBuDf//53ltfZrV69mh9++IHQ0NA8taBMnz6dNWvW8Pvvv6OqXHHFFcyaNYtu3bqd4qcS2BrHecBaVf3LneP3Y+DKbGUGA59lzFOtqjs91oUB5UUkDGfe54xrahWo5D6v7LHcp7p163LgwAG++OKLU30vppTZv99JGuD83b8/OHHMmzeP2NhYqlevTlhYGEOGDGHWrFl+t4mOjmbVqlU89dRThISE0KNHD2bMmMFvv/3G8uXL6dy5M23atOH9999nwwavY9jlcOWVV5KYmFjwSaOI69y5MwkJCbz99ts+m41mzZrF0KFDAejTp09m7c5Tbp/9wIEDs5TP/tpT//79CQ0NzfN7mD59OtOnT6dt27a0a9eOlStXsmbNmjxv700gbwCMAjZ5vN4MdMxWpgkQLiJJQEXgJVX9QFW3iMhoYCNwBJiuqtPdbe4CvnPXhwAXeDu4iNwM3AxQr149YmJieO+99xg8eHCBvDlTfOXlh+ivv0KPHpCaCmXKwLhx0KlTwEPL4VQnWitbtiyXXnopl156KTVr1uSLL76gZ8+eXHLJJUyYMCHf++vcuTPTpk1j8ODBBX9ZcyCqagXkjTfeYO7cuUydOpU2bdqwyEfNKLfPRFX9fvaRkZF+X/taFxYWRnrGLxzwei+GqvLQQw/xz3/+02+M+RHIGoe3TzL7/4IwoD3QB+gFPCIiTUSkCk7tpAFQB4gUkaHuNrcCd6tqXeBu4B1vB1fVt1S1g6p2qF69OgkJCcyYMYONGzee/jszJV6nTjBjBjz+uPM3GEkDoGPHjsycOZPdu3eTlpbGhAkTuPDCC/1us3DhwsybXtPT01myZAn169fn/PPPZ86cOaxduxaAlJQUVq9eDUDFihU5ePCgz32OHDmSqlWrcttttxXQOyse1q1bR8eOHRk5ciTVqlVj06ZNOT6rbt26MW7cOACmTZvG3r17c+zH32d/OmJiYli4cCHg/Lt7u3quV69evPvuu5l9Klu2bGHnzp05yuVHIBPHZqCux+tocjYrbQa+VdXDqrobmAW0Bi4G1qvqLlU9DnzGyZpFvPsaYDJOk1iu4uLiUFU++OCDU3ozpvTp1AkeeqjgkkZKSgrR0dGZj+effx6AJ554IstyT7Vr1+app56ie/futG7dmnbt2nHlldlbfLPauXMnffv2pUWLFrRq1YqwsDBuv/12qlevztixYxk0aBCtWrXi/PPPZ+XKlQDcfPPNXHrppX47x1988UWOHj3KAw88cJqfRNEwaNAgOnXqxKpVq4iOjuadd3L+Br3//vszL0zo1q0brVu3pnv37ixfvjyzc3zEiBHMmjWLdu3aMX36dOrVq5djP/4++9Pxj3/8g7///ps2bdrw+uuv06RJkxxlevbsyeDBg+nUqRMtW7bkmmuu8fsjIS8CNue42zexGugBbAHmAYNVdZlHmabAKzi1jTLA7zid4JHAu8C5OE1VY4H5qvo/EVkB3KqqSSLSA3hGVdv7i6VDhw46f/58unfvzqZNm1izZo3dRVzKrFixgqZNmwY7DGOKLG//R0Rkgap2yF42YDUOVT0B3A58B6wAJqnqMhG5RURuccusAL4FluAkjTGqulRV5wKfAAuBP904My5VuAl4TkQWA0/i9mPkRWJiIuvWrWPOnDkF8h6NMaY0CliNoyjJqHEcPnyYWrVqMWDAAK/VUlNyWY3DGP+KRI2jKIqMjGTAgAFMmjSJw4cPBzscU8hKw48kY05Ffv9vlKrEAc6dl4cOHeLTTz8NdiimEJUrV449e/ZY8jAmm4z5OMqVK5fnbUpVUxU4H1Ljxo2pV68eP/74Y5AjM4XFZgA0xjdfMwD6aqoqdTMAiggJCQk88sgjJCcnExMTE+yQTCEIDw/P8+xmxhj/Sl1TFTj3dIgI77//frBDMcaYYqdUJo569erRo0cP3n///Sy36xtjjMldqUwc4HSSr1+/PtcB44wxxmRVahPHVVddRaVKlWyeDmOMyadSmzgiIiIYOHAgkydPPu1xW4wxpjQptYkDnOaqlJQUPvnkk2CHYowxxUapThydOnWiSZMm1lxljDH5UCoSx5Yt3ueNzrinY9asWaxbt67wAzPGmGKoVCSO7dud2dy8JY+4uDhCQkLsng5jjMmjUpE4wJkCNCkp5/KoqCguueQSu6fDGGPyqNQkjpAQiI31vi4hIYGNGzfy008/FWpMxhhTHJWKxBEWBg0b+p4CtF+/flSuXNk6yY0xJg9KReKoVQtWroTly72vL1euHIMGDeLTTz9l//79hRucMcYUM6UicVSt6tQ6/PV/JyYmcuTIESZPnlx4gRljTDFUaubjqFNnPgsWwMaNEBqas4yq0rx5c6pUqWJzkhtjDDZ1LPHxsHUr/PCD9/UZ93T88ssvrF69unCDM8aYYqTUJI7LL4cqVcBf//d1111HSEiIdZIbY4wfpSZxlC0LgwbBF1+Ar/7v2rVr07t3bz744APS0tIKNT5jjCkuSk3iAKe56uhRmDTJd5nExES2bNnCjBkzCi8wY4wpRkpV4jj3XDjnHP9XV/Xt25cqVarw3nvvFV5gxhhTjJSqxCECCQkwZw6sXeu9TNmyZRk8eDCff/45+/btK8zwjDGmWAho4hCR3iKySkTWisi/fZSJFZFFIrJMRGZ6LL/bXbZURCaISDl3+US3/CIRSRaRRfmJaehQZ/iRDz7wXSYxMZFjx47x8ccf52fXxhhTKgTsPg4RCQVWA5cAm4F5wCBVXe5R5gzgF6C3qm4UkRqqulNEooDZQDNVPSIik4BvVHVstmM8B+xX1ZH+YunQoYPOnz8/83WvXrBqFfz1l5NEslNVWrVqRWRkJL/99tupvH1jjCn2gnEfx3nAWlX9S1VTgY+BK7OVGQx8pqobAVR1p8e6MKC8iIQBEcBWzw1FRIABwIT8BhYfDxs2wKxZ3teLCImJicydO5cVK1bkd/fGGFOiBTJxRAGbPF5vdpd5agJUEZEkEVkgInEAqroFGA1sBLbh1CqmZ9u2K7BDVdd4O7iI3Cwi80Vk/q5du7Ks69cPKlb0f0/HkCFDCA0NtXs6jDEmm0AmDvGyLHu7WBjQHugD9AIeEZEmIlIFp3bSAKgDRIrI0GzbDsJPbUNV31LVDqraoXr16lnWRUTAgAHwySdw6JD37WvWrEmfPn348MMPOXHihM83aYwxpU0gE8dmoK7H62iyNTe5Zb5V1cOquhuYBbQGLgbWq+ouVT0OfAZckLGR23x1NTDxVIOLj4fDh+Gzz3yXSUhIYNu2bUyfnr2yY4wxpVcgE8c8oLGINBCRMsC1wJfZykwBuopImIhEAB2BFThNVOeLSITbl9HDXZ7hYmClqm4+1eC6dIGzzvJ/T0efPn2oVq2aNVcZY4yHgCUOVT0B3A58h3PSn6Sqy0TkFhG5xS2zAvgWWAL8DoxR1aWqOhf4BFgI/OnG+ZbH7q/lFDrFPYk4tY6ffnJGzPWmTJkyDBkyhClTpvD333+fzuGMMabEKDXDqntejpshORkaNIAnnoCHH/a+7aJFi2jbti2vvPIK//rXvwIbqDHGFCGlflh1b2Ji4MILneYqX/mzTZs2tGnTxpqrjDHGVSoSh79aVXw8rFkD/u7zS0hIYP78+SxdujQA0RljTPFSKhLHsmXLSE9P97rummucy3P9VSgGDx5MWFiY1TqMMYZSkjiOHTvGzz//7HVdxYpw9dUwcSIcOeJ9++rVq9O3b18+/PBDjh8/HsBIjTGm6CsViSMkJIQP/IxqGB/vTO70ZfaLhT0kJCSwc+dOvv322wBEaIwxxUepuKqqWrVqmpqayvbt24mIiMixPi3N6Shv2RK++cb7Po4fP050dDRdunTh008/DWzAxhhTBBTIVVUiEiIilQourMJRtWpVDh48yJQpU7yuDw2FuDj47jvYts37PsLDwxk6dChfffUVu3fvDmC0xhhTtOWaOERkvIhUEpFIYDmwSkTuD3xoBadixYo899xzdO3a1WeZuDhIT4dx43zvJyEhgePHjzN+/PgARGmMMcVDrk1VIrJIVduIyBCcAQkfBBaoaqvCCLAg+LoBMLtOnZxBD5csce4s97Ev0tLS+OOPPwo4SmOMKVpOp6kqXETCgX7AFHfQwWLXMaKqTJ482WdzFTid5EuXgr+ckJCQwKJFi1i0aFHBB2mMMcVAXhLHm0AyEAnMEpH6wIFABhUIIsJzzz3HiBEjfJYZOBDKlvV/T8egQYMoU6aM3dNhjCm1ck0cqvqyqkap6mXq2AB0L4TYCtx1113H4sWLWbx4sdf1VarAFVfA+PGQmup9H1WrVuWKK65g3LhxpPoqZIwxJVheOsfvdDvHRUTeEZGFwEWFEFuBGzhwIOHh4Xz44Yc+y8THw549vi/LBUhMTGT37t1846+QMcaUUHlpqrpeVQ8APYHqQCIwKqBRBUi1atXo06cP48aN8zmrX69eULOm/3k6evbsSa1atXjvvfcCFKkxxhRdeUkcGdcXXQa8p6qL8T4tbLEQFxdHpUqV2LRpk9f1YWEwdChMnQq+btcICwvjuuuuY+rUqezYsSOA0RpjTNGTl8SxQESm4ySO70SkIuB9xMBi4Morr2TlypU0aNDAZ5n4eDh+HCb4mSoqISGBtLQ0xvm78cMYY0qgvNzHEQK0Af5S1X0iUhWIUtUlhRBfgfB2H8fRo0cBKFeunNdt2rWDkBDwd/tHx44dSUlJYcmSJYivGz+MMaaYOuX7OFQ1HYgG/isio4ELilPS8CY5OZlatWoxwU+VIj4eFixw7uvwJTExkaVLl9rNgMaYUiUvV1WNAu7EGW5kOTBMRJ4KdGCBVL9+fWrUqOF3xNzBg53+Dn+d5AMHDqRs2bLWSW6MKVXy0sdxGXCJqr6rqu8CvYE+gQ0rsESEuLg4kpKS2LBhg9cy1avDZZfBRx+BjwuwqFKlCv369WP8+PEcO3YsgBEbY0zRkdfRcc/weF45AHEUuqFDhwLw0Ucf+SwTHw/bt8P33/veT2JiIn///TdfffVVQYdojDFFUl4Sx1PAHyIyVkTeBxYATwY2rMCLiYmhW7dufPDBBz7nJO/TB848039z1cUXX0xUVJQNQWKMKTXCciugqhNEJAk4F+f+jQeB+gGOq1A89dRThIaG+lxftqzT1/H227BvH5xxRs4yoaGhxMXF8fTTT7Nt2zZq164dsHiNMaYoyFNTlapuU9UvVXWKqm4HJgc4rkJxwQUX0LFjR7+X0sbHw7FjMGmS7/3Ex8eTnp7ut9nLGGNKilOdc7zE3LSwfPly7rzzTp8DFrZvD82a+W+uOvvss+nUqRNjx4712exljDElxakmjjydHUWkt4isEpG1IvJvH2ViRWSRiCwTkZkey+92ly0VkQkiUs5j3R3ufpeJyDOn+B4A2LBhAy+//LLPAQtFnFrHL7/AmjW+95OYmMjy5cuZN2/e6YRjjDFFns/EISJficiXXh5fAVVz27GIhAKvApcCzYBBItIsW5kzgNeAK1S1OdDfXR4FDAM6qGoLIBS41l3XHbgSaOVuMzq/b9rTJZdcQs2aNf3e0zF0qHMXub9ax4ABAyhfvrx1khtjSjx/NY7RwHNeHqNx7u3IzXnAWlX9S1VTgY9xTvieBgOfqepGAFXd6bEuDCgvImFABLDVXX4rMEpVj3nZJt/CwsIYMmQIX3/9NXv27PFapk4duOQS+PBDZ15ybypXrszVV1/NhAkTMoczMcaYkshn4lDVmf4eedh3FOA5BO1md5mnJkAVEUkSkQUiEuceewtOgtoIbAP2q+p0j226ishcEZkpIud6O7iI3Cwi80Vk/q5du/wGGhcXx/Hjx5k4caLPMvHxsHEjJCX53k9CQgL79u3zOz2tMcYUd3kZcuRPEVmS7fGziLzgDnjoc1Mvy7L3jYQB7XHuRO8FPCIiTUSkCk7tpAFQB4gUkaEe21QBzgfuByaJl8uiVPUtVe2gqh2qV6/u9z22bt2a2NhYvzWFfv2gUiX/zVUXXXQRdevWteYqY0yJlut9HMA0IA0Y776+Ficp7AfGAn19bLcZqOvxOpqTzU2eZXar6mHgsIjMAlq769ar6i4AEfkMuAD4yN3mM3UuX/pdRNKBaoD/akUufvrpJ7/ry5d35iQfPx5efRUqVMhZJiQkhPj4eJ588km2bNlCVFT2CpYxxhR/ebmqqrOqPqSqf7qPh4ELVfVpIMbPdvOAxiLSQETK4CScL7OVmYLT7BQmIhFAR2AFThPV+SIS4dYmerjLAb7AnbpWRJoAZQAfUy7lj6qyZcsWn+vj4+HwYfj0U9/7yLinw9/0tMYYU5zlJXFUEJGOGS9E5Dwg4/e2j+H/QFVPALcD3+Gc9Cep6jIRuUVEbnHLrAC+BZYAvwNjVHWpqs4FPgEWAn+6cb7l7vpd4CwRWYrT4R6vBXTzxA033EDnzp1J99EDfsEF0KiR/+aqRo0a0bVrV9577z27p8MYUzKpqt8HzlAjfwLrgWSck/y5QCQwILfti8Kjffv2mhfjxo1TQJOSknyWGTlSFVSTk33v55133lFAf/nllzwd1xhjiiJgvno5p+ZlIqd5qtoSZxbANqrayl12WFX9DMRR/PTr148KFSr4vafjuuucv36K0L9/fyIiImyeDmNMiZSXq6oqi8jzwAzgBxF5TkRKxNDq2UVERNC/f38mT55MSkqK1zIxMRAb6yQOXy1RFStW5JprrmHixIk+92OMMcVVXvo43gUOAgPcxwGgxP6UjouL4+DBg3z5ZfZ+/JPi42HtWmcYEl8SEhI4cOAAX3zxRcEHaYwxQSSaSweuiCxS1Ta5LSvKOnTooPPnz89T2fT0dL7++mt69epF2bJlvZY5eBBq1YIhQ+Ctt7wWIT09nYYNG9KoUSO+9zcTlDHGFFEiskBVO2RfnpcaxxER6eKxo87AkYIMrigJCQnhiiuu8Jk0ACpWhGuugYkT4YiPTyLjno4ZM2awcePGAEVrjDGFLy+J4xbgVRFJFpFk4BXgnwGNKsjS09MZOXIk7/u57jY+Hg4cAH+ji8THx6OqfjvbjTGmuMm1qSqzoEglAFU9ICJ3qeqLgQysIOWnqSpDx44dOXr0KIsXL/a6Pj0dGjRw5uqYNs33frp3786mTZtYs2aN3wmjjDGmqDmdpirASRiqesB9eU+BRVZExcXFsWTJEp+JIyTEuTR3+nTYmn0gFQ8JCQmsW7eOG264IUCRGmNM4Sr1MwD6MnDgQMLDw/0OHRIX59Q8/M0Ye80111ChQgW7p8MYU2IEdAbA4qxatWr06dOHcePGceKE95FVmjSBTp2cIUh8tfhFRkbSv39/AP76669AhWuMMYXG3wyAB0XkgJfHQZyhzku8xMREzj//fP7++2+fZeLjYflyWLAg57pHH30UEcmsbTRs2BAR4eGHHw5UyMYYE3B57hwvzk6lczyv9u1z7um46Sb43/98lxMRBgwYwKRJk2jQoAEvvfQSffv6GpHeGGOC77Q7x0uzv/76i0OHDnldd8YZziRPEyZAaqr//UycOJEffviBcuXKccUVV3D55Zezbt26Ao/XGGMCyRJHLpYuXUrDhg2ZPHmyzzLx8bBnD0yd6ns/I0aMAKBHjx4sWrSIZ599lpkzZ9K8eXOGDx9uY1oZY4oNa6rKhapy9tlnExUV5XOWwBMnoG5d6NgR8jM01ZYtW7j//vuZMGECMTExvPjii1xxxRV2v4cxpkiwpqpTJCLExcWRlJTEhg0bvJYJC4OhQ50ax658TGAbFRXF+PHj+emnn4iIiKBfv3706dOHtWvXFlD0xhhT8E7lqqoDInLA13Yl0dChQwH4yM8NG/HxTs1j/HifRXyKjY1l0aJFPPfcc8yePZvmzZvzyCOPWPOVMaZIysvouCOB7cCHODf+DQEqquozgQ+vYBTEVVWxsbHs2bOHP//802eZ9u2d+zkWLjz142zbto3777+fcePGUb9+fV544QX69etnzVfGmEJ3Ok1VvVT1NVU96A478jrwj4IPsWh7/fXXffZxZIiPhz/+AD+5JVe1a9fmo48+IikpiYoVK3L11Vdz6aWXsnr16lPfqTHGFKC8JI40ERkiIqEiEiIiQ4C0QAdW1DRt2pRq1ar5LTNokNPf4WdQ3Ty78MILWbhwIS+88AK//vorLVu25OGHH+bw4cOnv3NjjDkNeUkcg3Fm/tvhPvq7y0qdH3/8kcsvv5xUHzdsVK8Ol1/ujF3lY5SSfAkPD+euu+5i1apVDBw4kCeffJKmTZvy6aefUhquhjPGFE25Jg5VTVbVK1W1mqpWV9V+qppcCLEVOceOHWPq1Kl88803PsvEx8OOHc6ouQWlVq1afPDBB8yaNYszzjiDa665hl69erFq1aqCO4gxxuRRrolDRJqIyAwRWeq+biUi/w18aEXPJZdcQs2aNf1OzHTZZVC1asE0V2XXtWtXFi5cyEsvvcTcuXNp2bIlDz30kDVfGWMKVV6aqt4GHgKOA6jqEuDaQAZVVIWFhTFkyBC+/vpr9uzZ47VMmTIweLAzM+DevYGJYdiwYaxevZrBgwczatQozjnnHCZPnmzNV8aYQpGXxBGhqr9nW1YALfjFU1xcHMePH2fixIk+y8THw7FjzpzkgVKzZk3Gjh3L7NmzqVatGgMGDKBnz56sXLkycAc1xhjyljh2i0hD3Dk4ROQaYFtedi4ivUVklYisFZF/+ygTKyKLRGSZiMz0WH63u2ypiEwQkXLu8kdFZIu7zSIRuSwvsRSU1q1bM3ToUGrVquWzTLt20Lx5YJqrsuvcuTPz5s3jf//7H/PmzaNVq1Y8+OCDPgdlNMaY06aqfh/AWcAPQAqwBZgN1M/DdqHAOnf7MsBioFm2MmcAy4F67usa7t8oYD1Q3n09CUhwnz8K3Jfb8T0f7du318L2zDOqoLpyZeEdc8eOHZqYmKiARkVF6cSJEzU9Pb3wAjDGlCjAfPVyTs1LjUNV9WKgOnCOqnYhbzWV84C1qvqXqqYCHwNXZiszGPhMVTe6B9rpsS4MKC8iYUAE4Gdm78K3f/9+Fnibvck1dKgzL7mffvQCV6NGDd59911++eUXatSowcCBA7n44otZsWJFZplHH3208AIyxpRIeUkAnwKo6mFVPegu+yQP20UBmzxeb3aXeWoCVBGRJBFZICJx7rG2AKOBjTjNYvtV1fMC19tFZImIvCsiVbwdXERuFpH5IjJ/V35GHsyjoUOHcvXVV5Oenu51fe3a0KsXfPihMy95YerUqRPz5s3j1VdfZeHChbRq1Yr777+fgwcP8thjjxVuMMaYEsffIIfniMg/gMoicrXHIwEol4d9extcKftlP2FAe6AP0At4xL38twpO7aQBzjS1kSIy1N3mdaAh0AYnqTzn7eCq+paqdlDVDtWrV89DuPkzaNAgNm7cyM8//+yzTHw8bNoEuYxUEhChoaHcdtttrF69mvj4eEaPHs0555wDOPejGGPMqfJX4zgbuBynH6Kvx6MdcFMe9r0ZqOvxOpqczU2bgW/d2sxuYBbQGrgYWK+qu1T1OPAZcAGAqu5Q1TRVTce5VPi8PMRS4Pr160eFChX83tNx5ZVQuXLhdJL7Ur16daKjowHYutX5+MuVK4eI8OCDDwYvMGNM8eWt40OzdmB3yq2Mj+3CgL9wag0ZnePNs5VpCsxwy0YAS4EWQEdgmbtMgPeBO9xtantsfzfwcW6xBKpzPDExUStWrKiHDx/2Webmm1UjIlQPHAhICPmSnp6ugPbs2VMBjYyM1DvvvFPXr18f7NCMMUUQp9E5/oeI/EtEXnP7FN4VkXfzkJBOALcD3wErgEmqukxEbhGRW9wyK4BvgSXA78AYVV2qqnNx+lEWAn/i1Izecnf9jIj8KSJLgO5u8giK6667joMHD/odNTc+HlJS4JO89AoFWMbQ7N999x2LFy/mH//4B6+++ioNGzbk2muv5XSHnjfGlBLesolmrRVMBh7HubQ2HpgOvJTbdkXpEagaR1pamq5YscJvmfR01eho1QYNVH/5JSBh5MuIESOyvN60aZPef//9WqlSJQU0NjZWv/76a01LSwtOgMaYIgMfNY68TOT0h6q2FZElqtpKRMKB71T1ooBmtAJUEBM5napff4Vu3ZzRcsPDncEPY2ODEopfBw4cYMyYMbz44ots2rSJpk2bcu+99zJ06FDKli0b7PCMMUFwOhM5HXf/7hORFkBlIKYAYyvWUlNTGThwIK+++qrX9UlJzqyAAMePQ79+wblENzeVKlXinnvuYd26dXz00UeULVuWG2+8kfr16/Pkk0/y999/BztEY0wRkZfE8ZZ7eewjwJc4d3oXm2ljA61MmTIkJyfz9ttve10fG+sMfBgaCmXLQq1aEBcH557rJJWiJjw8nCFDhrBw4UJ++OEH2rZty8MPP0zdunUZNmwY69evD3aIxpggy8t8HGNUda+qzlTVs1S1hqq+URjBFRdxcXEsXryYxYsX51jXqRPMmAGPP+7cz7F8uTPR065d0L07XHEFFMVxCUWEHj16MG3aNJYsWUL//v154403aNSoEQMHDmTevHnBDtEYEyQ++zhE5B5/G6rq8wGJKAAC3cexe/du6tSpw7Bhwxg9enSetjlyBF56CZ580rnq6p//hBEjoEaNgIV52rZs2cLLL7/MG2+8wYEDB+jWrRv3338/l112GSEheam8GmOKk1Pp46joPjoAt+IMFxIF3AI0C0SQxVW1atXo06cP48aN40Qe54wtXx7+/W9YuxZuuQXefBMaNYKnnnKSSlEUFRXF008/zaZNm3j++edZv349ffv2pXnz5owZM4ajR48GO0RjTCHwmThU9TFVfQyoBrRT1XtV9V6cIUKiCyvA4uLWW28lMTEx3yfPGjXglVdg6VKn6eo//4Gzz3aas4paB3qGSpUqcffdd7Nu3TrGjx9P+fLluemmm6hfvz5PPPGEz0mujDElQ14ux10JtFbVY+7rssBiVT2nEOIrEMG8HDe/kpLg3nth4UJo3x5Gjy6al+96UlV++uknRo8ezbRp04iIiOD666/n7rvv5qyzzgp2eMaYU3Q6l+N+CPzuTqA0ApgLFOJg4cXH8ePHmTp1KgcOHDjlfcTGwrx5ziW7O3c6tZArryyaHegZRISLLrqIb775hj///JMBAwbw5ptv0rhxYwYMGMDvv2edQNKGdjemmPN2V2D2B87Ahne6j7Z52aYoPQprIqdff/1VAX3nnXcKZH8pKapPPaVasaJqaKjqbbep7txZILsOuC1btui///1vrVy5sgLatWtXnTJliqalpanztTPGFHXk985xEamkqgdE5EwfCafY3BFWWE1VqsrZZ59NVFSU3/Gr8mvnTnjsMacDPSLC6Qe5806ng72oO3jwIO+88w4vvPACGzdu5Oyzz2bVqlUkJydTv379YIdnjPHjVJqqxrt/FwDzPR4Zr002IkJcXBxJSUls2LChwPZbowa8+urJDvSHHoJzzoFx44puB3qGihUrctdddxEfHw/AqlWrAIiJiUFEaNOmDWPHjiU5OTmIURpj8iPXzvGSoDA7x5OTk2nQoAFPPPEEDz/8cECOkb0D/bnn4MILA3KoAqeqhISE8NJLLzFz5kxmzpyZeRVWvXr1iI2N5cILLyQ2NpYGDRpkjuhrjCl8vmoc/pqq2vnboaouLKDYAq6wr6qKjY0lNDSUGTNmBOwY6ekwfrzTbLVpk3MH+jPPOJfyFnUiktF3Rnp6OsuWLWPmzJkkJSUxc+ZMdu/eDUB0dDSxsbGZyaRhw4aWSIwpRKeSOPw10qva6Lg+7dixg+rVqxfK3dRHjsCLLzo3DqakODcTjhgBAZgtt8A8+uijPq+sUlWWL1+emUiSkpLImDM+KioqszYSGxtLo0aNLJEYE0D5ThwlSbDu41DVQjuxeXagR0Y6NZFhw4pHB7o/qsrKlSszayNJSUns2LEDgNq1a2dp2mrSpIklEmMK0GklDnc49WZAuYxlqlps7uUIRuL46KOPeOqpp1i4cGGhzmexciU88AB89RXUq+eMhTVoEJSUoaRUldWrV2fWRmbOnMm2bdsAqFWrFhdeeGFmIjnnnHMskRhzGk45cbg3/cXiJI5vgEuB2ap6TQDiDIhgJI5p06Zx2WWX8dlnn3HVVVcV6rHBGYn3vvucDvQOHZw70MuUcTrWY2OdUXtLAlVlzZo1WWokW7duBaBGjRqZSeTCCy+kWbNmWRKJvyYzY8zpJY4/gdbAH6raWkRq4swN3jcwoRa8YCSOEydOEB0dTadOnfj8888L9dgZsnegZ9Q6ypZ1hnovKcnDk6qybt26LIlk8+bNAFSvXp1u3bpl9pG0bNmS0tBUa8ypOp3E8buqniciC4DuwEFgqao2D0yoBS9YfRz33nsv//vf/9i2bRtVq1Yt9ONnOHLEmXlw+vSTyxIS4N13oaS35Kgq69evz2zaSkpKYtOmTZnrO3fuTJs2bWjTpg1t27alefPmlCtXzs8ejSk9TidxvAb8B7gWuBc4BCxS1cRABBoIwUocixcvpk2bNrz66qvcdttthX58T7/+ChddBMeOnZzKtmlT5w70665z7kgvDUaMGMHIkSN9rg8NDaVp06aZySTjEczEb0ywnMrluK8A41X1F49lMUAlVV0SqEADIZij4z788MP069ePc889NyjH9/Trr04fR+fOsHEjvPCC0wdStaozkdRtt0FUVLCjLDwZ95Okp6fz119/sWjRoiyPLVu2ZJaNjo7Okkjatm1LTEyMTWBlSrRTSRx34tQyagMTgQmquiiQQQZKcRpWvTCpwuzZTgL54gtnXvSBA+Huu5070ks6zxsRvdm1axeLFy/OTCR//PEHK1euJN0d56VSpUq0bt06S0Jp3rx5oV5FZ0wgnU5TVX2cBHItzuW4E4CPVXV1IAINhGAnjkWLFrFt2zYuvfTSoMWQm7/+gpdfhnfegUOHoGtXuOsuZ0j30NBgRxcYp3JV1ZEjR1i6dGmWmsnixYs5fPgwAGFhYVmautq2bUvr1q0580yvY4XalV2mSBORbapaJ8fy/FxVIiJtgXeBVqpabE4nwU4cl156KcuXL2fdunWEhYUFLY682L/f6TR/+WVIToYGDZwbCa+/HipVCnZ0RVN6ejrr1q3L0dSVcVkwOONwZe83yWjqsiu7TFHl1spzXEKTlxpHONAbp8bRA5iJ02z1RR4O2ht4CQjFuYR3lJcyscCLQDiwW1UvdJffDdwIKPAnkKiqRz22uw94Fqiuqrv9xRHsxPHJJ5/Qv39/zjnnHB599FH69+9f5NvG09JgyhSnGWv2bKhYEW64wUkiDRoEO7riYceOHVmauhYtWsSqVasym7oqV67M/v37GTx4MNHR0VkedevWpUaNGkX+e2ICoyBrounp6Rw5coRDhw5x+PDhHH+9Lcv4+/777+cvcYjIJcAgoA/wO/Ax8IWqHs5LsCISCqwGLgE2A/OAQaq63KPMGcAvQG9V3SgiNVR1p4hEAbOBZqp6REQmAd+o6lh3u7rAGOAcoH1RTxyqymeffcbw4cNZvnw5559/PnPmzCk2J4X5853xsCZOdO4NufJKpx+kS5eSfzlvQUtJSeHOO+9kzJgxuZYNCwsjKioqR0LxfF2rVi1CS2pbYpAEo/lQVTl69CgpKSkcOXKEunXrsmDBgnyd5H2tS0lJyVcsoaGhpKWlecaWr8TxE86cHJ+eyqRNItIJeFRVe7mvH3KDeMqjzG1AHVX9b7Zto4DfcG48PAB8AbysqtPd9Z8AjwNTgA5FPXFkSEtLY+LEiezevZthw4ahqsyePZsuXboUi6Extmxx5gV58034+2+nA/3uu6F/f+eudJN/GR30qsru3bvZtGkTmzdv9vrYtGkTR48ezbJ9aGgotWvXzpFQPB916tTJtYm0KPS1FIUYIOfozUeOHMk8oaekpGQ+sr/OSxl/2+Q3xsjISCpUqJDlr7dl+VkXERGR5btyyk1Vp0pErsGpSdzovr4O6Kiqt3uUeRGniao5UBF4KWMMLPeqrv8DjgDTVXWIu/wKoIeq3ikiyfhIHCJyM3AzQL169doX5MRKBeX777+nZ8+edOrUiSeeeIKLLioeAw6npDhzor/4ojM2Vp068K9/OZf02u0O+ZPblV2eVJW///7bZ1LJ+Jv9JBQSEkKtWrW8NodlPG/QoAHbt2/PPE5GTNmf57Y+P2WzL2vevDkLFy4kNTWV48ePZz48XxfUc3/r1q5dS5UqVThy5EiORJ1X5cuXp3z58kRERGR5+Fs2Z84cr1Mx3HDDDdx1111ZTvLly5cvlB+bwUgc/YFe2RLHeap6h0eZV4AOOH0n5YFfcZrGdgGfAgOBfcBk4BPgM+AnoKeq7veXODx1qFhR5xfB60vTVdm+fTsbkpM5lprKGZUr06BBAypXrhzs0PJEcWoemzfD3r3OkCY1a0J0NESWkhsKT1dycjIxMTEFtj8F0k6c4NixY5mPox7PMx6eTREliQASEkKICCKS5+dHjxwh5ciRHPs7o3JlzqxaldCQEEJCQ52/ns+9/A0JCeF0T+lJM2cSWwRmZ5OZM71eVRXIS3w2A3U9XkcDW72U2e32mxwWkVk4zVMA61V1F4CIfAZcACwGGgCL3WwbDSwUkfNUdXvA3kmAhIhQp3ZtatWsydZt29i4YQPL3D6QkGLQdCVA1TOdx+HDTgLZsR22bYMzqzgJpMqZnPZ/opKsIJMGOJ91WFgYYWFhREZG+ix3Ii2Nv/76K8uVXxnOPPNMqnlWHT2+i1n+LfOwPLdtdu3axa7dOX/31axRg9q1ayMhIYiIc5LP5bmIFMh3raictIuInF8QOFlVzP4A6vpZ19XXOo8yYcBfOCf6Mjgn/ebZyjQFZrhlI4ClQAugI7DMXSbA+8AdXo6RDFTLLZb27dtrcXD48GFduHChqqoeO3ZMb7zxRv3jjz+CG1Q+7dyp+vjjqrVqqYJqs2aqb72lmpIS7MiMP86pwGJQLRpxjBgxItghqKoqMF+9nFP9XdYzU0QeEJHMWomI1BSRj4Dnc0tTqnoCuB34DlgBTFLVZSJyi4jc4pZZAXwLLMG5cmuMqi5V1bk4TVMLcS7FDQHeyu2YxV1ERARt27YFYNmyZXzyySe0bduW/v37s2zZsiBHlzfVq8N//+vcA/LBB85IvDffDHXrOsu/+sqZrfDXX4MdqTHejRgxItghFImLBPzylk2cREMV4E2cE/dFwJ3ABuBfQIiv7Yrio7jUOLLbu3evDh8+XCtWrKgiooMHD9YDBw4EO6x8SU9XTUpSvfJKpwaS8ShTRvX774MdnclQFH7hFoUYTFb4qHHk5QbAO4EXcNq6zlfVzYFLY4FRVC7HPVV79uzh2WefZfbs2cyaNYuQkBAOHz7stw27KLrvPnj++ZOj84aEQPfu0KeP82jSJLjxGWOy8jVWlc+mKhE5Q0TeBBJx7hz/BJgmIsXjmtESpGrVqowaNSozafz999/ExMRwyy23ZJlboqj7xz+gXDln7KuyZeHaa52O9HvugbPPhsaNnfGxvv/eGf7dGFM0+evjWAiswbncdbqq3gVcBzwhIhMKIziTVcad5unp6QwYMIB3332XRo0aceedd2Zeg1+UderkzDz4+OPO1LbjxsGyZbB+PbzyipM43nwTevZ07gfp1w/eftu58dAYU3T4u3M82lezlIjcpKpvBzSyAlTcm6p8SU5O5oknnmDs2LGULVuWNWvWUKdOjkuui5WUFCepTJ3qPDZudJa3aXOySeu880ruiL3GFCWnPKx6SVBSE0eGtWvX8uWXX3LPPfcA8PnnnxMbG0uVKlWCHNnpUXVqJBlJ5JdfnMEXq1WD3r2dJNKrFxTzt2lMkWWJowQnDk87duygbt26REREcO+993LnnXdSqYSMh753L3z3nZNEpk2DPXucmscFF5ysjTRvbgMvGlNQ8t05boqnmjVrMm/ePGJjYxk+fDgNGjTg6aefzpxoqDirUsXpUP/wQ9ixw6mB/PvfcPCg87dlS4iJcabA/fprp9nLGFPwrMZRgs2fP5/hw4czY8YM1qxZQ7169YIdUsBs2eLUQqZOda7KOnzYuYLL83LfAh7dw5gSz5qqSmHiyLBhwwbq168PwIABA2jUqBG33nordevWzWXL4unYMZg162TfyNq1zvJmzU4mkZAQZ4Kq2Fjnai9jTE6WOEpx4shw7NgxBg0axJQpUxARrrrqKu644w66du1aLOYDOVWrV59MIrNmwfHjJ9eFh8Prr0NcnPPcGHOSJQ5LHJmSk5N57bXXGDNmDHv37uW9994jISEh2GEVioMH4dZbYfz4k3ewA5QvD+efD127OjMbduoEFSoEL05jigJLHJY4ckhJSWH8+PH079+fypUr8/HHH7Nw4UJuu+22Ah/uuyj59Vfo0QNSU51axn//Czt3Ok1XixY50+OGhjr3jnTpcjKZ1KwZ7MiNKVyWOCxx5Orhhx/m6aefRlW54oorGDZsGLGxsSWyGevXXyEpKWcfx8GDzrrZs+Hnn2HuXMiY36dxYyeBZCSTRo3s0l9TslnisMSRJ5s2beL111/nrbfeYs+ePQwaNIjx48cHO6ygSU2FP/5wksjs2c5jzx5nXY0aWWskbdpALlN7G1OsWOKwxJEvR48e5eOPP6Z69er06dOHvXv38uSTT3Lrrbdy1llnBTu8oElPh1WrTtZIZs92xtoCiIx0ai8ZyaRjR2eZMcWVJQ5LHKfl66+/5qqrriItLY0+ffpwxx13cMkll5TIZqz82rLlZG3k559hyRKn4z0sDNq1O9m81aWLM9GVMcWFJQ5LHKdty5YtvPnmm7z55pvs3LmTc845h99//52KFSsGO7QiZf9+5672jGQyd+7JYeLPPvtk01ZEBKxZ49ykaPeSmKLIEocljgJz7NgxJk+ezLx583jppZcAGDt2LJ07d6Zx48ZBjq7oOXYMFiw42bQ1Z44z7laGkBC46irnSq+mTZ0bFatXt453E3yWOCxxBMyBAweoXbs2KSkp9O7dm2HDhtGrV6/M+UNMVunpzuRVL7988l6SMmWcjvgMVaueTCLNmp18HhVlCcUElqrzYycpCS69NHqL6ubo7GUscZgCsX37dt566y3eeOMNtm3bRqNGjTJrISYnz3tJypSBH36AevVg+XLnsWLFyed//31yu4oVsyaSjOcxMU7NxRR/vi4Vzy411RmT7fBhOHTo5HNfy/JTJj094ygdUJ2f46eKJQ5ToFJTU/nss8949dVXGT9+PHXr1mXJkiWEh4fTtGnTYIdXpOTlBKEKu3Z5Tyiekz6WLw/nnJMzoTRsaEOp5EdeT9oZ0tKcX+dHjzp//T3yUmb9evj8c2e/oaHQoYPzw8LbCf7Eify9t4gI5yq/ChWcv56P7Mvmz3cGC1W1xBHsMEqtyy+/nKlTp3LJJZdwxx13cNlllxFqU/idtr17TyYSz4SSMWsiOEmjSZOsCaVZM2dZ2bJOmfyeLAPBM4bzz3dOnKmpzrhiqamF83zbNieGtDSn9taihfMZ+Tvhp6UVzPsPCXGOpeocL0N0tHPjqb+TfF6WlS+fvxppRo34yJH2qrogx5aWOEzA7dq1i7fffpvXXnuNLVu2EBMTwwMPPMCtt94a7NBKpEOHYOXKnLWUv/462QQREuLURmrVck4SGb9w4+Kgdm3n12xamvM3r89PZZu0NOfO/P37C+/zCQ93fsVn/M14fvAg7N59stxZZzkJtlw556Tu7eFvXW7rPddl3DiavQlzxozgJvMLLrA+jmCHUeodP36cL774grfffpuLL76YBx54gJSUFEaNGkX//v1p0aKF3RcSQEeOOCMFe9ZOfv7ZGafLU0iIcyILDXX+Zn/ub92plFu8GH7/3fm1LeLUOrp3z3pSP9Xn2V+Hhfm+uKConLSLQg0wg11VZYmjSEpKSqJHjx6kp6fTtGlTBgwYwMCBA60/pJB466S/4ILgxhDsX9lF5aRdFAQlcYhIb+AlIBQYo6qjvJSJBV4EwoHdqnqhu/xu4EZAgT+BRFU9KiKPA1cC6cBOIEFVt/qLwxJH0bZz504+++wzJk6cyMyZM1FV/vzzT1q0aEFqaiplypQJdoglWlE4WRaFGExOhZ44RCQUWA1cAmwG5gGDVHW5R5kzgF+A3qq6UURqqOpOEYkCZgPNVPWIiEwCvlHVsSJSSVUPuNsPc8vc4i8WSxzFx/bt25k2bRoJCQmICDfccAMLFixgwIABmbMXGmMKh6/EEcgrv88D1qrqX6qaCnyMU1PwNBj4TFU3AqiqZ2trGFBeRMKACGCrW+aAR5lInBqJKSFq1apFYmJiZl/HBRdcQGRkJA8//DCNGzemffv2vPXWW0GO0pjSLZCJIwrY5PF6s7vMUxOgiogkicgCEYkDUNUtwGhgI7AN2K+q0zM2EpH/E5FNwBBguLeDi8jNIjJfRObv2rWrwN6UKVw33HADc+bMYePGjTz33HOEh4ezdOlSANLT03n55ZfZsGFDkKM0pnQJZFNVf6CXqt7ovr4OOE9V7/Ao8wrQAegBlAd+BfoAu4BPgYHAPmAy8ImqfpTtGA8B5VR1hL9YrKmqZElLSyM0NJSFCxfSvn17ADp27MiAAQPo378/devWDXKExpQMwWiq2gx4/g+Oxm1uylbmW1U9rKq7gVlAa+BiYL2q7lLV48BngLdrPcYD/yjwyE2RlnHzYLt27Vi3bh2jRo0iNTWVe++9l3r16jFnzhwASsMVg8YEQyATxzygsYg0EJEywLXAl9nKTAG6ikiYiEQAHYEVOE1U54tIhDiN3T3c5YiI5/CrVwArA/geTBF31lln8eCDD7Jw4UJWr17NqFGjOPfccwEYPnw43bp145VXXmHbtm1BjtSYkiNgiUNVTwC3A9/hnPQnqeoyEblFRG5xy6wAvgWWAL/jXLK7VFXnAp8AC3EuxQ0BMnpER4nIUhFZAvQE7gzUezDFS+PGjXnwwQczL9+Njo5m79693HHHHURFRdG9e3fee++9IEdpTPFnNwCaEm/ZsmVMnjyZiRMn0rJlSyZNmgTA+++/T5cuXWjYsGGQIzSmaLI7xy1xlHqqSkpKCpGRkWzcuJH69esDTk3l0ksvpXfv3sTGxlK+fPkgR2pM0RCMznFjihQRITIyEoB69eqxZs0aXn75ZRo3bszbb7/NZZddxieffALA7t27Wb16tXWwG+OFJQ5TajVq1Ig77riDqVOnsmfPHr799lv69OkDwPjx4zn77LNp1KgRt99+O19//TWHDx8OcsTGFA3WVGWMF5s2beKrr75i2rRp/Pjjj6SkpBAREcHOnTuJjIzkwIEDVKxY0UbzNSWa9XFY4jCn6OjRo8yePZulS5dy1113AXDxxRezdu3azL6RHj16UKFCheAGakwBs8RhicMUoPfee48pU6YwY8YMDh06RHh4OHfccQfPPfdcsEMzpsD4ShxhwQjGmOIuMTGRxMREUlNTmTNnDtOmTaNZs2YA7N+/n3bt2tGjRw969+7NxRdfTKVKlYIcsTEFx2ocxhSw5ORk7r33Xr7//nsOHjxIWFgYnTt35plnnuG8884LdnjG5JldjmtMIYmJieHTTz9lz549JCUlcd9997Fv377MS4GnTp3K9ddfz+TJk9m3b19wgzXmFFiNw5hC9tprr/Gf//yH/fv3IyI0a9aMjh078sYbbxAeHh7s8IzJZJ3jljhMEXLixAnmzp3LDz/8wNy5c9m5cycZ39GEhASSk5Pp2LEj559/Ph07dqROnTpBjtiURtY5bkwRktHv0blz5xzrGjZsyIoVK3jhhRc4fvw4AH369OHrr78GYPHixTRu3JiIiIhCjdmYDJY4jCliHnnkER555BGOHj3KokWLmDt3LpUrVwacmsoFF1zAsWPHaNmyZWatJDY2lpiYmOAGboq09PR0Dh8+nOPRpk0bypcvz7Jly/jll18yl0dFZZ+w9SRrqjKmGDl+/Djffvstc+fOZe7cufz+++8cOHCAxx57jOHDh7N//36effbZzCau6tWrBztk40fG+VdESEtL4+DBg5w4cYJjx45lnsDr16/PmWeeybZt2/jhhx9ynPhvuOEGmjRpwpw5c3jyySezrEtJSeHzzz+nbdu2jBkzhptuuilHDEuXLqV58+a89NJLmTe4AnTr1o1Zs2ZZU5UxxV14eDh9+/alb9++gPMrctWqVVSsWBFwhpAfNWoUaWlpADRo0ICOHTvy0EMP0apVq6DFHQhpaWkcP34cVc0c0XjDhg0cPXqU1NTUzEeVKlU455xzAJgyZUqO9U2bNiU2Npb09HQee+wxjh8/zokTJzL/XnTRRVx11VUcPnyYm2++Ocf6oUOHMmTIEHbu3Enfvn2zrDt+/Dj/+c9/SExMZPXq1XTq1CnH+jFjxnDDDTcwf/58zj///Bzv8+OPP2bgwIEsX76cuLi4LOvKli3LRRddRJMmTUhNTWXHjh1ERkZSvXp1YmJiiIyMzLyar2PHjjzzzDOZyzIe9erVA5y+tauvvjpzeZkyZQgJ8X7hrSUOY4qxkJAQmjZtmvn6ggsuYP/+/SxcuJDffvuNuXPnMnv2bE6cOAHApEmTGD16dJaO94YNG/occ+vEiROkpqZm9qfs2rWLvXv3cuzYMY4dO0ZqamrmcQF+/vlnNmzYkLn+2LFjVKhQIfOX7muvvcaKFStITU3NXF+3bl2eeeYZAG666SaWLl2a5cTerl07JkyYADjTBWdsn56eDkDfvn358ktnctGOHTuyY8eOLO9h0KBBjB8/HoAhQ4bkGKzypptuIjY2FhFh5MiRhIWFER4envn3zDPP5KqrrkJV+f3333OsP3LkCOBMaXzmmWdmWR8WFkbNmjUBqFy5MoMGDcqybVhYGG3atAGgfv36PP/884SHhxMeHp55As+496dTp06sWbMmc3lERARhYSdP4d27d8dfy0rLli1p2bKlz/WVK1fObBLNjTVVGVMKqCoiwpdffsnzzz/P/PnzM0+gVatWZefOnYSEhPDAAw/w9ttvZ57U09PTKV++PCkpKYBz4s04CWeoUaNG5sm6X79+TJkyJcv6s846i3Xr1gFOJ/8vv/xC2bJlMx8tW7bk008/BeDWW29l3bp1lClThrJly1KmTBmaNWvGI488AsCoUaPYu3cvZcqUyXw0btyYq6++GoDJkydz4sSJLOvr1KmTecL8888/CQsLy7I+MjIyc5yxjM/JOOxyXEscxmQ6ceIEy5cvz+wnee211wgPD2fixInMmTMny4m9fPny3HfffYBTo9i4cWPmSb1s2bJERkbSpUsXwBlV+OjRo1m2z9iHKX4scVjiMMaYfLEhR4wxxhQISxzGGGPyxRKHMcaYfLHEYYwxJl8scRhjjMkXSxzGGGPyxRKHMcaYfLHEYYwxJl9KxQ2AIrIL2BDkMKoBu4McQ1Fhn8VJ9lmcZJ/FSUXls6ivqjmGWC4ViaMoEJH53u7ALI3sszjJPouT7LM4qah/FtZUZYwxJl8scRhjjMkXSxyF561gB1CE2Gdxkn0WJ9lncVKR/iysj8MYY0y+WI3DGGNMvljiMMYYky+WOAJMROqKyE8iskJElonIncGOKdhEJFRE/hCRr4MdSzCJyBki8omIrHS/H52CHVOwiMjd7v+PpSIyQUTKBTumwiIi74rIThFZ6rHsTBH5XkTWuH+rBDPG7CxxBN4J4F5VbQqcD/xLRJoFOaZguxNYEewgioCXgG9V9RygNaX0MxGRKGAY0EFVWwChwLXBjapQjQV6Z1v2b2CGqjYGZriviwxLHAGmqttUdaH7/CDOySEquFEFj4hEA32AMcGOJZhEpBLQDXgHQFVTVXVfUIMKrjCgvIiEARHA1iDHU2hUdRbwd7bFVwLvu8/fB/oVZky5scRRiEQkBmgLzA1yKMH0IvAAkB7kOILtLGAX8J7bbDdGRCKDHVQwqOoWYDSwEdgG7FfV6cGNKuhqquo2cH58AjWCHE8WljgKiYhUAD4F7lLVA8GOJxhE5HJgp6ouCHYsRUAY0A54XVXbAocpYs0RhcVtv78SaADUASJFZGhwozL+WOIoBCISjpM0xqnqZ8GOJ4g6A1eISDLwMXCRiHwU3JCCZjOwWVUzap+f4CSS0uhiYL2q7lLV48BnwAVBjinYdohIbQD3784gx5OFJY4AExHBacdeoarPBzueYFLVh1Q1WlVjcDo/f1TVUvnLUlW3A5tE5Gx3UQ9geRBDCqaNwPkiEuH+f+lBKb1QwMOXQLz7PB6YEsRYcggLdgClQGfgOuBPEVnkLvuPqn4TvJBMEXEHME5EygB/AYlBjicoVHWuiHwCLMS5CvEPiviQGwVJRCYAsUA1EdkMjABGAZNE5AacxNo/eBHmZEOOGGOMyRdrqjLGGJMvljiMMcbkiyUOY4wx+WKJwxhjTL5Y4jDGGJMvljhMvohITREZLyJ/icgCEflVRK4K0LHGisg17vMxGYNDikiyiFTLx34edkdeXSIii0SkY16Pm894Y0Rk8Cls5/V4InK+iMx1Y14hIo/m4fhL/ZXJYzyxIpKvG/BEpKyI/ODGOjDbOq/vQ0SuEJHTulvejbVUj7IcDHYfh8kz9+asL4D3VXWwu6w+cIWXsmGqeqKgjq2qN57Kdu5Q5ZcD7VT1mJtwyhRUXNnEAIOB8QW0v/eBAaq6WERCgbNz26CAxAKHgF/ysU1bIFxV23hZ5/V9qOqXODe6mWLGahwmPy4CUlX1jYwFqrpBVf8HICIJIjJZRL4CpotIpDvXwDx3IL8r3XKhIvKsu3yJiPzTXS4i8oqILBeRqXgM7CYiSSLSwTMYEXlcPOY3EZH/E5Fh2WKuDexW1WNuvLtVdatbvr2IzHRrTt9lDPGQ7Rhey4hII/cX9mIRWSgiDXFu2urq/rK++1TeZzY1cAb9Q1XTVHW5u/2jInKfR4xLxRlAEyBMRN53j/eJiES4ZUa5x1siIqPdZdVF5FM3vnki0tndzy3A3e776Jrt8zhTRL5w9/ObiLQSkRrAR0Abd5uGeXwfCSLyivt8kcfjiIhc6Ov744uInOuWO8tfOVMAVNUe9sjTA2fOhBf8rE/AGYPpTPf1k8BQ9/kZwGogErgZ+K+7vCwwH2eAu6uB73HmY6gD7AOuccsl4czXAJAMVMP5hb/QXRYCrAOqZoupArDIPfZrwIXu8nCcX9TV3dcDgXfd52OBa3IpMxe4yn1eDmco8Fjga49j5/t9Zot9OLAX+Bz4J1DOXf4ocJ9HuaXuZxEDKNDZXf4ucB9wJrCKkzf8nuH+HQ90cZ/XwxkWJ8f+s8X0P2CE+/wiYJH7PMt7z+P7SABeyVa2L/Cz+9l7/f5kKx8LfI0zttUCoF6w/5+Uhoc1VZlTJiKvAl1waiHnuou/V9WMuQV64gxqmPHruBzOCaon0EpOtutXBhrjzE8xQVXTgK0i8qO/46tqsojsEZG2QE3gD1Xdk63MIRFpD3QFugMT3Xb1+UAL4HunBY5Q3F/FHs72VkZEKgJRqvq5e4yj7ueRPcTTep+qOlJExrn7GQwMwjlR+rNJVee4zz/CSfYvAkeBMW4NJ6NP4GKgmUfcldz35k8X4B9ufD+KSFURqexvg7y+DxFpDDwLXKSqx0XE1/cn+zhWTXGGKOmpbm3SBJYlDpMfy3BPGgCq+i9x+gzme5Q57PFcgH+o6irPnYhzprpDVb/LtvwynF/M+TEG55drLZxf2Dm4J+gkIElE/sQZNG4BsExV/U3XKt7KiDMJU16c9vtU1XXA6yLyNrBLRKrijOfk2czsOc1q9v2qqp4QkfNwBg+8Frgdp7YQAnRS1SPZ4svtPeUI8xTfh+cxI4FJwE0eJ3+v3x8vtuF8Bm0pRRNABZP1cZj8+BEoJyK3eiyL8FP+O+AON1Hg1gwylt8qznDziEgT98QxC7jW7RuojVNDyM3nONNunuvuNwsROdv9JZuhDbABp+mmurjzfItIuIg0z7a51zLqzKeyWUT6ucvLun0JBwHPX+yn9T5FpI+cPIs3BtJwmrWScYdgF5F2OM1fGerJybnLBwGzxZkLprI6A2ve5X4GANNxkkjG8TKWZ38fnmYBQ9zysTj9R37nl/HzPjy9B7ynqj97LPP1/cluH86skk+6MZkAsxqHyTNVVfdk+YKIPIAzg91h4EEfmzyO00yyxP3Pn4xzhdMY3P4Jd/kunKkxP8f5JfwnTnv2zDzElCoiPwH73JpFdhWA/4nIGTi/1NcCN7vbXQO87Da1hLmxLsu2b19lrgPeFJGRwHGc0UuXACdEZDFOP8lLp/k+r8P5rFPc2IeoapqIfArEiTPa8jx3HxlWAPEi8iawBngdp4lsioiUw/kVf7dbdhjwqogscd/bLJyO8a+AT9zO6DuyncwfxZm1cAmQwsmhv/3x9T6AzCvzrgGaiMj17jY34vv7k4Oq7hCRvsA0EbleT85zYgLARsc1xZqIhOAMx91fVdcEOx5jSgNrqjLFljg3BK4FZljSMKbwWI3DGGNMvliNwxhjTL5Y4jDGGJMvljiMMcbkiyUOY4wx+WKJwxhjTL78P1X7M09B1NqzAAAAAElFTkSuQmCC\n", "text/plain": [ "The above plot will support any of the stopping criteria defined above. Assuming we haven't predefined $k$, we might choose $k=7$, as that is where the error really seems to plateau. Additionally, we could be more conservative. The red line shows the min value of (mean+1 std error). Any value of $k$ that is less than this value is statistically the same as the $k$ with the lowest mean error. If we were using that rule, we'd potentially stop at $k=3$.
\n",
"
Another way to bring the learning down to $k$ features is to find a projection matrix that produces a rank-$k$ approximation of our training data $X$. The best way to get a low-rank approximation, from a signal preservation perspective, is by the use of the Singular Value Decomposition. A review of the SVD can be found here:
\n",
"http://nbviewer.ipython.org/github/briandalessandro/DataScienceCourse/blob/master/ipython/Lecture3_PhotoSVD.ipynb\n",
"
\n",
"We'll put it to work here by doing the following:\n",
"
Now let's decompose the training data.
" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "U, sig, Vt = np.linalg.svd(X_train_norm)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Out of curiosity, let's plot the spectrum to get a sense of how independent the features are."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAD4CAYAAAD8Zh1EAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/d3fzzAAAACXBIWXMAAAsTAAALEwEAmpwYAAAh00lEQVR4nO3deXhV5bn+8e9DmGcxYUoIiYwCMgZwaosDigpFW1txaB0PRaW2nh6n/o6d7GltT+3RtgilFueKY1tULM6i4kDCEGYMYUrCEKYwJWR6fn8kbdM0mA3snbWH+3NdXGbt9bL3vSXcrKz9rneZuyMiIrGvWdABREQkPFToIiJxQoUuIhInVOgiInFChS4iEieaB/XCycnJnpGREdTLi4jEpJycnF3untLQvsAKPSMjg+zs7KBeXkQkJpnZ5qPt0ykXEZE4oUIXEYkTKnQRkTihQhcRiRMqdBGRONFooZvZHDPbaWYrj7LfzOw3ZpZnZrlmNjL8MUVEpDGhHKE/Bkz4nP0XAf1qf00FZp54LBEROVaNzkN394VmlvE5QyYDT3jNOrwfm1lnM+vh7tvCFVJEJNbtO1zOisISVhSWMDS1M2f3Sw77a4TjwqJUYGud7YLax/6t0M1sKjVH8aSnp4fhpUVEok/d8l5ZWEJuQQkFe0v/sf/mcX2ittCtgccavGuGu88GZgNkZWXpzhoiEvPql/eKwhK27vlnead3acuwtM5cPbY3p6V2YkhqRzq3bRmRLOEo9AKgV53tNKAoDM8rIhJVQinvoamduWpM5Mu7IeEo9HnAdDObC4wFSnT+XERiXbSXd0MaLXQzewYYBySbWQHwQ6AFgLvPAuYDFwN5wGHg+kiFFRGJhH2Hy1lZuJ/cwn0NlnevLm2irrwbEsoslysb2e/ArWFLJCISYeu2H+DttTtZUbgvZsu7IYEtnysi0pQ2FB/kleXbeCW3iM92HgRiu7wbokIXkbi1dc9hXs4t4pXl21i9bT9mMDqjC/dNHsyEIT1I6dAq6IhhpUIXkbiyraSUV3O38XLuNpZv3QfAiPTO3DtxEJec1oPunVoHGzCCVOgiEvN2HijjtRXbeSW3iMWb9gIwJLUjd180kEtO60GvLm0DTtg0VOgiEpP2HirntZU1Jf5x/m6qHQZ068D3xvdn4rCeZCa3Czpik1Ohi0jM2F9WweurdvDy8iI+zNtFZbWTmdyO6ef0ZeKwnvTv1iHoiIFSoYtIVDt0pJI31+zg5eXbWLi+mPKqalI7t+GmL5zCxKE9GNyzI2YNrUCSeFToIhJ1yiqqeGftTl7J3cZba3dQVlFNt46t+MYZvZk4tAfDe3VWiTdAhS4iUeFIZRXvr9/FK7lFvLF6B4fKq0hu35KvjerFpGE9yep9Es2aqcQ/jwpdRAJTUVXNog27eXl5EQtWbedAWSWd27Zg0rCeTBrWk7GZXWiepDtlhkqFLiJNrvjAEf70yRae+mQzxQeO0KFVc8YP7sakoT05q28yLZurxI+HCl1EmsyKghIeXbSRV5Zvo7yqmnEDUpgyOp1xA1Jo3SIp6HgxT4UuIhFVUVXNglXbeezDTWRv3kvblklcOaYX3zwzgz4p7YOOF1dU6CISEXsOlfPMp1t46uPNbCspI71LW+6dOIivZaXRsXWLoOPFJRW6iITVmm37efTDjfxlWRHlldWc3TeZ+yYP4ZyBXUnSLJWIUqGLyAmrqnbeWL2DRz/cyCcb99C6RTMuH5XGdWdmJPzVm01JhS4ix63kcAVzF2/hiY82U7ivlNTObbjnooFcMbpXTK8rHqtCKnQzmwA8BCQBj7j7/fX2nwTMAfoAZcAN7r4yzFlFJEqs33GAxxZt4s9LCimtqGJsZhfunXgq55/aTfPGAxTKPUWTgBnAeKAAWGxm89x9dZ1h3weWuftlZjawdvx5kQgsIsGornbeXruTxxZt4oO8XbRs3oxLh/fkujMzGdSzY9DxhNCO0McAee6eD2Bmc4HJQN1CHwT8HMDd15pZhpl1c/cd4Q4sIk1rf1kFz2cX8MRHm9i8+zDdO7bmjgsHcOWYdLq002mVaBJKoacCW+tsFwBj641ZDnwF+MDMxgC9gTTgXwrdzKYCUwHS09OPM7KINIUNxQd5YtEmXsgp4FB5FaN6n8QdFw7gwsHdaaHTKlEplEJvaJ6R19u+H3jIzJYBK4ClQOW//Sb32cBsgKysrPrPISIBq652Fn5WzKMfbuK99cW0SDImDe3JdWdlMDStc9DxpBGhFHoB0KvOdhpQVHeAu+8HrgewmjUtN9b+EpEYcPBIJS8tKeCxRZvILz5ESodW3H5+f64c24uuHeL3HpzxJpRCXwz0M7NMoBCYAlxVd4CZdQYOu3s5cBOwsLbkRSSK7TtcziPvb+TxjzZxoKySYWmdePCK4Vx8Wg8tkBWDGi10d680s+nAAmqmLc5x91VmNq12/yzgVOAJM6ui5sPSGyOYWURO0N5D5TzyQT6PL9rMwSOVXDSkO//xxVMYoRtHxLSQ5qG7+3xgfr3HZtX5+iOgX3ijiUi47TlUzh/ez+eJRZs4XFHFxUN68O3z+jKwu6YdxgNdKSqSAHYfPMLs9/N58qPNlFZUcclpPbjtvH66LD/OqNBF4tiug0f4w8J8nvhoM2WVVUwa2pNvn9uXfiryuKRCF4lDxQeOMHvhBp76eAtHKquYNKymyPt2VZHHMxW6SBzZeaCM37+Xz9OfbKa8sprJw1OZfm5f3UgiQajQReLAzv1lzKot8oqqai4dkcr0c/pyioo8oajQRWLYjv1lzHx3A898uoXKaufS2iPyzOR2QUeTAKjQRWLQ9pIyZr6bxzOLt1JV7XxlRE2R9z5ZRZ7IVOgiMWRbSSkz393A3E+3Uu3OV0emces5fUk/uW3Q0SQKqNBFYkDRvlIefjeP5xYXUO3O5aNqirxXFxW5/JMKXSSKFe4r5eF38nguu2YF68tH9eKWcX1U5NIgFbpIFCrYe5gZ72zghZyaIv96Vi9uHteHtJNU5HJ0KnSRKLJ1z2FmvJPHCzkFNDNjyuh0bh7Xh56d2wQdTWKACl0kChTtK+W3b3/G89k1RX7V2Joi79FJRS6hU6GLBGj3wSM8/O4Gnvx4MzhcPTadm8f1pXsn3VRCjp0KXSQA+8sqeGRhPn/8YCOlFVVcPiqN287rp3PkckJU6CJNqLS8iscWbWLWexsoKa3gkqE9uP38/vTtqkv05cSFVOhmNgF4iJo7Fj3i7vfX298JeApIr33OX7n7o2HOKhKzyiurmbt4C799O4/iA0c4Z0AK37tgAENSOwUdTeJIo4VuZknADGA8NTeMXmxm89x9dZ1htwKr3X2SmaUA68zs6dp7jIokrKpq589LC3nwzfUU7C1lTEYXHr56JKMzugQdTeJQKEfoY4A8d88HMLO5wGRq7h36dw50sJqbEbYH9gCVYc4qEjPcnb+t3M4Db6wnb+dBhqR25H8uO40v9kvWPTslYkIp9FRga53tAmBsvTG/A+YBRUAH4Ap3rw5LQpEY4u4s/GwXv1qwjhWFJfRJacfMq0cyYUh3FblEXCiF3tB3odfbvhBYBpwL9AHeMLP33X3/vzyR2VRgKkB6evoxhxWJZtmb9vDLBev4dOMeUju34VdfG8alw3vSPKlZ0NEkQYRS6AVArzrbadQcidd1PXC/uzuQZ2YbgYHAp3UHuftsYDZAVlZW/X8URGLSysISHnh9He+sKya5fSt+MnkwV4zuRavmSUFHkwQTSqEvBvqZWSZQCEwBrqo3ZgtwHvC+mXUDBgD54QwqEm02FB/k12+s59XcbXRq04K7Jgzk2jN707alZgNLMBr9znP3SjObDiygZtriHHdfZWbTavfPAu4DHjOzFdScornL3XdFMLdIYAr3lfLQm+t5IaeA1i2S+Pa5fbnpC6fQqU2LoKNJggvpUMLd5wPz6z02q87XRcAF4Y0mEl2KDxzh4XfzePrjLQBcd2Ymt5zTh+T2rQJOJlJDPxuKNKKktII/LMxnzocbOVJZzeUj07jt/H6kagVEiTIqdJGjOFxeWXOZ/rsb2F9WycShPbh9fH/6pOgyfYlOKnSReo5UVjH306389u08dh08wrkDu/K9C/ozuKcu05fopkIXqeXuvLZyO//z6hoK95UyJrMLs64ZSZYu05cYoUIXAdZtP8CP5q3io/zdDOzegSduGMMXdJm+xBgVuiS0ksMV/N+b63ny4820b9Wc+yYP5sox6bq6U2KSCl0SUlW181z2Vv53wTr2Hi7nqjHp/NcFAzipXcugo4kcNxW6JJyczXv50bxVrCgsYXTGSfxw0hitSy5xQYUuCWPn/jLuf20tLy0tpFvHVjw0ZThfHtZT58klbqjQJe6VV1bz6Icb+c1bn1FR5dwyrg+3ntOXdq307S/xRd/REtfeXbeTn7y8mvxdhzhvYFfunTiIjOR2QccSiQgVusSlTbsO8dNXV/Pmmp1kJrfj0etGc87ArkHHEokoFbrElUNHKnn43Tz+sHAjLZKMuy8ayA1nZdKyuaYhSvxToUtccHfmLS/i5/PXsn1/GZeNSOXuiwbSrWProKOJNBkVusS81UX7+dG8VXy6aQ+De3bkd1eN0OX6kpBU6BKz9h4q54E31vGnT7bQqU0LfnbZaVwxuhdJzTQNURKTCl1iTlW186dPt/DA6+vYX1rBN8/I4Pbz+9Opre4YJIktpEI3swnAQ9Tcgu4Rd7+/3v47gKvrPOepQIq77wljVhE+3biHH85bxZpt+zn9lC786MuDGdi9Y9CxRKJCo4VuZknADGA8UAAsNrN57r7672Pc/X+B/60dPwm4XWUu4bS9pIyfzV/DvOVF9OzUmhlXjeTi07rrKk+ROkI5Qh8D5Ll7PoCZzQUmA6uPMv5K4JnwxJNEd6Syikfe38iMd/KorHZuO7cv08b1oW1LnS0UqS+UvxWpwNY62wXA2IYGmllbYAIw/Sj7pwJTAdLT048pqCQWd+fttTv5ySur2bz7MBcM6sa9EwfRq0vboKOJRK1QCr2hn2n9KGMnAR8e7XSLu88GZgNkZWUd7Tkkwe0+eITv/3kFC1btoE9KO564YQxf7J8SdCyRqBdKoRcAvepspwFFRxk7BZ1ukRPwzrqd3PF8LvtLK7hrwkBuPFtXeYqEKpRCXwz0M7NMoJCa0r6q/iAz6wR8CbgmrAklIZSWV/Gz+Wt48uPN9O/WniduGMOgnpq9InIsGi10d680s+nAAmqmLc5x91VmNq12/6zaoZcBr7v7oYillbiUW7CP7z67jPziQ9x4diZ3XDiA1i2Sgo4lEnPMPZhT2VlZWZ6dnR3Ia0t0qKyqZtZ7G3jwzc9Ibt+KB74+jLP6JgcdSySqmVmOu2c1tE9zvyQQW3Yf5vbnlpGzeS8Th/bgp5cOoXNb3c9T5ESo0KVJuTvPZxfw45dX0ayZ8eAVw5k8XLeBEwkHFbo0mT2HyrnnpVwWrNrB2Mwu/PqK4aR2bhN0LJG4oUKXJvHOup3c+UIu+w6Xc89FA7npC6doVUSRMFOhS0TVn474+PWajigSKSp0iZgVBSV859ml5Bcf4oazMrlzgqYjikSSCl3Crqramflu3j+mIz5141jO7qfpiCKRpkKXsKo7HfGSoT34H01HFGkyKnQJC3fn+ZwCfjxvFc3M+L8rhnHp8FRNRxRpQip0OWH1pyM+8PVhpJ2kZW5FmpoKXU6IpiOKRA8VuhyX0vIqfv7aGp74qGY64mPXj2Zwz05BxxJJaCp0OWYrCkr47rNL2aDpiCJRRYUuIdN0RJHopkKXkGzdc5jbn11GtqYjikQtFbp8Lk1HFIkdKnQ5qpLSCu5+MZfXVm7XdESRGBBSoZvZBOAham5B94i739/AmHHAg0ALYJe7fylsKaXJrSoq4Zanl1C4t5S7Jgxk6hc1HVEk2jVa6GaWBMwAxgMFwGIzm+fuq+uM6Qw8DExw9y1m1jVCeaUJPJ+9lf/+y0o6t23B3Kmnk5XRJehIIhKCUI7QxwB57p4PYGZzgcnA6jpjrgJecvctAO6+M9xBJfLKKqr48cureObTrZxxysn85soRpHRoFXQsEQlRKIWeCmyts10AjK03pj/QwszeBToAD7n7E/WfyMymAlMB0tPTjyevRMjWPYe5+ekcVhbu55ZxffjP8f1pntQs6FgicgxCKfSGTpx6A88zCjgPaAN8ZGYfu/v6f/lN7rOB2QBZWVn1n0MC8vbaHXx37jIc+MM3sxg/qFvQkUTkOIRS6AVArzrbaUBRA2N2ufsh4JCZLQSGAeuRqFVV7fzfG+v53Tt5DOrRkZnXjKT3ye2CjiUixymUn6kXA/3MLNPMWgJTgHn1xvwV+IKZNTezttScklkT3qgSTrsPHuGbcz7hd+/k8fWsNF665UyVuUiMa/QI3d0rzWw6sICaaYtz3H2VmU2r3T/L3deY2d+AXKCamqmNKyMZXI5fzua9TP/TEnYfKucXXz2NK0br8wyReGDuwZzKzsrK8uzs7EBeO1G5O48v2sRPX11Dj86tmXn1KIakaoVEkVhiZjnuntXQPl0pmiAOHank7pdW8PLyIs4/tSsPfG04ndq2CDqWiISRCj0B5O08wLSnlpBffJA7LhzAzV/qQzNd9SkSd1Toce6V3CLufCGXNi2SePLGsZzVV8vdisQrFXqcKq+s5uevreHRDzcxMr0zD189iu6dWgcdS0QiSIUeh7aXlHHrn5aQs3kv15+VwT0XnUrL5rrqUyTeqdDjzId5u7jtmaWUVlTxu6tGMHFoz6AjiUgTUaHHiepqZ+Z7G3jg9XWcktKeZ68ZSd+uHYKOJSJNSIUeB0oOV/Cfzy3jrbU7mTSsJ/d/5TTatdIfrUii0d/6GLeysISbn85he0kZP5o0iGvPzNDt4UQSlAo9hj27eAv3/nUVXdq2ZO7UMxjV+6SgI4lIgFToMaisooof/HUlz2UXcHbfZB6aMpyT2+tGFCKJToUeY7bsPsy0p3JYvW0/3z63L989v7/u9SkigAo9pry5ege3P7eMZmbMuS6LcwfqRhQi8k8q9BhQWVXNr99Yz8PvbmBIakdmXj2KXl3aBh1LRKKMCj3KuTt3vpjLS0sKuXJMOj+cNIjWLZKCjiUiUUiFHuUeeuszXlpSyHfO68ft4/sHHUdEolhIC3yY2QQzW2dmeWZ2dwP7x5lZiZktq/31g/BHTTwv5hTw4Juf8dWRaXz3/H5BxxGRKNfoEbqZJQEzgPHU3Ax6sZnNc/fV9Ya+7+4TI5AxIS3asIu7X8rlzD4n8/OvnKaLhUSkUaEcoY8B8tw9393LgbnA5MjGSmyf7TjAt57MIePkdsy8ZpRWShSRkITSFKnA1jrbBbWP1XeGmS03s9fMbHBY0iWgnQfKuO7RxbRukcSj14+mUxvdJk5EQhPKh6IN/axf/87SS4De7n7QzC4G/gL820lfM5sKTAVIT9ed5us7XF7JTY9ns+dQOc9+63TSTtLURBEJXShH6AVArzrbaUBR3QHuvt/dD9Z+PR9oYWb/dq8zd5/t7lnunpWSknICseNPVbXznbnLWFFYwm+uHMHQtM5BRxKRGBNKoS8G+plZppm1BKYA8+oOMLPuVvupnZmNqX3e3eEOG89++upq3li9gx9OHMT4QboCVESOXaOnXNy90symAwuAJGCOu68ys2m1+2cBlwM3m1klUApMcff6p2XkKOZ8sJFHP9zEDWdlct1ZmUHHEZEYZUH1blZWlmdnZwfy2tHk9VXb+dZTOYw/tRszrxmlhbZE5HOZWY67ZzW0T/PhArR86z5um7uUoamdeGjKCJW5iJwQFXpAtu45zI2PZ5PcvhWPXDuaNi21PouInBit5RKAktIKrn9sMeWVVcydOpaUDro5hYicOBV6EyuvrGbakzls3n2IJ24YS9+uHYKOJCJxQoXehNydu1/K5aP83fz668M4o8/JQUcSkTiic+hN6O9L4d5+fn++MjIt6DgiEmdU6E3khTpL4d52Xt+g44hIHFKhN4FFG3Zxj5bCFZEIU6FHmJbCFZGmonaJIC2FKyJNSYUeIXWXwv3jtVlaCldEIk7TFiOg7lK4s7+RpaVwRaRJ6Ag9ArQUrogEQYUeZloKV0SCokIPo9dXbee+V1dzwaBu/L9LTg06jogkGBV6mGgpXBEJmgo9DLQUrohEg5AK3cwmmNk6M8szs7s/Z9xoM6sys8vDFzG61V0K97HrR2spXBEJTKOFbmZJwAzgImAQcKWZDTrKuF9Qc+/RhFB3KdzffyNLS+GKSKBCOUIfA+S5e767lwNzgckNjPs28CKwM4z5olbdpXB/eflQLYUrIoELpdBTga11tgtqH/sHM0sFLgNmfd4TmdlUM8s2s+zi4uJjzRpV6i6Fe9kILYUrIsELpdAbmq7h9bYfBO5y96rPeyJ3n+3uWe6elZKSEmLE6KOlcEUkGoVy6X8B0KvOdhpQVG9MFjC3dlnYZOBiM6t097+EI2Q00VK4IhKtQin0xUA/M8sECoEpwFV1B7j7Py6JNLPHgFfiscy1FK6IRLNGC93dK81sOjWzV5KAOe6+ysym1e7/3PPm8WLf4XJueFxL4YpI9ApptUV3nw/Mr/dYg0Xu7tedeKzoUl3t3P7sMraXlPHct87QUrgiEpV0ziAEv307j3fWFfODSYMZkX5S0HFERBqkQm/Ee+uLefCt9Vw2IpVrxqYHHUdE5KhU6J+jYO9hvjN3KQO6deBnl2lGi4hENxX6UZRVVHHL00uoqnJmXjNKC26JSNTTLeiO4scvrya3oITZ3xhFZnK7oOOIiDRKR+gNeD57K898uoVpX+rDBYO7Bx1HRCQkKvR6VhWV8N9/WckZp5zMf13QP+g4IiIhU6HXUXK4gpufWkLnti347VUjaJ6k/z0iEjt0Dr1WdbXzveeXUbSvlGe/dTrJ7XWjChGJLToErTXzvQ28uWYn/33JqYzq3SXoOCIix0yFDnzw2S4eeH0dXx7Wk2vPzAg6jojIcUn4Qi/aV8ptc5fSt2t77v+qLh4SkdiV0IV+pLLm4qHyympmXjOKti31kYKIxK6EbrCfvrKGZVv3MfPqkfRJaR90HBGRE5KwR+h/XlrAkx9vZuoXT+Gi03oEHUdE5IQlZKGv3b6fe15awZjMLtx54YCg44iIhEXCFfr+sgqmPZlDx9Yt+J0uHhKROBJSm5nZBDNbZ2Z5ZnZ3A/snm1mumS0zs2wzOzv8UU+cu/Nfzy2nYG8pM64eSdcOrYOOJCISNo1+KGpmScAMYDxQACw2s3nuvrrOsLeAee7uZjYUeA4YGInAJ+L3C/N5ffUO7p04iNEZunhIROJLKEfoY4A8d89393JgLjC57gB3P+juXrvZDnCizKINu/jl39ZyydAe3HBWRtBxRETCLpRCTwW21tkuqH3sX5jZZWa2FngVuKGhJzKzqbWnZLKLi4uPJ+9x2V5Sxm3PLCUzuR2/+OpQXTwkInEplEJvqP3+7Qjc3f/s7gOBS4H7Gnoid5/t7lnunpWSknJMQY9XeWU1tzydQ2l5Fb//xijat0roqfciEsdCKfQCoFed7TSg6GiD3X0h0MfMkk8wW1j8bP4almzZxy8uH0rfrh2CjiMiEjGhFPpioJ+ZZZpZS2AKMK/uADPra7XnMcxsJNAS2B3usMfqr8sKeWzRJm44K5OJQ3sGHUdEJKIaPf/g7pVmNh1YACQBc9x9lZlNq90/C/gq8E0zqwBKgSvqfEgaiPU7DnD3iysYnXES91wcdRNuRETCzoLq3aysLM/Ozo7Icx8oq2DyjA/ZX1rJq7edTbeOmm8uIvHBzHLcPauhfXH3CaG7c+cLuWzefZinbxqrMheRhBF3173/8YONvLZyO3dNGMDpp5wcdBwRkSYTV4X+Sf5ufv7aWi4a0p3/+MIpQccREWlScVPoO/eXMf2ZpfTu0pZfXq6Lh0Qk8cTFOfSKqmpu/dMSDpZV8vRNY+nQukXQkUREmlxcFPovXlvL4k17eWjKcPp308VDIpKYYv6Uy6u523jkg41cd2YGk4f/2xIzIiIJI6YLPW/nAe58YTkj0zvz/YtPDTqOiEigYrbQDx2pZNpTS2jdIokZV4+kZfOYfSsiImERk+fQ3Z27Xswlv/ggT904lh6d2gQdSUQkcDF5WPvoh5t4JXcbd1w4kDP7RsWijiIigYu5Qs/etIefzV/D+EHdmPYlXTwkIvJ3MVfobVomcUafk3ng68N08ZCISB0xdw59cM9OPHnj2KBjiIhEnZg7QhcRkYap0EVE4oQKXUQkToRU6GY2wczWmVmemd3dwP6rzSy39tciMxsW/qgiIvJ5Gi10M0sCZgAXAYOAK81sUL1hG4EvuftQ4D5gdriDiojI5wvlCH0MkOfu+e5eDswFJtcd4O6L3H1v7ebHQFp4Y4qISGNCKfRUYGud7YLax47mRuC1hnaY2VQzyzaz7OLi4tBTiohIo0Ip9Iau3vEGB5qdQ02h39XQfnef7e5Z7p6VkpISekoREWlUKBcWFQC96mynAUX1B5nZUOAR4CJ3393Yk+bk5Owys82hBq0nGdh1nL83Vuk9Jwa958RwIu+599F2mHuDB9v/HGDWHFgPnAcUAouBq9x9VZ0x6cDbwDfdfdFxhgyZmWW7e1akXyea6D0nBr3nxBCp99zoEbq7V5rZdGABkATMcfdVZjatdv8s4AfAycDDteurVCbaH5CISNBCWsvF3ecD8+s9NqvO1zcBN4U3moiIHItYvVI0Eee56z0nBr3nxBCR99zoOXQREYkNsXqELiIi9ajQRUTiRMwVemMLhcUbM+tlZu+Y2RozW2Vm3wk6U1MwsyQzW2pmrwSdpamYWWcze8HM1tb+eZ8RdKZIMrPba7+nV5rZM2bWOuhMkWBmc8xsp5mtrPNYFzN7w8w+q/3vSeF4rZgq9BAXCos3lcD33P1U4HTg1gR4zwDfAdYEHaKJPQT8zd0HAsOI4/dvZqnAbUCWuw+hZkr0lGBTRcxjwIR6j90NvOXu/YC3ardPWEwVOiEsFBZv3H2buy+p/foANX/JP28tnZhnZmnAJdRceZwQzKwj8EXgjwDuXu7u+wINFXnNgTa1Fy+2pYEr0OOBuy8E9tR7eDLweO3XjwOXhuO1Yq3Qj3WhsLhiZhnACOCTgKNE2oPAnUB1wDma0ilAMfBo7ammR8ysXdChIsXdC4FfAVuAbUCJu78ebKom1c3dt0HNQRvQNRxPGmuFHvJCYfHGzNoDLwLfdff9QeeJFDObCOx095ygszSx5sBIYKa7jwAOEaYfw6NR7TnjyUAm0BNoZ2bXBJsq9sVaoYe0UFi8MbMW1JT50+7+UtB5Iuws4MtmtomaU2rnmtlTwUZqEgVAgbv//aevF6gp+Hh1PrDR3YvdvQJ4CTgz4ExNaYeZ9QCo/e/OcDxprBX6YqCfmWWaWUtqPkSZF3CmiLKaxXH+CKxx918HnSfS3P0ed09z9wxq/nzfdve4P3Jz9+3AVjMbUPvQecDqACNF2hbgdDNrW/s9fh5x/CFwA+YB19Z+fS3w13A8aUhruUSLoy0UFnCsSDsL+AawwsyW1T72/dr1dSS+fBt4uvZgJR+4PuA8EePun5jZC8ASamZyLSVOlwAws2eAcUCymRUAPwTuB54zsxup+cfta2F5LV36LyISH2LtlIuIiByFCl1EJE6o0EVE4oQKXUQkTqjQRUTihApdRCROqNBFROLE/wf1Z7GM/73wNAAAAABJRU5ErkJggg==\n",
"text/plain": [
" We can see above that the first 5 or 6 singular values account for 70-80% of the sum of squares of our data. Now let's see how a rank-$k$ approximation does in cross-validation.\n",
" Now let's plot the results. We'll also plot it with the stepwise forward selection results above.\n",
" This is a very interesting plot. We can see that SVD approach does almost universally worse than the stepwise approach. Its not until we get to the last two features do we see them equal each other.
\n",
"My interpretation of this is that the SVD approach does not reduce the feature space with the outcome or learning task in mind. It is purely untangling the covariance structure of the matrix $X$. It is quite possible (based on the evidence above), that the most predictive features corresponded to the lowest singular values. That is why we don't see a large reduction in the cross-validated error until the last few columns are added into the matrix.
\n",
"\n",
"This is not to say or prove that dimensionality reduction with SVD is a generally inferior approach. Because the matrix $X$ is dense and not very high dimensional, the SVD approach just might not be appropriate here. It is also slower, as the cost of generating the SVD isn't always cheap.\n",
"\n",
"\n",
"