{ "metadata": { "name": "", "signature": "sha256:9b550808aac6fe6a5d1ca87554b77c90785eb1a5c9e1b8a24272932cac50efaf" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "This tutorial , uses multivariate regression to predict house price. The high level goal is the use multiple features (size , number of bedrooms,bathrooms etc) to predict the price of a house. This tutorial is a self-paced tutorial.\n", "\n", "The language used throughout will be Python and libraries available in python for scientific and machine learning applications.\n", "\n", "One of the Python tools, the IPython notebook = interactive Python rendered as HTML, you're watching right now. We'll go over other practical tools, widely used in the data science industry, below." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "**You can run this notebook interactively**\n", "
    \n", "
  1. Install the (free) [Anaconda](https://store.continuum.io/cshop/anaconda/) Python distribution, including Python itself.
  2. \n", "
  3. Install the TextBlob library: [Textblob](http://textblob.readthedocs.org/en/dev/install.html).
  4. \n", "
  5. Download the source for this notebook to your computer: [https://github.com/nmishra/mvregression/blob/master/predict_house_price_python.ipynb](https://github.com/nmishra/mvregression/blob/master/predict_house_price_python.ipynb) and run it with:
    \n", " `$ ipython notebook predict_house_price_python.ipynb`
  6. \n", "
  7. Watch the [IPython tutorial video](https://www.youtube.com/watch?v=H6dLGQw9yFQ) for notebook navigation basics.
  8. \n", "
  9. Run the first code cell below; if it executes without errors, you're good to go!\n", "
\n", "
" ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Python tutorial : Multivariate linear regression to predict house prices" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "import csv\n", "from textblob import TextBlob\n", "import pandas\n", "import sklearn\n", "import pickle\n", "import numpy as np\n", "from sklearn import preprocessing\n", "from sklearn import cross_validation as cv\n", "from sklearn.metrics import explained_variance_score, mean_squared_error,r2_score \n", "from sklearn.pipeline import Pipeline\n", "from sklearn.grid_search import GridSearchCV\n", "from sklearn.cross_validation import cross_val_score, train_test_split \n", "from sklearn.learning_curve import learning_curve\n", "from sklearn.externals import joblib\n", "from sklearn.linear_model import LinearRegression\n" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 1 }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Step 1: Load data, plot data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Labeled data is crucial for supervised learning. By the gracious courtesy of redfin.com, we are going to focus on last 3 years of data for zipcode 94555. The houses in this data are at least 4500 sq ft lot size, 3+bed, 2+bath,1500 sqft - 2000 sqft and were not listed more than 30 days in www.redfin.com" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```bash\n", "$ mkdir data && cd data \n", "$ ls -ltrh\n", "-rw-r--r--@ 1 user staff 41K Feb 17 23:19 redfin_2015-02-17_94555_results.csv\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pandas is an amazing library to load and manipulate data. We will be using pandas to load the data." ] }, { "cell_type": "code", "collapsed": false, "input": [ "labels=[\"LIST PRICE\",\"BEDS\",\"BATHS\",\"SQFT\",\"LOT SIZE\",\"YEAR BUILT\",\n", " \"PARKING SPOTS\",\"PARKING TYPE\",\"ORIGINAL LIST PRICE\",\"LAST SALE PRICE\"]\n", "houses=pandas.read_csv('./data/redfin_2015-02-17_94555_results_massaged.csv', quoting=csv.QUOTE_NONE,names=labels)\n", "print(houses.head())" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ " LIST PRICE BEDS BATHS SQFT LOT SIZE YEAR BUILT PARKING SPOTS \\\n", "0 739000 3 2.5 1988 5595 1991 2 \n", "1 749888 3 2.5 1642 9876 1986 2 \n", "2 713500 4 2.0 1504 5800 1969 2 \n", "3 749000 3 3.0 1781 5800 1971 2 \n", "4 835000 4 2.5 1857 5061 1990 2 \n", "\n", " PARKING TYPE ORIGINAL LIST PRICE LAST SALE PRICE \n", "0 Garage 739000 755000 \n", "1 Garage 749888 682500 \n", "2 Garage 599000 713500 \n", "3 Garage 749000 750000 \n", "4 Garage 835000 835000 \n" ] } ], "prompt_number": 14 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The matrix size of the input data is:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print(houses.shape)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "(116, 10)\n" ] } ], "prompt_number": 16 }, { "cell_type": "markdown", "metadata": {}, "source": [ "It's always advisable to visualize the data. Let's create a scatter plot with X-axis as the SQFT of the house and Y-axis as the LAST SALE PRICE of the house." ] }, { "cell_type": "code", "collapsed": false, "input": [ "houses.plot(x='SQFT',y='LAST SALE PRICE', kind='scatter')" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 12, "text": [ "" ] }, { "metadata": {}, "output_type": "display_data", "png": "iVBORw0KGgoAAAANSUhEUgAAAaEAAAEPCAYAAADrvntcAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJztnX2cVdV197+LkKFjNI4DFvAlQhVryRtICraJSlJnLrUp\niZoqflodlJZP6lOTxkkEJVUTwYiNivGpL0nUQfMkkSfEFFt7ZybK0NonBmPFkAiCLySCkSgErc3E\nl7CeP86+zOFy5859OW/73vX9fM5n9tnnnHt+59w7Z5291tp7i6piGIZhGGkwKm0BhmEYRvNiRsgw\nDMNIDTNChmEYRmqYETIMwzBSw4yQYRiGkRpmhAzDMIzUiNUIicinRWSjiPxERD7t6tpFpF9EtohI\nn4i0hfa/TES2ishmEekM1c9wn7NVRG4K1Y8RkXtd/SMickxoW5c7xxYROT/O6zQMwzBqIzYjJCLv\nAf4a+EPg/cBHReRYYDHQr6rHAw+6dURkKnAOMBWYA9wiIuI+7lZggapOAaaIyBxXvwDY5epvBJa7\nz2oHrgBmuuXKsLEzDMMwskGcLaETgB+q6m9U9bfAOuAsYC6w0u2zEvi4K38M+Jaqvqmq24CngVki\nMhE4RFXXu/3uDh0T/qzVwJ+4cg7oU9U9qroH6CcwbIZhGEaGiNMI/QQ42bnfDgJOB44CxqvqTrfP\nTmC8Kx8BbA8dvx04skT9DleP+/s8gKq+BbwiImPLfJZhGIaRIUbH9cGqullElgN9wP8AG4DfFu2j\nImLjBhmGYTQpsRkhAFW9E7gTQESWEbRIdorIBFV90bnaful23wEcHTr8KLf/Dlcuri8c8y7gBREZ\nDRyqqrtEZAcwO3TM0cBDxfrMABqGYdSGqsrIe41M3Nlxv+v+vgs4E/gmsAbocrt0Ad9z5TXAPBFp\nEZHJwBRgvaq+CLwqIrNcosJ5wD+Hjil81icIEh0gaH11ikibiBwGdAC9pTSqqrfLlVdemboG05++\njmbU77P2RtAfJbG2hIDvuBjNm8BFqvqKiFwLrBKRBcA24GwAVX1SRFYBTwJvuf0LV3sR0AO0Ag+o\nat7V3wHcIyJbgV3APPdZu0XkauBRt98XNEhQaCi2bduWtoS6MP3p4rN+n7WD//qjJG533Ckl6nYD\npw2z/zXANSXqHwPeW6L+dZwRK7HtLuCuKiUbhmEYCWIjJnjM/Pnz05ZQF6Y/XXzW77N28F9/lEjU\n/j2fEBFt5us3DMOoBRFBfUhMMOJlYGAgbQl1YfrTxWf9PmsH//VHiRkhwzAMIzXMHdfE128YhlEL\n5o4zDMMwGgIzQh7ju1/Z9KeLz/p91g7+648SM0KGYRhGalhMqImv3zAMoxYsJmQYhmE0BGaEPMZ3\nv7LpTxef9fusHfzXHyVmhIympre3l87Os+jsPIve3pIDrWcSX3WHaYRrMCIg7SHBUx6OXI3mJZ/P\na2vreIUehR5tbR2v+Xw+bVkj4qvuMI1wDc2Me3ZG8hy2lpDRtFx//VcZHFxOMCVVF4ODy7n++q+m\nLWtEotCddisk7Xuf9vUbQ5gR8hjf/cqmPx16e3s544wu+vun0N8/lzPO6PLuQVzPvR+6/rmpXb+v\nv504iHtSO8MYkd7e3n1vwd3dC8nlcomct7t7IQ8/3MXgYLDe2rqI7u6ViZy7HurVPdQKOQaYzeBg\nUJfUfYd07/3+rTBSuX4jRFR+PR8XLCaUOmnHBpYuXart7cdqe/uxunTp0sTOWy/5fF47Os7Ujo4z\nq75fHR1nuvutbunRjo4zY1I6PPVcQz1k5fp9hghjQqkbgjQXM0Lpk+YDIW0DmBZJXndahqYczfq9\nR0mURshiQh7ju185bf31BsfT1l8ruVyO++5byYwZd9DRsYb77lsZiytq2bJlnH76X9Lf/wL9/ZMj\njb3Uc+8L19/RsSbW6y+Hr7+dOLCYkJEqvsZlShFVbCuJGFkul2PMmDHMnj078s+G4BquuOJ69u69\n0dUsYnDwrzITe8nlcpnQYWDuOCN90nLZROmWieqzGsVVVMrNCidZ7KVBwGJCZoSMaIjKAEYV22qU\noHmp6xg1aqyXBtU4kCiNkMWEPMZ3v3IW9OdyOfr6VtPXt7pq90wW9NdDnPq7uxfS2roI+CzwR8Bn\nOO+8j0bmArN73ziYETKMCBh66K4EVrrY1sLUPidtcrkcS5ZczKhRdwKfBG5k1aq8d51ijfix+YSa\n+PqNaIkqoWDZsmXccMNdAFxyyQUsWbIkMo1J0tl5Fv39cyl0CoUgI62vb3WasowIiHI+IcuOM4yI\niCLjqre3l2XLbnap47Bs2SI+8IEPWCaX0bCYO85jfPcrN5L+qAbETHJgz7jvf6WuxVruXSP9dpod\nawkZRp0UBsQstF4efrgrlQ6QWaPQKXTIRXngPbF7Z8SdAn0Z8FNgI/BNYAzQDvQDW4A+oK1o/63A\nZqAzVD/DfcZW4KZQ/RjgXlf/CHBMaFuXO8cW4Pxh9EWTr2g0NVGmVTdKP6FKKXfvsjjkjxGADyna\nIjIJ+BvgRFV9L/A2YB6wGOhX1eOBB906IjIVOAeYCswBbhGRQuDrVmCBqk4BpojIHFe/ANjl6m8E\nlrvPageuAGa65UoRaYvrWg0jKrIwpEwWyMJ0C0ZCRGXNiheCFs9TwGEEbr/7gQ6CVs54t88EYLMO\ntYIWhY7PAycBE4FNofp5wG2hfWa58mjgJVc+F7g1dMxtwLwSGqN5LUiJtWvXpi2hLhpFf3HrpaWl\nTadPPzXzb/BZuP/DtfxGal1mQXs9+K4fH1pCqrobuB74OfACsEdV+50B2ul22wmMd+UjgO2hj9gO\nHFmifoerx/193p3vLeAVERlb5rMMo2pGCpyHWy/Tp38NeDuPP36BvcFXgLX8jDhbQscCTwJjCVop\n9wF/BfyqaL/d7u/NwF+G6r8OnEUQD+oP1Z8M3O/KG4EjQtuedufrBpaE6j8PdJfQGMVLgdHAVBuj\naZRhd9Km2WJjvkGELaE4s+M+APw/Vd0FICLfJRi/40URmaCqL4rIROCXbv8dwNGh448iaMHscOXi\n+sIx7wJeEJHRwKGquktEdgCzQ8ccDTxUSuT8+fOZNGkSAG1tbUybNm3fyMKFNEpbb971JUuWhVKm\nBxgcnL9vJOhS++/e/RJDDACbhtYycD2+rOdyOa666hJWrbqD9vbD6e5eyZgxYxgYGMiEvmZbHxgY\noKenB2Df8zIyorJmxQvwfuAnQCsgBJ0F/hdwHS72Q5CUcK0rTwU2AC3AZOAZhkZ0+CEwy33OA8Ac\nV38RLvZDECv6tg7Fo54F2ghiUs8SysILaYzy5SBxsuxXriSzKcv6C5Rr2ZTS79MbvA/3fzh81q7q\nv358aAmp6hMicjfwI2Av8F/AV4FDgFUisgDYBpzt9n9SRFYRuPDeAi5yF1swNj0EBu0BVc27+juA\ne0RkK7DLGSJUdbeIXA086vb7gqruietajf2Ju+9HEvPtFKh2vqNK+sYYhjGEjR3XxNcfF3GOGVZs\n4FpbF8UezE7S6BmGD9jYcQbQnA/H/Ye1gcFBYp+t02bhjIZm/L0aI2Njx3lKb28vc+eem8nOfJWO\nGVYIfPqK6a+cqDuf2r1vHKwl5CnXX/9V3nhjIUm2CColzrhItTEaIxuk0YI1/MCMkNf8QdoChqUS\nF1YhFbTaz81K4L8W/VnCZ/0+awf/9UeJGSFPaeYWgcVo/OPUU0/kwQe72bv3NuCDtLZ+g1NPvZjO\nzrMAixE1NVHlevu44Hk/oeXLl3s9yrDvfSWaRX+9o1kX950aNeow7erqqqs/VbPc+6yCD/2EjPiZ\nOXMml156adoyjAYmij5fxfGgvXvh/vuvthiRAZg7zmt89yub/nSpRH9WEwqa4d43C2aEDMOIlVLx\ny0suuZhlyxY1ZUzT2B/rJ+QxWeprMNJ0B6XIkv5aaAb9lfb5Kkep6RqWLFlS1xQOzXDvmwVrCRl1\nE/dYcUZ6RJUSXyqjsZGzHG10iCqIKsPBxwXPs+OipJ4MqGrm0Mnn8zp9+qna3n6sTp/+Qc3n83Vn\nX6VJpdqTuEbf7qNveivFp5HUa4UIs+NSNwRpLmaEAur9p6nUCOXzeW1pOVyhW+EkhXYdNeogbWlp\n8/IfttL7lsRDybcHn296q6EZJjY0I2RGSFWj62tQ7z9NpQ+U4DzdCoV9Fym0KZzg5T/sjBknV3Tf\nkngo1XKONPuq1HtPstzPppJry7L+SojSCFliglE3pQLPw/vA/xMopPzOAVYAr8Wiq5ZkCcOolyiS\nOZqKqKyZjwuet4SiIinXSD6fV5H2A94S4WDnnjtJW1raIjl3llxgWdKSFXzTWy1Lly7V9vZjtb39\nWF26dGnaciIHc8eZEYqapILES5cuVZGhGNDo0Yfq6NFj9623tBweyfmjdIGVuzeltlVaFzW+Bfp9\n01spjW5gVc0ImRFyxOVXjvvhUPj8GTNO1unTT40lXhKVESr3QCl1/316APkcl8iydosJVbdYTMjY\nj6gnHytFLpejr281X/7yFxk3bmykn10gKr/8/sPWBH2hCv0/oti/0bG4nDEiUVkzHxc8bwnFQdLp\npUNp29G64wqfXW+Lrtr70QzpuZXiU6swSprhurFRtI3G4k3gtlA5GqLokV/tvE3l9m+2XvRZHfw0\nbrI08aIPmDvOY+IYf6oeN1a1rpeBgQE3TfkK4AfAD3jjjRWZcV8VjMYJJxzH9Ol3HZB+Xur+D5eu\nnoSbs1oq/f1k0aXm+9hrvuuPlKiaVD4ueO6Oy1JiQi0uiLVr12bWfVXJ9VRz/7N4nZXor8e1FKdb\nKsuB/ah/O1kEy44zI1QrcWW+1fqQzar/PGqjkUUjVAlRjKbRiGnY5fD1u66GKI2QxYSaiCyOdu2z\n/7yaGE+1saVGIYsjZTdbbC7zRGXNfFzwvCW0fPnyqt4y43xDq9Udl1VGup4gq++wqq43a62CuN1x\ncVLrbycro1dk+bdfCZg7zoxQLQ/BuN0E1U5rMGPGyZl4oJWi1JQTYYJ7uajkvazX2CRlrCp9EGbN\neKrW/hBPylU20j0zI2RGyHsjVMs/Uxbeaos1jBp1WObG1qrkPg13/+u9x1n4jhqZZojXJIEZITNC\ndSUCpPlWW0r3qFFjM/WgreTeDmcs6n3I2UMyXszIR0OURmjYfkIi8pFQeXLRtjMriTeJyO+LyOOh\n5RUR+ZSItItIv4hsEZE+EWkLHXOZiGwVkc0i0hmqnyEiG922m0L1Y0TkXlf/iIgcE9rW5c6xRUTO\nr0SzL3R3L6Sl5TNU25+nMGROX9/qDARkBwDYu3dKZvoGVUoul+Oqqy6pcPqKbOJzX5VatVc37Uh8\n+HzvI2c46wQ8Xqpcar2ShaBj7C+Ao4HrgEtd/SLgWleeCmwA3g5MAp4GxG1bD8x05QeAOa58EXCL\nK58DfNuV24FngDa3PAO0FWmK9vUgYS688ELvhovP5/M6alQhlrVIgwnuujP1tl/p2/JwA5gGM8XW\nNjVFkm/qPsclfNau6r9+knDHxWCEOoH/cOXNwHhXngBsduXLgEWhY/LAScBEYFOofh5wW2ifWa48\nGnjJlc8Fbg0dcxswr0hTZF9K0vjsVli6dKmOGjXWPai7M6m91vlgohgLL22XqWGMhK9G6E7gIlf+\nVaheCuvAzcBfhrZ9HTgLmAH0h+pPBu535Y3AEaFtTwNjgW5gSaj+80B3kaaovpPE8T12kOUHbT0G\n3vfvxTAqIUojVK6z6u+JyP2uPDlUBphc6oDhEJEW4M8JXG/7oaoqIlrN50XJ/PnzmTRpEgBtbW1M\nmzaN2bNnA0N+26yuQy9wDBCs7979EgMDA5nRV249l8uxadOmTN7voYE3g/BiYTqGMWPG7Lf/ihUr\nDtC/e/dLDDEAbBpay8j1ldOfJX3l1sMxlSzoaXT9AwMD9PT0AOx7XkbGcNaJ4Ml2qvtbvJxajaUD\nPgbkQ+ubgQmuPJEhd9xiYHFovzwwi8BlF3bH7XO1uX1O0gPdcftcdm79duCcIl0RvhskSy39hLJG\nVv3ilbZm4prULmv9hLJIUtrj+i58vveqybnjfhd4d4n6dwO/W9VJ4NtAV2j9Olzsxxme4sSEFoLW\n1jMMJSb80Bkk4cDEhIJBmsf+iQnPEiQlHFYoF+mK9ptJmCy7tHwmir4+tX4vPsf6Gg37LoYnKSN0\nb6kWD3AK8M2KTwDvAF4GDgnVtQPfB7YAfWHjAFxOENfZDORC9TMI4j9PA18J1Y8BVgFbgUeASaFt\nF7j6rWEjGNoe4ddiNBJpGXiLKWUH+y6GJ0ojVG4+oeNUdV1xpar+O/D+MscV7/8/qjpOVf87VLdb\nVU9T1eNVtVNV94S2XaOqx6nqCaraG6p/TFXf67Z9KlT/uqqerapTVPUkVd0W2naXq5+iqg03WmTY\nr1wpWZobphb9pYjjmirpTzWc/izd43JEdf9rpZ77lLb2evFdf6QMZ52ALbVs82nB85ZQtX7lrLkX\novCLp3lNccSEmqWfUL3XmYT2Zp0PqRJIyB33APBnJepPB/4tKgFpLr4boWppRPdC1q4pCj3NEOvL\n2vc2HM3wXdRClEaoXIr23wP/IiJ/ATxGkBAwA/hj4KMxNMoMwyCbc/A0K/ZdxM+wMSFV3QK8D/h3\ngky1Y4B1wPtU9alk5BnlqNav3N29kNbWRVQ73lxcROEXT/OaivX39vby8ss7Efn7fXpaWj4XqZ4o\n401pxiXq/d58j6n4rj9Kys6sqqq/IRjpwGgAfJ7FdDgK13TZZVfzs5+9yDHHnJCKjv1nrd0IXAIc\nAbwZ0zmyMTNurTTib9GojUIfnAM3iLwGDDeSgarqO2NTlRAiosNdv+EPxQ/n1tZFiT+cOzvPor9/\nLtDlalYCVwMfp6PjOfr6Vsdyjo6ONZF8tmFUg4igqhLFZw3bElLVg6M4gWHEzdAwO8HDeXAwqEv/\nzfpwYCUvv/z7KeswjOxSrp9QSUTkYBE5YAw4I3mi9Cun0bfFd794WH9xjCMYJvEq4MuM4PWumKjj\nXz7ff5+1g//6o6TcpHZHiMjNIvKAiFznjM9nCEYyODI5iUbcFNxZ/f1z6e+fyxlndGW6k2UxxQ/n\nlpbP8fLLOxM1qIUYR3v71cC1wAnAV4GNjBs3NtJzpD0hWxhfOuYaGWa43G2CYXWuAuYAK4BtBGPA\nTYgqPzzthSbrJzQcvvTZKEehP8f06ae6SeXS6ZC7dOlShXfuOz+805sJB6sl7o611kcnu5BQZ9UN\nRevbgbdFdeIsLGaEAhrBCBVI+1rSPn+SxHmtWRvdw9ifKI1QuZjQKBFpd8tYYDdwaKEu+jaZUS1R\n+ZXT6mtT0O+rS8d3v36W9e+fbNK1b06nAlnWXgm+64+SchHTdxKMlBCmsK7A78WiyEicNPtsRN33\npbt7IQ8/3MXgYLAeGNTkxq5N+/xJUu+19vb2hn5zC1OPbxkpEVWTyscFc8elThwunbRjCWmfP0lq\nvdaR3G3mjss2ROiOK9dZdTzB3D7HAT8GvqSqryZlHJPAOqumj3XAbE4q+d6tpZRdouysWi4mdDfw\nGnAzcAjwlShOaERHI8wnlLXx7KrBd79+1vQ/9tgT+/0my83plDXt1eK7/igpZ4QmqOoSVc2r6t9R\nxUR2RjbJYn+gtPu+ZMkoNxMHdu79LLt3f/yA36R9P03AcH46Ahdcu1vGFq23R+UPTHOhyWJCzZQ+\nXAkWd0iXfD6v7e3HKpykkD/gN2nfT3YhoRTtQnbcY8CPCFxy4XXDqImsvN2OlAZcKcXXk5Xryzq5\nXI4ZM94PfBI4sPUb1fdjZJtyA5hOSlCHUQMDAwPMnj274v2zkD68f0r2Jtatm8e73/1+xo0bm9ng\n83AB8oGBAV5//fX9UszXrTsPeJM33lgBZHu6hWp/P3FQ628yC9rrwXf9kRJVk8rHBc/dcZXMU1+c\nQpt2+vD+LsHlCuNSc7dU4u4pt8/atWtLujgD91L2XZ6V/H6SYLjf5Ej33md8108Sw/Y0w+K7ERqJ\nLPrU939opx+jGskojxRH89kI+UDaL01GaaI0QtGMMW9kkizOs7O/++WFWM5RTf+SXC5X1/0odie1\ntHyOwB0XuJQaecSEJKj3+zE8YDjrBHwkVJ5ctO3MqKxgmguet4RGatJnNRuu8HZ73HHv0ZaWwyNt\nqVXb+hvpTbsSl1DWXJ6V4rNLyGftqv7rJ6FRtB8vVS617uvS6EYoi+64MGvXro38gV2N4a30/gyn\n0fcHic/6fdau6r/+KI1QuWF7HlfV6cXlUuu+0gzD9jTb0CfVDANkQwYZRm1EOWyPxYQanGbzqWch\nDb1emu3FwWhyhmsiAa8Aa4D7gT3ub2HZE1VTLM2FBnfHZZ249Ffq4qvXXRmH/iRdqD7/fnzWruq/\nfhLKjvtYqHx90bYvV2rkRKQN+DrwboJ5iC4AtgL3AscQTBt+tqrucftfBlwI/Bb4lKr2ufoZQA/w\nO8ADqvppVz+GYLDVE4FdwDmq+jO3rQtY4qQsVdW7K9Vt+Eulrb9cLseSJRdzww1XA3DJJRen3urI\nYkajYcRKLZYLuLeKfVcCF7ryaOBQ4DrgUle3CLjWlacCG4C3A5OAp2Ff3Go9MNOVHwDmuPJFwC2u\nfA7wbVduB54B2tzyDNBWpC2qFwPDQ7KYuJGFjEZfsvuM9CDtzqrA8xXudyjwbIn6zcB4V54AbHbl\ny4BFof3ywEnARGBTqH4ecFton1k6ZORecuVzgVtDx9wGzCvSEdFXYiRBmpl0SZG2YUz7/IYfRGmE\nyg1gGgWTgZdE5C4R+S8R+ZqIvMMZoJ1un53AeFc+AtgeOn47cGSJ+h2uHvf3eQBVfQt4RUTGlvms\nhsH3OUmq0V88DcXpp5/Lcce9jxNP/FBqA4XGcf+Hm9oijkFRS+n3ZdDQqO59WoPN+v6/GyXDxoRc\nDKZU/rIQuMsq/fwTgb9T1UdFZAWwOLyDqqqIpJYnPX/+fCZNmgRAW1sb06ZN2zewYOGHktX1DRs2\nZEpPnPqDh+N8gjDibPbuhWeeWQ5sBE7i4Ye7uOqqS5g5c2bF5z/ttFmsW/cZ3ngDAFpaPsNppw39\nPNO6/4WYVmF9aNDX+cDQoKhjxoyp63yl9O/e/RJDDACbKr4fvq1fd911/MM/XMsbb9wIwLp153L1\n1Yu59NJLM6EvS+sDAwP09PQA7HteRsZwTSSCX+Da4ZZKmlkErrbnQusfAv6V4Jc9wdVNZMgdtxhY\nHNo/D8xynxN2x+1ztbl9TtID3XH7XHZu/XaCpAVzx3lI6THaztzvby2utChdfHHFUpJ0GzaTOy6L\n7lhfIInsOFWdPdw2EWmp0MC9KCLPi8jxqroFOA34qVu6gEK7/3vukDXAN0XkBgLX2RRgvaqqiLwq\nIrMIEhTOY2i68TXuMx4BPgE86Or7gGtcdp4AHQRJEIaHFPf/Cb7KlcCLdX1uVP2o9p+iIttTOJSj\n4A4c6qfk3zUYnlGptSJ4kJ8G3AHsrOK49wOPAk8A3yVIVmgHvg9sITAWbaH9LyfIitsM5EL1Mwh8\nL08DXwnVjwFWEaR9PwJMCm0rpINvBbpKaIvszSANfO9rUK3+fD6v06efqqNGjVXodm+x4xS6U3lj\nD+uP8606rtaJz7+fKLSn2erz+d6rJtQSKiAif0Tg/vq4Mx5/B3yuCiP3BPCHJTadNsz+1wDXlKh/\nDHhvifrXgbOH+ay7gLsq1eoLhR71u3e/xLJlS5rmTbXQailc/8sv/wj4fcaNe66h39itdRIPdl+z\nQbmx474EnAU8S9DS+B7wmKpOTk5evPg4dlyx26e1dZGXbp9GY9myZVxxxY3s3TsF+CCtrd+w78Vo\nWKIcO65civZfExigW4FvqOruKE5o1IcvKbRxkFY67Uj09vaybNnN7N17PfBJRo26kyVL0h99oRqy\nem+NxqecEZoI3AScCTwjIvcArSJSaXq2ETsDaQuoi0IKaCUU9xM644yu1B+WBf3FLwZ7997IunX/\nlaa0iihOAc/SvR2Jan47WcR3/VEyrBFS1bdU9d9UtYsgS+2fgf8EtovIN5MSaOxPd/dCWlsLmWF5\nN0r0wrRlVU1vby+f/ewVFb95N3MLMG7s3hppUtFUDqr6G+A7wHdE5J0ESQpGCjRCMLVR0pkLnfp8\nnT6ioN9HfNYO/uuPlOHS5oCZwMTQehdBn5yvAO1RpeelueB5irav1JLOnPVOlD4P+pn1e2tkDxIa\nO+524HUAETkFuJbAB/QqYG31DOC/X3mg4j2HG1MtTcL3P5fL0de3mr6+1anrqpTwMEFZu7cj4ftv\n33f9UVLOHTdKhzLizgFuV9XVwGoReSJ+aUajMuS+mg/8rGL3VbPNEpskdm+NtCjXT+gnwHRVfVNE\nngIWquo6t+2nqvruBHXGgo/9hBoFm8LaMPwlyn5C5YzQEuDPgJeBo4EZqrpXRKYAPar6wSgEpIkZ\nocbEDJxhxEsinVVVdRnQTTDszYdUdW/h/MDFUZzcqA/f/cpx6E+yz4vd//TwWTv4rz9KyqZoq+oP\nStRtiU+OYdTH/n1eYHAwqLPWkGFkk2Hdcc2AueMaj87Os+jvn0vBCEGQ9dXXtzpNWYbRUCQ1dpxh\n1EUa45HtP6LESm9HlDCMZmFYIyQifUkKMaony37lSmIzcehPss9Llu9/JaStv56XlLS114vv+qOk\nXEzo8MRUGA1HmrEZ6/OSfRpl6CajfsqlaD8LfJYgG64YVdXvxiksCSwmFB8WmzHKYb8Pv4kyJlSu\nJXQo8OdltntvhIz48HVQTyMZXn55V0V1RuNTLjHh56p6wXBLYgqNYcmyX7mS2EyW9VeC6a+Htwgc\nLSvd8llXVxl27xuHiqZyMIxasNiMMRzjxo0HTiIYmB+gi3HjnktRkZEW5WJC71HVn4TWxwGnAD9T\n1ccS0hcrFhMyjHQoTkxobV1kiQkekdTYcf8KLFLVn4jIROBx4FHgWOBrqnpjFALSxIyQYaSHjfHn\nL0l1Vp0UagldAPSp6p8Ds4ALozi5UR+1+JXT6EA6HL77xU1/fdQzB1Pa2uvFd/1RUi4m9GaofBrw\nNQBV/W+n0xPmAAAR2ElEQVQR2Vv6ECPLWN8MwzCyRjl33L8AvcAO4A7g91T1VyJyEPCozSfkH9Y3\nw8gSjeyOa+Rrg+T6CS0AvkjQCjpHVX/l6mcRTO9gGIZRE43cKm/ka4sFVa1qAVqBs6s9LotLcPn+\nsnbt2qr2z+fz2to6XqFHoUdbW8drPp+PR1wFVKu/QD6f146OM7Wj40wv9WeFNPV3dJzpfofqlh7t\n6Diz4uOzfO8rubYs668E9+yM5DlcUT8hEXkbMAc4F+gAHgZWxWIVjdgodCAdchP493Zmb5mG0WAM\nZ50IxoybDdwOPA98B9gJHFSNlQO2AT8mSPFe7+ragX5gC9AHtIX2vwzYCmwGOkP1M4CNbttNofox\nwL2u/hHgmNC2LneOLcD5JbRF/YJgxEy9b9BGNshaqzxKGvnaChBhS6hcivbzwOXAWuAEVf0E8GtV\n/XW1dg6YrarTVXWmq1sM9Kvq8cCDbh0RmQqcA0wlaHndIiKF4NetwAJVnQJMEZE5rn4BsMvV3wgs\nd5/VDlwBzHTLlSLSVqV2wzBiIMkpN5Kmka8tFoazTsAK4FngPmAe8A7guWqtHPAcMLaobjMw3pUn\nAJt1qBW0KLRfnmBsj4nAplD9POC20D6zXHk08JIrnwvcGjrmNmBekY4oXgpSY+3atZmJj9RCLX7x\nLL1l+u7X91m/z9pV/ddPEi0hVf174DjgZuBPgKeAw0XkHBE5uBo7B3xfRH4kIn/j6sar6k5X3gmM\nd+UjgO2hY7cDR5ao3+HqcX+fd5rfAl4RkbFlPqthWL9+/YgTxzUa9pZpGI1F2cQEVd0LPAQ8JCIt\nQI6ghfFPwLgKz/FBVf2FiBwO9IvI5qJzqIik1lln/vz5TJo0CYC2tjamTZvG7NmzgaFezVldX7Xq\nXxgcnM/QxHGbWLJk2b6Hctr6Rlov1FV7fGFg1OJe577oz8q6z/pnz56dKT2Nrn9gYICenh6Afc/L\nqBi2s2rZg0QuU9Uv1XDclcBrwN8QxIledOPSrVXVE0RkMYCqXuv2zwNXAj9z+/yBqz8XOEVV/9bt\nc5WqPiIio4FfqOrhIjLPneOT7pjbgYdU9d6QHq3l+rOCdT41DCMNkho7rhwXVbKTiBwkIoe48juA\nToIMtzUMPTm7gO+58hpgnoi0iMhkYApBRt2LwKsiMsslKpwH/HPomMJnfYIg0QGCrLtOEWkTkcMI\nUssbyld12mmzaG1dRGFOlmDiuIVpy6qY4paMb5j+9PBZO/ivP0rink9oPHCfS3AbDfwfVe0TkR8B\nq0RkAUEK99kAqvqkiKwCniSY4eqiUFPlIqCHoLPsA6qad/V3APeIyFZgF0HSAqq6W0SuJhj5G+AL\nqronzotNmpkzZ3rf78cw0qDRh9XxiVrdcc+r6tEx6EkU391xhmFUj81lVD9JzSf0GkFmWykOUtW3\nRSEgTcwIGUbzYbHU+kkkJqSqB6vqIcMs3hugRsB3v7LpTxef9fusHfzXHyVxx4QMwzAyRXf3Qh5+\nuIvBwWA9SOhZma6oJqammFCjYO44w2hOLDGhPhKJCTUDZoQMwzCqJwv9hIwM4Ltf2fSni8/6fdYO\n/uuPEjNChmEYRmqYO66Jr98wDKMWzB1nGIZhNARmhDzGd7+y6U8Xn/X7rB381x8lZoQMwzCM1LCY\nUBNfv2EYRi1YTMgwDMNoCMwIeYzvfmXTny4+6/dZO/ivP0rMCBmGYRipYTGhJr5+wzCMWrCYkGEY\nhtEQmBHyGN/9yqY/XXzW77N28F9/lJgRMgzDMFLDYkJNfP2GYRi1YDEhwzAMoyEwI+QxvvuVTX+6\n+KzfZ+3gv/4oMSNkGIZhpIbFhJr4+g3DMGrBYkJGJunt7aWz8yw6O8+it7c3bTmGYXiAGSGPyZJf\nube3lzPO6KK/fy79/XM544yuEQ1RlvTXgulPD5+1g//6o2R02gKMxuD667/K4OByoAuAwcGgLpfL\npSvMMIxMYzGhJr7+KOnsPIv+/rkUjBCspKNjDX19q9OUZRhGDHgVExKRt4nI4yJyv1tvF5F+Edki\nIn0i0hba9zIR2Soim0WkM1Q/Q0Q2um03herHiMi9rv4RETkmtK3LnWOLiJwf93U2O93dC2ltXQSs\nBFbS2rqI7u6FacsyDCPjJBET+jTwJFBociwG+lX1eOBBt46ITAXOAaYCc4BbRKRgaW8FFqjqFGCK\niMxx9QuAXa7+RmC5+6x24ApgpluuDBu7RiFLfuVcLsd99wWtn46ONdx338oRXXFZ0l8Lpj89fNYO\n/uuPklhjQiJyFHA6sAy4xFXPBU515ZXAAIEh+hjwLVV9E9gmIk8Ds0TkZ8AhqrreHXM38HEg7z7r\nSle/GvjfrpwD+lR1j9PRT2DYvh3DZRqOXC5nMSDDMKoi7pbQjcDngL2huvGqutOVdwLjXfkIYHto\nv+3AkSXqd7h63N/nAVT1LeAVERlb5rMaitmzZ6ctoS5Mf7r4rN9n7eC//iiJzQiJyEeBX6rq40DJ\nAJbLCrDMAMMwjCYlTnfcHwNzReR04HeAd4rIPcBOEZmgqi+KyETgl27/HcDRoeOPImjB7HDl4vrC\nMe8CXhCR0cChqrpLRHYAs0PHHA08VErk/PnzmTRpEgBtbW1MmzZt31tKwW+b1fUVK1Z4pdf0Z2vd\nZ/3hmEoW9DS6/oGBAXp6egD2PS8jQ1VjXwhiQPe78nXAIldeDFzrylOBDUALMBl4hqEU8h8Cswha\nVA8Ac1z9RcCtrjwP+LYrtwPPAm3AYYVyCV3qM2vXrk1bQl2Y/nTxWb/P2lX91++enZHYh0T6CYnI\nqUC3qs51mWurCFow24CzdSiB4HLgQuAt4NOq2uvqZwA9QCvwgKp+ytWPAe4BpgO7gHmqus1tuwC4\n3ElYqqorS+jSJK7fMAyjkYiyn5B1Vm3i6zcMw6gFrzqrGvER9iv7iOlPF5/1+6wd/NcfJWaEDMMw\njNQwd1wTX79hGEYtmDvOMAzDaAjMCHmM735l058uPuv3WTv4rz9KzAgZhmEYqWExoSa+fsMwjFqw\nmJBhGIbREJgR8hjf/cqmP1181u+zdvBff5SYETIMwzBSw2JCTXz9hmEYtWAxIcMwDKMhMCPkMb77\nlU1/uvis32ft4L/+KDEjZBiGYaSGxYSa+PoNwzBqwWJChmEYRkNgRshjfPcrm/508Vm/z9rBf/1R\nYkbIMAzDSA2LCTXx9RuGYdSCxYQMwzCMhsCMkMf47lc2/enis36ftYP/+qPEjJBhGIaRGhYTauLr\nNwzDqAWLCRmGYRgNgRkhj/Hdr2z608Vn/T5rB//1R4kZIcMwDCM1LCbUxNdvGIZRCxYTMgzDMBqC\n2IyQiPyOiPxQRDaIyJMi8iVX3y4i/SKyRUT6RKQtdMxlIrJVRDaLSGeofoaIbHTbbgrVjxGRe139\nIyJyTGhblzvHFhE5P67rTBPf/cqmP1181u+zdvBff5TEZoRU9TfAh1V1GvA+4MMi8iFgMdCvqscD\nD7p1RGQqcA4wFZgD3CIihebercACVZ0CTBGROa5+AbDL1d8ILHef1Q5cAcx0y5VhY9cobNiwIW0J\ndWH608Vn/T5rB//1R0ms7jhV/bUrtgBvA34FzAVWuvqVwMdd+WPAt1T1TVXdBjwNzBKRicAhqrre\n7Xd36JjwZ60G/sSVc0Cfqu5R1T1AP4Fhayj27NmTtoS6MP3p4rN+n7WD//qjJFYjJCKjRGQDsBNY\nq6o/Bcar6k63y05gvCsfAWwPHb4dOLJE/Q5Xj/v7PICqvgW8IiJjy3yWYRiGkSFGx/nhqroXmCYi\nhwK9IvLhou0qIpaeViPbtm1LW0JdmP508Vm/z9rBf/2RoqqJLMA/AJ8FNgMTXN1EYLMrLwYWh/bP\nA7OACcCmUP25wK2hfU5y5dHAS648D7gtdMztwDklNKkttthiiy3VL1HZhthaQiIyDnhLVfeISCvQ\nAXwBWAN0ESQRdAHfc4esAb4pIjcQuM6mAOtda+lVEZkFrAfOA74SOqYLeAT4BEGiA0AfcI1LRhB3\n7kXFGqPKczcMwzBqI0533ERgpYiMIog93aOqD4rI48AqEVkAbAPOBlDVJ0VkFfAk8BZwUagn6UVA\nD9AKPKCqeVd/B3CPiGwFdhG0gFDV3SJyNfCo2+8LLkHBMAzDyBBNPWKCYRiGkS4NNWKCiNwpIjtF\nZGOJbd0istf1ISrUVdU5Ng39InKViGwXkcfd8qc+6Xf1F4vIJhH5iYgs90m/iHw7dO+fcy15n/TP\nFJH1Tv+jIvKHnul/v4j8QER+LCJrROSQLOoXkaNFZK2I/NT9zj/l6iPrnJ+S/r9wdb8VkROLjolG\nf1KJCQklP5wMTAc2FtUfTZDE8BzQ7uqmAhuAtwOTCPolFVqG64GZrvwAMCct/cCVwCUl9vVF/4cJ\n+mm93a0f7pP+ou1fBj7vk35gAMi58p8SdJXwSf+jwMmufAHwxSzqJ0igmubKBwNPAX8AXAdc6uoX\nAdd6pv8E4HhgLXBiaP/I9DdUS0hV/4OgQ2wxNwCXFtXV0jk2VsroL5VA4Yv+vwW+pKpvun1ecvW+\n6AdARIQgfvktV+WL/l8Ah7pyG0E/O/BH/xRXD/B94CxXzpR+VX1RVTe48mvAJoIEqyg75yet/whV\n3ayqW0ocEpn+hjJCpRCRjwHbVfXHRZtq6RybFheLyBMickeoOe+L/inAKRKM7TcgIh9w9b7oL3Ay\nsFNVn3HrvuhfDFwvIj8H/hG4zNX7ov+n7n8Y4C8IvBqQYf0iMomgRfdDou2cnwhF+ocjMv0NbYRE\n5CDgcgKX1r7qlOTUyq3AZGAawVvt9enKqZrRwGGqehLwOWBVynpq5Vzgm2mLqIE7gE+p6ruAzwB3\npqynWi4ELhKRHxG4id5IWU9ZRORggiHEPq2q/x3epoF/KtOZYE7/dwj0v5bEOWMdMSEDHEvgr3wi\n8KZwFPCYBH2OdjD0VlXYtt3VH1VUv4OUUNVfFsoi8nXgfrfqhX4CTd8FUNVHJUgOGYc/+hGR0cAZ\nQDgw64v+map6mit/B/i6K3uhX1WfIhgLEhE5Hvgztylz+kXk7QQG6B5VLfR/3CkiE1T1ReeqKvw/\nZ1n/N0L6hyMy/Q3dElLVjao6XlUnq+pkgpt0omserwHmiUiLiExmqHPsi8CrIjLLxQHOY6hDbeK4\nH26BM4BC5pAX+t25PwL7HiItqvoy/ugHOI1g1I4XQnW+6H9aRE515Y8ABf++F/pF5HD3dxTweQLP\nAGRMvzvXHcCTqroitKnQoR4O7Jzvg/79dguVo9Mfd9ZFkgtB0PgF4HWCgU0vKNr+LC47zq1fThBQ\n24zLIHL1Mwge9k8DX0lB/xtO/4UEgb0fA0+4L3O8B/r33X+C7Jl7nJ7HgNk+6Xf1dwELS+yfVf2F\n388FwAcIfPsbgB8A0z3SfyHwKYJMraeAa7J6/4EPAXvdfX7cLXOAdoKEii0EI7m0eaT/TwmSCp4H\nBoEXgX+LWr91VjUMwzBSo6HdcYZhGEa2MSNkGIZhpIYZIcMwDCM1zAgZhmEYqWFGyDAMw0gNM0KG\nYRhGapgRMoyEEJElbpj8JySYWmGm6+y3wg17v0VEviciR4aO+a0MTSXxuIjMD5XfkGCKg8dF5Jo0\nr80waqXRh+0xjEwgIn9EMOTMdFV9U4J5rcYA1wDvAI5XVRWR+QTDHM1yh/5aVacXfVyP+8znCDr/\n7k7gEgwjFqwlZBjJMAF4WYemtNgNvALMBz6jrte4qvYAr4vI7HRkGkaymBEyjGToA44WkadE5J9E\n5BTgOODneuBoxT8C3uPKB4Xcb6uTFGwYSWDuOMNIAFX9HxGZQTAv0YeBewlcccNR+N8s5Y4zjIbB\njJBhJISq7gXWAetEZCPwSYLW0cFFraEPAP+ahkbDSBpzxxlGAojI8SIyJVQ1nWAK5buBG9xUBYjI\n+QSTtz2UvErDSB5rCRlGMhwM3OymZ38L2AosBF4jmHb7KRFpBfYAnTo0vH25Ye5tCHzDe2wqB8PI\nCCIynmDm3H9U1f+bth7DSAIzQoZhGEZqWEzIMAzDSA0zQoZhGEZqmBEyDMMwUsOMkGEYhpEaZoQM\nwzCM1DAjZBiGYaSGGSHDMAwjNf4/9gPhROoRjtwAAAAASUVORK5CYII=\n", "text": [ "" ] } ], "prompt_number": 12 }, { "cell_type": "markdown", "metadata": {}, "source": [ "pandas also helps with aggregate statistics." ] }, { "cell_type": "code", "collapsed": false, "input": [ "houses.groupby('SQFT').describe()" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
BATHSBEDSLAST SALE PRICELIST PRICELOT SIZEORIGINAL LIST PRICEPARKING SPOTSYEAR BUILT
SQFT
1500count 1.0 1.000000 1.000000 1.000000 1.000000 1.000000 1 1.000000
mean 2.0 3.000000 583000.000000 583000.000000 6139.000000 557300.000000 2 1978.000000
std NaN NaN NaN NaN NaN NaNNaN NaN
min 2.0 3.000000 583000.000000 583000.000000 6139.000000 557300.000000 2 1978.000000
25% 2.0 3.000000 583000.000000 583000.000000 6139.000000 557300.000000 2 1978.000000
50% 2.0 3.000000 583000.000000 583000.000000 6139.000000 557300.000000 2 1978.000000
75% 2.0 3.000000 583000.000000 583000.000000 6139.000000 557300.000000 2 1978.000000
max 2.0 3.000000 583000.000000 583000.000000 6139.000000 557300.000000 2 1978.000000
1504count 3.0 3.000000 3.000000 3.000000 3.000000 3.000000 3 3.000000
mean 2.0 3.666667 731166.666667 687462.666667 5837.666667 649296.000000 2 1974.333333
std 0.0 0.577350 42826.199146 34081.757310 467.639106 50501.236104 0 10.115994
min 2.0 3.000000 700000.000000 648888.000000 5390.000000 599000.000000 2 1968.000000
25% 2.0 3.500000 706750.000000 674444.000000 5595.000000 623944.000000 2 1968.500000
50% 2.0 4.000000 713500.000000 700000.000000 5800.000000 648888.000000 2 1969.000000
75% 2.0 4.000000 746750.000000 706750.000000 6061.500000 674444.000000 2 1977.500000
max 2.0 4.000000 780000.000000 713500.000000 6323.000000 700000.000000 2 1986.000000
1507count 1.0 1.000000 1.000000 1.000000 1.000000 1.000000 1 1.000000
mean 2.0 4.000000 498000.000000 450000.000000 6323.000000 450000.000000 2 1969.000000
std NaN NaN NaN NaN NaN NaNNaN NaN
min 2.0 4.000000 498000.000000 450000.000000 6323.000000 450000.000000 2 1969.000000
25% 2.0 4.000000 498000.000000 450000.000000 6323.000000 450000.000000 2 1969.000000
50% 2.0 4.000000 498000.000000 450000.000000 6323.000000 450000.000000 2 1969.000000
75% 2.0 4.000000 498000.000000 450000.000000 6323.000000 450000.000000 2 1969.000000
max 2.0 4.000000 498000.000000 450000.000000 6323.000000 450000.000000 2 1969.000000
1523count 1.0 1.000000 1.000000 1.000000 1.000000 1.000000 1 1.000000
mean 2.0 3.000000 626000.000000 620000.000000 9241.000000 650000.000000 2 1974.000000
std NaN NaN NaN NaN NaN NaNNaN NaN
min 2.0 3.000000 626000.000000 620000.000000 9241.000000 650000.000000 2 1974.000000
25% 2.0 3.000000 626000.000000 620000.000000 9241.000000 650000.000000 2 1974.000000
50% 2.0 3.000000 626000.000000 620000.000000 9241.000000 650000.000000 2 1974.000000
..............................
1936std 0.0 0.000000 255265.548008 248265.190875 1675.135965 218849.548777 0 1.414214
min 3.0 4.000000 485000.000000 484900.000000 6400.000000 526500.000000 2 1973.000000
25% 3.0 4.000000 575250.000000 572675.000000 6992.250000 603875.000000 2 1973.500000
50% 3.0 4.000000 665500.000000 660450.000000 7584.500000 681250.000000 2 1974.000000
75% 3.0 4.000000 755750.000000 748225.000000 8176.750000 758625.000000 2 1974.500000
max 3.0 4.000000 846000.000000 836000.000000 8769.000000 836000.000000 2 1975.000000
1942count 1.0 1.000000 1.000000 1.000000 1.000000 1.000000 1 1.000000
mean 2.0 4.000000 595000.000000 564900.000000 6205.000000 564900.000000 2 1973.000000
std NaN NaN NaN NaN NaN NaNNaN NaN
min 2.0 4.000000 595000.000000 564900.000000 6205.000000 564900.000000 2 1973.000000
25% 2.0 4.000000 595000.000000 564900.000000 6205.000000 564900.000000 2 1973.000000
50% 2.0 4.000000 595000.000000 564900.000000 6205.000000 564900.000000 2 1973.000000
75% 2.0 4.000000 595000.000000 564900.000000 6205.000000 564900.000000 2 1973.000000
max 2.0 4.000000 595000.000000 564900.000000 6205.000000 564900.000000 2 1973.000000
1958count 1.0 1.000000 1.000000 1.000000 1.000000 1.000000 1 1.000000
mean 2.0 4.000000 690000.000000 649000.000000 6828.000000 649000.000000 2 1985.000000
std NaN NaN NaN NaN NaN NaNNaN NaN
min 2.0 4.000000 690000.000000 649000.000000 6828.000000 649000.000000 2 1985.000000
25% 2.0 4.000000 690000.000000 649000.000000 6828.000000 649000.000000 2 1985.000000
50% 2.0 4.000000 690000.000000 649000.000000 6828.000000 649000.000000 2 1985.000000
75% 2.0 4.000000 690000.000000 649000.000000 6828.000000 649000.000000 2 1985.000000
max 2.0 4.000000 690000.000000 649000.000000 6828.000000 649000.000000 2 1985.000000
1988count 4.0 4.000000 4.000000 4.000000 4.000000 4.000000 4 4.000000
mean 2.5 3.750000 683750.000000 678997.000000 5538.500000 683747.000000 2 1992.000000
std 0.0 0.500000 113164.702978 102146.130793 884.457838 92866.140417 0 1.154701
min 2.5 3.000000 515000.000000 529000.000000 4576.000000 548000.000000 2 1991.000000
25% 2.5 3.750000 672500.000000 657241.000000 5105.500000 661991.000000 2 1991.000000
50% 2.5 4.000000 732500.000000 719494.000000 5438.500000 719494.000000 2 1992.000000
75% 2.5 4.000000 743750.000000 741250.000000 5871.500000 741250.000000 2 1993.000000
max 2.5 4.000000 755000.000000 748000.000000 6701.000000 748000.000000 2 1993.000000
\n", "

504 rows \u00d7 8 columns

\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 20, "text": [ " BATHS BEDS LAST SALE PRICE LIST PRICE LOT SIZE \\\n", "SQFT \n", "1500 count 1.0 1.000000 1.000000 1.000000 1.000000 \n", " mean 2.0 3.000000 583000.000000 583000.000000 6139.000000 \n", " std NaN NaN NaN NaN NaN \n", " min 2.0 3.000000 583000.000000 583000.000000 6139.000000 \n", " 25% 2.0 3.000000 583000.000000 583000.000000 6139.000000 \n", " 50% 2.0 3.000000 583000.000000 583000.000000 6139.000000 \n", " 75% 2.0 3.000000 583000.000000 583000.000000 6139.000000 \n", " max 2.0 3.000000 583000.000000 583000.000000 6139.000000 \n", "1504 count 3.0 3.000000 3.000000 3.000000 3.000000 \n", " mean 2.0 3.666667 731166.666667 687462.666667 5837.666667 \n", " std 0.0 0.577350 42826.199146 34081.757310 467.639106 \n", " min 2.0 3.000000 700000.000000 648888.000000 5390.000000 \n", " 25% 2.0 3.500000 706750.000000 674444.000000 5595.000000 \n", " 50% 2.0 4.000000 713500.000000 700000.000000 5800.000000 \n", " 75% 2.0 4.000000 746750.000000 706750.000000 6061.500000 \n", " max 2.0 4.000000 780000.000000 713500.000000 6323.000000 \n", "1507 count 1.0 1.000000 1.000000 1.000000 1.000000 \n", " mean 2.0 4.000000 498000.000000 450000.000000 6323.000000 \n", " std NaN NaN NaN NaN NaN \n", " min 2.0 4.000000 498000.000000 450000.000000 6323.000000 \n", " 25% 2.0 4.000000 498000.000000 450000.000000 6323.000000 \n", " 50% 2.0 4.000000 498000.000000 450000.000000 6323.000000 \n", " 75% 2.0 4.000000 498000.000000 450000.000000 6323.000000 \n", " max 2.0 4.000000 498000.000000 450000.000000 6323.000000 \n", "1523 count 1.0 1.000000 1.000000 1.000000 1.000000 \n", " mean 2.0 3.000000 626000.000000 620000.000000 9241.000000 \n", " std NaN NaN NaN NaN NaN \n", " min 2.0 3.000000 626000.000000 620000.000000 9241.000000 \n", " 25% 2.0 3.000000 626000.000000 620000.000000 9241.000000 \n", " 50% 2.0 3.000000 626000.000000 620000.000000 9241.000000 \n", "... ... ... ... ... ... \n", "1936 std 0.0 0.000000 255265.548008 248265.190875 1675.135965 \n", " min 3.0 4.000000 485000.000000 484900.000000 6400.000000 \n", " 25% 3.0 4.000000 575250.000000 572675.000000 6992.250000 \n", " 50% 3.0 4.000000 665500.000000 660450.000000 7584.500000 \n", " 75% 3.0 4.000000 755750.000000 748225.000000 8176.750000 \n", " max 3.0 4.000000 846000.000000 836000.000000 8769.000000 \n", "1942 count 1.0 1.000000 1.000000 1.000000 1.000000 \n", " mean 2.0 4.000000 595000.000000 564900.000000 6205.000000 \n", " std NaN NaN NaN NaN NaN \n", " min 2.0 4.000000 595000.000000 564900.000000 6205.000000 \n", " 25% 2.0 4.000000 595000.000000 564900.000000 6205.000000 \n", " 50% 2.0 4.000000 595000.000000 564900.000000 6205.000000 \n", " 75% 2.0 4.000000 595000.000000 564900.000000 6205.000000 \n", " max 2.0 4.000000 595000.000000 564900.000000 6205.000000 \n", "1958 count 1.0 1.000000 1.000000 1.000000 1.000000 \n", " mean 2.0 4.000000 690000.000000 649000.000000 6828.000000 \n", " std NaN NaN NaN NaN NaN \n", " min 2.0 4.000000 690000.000000 649000.000000 6828.000000 \n", " 25% 2.0 4.000000 690000.000000 649000.000000 6828.000000 \n", " 50% 2.0 4.000000 690000.000000 649000.000000 6828.000000 \n", " 75% 2.0 4.000000 690000.000000 649000.000000 6828.000000 \n", " max 2.0 4.000000 690000.000000 649000.000000 6828.000000 \n", "1988 count 4.0 4.000000 4.000000 4.000000 4.000000 \n", " mean 2.5 3.750000 683750.000000 678997.000000 5538.500000 \n", " std 0.0 0.500000 113164.702978 102146.130793 884.457838 \n", " min 2.5 3.000000 515000.000000 529000.000000 4576.000000 \n", " 25% 2.5 3.750000 672500.000000 657241.000000 5105.500000 \n", " 50% 2.5 4.000000 732500.000000 719494.000000 5438.500000 \n", " 75% 2.5 4.000000 743750.000000 741250.000000 5871.500000 \n", " max 2.5 4.000000 755000.000000 748000.000000 6701.000000 \n", "\n", " ORIGINAL LIST PRICE PARKING SPOTS YEAR BUILT \n", "SQFT \n", "1500 count 1.000000 1 1.000000 \n", " mean 557300.000000 2 1978.000000 \n", " std NaN NaN NaN \n", " min 557300.000000 2 1978.000000 \n", " 25% 557300.000000 2 1978.000000 \n", " 50% 557300.000000 2 1978.000000 \n", " 75% 557300.000000 2 1978.000000 \n", " max 557300.000000 2 1978.000000 \n", "1504 count 3.000000 3 3.000000 \n", " mean 649296.000000 2 1974.333333 \n", " std 50501.236104 0 10.115994 \n", " min 599000.000000 2 1968.000000 \n", " 25% 623944.000000 2 1968.500000 \n", " 50% 648888.000000 2 1969.000000 \n", " 75% 674444.000000 2 1977.500000 \n", " max 700000.000000 2 1986.000000 \n", "1507 count 1.000000 1 1.000000 \n", " mean 450000.000000 2 1969.000000 \n", " std NaN NaN NaN \n", " min 450000.000000 2 1969.000000 \n", " 25% 450000.000000 2 1969.000000 \n", " 50% 450000.000000 2 1969.000000 \n", " 75% 450000.000000 2 1969.000000 \n", " max 450000.000000 2 1969.000000 \n", "1523 count 1.000000 1 1.000000 \n", " mean 650000.000000 2 1974.000000 \n", " std NaN NaN NaN \n", " min 650000.000000 2 1974.000000 \n", " 25% 650000.000000 2 1974.000000 \n", " 50% 650000.000000 2 1974.000000 \n", "... ... ... ... \n", "1936 std 218849.548777 0 1.414214 \n", " min 526500.000000 2 1973.000000 \n", " 25% 603875.000000 2 1973.500000 \n", " 50% 681250.000000 2 1974.000000 \n", " 75% 758625.000000 2 1974.500000 \n", " max 836000.000000 2 1975.000000 \n", "1942 count 1.000000 1 1.000000 \n", " mean 564900.000000 2 1973.000000 \n", " std NaN NaN NaN \n", " min 564900.000000 2 1973.000000 \n", " 25% 564900.000000 2 1973.000000 \n", " 50% 564900.000000 2 1973.000000 \n", " 75% 564900.000000 2 1973.000000 \n", " max 564900.000000 2 1973.000000 \n", "1958 count 1.000000 1 1.000000 \n", " mean 649000.000000 2 1985.000000 \n", " std NaN NaN NaN \n", " min 649000.000000 2 1985.000000 \n", " 25% 649000.000000 2 1985.000000 \n", " 50% 649000.000000 2 1985.000000 \n", " 75% 649000.000000 2 1985.000000 \n", " max 649000.000000 2 1985.000000 \n", "1988 count 4.000000 4 4.000000 \n", " mean 683747.000000 2 1992.000000 \n", " std 92866.140417 0 1.154701 \n", " min 548000.000000 2 1991.000000 \n", " 25% 661991.000000 2 1991.000000 \n", " 50% 719494.000000 2 1992.000000 \n", " 75% 741250.000000 2 1993.000000 \n", " max 748000.000000 2 1993.000000 \n", "\n", "[504 rows x 8 columns]" ] } ], "prompt_number": 20 }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Step 2: Data preprocessing" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this section, we will convert the data to a format, that can be normalized and consumed by our learning algorithm. We have 9 columns of input data based on which, we need to predict the house price. We have around 116 rows of data (training samples).\n", "\n", "We will extract the LAST SALE PRICE column and save that to our target values. We will normalize the rest of the columns. By normalization, we are making sure the features are scaled to an uniform scale and no particular feature has more weight over the other. \n" ] }, { "cell_type": "heading", "level": 4, "metadata": {}, "source": [ "Step 2.1 Extract target values" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We need to extract LAST SALE PRICE and save that to a 'y' array and extract the rest of the columns and save it to 'houses_features' array" ] }, { "cell_type": "code", "collapsed": false, "input": [ "y= houses['LAST SALE PRICE'].copy()\n", "houses_features =houses.iloc[:,[0,1,2,3,4,5,6,7,8]].copy()\n", "\n", "print (\"features: \\n\",houses_features.head())\n", "print(\"target: \\n\", y.head())\n" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "features: \n", " LIST PRICE BEDS BATHS SQFT LOT SIZE YEAR BUILT PARKING SPOTS \\\n", "0 739000 3 2.5 1988 5595 1991 2 \n", "1 749888 3 2.5 1642 9876 1986 2 \n", "2 713500 4 2.0 1504 5800 1969 2 \n", "3 749000 3 3.0 1781 5800 1971 2 \n", "4 835000 4 2.5 1857 5061 1990 2 \n", "\n", " PARKING TYPE ORIGINAL LIST PRICE \n", "0 Garage 739000 \n", "1 Garage 749888 \n", "2 Garage 599000 \n", "3 Garage 749000 \n", "4 Garage 835000 \n", "target: \n", " 0 755000\n", "1 682500\n", "2 713500\n", "3 750000\n", "4 835000\n", "Name: LAST SALE PRICE, dtype: int64\n" ] } ], "prompt_number": 15 }, { "cell_type": "heading", "level": 4, "metadata": {}, "source": [ " Step 2.2 Normalize data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The first step for our data would be to give \"garage\" a numerical value. We can assign 1 for the presence of garage and 0 for the absence of garage. " ] }, { "cell_type": "code", "collapsed": false, "input": [ "import math\n", "def convert_parking_type(features):\n", " features['PARKING TYPE'] \\\n", " = features['PARKING TYPE'].map(lambda x : 1 if x.strip().lower() == 'garage' else 0)\n", " return features\n", " \n", "houses_features = convert_parking_type(houses_features);\n", "print(houses_features['PARKING TYPE'].head())" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "0 1\n", "1 1\n", "2 1\n", "3 1\n", "4 1\n", "Name: PARKING TYPE, dtype: int64\n" ] } ], "prompt_number": 16 }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are many ways to normalize the features. We will use a simple way of normalizing the feature:\n", "given a list of values X[1,2,3,4,5] , we will normalize as:\n", "\n", "\\begin{equation*} X[i] = \\frac{X[i] - mean(X[i])}{\\sigma(X[i])} \\end{equation*}\n", "\n", "where m= total no. of features of X\n", "\n", "In our case, we will normalize each column of features." ] }, { "cell_type": "code", "collapsed": false, "input": [ "def feature_normalize(matrix, mean=None, std=None):\n", " mean = matrix.mean() if mean is None else mean\n", " std = matrix.std() if std is None else std\n", " matrix = (matrix - mean)/std\n", " return {'matrix_norm': matrix.fillna(0), 'mean': mean, 'std': std}" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 17 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sanity Check !!!" ] }, { "cell_type": "code", "collapsed": false, "input": [ "results = feature_normalize(houses_features)\n", "houses_norm = results['matrix_norm']\n", "X_mean =results['mean']\n", "X_std = results['std']\n", " \n", "results = feature_normalize(y);\n", "y_norm = results['matrix_norm']\n", "y_mean = results['mean']\n", "y_std = results['std']\n", "\n", "print ('{0}\\n'.format(houses_norm.head()))\n", "print ('{0}\\n'.format(y_norm.head()))\n" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ " LIST PRICE BEDS BATHS SQFT LOT SIZE YEAR BUILT \\\n", "0 0.783326 -1.417852 0.444568 2.093853 -0.446035 1.547770 \n", "1 0.899198 -1.417852 0.444568 -0.591009 1.947155 0.851334 \n", "2 0.511950 0.517099 -0.949213 -1.661850 -0.331435 -1.516551 \n", "3 0.889748 -1.417852 1.838348 0.487591 -0.331435 -1.237976 \n", "4 1.804976 0.517099 0.444568 1.077330 -0.744555 1.408483 \n", "\n", " PARKING SPOTS PARKING TYPE ORIGINAL LIST PRICE \n", "0 0 0 0.777064 \n", "1 0 0 0.894669 \n", "2 0 0 -0.735122 \n", "3 0 0 0.885077 \n", "4 0 0 1.813991 \n", "\n", "0 0.679040\n", "1 -0.086546\n", "2 0.240808\n", "3 0.626241\n", "4 1.523824\n", "Name: LAST SALE PRICE, dtype: float64\n", "\n" ] } ], "prompt_number": 19 }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can use Preprocessing module of sklearn to apply zero mean and unit deviation norm. Let's use that and check the results." ] }, { "cell_type": "code", "collapsed": false, "input": [ "scaler = preprocessing.StandardScaler().fit(houses_features.values)\n", "houses_norm = scaler.transform(houses_features)\n", "print(\"Mean: {0}, STD: {1}\".format(scaler.mean_,scaler.std_))\n", "print(houses_norm)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Mean: [ 6.65394267e+05 3.73275862e+00 2.34051724e+00 1.71816379e+03\n", " 6.39287931e+03 1.97988793e+03 2.00000000e+00 1.00000000e+00\n", " 6.67058466e+05], STD: [ 9.35597222e+04 5.14576468e-01 3.57186913e-01 1.28313997e+02\n", " 1.78109820e+03 7.14839209e+00 1.00000000e+00 1.00000000e+00\n", " 9.21813058e+04]\n", "[[ 0.78672458 -1.42400336 0.44649665 ..., 0. 0. 0.78043519]\n", " [ 0.90309944 -1.42400336 0.44649665 ..., 0. 0. 0.89855024]\n", " [ 0.51417139 0.5193424 -0.95333068 ..., 0. 0. -0.73831093]\n", " ..., \n", " [-1.24406384 0.5193424 -0.95333068 ..., 0. 0. -1.28072025]\n", " [-0.39019213 0.5193424 -0.95333068 ..., 0. 0. -0.41408033]\n", " [-2.83769833 -1.42400336 -0.95333068 ..., 0. 0. -2.89818487]]\n" ] } ], "prompt_number": 22 }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Step 3: Training our model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's use a simple linear model , that uses the least squared error function. \n", "\n", "Optional !!: A quick mathematical background.\n", "\n", "$\\begin{equation*} y' = h_\\theta(X)\\end{equation*}$\n", "\n", "Where \n", "\n", "$\\begin{equation*}h_\\theta(X) = \\theta_0 + \\theta_1*x_1 + ... + \\theta_n*x_n\\end{equation*}$\n", "\n", "Here y' is the predicted value and y is the actual value. Our goal is to minimize the delta between these two values. So, if we can choose the best values for $\\begin{equation*}\\Theta\\end{equation*}$, we can accurately predict the house prices.\n", "\n", "To achieve that , we come up with least squared error function, for which, we need to find the minimum values of $\\begin{equation*}\\Theta\\end{equation*}$ where , $\\begin{equation*}\\Theta = [\\theta_0, \\theta_1 ... \\theta_n]\\end{equation*}$\n", "\n", "$\\begin{equation*} J(\\Theta) = \\sum_{i=0}^m(h_\\theta(X)-y)^2\\end{equation*}$\n", "\n", "\n", "We choose some arbitrary number of iterations and in each iteration we simultaneously calculate \n", "\n", "$\\begin{equation*} \\theta_0 := \\theta_0 -\\alpha*\\frac{\\partial}{\\partial\\theta_0}J(\\Theta)\\end{equation*}$\n", "\n", "$\\begin{equation*} \\theta_1 := \\theta_1 - \\alpha*\\frac{\\partial}{\\partial\\theta_1}J(\\Theta)\\end{equation*}$\n", "\n", ".\n", ".\n", ".\n", "\n", "$\\begin{equation*} \\theta_n := \\theta_n - \\alpha*\\frac{\\partial}{\\partial\\theta_n}J(\\Theta)\\end{equation*}$\n", "\n", "Where $\\begin{equation*}\\alpha\\end{equation*}$ is a scalar learning rate value.\n", "\n", "Phew !! Thankfully, we can use different linear models available in sklearn.\n", "\n", "Let's define a train_model function in which, \n", "1. we create a regressor with intercept set to true. An intercept is the value of Y when, all features values are set to 0. We can either add $\\begin{equation*}x_0 = 1\\end{equation*}$ and add that to our features matrix or set the fit_intercept parameter to true. So this means, that if we are not given any features of a house, can we come up with a mean value of the house.\n", "2. We fit the model (train ) with our house features and prices. " ] }, { "cell_type": "code", "collapsed": false, "input": [ "def train_model(X,y):\n", " regressor = LinearRegression(fit_intercept=True)\n", " regressor.fit(X,y)\n", " return regressor" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 23 }, { "cell_type": "code", "collapsed": false, "input": [ "regr = train_model(houses_norm, y_norm)\n", "print(regr.coef_)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "[ 0.95294642 0.02315098 -0.09140542 0.00173405 0.00790588 -0.03303855\n", " 0. 0. 0.01064799]\n" ] } ], "prompt_number": 24 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's run a prediction and see if we can predict the price of a house. We are going to use the 3rd sample from our training set.\n", "The input features (house_test) for this sample :\n", "{\"LIST PRICE\" : 749888,\n", " \"BEDS\" : 3,\n", " \"BATHS\": 2.5,\n", " \"SQFT\": 1642,\n", " \"LOT SIZE\" : 9876,\n", " \"YEAR BUILT\" : 1968,\n", " \"PARKING SPOTS\" : 2, \n", " \"PARKING TYPE\" : \"Garage\",\n", " \"ORIGINAL LIST PRICE\": 74988}\n", " \n", " and the actual output is \"LAST SALE PRICE\" which is 713,550." ] }, { "cell_type": "code", "collapsed": false, "input": [ "house_test = pandas.DataFrame({\"LIST PRICE\" : 749888,\n", " \"BEDS\" : 3,\n", " \"BATHS\": 2.5,\n", " \"SQFT\": 1642,\n", " \"LOT SIZE\" : 9876,\n", " \"YEAR BUILT\" : 1968,\n", " \"PARKING SPOTS\" : 2, \n", " \"PARKING TYPE\" : \"Garage\",\n", " \"ORIGINAL LIST PRICE\": 74988},index=[0])\n", "house_test_converted = convert_parking_type(house_test)\n", "house_test_norm = feature_normalize(house_test_converted, X_mean, X_std)\n", "\n", "y_predicted_norm = regr.predict(house_test_norm['matrix_norm']);\n", "y_predicted = y_predicted_norm*y_std + y_mean\n", "print('Predicted Price: {0}'.format(y_predicted[0].round(0)))\n", "\n", "\n", "print('Normalized Input: \\n {0} \\n'.format(house_test_norm['matrix_norm']))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Predicted Price: 713785.0\n", "Normalized Input: \n", " BATHS BEDS LIST PRICE LOT SIZE ORIGINAL LIST PRICE \\\n", "0 0.444568 -1.417852 0.899198 1.947155 -6.395146 \n", "\n", " PARKING SPOTS PARKING TYPE SQFT YEAR BUILT \n", "0 0 0 -0.591009 -1.655838 \n", "\n" ] } ], "prompt_number": 25 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's do some score analysis to find out the performance of our prediction algorithm." ] }, { "cell_type": "code", "collapsed": false, "input": [ "print(\"Mean square error: {0} \\n\".format(mean_squared_error([y_norm[2]],y_predicted_norm)))\n", "print(\"R2 Score: {0} \\n\".format(regr.score(houses_norm, y_norm)))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Mean square error: 9.067023349686855e-06 \n", "\n", "R2 Score: 0.8903593961815572 \n", "\n" ] } ], "prompt_number": 178 }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Step 5. Now what ? Optimizations and explorations - the fun part !!!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the above \"evaluation\" we didn't break the data into training and test set. That's a **big no no** in preparing your machine learning models. Let's fix that.\n", "\n", "A proper way is to split the data into a training/test set, where the model only ever sees the **training data** during its model fitting and parameter tuning. The **test data** is used for final evaluation of our model to find out it's prediction performance." ] }, { "cell_type": "code", "collapsed": false, "input": [ "houses_norm_train,houses_norm_test, y_norm_train, y_norm_test = \\\n", "train_test_split(houses_norm,y_norm,test_size=0.2) #<=== 20% of the samples are used for testing.\n", "print(len(houses_norm_train),len(houses_norm_test),len(houses_norm_train)+len(houses_norm_test))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "92 24 116\n" ] } ], "prompt_number": 29 }, { "cell_type": "markdown", "metadata": {}, "source": [ "So, as requested, the test size is 20% of the entire dataset (24 houses out of total 116), and the training is the rest (92 out of 116). Let me introduce the scikit-learn Pipeline concept here. The pipeline helps in breaking our learning into sub-steps and assemble them together. This way, we can run the pipeline on different set of parameters and cross-validation set without changing the code of each individual steps." ] }, { "cell_type": "code", "collapsed": false, "input": [ "linear_regressor_pipeline = Pipeline([\n", "('normalize', preprocessing.StandardScaler()),\n", "('regressor', LinearRegression())\n", "])" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 27 }, { "cell_type": "markdown", "metadata": {}, "source": [ "A common practice is to split the training data into further smaller subsets of data; for example 5 equally sized subsets. Then, train the model on 4 subsets of data and validate the accuracy on the last subset of data (called \"validation set\"). If we repeat, this training with different subset as the validation set, we can test the stability of the model. If the model gives different scores for different subsets , then we need to go back and check the model and/or the data." ] }, { "cell_type": "code", "collapsed": false, "input": [ "scores = cross_val_score(linear_regressor_pipeline,\n", " houses_norm_train, #<== training samples\n", " y_norm_train,#<== training output labels\n", " cv=5,#<=== split data randomly into 5 parts, 4 for training and 1 for scoring\n", " scoring='r2', #<== use R^2 score\n", " n_jobs=-1)\n", "print(scores,scores.mean(),scores.std())" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "[ 0.89656312 0.91393417 0.887133 0.90887704 0.63537356] 0.848376178916 0.106913414232\n" ] } ], "prompt_number": 30 }, { "cell_type": "markdown", "metadata": {}, "source": [ "A score of 1 implies we have predicted perfectly. As you can see we are little off in our prediction (16% off).The scores are quite consistent. But, we definitely can do better.\n", "\n", "How can we fix this ?\n", "\n", "There are few design principles that we need to consider :\n", "1. **Get more Data**\n", " * Sometimes more training data doesn't help. More data maynot help because we maybe overfitting our model to our training data.\n", " * But, most of the time, this is a good place to start. \n", "2. **Try smaller sets of features**\n", " * We can do this by hand or use Dimension Reduction using PCA techniques\n", "3. **Polynomial Features**\n", " * If we think that some of the features have heavier influence over other features, we may want to use higher degree of those features.\n", "3. **Combining Features**\n", " * if we have better knowledge of the domain, we can combine and build better features.\n", "4. ** New models**\n", " * We can try new regression models and see if they help in our prediction score\n", "\n", "All of the above advice is good. But, where should we start ?\n", "\n", "Let's start with Step 1 , Step 3 and Step 4 and see if we can make our predictions better. \n", "\n", "We will get more data and use cubic interpolation to fix missing data. You can read more about interpolation [here](http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html)." ] }, { "cell_type": "code", "collapsed": false, "input": [ "labels=[\"LIST PRICE\",\"BEDS\",\"BATHS\",\"SQFT\",\"LOT SIZE\",\"YEAR BUILT\",\n", " \"PARKING SPOTS\",\"PARKING TYPE\",\"ORIGINAL LIST PRICE\",\"LAST SALE PRICE\"]\n", "data=pandas.read_csv('./data/redfin_more_data1.csv', quoting=csv.QUOTE_NONE,names=labels)\n", "print(data.head())\n", "data.plot(x='SQFT', y='LAST SALE PRICE', kind='scatter')\n", "data = convert_parking_type(data)\n", "data.interpolate(method='cubic',inplace=True)\n", "data.fillna(0)\n", "\n", "\n", "y= data['LAST SALE PRICE'].copy()\n", "X =data.iloc[:,[0,1,2,3,4,5,6,7]].copy()\n", "\n", "print (\"features {0}: \\n{1}\".format(X.shape, X.head()))\n", "print(\"target:{0}: \\n{1}\".format(y.shape, y.head()))\n", "\n", "X_train,X_test, y_train, y_test = \\\n", "train_test_split(X,y,test_size=0.2,random_state=42) #<=== 20% of the samples are used for testing.\n", "print(len(X_train),len(X_test),len(X_train)+len(X_test))\n", "print(len(y_train) + len(y_test))\n", "print (X_train.shape)\n", "print (y_train.shape)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ " LIST PRICE BEDS BATHS SQFT LOT SIZE YEAR BUILT PARKING SPOTS \\\n", "0 538000 3 2.5 1252 1932 1993 2 \n", "1 538888 3 2.0 1253 2100 1973 2 \n", "2 539500 3 2.5 1298 2437 1984 2 \n", "3 545000 3 2.5 1298 2482 1988 2 \n", "4 548888 4 2.0 1298 2700 1973 2 \n", "\n", " PARKING TYPE ORIGINAL LIST PRICE LAST SALE PRICE \n", "0 Garage 538000 393500 \n", "1 Garage 538888 400000 \n", "2 Garage 539500 401000 \n", "3 Garage 545000 404500 \n", "4 Garage 548888 415000 \n", "features (318, 8): \n", " LIST PRICE BEDS BATHS SQFT LOT SIZE YEAR BUILT PARKING SPOTS \\\n", "0 538000 3 2.5 1252 1932 1993 2 \n", "1 538888 3 2.0 1253 2100 1973 2 \n", "2 539500 3 2.5 1298 2437 1984 2 \n", "3 545000 3 2.5 1298 2482 1988 2 \n", "4 548888 4 2.0 1298 2700 1973 2 \n", "\n", " PARKING TYPE \n", "0 1 \n", "1 1 \n", "2 1 \n", "3 1 \n", "4 1 " ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\n", "target:(318,): \n", "0 393500\n", "1 400000\n", "2 401000\n", "3 404500\n", "4 415000\n", "Name: LAST SALE PRICE, dtype: int64\n", "254 64 318\n", "318\n", "(254, 8)\n", "(254,)\n" ] }, { "metadata": {}, "output_type": "display_data", "png": "iVBORw0KGgoAAAANSUhEUgAAAaEAAAEPCAYAAADrvntcAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xt8VdWd9/HPL4RAuIkJyKVYoIhjmdoSdcCptdKWQxjb\n4q1TcGbahPLUaZmWtsQqCFU7JiJO8dZX1dpaEmxrtVU6dEZzCAo+4zxVVMBLFUEqVlBRSQFFBEJ+\nzx9rnWQTkwDJOWeflfzer9d+Ze919t7ne46Slb3W2muLqmKMMcbEIS/uAMYYY7ovq4SMMcbExioh\nY4wxsbFKyBhjTGysEjLGGBMbq4SMMcbEJqOVkIh8R0SeFZHnROQ7vqxIROpEZJOIrBSRgZH954vI\nZhHZKCJTIuWn+/NsFpGbI+W9ROQeX/6YiIyMvFbm32OTiHw1k5/TGGNMx2SsEhKRjwH/B/g74BPA\nF0RkDDAPqFPVk4GH/DYiMg6YDowDpgK3ioj4090GzFLVscBYEZnqy2cBO335jcBif64i4Epggl+u\nilZ2xhhjckMmr4ROAR5X1fdV9RDwCHARMA2o8fvUAOf79fOAu1X1oKpuBV4CJorIMKC/qq71+y2L\nHBM9133A5/x6KbBSVXep6i6gDlexGWOMySGZrISeA872zW99gHOBEcAQVd3h99kBDPHrw4FtkeO3\nAR9qpXy7L8f/fBVAVRuA3SJS3M65jDHG5JD8TJ1YVTeKyGJgJbAX2AAcarGPiojNG2SMMd1Uxioh\nAFX9BfALABGpwl2R7BCRoar6hm9qe9Pvvh04MXL4CL//dr/esjx1zIeB10QkHzhOVXeKyHZgUuSY\nE4GHW+azCtAYYzpGVeXIex1ZpkfHneB/fhi4EPg1sAIo87uUAb/36yuAGSJSICKjgbHAWlV9A9gj\nIhP9QIWvAP8ZOSZ1ri/hBjqAu/qaIiIDReR4IAEkW8uoqsEuV111VewZLH/8Obpj/pCzd4X86ZTR\nKyHgd76P5iAwW1V3i8h1wL0iMgvYCnwZQFWfF5F7geeBBr9/6tPOBqqBQuABVa315XcCd4nIZmAn\nMMOfq15ErgGe8Pv9UN0AhS5l69atcUfoFMsfr5Dzh5wdws+fTplujvt0K2X1wOQ29r8WuLaV8qeA\nU1sp34+vxFp5bSmw9BgjG2OMySKbMSFg5eXlcUfoFMsfr5Dzh5wdws+fTpLu9r2QiIh2589vjDEd\nISJoCAMTTGatWbMm7gidYvnjFXL+kLND+PnTySohY4wxsbHmuG78+Y0xpiOsOc4YY0yXYJVQwEJv\nV7b88Qo5f8jZIfz86WSVkDHGmNhYn1A3/vzGGNMR1idkjDGmS7BKKGChtytb/niFnD/k7BB+/nSy\nSsgYY0xsrE+oG39+Y4zpCOsTMsYY0yVYJRSw0NuVLX+8Qs4fcnYIP386WSVkjDEmNtYn1I0/vzHG\ndIT1CRljjOkSrBIKWOjtypY/XiHnDzk7hJ8/nawSMsYYExvrE+rGn98YYzrC+oSMMaYbSCaTFBef\ngEgxIsUkEom4I6WdVUIBC71d2fLHK+T8IWeHD+Y/6aSTmiqak046CYDy8nKmTv089fX7gRuAG1i1\nam2Xq4jy4w5gjDHd2fDhw3n99b3ALQBs2TKH4uJi6usbgeNwFVBZ0/6rVlXEETNjrE+oG39+Y0y8\nEokEq1Y9BdxIc0VTA8wFTgY2cXglVANUoPp2tqMeJp19QnYlZIwxMSgvL2fVqieAHm3scRbwFDAn\nUjaHyZMnZDxbNlmfUMC6Wrt4aCx/fHIle1VVFcXFJ1FcfBJVVVVHvd/1119PTc0fgP7AblxFU+OX\nOcBfgZ/RXAF9D5jL5MkTqKury+RHyjq7EjLGmA6oqqpi4cLrSfXlLFzoKowFCxYcYb/Z5OXl4379\n9gN6Ae/gmuAA3qGy8hoAbrhhKTCYuXNnfuC8XYaqZmwB5gN/Ap4Ffo37touAOlxj50pgYIv9NwMb\ngSmR8tP9OTYDN0fKewH3+PLHgJGR18r8e2wCvtpGPjXGmGNVWVmpUKxQraB+qdaiojEf2LeoaEyL\n/c702z0V+igMUBihMEjheC0rK8v+BzpG/ndnWuqJjDXHicgo4OvAaap6Kq7hcwYwD6hT1ZOBh/w2\nIjIOmA6MA6YCt4pIquPrNmCWqo4FxorIVF8+C9jpy28EFvtzFQFXAhP8cpWIDMzUZzXGdB/NVzaF\nnTxTb+CQX38POAC8T3V1dSfPG5ZM9gntAQ4CfUQkH+gDvAZMwzV84n+e79fPA+5W1YOquhV4CZgo\nIsOA/qq61u+3LHJM9Fz3AZ/z66XASlXdpaq7cFdeqYqry8iVdvGOsvzxCjl/nNmrqm7FNa19g5Z9\nOXPnzvzA/q4sut8z5OXNAU7E/W0+DjcSroHJk8/KxkfIKRnrE1LVehFZAvwF2AckVbVORIao6g6/\n2w5giF8fjmtSS9kGfAhXkW2LlG/35fifr/r3axCR3SJS7M+1rZVzGWNMp+zb975fS/XRLATeo7Ly\nslb7bVJlN9zg+nnmzr2Cnj17smrV4zz11FvU128ChMmTP9nlBh0cjYzdJyQiY4A/AGfjhn/8Fne1\n8mNVPT6yX72qFonIj4HHVPVXvvznwIPAVuA6VU348rOBy1T1iyLyLFCqqq/5114CJgLlQG9VrfLl\nC4F9qrqkRUbN1Oc3xnQd7n6e9bgms4NAAamBBjAHkQYaG/fGli/bQrlP6Azg/6nqTgARuR/4e+AN\nERmqqm/4prY3/f7bcdenKSNwVzDb/XrL8tQxHwZe801+x6nqThHZDkyKHHMi8HBrIcvLyxk1ahQA\nAwcOZPz48Uya5A5NXfLbtm3bdvfdnjt3LuvXbwG+iRsH1QC8BVyKs5+PfGRMzuTNxPaaNWua+qpS\nvy/TJl0jHFouwCeA53C9d4JrDP034Hrgcr/PPNxVDriG0Q24PzFGA1tovlJ7HHeFI8ADwFRfPhu4\nza/PAH7j14uAPwMDgeNT661k7MC4kNyxevXquCN0iuWPV8j5M529trZWE4kLtaTkLIXjI6PbxihU\nKPT1I9pGKPTS2traYzp/yN+9anpHx2WyT+hpEVkGPAk0AuuAO3B3Z90rIrNwTW1f9vs/LyL3As/j\n/tSY7T9sqrKpxlVoD6hqrS+/E7hLRDYDO31FlOqPugZ4wu/3Q3UDFIwxpk3l5eUsW/ZbVPNxzW0/\nwv3tmzIU9/f0N4D/BTYxZswYSktLsx+2i7C547rx5zfGNCsvL6em5re4xpibcLcanoRregNXKT2L\nu2Pk4wAUFGxkxYrfdLtKyJ4nZIwxbUgmk0yZchFTplxEMplsmjKnf//hFBcPo2fPIQwYMPID0+z8\n8pcP4CqXfpHSof7nPtyMBkuB9ykp6UUiMbxbVkBpl652vRAXrE8oVpY/XiHnbyt7bW2tFhYO8X04\n1Zqf39fPSFARmZ2g2i8DtLKysulYNwPCmQqn+NkLqiPHnaIwQkWOP+yYdOcPBSHMmGCMMdm2ZMkd\n7Nu3GNeUVkZDw/G4ZrSXceOVbml6DW7xc7M5w4b1BZ7B3VN/ALgd+F/y8qCkZAiJxAQefPDurjuH\nW0xsAtOApYZShsryxyvk/G1lf/vtnR0+59Kld3DuuefR2JiPG0u1kTFjRvGTn9yf9ia3kL/7dLNK\nyBjThTTQfP8OQD1uypyvA7W0fDbP3LmXNW2VlpbywAP/yZIldwBQUXGJ9fdkgY2OC/jzr1mzJui/\nqCx/vELO31b24uJR1Nd/Cdf8BjCa3r1/Tp8+gzhw4D0KCpQ9exopLOzN5ZdfElvTWsjfPYQzY4Ix\nxmRVff1O3H08P/Ill/L++4fYt++lGFOZ9tiVUDf+/MZ0NW7+4plEr4RgKX72MJMmdp+QMcZ4qfuA\nCgpSj8r+Ge4pL9P8+u4445kjsEooYKkJBkNl+eMVcv5U9tQD5urrh3LwYB7QE9iLu7F0rl/vGVfM\nNoX83aebVULGmJyWTCY57bRPUVx8EqedNolkMtn02uLFd+Du/dlM86MVUs/Q7APk06NHthObY2F9\nQt348xuTqw5/fs8hXIXiBhsUFHyfFSvuAmDq1IuBG4HLcBP0bwOuAo7zZ9pNZeUP7QbTNEtnn5BV\nQt348xuTi1wFtBZ3ZXMNMBg3a3WZ36OGRGIFAHV1a4E9wKm4yUVvwT1L82EKCgq48spvWwWUATYw\nwQDhtytb/njFnb/lRKMpq1ato3l6nXdaPba+PjWz9QggAbyBm+WgAlhDWdk09u9/I2croLi/+1xi\n9wkZYzKiqqqqaW62uXNnHlYhJJNJLrigzM/zBo8+Wsby5TWtnKUH7hFj0VkQvsOXv3wFn/jEJ3jk\nkRkcOPASLZvqbKaDgKRrJtQQFwKfRduYXFRbW6tFRYPbnbE6kbgw8rRSVajWROJCXz4gcuxFh81i\nDcdpWVnZYe9VUnKWFhWN0ZKSc475CaemY0jjLNrWJ9SNP78x6ZRMJpk//xrWr38SN5DgRqL9OP37\nX8mePa8AMGXKRdTVTaP1fp7RuOa4/v61v9K//4n07NnzA1dUJh7WJ2SA8NuVLX+80pk/mUzy+c9f\nzPr1b+AGEnxwXPS+fe83rVdUXEJh4eW4KXZqKCy8nIqKS3z5L3ETjZ5MXp5QWXkNe/a8ws6dLzVV\nQPbddx3WJ2SM6bSZM+dw6FBqqHQB0JuWM1aPHDmiaau0tJTly2siM1bXNPXjNJcPp6Liauvf6eKs\nOa4bf35j0sXN2XYDMJ/Dp8kp8q/v5MEHl1uF0kVYc5wxJidUVVX5OdsO4EawfdK/UoRraNlLUVEP\nq4BMm6wSCljo7cqWP16dyV9eXk5eXl8WLryegwd7A/uB94DtuApoL5BHZWUFO3duTXsF1J2/+67G\nKiFjzDEpLy+npmY5qv1wo9jygF646XVeAt7HVUiHbCSbOSLrE+rGn9+YjujZcwgNDdfjZidYAqwG\nfomriD7u93qGyZM/SV1dXUwpTSZZn5AxJmtS0+ucdtqnGD58NA0NB/0rihsB9xngTFyT3CZgk1VA\n5qhZJRSw0NuVLX+8Wsvfcj63qqoqzj33YurqRrN+/Xpef70eN0dbqvLZDywEXgHyqayci+rOjFdA\nXfG7767sPiFjurFkMsmSJXfwxz8+xLvvHsL9Xeqey7N69XQOHeqB6o3ACtyAg0rczNZDgf/B3ZS6\nH2iksvIH1gdkjpn1CXXjz2+6t2QyyRe+8M80NPQFdgEfxg2zTk2l8/f+5zdwldBaXCX0M+BFUpOG\nwqWUlPwN69Y9mrXsJl7WJ2SM6bR/+7d5NDQsAd7FXf3sa2Wvs4DLgdFAPa4Z7kzcfUG3A7dTUNDA\nokU/yE5o0+W0WQmJyGcj66NbvHbh0ZxcRP5GRNZHlt0iMkdEikSkTkQ2ichKERkYOWa+iGwWkY0i\nMiVSfrqIPOtfuzlS3ktE7vHlj4nIyMhrZf49NonIV48mc0hCb1e2/PHasmVLi5KhuCuhGr88j8id\nwL8A/4tID4qK+pCffxeFhb0YM+Z9EonhrFjxm6zfiBr6dx96/nRqr09oCVDi1++PrAP8wJe1S1Vf\nTB0nInm4O9mWA/OAOlW9XkQu99vzRGQcMB0YB3wIWCUiY32b2W3ALFVdKyIPiMhUVa0FZgE7VXWs\niEwHFgMzRKQIuBI43cd5SkRWqOquI+U2pqurqqoCDuIqnX64K5yv455OejsABQV5XHnlpTzyyDps\nHjeTKW32CYnIelUtabne2vZRvZG7qvmBqp4tIhuBc1R1h4gMBdao6ikiMh9oVNXF/pha4Grc0JuH\nVfWjvnwGMElVv+H3uUpVHxeRfOB1VR0sIhcDn1bVb/pjbvfv85tIJusTMt1O8+OzD+AaQz4OPIO7\nz+cg/fsP4Mwzz6Si4hKrdEyr0tknlM3RcTOAu/36EFXd4dd3AEP8+nDgscgx23BXRAf9esp2X47/\n+SqAqjb4Jr9if65trZzLmG6rvLycVauewvUBzcTNeL0NN/LtLeAAe/a0/khtYzKhvUroIyLyB78+\nOrIOrpfyqIlIAfBFXA/nYVRVRSS2y5Hy8nJGjRoFwMCBAxk/fjyTJk0Cmtttc3X7pptuCiqv5Y93\ne9asWdTU/A44HkdonvMNv978HKC487a3He1TyYU8XT3/mjVrqK6uBmj6fZk2bT1yFZgEnON/tlzO\nOZbHtwLnAbWR7Y3AUL8+DNjo1+cB8yL71QITcT2mL0TKLwZui+xzpl/PB97y6zOA2yPH/BSY3iJX\n+8+wzXGrV6+OO0KnWP7sqa2tVRioUOQfmT1AoadfBvmlp/br1y/uqEclpO++NaHnJxuP9xaRE4DB\nqvqnFuV/63/Rv3m0FZ2I/AZ4UFVr/Pb1uMEEi0VkHjBQVVMDE34NTMAPTABOUlUVkcdxvadrgf8G\nblHVWhGZDZyqqt/0fUXnq2pqYMKTwGm4P/meAk7TyMAE6xMy3cVJJ5WwZcvbwB5fkgAexzXFHQ8I\n/fod4J13rCnOHFm27hP6MTColfJi4KajfQMR6QtM5vDRdNcBCRHZBHzWb6OqzwP3As8DDwKzI7XE\nbODnwGbgJXUj4wDuBIpFZDPwXdzVFKpaj7u1+wlcxfVDtZFxphsqLy9ny5a/4EbBNeCa3R73r/ai\nsrIC1Z1WAZl4tHWJBDzVzmt/StelWJwL1hwXK8vfcbW1tTps2IcVirVHj8FaVlamtbW1WlJylhYV\njdGSknO0trZWy8rKfNPbIP9zgMIIv91Py8rKYvsMnWH/78SLNDbHtTcwoX87r/VMWy1ojDkmyWSS\nc889j8bGXsAtHDoENTWzueuu+2hs7A38iPp6mDbtKxw6BG4k3DbgWtzUPO8CByktPaups9mY2LRV\nOwEPAJ9vpfxcXP9O7FcynV0I/ErIdB1lZWWan3+C5uef0HR1UllZqUVFY7SoaIxWVlY27ZtIXOiv\nZKoV1C9n+iVaVq1QHCmr9FdBRYedz5hjRZauhL4L/JeI/COuU19wsw98EvhC5qpFY7qX1JNKU7NX\n19TM4ZlnnmH9+i1NZQsXzgFgwYIFvP32zqM+97BhfXn99Tl+awSwh7KyC2y2a5M72quhgN7A14Ab\ncNP4fA0oTFcNGPdC4FdCobcrd/X8tbW1mkhcqInEhVpbW6uqqiUlJX6YdJGWlJSoqmp+/gmtXMEU\nfaCsqGiMP8dZCr18/061X/poXl6/yBVStRYUDG7qF2p5lXU0+XNZyNlVw89Plq6EUNX3gV9ktBY0\npotJJpPMn38NGzb8CVU3kPSRR77CiScOYMuWt0hd3axfP4fTTjvtmM8/aNAQ4FvAb3GP2IZhwwax\ndOkdzJ9/Da+8cg0jR45g0aK7KC0tpbS0FOv6MTmrrdoJ13v5ThvLnnTVgnEuBH4lZOKRusIZNuzD\n2qPH4MOuMGpra7WgYLDCx47q6gaKIiPYUlc1A/wV0+FlqX6c2tpaLSwc0vRaYeGQpistY7KBbFwJ\nqWq/TFeAxoQmmUwybdpXOHDgZNyD4Jr7cQCeeWYrBw78B+4WtaOTGqH2q19dBsA///MFVFdXU1VV\nxQ03uPPMnXtZUz9OaWkpy5fXsGTJHQBUVNTYRKMmXMdaa+HueLs8XbVgnAuBXwmF3q4cUv7Kykrt\n129Y5OrkBIXLD7uqyc8/QYuKxvjXzzqsfwYGae/efVq94olLSN9/SyFnVw0/P2m8EmrvoXbDReTH\n/tk914tIPxH5Hm7eN5uN2nRJiUQCkUGIDCKRSADu2TsLF17Lu+/uJjrBZ2tGjkw9GO6DTx/9/e/v\np6RkDDAXmEtJyRjWrVuXyY9jTM5rb+64VcCjuEcrTAXO9+vfVdU3spYwg2zuOBPV/JydW3zJHCZP\nnsC6dS9TXz8Yd8NnP+Bt4G9wD4Br3res7AIuvvhipk2bwYEDpwC7ENnB+PEfZ9Gi+dZkZrqMdM4d\n114ltEFVx0e2twEjVfVQOt44F1glZKLcY6huAMp8SQ0wl6Ki4yOV0HTcdIWnAC8Bh+jRI59/+Zdz\nm/p2kslkpL/GHgxnup5sTWCaJyJFfikG6oHjUmXpeHPTOdFnkoQolPxz587EPXm0HvgZ7onyAPso\nK5tGQ8Obh01/U1paysqV97Fy5X05XQGF8v23JuTsEH7+dGrvPqEBuJkSolLbCnwkI4mMic1u3NNC\nUuYAe5tGpV133U/Yu3c3IjX07duHf/zH6dx5551xBDWmy2izOa47sOY4EyVSiHuS/HG+ZDfQE9V9\n8YUyJgdlpTlORIaIyM0i8t8iskhEBqTjDY3JbflAH7+0O6GIMSYN2usTWoabNeHHuMc63NLOviYG\nobcr51L+8vJy3FVQS62VObmUvyNCzh9ydgg/fzq196feUFVNTbVbKyLrsxHImDjU1PwB6AvsBd7z\npQ0UFdkYHGMyqb0h2s8Ak1KbwOrINuoenx006xMykLo/6CncvT97gB3+lSEkEqewcuV98YUzJgdl\n6z6hrbhRcK1RVQ1+dJxVQgZS9wftxc2G0Af4kX9lDrW19+b0MGtj4pCVgQmqOkpVR7exBF8BdQWh\ntyvnQn73KAUFzsD1/xzATavjZktorwLKhfydEXL+kLND+PnTqb2BCcZ0eevXv4LrA3oWmAAU4Cql\nd6mrq4szmjHdgt0n1I0/v0k1xZ0AbMENAgV4h5KSCaxb92h8wYzJYdmatseYILU2E3Zrkskk7plA\nfwEKgZP90pNFi36QlazGdHft3az62cj66BavXZjJUObohN6unIn8zTNhLwGWsGrV2jYrIjfJ6HDg\nfaAR2AT8iWHDBh3VYAT7/uMTcnYIP386tXcltCSyfn+L1+zPRJOTVq1ah7uvuswvt/iytkzHXQUN\nxI2MO8jSpXdkPKcxxrHmuIBNmjQp7gidku78rnnt6FVUXEJh4S+BbwAjyMvbR2XlwqMekm3ff3xC\nzg7h508nmxzLdAnJZJILLijD3Wzacibs1icgLS0tZfnymqZmuYqKq+2eIGOyrL2bVXcDj+BmSzgb\n+J/Iy2er6sDMx8us0EfHrVmzJui/qNKZf8qUi6irmwbMBvYTnQlbpBeNjXvT8j5R9v3HJ+TsEH7+\nbI2OOw/3mMkluEd7L4ks5x3tG4jIQBH5nYi8ICLPi8hE/2C8OhHZJCIrRWRgZP/5IrJZRDaKyJRI\n+eki8qx/7eZIeS8RuceXPyYiIyOvlfn32CQiXz3azCY8b7+9068dB/SieSbsXvTte1ybxxljYqaq\nx7wA9xzDvjXA1/x6Pu63xPXAZb7scuA6vz4O2AD0BEbhnp+culpbC0zw6w8AU/36bOBWvz4d+I1f\nL8Ld/DHQL1uAgS2yqcl9tbW1mkhcqInEhVpbW9vqPiUlZykMUrhIoY/CmX7po5WVlVlObEzX5n93\ndqj+aLl0dGDCJ49mJxE5Dtd09wv/G79BVXcD03CVE/7n+X79POBuVT2oqltxldBEERkG9FfVtX6/\nZZFjoue6D/icXy8FVqrqLlXdBdQBU4/5k5pYpfp66uqmUVc3jQsuKGt1AMIrr2zDjYZT3N8vW4FN\nVFZe0fRkVGNM7sn06LjRwFsislRE1onIz0SkLzBEVVNTFe8Ahvj14cC2yPHbgA+1Ur7dl+N/vgqu\nkgN2i7sNvq1zdRmh3mtQXl5OXl5fRAbQq9dQqqqq2tx3yZI72LdvMakh1/v2LfYDCQ5XX1+P+1tk\nGnAZ0AAczGgFFOr3nxJy/pCzQ/j506nN0XEicjqtz6ItuOayoz3/acC3VPUJEbkJmBfdQVVVRGIb\nHVBeXs6oUaMAGDhwIOPHj2/qMEz9j5Kr2xs2bMipPEezfd1115FMrsH123yeAwc+zMKF1wNw1lln\nfWD/+vq3aLYGeKF567DzFwCfBe4EBuMqrZ8d1gFs33/Xym/b2dtes2YN1dXVAE2/L9OmrXY63L/4\n1W0tR9PWBwwFXo5sfwr4b9xvkqG+bBiw0a/PA+ZF9q8FJvrzvBApvxi4LbLPmdrc5/SWX58B3B45\n5qfA9Bb5Ot02atpXVlam+fknaH7+CU3rMEKhWkH9Uq1FRWNaPb62tlYLC4f4/au1sHBIq/1CkydP\nVhjQtB8M0MmTJ2f64xnTLZHGPqGODkwoOIZ9/y9wsl+/Gjco4Xrgcm2ueFoOTCjANeVtoXlgwuO+\nQhI+ODAhVSHN4PCBCX/GDUo4PrXeIlt6/8uYw5SVlX2gYhApPqZKSPXoBiaopiqiYoViq4CMyaBY\nKiH/y38yrr1jxzEc9wngCeBp3PQ/x/kKYhVusq6V0coBuAI3IGEjUBopPx033/5LwC2R8l7AvcBm\n4DFgVOS1mb58M1DWSrb0/pfJstWrV8cdoV15eYM+UNlAL78MULi8qXIKcQRbrn//RxJy/pCzq4af\nP52V0BFnTBCRv8c1f53vK49vAd8/0nEpqvo08HetvDS5jf2vBa5tpfwp4NRWyvcDX27jXEuBpUeb\n1aRHMplkyZI7aGxsbOXVvpSVfZFly36L6q0UFPThyisvsxFsxnRT7c2YsAi4CNeMdS/we+ApVR3d\n6gEBCn3GhFyTTCaZP/8aNmx4DtWbgX/FXaje4veYQ17eAQ4dan0aHWNMGNI5Y0J7ldBbwFPAbcCD\nqnpARF62Ssi0JplMcu65M2hsLMZNsl4GFAPvEJ1Cp3fv/uzbt7Ot0xhjApCtaXuGATcDFwJbROQu\noFBEjnZ4tsmw1BDKXPBP//SvNDbehKt0Uo7H3bdznF+m0afP8U2v5lL+jrD88Qk5O4SfP53a7BNS\nd+Png8CDItIb+AJuMq5tIvKQqv5TljKaANTXpyqfHsClfn08bqKK5ua4uXMvy3Y0Y0wOa7M5rs0D\nRAYA56vqssxEyh5rjksfkf5Ab+Ac3N8uH/evPEVh4WAKCwuZO3emDUAwpgtIZ3NcezMmTABeVdXX\n/XYZbqDCVtz9PsY0KSoqpL5+L25GpSLgRUCorLzKKh5jTJva6xP6Ke7BLIjIp4HrcJNz7QHs+cc5\nIJfalUeOHIebOHQz7iFywygpObXdCiiX8neE5Y9PyNkh/Pzp1N59QnmqWu/XpwM/VdX7gPtE5OnM\nRzMh2bP5XrfRAAAXKklEQVSnHngT97gpgEvZs6dHjImMMSFob4j2c0CJqh4UkReBS1T1Ef/an1T1\nb7OYMyOsTyh9iotPor7+fOBlXzKaoqLfs3PnS3HGMsZkQFb6hIC7gUdE5G3gPfzjvUVkLLArHW9u\nuo5evRTXWvsjX3IpvXoNiDGRMSYEbfYJqWoVUIGb9uZTqpqag0WAb2chmzmCXGpXfuuteuAAcLtf\nDviytuVS/o6w/PEJOTuEnz+d2p07TlX/2ErZpszFMSEqLy+noUGAr9PcHHcWDQ02bZ8xpn3HfJ9Q\nV2J9Qp1XXl5OTc1vgUbcEziab0yFg6i+F1s2Y0xmZGXuuO7AKqHOy88/gUOHxuCesPEO7qmmAG8x\nbNgQXnvtlfjCGWMyIitzx4nIynS8gcmcuNuVk8kkhw4d8ltl/uf7fmlk6dL2byeLO39nWf74hJwd\nws+fTu3drDq4ndeMYf78RcBHgWdwI+O+BZwEHKKy8ipKS0vjjGeMCUB79wn9GTcTZWuXXKqq92cy\nWDZYc1znDBgwknfe+XdgNe5xUz2AQ9TW3mMVkDFdWLaeJ7QTWNHWgao6Mx0B4mSVUOf06XMC+/Yp\n0XuDCguF9957M85YxpgMy9bzhP6iqjPbWtLx5qZz4mxXTiaT7N9/gJb3BvXoccQnxjcJvV3c8scn\n5OwQfv50aq8SMqZVyWSSCy4oo7Fx4AdeGzJkUAyJjDGhaq857mOq+lxkexDwaeAVVX0qS/kyyprj\nOmbKlIuoq5sG/Ax4FhjnX3mekpJTWbfu0fjCGWMyLlvNcYtF5GP+DYcBzwEzgbtE5HvpeHMTuiG4\nWRKG++XrDBo0JN5IxpigtFcJjYpcCc0EVqrqF4GJwNcynswcUVztyhUVl5CfXwGMxg3NngZMo7Dw\nl1RUXHLU5wm9Xdzyxyfk7BB+/nRqrxI6GFmfjHtmM6r6Dm6OFtNNPfnkkzQ07AP+FxgEfIcxY25i\n+fIaG5ptjDkm7fUJ/ReQxD2v+U7gI6r6VxHpAzxhzxPqvtyzg35A8ywJNRQVXWPPDjKmm8hWn9As\n4GO43zTTVfWvvnwi7vEOxhhjTKe09zyhHar6r6p6nqpG55F7DPhL5qOZI8l0u3IikUBkECKDSCQS\nTeVz587EzZJd45c5vuzYhN4ubvnjE3J2CD9/Oh3VfUIi0kNEPi8ivwS2AtMzmsrELpFIsGrVWmAJ\nsIRVq9Y2VURnnHEG+fmHgIXAQvLzD3HGGWfEmNYYE6r2+oQEOAe4GDgXeBw4Gxitx/CQGBHZCuwB\nDgEHVXWCiBQB9wAjcZXal1V1l99/Pm703SFgTuoqTEROB6qB3sADqvodX94LWAacBuzENR2+4l8r\nAxb4KJWquqxFNusTaoNIMXAD0X4fqED17ch9Qs2vJRIrWLnyvjiiGmOyLFt9Qq8CV+BmpzxFVb8E\nvHcsFZCnwCRVLVHVCb5sHlCnqicDD/ltRGQc7iprHDAVuNVXhgC3AbNUdSwwVkSm+vJZwE5ffiOw\n2J+rCLgSmOCXq0Tkg7f4dwPl5eX07DmEnj2HUF5efsT9k8lk5kMZYwztV0K/w83LPx34ooj07cT7\ntKwxp+H+tMb/PN+vnwfcraoHVXUr7klpE/3Nsv1Vda3fb1nkmOi57gM+59dLcfc27fJXWXW4iq3L\nOJp2Zffk0+U0NFxPQ8P11NQsP2JFtGTJHcBQWvb7TJ5cArj7hAoLL296rbDw8mO6P+hY8ucyyx+f\nkLND+PnTqb2BCd/FVUI/xv1ifxEYLCLTRaTfMbyHAqtE5EkR+bovG6KqO/z6Dtyt9+Buu98WOXYb\n8KFWyrf7cvzPV33mBmC3uLakts7VrfzqVw/iHrld5pdbfNmRXIa7gKwA5lJU1Iu6ujoASktLWb7c\nNcElEivs/iBjTIe1O+WxqjYCDwMPi0gB7uriYuAnuLsUj8ZZqvq6iAwG6kRkY4v3UBGJrWOmvLyc\nUaNGATBw4EDGjx/PpEmTgOa/VnJ1O1XW3v6Njfsjn3YN8AKpfrC2zl9RcQmPPlrGvn3lwOkUFlbz\n61/XHLZ/aWkpvXr1ynj+XN62/PFtT5o0KafydPX8a9asobq6GqDp92W6tDkwod2DROar6qIOHHcV\n8C5uwrFJqvqGb2pbraqniMg8AFW9zu9fC1wFvOL3+agvvxj4tKp+0+9ztao+JiL5wOuqOlhEZvj3\n+IY/5qfAw6p6TyRPlx+Y0L9/Ee++ewh3NQQwh379evDOO/XtHpdMJn2znKuU7ErHGJOSrYEJ7Zl9\nNDuJSB8R6e/X+wJTcNMur6B5aFUZ7rGc+PIZIlIgIqOBscBaVX0D2CMiE/1Aha8A/xk5JnWuL+EG\nOgCsBKaIyEAROR5I4GaA6DJSf6mkJJNJpky5iClTLmoaXPD++z2BU3HNa5cBp/qy9pWWlrJy5X2s\nXHlfxiqglvlDY/njE3J2CD9/Oh39E8g6Zgiw3A9wywd+paorReRJ4F4RmYUfog2gqs+LyL3A80AD\nMDtyqTIbN0S7EDdEu9aX34mb2Xszboj2DH+uehG5BnjC7/fD1DDwruKuu+7ioov+DwBf/OKnuPfe\nWvbtWwzAo4+WsXx5Db179+Ddd18k+vTT3r2PXAkZY0w2dLQ57lVVPTEDebIq5Oa4qqoqFi68nuZm\ntu8CN9Hy3p1zzjmNhQuvBT7uy5+hsvIKFixYgDHGdEQ6m+PavBISkXdxI9ta0ycdb246bvHiO2ge\n9Qbu8doflKpsbrjBTfc3d65VQMaY3NHeEO1+qtq/jaVHNkOaD9q79z3ghUjJWUTv64neu7NgwQJ2\n7nyJnTtfyqkKKPR2ccsfn5CzQ/j50ynTfUImQ9zEFT8GPupLfgbsJ5FYAUBFhd27Y4zJfR3qE+oq\nQu4T6tPnBPbt2wsU+ZJ6Cgv78t57b8YZyxjTDWSlT8jkrqqqKt5/f6/fGuF/1jN8+OC4IhljTId0\n9D4hE5PUqDjVjwNnAG/55R/4yEdOiTfcMQq9Xdzyxyfk7BB+/nSyK6HAuFFut+AmGL0YN3E4fiBC\nTTtHGmNM7rE+ocA+f3HxSdTX/wA3NDsJXE1+/p/5r/9aZgMRjDFZkQvT9piYHP5o7TeA57n66jlW\nARljgmSVUGAWLFhAZeVlFBVdQ//+V1BZeVlO3ftzLEJvF7f88Qk5O4SfP52sEgpQ6ubTFSt+FWwF\nZIwxYH1CwfUJGWNM3KxPyBhjTJdglVDAQm9XtvzxCjl/yNkh/PzpZJWQMcaY2FifUDf+/MYY0xHW\nJ2SMMaZLsEooYKG3K1v+eIWcP+TsEH7+dLJKKDDJZJIpUy5iypSLWLt2bdxxjDGmU6xPKKDPn0wm\nueCCMvbtWwy4SUuXL7eH1xljsiudfUJWCQX0+adMuYi6umm4yUsBakgkVrBy5X1xxjLGdDM2MKGb\nevvtnS1KXmilLByht4tb/viEnB3Cz59O9jyhoDQAl0a2bwNOjSmLMcZ0njXHBfT5XXPcaOBlXzKa\nROJla44zxmSVNcd1UxUVl1BQsAyYBkyjoGAZFRWXxB3LGGM6zCqh4BwEbgdup7HxvbjDdEro7eKW\nPz4hZ4fw86eTVUIBWbLkDg4cuAn4I/BHGhq+xZIld8QdyxhjOswqoaB9NO4AnTJp0qS4I3SK5Y9P\nyNkh/PzplPFKSER6iMh6EfmD3y4SkToR2SQiK0VkYGTf+SKyWUQ2isiUSPnpIvKsf+3mSHkvEbnH\nlz8mIiMjr5X599gkIl/N9OfMBtcn9H2gBqihoOD71idkjAlaNq6EvgM8D6SGoc0D6lT1ZOAhv42I\njAOmA+OAqcCtIpIafXEbMEtVxwJjRWSqL58F7PTlNwKL/bmKgCuBCX65KlrZhayx8QCpPqGGBusT\nipPlj0/I2SH8/OmU0UpIREYA5wI/B1IVyjTcn/L4n+f79fOAu1X1oKpuBV4CJorIMKC/qqYmSlsW\nOSZ6rvuAz/n1UmClqu5S1V1AHa5iy6roPG/JZLLT55s/fxENDTeT6hNqbPwW8+cv6vR5jTEmLpm+\nWfVG4PvAgEjZEFXd4dd3AEP8+nDgsch+24AP4YaDbYuUb/fl+J+vAqhqg4jsFpFif65trZwra1rO\n8/boo2WdnuftlVe2tSj5KK+88rtOpIxX6O3ilj8+IWeH8POnU8YqIRH5AvCmqq4XkUmt7aOqKiLh\n3C16DObPX+QrIDfP2759bnRbZyqhkSOHUl8fnTHhUkaO/JvOBTXGmBhl8krok8A0ETkX6A0MEJG7\ngB0iMlRV3/BNbW/6/bcDJ0aOH4G7gtnu11uWp475MPCaiOQDx6nqThHZDkyKHHMi8HBrIcvLyxk1\nahQAAwcOZPz48U1/paTabY91e//+/Tz99HPAC8Capij19W+xZs2aDp9/xoxpPPvsv9PQcDsAeXnv\nMmPGtKbP0tG8cW3fdNNNafm+LX/3yx/tU8mFPF09/5o1a6iurgZo+n2ZNqqa8QU4B/iDX78euNyv\nzwOu8+vjgA1AATAa2ELztEKPAxNx/UoPAFN9+WzgNr8+A/iNXy8C/gwMBI5PrbeSSzMhkbhQoUJh\niEK1QrXm5R2vtbW1nT53bW2tJhIXaiJxoS5evDgNaeOzevXquCN0iuWPT8jZVcPP7393pqV+yMrc\ncSJyDlChqtP8yLV7cVcwW4Evqxs8gIhcAXwNN1Pnd1Q16ctPB6qBQuABVZ3jy3sBdwElwE5ghrpB\nDYjITOAKH6FSVVMDGKK5NBOfv/mRC0OBO4DXKCnpwbp1j6b9vYwxJtvseUJpkqlKyB4+Z4zpymwC\n0xxXWlrK8uXugXOJxIqMVUDRduUQWf54hZw/5OwQfv50sucJZUhpaald+RhjzBFYc1w3/vzGGNMR\n1hxnjDGmS7BKKGChtytb/niFnD/k7BB+/nSySsgYY0xsrE+oG39+Y4zpCOsTMsYY0yVYJRSw0NuV\nLX+8Qs4fcnYIP386WSVkjDEmNtYn1I0/vzHGdIT1CRljjOkSrBIKWOjtypY/XiHnDzk7hJ8/nawS\nMsYYExvrE+rGn98YYzrC+oSMMcZ0CVYJBSz0dmXLH6+Q84ecHcLPn05WCRljjImN9Ql1489vjDEd\nYX1CxhhjugSrhAIWeruy5Y9XyPlDzg7h508nq4SMMcbExvqEuvHnN8aYjrA+IWOMMV2CVUIBC71d\n2fLHK+T8IWeH8POnk1VCxhhjYmN9Qt348xtjTEdYn5AxxpguIWOVkIj0FpHHRWSDiDwvIot8eZGI\n1InIJhFZKSIDI8fMF5HNIrJRRKZEyk8XkWf9azdHynuJyD2+/DERGRl5rcy/xyYR+WqmPmecQm9X\ntvzxCjl/yNkh/PzplLFKSFXfBz6jquOBjwOfEZFPAfOAOlU9GXjIbyMi44DpwDhgKnCriKQu924D\nZqnqWGCsiEz15bOAnb78RmCxP1cRcCUwwS9XRSu7rmLDhg1xR+gUyx+vkPOHnB3Cz59OGW2OU9X3\n/GoB0AP4KzANqPHlNcD5fv084G5VPaiqW4GXgIkiMgzor6pr/X7LIsdEz3Uf8Dm/XgqsVNVdqroL\nqMNVbF3Krl274o7QKZY/XiHnDzk7hJ8/nTJaCYlInohsAHYAq1X1T8AQVd3hd9kBDPHrw4FtkcO3\nAR9qpXy7L8f/fBVAVRuA3SJS3M65jDHG5JD8TJ5cVRuB8SJyHJAUkc+0eF1FxIanddDWrVvjjtAp\nlj9eIecPOTuEnz+tVDUrC/AD4FJgIzDUlw0DNvr1ecC8yP61wERgKPBCpPxi4LbIPmf69XzgLb8+\nA7g9csxPgemtZFJbbLHFFluOfUlX3ZCxKyERGQQ0qOouESkEEsAPgRVAGW4QQRnwe3/ICuDXInID\nrulsLLDWXy3tEZGJwFrgK8AtkWPKgMeAL+EGOgCsBK71gxHEv/flLTOma5y7McaYjslkc9wwoEZE\n8nB9T3ep6kMish64V0RmAVuBLwOo6vMici/wPNAAzI7cSTobqAYKgQdUtdaX3wncJSKbgZ24KyBU\ntV5ErgGe8Pv90A9QMMYYk0O69YwJxhhj4tWlZkwQkV+IyA4ReTZS9h8i8oKIPC0i9/tBEqnXjunm\n2JjyX+OzbxCRh0TkxJDyR16rEJFGfw9XMPlF5GoR2SYi6/3yDyHl9+Xf9v8GnhORxSHlF5HfRL77\nl31LSs7lbyP7BBFZ67M/ISJ/l4vZ28n/CRH5o4g8IyIrRKR/RvJna2BClgY/nA2UAM9GyhJAnl+/\nDrjOr48DNgA9gVG4+5JSV4ZrgQl+/QFgaoz5+0fWvw38PKT8vvxE3CCSl4GikPIDVwFzW9k3lPyf\nwd0n19NvDw4pf4vXfwQszMX8bXz3a4BSv/4PuNtUci57O/mfAM726zOBf89E/i51JaSq/4O7ITZa\nVqduqDjA48AIv96Rm2Mzqo3870Q2+wFv+/Ug8ns3AJe1KAspf2sDWELJ/01gkaoe9Pu85ctDyQ+A\niAiu//huX5RT+dvI/jqQankZiLvHEXIsO7SZf6wvB1gFXOTX05q/S1VCR+FruNoZOnZzbCxEpEpE\n/gKUA4t8cRD5ReQ8YJuqPtPipSDye9/2TaJ3SvP0T6HkHwt8WtzcimtE5AxfHkr+lLOBHaq6xW+H\nkH8esMT/2/0PYL4vDyE7wJ/8v1+Af8S1aECa83ebSkhEFgAHVPXXcWc5Vqq6QFU/DCwFboo7z9ES\nkT7AFbgmrabimOJ01G3AaGA87i/bJfHGOWb5wPGqeibwfeDemPN01MVAaP927wTm+H+73wN+EXOe\nY/U1YLaIPIlrhTmQiTfJ6IwJuUJEyoFzaZ5bDlwtfWJkewSuFt9Oc5Ndqnw7ueHXNF/JhZB/DK7N\n+GnXmsII4Clx93yFkB9VfTO1LiI/B/7gN4PIj8t0P4CqPuEHhwwinPyISD5wAXBapDiE/BNUdbJf\n/x3wc78eQnZU9UXcPJyIyMnA5/1Lac3f5a+ExM24/X3gPHUze6esAGaISIGIjKb55tg3gD0iMtG3\nQ3+F5htqs05ExkY2zwNSo4NyPr+qPquqQ1R1tKqOxv2Pepq6uQNzPj+Ab+dOuQBIjR4KIr9/789C\n0y+SAlV9m3DyA0zGzZryWqQshPwvicg5fv2zwCa/HkJ2RGSw/5kHLMS1CkC682dj5EW2Flyn5Wu4\ny8ZXcZeTm4FXcL+81wO3Rva/AtepthE/isWXn477ZfMScEvM+X/ns2zAzRR+QgD59/v8M1u8/mf8\n6Lgczx/9/pcBzwBP+39QQwLI3/T940Yw3eXzPAVMCim/L18KXNLK/jmTv5X/d2YCZ+AGQ20A/giU\n5GL2dv7fnwO86JdrM/Xd282qxhhjYtPlm+OMMcbkLquEjDHGxMYqIWOMMbGxSsgYY0xsrBIyxhgT\nG6uEjDHGxMYqIWOyREQWiHucwtN+ev8J/oa/m/zU95tE5Pci8qHIMYek+VEG60WkPLJ+wE+zv15E\nro3zsxnTUd1i2h5j4iYif4+b9qREVQ+Ke65SL+BaoC9wsqqqn2LqfmCiP/Q9VS1pcbpqf86XcTef\n1mfhIxiTEXYlZEx2DAXe1uZHKtQDu3Ezo39P/V3jqloN7BeRSfHENCa7rBIyJjtWAieKyIsi8hMR\n+TRwEvAXVX23xb5PAh/z630izW/3ZTOwMdlgzXHGZIGq7hWR03HPxfkMcA+uKa4tqX+brTXHGdNl\nWCVkTJaoe8LvI8AjIvIs8A3c1VG/FldDZwD/HUdGY7LNmuOMyQIRObnFYzlKgBdws3Tf4KfLR0S+\ninuA2MPZT2lM9tmVkDHZ0Q/4sX88eAPuESOXAO/iHv38oogUAruAKdo8vX1709zbFPgmePYoB2Ny\nhIgMwT259T9U9bdx5zEmG6wSMsYYExvrEzLGGBMbq4SMMcbExiohY4wxsbFKyBhjTGysEjLGGBMb\nq4SMMcbExiohY4wxsfn/rOJs9eexHG4AAAAASUVORK5CYII=\n", "text": [ "" ] } ], "prompt_number": 181 }, { "cell_type": "code", "collapsed": false, "input": [ "from sklearn.linear_model import Ridge\n", "from sklearn.metrics import explained_variance_score, r2_score, mean_squared_error, mean_absolute_error\n", "ridge_regressor_pipeline = Pipeline([\n", "('normalize', preprocessing.StandardScaler()),\n", "('regressor',Ridge())\n", "])\n", "\n", "scores = cross_val_score(ridge_regressor_pipeline,\n", " X_train, #<== training samples\n", " y_train,#<== training output labels\n", " cv=5,#<=== split data randomly into 5 parts, 4 for training and 1 for scoring\n", " scoring='r2', #<== use R^2 score\n", " n_jobs=-1)\n", "\n", "print(scores,scores.mean(),scores.std())\n", "print(\"Accuracy: %0.5f (+/- %0.5f)\" % (scores.mean(), scores.std() * 2))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "[ 0.9822377 0.97144495 0.97506792 0.98181215 0.96860433] 0.975833408526 0.0054564552039\n", "Accuracy: 0.97583 (+/- 0.01091)\n" ] } ], "prompt_number": 86 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's test with our test set and compare the predicted result with our actual results." ] }, { "cell_type": "code", "collapsed": false, "input": [ "ridge_regressor_pipeline.fit(X_train,y_train)\n", "y_predicted = ridge_regressor_pipeline.predict(X_test)\n", "print(\"Variance Score: %0.05f and R2 Score: %0.5f \\n\" % (explained_variance_score(y_test,y_predicted), r2_score(y_test,y_predicted)))\n" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Variance: 0.97419 and R2 Score: 0.97418 \n", "\n" ] } ], "prompt_number": 87 }, { "cell_type": "code", "collapsed": false, "input": [ "def plot_learning_curve(estimator, title, X, y, ylim=None, cv=None,\n", " n_jobs=-1, train_sizes=np.linspace(.1, 1.0, 5)):\n", " \"\"\"\n", " Generate a simple plot of the test and traning learning curve.\n", "\n", " Parameters\n", " ----------\n", " estimator : object type that implements the \"fit\" and \"predict\" methods\n", " An object of that type which is cloned for each validation.\n", "\n", " title : string\n", " Title for the chart.\n", "\n", " X : array-like, shape (n_samples, n_features)\n", " Training vector, where n_samples is the number of samples and\n", " n_features is the number of features.\n", "\n", " y : array-like, shape (n_samples) or (n_samples, n_features), optional\n", " Target relative to X for classification or regression;\n", " None for unsupervised learning.\n", "\n", " ylim : tuple, shape (ymin, ymax), optional\n", " Defines minimum and maximum yvalues plotted.\n", "\n", " cv : integer, cross-validation generator, optional\n", " If an integer is passed, it is the number of folds (defaults to 3).\n", " Specific cross-validation objects can be passed, see\n", " sklearn.cross_validation module for the list of possible objects\n", "\n", " n_jobs : integer, optional\n", " Number of jobs to run in parallel (default 1).\n", " \"\"\"\n", " plt.figure()\n", " plt.title(title)\n", " if ylim is not None:\n", " plt.ylim(*ylim)\n", " plt.xlabel(\"Training examples\")\n", " plt.ylabel(\"Score\")\n", " train_sizes, train_scores, test_scores = learning_curve(\n", " estimator, X, y, cv=cv, n_jobs=n_jobs, train_sizes=train_sizes)\n", " train_scores_mean = np.mean(train_scores, axis=1)\n", " train_scores_std = np.std(train_scores, axis=1)\n", " test_scores_mean = np.mean(test_scores, axis=1)\n", " test_scores_std = np.std(test_scores, axis=1)\n", " plt.grid()\n", "\n", " plt.fill_between(train_sizes, train_scores_mean - train_scores_std,\n", " train_scores_mean + train_scores_std, alpha=0.1,\n", " color=\"r\")\n", " plt.fill_between(train_sizes, test_scores_mean - test_scores_std,\n", " test_scores_mean + test_scores_std, alpha=0.1, color=\"g\")\n", " plt.plot(train_sizes, train_scores_mean, 'o-', color=\"r\",\n", " label=\"Training score\")\n", " plt.plot(train_sizes, test_scores_mean, 'o-', color=\"g\",\n", " label=\"Cross-validation score\")\n", "\n", " plt.legend(loc=\"best\")\n", " return plt" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 177 }, { "cell_type": "code", "collapsed": false, "input": [ "%time plot_learning_curve(ridge_regressor_pipeline, \"accuracy vs. training set size\", X_train, y_train, cv=5)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "CPU times: user 265 ms, sys: 48.7 ms, total: 314 ms\n", "Wall time: 520 ms\n" ] }, { "metadata": {}, "output_type": "pyout", "prompt_number": 184, "text": [ "" ] }, { "metadata": {}, "output_type": "display_data", "png": "iVBORw0KGgoAAAANSUhEUgAAAY4AAAEZCAYAAACAZ8KHAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzsnXl8VOXV+L9nZrJCSMK+yBqtQvUFK6JCQaoWUFBrrQtU\nLWqtvi64vlD0tWBbq6j4KtafUkWxWpfWlmpBFFtBURSkiltREWQJILtASDLr+f0xd5Ykk2SSzGQ9\n38/nfuY+271nTm7umeecZxFVxTAMwzCSxdXUAhiGYRgtCzMchmEYRp0ww2EYhmHUCTMchmEYRp0w\nw2EYhmHUCTMchmEYRp0ww2EYTYiI/FREXkt13eaOiPQRkYMiIk0ti1F3xOZxGEb9EJH5wBZVvb2p\nZWkKRKQfsAHwqGqoaaUxGhPrcRhNhjg0tRzpQkQ8TS1DI9Fq/4ZGYsxwtHFE5Jci8pWIHBCRz0Tk\nR5XKrxCR/8SVH+vk9xaRv4nIThHZLSIPOfkzReTpuPb9RCQkIi4nvUxEfisi7wCHgAEicmncPdaL\nyC8qyXC2iKwRkf2OrGNF5DwRWV2p3k0i8vcE3/ECEXm/Ut6NIvKSc36G890OiEixiNychN5+AUwC\npjoul8i1NorIVBH5GDgoIu6adCwik0VkeVw6JCJXisiXIrJPRH5fz7ouEZktIrtEZIOIXBv/d0jw\nfaY53/2AiHwuIqc4+RIn/24ReUFECp1mbzmf3zo6OCHBdYeJyGrnb/eNiMx28qPPhYic5LSPHOUi\n8nXc96ju/kZToap2tOED+AnQ3Tk/HygBujnp84Bi4DgnXQT0AdzAR8BsIAfIAoY7dWYAT8ddvx8Q\nAlxOehmwERhI+IeLBzgD6O+UjyJsUI510sOAb4FTnXRP4EggE9gDHBV3rw+BcxJ8xxzgAHB4XN77\nwPnO+XZghHOeH7l3Erp7Evh1pbyNwAdALyArCR1PBpbHtQ8BLwMdgN7ATmBsPepeBXzm6KsA+CcQ\njPwdKsl8JLA5TsY+wADn/HpghXOdDOBR4FmnrG/837YaHb0L/NQ5zwVOSPRcxNX3OM/InbXd344m\nfG80tQB2NK/Defme6Zy/BlyXoM5Jzksq0UtoJjUbjqXAzFpkWABMcc7nArOrqfcI8Fvn/LvAXiCj\nmrpPA7c750cQNiTZTnoT8AugQx119STwm0p5XwOTk9DxWc55ImMwPC79AjCtDnWnOudvAFfElZ1a\n3UseOBzY4dTJqFT2H+CUuHQPwEfY6Cd8+Vdq/6bzTHSulF+d4XgEeDmZ+zf1/0pbPsxV1cYRkUtE\n5EPH1bEPOBro7BQfBqxP0Kw3sEnrHxDdUkmG00XkPRHZ48hwBtCpFhkAniLsLgK4GHhBVf3V1H0W\nmOicTwIWqGq5kz7XuedGx5V2Yp2/UUUqf79EOu6UuCkA38SdlwLt6lC3vXPeo5IcxdVdQFW/Am4g\n/ILfISLPiUgPp7gfsCBO9v8AAaBbDTLFcznwHWCtiKwSkfHVVRSRKwn3OCfFZTf0/kYaMMPRhhGR\nvsAfgGuAjqpaCHxKLNi5hfCv0cpsAfqIiDtBWQlhl0SE7gnqRIfyiUgW8FfgHqCrI8MrSciAqr4H\n+ERkFGGj8HSieg7/BLqIyGDgQsKGJHKd1ar6I6AL8HfgzzVcJ+H3qC4/CR2ni+2EDXyE3tVVBFDV\n51R1JGH3kwKznKLNwDhVLYw7clV1O9V///jrfqWqk1S1i3PNF0Ukp3I9ERkJ/Bo4W1VL4opqur/R\nRJjhaNu0I/zPvxtwicilhH8NR3gcuEVEvucESQ8XkT7ASsIvprtFJFdEskVkuNNmDTBKwsHzfGB6\ngvvGvzQznWM3EBKR04ExceXzgEtF5BQnUNpLRI6MK38a+D3gU9UV1X1RpyfyF+A+oBB4HUBEMiQ8\nPyJfVYPAQcKxgGTYAQyopU5tOq4NIXkjE1/3z8D1ItJTRAqAaVTzoheR7zj6zQK8QDkxHTwK/M75\nuyMiXUTkLKdsF2F3U1G1AolcJCJdnOR+R4ZQpTq9HXkvdno/8dR0f6OJMMPRhlHV/xAOcL9L2OVx\nNPB2XPmLwJ2Ef50fAP4GFDouqjMJ9wQ2E+4VnO+0+SdhX/vHhAPQ/6DqCyuaVtWDwBTCL469hHsO\nL8WVvw9cCvwf4SD5UsLB2whPE45vPJPEV36WsB//L5XcbBcBX4vIfsKxjp9ChUlqh1VzvXnAIMeN\n8rdEFWrTMWFdaKU01ZTXpe5jwBLCf4d/A4uAYDXuxSzgLsKGYDthV2XE4D9IOAC/REQOON9jmPPd\nSgk/H+84OhiW4NpjgU9F5CDhv+GFquqtJP+pQFfgr3Ejqz6p7f5G05HWCYAi8gQwHtipqsdUU2cO\ncDph/+xkVf3QyR8HPEB4BM/jqjorUXujbeO4PXYQHglVXSykzeP05B5R1X5NLYvR8kl3j+NJYFx1\nhSJyBuEhkkcQ/qX3iJPvJux+GAcMAiaKyMA0y2q0TP4bWGVGoyKO+/AMEfGISC/Cw6QT9ooMo66k\ndWarqi6X8LIE1XEW4ZExqOpKESkQke5Af+ArVd0IICLPA2cDa9Mpr9GyEJGNhN0dP6qlaltECI+S\neh4oAxYCv2pKgYzWQ1MvidCLqkMGexGe7FM5v8qsVKNtY26X6lHVMiwWYKSJ5hAct3VuDMMwWhBN\n3ePYSsXx5YcR7l1kUHUMepUJTCJiS/sahmHUA1Wt94/2pu5xvAxcAuDM1v1WVXcAq4EjnIXQMoEL\nnLpVaOqp983lmDFjRpPL0FwO04XpwnRR89FQ0trjEJHngJOBziKyhfDIjgwAVZ2rqq84Iz++Iryw\n3aVOWUBEriW8VpIbmKeqFhivgY0bNza1CM0G00UM00UM00XqSPeoqolJ1Lm2mvzFwOKUC2UYhmE0\niKZ2VRkpYvLkyU0tQrPBdBHDdBHDdJE6WvTWsSKiLVl+wzCMpkBE0BYcHDdSxLJly5pahGaD6SKG\n6SKG6SJ1mOEwDMMw6oS5qgzDMNoYDXVVNfUEwGbPW4sWsWTOHDxeL4GsLMZMmcKo8dVuYmYYhtHq\nMcNRA28tWsRr11/PnetjC6/etm4dlJYy6vTTQaRuRxpZtmwZo0ePTus9Wgqmiximiximi9RhhqMG\nlsyZU8FoANz59dfc/stfMmrDBsjNhXbtwkf8ebt2kJVV9YIi4HLFPpM5Khue6oxSMAihUKMYKcMw\n2jZmOGrA4/UmzHeXlMCqVXDoUMWjpCT26XaHDUj79uEjch5vXCofiQxR5NMdt713grjO6N69Id7I\n1dNIvfXqqyx59NGway47mzFXXx12zVU2VK7mO67CflXGMF3EMF2kDjMcNRBI1GsAgkcfDf/v/0Eg\nEP6VHwiEf/FHXuihEPh8FY1J5CgrC3+WloaPb7+FrVvD5yUlFY1P5LO0FDIzExuhynmJjFRlg5Sd\nHX75R+RVBVXeWrqU1+68kzs3b45+19u+/BJ27WLUD34Qqx/f+0lknDyemntPlXtSTeDaawoiAznU\n2TG1pnRd6kbSkXYhZ3fYmj4V5b3Fr7Ni7hPR+N2JV07mhLGnxeSlYesaRWSqd3u0ml3Sk2+f6P6r\nlyxl1R/+SLsAaE6uxS3rgRmOGhgzZQq3rV9fwV11a1ER4266Cbp1S9woFAofqlXPI+6kQKCq0YnU\nU429NCPnoVDY4ESMTcQARc5LS1n22WeMzsuDHTtiBife+MSfe72Qk1PFyCz54gvu3Lu3wte5c/Nm\nbr/7bkZt3Rp2v2VkhI1Y5IikPZ7wefzh8cTqRMrje06ViTdM1Rkltzt81ODSW/b224w++eQajVJI\nQxWONxcuZOnDj+DxevFnZTHqv3/BiePCL9EaX8DOy7FyWeSFG6JiOnIuIiiKINF0FT1E6kLFutXp\njfBoGTRcF+C9Fas4acQJEPdYicL7ry/j+em3UxLahTcDsvzw9c1rwfcbhv3wB9Hrxt9LatsBoSYj\nU/k71lQ/lGhbdJBQ3epX1suK997HXVrGh3fczf2bY4tt3+b8f5vxSB4zHDUQeZBuf+gh3OXlBLOz\nGXfddTU/YJEXWX2IGJhERidyVDY6gUC4Tq9eMGxYVaMTfx6RS7WiIXJ6O55f/QoqGQ4Ad3k5bNsW\n7kX5/WHDEzn3+cLpyHn8kahuxKjEG59EhqiykYo3QpH8uHP1eNAMD6HMDPybNuJd9zkhj5tgpoeA\nx0XA7Qp/elwEMt0EMzMhMwPNzub9VR/ywX0Pc++WbdHv/D9ffEHJrdczdNRw8AcQFAmGkFAo/BkM\nIRqCkOIKBHAFQ4gqBEO4nB8CElInL4gEQ6AhCIb/bhL5ewbDPygkUqYhJBCs8Hev0DY+HV8eOY9/\nVkIhCrdspcsr/6pYHgzx59df46Mu+1l/XuzvXPSXXbhumsrpI0+u9FLXCh9A+HtFnqX457dSk4pt\nq2sjlfLjXvhK1Tbx1ChH5DNsTPP2fcuyjZsZdehbxvYkajCnbFnP6w89ZIajDpjhqIVR48c33gMl\nUvMv8hoYPWBAzUZHtaLRycmBDh0q9IQCXbrAl19WuXawqAimTq2YGW+YEqUT5UWMXSIDE294nDoh\nr5eQrxz1eVGfD/X7UK+XoM+LesvRkgOEfH7UV474/EgggPh8nOjz4/9kDeLz4/b78fgD4XJ/APH7\nnfPwJ34fT5Z7GZ1JxZfJjq28999TuQDQaG/HhUr4M/IDQZ18xIU6n5Hy+HbR8rhekrrdzt/chbrc\n4JJK1430rCRc7g73rDSS7zwvkfsG3eB1g9cFfjf43EqfPnmsc5Xicys+l3O44e1Oh/jmvIp/rvXn\nQcmfS3npux5cTv/ChYTPJfzpElc4rYpIuFa0LJyKa+OKXiea73KFrxtX5oq2jFyfuPNwDzEiQ+z6\nLgR1ZHI79SXWq6n8qcrxIsx78EH+0pMKBnP9X2DYrirb/Rg1YIajNRHpUdTH+KgyZto0brvxxoqu\nuQEDwq653r2r1K8xnSAvFApWcA8FQwFCGiIQChAIBfGH/PgDPoKhAEENhi9BZReF4lJwiRuXhF8g\nLnEldt9UlqmyS8NJf/nj83mBDVVeJkMYwPoXn8Ef8hPQID4N4A/58YUC+J1zv3PuC/qdtB+fBgmE\n/PhCfvwaxB90zkP+cDunri8UwB/0EQgFouW+kD9a7g/54ur6otcJhAKxawbD9XxBPwBZ7kwy3Jlk\nuDLIcHucz0wyXRlkuDPIcGWS4fawt2PiX/D7C2D+4QeruOEicZGQKloh7ZwrKKFou5CGCBH+G1Ru\nH3bnxedH3H5OW0LOfdQpd/JD4TKNylBVPogzOuIKG5OIERMXpX0OoadV/M7rzwP92zcJ9WEkxgxH\nK6HBY9RFGHXmmeByJe2a07iXQlCDcQYh6LwMfY5RCBuCUChU1c8NuNwuxCO4JBOXZJMl7oT1kmXF\n8hUMHzk8KmOJr4Q9ZXvYXbqbPaV7oueR9KsdNlM+tuI11p8H6/+1gYUv/sB54WaQ6c6MnifMi+S7\nM8h0hfMz3Zl4XJ4KeRmZ7WjnysTj9sTy4suTvE6mYyA8Lg+Z7kzcrqo/GOJ1Ec8ZC8byEZ9WyT+q\n4Cjmn/NUvXXf1GgVQxUzgO8uf5ffvf0bPmddlXZ5fXo0gbQtFzMcRgUirjlvwBv9BzzgPYA/6I8a\ngUAo/Gu7csAXwkHIqEtBXLhdbjLcGWRJ4hFq9aHMX1bBEOwucwyCc/7Vv78iuCnInrJwntvlpnNu\nZzrldKJTbic653Smc25nenXoxeBug/kgbzWbq+5MzKD2R/L69W+kTO7mxE1X/Q9T5/wPO0bujOZ1\ne6srN17/P00oVcMREdzixk1VI5qTkUO3zj0TGo7uXXo1hnitBjMcrYRUjlHfUxp+KbskFuSPdf3D\nR44np0G9gnh8QV+0JxC5d+Xz+M9gKEjHnI50zg0bgPjz73T+Dmd+58xoulNOJ3Iycmq8/1+7/TWh\n4ejapXF+hVY3DDeZ4bmVyyNlkb/N4BMGc8h3qOL9UE4acRK/9v+Gp//+NN6QlyxXFhdffTEnjTiJ\nEm9JhVFfldvWNS/+OUk0MitSHl+WirzKDB85nNJAKRsf38im4zZF84s+KOK6a69L2MZIjC1yaFTg\nhUUv8NBzDxEkSJYri8t+chmn/eC02hvGEQwF2Ve+L+YOqsYg7C7dzd6yvRzyH6JjTscKPYJOuRV7\nB51yO9EppxOdczvTPrN9yoyWqrLkjSXcMe8ONg2NvUz6vN+H2y+7PWqQa3rOEr04a7pfRHZVxeVy\n4XIWqZaoP16iRjtiqGs7j3zGv0irO4+/V+Q8/jtWZ5zqm1d5qHI68ioPf07kFo3ofumypTz996dR\nUdp72nPdxOsY/8O2NaKqoYscmuFoJaRiHZ4/L/oztzxyC1uO3xLN6/vvvsy8fCbHn3R8xZd+2W72\nlu6NnsfHDvaX76dDVoeEPYJ4AxApK8guqNC7aQiqyjtvvcMJ3z+hoq+7mviKiOARD2++9Sbz/zYf\nb8hLjjuHn//k54w7dVyVl3PlF24y51D9izzd2PpMMaOydNnSCj8EEsWE2gq2Oq6REg56DzLnuTkV\njAbApuM2cen9l9Lh9A4JewQDCgcwrNewCgahMKcQjyt1j1Yk4B5vCNQZdVP5BSwiBDWIW9xke7Lx\nuDx4XB7cLnc45iLuCi63SPt+Z/fjZ2f/LGUyG82HyN84vpeWZAfRqAbrcRgc8h2i+EAxk2+YzHtH\nvFelfNiXw1jw6IKU3a86QwBUMQaRYGdk5FBlQ1DZGDTWL3nDaMlYj8NoEGX+MooPFJObkUuWK/HI\np1xPbo3XiB+GW9kQVCbeEGS5s8hwZ5ghMIwWhhmOVkJ9fNnlgXKKDxSTk5GD2+Xm0nMv5b0H38N7\ncmxV4D7v92HSZZMqjMyJ7xXEG4JsT3Z0DkK8IYg3Bo1hCMyvH8N0EcN0kTrMcLRRfEEfxQeKo+4f\ngKyiLPIH5fOdr76DL+Qj15PLFVdewRmnnYHH7akyJNfdwIl6hmG0TCzG0QbxB/1sObAFl7jIdGcC\n4V7EhGcn8Iuhv+DUfqfSO793rfMfDMNomTQ0xtF8d+Mx0kIgFKD4QDGCRI0GwKtfvYo/5Gf8EeOj\nbifDMIxEmOFoJSxbtqzWOsFQkOL9xagqWZ6sCvn3rLiHaSOm4Q/6KcgpaNEuqGR00VYwXcQwXaQO\nMxxthJCG2HpwK0ENkp1RsTfxt8//Rn5WPqf0P4WgBmmX0a6JpDQMoyWQ1hiHiIwDHgDcwOOqOqtS\neSHwBDAAKAcuU9XPnLKNwAEgCPhVdViC61uMIwlCGmL7we2U+cvIzaw4tNYX9DHqyVE8MO4BhvYc\nSigUol9hv6YR1DCMRqHZxjhExA38HhgHDAImisjAStVuBT5Q1cHAJcCDcWUKjFbVYxMZDSM5VJUd\nJTsoC1Q1GgDPfvIsh3c8nBMPOxFvwEthTmETSGkYRksina6qYcBXqrpRVf3A88DZleoMBJYCqOoX\nQD8R6RJX3nId7Y1MIv+tqrLz0E4O+g6Sm1HVaJT6S5mzcg7TRkwL10dpl9ny3VTmy45huohhukgd\n6TQcvYD4hY+Knbx4PgJ+DCAiw4C+wGFOmQL/FJHVInJFGuVstUQWHGyf2T5h+RMfPsHxvY7nmG7H\n4A14yfXkpnSNKcMwWidpi3GIyLnAOFW9wklfBJygqtfF1ckj7J46FvgEOAr4uap+LCI9VXWb0wN5\nHbhOVZdXuofFOKohsoptXlZewvL95fv5/pPfZ8EFCzi84+GUeEvo1aFXq+hxGIZRM815raqtQPxG\n1b2h4m45qnoQuCySFpGvgQ1O2Tbnc5eILCDs+qpgOAAmT55Mv379ACgoKGDIkCHRZQUiXdO2lh58\nwmB2l+7m45UfIyLRrUNXLF8BhDe0eWT1IwwuG8zOz3ZS9P0iRISV76zEJa4ml9/SlrZ0atPLli1j\n/vz5ANH3ZUNIZ4/DA3wBnApsA1YBE1V1bVydfKBMVX2OO2qEqk4WkVzAraoHRaQdsAS4Q1WXVLqH\n9Tgcljnr8BwoP8D2ku01bna069AuRj81miUXLaFXh16U+cvIy8qja7uujSx1eojowjBdxGO6iNFs\nexyqGhCRa4HXCA/Hnaeqa0XkSqd8LuHRVvNFRIFPgcud5t2ABc6LzwP8qbLRMKpS4ithe8l22mW2\nq3EC35yVczh34Ln06hAOOQVDQfIyE7u0DMMwKmNrVbUSSv2lbNm/hXaZ7WrcTa/4QDFjnxnLm5Pf\npHNuZ0Iawhfw0b+wf4ueLW4YRvI023kcRuMRv6dGbVuwzn53Nj8b/DM653YGwBvwtvglRgzDaFzM\ncLRwvAEvxQeK+fDdD2vdQ3ndnnX8a8O/uPK4K6N5QQ1WO1y3pRIJChqmi3hMF6nDDEcLJn5PjdqM\nBsA9K+7hqqFXkZ+dD4RXys1yZ1VYJdcwDKM2LMbRQkm0p0ZNfPTNR1z60qW8c9k70X02DvkO0a1d\nNzpkd0i3uIZhNCMsxtEGieypASTdW5j1ziymnDClwuZMiiZcv8owDKMmzHC0MIKhIFsPbEVVK2y2\nFJncl4gVW1bw9bdfM+mYSdE8b8BL+8z2rXKJEfNlxzBdxDBdpA4zHC2IkIbYdnAbgVCgyp4a1aGq\nzHpnFjefdHOF3kkgFCA/Kz9dohqG0YqxGEcLQVXZdnBbwj01auL1Da9z1/K7eP3i16MBdFWlzF/G\ngI4Dah2+axhG68NiHG0AVeWbkm8o9ZfWyWiENMSst2cxdcTUCqOuygPldMjuYEbDMIx6YW+OFsDO\nQzsp8ZXUuHJtohjHy1+8TLYnm7FFYyvkBzVIh6zWO5LKfNkxTBcxTBepwwxHM2f3od18W/5tnZc7\n9wf93LviXqZ9f1qFWeHBUBCPeCoE1g3DMOqCxTiaMXvL9rLr0K5q99SoiWc+foZ/fPkPXvjJCxXy\nS32ldMrtZFvEGkYbptmujms0jG/LvmXXoV31Wg6kzF/G/733fzx25mNVykKEbLMmwzAahLmqmiEH\nyg/wzaFvatxTozLxMY6nPnqKId2G8L0e36tQxx/0k+3JbvVLjJgvO4bpIobpInVYj6OZcch3iG0H\nt9E+K3mjEc9B70EeWf1IFRcVhNe26t6+eyrENAyjDWMxjmZEqb+U4v3F5GTkJLVoYSJmr5jNpv2b\nmHP6nCplh3yHGFA4oN7XNgyjdWAxjlZCeaCc4gMNMxp7y/byxJoneGXSK1XKvAEv7TLbmdEwDKPB\nWIyjGRDZUyPbk13vF/uK5St4aNVDnH3k2fQt6Ful3B/yU5Bd0FBRWwTmy45huohhukgd1uNoYiJ7\namS4Mhq04OCe0j38+as/88bP3qhSpqq4cJHjyUnQ0jAMo25YjKMJieypIQhZnqwGXWvq61PJz8rn\ntlG3VSkr85eRn5VP53adG3QPwzBaBxbjaKEEQgG2HtiaEqOxYd8GXln3CssvXZ6wPKhB2me1ru1h\nDcNoOizG0QQEQ0G2HdxGSEMNNhoQHkk1xj0m4WzwYCiIx9W2lhgxX3YM00UM00XqMMPRyET21PAH\n/UnvqVETn+36jHe2vMOEIyYkLPcGvHTM7tjg+xiGYUSwGEcjoqpsP7id0kApuRmp2bL1Z3//GSP7\njOTn3/t5wvISbwn9C/uT4c5Iyf0Mw2j52H4cLQRVZcehHRzyH0qZ0Xh/6/us3bWWi//r4oTlkV6N\nGQ3DMFKJGY5GYnfpbg6UH0jZAoOqyt1v381NJ91Elicr4X4c3qCXjjltz01lvuwYposYpovUYYaj\nEdh9aDf7yveldGTTm5veZFfpLn4y6CfV1hHE5m4YhpFyLMaRZhqyp0Z1qCqn/+l0rhl2DWd+58yE\ndcoD5eR6cumeZ4saGoZREYtxNGP2l+9nZ8nOeu2pUROL1i0CYPwR46utEwgF6JDdereHNQyj6Uir\n4RCRcSLyuYisE5FpCcoLRWSBiHwkIitF5LvJtm3uHPQe5JuSb8jLyqvX8ujVEQgFwlvCjpiGS2J/\nvvgYh6riFnebdVOZLzuG6SKG6SJ1pM1wiIgb+D0wDhgETBSRgZWq3Qp8oKqDgUuAB+vQttkS2VOj\nXWa7lBoNgL/+5690zunM6H6jq61THignPys/5fc2DMOANMY4ROQkYIaqjnPSvwRQ1bvj6iwE7lbV\nt530V8BwoKi2tk5+s4txlPnL2Lx/M7kZuSlfwtwb8DLyyZE8fMbDHN/r+GrrlXhL6FvQNyWz0g3D\naH005xhHL2BLXLrYyYvnI+DHACIyDOgLHJZk22ZHKvbUqIlnPn6GozofVaPRCIaCZLgzzGgYhpE2\n0rnIYTJdgbuBB0XkQ+AT4EMgmGRbACZPnky/fv0AKCgoYMiQIYwePRqI+TQbI+0NePnrK3/F7XIz\n6uRRQCzuMHzk8AanD/kOMfu52dw2Mrb6bXx55LzcX86ZY89s9O/fnNKRvOYiT1Om16xZww033NBs\n5GnK9AMPPNBk74emTi9btoz58+cDRN+XDSGdrqoTgZlx7qbpQEhVZ9XQ5mvgGODoZNo2F1dVZHl0\nl7jIdGem5R5zVs5h7e61PDL+kYTlK5avYPjI4bbECOF/mMg/T1vHdBHDdBGjoa6qdBoOD/AFcCqw\nDVgFTFTVtXF18oEyVfWJyBXACFWdnExbp32TG45AKMDm/ZtTsjx6dewr28fIJ0fy0sSXKCosqrae\nL+jDIx4Oyz8sLXIYhtE6aLb7cahqQESuBV4D3MA8VV0rIlc65XMJj5iaLyIKfApcXlPbdMlaX4Kh\nIMX7i9NqNAAeXf0opx9+eo1GA8I9n855tlmTYRjpxWaO15OQhig+UIw/6CcnI33zJXaU7OCUP57C\nkouX0Cuv+vEBK5avYPAJgxlQOCAtgfmWhLkkYpguYpguYjTnUVWtlpCG2H5we9qNBoRjG+cNOq9G\nowFhN1VeZl6bNxqGYaQf63HUEVXlm5JvUro8enVs3r+Z0/90Om9NfotOuZ1qrFviLaF3fu+0GzLD\nMFo+1uNoRFSVnYd2ctB3MO1GA2D2u7O5dMiltRqNkIZwu9xtantYwzCaDjMcdWB36W72l+9P+aKF\nifhi9xdLIKnOAAAgAElEQVQs/XopVx53Za11vQEvn6761JYYcYifz9HWMV3EMF2kDjMcSbKndA97\ny/amdE+Nmrh3xb1cffzVSS3HHgwFzUVlGEajYTGOJDjkO0TxgeKU7qlREx9u/5Cf/+PnvH3p27Ua\nhEAoQCgUol9hv0aRzTCMlo/FOBqBkIYqLGGebma9M4sbTrwhqV6EN+ClMKewEaQyDMMIY4ajmfH2\n5rfZsn8LF373wqTqqyrtMtuZ/zYO00UM00UM00XqMMPRjFBV7n77bm4ZfktSa035gj5yM3LxuNK5\nVqVhGEZFLMaRBJHd/NpltkvrfZasX8I979zDkouXJOUaO+Q7RM+8nmmXyzCM1oXFOFoJwVCQWW/P\nYuqIqUkZjYjBtNFUhmE0NmY4mgkvffES7TLb8cMBP0yqvjfopUNWh6iRMf9tDNNFDNNFDNNF6jDD\n0QzwBX3ct+I+fvn9XyY9iS8QDNAhq0OaJTMMw6iKxTiSIN0xjj9+9EcWf7WY5859Lqn6IQ3hDXgZ\nUDjAZosbhlFnmu1+HEZylPnLePC9B3ni7CeSbhOZu2FGwzCMpsBcVU3M/DXz+V6P7zG4++Ck2wRD\nQdplVOz9mP82hukihukihukidViPowk54D3AI6sf4cXzX0y6TSAUINOdmdYdBw3DMGrCYhxJkK4Y\nx73v3MvWg1t5YNwDSbcp9ZfSJbcL+dn5KZXFMIy2g8U4Wii7S3cz/6P5vPrTV+vULqQhm/BnGEaT\nYjGOJuKhVQ9xzlHn0Du/d9JtfEEfuZ7ES4yY/zaG6SKG6SKG6SJ1JNXjEJFcoLeqfpFmedoEWw9s\n5cX/vMjSny2tUztfwEeXDl3SJJVhGEZy1BrjEJGzgHuBLFXtJyLHAneo6lmNIWBNtNQYxy1LbqFT\nbiemf3960m1UlVJ/KUUdixp1iXfDMFofjbFW1UzgBGAfgKp+CAyo7w3bOl/t/YrX1r/Gfw/97zq1\nKw+Uk5+db0bDMIwmJ5m3kF9Vv62UF0qHMG2B+1bcxy+O+wUF2QV1ahcMBcnLrH4HQvPfxjBdxDBd\nxDBdpI5kDMdnIvJTwCMiR4jIQ8CKNMvVKvl056es3LqSy4+9vE7tQhrC4/KQ7clOk2SGYRjJk0yM\nIxf4X2CMk/Ua8BtVLU+zbLXS0mIcF//tYk7pfwqXHntpndqV+csozCmkY07HBt3fMAwD0jyPQ0Q8\nwCJV/QFwa31vYsDK4pWs27uOx896vM5tgxqkfWb7NEhlGIZRd2p0ValqAAiJSN0c8kYFVJW737mb\nm066qc5LhQRCAbLcWWS6M2usZ/7bGKaLGKaLGKaL1JHMPI5DwCci8rpzDqCqOqW2hiIyDngAcAOP\nq+qsSuWdgWeA7o4s96nqfKdsI3AACBIO0A9L5gs1R5ZuXMq+sn2cO/DcOrf1Brx0a9ctDVIZhmHU\nj2RiHJOd00hFIWw4nqqlnRv4AjgN2Aq8D0xU1bVxdWYSnh8y3TEiXwDdVDUgIl8Dx6nq3hru0exj\nHCENMe6Zcdxw4g2cccQZdW5f4ithQOGAhLPFDcMw6kPa16pS1fkikgV8x8n6XFX9SVx7GPCVqm50\nBH0eOBtYG1dnO/BfznkHYI/jHovQ4jecWPjlQjwuD6cffnqd23oDXtpntjejYRhGs6LW4bgiMhr4\nEnjYOdaJyMlJXLsXsCUuXezkxfMY8F0R2QZ8BFwfV6bAP0VktYhckcT9mh2BUIB7V9xbpy1h4/EH\n/eRnJbcKrvlvY5guYpguYpguUkcyP2XvB8ZE1qkSke8AzwPfq6VdMj6kW4E1qjpaRIqA10VksKoe\nBEao6nYR6eLkf66qyytfYPLkyfTr1w+AgoIChgwZwujRo4HYg9LQ9HEnHQfAiuXh6SvDRw5PKn3X\n03eRvSWbkX1G1rm9qvL+O+9T3KGYU35wSkq/T2tPR2gu8jRles2aNc1KnqZMr1mzplnJ05jpZcuW\nMX/+fIDo+7IhJBPj+FhV/6u2vATtTgRmquo4Jz0dCMUHyEXkFeBOVX3HSf8LmKaqqytdawZQoqqz\nK+U32xhHeaCckU+O5JHxjzC059A637PMX0ZeVh5d23Wtc1vDMIyaaIy1qv4tIo+LyGgR+YGIPA6s\nrrVVuM4RItJPRDKBC4CXK9X5nHDwHBHpBhwJbBCRXBHJc/LbEZ58+ElyX6l58PTHT/PdLt+tl9GA\n8NyNDlkdUiyVYRhGw0nGcPw34YD2FOA64DMnr0acIPe1hGea/wd4QVXXisiVInKlU+13wFAR+Qj4\nJzDVGUXVHVguImuAlcBCVV1St6/WdJT4Snh41cNMHTG1Xu1DGsIjdVtipLKbpi1juohhuohhukgd\nycQ43MADETeRM8w2qVlsqroYWFwpb27c+W7gzATtNgBDkrlHc+SxDx7j+32+z6Aug+rVvtxfTqfc\nTimWyjAMIzUkE+NYCZyqqiVOOg94TVWHN4J8NdIcYxx7y/Yy6slR/GPiP+hf2L9e9yvxldCvoF+t\ns8UNwzDqQ2PEOLIiRgPAGfGUW98btnb+3/v/j/HfGV9vo+EP+sn2ZJvRMAyj2ZKM4TgkIsdFEiIy\nFChLn0gtl29KvuG5T57jhhNuqPc1fEEfhdmFdW5n/tsYposYposYpovUkUyM4wbgzyKy3Ul3By5M\nn0gtlwfee4ALj76QHnk9GnSd3Azr0BmG0XypNsYhIsOALc4kvEzgF8CPCY+wur2mNaQai+YU49j4\n7UYmPDuBty59q977ZngDXrI8WfTM61lfUQ3DMGolnTGOuYDXOT8RuI3wkiP7gD/U94atldkrZnP5\nsZc3aLMlf8hf5y1lDcMwGpuaDIcrrldxATBXVf+qqv8LHJF+0VoOa3et5a3Nb3HFcfVfUktVceGq\n9/aw5r+NYbqIYbqIYbpIHTUZDreIZDjnpwFL48psudY47llxD9ccf02DdukrD5RTkF2AS5IZr2AY\nhtF01BTjuA0YD+wGehPeGyMkIkcA81V1ROOJmZjmEOP497Z/c+XCK3n7srfr3VuA8NyNPvl9GnQN\nwzCMZEjbfhyqeqeIvEF4FNUSVQ1F7kl46ZE2T/yWsA154QdDQTyuui0xYhiG0VTUtuf4u6q6QFUP\nxeV9qaofpF+05s/yzcvZfnA753/3/AZdxxvw1mvuRjzmv41huohhuohhukgd5lCvJ6rKrLdn8T8j\n/qfBO/SFNNSg+IhhGEZjUutaVc2ZpoxxLF63mP977/949aJXGxTQ9gf9iAh98vukQlTDMIxaaYy1\nqoxKBENB7llxD9NGTGvwKChv0NuguR+GYRiNjRmOevC3z/9GflY+p/Q/pcHXEoQcT06Dr2P+2xim\niximiximi9Rh8zHqiC/oY/aK2Tww7gFE6t3TA8JzN/Iy83C73CmSzjAMI/1YjCMJ4mMc89fM558b\n/skzP36mwdct8ZVwWIfDbFFDwzAalbTN4zCqUuovZc7KOTz1o6cafK3IEiOpcFMZhmE0JhbjqANP\nfPgEx/c6nmO6HdPga0WWGGmouyuC+W9jmC5imC5imC5Sh/U4kuSA9wBz/z2XBRcsSMn1AqEAeVl5\nKbmWYRhGY2IxjlpY9Poi7n/mftbuXYsGlVlXzeK0H5zWoGsGQ0ECoUC9t5c1DMNoCBbjSCOLXl/E\n9Q9fz/pj18OAcN6vHv8VQIOMR3mgnK7tuqZCRMMwjEbHYhw1MOfZOWGjEcem4zbx5F+fbNB1VTXl\nI6nMfxvDdBHDdBHDdJE6zHDUgFe9CfPLg+X1vqY/6CcnI4cMd0btlQ3DMJohZjhqIEuyEuZnu+u/\n/Lkv6KMwp2Er4SZi9OjRKb9mS8V0EcN0EcN0kTrMcNTAlElTKPqwqEJe39V9ufTcS+t9TVW1uRuG\nYbRozHDUwPgfjufBax7k1K9P5fgvj2f0htH8+opf1zswXh4op0NWh7QsMWL+2ximiximiximi9Rh\no6pqYfwPxzNq1Khqt46tC4FQgPzs/BRJZhiG0TSkdR6HiIwDHgDcwOOqOqtSeWfgGcLb03qA+1R1\nfjJtnTpNvud4soQ0hDfgZUDhgJTNFjcMw6gPzXY/DhFxA78HxgGDgIkiMrBStWuBD1V1CDAamC0i\nniTbtii8AW9KlxgxDMNoKtIZ4xgGfKWqG1XVDzwPnF2pznagg3PeAdijqoEk27YogqFgWreHNf9t\nDNNFDNNFDNNF6kin4egFbIlLFzt58TwGfFdEtgEfAdfXoW2LIRAKkOHOIMuTeHivYRhGSyKdwfFk\ngg+3AmtUdbSIFAGvi8jgutxk8uTJ9OvXD4CCggKGDBkSHa8d+YXR0PRxJx0HwIrlKwAYPnJ4ndKD\nTxhM13ZdUyZPovTo0aPTen1Lt9x0hOYiT1OlI3nNRZ7GTC9btoz58+cDRN+XDSFtwXERORGYqarj\nnPR0IBQf5BaRV4A7VfUdJ/0vYBphg1ZjWye/RQTHS7wl9C/sb7PFDcNoFjTb4DiwGjhCRPqJSCZw\nAfBypTqfA6cBiEg34EhgQ5JtWwS+oI/cjNy0G43Kvy7bMqaLGKaLGKaL1JE2V5WqBkTkWuA1wkNq\n56nqWhG50imfC/wOeFJEPiJsxKaq6l6ARG3TJWs68Qf9dM7r3NRiGIZhpAzbjyMJ6uuqUlVK/aUU\ndSzCJTZJ3zCM5kFzdlW1ebxBL3mZeWY0DMNoVdgbLY0Ego23xIj5b2OYLmKYLmKYLlKHGY40EdIQ\nbpebbE/9l2A3DMNojliMIwnqE+Mo85dRmFNIx5yOaZTMMAyj7liMo5kSDAVpl9Gw1XQNwzCaI2Y4\n0kAgFCDTndmoS4yY/zaG6SKG6SKG6SJ1mOFIA+naHtYwDKM5YDGOJKhrjKPEV8KAwgF4XLZPlmEY\nzQ+LcTQzfEEfuZ5cMxqGYbRazHCkGF+gadxU5r+NYbqIYbqIYbpIHWY4UoiqIiLkZOQ0tSiGYRhp\nw2IcSZBsjKPMX0ZeVh5d23VNu0yGYRj1xWIczYhgKEheZl5Ti2EYhpFWzHCkiJCG8Lg8TbbEiPlv\nY5guYpguYpguUocZjhThDXgpyClApN69P8MwjBaBxTiSIJkYR4mvhH4F/ch0Z6ZdHsMwjIZgMY5m\nQCAUINOVaUbDMIw2gRmOFOANeJt8FVzz38YwXcQwXcQwXaQOMxwpQFFyM3ObWgzDMIxGwWIcSVBT\njMMb8JLlyaJnXs+0y2EYhpEKLMbRxPiDfvKzGmd7WMMwjOaAGY4GoKq4xNUslhgx/20M00UM00UM\n00XqaJVLuNpcCqOl05JdyEbrp1XGOBz/XRNIZBgNx55fI91YjMMwDMNoVMxwGEYrxvz6MUwXqcMM\nh2EYhlEnLMZhGM0Me36NdNOsYxwiMk5EPheRdSIyLUH5LSLyoXN8IiIBESlwyjaKyMdO2ap0ytlS\nOeOMM3j66adTXtcwDKMm0tbjEBE38AVwGrAVeB+YqKprq6k/AbhBVU9z0l8Dx6nq3hru0eJ6HO3b\nt48OFz506BDZ2dm43W4A/vCHPzBx4sSmFM9oBqTy+V22bBmjR49OybVaOqaLGA3tcaRzHscw4CtV\n3QggIs8DZwMJDQcwCXiuUl7KJ2S8tWgRS+bMweP1EsjKYsyUKYwaP77R2peUlETP+/fvz7x58zjl\nlFOq1AsEAng8rXKaTZ0wPRhGM0RV03IAPwEei0tfBDxUTd1cYA9QEJe3AfgQWA1cUU07TUR1+W8u\nXKi3FhWpQvS4tahI31y4MGH9VLevTL9+/fRf//qXqqouXbpUe/XqpbNmzdLu3bvrJZdcovv27dPx\n48drly5dtLCwUCdMmKDFxcXR9ieffLI+/vjjqqr65JNP6ogRI/SWW27RwsJC7d+/vy5evLhedTds\n2KAjR47UvLw8Pe200/Tqq6/Wiy66KOF32LVrl44fP14LCgq0Y8eOOnLkSA2FQqqqunnzZj3nnHO0\nS5cu2qlTJ7322mtVVTUYDOpvfvMb7du3r3bt2lUvueQS3b9/v6qqfv311yoiOm/ePO3Tp4+efPLJ\nqqo6b948HThwoBYWFurYsWN106ZN9dJ5S6C659cwUoXzjNX7/Z7OGEdd+tpnAm+r6rdxeSNU9Vjg\ndOAaERnZUIGWzJnDnevXV8i7c/16Xn/ooUZpXxs7duxg3759bN68mblz5xIKhbj88svZvHkzmzdv\nJicnh2uvvTZaX0QqzJJftWoVRx11FHv27GHq1Klcfvnl9ao7adIkTjzxRPbu3cvMmTN55plnqp2N\nP3v2bHr37s3u3bvZuXMnd911FyJCMBhkwoQJ9O/fn02bNrF169aoG27+/Pk89dRTLFu2jA0bNlBS\nUlLhewG89dZbfP7557z66qu89NJL3HXXXSxYsIDdu3czcuRIc+kZRhOSTh/AVqB3XLo3UFxN3Qup\n5KZS1e3O5y4RWUDY9bW8csPJkyfTr18/AAoKChgyZEi1Anm83oT57tdegySWKalOWe7y8lrbJoPL\n5eKOO+4gIyODjIwMsrOzOeecc6Llt956a0K3VoS+fftGDcAll1zC1Vdfzc6dO+natWvSdcvLy1m9\nejVLly7F4/EwYsQIzjrrrGp97pmZmWzfvp2NGzdSVFTEiBEjgLBh2r59O/feey8uV/j3yfDhwwH4\n05/+xM033xz9u911110cffTRzJ8/P3rdmTNnkpMTXgPs0UcfZfr06Rx55JEATJ8+nd/97nds2bKF\n3r3jH7HWRWTeQcQvX5/0mjVruOGGG1J2vZacfuCBBxgyZEizkacx08uWLYv+f0X+7xpEQ7orNR2E\n37PrgX5AJrAGGJigXj5hN1VOXF4ukOectwPeAcYkaFtTN6wKt40ZU8HNFDn+d+zYpLp3DW1fmUSu\nqngOHTqkv/jFL7Rv377aoUMH7dChg7pcrqgraPTo0Tpv3jxVDbufvv/971doLyK6fv36OtV99913\ntWvXrhXKpk+fXq2r6uDBg3rzzTfrgAEDdMCAAXr33XerquoLL7ygQ4cOTdhm4MCB+sorr0TTZWVl\nKiK6bdu2qKsqEAhUqN++fXstKCiIHrm5ufruu+8mvH5Lp7rntz4sXbo0Zddq6ZguYtBcXVWqGgCu\nBV4D/gO8oKprReRKEbkyruqPgNdUtSwurxuwXETWACuBhaq6pKEyjZkyhduKiirk3VpUxA+vu65R\n2tdGZXfQ7Nmz+fLLL1m1ahX79+/nzTffjDeaaaFHjx7s3buXsrLYn2Pz5s3V1m/fvj333Xcf69ev\n5+WXX+b+++/njTfeoE+fPmzevJlgMFilTc+ePdm4cWOF63s8Hrp16xbNi9dFnz59+MMf/sC+ffui\nx6FDhzjxxBMb+G1bPzaKKIbpInWkdbiKqi4GFlfKm1sp/RTwVKW8r4HqfU71JDL66faHHsJdXk4w\nO5tx112X9KiohravKyUlJeTk5JCfn8/evXu544470nKfePr27cvQoUOZOXMmv/3tb1m9ejULFy7k\nrLPOSlh/0aJFHHnkkRQVFdGhQwfcbjdut5thw4bRo0cPfvnLX3LHHXfgcrn44IMPGD58OBMnTmTW\nrFmcfvrpdO7cmVtvvZULL7ww6tKqzFVXXcXtt9/O4MGDGTRoEPv372fJkiWcd9556VSFYRjV0ObG\nOY4aP75BL/qGtq+Jyj2OG264gUmTJtG5c2d69erFTTfdxMsvv1xt28rtqwto11b3T3/6E5MnT6ZT\np04MGzaMCy64IGHPAWDdunVce+217Nq1i8LCQq655hpOPvlkAP7xj38wZcoU+vTpg4jw05/+lOHD\nh3PZZZexbds2Ro0aRXl5OePGjeOhuAEGlWX70Y9+RElJCRdeeCGbNm0iPz+fMWPGmOFIApu7EMN0\nkTpsyRGjVi644AIGDRrEjBkzmlqUNoFNAEwPposYDZ0AaIbDqMLq1aspLCykf//+vPbaa/z4xz/m\nvffeY/DgwU0tWpvAnl8j3TTnmeNGC+Wbb77hxz/+MXv27KF37948+uijZjQMw4hiPQ7DaGaYqyo9\nmC5iNOvVcQ3DMIzWh/U4DKOZYc+vkW6sx2EYhmE0KmY4DKMVY/tsxzBdpA4zHIZhGEadMMNhNIiN\nGzficrkIhUJAzVvUVq5bV+666y6uuOKKesvaFrFRRDFMF6nDDEcT8eyzzzJ06FDy8vLo2bMnZ5xx\nBu+8805Ti9VgXnnlFS6++OIGX2fZsmVVlkyfPn06jz32WIOvbRhGw2hzhmPR64sYe+lYRk8ezdhL\nx7Lo9UWN2h7g/vvv58Ybb+R///d/2blzJ1u2bOGaa66pdh2q6taJMloGgUCgye5tfv0YposU0pA1\n2Zv6oI77cSxcslCLzi5SZhI9is4u0oVLktv6taHtVVW//fZbbd++vb744ovV1pkxY4aee+65etFF\nF2mHDh103rx5unXrVj3zzDO1Y8eOevjhh+tjjz0Wrb9y5Uo97rjjtEOHDtqtWze96aabVDW8z8VP\nf/pT7dSpkxYUFOjxxx+vO3bsqHK/559/vsreGffff7+eddZZ4e+9cKEOGTJEO3TooL1799aZM2dG\n60X2zwgGg6pacYvaQCCgN998s3bu3FkHDBigv//97yvUfeKJJ3TgwIGal5enAwYM0Llz56qqaklJ\niWZnZ6vL5dL27dtrXl6ebtu2TWfMmFFhX5CXXnpJBw0apAUFBTp69Ghdu3ZttKxv375633336X/9\n139pfn6+XnDBBVpeXp5Q3+vWrdNRo0Zpfn6+du7cWS+44IJo2aeffqqnnXaaduzYUbt166a/+93v\nVFW1vLxcr7/+eu3Zs6f27NlTb7jhBvV6vaqaeBvgUCikd911lxYVFWmnTp30/PPP17179yaUp7rn\ntz7YHhQxTBcxaOB+HE3+8m+Q8HU0HGMmj6nw0o8cYy9NbiOmhrZXVV28eLF6PJ7oyzMRM2bM0IyM\nDH3ppZdUNWwARo4cqddcc416vV5ds2aNdunSRd944w1VVT3xxBP1mWeeUdXw5k8rV65UVdVHH31U\nzzzzTC0rK9NQKKQffPCBHjhwoMr9SktLNS8vT9etWxfNGzp0qL7wwguqqrps2TL99NNPVVX1448/\n1m7duunf//53Va1qOOI3jHrkkUf0qKOO0uLiYt27d6+OHj1aXS5XtO6iRYt0w4YNqqr65ptvam5u\nrn7wwQfRex522GEV5Jw5c2bUcHzxxRfarl07/ec//6mBQEDvuecePfzww9Xv96tqeJOsE044Qbdv\n36579+7VgQMH6qOPPppQ3xdeeGHUIHi9Xn3nnXdUVfXAgQPavXt3vf/++9Xr9erBgwejur399tv1\npJNO0l27dumuXbt0+PDhevvtt6tq+AXl8Xj0l7/8pfp8Pi0rK9MHHnhATzrpJN26dav6fD698sor\ndeLEiQnlSaXhMIxENNRwtClXlVcTbx372obXkDuk1mPJ14n3kioPJb917J49e+jcuXO1e09EGD58\neHQPjF27drFixQpmzZpFZmYmgwcP5uc//zl//OMfgfD2revWrWP37t3k5uYybNiwaP6ePXtYt24d\nIsKxxx5LXl5elXvl5ORw9tln89xz4d17161bxxdffBG9/8knn8x3v/tdAI455hguvPBC3nzzzVq/\n65///GduvPFGevXqRWFhIbfeemvE4APhQHr//v0BGDVqFGPGjGH58vDuwPH1IsTnvfDCC0yYMIFT\nTz0Vt9vNLbfcQllZGStWrIjWmTJlCt27d6ewsJAzzzyTNWvWJJQzMzOTjRs3snXrVjIzM6Nb3C5c\nuJCePXty4403kpmZSfv27aO6ffbZZ/nVr35F586d6dy5MzNmzKgwKCB+G+Ds7Gzmzp3Lb3/7W3r2\n7ElGRgYzZszgxRdfrPdAAcNoStqU4ciSrIT5YweMRWdorceY/mMSts92ZSctQ6dOndi9e3etL4zD\nDjsser5t2zY6duxIu3btonl9+vRh69atAMybN48vv/ySgQMHMmzYMBYtCsddLr74YsaOHcuFF15I\nr169mDZtGoFAgOXLl5OXl0deXh7HHHMMAJMmTYoajmeffZZzzjmH7Ozw91q5ciU/+MEP6Nq1KwUF\nBcydO5c9e/bU+l23b99eIcDdp0+fCuWLFy/mxBNPpFOnThQWFvLKK68kdd2ITuKvJyL07t07qhOA\n7t27R89zcnIoKSlJeK177rkHVWXYsGEcffTRPPnkkwBs2bKFAQMGVHv/vn37Vvhu27Zti6a7dOlC\nZmZmNL1x40bOOeccCgsLKSwsZNCgQXg8Hnbs2JHU960v5tePYbpIHW3KcEyZNIWiDytu/Vr0QRHX\nTUxu69eGtgc46aSTyMrKYsGCBdXWqbzRUs+ePdm7d2+FF9/mzZujxuXwww/n2WefZdeuXUybNo2f\n/OQnlJWV4fF4+NWvfsVnn33GihUrWLhwIX/84x8ZOXIkBw8e5ODBg3zyyScAnHbaaezatYuPPvqI\n559/nkmTJkXvNWnSJH70ox9RXFzMt99+y1VXXZXUL+UePXpU2HY2/tzr9XLuuecydepUdu7cyb59\n+zjjjDOivYrqNqGK0KtXLzZt2hRNqypbtmyhV69e1eq0Orp168Yf/vAHtm7dyty5c7n66qtZv349\nffr0YcOGDQnbJNr+tmfPntXer0+fPrz66qsVtr8tLS2lR48eNX5Pw2iOtCnDMf6H43nwmgcZu2ks\nJ399MmM3jeXBax9k/A+T29Gvoe0B8vPz+fWvf80111zDSy+9RGlpKX6/n8WLFzNt2jSgqpumd+/e\nDB8+nOnTp+P1evn444954oknuOiiiwB45pln2LVrV/T6IoLL5WLp0qV88sknBINB8vLyyMjIwO12\nJ5QrIyOD8847j1tuuYV9+/bxwx/+MFpWUlJCYWEhmZmZrFq1imeffbbWFzvA+eefz5w5c9i6dSv7\n9u3j7rvvjpb5fD58Pl/Ubbd48WKWLIm5Art168aePXs4cOBAwmufd955LFq0iDfeeAO/38/s2bPJ\nzh2t4+wAAAo9SURBVM6Oupkqk8j1FeEvf/kLxcXFABQUFCAiuN1uJkyYwPbt23nwwQfxer0cPHiQ\nVatWATBx4kR++9vfsnv3bnbv3s2vf/3rGochX3XVVdx6661R47lr165qR9GlEpu7EMN0kUIaEiBp\n6oM6BsebE3/605906NCh2q5dO+3evbtOmDBB3333XVUNB4EvvvjiCvWLi4t1woQJ2rFjRy0qKoqO\nQFJVveiii7Rr167avn17Pfroo6NB9eeee06PPPJIbdeunXbr1k2vv/76GoPyy5cvVxHRa6+9tkL+\niy++qH379tW8vDydMGGCXnfddVH5vv766woB7/jgeCAQ0BtvvFE7deqkAwYM0IcffrhC3Ycffli7\ndeumBQUFevHFF+vEiROjAWZV1csuu0w7deqkhYWFum3btip6WbBggQ4aNEjz8/N19OjR+p///Cda\n1q9fP/3Xv/4VTSfSaYSpU6dqr169tH379lpUVFRhxNqnn36qp556qhYWFmr37t111qxZqhoeVTVl\nyhTt0aOH9ujRQ6+//voKo6p69+5d4R6hUEjvv/9+PfLIIzUvL0+Lior0tttuSyhPS3h+jZYNDQyO\n2+q4htHMsP040oPpIoatjmsYhmE0KtbjMIxmhj2/RrqxHodhGIbRqJjhMIxWjM1diGG6SB1mOAzD\nMIw6YTEOw2hm2PNrpJuGxjg8qRSmOZHMBDXDMAyj7qTVVSUi40TkcxFZJyLTEpTfIiIfOscnIhIQ\nkYJk2tZEQya2tNRj6dKlTS5Dczlagy5Shfn1Y5guUkfaDIeIuIHfA+OAQcBEERkYX0dV71PVY1X1\nWGA6sExVv02mrVGR6lZ+bYuYLmKYLmKYLlJHOnscw4CvVHWjqvqB54Gza6g/CXiunm3bPN9++21T\ni9BsMF3EMF3EMF2kjnQajl7Alrh0sZNXBRHJBcYCf61rW8MwDKNxSafhqIuj9kzgbVWN/CSwISV1\nJH6J77aO6SKG6SKG6SJ1pG04roicCMxU1XFOejoQUtVZCeouAF5Q1efr0lZEzMAYhmHUA23AcNx0\nGg4P8AVwKrANWAVMVNW1lerlAxuAw1S1rC5tDcMwjMYnbfM4VDUgItcCrwFuYJ6qrhWRK53yuU7V\nHwGvRYxGTW3TJathGIaRPC165rhhGIbR+LTYtaoaMkGwNSAiG0XkY2fy5Conr6OIvC4iX4rIkshk\nytaEiDwhIjtE5JO4vGq/t4hMd56Rz0VkTNNInR6q0cVMESmOm1h7elxZa9ZFbxFZKiKficinIjLF\nyW9zz0YNukjds9HUM2TrOavWDXwF9AMygDXAwKaWq5F18DXQsVLePcBU53wacHdTy5mG7z0SOBb4\npLbvTXjy6BrnGennPDOupv4OadbFDOCmBHVbuy66A0Oc8/aEY6QD2+KzUYMuUvZstNQeh00QDFN5\nVMRZwFPO+VOE40etClVdDuyrlF3d9z4beE5V/aq6kfA/xLDGkLMxqEYXUPW5gNavi29UdY1zXgKs\nJTz3q809GzXoAlL0bLRUw2ETBMNzXf4pIqtF5Aonr5uq7nDOdwDdmka0Rqe6792T8LMRoa08J9eJ\nyEciMi/ONdNmdCEi/Qj3xFbSxp+NOF2852Sl5NloqYbDIvowQsNrfJ0OXCMiI+MLNdwHbXN6SuJ7\nt3adPAL0B4YA24HZNdRtdboQkfaEV6C4XlUPxpe1tWfD0cWLhHVRQgqfjZZqOLYCvePSvaloMVs9\nqrrd+dwFLCDctdwhIt0BRKQHsLPpJGxUqvvelZ+Tw5y8Vouq7lQH4HFiLodWrwsRySBsNJ5W1b87\n2W3y2YjTxTMRXaTy2WiphmM1cISI9BOR/9/e/YVYUYZxHP/+sCwVV8uiu8LKjYgN+6NRKRWIZl1J\ngV2lFnkhZoR1UTcFQaWBf/pjBhkhlRdGiBWpFe5Fm2G6aZuaooU3CoEgmbIK+nTxvqcdh3OOe2jX\n4+7+PnDY2TnvzDwzzHnf887Med7hwGxgY5NjumgkjZQ0Ok+PAqYDXaRjMCcXmwNsqL6GQafWfm8E\nnpA0XNJ4YALpx6SDVq4cK2aRzgsY5MdCkoA1wN6IWFF4a8idG7WORZ+eG81+AuB/PDkwk/S0wEHg\npWbHc5H3fTzpKYhdwG+V/QeuBr4DDgBbgLHNjrUf9n0dKZvAGdJ9rnn19ht4OZ8jvwMzmh1/Px+L\np4C1wK/AblIled0QORZTgHP5M/FLfj08FM+NGsdiZl+eG/4BoJmZNWSgXqoyM7MmccNhZmYNccNh\nZmYNccNhZmYNccNhZmYNccNhZmYNccNhlzxJ4wqpoI8WUkN35tEi6y17l6SVvdhGR99F3HyS5kp6\np9lx2ODUbyMAmvWViDhGStSGpFeAExGxrPK+pGERcbbGsjuBnb3Yxv19FO6lwj/Qsn7jHocNRJL0\nsaTVkn4ClkiaJOnH3AvpkNSaCz4o6cs8/Woe/GirpEOSni2s8J9C+XZJ6yXtk/RJocwjed4OSW9X\n1lsKbJiktyRtz1lI5+f5z0tak6fbJHVJulLS5Bpxz5W0IQ8+9KekhZJeyOW2Sboql2uXtCL3wLok\nTaoS07WSPs8xbZd0X57/QKEn15mT4pldkHscNlAFKR30vREROXfX1Ig4K2ka8DrweJXlWoGHgBZg\nv6RVubdS/IY+kTS4zVGgI1e0ncDqvI3Dkj6j+rf6p4HjETFZ0hXAD5I2AyuAdkmzSOkd5kdEt6R9\ndeK+LccyAjgEvBgRd0paBjwJrMwxjIiIO3KG5I+ANs4fd2ElsDwiOiRdD2zK+7cYWBAR2ySNBE5f\n4JibAW44bGBbHz05c8YCayXdTKpML69SPoCvIw3+dUzSX6TxGY6Uym2PiCMAknaRcoOdAv6IiMO5\nzDpgfpVtTAfaJFUq/xZgQm5s5pISy70fEdtqxF38TG6NiJPASUnHgUoPpwu4vVBuHaSBnSS1SBpT\nimkacGvKfQfA6JwcswNYLulT4IuIGDTZYa1/ueGwgexUYfo14PuImCXpBqC9xjJnCtNnqf4ZOF2l\nTLl3UW0ktYqFEfFtlfmtwAnOHySnXtzFOM4V/j9XI+5i2XKs90TEmdL8JZK+Ah4l9axmRMT+Ous1\nA3yPwwaPFnp6DvNqlKlX2dcTpEzMN+bKHVIq/2qXqjYDCypPe0lqVUqDP4Z0yWgqME7SYw3EXabS\n9Oy8rSmky2QnSuW3AIv+W0CamP/eFBF7ImIp8DNwSy+3b0OcGw4byIoV91LgDUmdwLDSe1H4W+tp\no2rle2ZEdAMLgE2SdgB/51fZh8BeoFNSF2nUtcuAZcC7EXGQdB/kTUnX1Im7HGt5uliuOy+/Kq+7\nXGYRcHe+Wb+Hnktsz+Ub6rtJPbFvqh4ZsxKnVTfrJUmj8j0HJL0HHIiIC/5GpJ9j2gosjojOZsZh\nQ4t7HGa990x+dHUP6RLTB80OyKwZ3OMwM7OGuMdhZmYNccNhZmYNccNhZmYNccNhZmYNccNhZmYN\nccNhZmYN+Rc52/St3Y+lywAAAABJRU5ErkJggg==\n", "text": [ "" ] } ], "prompt_number": 184 }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can see, with more data, our prediction confidence is almost at 97% confidence. We have limited data samples. Around 150 training examples, our accuracy has almost reached to perfect score. " ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Step 6. Find the best parameters for our regressor" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's use GridSearch to find the best params for our model." ] }, { "cell_type": "code", "collapsed": false, "input": [ "from sklearn.linear_model import Ridge\n", "from sklearn.cross_validation import StratifiedKFold\n", "\n", "#load the data\n", "houses=pandas.read_csv('./data/redfin_more_data1.csv', quoting=csv.QUOTE_NONE,names=[\"LIST PRICE\",\"BEDS\",\"BATHS\",\"SQFT\",\"LOT SIZE\",\"YEAR BUILT\",\n", " \"PARKING SPOTS\",\"PARKING TYPE\",\"ORIGINAL LIST PRICE\",\"LAST SALE PRICE\"])\n", "houses.interpolate(inplace=True) #replace missing values by interpolating the data in a particular column\n", "y= houses['LAST SALE PRICE'].copy()\n", "houses_features =houses.iloc[:,[0,1,2,3,4,5]].copy()\n", "\n", "#split 80% of data to training set and 20% to test set\n", "houses_train,houses_test, y_train, y_test = \\\n", "train_test_split(houses_features,y,test_size=0.2) #<=== 20% of the samples are used for testing.\n", "\n", "#build the pipeline and use the StandardScalar to normalize the data\n", "ridge_regressor_pipeline = Pipeline([\n", "('normalize', preprocessing.StandardScaler()),\n", "('regressor', Ridge())\n", "])\n", "\n", "#set some params for the regressor\n", "param_ridge = [\n", "{'regressor__alpha':[0.01, 0.1, 1], 'regressor__solver':['lsqr'],'regressor__tol': [.99], 'regressor__max_iter': [500] },\n", "{'regressor__alpha':[0.01, 0.1, 1], 'regressor__solver':['cholesky'],'regressor__tol': [.99], 'regressor__max_iter': [500] },\n", "]\n", "\n", "#search the best fitting params for our regressor\n", "grid_ridge = GridSearchCV(\n", " ridge_regressor_pipeline, #pipeline from above\n", " param_grid=param_ridge, #parameters to tune via cross validation\n", " refit=True, #fit using all data for the best regressor\n", " n_jobs=-1,\n", " scoring='r2',\n", " cv=StratifiedKFold(y_train,n_folds=5)\n", ")\n" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 267 }, { "cell_type": "code", "collapsed": false, "input": [ "%time price_predictor = grid_ridge.fit(houses_train,y_train)\n", "print (price_predictor.grid_scores_)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "CPU times: user 196 ms, sys: 40.6 ms, total: 236 ms\n", "Wall time: 522 ms\n", "[mean: 0.96209, std: 0.07023, params: {'regressor__tol': 0.99, 'regressor__alpha': 0.01, 'regressor__solver': 'lsqr', 'regressor__max_iter': 500}, mean: 0.96208, std: 0.07026, params: {'regressor__tol': 0.99, 'regressor__alpha': 0.1, 'regressor__solver': 'lsqr', 'regressor__max_iter': 500}, mean: 0.96190, std: 0.07056, params: {'regressor__tol': 0.99, 'regressor__alpha': 1, 'regressor__solver': 'lsqr', 'regressor__max_iter': 500}, mean: 0.98043, std: 0.01240, params: {'regressor__tol': 0.99, 'regressor__alpha': 0.01, 'regressor__solver': 'cholesky', 'regressor__max_iter': 500}, mean: 0.98045, std: 0.01332, params: {'regressor__tol': 0.99, 'regressor__alpha': 0.1, 'regressor__solver': 'cholesky', 'regressor__max_iter': 500}, mean: 0.97933, std: 0.01996, params: {'regressor__tol': 0.99, 'regressor__alpha': 1, 'regressor__solver': 'cholesky', 'regressor__max_iter': 500}]\n" ] } ], "prompt_number": 188 }, { "cell_type": "markdown", "metadata": {}, "source": [ "So apparently, alpha=0.01, tol=.99 and solver='cholesky' provides the best fit. Let's run our Regressor on our test set for a sanity check." ] }, { "cell_type": "code", "collapsed": false, "input": [ "yp = price_predictor.predict(houses_test)\n", "print(\"Variance: %0.02f and R2 Score: %0.02f \\n\" % (explained_variance_score(y_test,yp), r2_score(y_test,yp)))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Variance: 0.97 and R2 Score: 0.95 \n", "\n" ] } ], "prompt_number": 268 }, { "cell_type": "markdown", "metadata": {}, "source": [ "We get 95% accuracy in our prediction when we try the prediction on 20% of training samples that we had split for test set. This test set has never been seen before by our predictor. Though, our R2 score as gone down a little bit.But, it represents a real world scenario. Let' predict the price of a house?" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import math\n", "data = {\"LIST PRICE\":879000,\n", "\"BEDS\": 3,\n", "\"BATHS\": 2.5,\n", "\"SQFT\": 1900,\n", "\"LOT SIZE\" : 6800,\n", "\"YEAR BUILT\":2015} \n", "house_test = pandas.DataFrame(data,index=[0],columns=[\"LIST PRICE\",\"BEDS\",\"BATHS\",\"SQFT\",\"LOT SIZE\",\"YEAR BUILT\"])\n", "%time yp=price_predictor.predict(house_test)\n", "print(\"Predicted Price: {0}\".format(round(yp[0])))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "CPU times: user 409 \u00b5s, sys: 1 \u00b5s, total: 410 \u00b5s\n", "Wall time: 419 \u00b5s\n", "Predicted Price: 864631.0\n" ] } ], "prompt_number": 269 }, { "cell_type": "markdown", "metadata": {}, "source": [ "For production, we may need to save the model, the parameters, cross validation scores, training scores and then run the model on a production database. We would then need to compare the model training , cv and test scores with the production data prediction scores and figure out if our model needs further tuning." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is a basic example of linear multivariate regresion to predict a continous output. What's next ?\n", "\n", "I will try to explore these following topics in future posts.\n", "1. Noise Detection using Expectation Maximization Models.\n", "2. Feature exploration using Novelty Detection.\n", "3. Feature weighing using polynomail degrees of features.\n", "4. Develop your own multivariate linear regression model,optimize the parameters and predict continuous rate." ] } ], "metadata": {} } ] }