{ "metadata": { "name": "", "signature": "sha256:239456687a244ac88cbca278abd216a6ecb16bf0f754b7b9283117213bd9a4c7" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "#Prediction Using Different Machine Learning Methods\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##Support Vector Regression and Cross Validation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "In this notebook, we'll train a Support Vector Regression(SVR) model for predicting building energy consumption based on historical energy data, several weather variables, hour of the day, day of the week, weekends and holidays. \n", "\n", "To do this, we'll fit the model to daily and hourly energy and weather data from 2012-01-01 to 2014-10-31 and compute the average squared residuals from predictions.\n", "\n", "During the design time, we've used cross-validation to fine tune the SVR parameters. And since SVR take too much time to compute, in this final notebook we'll set the parameters to their optimal value that were found with cross validation. We'll still show the range of parameters that was provided as input to cross validation.\n" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# special IPython command to prepare the notebook for matplotlib\n", "%matplotlib inline \n", "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "from sklearn.svm import SVR\n", "from sklearn.grid_search import GridSearchCV\n", "from sklearn import cross_validation\n", "from sklearn import grid_search\n", "\n", "pd.options.display.mpl_style = 'default'" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 1 }, { "cell_type": "code", "collapsed": false, "input": [ "dailyElectricity = pd.read_excel('Data/dailyElectricityWithFeatures.xlsx')\n", "dailyElectricity = dailyElectricity.drop('startDay', 1).drop('endDay', 1)\n", "\n", "dailyChilledWater = pd.read_excel('Data/dailyChilledWaterWithFeatures.xlsx')\n", "dailyChilledWater = dailyChilledWater.drop('startDay', 1).drop('endDay', 1)\n", "\n", "dailySteam = pd.read_excel('Data/dailySteamWithFeatures.xlsx')\n", "dailySteam = dailySteam.drop('startDay', 1).drop('endDay', 1)\n", "\n", "hourlyElectricity = pd.read_excel('Data/hourlyElectricityWithFeatures.xlsx')\n", "hourlyElectricity = hourlyElectricity.drop('startTime', 1).drop('endTime', 1)\n", "\n", "hourlyChilledWater = pd.read_excel('Data/hourlyChilledWaterWithFeatures.xlsx')\n", "hourlyChilledWater = hourlyChilledWater.drop('startTime', 1).drop('endTime', 1)\n", "\n", "hourlySteam = pd.read_excel('Data/hourlySteamWithFeatures.xlsx')\n", "hourlySteam = hourlySteam.drop('startTime', 1).drop('endTime', 1)\n", "\n", "#display one dataframe\n", "dailyElectricity.head()" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", " | electricity-kWh | \n", "RH-% | \n", "T-C | \n", "Tdew-C | \n", "pressure-mbar | \n", "solarRadiation-W/m2 | \n", "windDirection | \n", "windSpeed-m/s | \n", "humidityRatio-kg/kg | \n", "coolingDegrees | \n", "heatingDegrees | \n", "dehumidification | \n", "occupancy | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2012-01-01 | \n", "2800.244977 | \n", "76.652174 | \n", "7.173913 | \n", "3.073913 | \n", "1004.956522 | \n", "95.260870 | \n", "236.086957 | \n", "4.118361 | \n", "0.004796 | \n", "0.086957 | \n", "7.826087 | \n", "0 | \n", "0.0 | \n", "
2012-01-02 | \n", "3168.974047 | \n", "55.958333 | \n", "5.833333 | \n", "-2.937500 | \n", "994.625000 | \n", "87.333333 | \n", "253.750000 | \n", "5.914357 | \n", "0.003415 | \n", "0.000000 | \n", "9.166667 | \n", "0 | \n", "0.3 | \n", "
2012-01-03 | \n", "5194.533376 | \n", "42.500000 | \n", "-3.208333 | \n", "-12.975000 | \n", "1002.125000 | \n", "95.708333 | \n", "302.916667 | \n", "6.250005 | \n", "0.001327 | \n", "0.000000 | \n", "18.208333 | \n", "0 | \n", "0.3 | \n", "
2012-01-04 | \n", "5354.861935 | \n", "41.541667 | \n", "-7.083333 | \n", "-16.958333 | \n", "1008.250000 | \n", "98.750000 | \n", "286.666667 | \n", "5.127319 | \n", "0.000890 | \n", "0.000000 | \n", "22.083333 | \n", "0 | \n", "0.3 | \n", "
2012-01-05 | \n", "5496.223993 | \n", "46.916667 | \n", "-0.583333 | \n", "-9.866667 | \n", "1002.041667 | \n", "90.750000 | \n", "258.333333 | \n", "5.162041 | \n", "0.001746 | \n", "0.000000 | \n", "15.583333 | \n", "0 | \n", "0.3 | \n", "