{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Combine national and NERC data\n", "Take the various file (EIA facilities, EIA state-level totals, EPA emissions, etc) and combine them to determine total generation, CO₂ emissions, etc nationally and at NERC regions.\n", "\n", "## Instructions\n", "Make sure the `file_date` parameter below is set to whatever value you would like appended to file names.\n", "\n", "In **Calculate the extra gen/fuel consumption at the state levels** add years after 2017 in the `for` loop if needed.\n", "\n", "In **Fraction of estimated gen/fuels in each NERC region** update the years that plant ids for annual plants are fetched from EIA-923 files.\n", "\n", "The entire notebook can be run at once using *Run All Cells*" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T16:59:58.123948Z", "start_time": "2017-11-06T16:59:57.181973Z" }, "collapsed": true }, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "import os\n", "from os.path import join\n", "import sys\n", "import json\n", "\n", "idx = pd.IndexSlice\n", "cwd = os.getcwd()\n", "data_path = join(cwd, '..', 'Data storage')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Date string for filenames\n", "This will be inserted into all filenames (reading and writing)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "file_date = '2018-03-06'" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T16:59:59.331142Z", "start_time": "2017-11-06T16:59:59.317352Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "pandas 0.22.0\n", "numpy 1.14.2\n", "json 2.0.9\n", "CPython 3.6.4\n", "IPython 6.2.1\n" ] } ], "source": [ "%load_ext watermark\n", "%watermark -iv -v" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:00:18.821185Z", "start_time": "2017-11-06T17:00:18.788505Z" }, "collapsed": true }, "outputs": [], "source": [ "# Load the \"autoreload\" extension\n", "%load_ext autoreload\n", "\n", "# always reload modules marked with \"%aimport\"\n", "%autoreload 1" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:00:18.827565Z", "start_time": "2017-11-06T17:00:18.822931Z" }, "collapsed": true }, "outputs": [], "source": [ "# add the 'src' directory as one where we can import modules\n", "src_dir = join(os.getcwd(), os.pardir, 'src')\n", "sys.path.append(src_dir)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:00:18.846980Z", "start_time": "2017-11-06T17:00:18.830372Z" }, "collapsed": true }, "outputs": [], "source": [ "%aimport Data.make_data\n", "from Data.make_data import states_in_nerc\n", "%aimport Analysis.index\n", "from Analysis.index import facility_emission_gen, group_facility_data, add_quarter\n", "%aimport Analysis.index\n", "from Analysis.index import g2lb, change_since_2005, generation_index\n", "%aimport Analysis.index\n", "from Analysis.index import facility_co2, adjust_epa_emissions, group_fuel_cats\n", "%aimport util.utils\n", "from util.utils import rename_cols, add_facility_location" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load data" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:00:25.783044Z", "start_time": "2017-11-06T17:00:18.849846Z" }, "collapsed": true }, "outputs": [], "source": [ "cwd = os.getcwd()\n", "path = join(data_path, 'Derived data',\n", " 'Facility gen fuels and CO2 {}.csv'.format(file_date))\n", "eia_fac = pd.read_csv(path)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:00:26.096215Z", "start_time": "2017-11-06T17:00:25.785006Z" }, "collapsed": true }, "outputs": [], "source": [ "rename_cols(eia_fac)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:00:26.526228Z", "start_time": "2017-11-06T17:00:26.144542Z" }, "collapsed": true }, "outputs": [], "source": [ "path = join(data_path, 'Derived data',\n", " 'Monthly EPA emissions {}.csv'.format(file_date))\n", "epa = pd.read_csv(path)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Adjusted facility emissions and generation\n", "I have both the state and custom fuel categories here, but am only using the state fuel categories for now." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:00:26.546818Z", "start_time": "2017-11-06T17:00:26.528478Z" }, "collapsed": true }, "outputs": [], "source": [ "fuel_cat_folder = join(data_path, 'Fuel categories')\n", "state_cats_path = join(fuel_cat_folder, 'State_facility.json')\n", "\n", "with open(state_cats_path, 'r') as f:\n", " state_fuel_cat = json.load(f)\n", " \n", "custom_cats_path = join(fuel_cat_folder, 'Custom_results.json')\n", "with open(custom_cats_path, 'r') as f:\n", " custom_fuel_cat = json.load(f)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:00:35.435401Z", "start_time": "2017-11-06T17:00:28.197939Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Renaming columns\n", "Grouping facilities\n", "Adjusting EPA emissions\n", "Caculating CO2\n", "Gen/fuels to state categories\n" ] } ], "source": [ "co2, gen_fuels_state = facility_emission_gen(eia_facility=eia_fac, epa=epa,\n", " state_fuel_cat=state_fuel_cat,\n", " custom_fuel_cat=custom_fuel_cat,\n", " export_state_cats=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Extra gen/fuels from non-reporting" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:00:52.946442Z", "start_time": "2017-11-06T17:00:52.942136Z" }, "collapsed": true }, "outputs": [], "source": [ "%aimport Analysis.index\n", "from Analysis.index import extra_emissions_gen" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Total EIA generation/fuel consumption and emission factors" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:00:55.398938Z", "start_time": "2017-11-06T17:00:55.366360Z" }, "collapsed": true }, "outputs": [], "source": [ "cwd = os.getcwd()\n", "path = join(data_path, 'Derived data',\n", " 'EIA country-wide gen fuel CO2 {}.csv'.format(file_date))\n", "eia_total = pd.read_csv(path)\n", "\n", "path = join(data_path,\n", " 'Final emission factors.csv')\n", "ef = pd.read_csv(path, index_col=0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Calculate CO₂, generation, and fuel consumption that is not captured by facilities" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:01:02.921291Z", "start_time": "2017-11-06T17:01:02.438735Z" }, "collapsed": true }, "outputs": [], "source": [ "extra_co2, extra_gen_fuel = extra_emissions_gen(gen_fuels_state, eia_total, ef)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Total CO₂ (national)\n", "Combine adjusted CO₂ at facilities and CO₂ from fuel consumption that is not captured by facility data." ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:01:14.842434Z", "start_time": "2017-11-06T17:01:14.679060Z" }, "collapsed": true }, "outputs": [], "source": [ "facility_co2 = co2.groupby(['year', 'month']).sum()" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:01:28.030583Z", "start_time": "2017-11-06T17:01:28.023099Z" }, "collapsed": true }, "outputs": [], "source": [ "national_co2 = (facility_co2.loc[:, 'final co2 (kg)']\n", " + extra_co2.loc[:, 'elec fuel co2 (kg)']\n", " .groupby(['year', 'month']).sum())\n", "national_co2.name = 'final co2 (kg)'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## National Index and gen by fuels" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Total national generation (all fuels)\n", "Add generation/fuel consumption reported by facilities (with state-level fuel codes) and extra generation/fuel consumption (which also uses state-level fuel codes)." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
plant idtotal fuel (mmbtu)generation (mwh)elec fuel (mmbtu)latlonquarterall fuel fossil co2 (kg)elec fuel fossil co2 (kg)all fuel total co2 (kg)elec fuel total co2 (kg)
typeyearmonth
COW2001185502841.848875e+091.762754e+081.810304e+0925142.050268-53175.85936451.755757e+111.719629e+111.755757e+111.719629e+11
285502841.568672e+091.488241e+081.536017e+0925142.050268-53175.85936451.490065e+111.459476e+111.490065e+111.459476e+11
385502841.626246e+091.541157e+081.591151e+0925142.050268-53175.85936451.544286e+111.511424e+111.544286e+111.511424e+11
485502841.471568e+091.398943e+081.439361e+0925142.050268-53175.859312901.397181e+111.367018e+111.397181e+111.367018e+11
584982591.595267e+091.507916e+081.563698e+0925147.539168-53176.228812901.515424e+111.485856e+111.515424e+111.485856e+11
\n", "
" ], "text/plain": [ " plant id total fuel (mmbtu) generation (mwh) \\\n", "type year month \n", "COW 2001 1 8550284 1.848875e+09 1.762754e+08 \n", " 2 8550284 1.568672e+09 1.488241e+08 \n", " 3 8550284 1.626246e+09 1.541157e+08 \n", " 4 8550284 1.471568e+09 1.398943e+08 \n", " 5 8498259 1.595267e+09 1.507916e+08 \n", "\n", " elec fuel (mmbtu) lat lon quarter \\\n", "type year month \n", "COW 2001 1 1.810304e+09 25142.050268 -53175.8593 645 \n", " 2 1.536017e+09 25142.050268 -53175.8593 645 \n", " 3 1.591151e+09 25142.050268 -53175.8593 645 \n", " 4 1.439361e+09 25142.050268 -53175.8593 1290 \n", " 5 1.563698e+09 25147.539168 -53176.2288 1290 \n", "\n", " all fuel fossil co2 (kg) elec fuel fossil co2 (kg) \\\n", "type year month \n", "COW 2001 1 1.755757e+11 1.719629e+11 \n", " 2 1.490065e+11 1.459476e+11 \n", " 3 1.544286e+11 1.511424e+11 \n", " 4 1.397181e+11 1.367018e+11 \n", " 5 1.515424e+11 1.485856e+11 \n", "\n", " all fuel total co2 (kg) elec fuel total co2 (kg) \n", "type year month \n", "COW 2001 1 1.755757e+11 1.719629e+11 \n", " 2 1.490065e+11 1.459476e+11 \n", " 3 1.544286e+11 1.511424e+11 \n", " 4 1.397181e+11 1.367018e+11 \n", " 5 1.515424e+11 1.485856e+11 " ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gen_fuels_state.groupby(['type', 'year', 'month']).sum().head()" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:01:40.579779Z", "start_time": "2017-11-06T17:01:40.407271Z" }, "collapsed": true }, "outputs": [], "source": [ "national_gen = (gen_fuels_state\n", " .groupby(['type', 'year', 'month'])['generation (mwh)'].sum()\n", " .add(extra_gen_fuel['generation (mwh)'], fill_value=0))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Regroup generation from state codes to my custom fuel codes" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:01:50.832843Z", "start_time": "2017-11-06T17:01:50.784218Z" }, "collapsed": true }, "outputs": [], "source": [ "national_gen = group_fuel_cats(national_gen.reset_index(), custom_fuel_cat,\n", " 'type', 'fuel category').set_index(['fuel category', 'year', 'month'])" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:01:53.709009Z", "start_time": "2017-11-06T17:01:53.695991Z" } }, "outputs": [], "source": [ "total_gen = national_gen.groupby(['year', 'month']).sum()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### National Index" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "national_index = total_gen.copy()\n", "national_index['final co2 (kg)'] = national_co2\n", "national_index['index (g/kwh)'] = (national_index['final co2 (kg)']\n", " / national_index['generation (mwh)'])\n", "national_index.reset_index(inplace=True)\n", "add_quarter(national_index)\n", "g2lb(national_index)\n", "change_since_2005(national_index)" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": true }, "outputs": [], "source": [ "path = join(data_path, 'National data',\n", " 'National index {}.csv'.format(file_date))\n", "national_index.to_csv(path, index=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Percent generation by custom fuel type" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "ExecuteTime": { "end_time": "2017-10-31T19:16:32.688527Z", "start_time": "2017-10-31T19:16:32.649752Z" }, "collapsed": true }, "outputs": [], "source": [ "df_list = []\n", "for fuel in national_gen.index.get_level_values('fuel category').unique():\n", " percent_gen = national_gen.loc[fuel].divide(total_gen, fill_value=0)\n", " percent_gen['fuel category'] = fuel\n", " percent_gen.set_index('fuel category', inplace=True, append=True)\n", " df_list.append(percent_gen)\n", "percent_gen = pd.concat(df_list)" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "collapsed": true }, "outputs": [], "source": [ "path = join(data_path, 'National data',\n", " 'National generation {}.csv'.format(file_date))\n", "national_gen.to_csv(path)\n", "\n", "path = join(data_path, 'National data',\n", " 'National percent gen {}.csv'.format(file_date))\n", "percent_gen.to_csv(path)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Facility state and lat/lon file generation\n", "~~Create a .csv with lat/lon and state code for each facility. Need to manually add the NERC region label.~~ ~~I've done this with a spatial join between the lat/lon and NERC shapefiles in QGIS. It should be possible to do the spatial join in [GeoPandas](http://geopandas.org/). Could also add custom region labels (e.g. eGRID subregions, ISO/RTO boundaries, etc).~~\n", "\n", "This isn't possible with a spatial join because NERC regions aren't strictly based on geography. Instead, I've taken the NERC codes from EIA-860 and assigned unknown plants (mostly those that retired before 2012 when modern NERCs were mostly defined, and those that were assigned a plant id in 2017) NERC labels using a k-nearest neighbors algorithm and lat/lon information.\n", "\n", "See the `Assign NERC region labels` notebook for the creation of this file." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Fraction of estimated gen/fuels in each NERC region\n", "\n", "**NOTE** EIA changed the reporting requirements for wind and solar facilities in 2017. I'm now going to use the list of annual facilities from 2017 rather than from 2015, but still use the reported generation by those facilities in 2015." ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:06:53.740258Z", "start_time": "2017-11-06T17:06:53.735390Z" }, "collapsed": true }, "outputs": [], "source": [ "%aimport Data.make_data\n", "from Data.make_data import get_annual_plants" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:07:38.755682Z", "start_time": "2017-11-06T17:07:11.747334Z" }, "collapsed": true }, "outputs": [], "source": [ "annual_ids_2015 = get_annual_plants(2015)" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:07:57.886913Z", "start_time": "2017-11-06T17:07:38.757559Z" }, "collapsed": true }, "outputs": [], "source": [ "annual_ids_2017 = get_annual_plants(2017)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Combine the lists of annual plants in 2015 and 2017. This lets us catch facilities that have gone from monthly to annual since 2015, but it also includes plants that were annual in 2015 an may have retired.\n", "\n", "There is the possibility of an error in allocation to NERC regions for 2016 state-level generation when using the facilities that changed to annual in 2017. But since the state-level generation is so much smaller in 2016 I don't think this is much of an issue." ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:08:20.140523Z", "start_time": "2017-11-06T17:08:20.128770Z" }, "collapsed": true }, "outputs": [], "source": [ "annual_ids = set(annual_ids_2015.tolist() + annual_ids_2017.tolist())" ] }, { "cell_type": "code", "execution_count": 57, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:08:20.153975Z", "start_time": "2017-11-06T17:08:20.142993Z" } }, "outputs": [ { "data": { "text/plain": [ "7146" ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(annual_ids)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Add NERC region labels" ] }, { "cell_type": "code", "execution_count": 58, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:08:20.165890Z", "start_time": "2017-11-06T17:08:20.156452Z" }, "collapsed": true }, "outputs": [], "source": [ "%aimport Analysis.state2nerc\n", "from Analysis.state2nerc import fraction_state2nerc, add_region" ] }, { "cell_type": "code", "execution_count": 59, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:08:20.206633Z", "start_time": "2017-11-06T17:08:20.168272Z" }, "collapsed": true }, "outputs": [], "source": [ "cwd = os.getcwd()\n", "path = join(data_path, 'Facility labels',\n", " 'Facility locations_RF.csv')\n", "location_labels = pd.read_csv(path)" ] }, { "cell_type": "code", "execution_count": 60, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:08:20.225873Z", "start_time": "2017-11-06T17:08:20.209690Z" }, "collapsed": true }, "outputs": [], "source": [ "nerc_state_path = join(data_path, 'Derived data',\n", " 'NERC_states.json')\n", "\n", "with open(nerc_state_path, 'r') as f:\n", " nerc_states = json.load(f)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Added the filter that year must be 2015 - was getting all 2015 annual plants, but for all years!" ] }, { "cell_type": "code", "execution_count": 61, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:08:20.746670Z", "start_time": "2017-11-06T17:08:20.229488Z" }, "collapsed": true }, "outputs": [], "source": [ "eia_2015_annual = eia_fac.loc[(eia_fac['plant id'].isin(annual_ids)) & \n", " (eia_fac['year'] == 2015)].copy()\n", "\n", "# Group to state-level fuel categories\n", "eia_2015_annual = group_fuel_cats(eia_2015_annual, state_fuel_cat)" ] }, { "cell_type": "code", "execution_count": 62, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:08:20.808948Z", "start_time": "2017-11-06T17:08:20.748469Z" }, "collapsed": true }, "outputs": [], "source": [ "eia_2015_annual_nerc = add_facility_location(eia_2015_annual, location_labels, \n", " labels=['state', 'nerc', 'year'])" ] }, { "cell_type": "code", "execution_count": 63, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
typeyearmonthplant idtotal fuel (mmbtu)generation (mwh)elec fuel (mmbtu)latlonquarterall fuel fossil co2 (kg)elec fuel fossil co2 (kg)all fuel total co2 (kg)elec fuel total co2 (kg)statenerc
0COW2015170.00.0000.034.012800-85.97080010.00.00.00.0ALSERC
1COW20151472643817.0255715.1712643817.034.743900-87.8486001246668126.1246668126.1246668126.1246668126.1ALSERC
2COW2015159593567.053757.467593567.040.854765-98.348222157694712.457694712.457694712.457694712.4NEMRO
3COW2015179312015.016494.000285618.064.847743-147.735063130327858.027762069.630327858.027762069.6AKASCC
4COW201515080.00.0000.038.033327-102.53791510.00.00.00.0COWECC
\n", "
" ], "text/plain": [ " type year month plant id total fuel (mmbtu) generation (mwh) \\\n", "0 COW 2015 1 7 0.0 0.000 \n", "1 COW 2015 1 47 2643817.0 255715.171 \n", "2 COW 2015 1 59 593567.0 53757.467 \n", "3 COW 2015 1 79 312015.0 16494.000 \n", "4 COW 2015 1 508 0.0 0.000 \n", "\n", " elec fuel (mmbtu) lat lon quarter \\\n", "0 0.0 34.012800 -85.970800 1 \n", "1 2643817.0 34.743900 -87.848600 1 \n", "2 593567.0 40.854765 -98.348222 1 \n", "3 285618.0 64.847743 -147.735063 1 \n", "4 0.0 38.033327 -102.537915 1 \n", "\n", " all fuel fossil co2 (kg) elec fuel fossil co2 (kg) \\\n", "0 0.0 0.0 \n", "1 246668126.1 246668126.1 \n", "2 57694712.4 57694712.4 \n", "3 30327858.0 27762069.6 \n", "4 0.0 0.0 \n", "\n", " all fuel total co2 (kg) elec fuel total co2 (kg) state nerc \n", "0 0.0 0.0 AL SERC \n", "1 246668126.1 246668126.1 AL SERC \n", "2 57694712.4 57694712.4 NE MRO \n", "3 30327858.0 27762069.6 AK ASCC \n", "4 0.0 0.0 CO WECC " ] }, "execution_count": 63, "metadata": {}, "output_type": "execute_result" } ], "source": [ "eia_2015_annual_nerc.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is 2015 data on annual reporting facilities (from both 2015 and 2017)" ] }, { "cell_type": "code", "execution_count": 64, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:08:36.285221Z", "start_time": "2017-11-06T17:08:36.277897Z" }, "collapsed": true }, "outputs": [], "source": [ "# Get a list of all state abbreviations\n", "\n", "all_states = []\n", "for value in nerc_states.values():\n", " all_states.extend(value)\n", "all_states = set(all_states)" ] }, { "cell_type": "code", "execution_count": 65, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:08:38.924387Z", "start_time": "2017-11-06T17:08:36.287865Z" } }, "outputs": [], "source": [ "df_list = []\n", "\n", "for state in all_states:\n", " try:\n", " df_list.append(fraction_state2nerc(eia_2015_annual_nerc,\n", " state, region_col='nerc', fuel_col='type'))\n", " except:\n", " print(state)\n", " pass" ] }, { "cell_type": "code", "execution_count": 66, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:08:38.970750Z", "start_time": "2017-11-06T17:08:38.928892Z" }, "collapsed": true }, "outputs": [], "source": [ "nerc_fraction = pd.concat(df_list)\n", "nerc_fraction.set_index(['state', 'nerc', 'type'], inplace=True)\n", "nerc_fraction.sort_index(inplace=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With the values below I can allocate extra state-level generation and fuel use to each of the NERC regions!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Making sure that no values are greater than 1 (within tolerance)" ] }, { "cell_type": "code", "execution_count": 67, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:16:30.022250Z", "start_time": "2017-11-06T17:16:30.012576Z" } }, "outputs": [ { "data": { "text/plain": [ "% generation False\n", "% total fuel False\n", "% elec fuel False\n", "dtype: bool" ] }, "execution_count": 67, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(nerc_fraction.groupby(['state', 'type']).sum() > 1.0001).any()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Allocate extra gen from the state-level to regions\n", "\n", "I still need to generate state-level total generation and fuel use!" ] }, { "cell_type": "code", "execution_count": 68, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:17:13.954207Z", "start_time": "2017-11-06T17:17:13.950071Z" }, "collapsed": true }, "outputs": [], "source": [ "idx = pd.IndexSlice" ] }, { "cell_type": "code", "execution_count": 69, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:17:21.163335Z", "start_time": "2017-11-06T17:17:21.158742Z" }, "collapsed": true }, "outputs": [], "source": [ "# a dictionary to match column names\n", "nerc_frac_match = {'% generation': 'generation (mwh)',\n", " '% total fuel': 'total fuel (mmbtu)',\n", " '% elec fuel': 'elec fuel (mmbtu)'}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Load state-level total gen/fuel consumption" ] }, { "cell_type": "code", "execution_count": 70, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:17:24.615168Z", "start_time": "2017-11-06T17:17:24.212473Z" }, "collapsed": true }, "outputs": [], "source": [ "path = join(data_path, 'Derived data',\n", " 'EIA state-level gen fuel CO2 {}.csv'.format(file_date))\n", "\n", "state_total = pd.read_csv(path, parse_dates=['datetime'])" ] }, { "cell_type": "code", "execution_count": 71, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:17:26.684456Z", "start_time": "2017-11-06T17:17:26.625319Z" }, "collapsed": true }, "outputs": [], "source": [ "rename_cols(state_total)\n", "state_total['state'] = state_total['geography'].str[-2:]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Simplify the dataframe" ] }, { "cell_type": "code", "execution_count": 72, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:17:33.931547Z", "start_time": "2017-11-06T17:17:33.823487Z" } }, "outputs": [], "source": [ "cols = list(nerc_frac_match.values())\n", "state_total = state_total.groupby(['state', 'year', 'month', 'type'])[cols].sum()" ] }, { "cell_type": "code", "execution_count": 73, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:17:37.589401Z", "start_time": "2017-11-06T17:17:37.584045Z" }, "collapsed": true }, "outputs": [], "source": [ "# list of NERC regions\n", "nercs = nerc_fraction.index.get_level_values('nerc').unique()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Group the facility data to state fuel categories and add state labels" ] }, { "cell_type": "code", "execution_count": 74, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:18:04.557540Z", "start_time": "2017-11-06T17:17:58.399196Z" }, "collapsed": true }, "outputs": [], "source": [ "cols = list(nerc_frac_match.values())\n", "eia_fac_type = group_fuel_cats(eia_fac, state_fuel_cat)\n", "eia_fac_type = add_facility_location(eia_fac_type, location_labels, ['state', 'year'])\n", "eia_fac_type = eia_fac_type.groupby(['state', 'year', 'month', 'type'])[cols].sum()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Calculate the extra gen/fuel consumption at the state levels\n", "**Only worrying about extra generation from 2016 forward.**" ] }, { "cell_type": "code", "execution_count": 75, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:18:29.739841Z", "start_time": "2017-11-06T17:18:29.638943Z" } }, "outputs": [], "source": [ "state_extra = (state_total.loc[idx[:, 2016:, :, :], :]\n", " - eia_fac_type.loc[idx[:, 2016:, :, :], :])\n", "state_extra.dropna(how='all', inplace=True)\n", "state_extra = state_extra.reorder_levels(['year', 'state', 'month', 'type'])\n", "state_extra.sort_index(inplace=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sort the index of each dataframe to make sure they can be easily combined." ] }, { "cell_type": "code", "execution_count": 76, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:18:29.824834Z", "start_time": "2017-11-06T17:18:29.801409Z" } }, "outputs": [], "source": [ "nerc_fraction.sort_index(inplace=True)" ] }, { "cell_type": "code", "execution_count": 77, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:18:29.851377Z", "start_time": "2017-11-06T17:18:29.826926Z" } }, "outputs": [], "source": [ "state_extra.sort_index(inplace=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a copy of the `nerc_fraction` dataframe with repeated values for every month of the year, so that they MultiIndex matches the `state_extra` MultiIndex" ] }, { "cell_type": "code", "execution_count": 78, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:18:29.913157Z", "start_time": "2017-11-06T17:18:29.853842Z" }, "collapsed": true }, "outputs": [], "source": [ "df_list = []\n", "for month in range(1, 13):\n", " df = nerc_fraction.copy()\n", " df['month'] = month\n", " df.set_index('month', append=True, inplace=True)\n", " df_list.append(df)\n", "\n", "nerc_frac_monthly = pd.concat(df_list, axis=0)\n", "nerc_frac_monthly.sort_index(inplace=True)\n", "nerc_frac_monthly = (nerc_frac_monthly\n", " .reorder_levels(['nerc', 'state', 'month', 'type']))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Cycle through each year (2016 and 2017 in this case) and each NERC, multiplying the state-level extra generation, total fuel consumption, and fuel consumption for electricity by the share that should be allocated to each NERC." ] }, { "cell_type": "code", "execution_count": 79, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:18:31.126440Z", "start_time": "2017-11-06T17:18:29.923008Z" }, "collapsed": true }, "outputs": [], "source": [ "df_list_outer = []\n", "for year in [2016, 2017]:\n", " df_list_inner = []\n", " for nerc in nercs:\n", " df = pd.concat([(nerc_frac_monthly\n", " .loc[nerc]['% generation']\n", " * state_extra\n", " .loc[year]['generation (mwh)']).dropna(),\n", " (nerc_frac_monthly.\n", " loc[nerc]['% total fuel']\n", " * state_extra\n", " .loc[year]['total fuel (mmbtu)']).dropna(),\n", " (nerc_frac_monthly\n", " .loc[nerc]['% elec fuel']\n", " * state_extra\n", " .loc[year]['elec fuel (mmbtu)']).dropna()],\n", " axis=1)\n", " df.columns = nerc_frac_match.values()\n", " df['nerc'] = nerc\n", " df['year'] = year\n", " df = df.groupby(['year', 'nerc', 'month', 'type']).sum()\n", " df_list_inner.append(df)\n", "\n", " df_list_outer.append(pd.concat(df_list_inner))\n", "final = pd.concat(df_list_outer)\n", "final.sort_index(inplace=True)" ] }, { "cell_type": "code", "execution_count": 80, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:18:31.155192Z", "start_time": "2017-11-06T17:18:31.129610Z" }, "collapsed": true }, "outputs": [], "source": [ "nerc_frac_monthly.sort_index(inplace=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Although the dataframe is called `final`, it's really just the final allocated extra state-level generation/fuel consumption" ] }, { "cell_type": "code", "execution_count": 81, "metadata": { "ExecuteTime": { "end_time": "2017-11-06T17:19:01.406924Z", "start_time": "2017-11-06T17:19:01.377498Z" }, "collapsed": true }, "outputs": [], "source": [ "path = join(data_path, 'Derived data',\n", " 'NERC extra gen fuels {}.csv'.format(file_date))\n", "final.to_csv(path)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "psci", "language": "python", "name": "psci" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" }, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "window_display": false } }, "nbformat": 4, "nbformat_minor": 2 }