{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\"dsi\n", "\n", "# Electricity production plants - Visualisation Plots\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "Data origins from https://opendata.swiss/en/dataset/elektrizitatsproduktionsanlagen and we use the following .csv files.\n", "\n", "- ElectricityProductionPlant.csv \n", "- MainCategoryCatalogue.csv\n", "- SubCategoryCatalogue.csv\n", "- PlantCategoryCatalogue.csv\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Loading data\n", "\n", "We use `pd.read_csv` to read the csv files into a `DataFrame`. \n", "\n", "After reading we set the index to the corresponding column which makes it easier to join tables and join the (sub)-category names in order to have all information in one `epp` table " ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
AddressPostCodeMunicipalityCantonBeginningOfOperationInitialPowerTotalPowerMainCategorySubCategoryPlantCategory_x_yMainCategoryNameSubCategoryNamePlantCategoryName
xtf_id
5646Rue des Creusets 411948FionnayVS1958-03-071872000.01872000.0maincat_1subcat_1plantcat_62589880.01097661.0WasserkraftWasserkraftPumpspeicherkraftwerk
5686Binenweg 53904NatersVS1969-09-01349576.0349576.0maincat_1subcat_1plantcat_72644115.01131390.0WasserkraftWasserkraftSpeicherkraftwerk
5726Robbia 504G7741San CarloGR1910-11-0329150.029150.0maincat_1subcat_1plantcat_22801863.01136379.0WasserkraftWasserkraftAusleitkraftwerk
5727Via Principale 167744CampocolognoGR1907-03-0155000.055000.0maincat_1subcat_1plantcat_72808646.01123676.0WasserkraftWasserkraftSpeicherkraftwerk
5730Büdemli 65B7240KüblisGR1922-01-0144200.044200.0maincat_1subcat_1plantcat_72778481.01198505.0WasserkraftWasserkraftSpeicherkraftwerk
\n", "
" ], "text/plain": [ " Address PostCode Municipality Canton \\\n", "xtf_id \n", "5646 Rue des Creusets 41 1948 Fionnay VS \n", "5686 Binenweg 5 3904 Naters VS \n", "5726 Robbia 504G 7741 San Carlo GR \n", "5727 Via Principale 16 7744 Campocologno GR \n", "5730 Büdemli 65B 7240 Küblis GR \n", "\n", " BeginningOfOperation InitialPower TotalPower MainCategory \\\n", "xtf_id \n", "5646 1958-03-07 1872000.0 1872000.0 maincat_1 \n", "5686 1969-09-01 349576.0 349576.0 maincat_1 \n", "5726 1910-11-03 29150.0 29150.0 maincat_1 \n", "5727 1907-03-01 55000.0 55000.0 maincat_1 \n", "5730 1922-01-01 44200.0 44200.0 maincat_1 \n", "\n", " SubCategory PlantCategory _x _y MainCategoryName \\\n", "xtf_id \n", "5646 subcat_1 plantcat_6 2589880.0 1097661.0 Wasserkraft \n", "5686 subcat_1 plantcat_7 2644115.0 1131390.0 Wasserkraft \n", "5726 subcat_1 plantcat_2 2801863.0 1136379.0 Wasserkraft \n", "5727 subcat_1 plantcat_7 2808646.0 1123676.0 Wasserkraft \n", "5730 subcat_1 plantcat_7 2778481.0 1198505.0 Wasserkraft \n", "\n", " SubCategoryName PlantCategoryName \n", "xtf_id \n", "5646 Wasserkraft Pumpspeicherkraftwerk \n", "5686 Wasserkraft Speicherkraftwerk \n", "5726 Wasserkraft Ausleitkraftwerk \n", "5727 Wasserkraft Speicherkraftwerk \n", "5730 Wasserkraft Speicherkraftwerk " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import pandas as pd\n", "import seaborn as sns\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "\n", "plt.rcParams[\"figure.figsize\"] = (20, 10)\n", "\n", "epp = pd.read_csv('../../data-science-course/data/ch.bfe.elektrizitaetsproduktionsanlagen/ElectricityProductionPlant.csv', parse_dates=['BeginningOfOperation']).set_index('xtf_id')\n", "mainCat = pd.read_csv('../../data-science-course/data/ch.bfe.elektrizitaetsproduktionsanlagen/MainCategoryCatalogue.csv').set_index('Catalogue_id')\n", "subCat = pd.read_csv('../../data-science-course/data/ch.bfe.elektrizitaetsproduktionsanlagen/SubCategoryCatalogue.csv').set_index('Catalogue_id')\n", "plantCat = pd.read_csv('../../data-science-course/data/ch.bfe.elektrizitaetsproduktionsanlagen/PlantCategoryCatalogue.csv').set_index('Catalogue_id')\n", "\n", "lang='de'\n", "\n", "epp = epp.merge(mainCat[lang].rename(\"MainCategoryName\"),how='left', left_on='MainCategory', right_index=True)\n", "epp = epp.merge(subCat[lang].rename(\"SubCategoryName\"),how='left', left_on='SubCategory', right_index=True)\n", "epp = epp.merge(plantCat[lang].rename(\"PlantCategoryName\"),how='left', left_on='PlantCategory', right_index=True)\n", "\n", "display(epp.head())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "***\n", " ## Pie chart\n", "\n", "
\n", "Exercise: Plot the energy mix in a pie chart a Series \n", "\n", "1. Create a pie chart that displays the relative contribution of the main categories to the total power produced in Switzerland.\n", "\n", "2. Create a pie chart that displays the relative contribution of the main categories to the total number of power plants in Switzerland.\n", " \n", "3. (Optional) Can you plot both figures as subplots in one plot?\n", "\n", "4. (Optional) Plot the relative contribution for the sub categories for `maincat_2`.\n", "\n", " \n", " \n", "
\n", "\n", "*Hints*\n", "- To prepare the necessary DataFrame you can use the `DataFrame.groupby()` function and sum on the `TotalPower` column: `epp.groupby('MainCategoryName')['TotalPower'].sum()`\n", "- To plot you can use directly the function `Series.plot.pie()` [Pandas Doc](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.plot.pie.html)\n", "- To count instead of sum a grouped value use `.count()`\n", "\n", "***\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "***\n", " ## Bar chart\n", "\n", "
\n", "Exercise: Plot the energy data by canton \n", "\n", "1. Use `sns.countplot` to create a bar chart that displays the number of fossil fuel power plants (`epp.MainCategory == 'maincat_4'`) for each canton. Can you also display the different sub-categories?\n", "\n", "2. Create a stacked bar chart that displays the contribution to alternative energies (`epp.MainCategory == 'maincat_2'`) for each canton.\n", " \n", "3. (Optional) Can you create a similar bar chart with seaborn, i.e. `sns.barplot()`? P\n", " \n", "4. (Optional) Play around with the different `seaborn` themes `darkgrid`, `whitegrid`, `dark`, `white`, and `ticks`.\n", " \n", " \n", "
\n", "\n", "*Hints*\n", "- With `sns.countplot` you can provide your entier dataset \n", "- To prepare the necessary DataFrame you can use the `DataFrame.groupby()` function and sum on the `TotalPower` column: `epp.groupby(['Canton', 'SubCategoryName'], as_index=False)['TotalPower'].sum()`. Furthermore, you can use `DataFrame.pivot_table()` to explode the sub categories into columns. \n", "\n", "\n", "***\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "***\n", " ## History Plots\n", "\n", "
\n", "Exercise: Plot the energy mix over time \n", "\n", "**Step 1**: Create a line plot over the time that cumulates the energy that is produced until then.\n", "\n", "**Step 2**: Create a line plot with multiple lines. Each line represents the cumulative sum of the energy produced by one main category \n", "\n", "**Step 3**: Create an area plot with the same data as in step 2. With the area plot you can show the individual as well as the total values.\n", " \n", "
\n", "\n", "*Hints*\n", "- **Step 1**:\n", " - With `df = epp.groupby('BeginningOfOperation')['TotalPower' ].sum().sort_index()` you can create a `DataFrame` with the added `TotalPower` for each `BeginningOfOperations` date\n", " - use the `df.cumsum` function to return the cumulative sum that you can then plot directly with pandas into a `df.plot.line` plot.\n", "\n", "- **Step 2**:\n", " - Do the `groupby` not only by `BeginningOfOperation` but by `['BeginningOfOperation', 'MainCategoryName']`. Use `as_index=False` to have a `DataFrame` as a result of the grouping and not a `Series`\n", " - use `DataFrame.pivot_table()` to explode the main categories into columns.\n", " - use `data.replace(np.nan, 0)` to replace missing values with 0\n", "***\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.2" } }, "nbformat": 4, "nbformat_minor": 2 }