{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Return period\n", "> Let's calculate the intensity of once-in-a-century rainfall\n", "\n", "- toc: false \n", "- badges: true\n", "- comments: false\n", "- categories: [jupyter]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Bilbao, Spain\n", "\n", "![](hydrology_figures/bilbao-map.png)\n", "\n", "## Today\n", "\n", "![](https://upload.wikimedia.org/wikipedia/commons/9/90/Guggenheim_museum_Bilbao_HDR-image.jpg)\n", "\n", "## August 1983\n", "\n", "![](https://www.mascontext.com/wp-content/uploads/2017/02/30_31_turning_point_one_1983_flooding_06.jpg)\n", "\n", "[more photos](https://www.mascontext.com/issues/30-31-bilbao/turning-point-one1983-flooding/)\n", "\n", "On Friday, August 26, 1983, Bilbao was celebrating its Aste Nagusia or Great Week, the main annual festivity in the city, when it and other municipalities of the Basque Country, Burgos, and Cantabria suffered devastating flooding due to heavy rains. In 24 hours, the volume of water registered 600 liters per square meter. Across all the affected areas, the weather service recorded 1.5 billion tons of water. In areas of Bilbao, the water reached a height of 5 meters (15 feet). Transportation, electricity and gas services, drinking water, food, telephone, and many other basic services were severely affected. 32 people died in Biscay, 4 people died in Cantabria, 2 people died in Alava, and 2 people died Burgos. 5 more people went missing.\n", "\n", "## How often will such rainfall happen?\n", "\n", "How often does it rain 50 mm in 1 day? What about 100 mm in 1 day? How big is a \"once-in-a-century event\"?\n", "\n", "Let's examine Bilbao's daily rainfall (mm), between 1947 to 2021\n", "\n", "![](hydrology_figures/bilbao-1947-2021.png)\n", "\n", "On the week of 22-28 August 1983, Bilbao's weather station measured 4.5 m of rainfall!\n", "\n", "![](hydrology_figures/bilbao-august-1983.png)\n", "\n", "Let's analyze this data and find out how rare such events are. First we need to find the annual maximum for each hydrological year." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.VConcatChart(...)" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#collapse-hide\n", "\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import pandas as pd\n", "\n", "df = pd.read_csv('BILBAO_daily.csv', sep=\",\")\n", "df['DATE'] = pd.to_datetime(df['DATE'])\n", "df = df.set_index('DATE')\n", "# IMPORTANT!! daily precipitation data is in tenths of mm, divide by 10 to get it in mm.\n", "df['PRCP'] = df['PRCP'] / 10\n", "\n", "import altair as alt\n", "alt.data_transformers.disable_max_rows()\n", "\n", "# Altair only recognizes column data; it ignores index values.\n", "# You can plot the index data by first resetting the index\n", "# I know that I've just made 'DATE' the index, but I want to have this here nonetheless so I can refer to this in the future\n", "df_new = df.reset_index()#.replace({0.0:np.nan})\n", "source = df_new[['DATE', 'PRCP']]\n", "\n", "brush = alt.selection(type='interval', encodings=['x'])\n", "\n", "base = alt.Chart(source).mark_line().encode(\n", " x = 'DATE:T',\n", " y = 'PRCP:Q'\n", ").properties(\n", " width=600,\n", " height=200\n", ")\n", "\n", "upper = base.encode(\n", " alt.X('DATE:T', scale=alt.Scale(domain=brush)),\n", " alt.Y('PRCP:Q', scale=alt.Scale(domain=(0,100)))\n", ")\n", "\n", "lower = base.properties(\n", " height=60\n", ").add_selection(brush)\n", "\n", "alt.vconcat(upper, lower)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will consider a hydrological year starting on 1 August.\n", "![](hydrology_figures/monthly_average_bilbao.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Histogram of annual maximum events\n", "![](hydrology_figures/hist_count_cumulative_bilbao.png)\n", "\n", "![](hydrology_figures/pdf_cdf_bilbao.png)\n", "\n", "How many years, **on average**, do we have to wait to get an annual maximum above a given threshold?\n", "\n", "![](hydrology_figures/return_prob_050.png)\n", "\n", "![](hydrology_figures/return_prob_033.png)\n", "\n", "![](hydrology_figures/return_prob_020.png)\n", "\n", "![](hydrology_figures/return_prob_010.png)\n", "\n", "![](hydrology_figures/return_prob_005.png)\n", "\n", "Now everything together in one gif:\n", "\n", "![](hydrology_figures/return_prob.gif)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Return Period\n", "\n", "We will follow Brutsaert's derivation (\"Hydrology, an introduction\", page 513). It defines quantities is a little different from what we did above.\n", "\n", "$F(x)$ is the CDF of the PDF $f(x)$. $F(x)$ indicates the non-exceedance probability, i.e., the probability that a certain event above $x$ has **not** occurred (or that an event below $x$ has occurred, same thing). Modifying the graph shown above, we have\n", "\n", "![](hydrology_figures/return_prob_050_reversed.png)\n", "\n", "\n", "$1-F(x)$ is the probability that a certain event above $x$ *has* occurred. It's reciprocal is the return period:\n", "\n", "$$\n", "T_r(x) = \\frac{1}{1-F(x)}\n", "$$\n", "\n", "This return period is the expected number of observations required until $x$ is exceeded once.\n", "In our case, we can ask the question: how many years will pass (on average) until we see a rainfall event greater that that of 26 August 1983?\n", "\n", "Let's call $p=F(x)$ the probability that we measured once and that an event greater than $x$ has *not* occurred.\n", "What is the probability that a rainfall above $x$ will occur only on year number $k$? \n", "* it hasn't occurred on year 1 (probability p)\n", "* it hasn't occurred on year 2 (probability p)\n", "* it hasn't occurred on year 3 (probability p)\n", "* ...\n", "* it has occurred on year k (probability 1-p)\n", "\n", "$P\\{k \\text{ trials until }X>x\\} = p^{k-1}(1-p)$\n", "\n", "Every time the number $k$ will be different. What will be $k$ *on average*?\n", "\n", "$$\\bar{k} = \\displaystyle\\sum_{k=1}^{\\infty} k P(k) = \\displaystyle\\sum_{k=1}^{\\infty} k p^{k-1}(1-p)$$\n", "\n", "Let's open that up:\n", "\n", "$$\n", "\\begin{align}\n", "\\bar{k} &= 1-p + 2p(1-p) + 3p^2(1-p) + 4p^3(1-p)+ \\cdots\\\\\n", "\\bar{k} &= 1-p + 2p - 2p^2 + 3p^2 - 3p^4 + 4p^3 - 4p^4+ \\cdots \\\\\n", "\\bar{k} &= 1 + p + p^2 + p^3 + p^4 + \\cdots\n", "\\end{align}\n", "$$\n", "\n", "For $p<1$, the series converges to\n", "$$\n", "1 + p + p^2 + p^3 + p^4 + \\cdots = \\frac{1}{1-p},\n", "$$\n", "therefore\n", "$$\n", "\\bar{k} = \\frac{1}{1-p}.\n", "$$\n", "\n", "We conclude that if we know the exceedance probability, we immediately can say what the return times are. We now need a way of estimating this exceedance probability." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plotting Position\n", "\n", "Source: **Brutsaert, Hydrology, pages 514-516** \n", "\n", "The **Plotting Position** is used as an estimate of the exceedance probability. Many formulas have been suggested (see source above), we will use the **Weibull plotting position**:\n", "\n", "$P_m=$ plotting position, or probability of occurence for each event \n", "$n=$ total number of events \n", "$m=$ rank of each event, where $m=1$ is the lowest value, and $m=n$ is the highest\n", "\n", "\n", "### Return period:\n", "$$\n", "\\text{Return period} = T_r = \\frac{1}{1-P_m}\n", "$$\n", "\n", "#### Weibull plotting position:\n", "\n", "$$\n", "P_m = \\frac{m}{n+1}\n", "$$\n", "\n", "Now let's calculate that for Bilbao:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PRCPmPmTr
DATE
2011-07-3127.010.0131581.013333
2002-07-3128.520.0263161.027027
2021-07-3135.830.0394741.041096
2001-07-3138.640.0526321.055556
2004-07-3141.150.0657891.070423
...............
2010-07-31108.1710.93421115.200000
1960-07-31137.2720.94736819.000000
1964-07-31143.5730.96052625.333333
1954-07-31172.6740.97368438.000000
1984-07-31252.6750.98684276.000000
\n", "

75 rows × 4 columns

\n", "
" ], "text/plain": [ " PRCP m Pm Tr\n", "DATE \n", "2011-07-31 27.0 1 0.013158 1.013333\n", "2002-07-31 28.5 2 0.026316 1.027027\n", "2021-07-31 35.8 3 0.039474 1.041096\n", "2001-07-31 38.6 4 0.052632 1.055556\n", "2004-07-31 41.1 5 0.065789 1.070423\n", "... ... .. ... ...\n", "2010-07-31 108.1 71 0.934211 15.200000\n", "1960-07-31 137.2 72 0.947368 19.000000\n", "1964-07-31 143.5 73 0.960526 25.333333\n", "1954-07-31 172.6 74 0.973684 38.000000\n", "1984-07-31 252.6 75 0.986842 76.000000\n", "\n", "[75 rows x 4 columns]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#collapse-hide\n", "\n", "# resample daily data into yearly data (maximum yearly value)\n", "max_annual = df['PRCP'].resample('A-JUL').max().to_frame()\n", "# sort yearly max from lowest to highest\n", "max_annual = max_annual.sort_values(by=['PRCP'], ascending=True)\n", "max_annual['m'] = np.arange(1, len(max_annual) + 1)\n", "n = len(max_annual['m'])\n", "max_annual['Pm'] = max_annual['m'] / (n+1)\n", "max_annual['Tr'] = 1 / (1 - max_annual['Pm'])\n", "max_annual" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "How well does $P_m$ approximate $F(x)$?\n", "\n", "![](hydrology_figures/weibull_plotting_position.png)\n", "\n", "We can now see in this graph how long it takes, **on average**, for an annual maximum event above any threshold.\n", "\n", "![](hydrology_figures/annual_max_vs_return_period.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For times longer than $n$, we need to extrapolate from the curve above.\n", "\n", "![](hydrology_figures/extrapolation_exceedance1.png)\n", "\n", "![](hydrology_figures/extrapolation_exceedance2.png)\n", "\n", "![](hydrology_figures/extrapolation_exceedance3.png)\n", "\n", "![](hydrology_figures/extrapolation_exceedance4.png)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7" } }, "nbformat": 4, "nbformat_minor": 4 }