{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*This is a Jupyter Notebook. It is an interactive document that contains both rich text elements such as figures, links, equations, etc. and executable code - in this case Python code (the grey boxes).\n",
    "**How to use a Jupyter Notebook**: You can execute the blocks of code one at the time by placing the mouse in the grey box and pressing shift + enter. An asterisk will appear in the brackets at the top left of the box while the code is being exectued (this may take few seconds) and turns into a number when the execution is over. Alternatively, you can run all the code in the Notebook in a single step by clicking on the menu Cell -> Run All.*\n",
    "\n",
    "# What are the most important parameters in a mathematical model?\n",
    "Mathematical models often encompass a large number of parameters. The value of these parameters for a particular application are decided by the modeller based on information about the quantities that the parameters represent (for example, measurements from field work or lab experiments). Alternatively, if the modeller possesses some measurements of the system inputs and outputs, then the parameter values can be inferred by finding the values that makes the model best fit those measurements (this is called model 'calibration'). In both cases, determining the 'right' parameter values is often difficult and time consuming, as it may require acquiring and handling a lot of data and/or using of a lot of computing power (if the model is calibrated using a computer algorithm). And yet, parameter estimates may still be uncertain if the available data are sparse or affected by large measurement errors. In this Notebook we see how Global Sensitivity Analysis (GSA) can help in this context.\n",
    "\n",
    "Specifically, we will show how GSA can determine if any of the model parameters have negligible influence on the output we are interested in predicting (or for which we have measurements). If this is the case, then we do not need to invest too much efforts in the estimation of these uninfluential parameters, and we can focus instead on the smaller set of influential parameters, thus saving time while achieving better results. Indeed applications of GSA have shown that the set of parameters controlling a particular output variable is often quite small, although different sets of parameters may control different outputs. Also, if GSA shows that some parameters are uninfluential for all model outputs, then this may mean that the model is unnecessarily complex and give the modeller indications on how to simplify it.\n",
    "\n",
    "## A hydrological example\n",
    "In this Notebook we will apply GSA to determine the influential and uninfluential parameters of a hydrological model. For the sake of illustration, we will use a very simple model with only five parameters. But what is a hydrological model in the first place? Imagine that we want to predict the water flows that drain into a river from the amount of rainfall that has fallen in the river catchment area. For this purpose, we can use a hydrological model, which describes the hydrological processes occuring in the catchment (such as evaporation, surface and underground flows, etc.) via a set of equations. These equations encompass various parameters, describing the properties of the catchment, such as the vegetation and soil characteristics. \n",
    "\n",
    "<left><img src=\"util/hydrological/rain_runoff_diagram8.gif\" width=\"700px\">\n",
    "    \n",
    "Above: schematic of the main hydrological processes occuring in a catchment. Below: schematic of how the processes are represented in a hydrological model to finally predict the river flow.\n",
    "\n",
    "<left><img src=\"util/hydrological/hymod_diagram_simple4.gif\" width=\"1000px\">\n",
    "\n",
    "Our simple hydrological model (which is an adaptation of the HyMOD model *(Ref. 1)*) has 5 parameters:\n",
    "- Soil storage capacity (mm): capacity of the soil to store rainwater\n",
    "- Evaporation ratio: proportion of rainwater that is evaporated\n",
    "- Infiltration rate: proportion of effective rain that actually infiltrates into the soil\n",
    "- Travel time - surface flow (days): average time for the surface water to reach the river \n",
    "- Travel time - underground flow (days): average time for the underground water to reach the river\n",
    "\n",
    "<left><img src=\"util/hydrological/rain_runoff_param.png\" width=\"400px\">\n",
    "    \n",
    "## Estimating the model parameters\n",
    "Now imagine we have a time series of measurements of daily rainfall and river flow for a particular catchment. We can use these data to calibrate our model, that is, to find the parameter values that make the model best fit the data. The simplest way to do this is by changing the parameter values one-at-a-time and looking at how this changes the model predictions. Can we fit the predicted river flows to the measured ones?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "303c4f456f6d4a9a807ba197b4225ada",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "VBox(children=(VBox(children=(FloatSlider(value=50.0, continuous_update=False, description='Soil storage capac…"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from ipywidgets import widgets\n",
    "from util.hydrological.hydrological_interactive import hydrological_model\n",
    "\n",
    "Soil_sto,Evap_rate,Inf_rate, Time_surf,Time_under,slider_box_layout,\\\n",
    "           fig_sto,fig_flo,fig_hyd,hbox_layout,vbox_layout,param_obs = hydrological_model()\n",
    "\n",
    "widgets.VBox([widgets.VBox([Soil_sto,Evap_rate,Inf_rate, Time_surf,Time_under],layout=slider_box_layout),\n",
    "              widgets.HBox([fig_sto,fig_flo,fig_hyd],layout=hbox_layout)],layout = vbox_layout)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The interactive plot above also reports a quantitative indicator of the model fit-to-the data, or better, lack of fit: the calibration error. This is the sum of the absolute differences between predicted and measured river flows over the whole time series. Ideally we would like the error to be zero; more realistically, we would want to reduce it as much as possible. However, finding a good set of parameters by varying each of them one-at-a-time is time consuming.\n",
    "\n",
    "## Finding influential parameters by GSA\n",
    "The search of a good parameters combination would be easier if we could focus on fewer of them, the ones that mostly influence the model predictions. To identify such parameters, we can apply Global Sensitivity Analysis (GSA). Here we use one particular GSA method, PAWN *(Ref. 2)*. It provides a sensitivity index for each model parameter. The index measures the relative influence of that parameter on the calibration error: the lower the index, the smaller the influence. The method also provides a threshold value to identify uninfluential parameters: if the index of a parameter is below the threshold, then the parameter can be considered uninfluential. Now run the cell below to obtain the sensivity indices and the influence threshold (black dotted line).\n",
    "<left><img src=\"util/hydrological/hyd_model_screening_figure.png\" width=\"600px\">\n",
    "Can any of the parameters be neglected?\n",
    "## Estimating the model parameters, again...\n",
    "Now try to calibrate the hydrological model but only focusing on the parameters that, according to GSA, are influential. For your reference, consider that a **calibration error <50 ML** is a good result, and an **error <35 ML** is very good. Can you get a **good** calibration by only **changing the value of 2 parameters**? and a **very good** calibration by only **changing 3 parameters**?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "61822339b9444b48bc7bb474229b2d5f",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "VBox(children=(VBox(children=(FloatSlider(value=50.0, continuous_update=False, description='Soil storage capac…"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "Soil_sto,Evap_rate,Inf_rate, Time_surf,Time_under,slider_box_layout,\\\n",
    "           fig_sto,fig_flo,fig_hyd,hbox_layout,vbox_layout,param_obs = hydrological_model()\n",
    "\n",
    "widgets.VBox([widgets.VBox([Soil_sto,Evap_rate,Inf_rate, Time_surf,Time_under],layout=slider_box_layout),\n",
    "              widgets.HBox([fig_sto,fig_flo,fig_hyd],layout=hbox_layout)],layout = vbox_layout)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Another piece of information provided by GSA is about the possible interactions between the model parameters. For example, the coloured scatter plots below are a qualitative tool to analyse such interactions. Each point in the plots corresponds to a possible parameters combination (you can read the parameter values on the horizontal and vertical axes). The color tells us the calibration error of the model when run with that combination of parameters.\n",
    "<left><img src=\"util/hydrological/hyd_model_mapping_figure.png\" width=\"600px\">\n",
    "### References\n",
    "1. [HyMOD model - Boyle (2003)](https://doi.org/10.1002/9781118665671.ch14)\n",
    "2. [PAWN method - Pianosi and Wagener (2018)](https://doi.org/10.1016/j.envsoft.2018.07.019)\n",
    "3. [Global Sensitivity Analysis . The primer - Saltelli et al. (2008)](http://www.andreasaltelli.eu/file/repository/A_Saltelli_Marco_Ratto_Terry_Andres_Francesca_Campolongo_Jessica_Cariboni_Debora_Gatelli_Michaela_Saisana_Stefano_Tarantola_Global_Sensitivity_Analysis_The_Primer_Wiley_Interscience_2008_.pdf) "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  },
  "varInspector": {
   "cols": {
    "lenName": 16,
    "lenType": 16,
    "lenVar": 40
   },
   "kernels_config": {
    "python": {
     "delete_cmd_postfix": "",
     "delete_cmd_prefix": "del ",
     "library": "var_list.py",
     "varRefreshCmd": "print(var_dic_list())"
    },
    "r": {
     "delete_cmd_postfix": ") ",
     "delete_cmd_prefix": "rm(",
     "library": "var_list.r",
     "varRefreshCmd": "cat(var_dic_list()) "
    }
   },
   "types_to_exclude": [
    "module",
    "function",
    "builtin_function_or_method",
    "instance",
    "_Feature"
   ],
   "window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}