{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Uncertainty propagation\n",
    "\n",
    "This notebook demonstrates uncertainty propagation with PyCO2SYS.\n",
    "\n",
    "In this example, we imagine that we've measured dissolved inorganic carbon (DIC) and Total scale pH for four seawater samples.  The DIC and pH values are different from each other, but all other conditions are the same for the samples.  We want to know the seawater <i>p</i>CO<sub>2</sub> in the samples.\n",
    "\n",
    "The instructions follow [the docs](https://pyco2sys.readthedocs.io/en/latest/uncertainty/) for the Pythonic `pyco2.sys` interface.\n",
    "\n",
    "The normal way to run PyCO2SYS (without uncertainties) is as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import PyCO2SYS as pyco2\n",
    "\n",
    "# Set up CO2SYS inputs\n",
    "pyco2_kws = dict(\n",
    "    par1 = np.array([2150.1, 2175.3, 2199.8, 2220.4]),  # dissolved inorganic carbon in μmol/kg\n",
    "    par2 = np.array([ 8.121,  8.082,  8.041,  8.001]),  # pH on the Total scale\n",
    "    par1_type = 2,\n",
    "    par2_type = 3,\n",
    "    salinity = 33.1,\n",
    "    temperature = 25.0,\n",
    "    temperature_out = 25.0,\n",
    "    pressure = 0.0,\n",
    "    pressure_out = 0.0,\n",
    "    total_silicate = 5.0,\n",
    "    total_phosphate = 1.0,\n",
    "    total_ammonia = 0.5,\n",
    "    opt_pH_scale = 1,\n",
    "    opt_k_carbonic = 12,\n",
    ")\n",
    "# Run PyCO2SYS and print out the pCO2 values\n",
    "results = pyco2.sys(**pyco2_kws)\n",
    "print(\"pCO2: \", results[\"pCO2\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Uncertainties from measurements\n",
    "\n",
    "### Constant uncertainties\n",
    "\n",
    "First, we want to know how the uncertainties in our DIC and pH measurements propagate into calculated <i>p</i>CO<sub>2</sub>.  Let's assume that the uncertainties in DIC and pH are (1) independent from each other and (2) constant for all measurements, with a one-standard-deviation uncertainty of 3 μmol/kg for DIC, and 0.005 for pH.\n",
    "\n",
    "We need to assemble:\n",
    "\n",
    "  1. A list of parameters that we want to propagate uncertainties into: `uncertainty_into`.\n",
    "  2. A dict of parameters that we want to propagate uncertaintes from, and their uncertainties: `uncertainty_from`.\n",
    "  \n",
    "For both of these, the entries in the list (or keys in the dict) come from the corresponding fields in the [results dict](https://pyco2sys.readthedocs.io/en/latest/co2sys_nd/#results).\n",
    "\n",
    "**The uncertainties must be *independent* from each other and given in terms of *standard deviations* for the maths to work!**\n",
    "\n",
    "To do the uncertainty propogation, we need to include `uncertainty_into` and `uncertainty_from` as kwargs to `pyco2.sys` (along with all the normal kwargs for the calculations):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Which parameter(s) do we want to propagate uncertainties into?\n",
    "uncertainty_into = [\"pCO2\"]\n",
    "\n",
    "# Define measurement uncertainties\n",
    "uncertainty_from = {\n",
    "    \"par1\": 3,     # uncertainty in dissolved inorganic carbon in μmol/kg\n",
    "    \"par2\": 0.005, # uncertainty in pH\n",
    "}\n",
    "\n",
    "# Propagate and print out results\n",
    "results = pyco2.sys(**pyco2_kws,\n",
    "    uncertainty_into=uncertainty_into, uncertainty_from=uncertainty_from, \n",
    ")\n",
    "print(\"Uncertainty in pCO2 from par1 (i.e. DIC):     \", results[\"u_pCO2__par1\"])\n",
    "print(\"Uncertainty in pCO2 from par2 (i.e. pH):      \", results[\"u_pCO2__par2\"])\n",
    "print(\"Total uncertainty in pCO2 from par1 and par2: \", results[\"u_pCO2\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `results` dict now contains additional keys corresponding to the elements of `uncertainty_into`.  The values with keys `\"u_<result>\"` are the total uncertainty in the corresponding result.  For example, `results[\"u_pCO2\"]` contains the total uncertainty in calculated <i>p</i>CO<sub>2</sub> at input conditions (i.e. `\"pCO2\"`).\n",
    "\n",
    "The component of uncertainty from each argument is also provided.  For example, `result[\"u_pCO2__par1\"]` contains the uncertainty in calculated <i>p</i>CO<sub>2</sub> at input conditions (i.e. `\"pCO2\"`) resulting from uncertainty in `par1` (here, DIC).\n",
    "\n",
    "We can see from the results above that the uncertainty in pH dominates the overall uncertainty in <i>p</i>CO<sub>2</sub>.\n",
    "\n",
    "### Different uncertainties for different samples\n",
    "\n",
    "Now imagine that the first pH measurement was done with a more accurate method than the others, so its uncertainty is smaller.  We can account for this by using an array instead of a single value in the `uncertainty_from` dict.\n",
    "\n",
    "You could also use this approach if `\"par1\"` and/or `\"par2\"` contained a mixture of different parameter types, with different uncertainties."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Redefine measurement uncertainties\n",
    "uncertainty_from = {\n",
    "    \"par1\": 3,\n",
    "    \"par2\": np.array([0.001, 0.005, 0.005, 0.005]),\n",
    "}\n",
    "\n",
    "# Re-propagate and print out results\n",
    "results = pyco2.sys(**pyco2_kws,\n",
    "    uncertainty_into=uncertainty_into, uncertainty_from=uncertainty_from, \n",
    ")\n",
    "print(\"Uncertainty in pCO2 from par1 (i.e. DIC):     \", results[\"u_pCO2__par1\"])\n",
    "print(\"Uncertainty in pCO2 from par2 (i.e. pH):      \", results[\"u_pCO2__par2\"])\n",
    "print(\"Total uncertainty in pCO2 from par1 and par2: \", results[\"u_pCO2\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The lower uncertainty in the first pH measurement has significantly reduced the total uncertainty in calculated <i>p</i>CO<sub>2</sub>.\n",
    "\n",
    "### Other function inputs\n",
    "\n",
    "We can include any of the non-settings arguments of `pyco2.sys` by extending the `uncertainties_from` dict in the same way.  For example, to also get the uncertainty in aragonite saturation state, and to also propagate uncertainties in the temperature (of 0.05 °C) and salinity (of 0.002) values:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Redefine which parameter(s) we want to propagate uncertainties into\n",
    "uncertainty_into = [\"pCO2\", \"saturation_aragonite\"]\n",
    "\n",
    "# Redefine measurement uncertainties\n",
    "uncertainty_from = {\n",
    "    \"par1\": 3,\n",
    "    \"par2\": np.array([0.001, 0.005, 0.005, 0.005]),\n",
    "    \"temperature\": 0.05,\n",
    "    \"salinity\": 0.002,\n",
    "}\n",
    "\n",
    "# Re-propagate and print out results\n",
    "results = pyco2.sys(**pyco2_kws,\n",
    "    uncertainty_into=uncertainty_into, uncertainty_from=uncertainty_from, \n",
    ")\n",
    "print(\"Uncertainty in pCO2 from temperature:                 \", results[\"u_pCO2__temperature\"])\n",
    "print(\"Uncertainty in Omega-aragonite from salinity:         \", results[\"u_saturation_aragonite__salinity\"])\n",
    "print(\"Total uncertainty in Omega-aragonite from all inputs: \", results[\"u_saturation_aragonite\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Uncertainties from internal parameters\n",
    "\n",
    "We can also determine how uncertainties in PyCO2SYS's internally evaluated parameters (totals from salinity and equilibrium constants) propagate through to the calculated results.  The approach is very similar to that used above: we just need to find the name of the variable we want to propagate uncertainty from in the results dict.\n",
    "\n",
    "### Absolute uncertainty known\n",
    "\n",
    "First, imagine that we know the absolute uncertainty in an equilibrium constant.  Let's say we think the uncertainty in <i>K</i><sub>1</sub> (first dissociation constant of carbonic acid) is on the order of 10<sup>-8</sup> (as a standard deviation).  The approach is now virtually identical to that used above:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Add K1 uncertainty to the dict\n",
    "uncertainty_from.update({\"k_carbonic_1\": 1e-8})\n",
    "\n",
    "# Re-propagate and print out results\n",
    "results = pyco2.sys(**pyco2_kws,\n",
    "    uncertainty_into=uncertainty_into, uncertainty_from=uncertainty_from, \n",
    ")\n",
    "print(\"Uncertainty in pCO2 from K1:               \", results[\"u_pCO2__k_carbonic_1\"])\n",
    "print(\"Total uncertainty in pCO2 from all inputs: \", results[\"u_pCO2\"])\n",
    "print(\"Uncertainty in Omega-aragonite from K1:               \", results[\"u_saturation_aragonite__k_carbonic_1\"])\n",
    "print(\"Total uncertainty in Omega-aragonite from all inputs: \", results[\"u_saturation_aragonite\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Above we see that this uncertainty in <i>K</i><sub>1</sub> would dominate the total uncertainty in <i>p</i>CO<sub>2</sub>, but be less important for the aragonite saturation state.\n",
    "\n",
    "### Fractional uncertainty known\n",
    "\n",
    "If you know the fractional uncertainty but not an absolute value, then you could just extract the relevant parameter values from the `results` dict and multiply through.  For example, if we thought the uncertainty in <i>K</i><sub>2</sub> (second dissociation constant of carbonic acid) was on the order of 5%:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Get K2 values from the results\n",
    "k2 = results[\"k_carbonic_2\"]\n",
    "\n",
    "# Add K2 uncertainty to the dict\n",
    "uncertainty_from.update({\"k_carbonic_2\": k2 * 0.05})\n",
    "\n",
    "# Re-propagate and print out results\n",
    "results = pyco2.sys(**pyco2_kws,\n",
    "    uncertainty_into=uncertainty_into, uncertainty_from=uncertainty_from, \n",
    ")\n",
    "print(\"Uncertainty in pCO2 from K2:               \", results[\"u_pCO2__k_carbonic_2\"])\n",
    "print(\"Total uncertainty in pCO2 from all inputs: \", results[\"u_pCO2\"])\n",
    "print(\"Uncertainty in Omega-aragonite from K2:               \", results[\"u_saturation_aragonite__k_carbonic_2\"])\n",
    "print(\"Total uncertainty in Omega-aragonite from all inputs: \", results[\"u_saturation_aragonite\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this example, we can see that the <i>K</i><sub>2</sub> uncertainty has a much bigger influence on aragonite saturation state than <i>K</i><sub>1</sub> did.\n",
    "\n",
    "### Uncertainty in p<i>K</i> values, not <i>K</i>\n",
    "\n",
    "If uncertainties in equilibrium constants are given in terms of p<i>K</i> values rather than *K* we can simply append `p` to the start of the corresponding parameter name in `uncertainties_from`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Redefine measurement uncertainties\n",
    "uncertainty_from = {\n",
    "    \"par1\": 3,\n",
    "    \"par2\": np.array([0.001, 0.005, 0.005, 0.005]),\n",
    "    \"temperature\": 0.05,\n",
    "    \"salinity\": 0.002,\n",
    "    \"pk_carbonic_1\": 0.0075,\n",
    "    \"pk_carbonic_2\": 0.015,\n",
    "}\n",
    "\n",
    "# Re-propagate and print out results\n",
    "results = pyco2.sys(**pyco2_kws,\n",
    "    uncertainty_into=uncertainty_into, uncertainty_from=uncertainty_from, \n",
    ")\n",
    "print(\"Total uncertainty in pCO2 from all inputs: \", results[\"u_pCO2\"])\n",
    "print(\"Total uncertainty in Omega-aragonite from all inputs: \", results[\"u_saturation_aragonite\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Uncertainties of Orr et al. (2018)\n",
    "\n",
    "The uncertainties in equilibrium constants suggested by [Orr et al. (2018)](https://www.sciencedirect.com/science/article/pii/S030442031830149X) are available as a dict under `pyco2.uncertainty_OEDG18`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Propagate OEDG18 uncertainties and print out results\n",
    "results = pyco2.sys(**pyco2_kws,\n",
    "    uncertainty_into=[\"saturation_aragonite\"], uncertainty_from=pyco2.uncertainty_OEDG18, \n",
    ")\n",
    "print(\"Total uncertainty in Omega-aragonite from all inputs: \", results[\"u_saturation_aragonite\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Covarying uncertainties\n",
    "\n",
    "If your uncertainties are covarying then you need to assemble a variance-covariance matrix of the uncertainties and a Jacobian matrix for the parameters that you are interested in.  The maths of how to do this are beyond the scope of this tutorial, but you can use PyCO2SYS to get all the derivatives that you need.  Instead of using the `uncertainty_from` and `uncertainty_into` kwargs, we use `grads_wrt` (w.r.t. = with respect to) and `grads_of` instead:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Get raw derivatives\n",
    "results = pyco2.sys(**pyco2_kws,\n",
    "    grads_of=uncertainty_into, grads_wrt=uncertainty_from, \n",
    ")\n",
    "\n",
    "# Print out example results\n",
    "print(\"d(pCO2)/d(par1) =\", results[\"d_pCO2__d_par1\"])\n",
    "print(\"d(saturation_aragonite)/d(pk_carbonic_1) =\", results[\"d_saturation_aragonite__d_pk_carbonic_1\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The derivative of each result in `grads_of` with respect to each argument in `grads_wrt` is stored in the `results` dict under the key `\"d_<result>__d_<argument>\"`."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}