{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Protocol Example"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-block alert-info\">\n",
    "\n",
    "This example illustrates how to work with the Protocol API in the [Protocols](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.html) module. These classes represent (psychological) protocols. One instance of these protocols represents a particular study the protocol was conducted, and data collected during the study.  \n",
    "    \n",
    "The Protocol API further offers functions to process different kind of data collected during that study without the need of setting up \"common\" processing pipelines each time (see the [<code>ECG_Analysis_Example.ipynb</code>](ECG_Analysis_Example.ipynb) for further information on data processing pipelines).  \n",
    "    \n",
    "As input, this notebook uses the heart rate processing outputs generated from the ECG processing pipeline as shown in [<code>ECG_Processing_Example.ipynb</code>](ECG_Processing_Example.ipynb)) as well as saliva data.\n",
    "    \n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup and Helper Functions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "\n",
    "import re\n",
    "\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "\n",
    "from fau_colors import cmaps\n",
    "import biopsykit as bp\n",
    "from biopsykit.protocols import BaseProtocol, MIST, TSST\n",
    "\n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns\n",
    "\n",
    "%matplotlib widget\n",
    "%load_ext autoreload\n",
    "%autoreload 2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.close(\"all\")\n",
    "\n",
    "palette = sns.color_palette(cmaps.faculties)\n",
    "sns.set_theme(context=\"notebook\", style=\"ticks\", font=\"sans-serif\", palette=palette)\n",
    "\n",
    "plt.rcParams[\"figure.figsize\"] = (8, 4)\n",
    "plt.rcParams[\"pdf.fonttype\"] = 42\n",
    "plt.rcParams[\"mathtext.default\"] = \"regular\"\n",
    "\n",
    "palette"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def display_dict_structure(dict_to_display):\n",
    "    _display_dict_recursive(dict_to_display)\n",
    "\n",
    "\n",
    "def _display_dict_recursive(dict_to_display):\n",
    "    if isinstance(dict_to_display, dict):\n",
    "        display(dict_to_display.keys())\n",
    "        _display_dict_recursive(list(dict_to_display.values())[0])\n",
    "    else:\n",
    "        display(\"Dataframe shape: {}\".format(dict_to_display.shape))\n",
    "        display(dict_to_display.head())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Overview"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The Protocol API of `BioPsyKit` offers classes that represent various psychological protocols. \n",
    "\n",
    "The classes allow to add *time-series data* (such as heart rate) as well as *tabular data* (such as saliva) to an instance of a `Protocol` object. Further, it offers functions to process the data and aggregate the results on a study level (such as computing mean and standard error over all subjects within different phases during the protocol).  \n",
    "\n",
    "For this, it uses functions from the [biopsykit.utils.data_processing](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.utils.data_processing.html) module to split data, compute aggregated results, etc. While these functions can also be used stand-alone, without using the Protocol API, it is, however, recommended to make use of the Protocol API whenever possible since you don't have to take care of calling each function manually, and in the right order. \n",
    "\n",
    "See the [<code>ECG_Analysis_Example.ipynb</code>](ECG_Analysis_Example.ipynb) notebook for further examples."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## General Usage – [BaseProtocol](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.base.html#biopsykit.protocols.base.BaseProtocol)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The base class for all protocols is [BaseProtocol()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.base.html#biopsykit.protocols.base.BaseProtocol). It provides the general functionality. Besides that, there exist a couple of predefined subclasses that represent standardized psychological protocols, such as [MIST](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.mist.html#biopsykit.protocols.mist.MIST) (Montreal Imaging Stress Task) and [TSST](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.tsst.html#biopsykit.protocols.tsst.TSST) (Trier Social Stress Test).\n",
    "\n",
    "\n",
    "Each protocol is expected to have a certain structure which can be specified by passing a dictionary describing the protocol structure when creating the Protocol object. The structure can be nested, up to three nested structure levels are supported:\n",
    "\n",
    "* 1st level: ``study part``: These can be different, disjunct parts of the complete study, such as: \"Preface\", \"Test\", and \"Questionnaires\"\n",
    "* 2nd level: ``phase``: One *study part* of the psychological protocol can consist of different *phases*, such as: \"Preparation\", \"Stress\", \"Recovery\"\n",
    "* 3rd level: ``subphase``: One *phase* can consist of different *subphases*, such as: \"Baseline\", \"Arithmetic Task\", \"Feedback\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The following examples illustrate different possibilities of defining the protocol structure:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Example 1: study with three parts, no finer division into phases\n",
    "structure = {\"Preface\": None, \"Test\": None, \"Questionnaires\": None}\n",
    "BaseProtocol(name=\"Base\", structure=structure)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Example 2: study with three parts, all parts have different phases with specific durations\n",
    "structure = {\n",
    "    \"Preface\": {\"Questionnaires\": 240, \"Baseline\": 60},\n",
    "    \"Test\": {\"Preparation\": 120, \"Test\": 240, \"Recovery\": 120},\n",
    "    \"Recovery\": {\"Part1\": 240, \"Part2\": 240},\n",
    "}\n",
    "BaseProtocol(name=\"Base\", structure=structure)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Example 3: only certain study parts have different phases (example: TSST)\n",
    "structure = {\"Before\": None, \"TSST\": {\"Preparation\": 300, \"Talk\": 300, \"Math\": 300}, \"After\": None}\n",
    "BaseProtocol(name=\"Base\", structure=structure)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Example 4: study with phases and subphases, only certain study parts have different phases (example: MIST)\n",
    "structure = {\n",
    "    \"Before\": None,\n",
    "    \"MIST\": {\n",
    "        \"MIST1\": {\"BL\": 60, \"AT\": 240, \"FB\": 120},\n",
    "        \"MIST2\": {\"BL\": 60, \"AT\": 240, \"FB\": 120},\n",
    "        \"MIST3\": {\"BL\": 60, \"AT\": 240, \"FB\": 120},\n",
    "    },\n",
    "    \"After\": None,\n",
    "}\n",
    "BaseProtocol(name=\"Base\", structure=structure)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Load Data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Heart Rate and R Peak Data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Load processed heart rate time-series data of all subjects and concatenate into one dict."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "subject_data_dict_hr = {}\n",
    "# or use your own path\n",
    "ecg_path = bp.example_data.get_ecg_processing_results_path_example()\n",
    "\n",
    "for file in sorted(ecg_path.glob(\"hr_result_*.xlsx\")):\n",
    "    subject_id = re.findall(\"hr_result_(Vp\\w+).xlsx\", file.name)[0]\n",
    "    subject_data_dict_hr[subject_id] = pd.read_excel(file, sheet_name=None, index_col=\"time\")\n",
    "\n",
    "subject_data_dict_rpeaks = {}\n",
    "for file in sorted(ecg_path.glob(\"rpeaks_result_*.xlsx\")):\n",
    "    subject_id = re.findall(\"rpeaks_result_(Vp\\w+).xlsx\", file.name)[0]\n",
    "    subject_data_dict_rpeaks[subject_id] = pd.read_excel(file, sheet_name=None, index_col=\"time\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Structure of the [SubjectDataDict](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.utils.datatype_helper.html#biopsykit.utils.datatype_helper.SubjectDataDict):\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "display_dict_structure(subject_data_dict_hr)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Subject Conditions\n",
    "(will be needed later)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Note**: This is only an example, thus, the condition list is manually constructed. Usually, you should import a condition list from a file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "condition_list = pd.DataFrame(\n",
    "    [\"Control\", \"Intervention\"], columns=[\"condition\"], index=pd.Index([\"Vp01\", \"Vp02\"], name=\"subject\")\n",
    ")\n",
    "condition_list"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create `Protocol` instance"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's assume our example protocol consists of only one study part, so we can ignore this level. The study consists of four different phases: \"Baseline\", \"Intervention\", \"Stress\", and \"Recovery\". Each of this phase has, in turn, three subphases: \"Start\", \"Middle\", and \"End\" with the durations 20, 30, and 10 seoncds, respectively. Our structure dict then looks like this:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "subphases = {\"Start\": 20, \"Middle\": 30, \"End\": 10}\n",
    "structure = {\"Baseline\": subphases, \"Intervention\": subphases, \"Stress\": subphases, \"Recovery\": subphases}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "base_protocol = BaseProtocol(name=\"Base\", structure=structure)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Add Heart Rate Data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The data to be added to the Protocol object is expected to be a [SubjectDataDict](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.utils.datatype_helper.html#biopsykit.utils.datatype_helper.SubjectDataDict), a special nested dictionary structure that contains processed data of all subjects. The dictionaries `subject_data_dict_hr` and `subject_data_dict_rpeaks` are already in this structure, so they can be added to the Protocol object. \n",
    "\n",
    "Since we only have one study part, we can directly add this data without specifying a `study_part` parameter. The data is added to a *study part* named \"Study\" for better internal handling. The added data can be accessed via the [hr_data](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.base.html#biopsykit.protocols.base.BaseProtocol.hr_data) property.\n",
    "\n",
    "**Note**: See [BaseProtocol.add_hr_data()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.base.html#biopsykit.protocols.base.BaseProtocol.add_hr_data) and [BaseProtocol.hr_data](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.base.html#biopsykit.protocols.base.BaseProtocol.hr_data) for further information!\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "base_protocol.add_hr_data(hr_data=subject_data_dict_hr, rpeak_data=subject_data_dict_rpeaks)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "display_dict_structure(base_protocol.hr_data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Compute *Aggregated Results*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we can compute *Aggregated Results*. For this, we can use the method [BaseProtocol.compute_hr_results()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.base.html#biopsykit.protocols.base.BaseProtocol.compute_hr_results)  which allows us to enable/disable the specific processing steps and provide different parameters to the individual step (see the [<code>ECG_Analysis_Example.ipynb</code>](ECG_Analysis_Example.ipynb) notebook for further explanations on the different processing steps).\n",
    "\n",
    "Since different kinds of aggregated results can be computed (e.g., aggregating only over phases vs. aggregating over subphases, normalization of heart rate data vs. no normalization, ...) we can speficy an identifier for these results via the `result_id` parameter.\n",
    "\n",
    "\n",
    "[BaseProtocol.compute_hr_results()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.base.html#biopsykit.protocols.base.BaseProtocol.compute_hr_results) can do the following steps (`True`/`False` indicates which steps are enabled by default – have also a look at the function documentation!):\n",
    "\n",
    "* Resample (`resample_sec`): `True`\n",
    "* Normalize (`normalize_to`): `False`\n",
    "* Select Phases (`select_phases`): `False`\n",
    "* Split into Subphases (`split_into_subphases`: `False`\n",
    "* Aggregate per Subject (`mean_per_subject`): `True`\n",
    "* Add Conditions (`add_conditions`): `False`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Aggregate over Phases (Without Normalization)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "base_protocol.compute_hr_results(\"hr_phases\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Aggregate over Phases, Add Subject Conditions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "base_protocol.compute_hr_results(\"hr_phases_cond\", add_conditions=True, params={\"add_conditions\": condition_list})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Aggregate over Phases (With Normalization)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If, for example, we want to normalize the heart rate data to the \"Baseline\" phase, we can enable Normalization (`normalize_to`). Since all data were normalized to the \"Baseline\" phase we can drop this phase for later analysis by enabling the `select_phases` step. The parameters for these steps is supplied via the `params` dict:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "base_protocol.compute_hr_results(\n",
    "    \"hr_phases_norm\",\n",
    "    normalize_to=True,\n",
    "    select_phases=True,\n",
    "    params={\"normalize_to\": \"Baseline\", \"select_phases\": [\"Intervention\", \"Stress\", \"Recovery\"]},\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Aggregate over Subphases (With Normalization)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "base_protocol.compute_hr_results(\n",
    "    \"hr_subphases_norm\",\n",
    "    normalize_to=True,\n",
    "    select_phases=True,\n",
    "    split_into_subphases=True,\n",
    "    params={\n",
    "        \"normalize_to\": \"Baseline\",\n",
    "        \"select_phases\": [\"Intervention\", \"Stress\", \"Recovery\"],\n",
    "        \"split_into_subphases\": subphases,\n",
    "    },\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Access Results"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The results can then be accessed via the [hr_results](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.base.html#biopsykit.protocols.base.BaseProtocol.hr_results) property:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "base_protocol.hr_results[\"hr_phases\"].head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "base_protocol.hr_results[\"hr_phases_cond\"].head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "base_protocol.hr_results[\"hr_phases_norm\"].head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "base_protocol.hr_results[\"hr_subphases_norm\"].head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Compute *Ensemble Time-Series*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we can compute *Ensemble Time-Series*. For this, we can use the method [BaseProtocol.compute_hr_ensemble()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.base.html#biopsykit.protocols.base.BaseProtocol.compute_hr_ensemble) which allows us to enable/disable the specific processing steps and provide different parameters to the individual step (see the [<code>ECG_Analysis_Example.ipynb</code>](ECG_Analysis_Example.ipynb) notebook for further explanations on the different processing steps).\n",
    "\n",
    "Since different kinds of ensemble time-series can be computed (e.g., normalization of heart rate data vs. no normalization, ...), we can speficy an identifier for these results via the `ensemble_id` parameter.\n",
    "\n",
    "[BaseProtocol.compute_hr_ensemble()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.base.html#biopsykit.protocols.base.BaseProtocol.compute_hr_ensemble) can do the following steps (`True`/`False` indicates which steps are enabled by default – have also a look at the function documentation!):\n",
    "\n",
    "* Resample (`resample_sec`): `True`\n",
    "* Normalize (`normalize_to`): `True`\n",
    "* Select Phases (`select_phases`): `False`\n",
    "* Cut Phases to Shortest Duration (`cut_phases`): `True`\n",
    "* Merge Dictionary (`merge_dict`): `True`\n",
    "* Add Conditions (`add_conditions`): `False`\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "base_protocol.compute_hr_ensemble(\"hr_ensemble\", params={\"normalize_to\": \"Baseline\"})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "display_dict_structure(base_protocol.hr_ensemble[\"hr_ensemble\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Compute *HRV Parameters*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we can compute *HRV Parameters*. For this, we can use the method [BaseProtocol.compute_hrv_results()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.base.html#biopsykit.protocols.base.BaseProtocol.compute_hrv_results) which allows us to enable/disable the specific processing steps and provide different parameters to the individual step (see the [<code>ECG_Analysis_Example.ipynb</code>](ECG_Analysis_Example.ipynb) notebook for further explanations on the different processing steps).\n",
    "\n",
    "Since different kinds of HRV results can be computed (e.g., aggregating only over phases vs. aggregating over subphases, ...) we can speficy an identifier for these results via the `result_id` parameter.\n",
    "\n",
    "[BaseProtocol.compute_hrv_results()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.base.html#biopsykit.protocols.base.BaseProtocol.compute_hrv_results) can do the following steps (`True`/`False` indicates which steps are enabled by default – have also a look at the function documentation!):\n",
    "\n",
    "* Select Phases (`select_phases`): `False`\n",
    "* Split into Subphases (`split_into_subphases`): `False`\n",
    "* Add Conditions (`add_conditions`): `False`\n",
    "\n",
    "\n",
    "Additionally, the function has arguments to further specify HRV processing:\n",
    "\n",
    "* `dict_levels`: The names of the dictionary levels. Corresponds to the index level names of the resulting, concatenated dataframe. Default: `None` for the levels [\"subject\", \"phase\"] (or [\"subject\", \"phase\", \"subphase\"], when `split_into_subphases` is `True`)\n",
    "* `hrv_params`: Dictionary with parameters to configure HRV parameter computation. Parameters are passed to [EcgProcessor.hrv_process()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.signals.ecg.html#biopsykit.signals.ecg.EcgProcessor.hrv_process). Examples:\n",
    "  * `hrv_types`: list of HRV types to be computed. Subset of [\"hrv_time\", \"hrv_nonlinear\", \"hrv_frequency\"].\n",
    "  * `correct_rpeaks`: Whether to apply R peak correction before computing HRV parameters or not"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Compute HRV Parameter over Phases"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "base_protocol.compute_hrv_results(\"hrv_phases\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "base_protocol.hrv_results[\"hrv_phases\"].head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Compute HRV Parameter over Subphases"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Assuming we only want to compute a certain type of HRV measures (in this example, only time-domain measures), we can specify this in the `hrv_params` (see [EcgProcessor.hrv_process()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.signals.ecg.html#biopsykit.signals.ecg.EcgProcessor.hrv_process) for an overview of available arguments) argument."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "base_protocol.compute_hrv_results(\n",
    "    \"hrv_subphases\",\n",
    "    split_into_subphases=True,\n",
    "    params={\"split_into_subphases\": subphases},\n",
    "    hrv_params={\"hrv_types\": \"hrv_time\"},\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "base_protocol.hrv_results[\"hrv_subphases\"].head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Add Precomputed Results"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If processing results were already computed (e.g., in another notebook or because it's data from an old study), the precomputed results can be imported and directly added to the `Protocol` instance."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Ensemble Time-series\n",
    "\n",
    "In this example, we have *Ensemble Time-series* data from a study that has 3 phases (*Phase1*, *Phase2*, *Phase3*). This data is already in the correct format, so we can directly add it to a [Protocol](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.html)  object (that also has the correct structure – the duration of the single phases is not of importance, so we just leave it empty here)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dict_merged_norm = bp.example_data.get_hr_ensemble_sample()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "base_protocol = BaseProtocol(\"TimeSeriesDemo\", structure={\"Phase1\": None, \"Phase2\": None, \"Phase3\": None})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "base_protocol.add_hr_ensemble(\"hr_ensemble\", dict_merged_norm)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "display_dict_structure(base_protocol.hr_ensemble[\"hr_ensemble\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## MIST"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The [MIST](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.mist.html#biopsykit.protocols.mist.MIST) represents the *Montreal Imaging Stress Task*. \n",
    "\n",
    "Typically, the MIST has the following structure:\n",
    "\n",
    "* \"Before\": The study part before conducting the \"actual\" MIST. Here, subjects typically fill out questionnaires and provide a neutral baseline before stress induction\n",
    "* \"MIST\": The \"actual\" MIST. The MIST protocol consists of three repetitions of the same subphases: *Baseline* (BL), *Arithmetic Tasks* (AT), *Feedback* (FB). Each subphase has a duration that is specified in seconds.\n",
    "* \"After\": The study part after the MIST. Here, subjects typically fill out further questionnaires and remain in the examination room to provide further saliva samples.\n",
    "\n",
    "\n",
    "In summary:\n",
    "\n",
    "* *Study Parts*: Before, MIST, After\n",
    "* *Phases*: MIST1, MIST2, MIST3\n",
    "* *Subphases*: BL, AT, FB\n",
    "* *Subphase Durations*: 60, 240, 0 (Feedback Interval has length 0 because it's of variable length for each MIST phase and will later be inferred from the data)\n",
    "\n",
    "For further information please visit the documentatin of [protocols.MIST](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.mist.html).\n",
    "\n",
    "Hence, the [structure](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.tsst.html#biopsykit.protocols.tsst.TSST.structure)  dictionary would look like the following:\n",
    "\n",
    "```\n",
    "structure = {\n",
    "    \"Before\": None,\n",
    "    \"MIST\": {\n",
    "        \"MIST1\": {\"BL\": 60, \"AT\": 240, \"FB\": 0},\n",
    "        \"MIST2\": {\"BL\": 60, \"AT\": 240, \"FB\": 0},\n",
    "        \"MIST3\": {\"BL\": 60, \"AT\": 240, \"FB\": 0}\n",
    "    },\n",
    "    \"After\": None\n",
    "}\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "structure = {\n",
    "    \"Before\": None,\n",
    "    \"MIST\": {\n",
    "        \"MIST1\": [\"BL\", \"AT\", \"FB\"],\n",
    "        \"MIST2\": [\"BL\", \"AT\", \"FB\"],\n",
    "        \"MIST3\": [\"BL\", \"AT\", \"FB\"],\n",
    "    },\n",
    "    \"After\": None,\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mist = MIST(structure=structure)\n",
    "mist"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## TSST"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The [TSST](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.tsst.html#biopsykit.protocols.tsst.TSST)  represents the *Trier Social Stress Test*. \n",
    "\n",
    "\n",
    "Typically, the TSST has the following structure:\n",
    "\n",
    "* \"Before\": The study part before conducting the \"actual\" TSST. Here, subjects typically fill out questionnaires and provide a neutral baseline before stress induction\n",
    "* \"TSST\": The \"actual\" TSST. The TSST protocol consists two phases: *Talk* and *Math*, both with a duration of 5 minutes (300 seconds).\n",
    "* \"After\": The study part after the TSST. Here, subjects typically fill out further questionnaires and remain in the examination room to provide further saliva samples.\n",
    "\n",
    "\n",
    "In summary:\n",
    "\n",
    "* *Study Parts*: Before, TSST, After\n",
    "* *Phases*: Talk, Math\n",
    "* *Phase Durations*: 300, 300\n",
    "\n",
    "For further information please visit the documentatin of [protocols.TSST](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.tsst.html).\n",
    "\n",
    "Hence, the [structure](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.tsst.html#biopsykit.protocols.tsst.TSST.structure) dictionary would look like the following:\n",
    "\n",
    "```\n",
    "structure = {\n",
    "    \"Before\": None,\n",
    "    \"TSST\": {\n",
    "        \"Talk\": 300,\n",
    "        \"Math\": 300,\n",
    "    },\n",
    "    \"After\": None\n",
    "}\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "structure = {\n",
    "    \"Before\": None,\n",
    "    \"TSST\": {\n",
    "        \"Talk\": 300,\n",
    "        \"Math\": 300,\n",
    "    },\n",
    "    \"After\": None,\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "tsst = TSST(structure=structure)\n",
    "tsst"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Plotting"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The Protocol API offers functions to directly visualize the processed study data of the `Protocol` instance. So far, the visualization of heart rate data ([hr_mean_plot()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.plotting.html#biopsykit.protocols.plotting.hr_mean_plot) and [hr_ensemble_plot()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.plotting.html#biopsykit.protocols.plotting.hr_ensemble_plot)) and saliva data ([saliva_plot()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.plotting.html#biopsykit.protocols.plotting.saliva_plot)) is supported."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create Protocol object\n",
    "\n",
    "*Note*: For this example we assume that data was collected during the MIST."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mist = MIST()\n",
    "mist"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Aggregated Results"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Load Example Aggregated Results Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "hr_result = bp.example_data.get_hr_result_sample()\n",
    "# Alternatively: Load your own aggregated results\n",
    "# hr_result = bp.io.load_long_format_csv(\"<path-to-aggregated-results-in-long-format>\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "hr_result.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Add Aggregated Results Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mist.add_hr_results(\"hr_result\", hr_result)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Plot Mean HR Plot"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "nbsphinx-thumbnail"
    ]
   },
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots()\n",
    "mist.hr_mean_plot(\"hr_result\", ax=ax);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Ensemble Time-series"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Load Example Ensemble Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dict_hr_ensemble = bp.example_data.get_hr_ensemble_sample()\n",
    "\n",
    "# Alternatively: Load your own ensemble time-series data\n",
    "# dict_hr_ensemble = bp.io.load_pandas_dict_excel(\"<path-to-heart-rate-ensemble-data.xlsx>\")\n",
    "\n",
    "# Note: We rename the phase names (Phase1-3) of this example dataset to match\n",
    "# it with the standard phase names of the MIST (MIST1-3)\n",
    "dict_hr_ensemble = {key: value for key, value in zip([\"MIST1\", \"MIST2\", \"MIST3\"], dict_hr_ensemble.values())}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "display_dict_structure(dict_hr_ensemble)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Add Ensemble Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mist.add_hr_ensemble(\"hr_ensemble\", dict_hr_ensemble)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Plot HR Ensemble Plot"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*Note*: Since the MIST contains subphases the [MIST.hr_ensemble_plot()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.mist.html#biopsykit.protocols.mist.MIST.hr_ensemble_plot) function makes use of [structure](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.mist.html#biopsykit.protocols.mist.MIST.structure) property of the [MIST](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.mist.html#biopsykit.protocols.mist.MIST)  object and automatically highlights the subphases in the HR Ensemble Plot."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots()\n",
    "mist.hr_ensemble_plot(\"hr_ensemble\", ax=ax);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Saliva"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Just like heart rate data we can also add saliva data to the [Protocols](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.html) object. We just need to specify which type of data it is (e.g., \"cortisol\", \"amylase\", etc.) – see the (see the [<code>Saliva_Example.ipynb</code>](Saliva_Example.ipynb) notebook for further information.\n",
    "\n",
    "\n",
    "Saliva data can be added to the `Protocol` object in two different format:\n",
    "\n",
    "* Saliva data per subject as [SalivaRawDataFrame](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.utils.datatype_helper.html#biopsykit.utils.datatype_helper.SalivaRawDataFrame), a specialized dataframe containing \"raw\" (unaggregated) saliva data in a standardized format. `BioPsyKit` will then interanally compute mean and standard error.\n",
    "* Aggregated saliva data (mean and standard error) as [SalivaMeanSeDataFrame](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.utils.datatype_helper.html#biopsykit.utils.datatype_helper.SalivaMeanSeDataFrame) , a specialized dataframe containing mean and standard error of saliva samples in a standardized format."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### SalivaRawDataFrame"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Load Example Cortisol Data in [SalivaRawDataFrame](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.utils.datatype_helper.html#biopsykit.utils.datatype_helper.SalivaRawDataFrame) format.\n",
    "\n",
    "*Note*: The sample times are provided relative to the psychological protocol. The first sample is dropped for plotting."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "saliva_data = bp.example_data.get_saliva_example(sample_times=[-30, -1, 0, 10, 20, 30, 40])\n",
    "# alternatively: load your own saliva data\n",
    "# saliva_data = bp.io.saliva.load_saliva_wide_format(\"<path-to-saliva-data>\")\n",
    "# drop first saliva sample (not needed because it was only assessed to control for high baseline)\n",
    "saliva_data = saliva_data.drop(\"0\", level=\"sample\")\n",
    "saliva_data.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Add Saliva Data to [MIST](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.mist.html#biopsykit.protocols.mist.MIST)  object"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mist.add_saliva_data(saliva_data, \"cortisol\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = mist.saliva_plot(saliva_type=\"cortisol\")\n",
    "fig.tight_layout()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### SalivaMeanSeDataFrame"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create Protocol object – For this example, we're assuming the data was collected during the TSST."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "structure = {\n",
    "    \"Before\": None,\n",
    "    \"TSST\": {\n",
    "        \"Talk\": 300,\n",
    "        \"Math\": 300,\n",
    "    },\n",
    "    \"After\": None,\n",
    "}\n",
    "\n",
    "tsst = TSST(structure=structure)\n",
    "tsst"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Load Example Saliva Data (cortisol, amylase, and IL-6) in [SalivaMeanSeDataFrame](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.utils.datatype_helper.html#biopsykit.utils.datatype_helper.SalivaMeanSeDataFrame) format (or, more presicely, a dictionary of such)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "saliva_dict = bp.example_data.get_saliva_mean_se_example()\n",
    "display_dict_structure(saliva_dict)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Add Saliva Data to [TSST](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.tsst.html#biopsykit.protocols.tsst.TSST) object. Since it's a dictionary, we don't have to specify the `saliva_types` – the names are automatically inferred from the dictionary keys (we can, however, also manually specify the saliva type if we want)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "tsst.add_saliva_data(saliva_dict)\n",
    "display_dict_structure(tsst.saliva_data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Plot Cortisol Data\n",
    "\n",
    "*Note*: This function will automatically plot data from all conditions. If you want to change that behavior, and plot only selected conditions (or just one), you have to manually call [saliva_plot()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.plotting.html#biopsykit.protocols.plotting.saliva_plot) from the module [biopsykit.protocols.plotting](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.plotting.html)(see below)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = tsst.saliva_plot(saliva_type=\"cortisol\", legend_loc=\"upper right\")\n",
    "fig.tight_layout()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Plot Cortisol Data – Only \"Intervention\" Group"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots()\n",
    "bp.protocols.plotting.saliva_plot(\n",
    "    data=tsst.saliva_data[\"cortisol\"].xs(\"Intervention\"),\n",
    "    saliva_type=\"cortisol\",\n",
    "    test_times=tsst.test_times,\n",
    "    test_title=\"TSST\",\n",
    "    legend_loc=\"upper right\",\n",
    "    ax=ax,\n",
    ")\n",
    "fig.tight_layout()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Plot Cortisol and IL-6 Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots()\n",
    "tsst.saliva_plot(\n",
    "    saliva_type=[\"cortisol\", \"il6\"], linestyle=[\"-\", \"--\"], marker=[\"o\", \"P\"], legend_loc=\"upper right\", ax=ax\n",
    ");"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "biopsykit",
   "language": "python",
   "name": "biopsykit"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}