{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Saliva Example"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-block alert-info\">\n",
    "This example illustrates how to import saliva data (cortisol, amylase, etc.), how to compute often used parameters and how to export it to perform futher analysis.\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup and Helper Functions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns\n",
    "from fau_colors import cmaps\n",
    "\n",
    "import biopsykit as bp\n",
    "\n",
    "%matplotlib widget\n",
    "%load_ext autoreload\n",
    "%autoreload 2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.close(\"all\")\n",
    "\n",
    "palette = sns.color_palette(cmaps.faculties_light)\n",
    "sns.set_theme(context=\"notebook\", style=\"ticks\", font=\"sans-serif\", palette=palette)\n",
    "\n",
    "plt.rcParams[\"figure.figsize\"] = (8, 4)\n",
    "plt.rcParams[\"pdf.fonttype\"] = 42\n",
    "plt.rcParams[\"mathtext.default\"] = \"regular\"\n",
    "\n",
    "palette"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Set Saliva Time Points"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sample_times = [-30, -1, 30, 40, 50, 60, 70]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load Condition List"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Load example data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "condition_list = bp.example_data.get_condition_list_example()\n",
    "condition_list.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Alternatively: Load your own data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# condition_list = bp.io.load_subject_condition_list(\"<path-to-condition-list-file>\")\n",
    "# condition_list.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load Data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Option 0: Load BioPsyKit example data "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The example data will be imported and converted into a long-format dataframe:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_cort = bp.example_data.get_saliva_example()\n",
    "df_cort.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Compute Mean and Standard Error"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Mean and standard error is computed in [saliva.mean_se()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.saliva.html#biopsykit.saliva.mean_se) for the two conditions and for each sample separately."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "bp.saliva.mean_se(df_cort, saliva_type=\"cortisol\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Plot Saliva Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "bp.plotting.lineplot(\n",
    "    data=df_cort, x=\"sample\", y=\"cortisol\", hue=\"condition\", style=\"condition\", palette=cmaps.faculties\n",
    ");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Example: Subject Exclusion\n",
    "\n",
    "For this example, we assume we want to exclude the subjects 'Vp01' and 'Vp02' from both the condition list and the dataframe with cortisol samples using [data_processing.exclude_subjects()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.utils.data_processing.html#biopsykit.utils.data_processing.exclude_subjects)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dict_result = bp.utils.data_processing.exclude_subjects(\n",
    "    [\"Vp01\", \"Vp02\"], condition_list=condition_list, cortisol=df_cort\n",
    ")\n",
    "\n",
    "dict_result\n",
    "# uncomment to reassign the data with excluded subjects to the original variable names\n",
    "# df_cort = dict_result['cortisol']\n",
    "# condition_list = dict_result['condition_list']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Option 1: Use BioPsyKit to load saliva data in 'plate' format\n",
    "\n",
    "The data is converted into long-format ([SalivaRawDataFrame](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.utils.datatype_helper.html#biopsykit.utils.datatype_helper.SalivaRawDataFrame)) and returned."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Load saliva data into pandas dataframe (example data using [biopsykit.example_data.get_saliva_example_plate_format()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.example_data.html#biopsykit.example_data.get_saliva_example_plate_format) or own data in 'plate'-format using [biopsykit.io.saliva.load_saliva_plate()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.io.saliva.html#biopsykit.io.saliva.load_saliva_plate)):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_cort = bp.example_data.get_saliva_example_plate_format()\n",
    "\n",
    "# alternatively: load own data that is present in \"plate\" format\n",
    "# df_cort = bp.io.saliva.load_saliva_plate(file_path=\"<path-to-saliva-data-in-plate-format>\", saliva_type=\"cortisol\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_cort.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can additionally directly pass a 'condition list' to the data loader function (for example data loader as well as for your own data). This allows us to directly assign subject conditions to the saliva samples:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_cort = bp.example_data.get_saliva_example_plate_format(condition_list=condition_list)\n",
    "# alternatively: load own data in \"plate\" format and directly assign subject conditions\n",
    "# df_cort = bp.io.saliva.load_saliva_plate(file_path=\"<path-to-saliva-data-in-plate-format>\", saliva_type=\"cortisol\", condition_list=condition_list)\n",
    "\n",
    "df_cort.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Speficy your custom regular expression string to extract Subject ID and Saliva ID (see the documentation of [biopsykit.io.saliva.load_saliva_plate()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.io.saliva.html#biopsykit.io.saliva.load_saliva_plate) for further information).\n",
    "\n",
    "For example, the regex string in `regex_str` will extract the subject IDs **without** the `Vp` prefix and sample IDs **without** the `S` prefix:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "regex_str = r\"Vp(\\d+) S(\\d)\"\n",
    "df_cort = bp.example_data.get_saliva_example_plate_format(regex_str=regex_str)\n",
    "# works analogously for bp.io.saliva.load_saliva_plate()\n",
    "\n",
    "df_cort.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Option 2: Use BioPsyKit to load saliva data that's already in the \"correct\" wide format"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "During import, the data is converted into long-format ([SalivaRawDataFrame](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.utils.datatype_helper.html#biopsykit.utils.datatype_helper.SalivaRawDataFrame)) and returned."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_cort = bp.example_data.get_saliva_example()\n",
    "# Alternatively, use your own data:\n",
    "# df_cort = bp.io.saliva.load_saliva_wide_format(file_path=\"<path-to-cortisol-data-in-wide-format.csv>\", saliva_type=\"cortisol\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_cort.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Save Data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Save Dataframe as csv (in standardized format)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# bp.io.saliva.save_saliva(\"<export-path.csv>\", df_cort)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Saliva Data Processing"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Saliva Data in [SalivaRawDataFrame](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.utils.datatype_helper.html#biopsykit.utils.datatype_helper.SalivaRawDataFrame)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_cort.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Mean and Standard Error over all Subjects"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Computing mean and standard error over all subjects returns a [SalivaMeanSeDataFrame](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.utils.datatype_helper.html#biopsykit.utils.datatype_helper.SalivaMeanSeDataFrame). This dataframe can directly be used for plotting functions from `BioPsyKit`, such as [plotting.saliva_plot()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.plotting.html#biopsykit.protocols.plotting.saliva_plot) (more on that later)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_cort_mean_se = bp.saliva.mean_se(df_cort)\n",
    "df_cort_mean_se.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Saliva Features"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Standard Features"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Compute a set of \"Standard Features\", including:\n",
    "\n",
    "* `argmax`: location of maximum\n",
    "* `mean`: mean value\n",
    "* `std`: standard deviation\n",
    "* `skew`: skewness\n",
    "* `kurt`: kurtosis\n",
    "\n",
    "using [saliva.standard_features()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.saliva.html#biopsykit.saliva.standard_features)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "bp.saliva.standard_features(df_cort).head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### AUC"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Area under the Curve (AUC), in different variations (according to Pruessner et al. 2003):\n",
    "\n",
    "* `auc_g`: Total Area under the Curve\n",
    "* `auc_i`: Area under the Curve with respect to increase\n",
    "* `auc_i_post`: (if `compute_auc_post=True`) Area under the Curve with respect to increase *after* the stressor: This is only relevant for an acute stress scenario where we collected saliva samples before and after the stressor. Per `BioPsyKit` convention, saliva samples collected *after* the stressor have saliva times $t \\geq 0$.\n",
    "\n",
    "(see the documentation of [saliva.auc()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.saliva.html#biopsykit.saliva.auc) for further information)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Note**: For these examples we neglect the first saliva sample (S0) by setting `remove_s0` to `True` because this sample was only assessed to control for high cortisol baseline and is not relevant for feature computation. The feature computation is performed on the remaining saliva samples (S1-S6)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# sample times must directly be supplied if they are not already part of the saliva dataframe\n",
    "bp.saliva.auc(df_cort, remove_s0=True, sample_times=sample_times).head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Maximum Increase"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Absolute maximum increase (or the relative increase in percent if `percent=True`) between the *first* sample in the data and *all others*:\n",
    "\n",
    "Note: see the documentation of [saliva.max_increase()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.saliva.html#biopsykit.saliva.max_increase) for further information)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "bp.saliva.max_increase(df_cort, remove_s0=True).head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Slope"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Slope between two saliva samples (specified by `sample_idx`):\n",
    "Note: see the documentation of [saliva.slope()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.saliva.html#biopsykit.saliva.slope) for further information)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "bp.saliva.slope(df_cort, sample_idx=(1, 4), sample_times=sample_times).head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Plotting"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Using Seaborn (some very simple Examples)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots()\n",
    "sns.lineplot(\n",
    "    data=df_cort.reset_index(),\n",
    "    x=\"sample\",\n",
    "    y=\"cortisol\",\n",
    "    hue=\"condition\",\n",
    "    hue_order=[\"Control\", \"Intervention\"],\n",
    "    errorbar=None,  # errorbar = None => no error bars\n",
    "    ax=ax,\n",
    ")\n",
    "ax.set_ylabel(\"Cortisol [nmol/l]\")\n",
    "ax.set_xlabel(\"Sample Times\")\n",
    "fig.tight_layout()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots()\n",
    "sns.lineplot(\n",
    "    data=df_cort.reset_index(), x=\"sample\", y=\"cortisol\", hue=\"condition\", hue_order=[\"Control\", \"Intervention\"], ax=ax\n",
    ")\n",
    "ax.set_ylabel(\"Cortisol [nmol/l]\")\n",
    "ax.set_xlabel(\"Sample Times\")\n",
    "fig.tight_layout()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots()\n",
    "sns.boxplot(\n",
    "    data=df_cort.reset_index(),\n",
    "    x=\"sample\",\n",
    "    y=\"cortisol\",\n",
    "    hue=\"condition\",\n",
    "    hue_order=[\"Control\", \"Intervention\"],\n",
    "    ax=ax,\n",
    ")\n",
    "fig.tight_layout()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots()\n",
    "sns.boxplot(\n",
    "    data=bp.saliva.max_increase(df_cort).reset_index(),\n",
    "    x=\"condition\",\n",
    "    y=\"cortisol_max_inc\",\n",
    "    order=[\"Control\", \"Intervention\"],\n",
    ")\n",
    "fig.tight_layout()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots()\n",
    "\n",
    "data_features = bp.saliva.utils.saliva_feature_wide_to_long(bp.saliva.standard_features(df_cort), \"cortisol\")\n",
    "\n",
    "sns.boxplot(\n",
    "    data=data_features.reset_index(),\n",
    "    x=\"saliva_feature\",\n",
    "    y=\"cortisol\",\n",
    "    hue=\"condition\",\n",
    "    hue_order=[\"Control\", \"Intervention\"],\n",
    "    ax=ax,\n",
    ")\n",
    "fig.tight_layout()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Using functions from `BioPsyKit`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_cort_mean_se.T"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Note**: This example shows how to use the function [plotting.saliva_plot()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.plotting.html#biopsykit.protocols.plotting.saliva_plot). If you recorded saliva data within a psychological protocol, however, it's recommended to use the Protocol API and create a [Protocol](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.html) object, where you can add all data and use plotting functions in a more convenient way."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Saliva Plot"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "nbsphinx-thumbnail"
    ]
   },
   "outputs": [],
   "source": [
    "# plot without first saliva sample\n",
    "bp.protocols.plotting.saliva_plot(\n",
    "    df_cort_mean_se.drop(\"0\", level=\"sample\"),\n",
    "    saliva_type=\"cortisol\",\n",
    "    sample_times=sample_times[1:],\n",
    "    sample_times_absolute=True,\n",
    "    test_times=[0, 30],\n",
    "    test_title=\"TEST\",\n",
    ");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Saliva Features"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "All features in one plot using [saliva_feature_boxplot()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.plotting.html#biopsykit.protocols.plotting.saliva_feature_boxplot), `x` variable separates the features, `hue` variable separates the conditions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots()\n",
    "bp.protocols.plotting.saliva_feature_boxplot(\n",
    "    data=data_features,\n",
    "    x=\"saliva_feature\",\n",
    "    saliva_type=\"cortisol\",\n",
    "    hue=\"condition\",\n",
    "    hue_order=[\"Control\", \"Intervention\"],\n",
    "    ax=ax,\n",
    ")\n",
    "fig.tight_layout()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "One subplot per feature using [saliva_multi_feature_boxplot()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.plotting.html#biopsykit.protocols.plotting.saliva_multi_feature_boxplot), `x` variable separates the conditions. `features` is provided a list of the features to plot."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "bp.protocols.plotting.saliva_multi_feature_boxplot(\n",
    "    data=data_features,\n",
    "    saliva_type=\"cortisol\",\n",
    "    features=[\"argmax\", \"kurtosis\", \"std\", \"mean\"],\n",
    "    hue=\"condition\",\n",
    "    hue_order=[\"Control\", \"Intervention\"],\n",
    "    palette=cmaps.faculties_light,\n",
    ")\n",
    "fig.tight_layout()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Grouping of features per subplot, `features` is provided a dictionary that specifies the mapping."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "bp.protocols.plotting.saliva_multi_feature_boxplot(\n",
    "    data=data_features,\n",
    "    saliva_type=\"cortisol\",\n",
    "    features={\"mean\": [\"mean\", \"argmax\"], \"std\": [\"std\", \"skew\", \"kurtosis\"]},\n",
    "    hue=\"condition\",\n",
    "    hue_order=[\"Control\", \"Intervention\"],\n",
    "    palette=cmaps.faculties_light,\n",
    ")\n",
    "fig.tight_layout()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "biopsykit",
   "language": "python",
   "name": "biopsykit"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.14"
  },
  "toc-showtags": false
 },
 "nbformat": 4,
 "nbformat_minor": 4
}