{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Saliva Example" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "This example illustrates how to import saliva data (cortisol, amylase, etc.), how to compute often used parameters and how to export it to perform futher analysis.\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setup and Helper Functions" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from pathlib import Path\n", "\n", "import re\n", "\n", "import pandas as pd\n", "import numpy as np\n", "\n", "from fau_colors import cmaps\n", "import biopsykit as bp\n", "\n", "import matplotlib.pyplot as plt\n", "import seaborn as sns\n", "\n", "%matplotlib widget\n", "%load_ext autoreload\n", "%autoreload 2" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.close(\"all\")\n", "\n", "palette = sns.color_palette(cmaps.faculties)\n", "sns.set_theme(context=\"notebook\", style=\"ticks\", font=\"sans-serif\", palette=palette)\n", "\n", "plt.rcParams[\"figure.figsize\"] = (8, 4)\n", "plt.rcParams[\"pdf.fonttype\"] = 42\n", "plt.rcParams[\"mathtext.default\"] = \"regular\"\n", "\n", "palette" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Set Saliva Time Points" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sample_times = [-30, -1, 30, 40, 50, 60, 70]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load Condition List" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Load example data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "condition_list = bp.example_data.get_condition_list_example()\n", "condition_list.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alternatively: Load your own data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# condition_list = bp.io.load_subject_condition_list(\"\")\n", "# condition_list.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load Data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Option 0: Load BioPsyKit example data " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The example data will be imported and converted into a long-format dataframe:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_cort = bp.example_data.get_saliva_example()\n", "df_cort.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Compute Mean and Standard Error" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Mean and standard error is computed in [saliva.mean_se()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.saliva.html#biopsykit.saliva.mean_se) for the two conditions and for each sample separately." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "bp.saliva.mean_se(df_cort, saliva_type=\"cortisol\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Plot Saliva Data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "bp.plotting.lineplot(data=df_cort, x=\"sample\", y=\"cortisol\", hue=\"condition\", style=\"condition\");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Example: Subject Exclusion\n", "\n", "For this example, we assume we want to exclude the subjects 'Vp01' and 'Vp02' from both the condition list and the dataframe with cortisol samples using [data_processing.exclude_subjects()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.utils.data_processing.html#biopsykit.utils.data_processing.exclude_subjects)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dict_result = bp.utils.data_processing.exclude_subjects(\n", " [\"Vp01\", \"Vp02\"], condition_list=condition_list, cortisol=df_cort\n", ")\n", "\n", "dict_result\n", "# uncomment to reassign the data with excluded subjects to the original variable names\n", "# df_cort = dict_result['cortisol']\n", "# condition_list = dict_result['condition_list']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Option 1: Use BioPsyKit to load saliva data in 'plate' format\n", "\n", "The data is converted into long-format ([SalivaRawDataFrame](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.utils.datatype_helper.html#biopsykit.utils.datatype_helper.SalivaRawDataFrame)) and returned." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Load saliva data into pandas dataframe (example data using [biopsykit.example_data.get_saliva_example_plate_format()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.example_data.html#biopsykit.example_data.get_saliva_example_plate_format) or own data in 'plate'-format using [biopsykit.io.saliva.load_saliva_plate()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.io.saliva.html#biopsykit.io.saliva.load_saliva_plate)):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_cort = bp.example_data.get_saliva_example_plate_format()\n", "\n", "# alternatively: load own data that is present in \"plate\" format\n", "# df_cort = bp.io.saliva.load_saliva_plate(file_path=\"\", saliva_type=\"cortisol\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_cort.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can additionally directly pass a 'condition list' to the data loader function (for example data loader as well as for your own data). This allows us to directly assign subject conditions to the saliva samples:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_cort = bp.example_data.get_saliva_example_plate_format(condition_list=condition_list)\n", "# alternatively: load own data in \"plate\" format and directly assign subject conditions\n", "# df_cort = bp.io.saliva.load_saliva_plate(file_path=\"\", saliva_type=\"cortisol\", condition_list=condition_list)\n", "\n", "df_cort.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Speficy your custom regular expression string to extract Subject ID and Saliva ID (see the documentation of [biopsykit.io.saliva.load_saliva_plate()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.io.saliva.html#biopsykit.io.saliva.load_saliva_plate) for further information).\n", "\n", "For example, the regex string in `regex_str` will extract the subject IDs **without** the `Vp` prefix and sample IDs **without** the `S` prefix:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "regex_str = \"Vp(\\d+) S(\\d)\"\n", "df_cort = bp.example_data.get_saliva_example_plate_format(regex_str=regex_str)\n", "# works analogously for bp.io.saliva.load_saliva_plate()\n", "\n", "df_cort.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Option 2: Use BioPsyKit to load saliva data that's already in the \"correct\" wide format" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "During import, the data is converted into long-format ([SalivaRawDataFrame](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.utils.datatype_helper.html#biopsykit.utils.datatype_helper.SalivaRawDataFrame)) and returned." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_cort = bp.example_data.get_saliva_example()\n", "# Alternatively, use your own data:\n", "# df_cort = bp.io.saliva.load_saliva_wide_format(file_path=\"\", saliva_type=\"cortisol\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_cort.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Save Data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Save Dataframe as csv (in standardized format)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# bp.io.saliva.save_saliva(\"\", df_cort)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Saliva Data Processing" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Saliva Data in [SalivaRawDataFrame](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.utils.datatype_helper.html#biopsykit.utils.datatype_helper.SalivaRawDataFrame)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_cort.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Mean and Standard Error over all Subjects" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Computing mean and standard error over all subjects returns a [SalivaMeanSeDataFrame](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.utils.datatype_helper.html#biopsykit.utils.datatype_helper.SalivaMeanSeDataFrame). This dataframe can directly be used for plotting functions from `BioPsyKit`, such as [plotting.saliva_plot()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.plotting.html#biopsykit.protocols.plotting.saliva_plot) (more on that later)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_cort_mean_se = bp.saliva.mean_se(df_cort)\n", "df_cort_mean_se.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Saliva Features" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Standard Features" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compute a set of \"Standard Features\", including:\n", "\n", "* `argmax`: location of maximum\n", "* `mean`: mean value\n", "* `std`: standard deviation\n", "* `skew`: skewness\n", "* `kurt`: kurtosis\n", "\n", "using [saliva.standard_features()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.saliva.html#biopsykit.saliva.standard_features)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "bp.saliva.standard_features(df_cort).head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### AUC" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Area under the Curve (AUC), in different variations (according to Pruessner et al. 2003):\n", "\n", "* `auc_g`: Total Area under the Curve\n", "* `auc_i`: Area under the Curve with respect to increase\n", "* `auc_i_post`: (if `compute_auc_post=True`) Area under the Curve with respect to increase *after* the stressor: This is only relevant for an acute stress scenario where we collected saliva samples before and after the stressor. Per `BioPsyKit` convention, saliva samples collected *after* the stressor have saliva times $t \\geq 0$.\n", "\n", "(see the documentation of [saliva.auc()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.saliva.html#biopsykit.saliva.auc) for further information)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Note**: For these examples we neglect the first saliva sample (S0) by setting `remove_s0` to `True` because this sample was only assessed to control for high cortisol baseline and is not relevant for feature computation. The feature computation is performed on the remaining saliva samples (S1-S6)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# sample times must directly be supplied if they are not already part of the saliva dataframe\n", "bp.saliva.auc(df_cort, remove_s0=True, sample_times=sample_times).head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Maximum Increase" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Absolute maximum increase (or the relative increase in percent if `percent=True`) between the *first* sample in the data and *all others*:\n", "\n", "Note: see the documentation of [saliva.max_increase()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.saliva.html#biopsykit.saliva.max_increase) for further information)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "bp.saliva.max_increase(df_cort, remove_s0=True).head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Slope" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Slope between two saliva samples (specified by `sample_idx`):\n", "Note: see the documentation of [saliva.slope()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.saliva.html#biopsykit.saliva.slope) for further information)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "bp.saliva.slope(df_cort, sample_idx=(1, 4), sample_times=sample_times).head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plotting" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Using Seaborn (some very simple Examples)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots()\n", "sns.lineplot(\n", " data=df_cort.reset_index(),\n", " x=\"sample\",\n", " y=\"cortisol\",\n", " hue=\"condition\",\n", " hue_order=[\"Control\", \"Intervention\"],\n", " ci=None, # ci = None => no error bars\n", " ax=ax,\n", ")\n", "ax.set_ylabel(\"Cortisol [nmol/l]\")\n", "ax.set_xlabel(\"Sample Times\")\n", "fig.tight_layout()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots()\n", "sns.lineplot(\n", " data=df_cort.reset_index(), x=\"sample\", y=\"cortisol\", hue=\"condition\", hue_order=[\"Control\", \"Intervention\"], ax=ax\n", ")\n", "ax.set_ylabel(\"Cortisol [nmol/l]\")\n", "ax.set_xlabel(\"Sample Times\")\n", "fig.tight_layout()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots()\n", "sns.boxplot(\n", " data=df_cort.reset_index(),\n", " x=\"sample\",\n", " y=\"cortisol\",\n", " hue=\"condition\",\n", " hue_order=[\"Control\", \"Intervention\"],\n", " palette=cmaps.faculties_light,\n", " ax=ax,\n", ")\n", "fig.tight_layout()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots()\n", "sns.boxplot(\n", " data=bp.saliva.max_increase(df_cort).reset_index(),\n", " x=\"condition\",\n", " y=\"cortisol_max_inc\",\n", " order=[\"Control\", \"Intervention\"],\n", " palette=cmaps.faculties_light,\n", ")\n", "fig.tight_layout()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots()\n", "\n", "data_features = bp.saliva.utils.saliva_feature_wide_to_long(bp.saliva.standard_features(df_cort), \"cortisol\")\n", "\n", "sns.boxplot(\n", " data=data_features.reset_index(),\n", " x=\"saliva_feature\",\n", " y=\"cortisol\",\n", " hue=\"condition\",\n", " hue_order=[\"Control\", \"Intervention\"],\n", " palette=cmaps.faculties_light,\n", " ax=ax,\n", ")\n", "fig.tight_layout()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Using functions from `BioPsyKit`" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_cort_mean_se.T" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Note**: This example shows how to use the function [plotting.saliva_plot()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.plotting.html#biopsykit.protocols.plotting.saliva_plot). If you recorded saliva data within a psychological protocol, however, it's recommended to use the Protocol API and create a [Protocol](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.html) object, where you can add all data and use plotting functions in a more convenient way." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Saliva Plot" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [ "nbsphinx-thumbnail" ] }, "outputs": [], "source": [ "# plot without first saliva sample\n", "bp.protocols.plotting.saliva_plot(\n", " df_cort_mean_se.drop(\"0\", level=\"sample\"),\n", " saliva_type=\"cortisol\",\n", " sample_times=sample_times[1:],\n", " sample_times_absolute=True,\n", " test_times=[0, 30],\n", " test_title=\"TEST\",\n", ");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Saliva Features" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All features in one plot using [saliva_feature_boxplot()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.plotting.html#biopsykit.protocols.plotting.saliva_feature_boxplot), `x` variable separates the features, `hue` variable separates the conditions." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots()\n", "bp.protocols.plotting.saliva_feature_boxplot(\n", " data=data_features,\n", " x=\"saliva_feature\",\n", " saliva_type=\"cortisol\",\n", " hue=\"condition\",\n", " hue_order=[\"Control\", \"Intervention\"],\n", " palette=cmaps.faculties_light,\n", " ax=ax,\n", ")\n", "fig.tight_layout()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One subplot per feature using [saliva_multi_feature_boxplot()](https://biopsykit.readthedocs.io/en/latest/api/biopsykit.protocols.plotting.html#biopsykit.protocols.plotting.saliva_multi_feature_boxplot), `x` variable separates the conditions. `features` is provided a list of the features to plot." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "bp.protocols.plotting.saliva_multi_feature_boxplot(\n", " data=data_features,\n", " saliva_type=\"cortisol\",\n", " features=[\"argmax\", \"kurt\", \"std\", \"mean\"],\n", " hue=\"condition\",\n", " hue_order=[\"Control\", \"Intervention\"],\n", " palette=cmaps.faculties_light,\n", ")\n", "fig.tight_layout()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Grouping of features per subplot, `features` is provided a dictionary that specifies the mapping." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "bp.protocols.plotting.saliva_multi_feature_boxplot(\n", " data=data_features,\n", " saliva_type=\"cortisol\",\n", " features={\"mean\": [\"mean\", \"argmax\"], \"std\": [\"std\", \"skew\", \"kurt\"]},\n", " hue=\"condition\",\n", " hue_order=[\"Control\", \"Intervention\"],\n", " palette=cmaps.faculties_light,\n", ")\n", "fig.tight_layout()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "biopsykit", "language": "python", "name": "biopsykit" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.3" }, "toc-showtags": false }, "nbformat": 4, "nbformat_minor": 4 }