{
 "metadata": {
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.3"
  },
  "orig_nbformat": 2,
  "kernelspec": {
   "name": "python383jvsc74a57bd0449a6a5da217c2d6cffa80a6a7a4724fa6726b57ec2f120da10e91a7f29f4eb0",
   "display_name": "Python 3.8.3 64-bit ('runpandas_dev': conda)"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2,
 "cells": [
  {
   "source": [
    "# Analyzing the heart data and building the respective training zones\n",
    "Many runners use the heart rate (HR) data to plan their training strategies. Heart rate is an individual metric and differs between athletes running the same pace, therefore, it can help them to pace themselves properly and can be a useful metric to gauge fatigue and fitness level.  For many experienced runners, it would be important to analyze your historical workouts, explore your heart rate range variation and check the ellapsed time through each training zone in order to ensure that you are running at the right effort level to maximize your workout. In this notebook we will present how we can explore our heart rate (HR) records and the respective training zones based on the data provided from the smartwatches or tracking running apps.\n",
    "\n",
    "\n",
    "\n"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "source": [
    "## Finding my maximum HR"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "source": [
    "The easiest way resorts to a empirical equation, ``HRmax = 220 - Age``. It is drawn from some epidemiological studies; thus, may not personalized. A better way is to monitor HR while running uphill intervals or use your historical data to compute your HRmax. In this scenario let's use the empirical HRmax. The current runner's age is 36, so my max heart rate would be 184 (HRmax = 220-36)."
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "source": [
    "## Computing the training zones"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "source": [
    "As we explained above, the heart rate is grouped by zones: active recovery (easy running), general aerobic (marathon pace), basic endurance or the lactate threshold zone (tempo run) and the anaerobic or the VO2 max work(interval). For the maximum heart rate of 184, based on the Garmin's thresholds, the zones can be calculated as:\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "| Zone|  | heart_rate|\n",
    "| :- | :- | :-: |\n",
    "| Z1 (Active recovery)| | 110 |\n",
    "| Z2  (Endurance)| | 129 |\n",
    "| Z3 (Tempo)| | 147 |\n",
    "| Z4 (Threshold)| | 166 |\n",
    "| Z5 (VO2 Max) | | 184 |"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "source": [
    "## Using the Heart Rate Zones to Review a Workout"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "source": [
    "Now that we have the training zones, by using runpandas we could play with our workouts and evaluate the quality of the workouts based on its training zones. In this example, I selected one of workouts to further explore these zones and how they correlate with the sensors data available recorded.\n",
    "\n",
    "\n",
    "\n"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import warnings\n",
    "warnings.filterwarnings('ignore')"
   ]
  },
  {
   "source": [
    "import runpandas\n",
    "activity = runpandas.read_file('./11km.tcx')\n",
    "print('Start', activity.index[0],'End:', activity.index[-1])\n",
    "print('Distance in km:', activity.distance / 1000)"
   ],
   "cell_type": "code",
   "metadata": {},
   "execution_count": 2,
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "Start 0 days 00:00:00 End: 0 days 01:16:06\nDistance in km: 11.035920898\n"
     ]
    }
   ]
  },
  {
   "source": [
    "First, let's perform a QC evaluation on the data, to check if there's any invalid or missing data required for the analysis. As you can see in the cell below, there are 5 records with heart rate data missing. We will replace all these with the first HR sensor data available.\n",
    "\n"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "There are nan records: 5\n          run_cadence         alt       dist  hr        lon       lat  \\\ntime                                                                    \n00:00:00          NaN  668.801819   0.000000 NaN -36.577568 -8.364486   \n00:00:07          NaN  668.714722   5.749573 NaN -36.577465 -8.364492   \n00:00:10          NaN  668.680603  11.615299 NaN -36.577423 -8.364470   \n00:00:12         83.0  668.639099  17.306795 NaN -36.577366 -8.364449   \n00:00:15         82.0  668.600464  22.672394 NaN -36.577312 -8.364429   \n\n             speed  \ntime                \n00:00:00  0.000000  \n00:00:07  0.000000  \n00:00:10  0.000000  \n00:00:12  2.262762  \n00:00:15  2.317986  \nTotal nan after fill: 0\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "\n",
    "group_hr = activity['hr'].isnull().sum()\n",
    "print(\"There are nan records: %d\" % group_hr)\n",
    "\n",
    "#There is 5 missing values in HR. Let's see the positions where they are placed in the frame.\n",
    "print(activity[activity['hr'].isnull()])\n",
    "\n",
    "#We will replace all NaN values with the first HR sensor data available\n",
    "activity['hr'].fillna(activity.iloc[5]['hr'], inplace=True)\n",
    "\n",
    "print('Total nan after fill:', activity['hr'].isnull().sum())"
   ]
  },
  {
   "source": [
    "Let's see how to add a column with the heart rate zone label to the data frame.  For this task, we will use the special method `runpandas.compute.heart_zone` . The parameters are the bins argument which contains the left and right bounds for each training zone and the labels argument corresponding to the zone labels\n",
    "\n"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "time\n",
       "00:00:00    Z1\n",
       "00:00:07    Z1\n",
       "00:00:10    Z1\n",
       "00:00:12    Z1\n",
       "00:00:15    Z1\n",
       "Name: heartrate_zone, dtype: category\n",
       "Categories (6, object): [Rest < Z1 < Z2 < Z3 < Z4 < Z5]"
      ]
     },
     "metadata": {},
     "execution_count": 4
    }
   ],
   "source": [
    "activity['heartrate_zone'] = activity.compute.heart_zone(\n",
    "                        labels=[\"Rest\", \"Z1\", \"Z2\", \"Z3\", \"Z4\", \"Z5\"],\n",
    "                    bins=[0, 92, 110, 129, 147, 166, 184])\n",
    "activity[\"heartrate_zone\"].head()"
   ]
  },
  {
   "source": [
    "To calculate the time in zone, there is also a special method `runpandas.compute.time_in_zone` which computes the time spent for each training zone.\n",
    "\n"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "hr_zone\n",
       "Rest   00:00:00\n",
       "Z1     00:04:10\n",
       "Z2     00:07:05\n",
       "Z3     00:31:45\n",
       "Z4     00:33:06\n",
       "Z5     00:00:00\n",
       "Name: time_diff, dtype: timedelta64[ns]"
      ]
     },
     "metadata": {},
     "execution_count": 5
    }
   ],
   "source": [
    "time_in_zone = activity.compute.time_in_zone(\n",
    "                        labels=[\"Rest\", \"Z1\", \"Z2\", \"Z3\", \"Z4\", \"Z5\"],\n",
    "                    bins=[0, 92, 110, 129, 147, 166, 184])\n",
    "\n",
    "time_in_zone"
   ]
  },
  {
   "source": [],
   "cell_type": "markdown",
   "metadata": {}
  }
 ]
}