{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Schedulers"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In `timm`, essentially we have a total of four different schedulers: \n",
    "\n",
    "1. [SGDR: Stochastic Gradient Descent with Warm Restarts](https://arxiv.org/abs/1608.03983)\n",
    "2. [Stochastic Gradient Descent with Hyperbolic-Tangent Decay on Classification](https://arxiv.org/abs/1806.01593)\n",
    "3. [StepLR](https://github.com/rwightman/pytorch-image-models/blob/master/timm/scheduler/step_lr.py#L13)\n",
    "4. [PlateauLRScheduler](https://github.com/rwightman/pytorch-image-models/blob/master/timm/scheduler/plateau_lr.py#L12)\n",
    "\n",
    "In this tutorial we are going to look at each one of them in detail and also look at how we can train our models using these schedulers using the `timm` training script or use them as standalone schedulers for custom PyTorch training scripts."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Available Schedulers"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this section we will look at the various available schedulers in `timm`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### SGDR"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First, let's look at the `SGDR` scheduler also referred to as the `cosine` scheduler in `timm`. \n",
    "\n",
    "The `SGDR` scheduler, or the `Stochastic Gradient Descent with Warm Restarts` scheduler schedules the learning rate using a cosine schedule but with a tweak. It resets the learning rate to the initial value after some number of epochs. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img alt=\"SGDR\" src=\"images/SGDR.png\" width=\"500\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> NOTE:  Unlike the builtin PyTorch schedulers, this is intended to be consistently called at the END of each epoch, before incrementing the epoch count, to calculate next epoch's value & at the END of each optimizer update, after incrementing the update count, to calculate next update's value."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### StepLR"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `StepLR` is a basic step LR schedule with warmup, noise. \n",
    "\n",
    "> NOTE: PyTorch's implementation does not support warmup or noise. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The schedule for `StepLR` annealing looks something like: "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img alt=\"StepLR\" src=\"images/StepLR.png\" width=\"500\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "After a certain number `decay_epochs`, the learning rate is updated to be `lr * decay_rate`. In the above `StepLR` schedule, `decay_epochs` is set to 30 and `decay_rate` is set to 0.5 with an initial `lr` of 1e-4."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Stochastic Gradient Descent with Hyperbolic-Tangent Decay on Classification "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This is also referred to as the `tanh` annealing. `tanh` stands for hyperbolic tangent decay. The annealing using this scheduler looks something like: "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img alt=\"Tanh\" src=\"images/Tanh.png\" width=\"500\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It is similar to the `SGDR` in the sense that the learning rate is set to the initial `lr` after a certain number of epochs but the annealing is done using the `tanh` function. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### PlateauLRScheduler"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This scheduler is very similar to PyTorch's [ReduceLROnPlateau](https://pytorch.org/docs/stable/_modules/torch/optim/lr_scheduler.html#ReduceLROnPlateau) scheduler. The basic idea is to track an eval metric and based on the evaluation metric's value, the `lr` is reduced using `StepLR` if the eval metric is stagnant for a certain number of epochs. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using the various schedulers in the `timm` training script"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It is very easy to train our models using the `timm`'s training script. Essentially, we simply pass in a parameter using the `--sched` flag to specify which scheduler to use and the various hyperparameters alongside. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- For `SGDR`, we pass in `--sched cosine`. \n",
    "- For `PlatueLRScheduler` we pass in `--sched plateau`. \n",
    "- For `TanhLRScheduler`, we pass in `--sched tanh`.\n",
    "- For `StepLR`, we pass in `--sched step`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Thus the call to the training script looks something like: \n",
    "\n",
    "```python \n",
    "python train.py --sched cosine --epochs 200 --min-lr 1e-5 --lr-cycle-mul 2 --lr-cycle-limit 2 \n",
    "```"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.5"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}