{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Learning Rate Schedulers with Skorch \n",
    "\n",
    "This notebook demonstrates 3 learning rate schedulers in skorch:\n",
    "\n",
    "StepLR, ReduceROn Plateau and Cosine Annealing. This notebook was contributed by [Parag Ekbote](https://github.com/ParagEkbote).\n",
    "\n",
    "Firstly you will need to install the following libraries: skorch,numpy,matplotlib and torch."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<table align=\"left\"><td>\n",
    "<a target=\"_blank\" href=\"https://github.com/skorch-dev/skorch/blob/master/notebooks/Learning_Rate_Scheduler.ipynb\">\n",
    "    <img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>  \n",
    "</td><td>\n",
    "<a target=\"_blank\" href=\"https://github.com/skorch-dev/skorch/blob/master/notebooks/Learning_Rate_Scheduler.ipynb\"><img width=32px src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View source on GitHub</a></td></table>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Imports"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "from sklearn.datasets import load_breast_cancer\n",
    "from sklearn.model_selection import train_test_split\n",
    "from sklearn.preprocessing import StandardScaler\n",
    "import torch\n",
    "import torch.optim as optim\n",
    "import torch.nn as nn\n",
    "import numpy as np\n",
    "from skorch import NeuralNetClassifier\n",
    "from skorch.callbacks import LRScheduler\n",
    "from skorch.callbacks import Callback\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Data Preparation\n",
    "\n",
    "The dataset will be split into train and test datasets. We will scale the features upto float 32 and labels reshaped for efficient binary classification.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "def prepare_data():\n",
    "    # Load the dataset\n",
    "    data = load_breast_cancer()\n",
    "    X, y = data.data, data.target\n",
    "    \n",
    "    # Split into train and test sets\n",
    "    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)\n",
    "    \n",
    "    # Standardize the features\n",
    "    scaler = StandardScaler()\n",
    "    X_train_scaled = scaler.fit_transform(X_train).astype(np.float32)\n",
    "    X_test_scaled = scaler.transform(X_test).astype(np.float32)\n",
    "    \n",
    "    # Reshape the labels for compatibility\n",
    "    y_train = y_train.astype(np.float32).reshape(-1, 1)\n",
    "    y_test = y_test.astype(np.float32).reshape(-1, 1)\n",
    "    \n",
    "    return X_train_scaled, X_test_scaled, y_train, y_test\n",
    " "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Neural Net Parameters\n",
    "\n",
    "The BreastCancerNet is a neural network designed for binary classification tasks. It consists of an input layer, two hidden layers with ReLU activation functions, and a single output layer. The architecture is parameterized to allow flexibility in adjusting the input and hidden layer dimensions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "class BreastCancerNet(nn.Module):\n",
    "    def __init__(self, input_dim=30, hidden_dim=64):\n",
    "        super(BreastCancerNet, self).__init__()\n",
    "        self.fc1 = nn.Linear(input_dim, hidden_dim)\n",
    "        self.relu = nn.ReLU()\n",
    "        self.fc2 = nn.Linear(hidden_dim, 1)\n",
    "\n",
    "    def forward(self, x):\n",
    "        x = self.fc1(x)\n",
    "        x = self.relu(x)\n",
    "        x = self.fc2(x)\n",
    "        return x\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Learning Rate Scheduler Parameters \n",
    "\n",
    "1) StepLR:\n",
    "\n",
    "- Reduces the learning rate by a factor (gamma=0.3) every 100 epochs (step_size=10).\n",
    "- Useful for steady, predictable learning rate decay.\n",
    "\n",
    "2) ReduceLROnPlateau:\n",
    "\n",
    "- Reduces the learning rate dynamically when the model's performance (e.g., loss) plateaus.\n",
    "- Adjusts by a factor (factor=0.7) after 5 epochs of no improvement (patience=5).\n",
    "- Ideal for tasks where loss stagnation indicates the need for smaller learning rates.\n",
    "\n",
    "3) CosineAnnealing:\n",
    "\n",
    "- Reduces the learning rate in a cosine curve over 10 epochs (T_max=10).\n",
    "- Periodically resets the learning rate, promoting exploration of the loss landscape.\n",
    "\n",
    "We will now train the neural network with different LR Schedulers.\n",
    "\n",
    "Note: We are training with synthetic data. The breast cancer dataset can also be used."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Training with StepLR scheduler...\n",
      "StepLR Test Score: 0.4250\n",
      "StepLR Recorded Learning Rates: [0.05, 0.05, 0.05, 0.05, 0.05]...\n",
      "\n",
      "Training with ReduceLROnPlateau scheduler...\n",
      "ReduceLROnPlateau Test Score: 0.5750\n",
      "ReduceLROnPlateau Recorded Learning Rates: [0.05, 0.05, 0.05, 0.05, 0.05]...\n",
      "\n",
      "Training with CosineAnnealingLR scheduler...\n",
      "CosineAnnealingLR Test Score: 0.5250\n",
      "CosineAnnealingLR Recorded Learning Rates: [0.04999987663004646, 0.049999506521403426, 0.04999888967772375, 0.04999802610509541, 0.04999691581204153]...\n",
      "\n",
      "Final Results Summary:\n",
      "\n",
      "Scheduler: StepLR\n",
      "Test Score: 0.4250\n",
      "First 5 Learning Rates: [0.05, 0.05, 0.05, 0.05, 0.05]\n",
      "\n",
      "Scheduler: ReduceLROnPlateau\n",
      "Test Score: 0.5750\n",
      "First 5 Learning Rates: [0.05, 0.05, 0.05, 0.05, 0.05]\n",
      "\n",
      "Scheduler: CosineAnnealingLR\n",
      "Test Score: 0.5250\n",
      "First 5 Learning Rates: [0.04999987663004646, 0.049999506521403426, 0.04999888967772375, 0.04999802610509541, 0.04999691581204153]\n"
     ]
    }
   ],
   "source": [
    "class LRCaptureCallback(Callback):\n",
    "    def on_epoch_end(self, net, **kwargs):\n",
    "        # Log the learning rate of the optimizer\n",
    "        lr = net.optimizer_.param_groups[0]['lr']\n",
    "        if not hasattr(net.history, 'lr'):\n",
    "            net.history.record('lr', lr)\n",
    "        else:\n",
    "            net.history[-1, 'lr'] = lr\n",
    "\n",
    "# Training function with learning rate tracking \n",
    "def train_schedulers(X_train, X_test, y_train, y_test, lr=0.05, epochs=1000, hidden_dim=128):\n",
    "    # Convert data to PyTorch tensors\n",
    "    X_train = X_train.astype(np.float32)\n",
    "    X_test = X_test.astype(np.float32)\n",
    "    y_train = y_train.astype(np.float32).reshape(-1, 1)\n",
    "    y_test = y_test.astype(np.float32).reshape(-1, 1)\n",
    "\n",
    "    # Split training data into training and validation sets\n",
    "    X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.4, random_state=42)\n",
    "\n",
    "    # Define learning rate schedulers\n",
    "    schedulers = [\n",
    "        {\"name\": \"StepLR\", \"scheduler_class\": torch.optim.lr_scheduler.StepLR, \"params\": {\"step_size\": 100, \"gamma\": 0.3}},\n",
    "        {\"name\": \"ReduceLROnPlateau\", \"scheduler_class\": torch.optim.lr_scheduler.ReduceLROnPlateau, \"params\": {\"mode\": \"max\", \"factor\": 0.7, \"patience\": 5}},\n",
    "        {\"name\": \"CosineAnnealingLR\", \"scheduler_class\": torch.optim.lr_scheduler.CosineAnnealingLR, \"params\": {\"T_max\": 1000}},\n",
    "    ]\n",
    "\n",
    "    results = {}\n",
    "    for scheduler_info in schedulers:\n",
    "        print(f\"\\nTraining with {scheduler_info['name']} scheduler...\")\n",
    "\n",
    "        # Set up the neural network with the specified scheduler\n",
    "        net = NeuralNetClassifier(\n",
    "            module=BreastCancerNet,\n",
    "            max_epochs=epochs,\n",
    "            lr=lr,\n",
    "            optimizer=optim.SGD,\n",
    "            criterion=nn.BCEWithLogitsLoss,\n",
    "            callbacks=[\n",
    "                LRScheduler(\n",
    "                    policy=scheduler_info[\"scheduler_class\"],\n",
    "                    **scheduler_info[\"params\"]\n",
    "                ),\n",
    "                LRCaptureCallback(),\n",
    "            ],\n",
    "            iterator_train__shuffle=True,\n",
    "            train_split=None,\n",
    "            module__input_dim=X_train.shape[1],\n",
    "            module__hidden_dim=hidden_dim,\n",
    "            verbose=0\n",
    "        )\n",
    "\n",
    "        # Train the model\n",
    "        net.fit(X_train, y_train)\n",
    "\n",
    "        # Evaluate the model on the test set\n",
    "        score = net.score(X_test, y_test)\n",
    "        print(f\"{scheduler_info['name']} Test Score: {score:.4f}\")\n",
    "\n",
    "        # Extract learning rates \n",
    "        lrs = [event['lr'] for event in net.history if 'lr' in event]\n",
    "        print(f\"{scheduler_info['name']} Recorded Learning Rates: {lrs[:5]}...\")\n",
    "\n",
    "        # Save results\n",
    "        results[scheduler_info[\"name\"]] = {\n",
    "            \"model\": net,\n",
    "            \"learning_rates\": lrs,\n",
    "            \"score\": score,\n",
    "        }\n",
    "\n",
    "    print(\"\\nFinal Results Summary:\")\n",
    "    for scheduler_name, result in results.items():\n",
    "        print(f\"\\nScheduler: {scheduler_name}\")\n",
    "        print(f\"Test Score: {result['score']:.4f}\")\n",
    "        print(f\"First 5 Learning Rates: {result['learning_rates'][:5]}\")\n",
    "\n",
    "    return results\n",
    "\n",
    "\n",
    "# Generate synthetic data\n",
    "X_train = np.random.rand(100, 30)\n",
    "X_test = np.random.rand(40, 30)\n",
    "y_train = np.random.randint(0, 2, size=(100,))\n",
    "y_test = np.random.randint(0, 2, size=(40,))\n",
    "\n",
    "# Train with schedulers and evaluate\n",
    "results = train_schedulers(X_train, X_test, y_train, y_test, lr=0.05, epochs=1000, hidden_dim=128)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Visualization of Results\n",
    "\n",
    "We observe the following Results:\n",
    "\n",
    "1) StepLR reduces the learning rate in fixed steps, ReduceLROnPlateau adaptively lowers it when progress stagnates, and CosineAnnealing follows a periodic decay to explore new minima. \n",
    "\n",
    "2) Choosing the right scheduler depends on task requirements, with StepLR suited for predefined decays, ReduceLROnPlateau for dynamic adjustments, and CosineAnnealing for periodic resets to escape local minima."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "",
      "text/plain": [
       "<Figure size 1200x600 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "def plot_results(results):\n",
    "    plt.figure(figsize=(12, 6))\n",
    "    \n",
    "    for scheduler_name, result in results.items():\n",
    "        # Extract learning rates \n",
    "        learning_rates = result[\"model\"].history[:, 'lr']\n",
    "        plt.plot(learning_rates, label=scheduler_name)\n",
    "    \n",
    "    plt.title(\"Learning Rate Schedules\")\n",
    "    plt.xlabel(\"Epochs\")\n",
    "    plt.ylabel(\"Learning Rate\")\n",
    "    plt.legend()\n",
    "    plt.grid(True)\n",
    "    plt.show()\n",
    "\n",
    "plot_results(results)\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}