{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Cross entropy baseline model for ordinal regression and deep learning -- cement strength dataset"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This is a regular cross entropy classifier as a baseline for comparison with ordinal regression methods."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 0 -- Obtaining and preparing the cement_strength dataset"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We will be using the cement_strength dataset from [https://github.com/gagolews/ordinal_regression_data/blob/master/cement_strength.csv](https://github.com/gagolews/ordinal_regression_data/blob/master/cement_strength.csv).\n",
    "\n",
    "First, we are going to download and prepare the and save it as CSV files locally. This is a general procedure that is not specific to CORN.\n",
    "\n",
    "This dataset has 5 ordinal labels (1, 2, 3, 4, and 5). Note that we require labels to be starting at 0, which is why we subtract \"1\" from the label column."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Number of features: 8\n",
      "Number of examples: 998\n",
      "Labels: [0 1 2 3 4]\n"
     ]
    }
   ],
   "source": [
    "import pandas as pd\n",
    "import numpy as np\n",
    "\n",
    "\n",
    "data_df = pd.read_csv(\"https://raw.githubusercontent.com/gagolews/ordinal_regression_data/master/cement_strength.csv\")\n",
    " \n",
    "data_df[\"response\"] = data_df[\"response\"]-1 # labels should start at 0\n",
    "\n",
    "data_labels = data_df[\"response\"]\n",
    "data_features = data_df.loc[:, [\"V1\", \"V2\", \"V3\", \"V4\", \"V5\", \"V6\", \"V7\", \"V8\"]]\n",
    "\n",
    "print('Number of features:', data_features.shape[1])\n",
    "print('Number of examples:', data_features.shape[0])\n",
    "print('Labels:', np.unique(data_labels.values))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": []
   },
   "source": [
    "### Split into training and test data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.model_selection import train_test_split\n",
    "\n",
    "\n",
    "X_train, X_test, y_train, y_test = train_test_split(\n",
    "    data_features.values,\n",
    "    data_labels.values,\n",
    "    test_size=0.2,\n",
    "    random_state=1,\n",
    "    stratify=data_labels.values)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Standardize features"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.preprocessing import StandardScaler\n",
    "\n",
    "\n",
    "sc = StandardScaler()\n",
    "X_train_std = sc.fit_transform(X_train)\n",
    "X_test_std = sc.transform(X_test)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1 -- Setting up the dataset and dataloader"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this section, we set up the data set and data loaders. This is a general procedure that is not specific to the method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Training on cuda:0\n"
     ]
    }
   ],
   "source": [
    "import torch\n",
    "\n",
    "\n",
    "##########################\n",
    "### SETTINGS\n",
    "##########################\n",
    "\n",
    "# Hyperparameters\n",
    "random_seed = 1\n",
    "learning_rate = 0.001\n",
    "num_epochs = 50\n",
    "batch_size = 128\n",
    "\n",
    "# Architecture\n",
    "NUM_CLASSES = 5\n",
    "\n",
    "# Other\n",
    "DEVICE = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")\n",
    "print('Training on', DEVICE)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "from torch.utils.data import Dataset\n",
    "\n",
    "\n",
    "class MyDataset(Dataset):\n",
    "\n",
    "    def __init__(self, feature_array, label_array, dtype=np.float32):\n",
    "    \n",
    "        self.features = feature_array.astype(np.float32)\n",
    "        self.labels = label_array\n",
    "\n",
    "    def __getitem__(self, index):\n",
    "        inputs = self.features[index]\n",
    "        label = self.labels[index]\n",
    "        return inputs, label\n",
    "\n",
    "    def __len__(self):\n",
    "        return self.labels.shape[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Input batch dimensions: torch.Size([128, 8])\n",
      "Input label dimensions: torch.Size([128])\n"
     ]
    }
   ],
   "source": [
    "import torch\n",
    "from torch.utils.data import DataLoader\n",
    "\n",
    "\n",
    "# Note transforms.ToTensor() scales input images\n",
    "# to 0-1 range\n",
    "train_dataset = MyDataset(X_train_std, y_train)\n",
    "test_dataset = MyDataset(X_test_std, y_test)\n",
    "\n",
    "\n",
    "train_loader = DataLoader(dataset=train_dataset,\n",
    "                          batch_size=batch_size,\n",
    "                          shuffle=True, # want to shuffle the dataset\n",
    "                          num_workers=0) # number processes/CPUs to use\n",
    "\n",
    "test_loader = DataLoader(dataset=test_dataset,\n",
    "                         batch_size=batch_size,\n",
    "                         shuffle=False,\n",
    "                         num_workers=0)\n",
    "\n",
    "# Checking the dataset\n",
    "for inputs, labels in train_loader:  \n",
    "    print('Input batch dimensions:', inputs.shape)\n",
    "    print('Input label dimensions:', labels.shape)\n",
    "    break"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2 - Implementing a simple MLP with cross entropy loss"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this section, we are implementing a simple MLP for ordinal regression. To implement the Beckham et al. method, we add the parameter layer `a` as `self.a`, which is used to compute the predictions for the loss function later in the training loop:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "class MLP(torch.nn.Module):\n",
    "\n",
    "    def __init__(self, in_features, num_classes, num_hidden_1=300, num_hidden_2=300):\n",
    "        super().__init__()\n",
    "        \n",
    "        self.num_classes = num_classes\n",
    "        self.my_network = torch.nn.Sequential(\n",
    "            \n",
    "            # 1st hidden layer\n",
    "            torch.nn.Linear(in_features, num_hidden_1, bias=False),\n",
    "            torch.nn.LeakyReLU(),\n",
    "            torch.nn.Dropout(0.2),\n",
    "            torch.nn.BatchNorm1d(num_hidden_1),\n",
    "            \n",
    "            # 2nd hidden layer\n",
    "            torch.nn.Linear(num_hidden_1, num_hidden_2, bias=False),\n",
    "            torch.nn.LeakyReLU(),\n",
    "            torch.nn.Dropout(0.2),\n",
    "            torch.nn.BatchNorm1d(num_hidden_2),\n",
    "            \n",
    "            # Output layer\n",
    "            torch.nn.Linear(num_hidden_2, num_classes)\n",
    "        )\n",
    "\n",
    "    def forward(self, x):\n",
    "        logits = self.my_network(x)\n",
    "        return logits\n",
    "    \n",
    "    \n",
    "    \n",
    "torch.manual_seed(random_seed)\n",
    "model = MLP(in_features=8, num_classes=NUM_CLASSES)\n",
    "model.to(DEVICE)\n",
    "\n",
    "optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3 - Using the reformulated squared error loss loss for model training"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch: 001/050 | Batch 000/007 | Cost: 1.8506\n",
      "Epoch: 002/050 | Batch 000/007 | Cost: 1.2779\n",
      "Epoch: 003/050 | Batch 000/007 | Cost: 1.0849\n",
      "Epoch: 004/050 | Batch 000/007 | Cost: 1.0136\n",
      "Epoch: 005/050 | Batch 000/007 | Cost: 1.0655\n",
      "Epoch: 006/050 | Batch 000/007 | Cost: 0.9198\n",
      "Epoch: 007/050 | Batch 000/007 | Cost: 0.9269\n",
      "Epoch: 008/050 | Batch 000/007 | Cost: 0.8566\n",
      "Epoch: 009/050 | Batch 000/007 | Cost: 0.9192\n",
      "Epoch: 010/050 | Batch 000/007 | Cost: 0.8459\n",
      "Epoch: 011/050 | Batch 000/007 | Cost: 0.8595\n",
      "Epoch: 012/050 | Batch 000/007 | Cost: 0.8126\n",
      "Epoch: 013/050 | Batch 000/007 | Cost: 0.7344\n",
      "Epoch: 014/050 | Batch 000/007 | Cost: 0.7982\n",
      "Epoch: 015/050 | Batch 000/007 | Cost: 0.7587\n",
      "Epoch: 016/050 | Batch 000/007 | Cost: 0.7278\n",
      "Epoch: 017/050 | Batch 000/007 | Cost: 0.5626\n",
      "Epoch: 018/050 | Batch 000/007 | Cost: 0.6570\n",
      "Epoch: 019/050 | Batch 000/007 | Cost: 0.6695\n",
      "Epoch: 020/050 | Batch 000/007 | Cost: 0.8091\n",
      "Epoch: 021/050 | Batch 000/007 | Cost: 0.6433\n",
      "Epoch: 022/050 | Batch 000/007 | Cost: 0.5846\n",
      "Epoch: 023/050 | Batch 000/007 | Cost: 0.6255\n",
      "Epoch: 024/050 | Batch 000/007 | Cost: 0.6438\n",
      "Epoch: 025/050 | Batch 000/007 | Cost: 0.6645\n",
      "Epoch: 026/050 | Batch 000/007 | Cost: 0.6947\n",
      "Epoch: 027/050 | Batch 000/007 | Cost: 0.5889\n",
      "Epoch: 028/050 | Batch 000/007 | Cost: 0.6015\n",
      "Epoch: 029/050 | Batch 000/007 | Cost: 0.6087\n",
      "Epoch: 030/050 | Batch 000/007 | Cost: 0.5184\n",
      "Epoch: 031/050 | Batch 000/007 | Cost: 0.5749\n",
      "Epoch: 032/050 | Batch 000/007 | Cost: 0.5191\n",
      "Epoch: 033/050 | Batch 000/007 | Cost: 0.5260\n",
      "Epoch: 034/050 | Batch 000/007 | Cost: 0.6051\n",
      "Epoch: 035/050 | Batch 000/007 | Cost: 0.5267\n",
      "Epoch: 036/050 | Batch 000/007 | Cost: 0.5485\n",
      "Epoch: 037/050 | Batch 000/007 | Cost: 0.4345\n",
      "Epoch: 038/050 | Batch 000/007 | Cost: 0.5198\n",
      "Epoch: 039/050 | Batch 000/007 | Cost: 0.4047\n",
      "Epoch: 040/050 | Batch 000/007 | Cost: 0.5052\n",
      "Epoch: 041/050 | Batch 000/007 | Cost: 0.5436\n",
      "Epoch: 042/050 | Batch 000/007 | Cost: 0.4116\n",
      "Epoch: 043/050 | Batch 000/007 | Cost: 0.4640\n",
      "Epoch: 044/050 | Batch 000/007 | Cost: 0.5765\n",
      "Epoch: 045/050 | Batch 000/007 | Cost: 0.5034\n",
      "Epoch: 046/050 | Batch 000/007 | Cost: 0.5579\n",
      "Epoch: 047/050 | Batch 000/007 | Cost: 0.4336\n",
      "Epoch: 048/050 | Batch 000/007 | Cost: 0.5188\n",
      "Epoch: 049/050 | Batch 000/007 | Cost: 0.5183\n",
      "Epoch: 050/050 | Batch 000/007 | Cost: 0.5013\n"
     ]
    }
   ],
   "source": [
    "for epoch in range(num_epochs):\n",
    "    \n",
    "    model = model.train()\n",
    "    for batch_idx, (features, class_labels) in enumerate(train_loader):\n",
    "\n",
    "        class_labels = class_labels.to(DEVICE)\n",
    "        features = features.to(DEVICE)\n",
    "        logits = model(features)\n",
    "        \n",
    "        logits = model(features)\n",
    "        loss = torch.nn.functional.cross_entropy(logits, class_labels)\n",
    "        \n",
    "        optimizer.zero_grad()\n",
    "        loss.backward()\n",
    "        optimizer.step()\n",
    "        \n",
    "        ### LOGGING\n",
    "        if not batch_idx % 200:\n",
    "            print ('Epoch: %03d/%03d | Batch %03d/%03d | Cost: %.4f' \n",
    "                   %(epoch+1, num_epochs, batch_idx, \n",
    "                     len(train_loader), loss))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4 -- Evaluate model\n",
    "\n",
    "Finally, after model training, we can evaluate the performance of the model. For example, via the mean absolute error and mean squared error measures.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "def beckham_logits_to_labels(logits, model, num_classes):\n",
    "    predictions = beckham_logits_to_predictions(logits, model, num_classes)\n",
    "    return torch.round(predictions).float()\n",
    "    \n",
    "\n",
    "def compute_mae_and_mse(model, data_loader, device):\n",
    "\n",
    "    with torch.no_grad():\n",
    "    \n",
    "        mae, mse, acc, num_examples = 0., 0., 0., 0\n",
    "\n",
    "        for i, (features, targets) in enumerate(data_loader):\n",
    "\n",
    "            features = features.to(device)\n",
    "            targets = targets.float().to(device)\n",
    "\n",
    "            logits = model(features)\n",
    "            predicted_labels = torch.argmax(logits, dim=1)\n",
    "\n",
    "            num_examples += targets.size(0)\n",
    "            mae += torch.sum(torch.abs(predicted_labels - targets))\n",
    "            mse += torch.sum((predicted_labels - targets)**2)\n",
    "\n",
    "        mae = mae / num_examples\n",
    "        mse = mse / num_examples\n",
    "        return mae, mse"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "train_mae, train_mse = compute_mae_and_mse(model, train_loader, DEVICE)\n",
    "test_mae, test_mse = compute_mae_and_mse(model, test_loader, DEVICE)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Mean absolute error (train/test): 0.22 | 0.37\n",
      "Mean squared error (train/test): 0.27 | 0.41\n"
     ]
    }
   ],
   "source": [
    "print(f'Mean absolute error (train/test): {train_mae:.2f} | {test_mae:.2f}')\n",
    "print(f'Mean squared error (train/test): {train_mse:.2f} | {test_mse:.2f}')"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}