{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Cross entropy baseline model for ordinal regression and deep learning -- cement strength dataset" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is a regular cross entropy classifier as a baseline for comparison with ordinal regression methods." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 0 -- Obtaining and preparing the cement_strength dataset" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will be using the cement_strength dataset from [https://github.com/gagolews/ordinal_regression_data/blob/master/cement_strength.csv](https://github.com/gagolews/ordinal_regression_data/blob/master/cement_strength.csv).\n", "\n", "First, we are going to download and prepare the and save it as CSV files locally. This is a general procedure that is not specific to CORN.\n", "\n", "This dataset has 5 ordinal labels (1, 2, 3, 4, and 5). Note that we require labels to be starting at 0, which is why we subtract \"1\" from the label column." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Number of features: 8\n", "Number of examples: 998\n", "Labels: [0 1 2 3 4]\n" ] } ], "source": [ "import pandas as pd\n", "import numpy as np\n", "\n", "\n", "data_df = pd.read_csv(\"https://raw.githubusercontent.com/gagolews/ordinal_regression_data/master/cement_strength.csv\")\n", " \n", "data_df[\"response\"] = data_df[\"response\"]-1 # labels should start at 0\n", "\n", "data_labels = data_df[\"response\"]\n", "data_features = data_df.loc[:, [\"V1\", \"V2\", \"V3\", \"V4\", \"V5\", \"V6\", \"V7\", \"V8\"]]\n", "\n", "print('Number of features:', data_features.shape[1])\n", "print('Number of examples:', data_features.shape[0])\n", "print('Labels:', np.unique(data_labels.values))" ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "### Split into training and test data" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "from sklearn.model_selection import train_test_split\n", "\n", "\n", "X_train, X_test, y_train, y_test = train_test_split(\n", " data_features.values,\n", " data_labels.values,\n", " test_size=0.2,\n", " random_state=1,\n", " stratify=data_labels.values)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Standardize features" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "from sklearn.preprocessing import StandardScaler\n", "\n", "\n", "sc = StandardScaler()\n", "X_train_std = sc.fit_transform(X_train)\n", "X_test_std = sc.transform(X_test)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1 -- Setting up the dataset and dataloader" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this section, we set up the data set and data loaders. This is a general procedure that is not specific to the method." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Training on cuda:0\n" ] } ], "source": [ "import torch\n", "\n", "\n", "##########################\n", "### SETTINGS\n", "##########################\n", "\n", "# Hyperparameters\n", "random_seed = 1\n", "learning_rate = 0.001\n", "num_epochs = 50\n", "batch_size = 128\n", "\n", "# Architecture\n", "NUM_CLASSES = 5\n", "\n", "# Other\n", "DEVICE = torch.device(\"cuda:0\" if torch.cuda.is_available() else \"cpu\")\n", "print('Training on', DEVICE)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "from torch.utils.data import Dataset\n", "\n", "\n", "class MyDataset(Dataset):\n", "\n", " def __init__(self, feature_array, label_array, dtype=np.float32):\n", " \n", " self.features = feature_array.astype(np.float32)\n", " self.labels = label_array\n", "\n", " def __getitem__(self, index):\n", " inputs = self.features[index]\n", " label = self.labels[index]\n", " return inputs, label\n", "\n", " def __len__(self):\n", " return self.labels.shape[0]" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Input batch dimensions: torch.Size([128, 8])\n", "Input label dimensions: torch.Size([128])\n" ] } ], "source": [ "import torch\n", "from torch.utils.data import DataLoader\n", "\n", "\n", "# Note transforms.ToTensor() scales input images\n", "# to 0-1 range\n", "train_dataset = MyDataset(X_train_std, y_train)\n", "test_dataset = MyDataset(X_test_std, y_test)\n", "\n", "\n", "train_loader = DataLoader(dataset=train_dataset,\n", " batch_size=batch_size,\n", " shuffle=True, # want to shuffle the dataset\n", " num_workers=0) # number processes/CPUs to use\n", "\n", "test_loader = DataLoader(dataset=test_dataset,\n", " batch_size=batch_size,\n", " shuffle=False,\n", " num_workers=0)\n", "\n", "# Checking the dataset\n", "for inputs, labels in train_loader: \n", " print('Input batch dimensions:', inputs.shape)\n", " print('Input label dimensions:', labels.shape)\n", " break" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2 - Implementing a simple MLP with cross entropy loss" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this section, we are implementing a simple MLP for ordinal regression. To implement the Beckham et al. method, we add the parameter layer `a` as `self.a`, which is used to compute the predictions for the loss function later in the training loop:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "class MLP(torch.nn.Module):\n", "\n", " def __init__(self, in_features, num_classes, num_hidden_1=300, num_hidden_2=300):\n", " super().__init__()\n", " \n", " self.num_classes = num_classes\n", " self.my_network = torch.nn.Sequential(\n", " \n", " # 1st hidden layer\n", " torch.nn.Linear(in_features, num_hidden_1, bias=False),\n", " torch.nn.LeakyReLU(),\n", " torch.nn.Dropout(0.2),\n", " torch.nn.BatchNorm1d(num_hidden_1),\n", " \n", " # 2nd hidden layer\n", " torch.nn.Linear(num_hidden_1, num_hidden_2, bias=False),\n", " torch.nn.LeakyReLU(),\n", " torch.nn.Dropout(0.2),\n", " torch.nn.BatchNorm1d(num_hidden_2),\n", " \n", " # Output layer\n", " torch.nn.Linear(num_hidden_2, num_classes)\n", " )\n", "\n", " def forward(self, x):\n", " logits = self.my_network(x)\n", " return logits\n", " \n", " \n", " \n", "torch.manual_seed(random_seed)\n", "model = MLP(in_features=8, num_classes=NUM_CLASSES)\n", "model.to(DEVICE)\n", "\n", "optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3 - Using the reformulated squared error loss loss for model training" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch: 001/050 | Batch 000/007 | Cost: 1.8506\n", "Epoch: 002/050 | Batch 000/007 | Cost: 1.2779\n", "Epoch: 003/050 | Batch 000/007 | Cost: 1.0849\n", "Epoch: 004/050 | Batch 000/007 | Cost: 1.0136\n", "Epoch: 005/050 | Batch 000/007 | Cost: 1.0655\n", "Epoch: 006/050 | Batch 000/007 | Cost: 0.9198\n", "Epoch: 007/050 | Batch 000/007 | Cost: 0.9269\n", "Epoch: 008/050 | Batch 000/007 | Cost: 0.8566\n", "Epoch: 009/050 | Batch 000/007 | Cost: 0.9192\n", "Epoch: 010/050 | Batch 000/007 | Cost: 0.8459\n", "Epoch: 011/050 | Batch 000/007 | Cost: 0.8595\n", "Epoch: 012/050 | Batch 000/007 | Cost: 0.8126\n", "Epoch: 013/050 | Batch 000/007 | Cost: 0.7344\n", "Epoch: 014/050 | Batch 000/007 | Cost: 0.7982\n", "Epoch: 015/050 | Batch 000/007 | Cost: 0.7587\n", "Epoch: 016/050 | Batch 000/007 | Cost: 0.7278\n", "Epoch: 017/050 | Batch 000/007 | Cost: 0.5626\n", "Epoch: 018/050 | Batch 000/007 | Cost: 0.6570\n", "Epoch: 019/050 | Batch 000/007 | Cost: 0.6695\n", "Epoch: 020/050 | Batch 000/007 | Cost: 0.8091\n", "Epoch: 021/050 | Batch 000/007 | Cost: 0.6433\n", "Epoch: 022/050 | Batch 000/007 | Cost: 0.5846\n", "Epoch: 023/050 | Batch 000/007 | Cost: 0.6255\n", "Epoch: 024/050 | Batch 000/007 | Cost: 0.6438\n", "Epoch: 025/050 | Batch 000/007 | Cost: 0.6645\n", "Epoch: 026/050 | Batch 000/007 | Cost: 0.6947\n", "Epoch: 027/050 | Batch 000/007 | Cost: 0.5889\n", "Epoch: 028/050 | Batch 000/007 | Cost: 0.6015\n", "Epoch: 029/050 | Batch 000/007 | Cost: 0.6087\n", "Epoch: 030/050 | Batch 000/007 | Cost: 0.5184\n", "Epoch: 031/050 | Batch 000/007 | Cost: 0.5749\n", "Epoch: 032/050 | Batch 000/007 | Cost: 0.5191\n", "Epoch: 033/050 | Batch 000/007 | Cost: 0.5260\n", "Epoch: 034/050 | Batch 000/007 | Cost: 0.6051\n", "Epoch: 035/050 | Batch 000/007 | Cost: 0.5267\n", "Epoch: 036/050 | Batch 000/007 | Cost: 0.5485\n", "Epoch: 037/050 | Batch 000/007 | Cost: 0.4345\n", "Epoch: 038/050 | Batch 000/007 | Cost: 0.5198\n", "Epoch: 039/050 | Batch 000/007 | Cost: 0.4047\n", "Epoch: 040/050 | Batch 000/007 | Cost: 0.5052\n", "Epoch: 041/050 | Batch 000/007 | Cost: 0.5436\n", "Epoch: 042/050 | Batch 000/007 | Cost: 0.4116\n", "Epoch: 043/050 | Batch 000/007 | Cost: 0.4640\n", "Epoch: 044/050 | Batch 000/007 | Cost: 0.5765\n", "Epoch: 045/050 | Batch 000/007 | Cost: 0.5034\n", "Epoch: 046/050 | Batch 000/007 | Cost: 0.5579\n", "Epoch: 047/050 | Batch 000/007 | Cost: 0.4336\n", "Epoch: 048/050 | Batch 000/007 | Cost: 0.5188\n", "Epoch: 049/050 | Batch 000/007 | Cost: 0.5183\n", "Epoch: 050/050 | Batch 000/007 | Cost: 0.5013\n" ] } ], "source": [ "for epoch in range(num_epochs):\n", " \n", " model = model.train()\n", " for batch_idx, (features, class_labels) in enumerate(train_loader):\n", "\n", " class_labels = class_labels.to(DEVICE)\n", " features = features.to(DEVICE)\n", " logits = model(features)\n", " \n", " logits = model(features)\n", " loss = torch.nn.functional.cross_entropy(logits, class_labels)\n", " \n", " optimizer.zero_grad()\n", " loss.backward()\n", " optimizer.step()\n", " \n", " ### LOGGING\n", " if not batch_idx % 200:\n", " print ('Epoch: %03d/%03d | Batch %03d/%03d | Cost: %.4f' \n", " %(epoch+1, num_epochs, batch_idx, \n", " len(train_loader), loss))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4 -- Evaluate model\n", "\n", "Finally, after model training, we can evaluate the performance of the model. For example, via the mean absolute error and mean squared error measures.\n", "\n" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "def beckham_logits_to_labels(logits, model, num_classes):\n", " predictions = beckham_logits_to_predictions(logits, model, num_classes)\n", " return torch.round(predictions).float()\n", " \n", "\n", "def compute_mae_and_mse(model, data_loader, device):\n", "\n", " with torch.no_grad():\n", " \n", " mae, mse, acc, num_examples = 0., 0., 0., 0\n", "\n", " for i, (features, targets) in enumerate(data_loader):\n", "\n", " features = features.to(device)\n", " targets = targets.float().to(device)\n", "\n", " logits = model(features)\n", " predicted_labels = torch.argmax(logits, dim=1)\n", "\n", " num_examples += targets.size(0)\n", " mae += torch.sum(torch.abs(predicted_labels - targets))\n", " mse += torch.sum((predicted_labels - targets)**2)\n", "\n", " mae = mae / num_examples\n", " mse = mse / num_examples\n", " return mae, mse" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "train_mae, train_mse = compute_mae_and_mse(model, train_loader, DEVICE)\n", "test_mae, test_mse = compute_mae_and_mse(model, test_loader, DEVICE)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Mean absolute error (train/test): 0.22 | 0.37\n", "Mean squared error (train/test): 0.27 | 0.41\n" ] } ], "source": [ "print(f'Mean absolute error (train/test): {train_mae:.2f} | {test_mae:.2f}')\n", "print(f'Mean squared error (train/test): {train_mse:.2f} | {test_mse:.2f}')" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.12" } }, "nbformat": 4, "nbformat_minor": 4 }