{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# The building blocks of DL 2\n",
    "\n",
    "> Simplest Neural Network - linear layer with activation function\n",
    "\n",
    "- toc: false \n",
    "- badges: true\n",
    "- comments: true\n",
    "- categories: [ML, medical]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the [previous post](/blog/ml/medical/2020/07/20/building-blocks-of-dl-sgd.html) I've explained what is the most important concept in neural networks - technique that allows us to incrementally find minimum of a function. This is called Gradient Descent algorithm!\n",
    "\n",
    "In this post I will build on this concept and show you how to create a basic linear model to predict what is on a medical image!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Download data\n",
    "\n",
    "As in the [first post](https://ml4med.github.io/blog/medicalnist/mnist/basic/2020/06/19/fist_post.html) showing how to build medical image recognition with pure statistics, we need to download data first. For some basic description of how the data looks like, see that [first post](https://ml4med.github.io/blog/medicalnist/mnist/basic/2020/06/19/fist_post.html) first ;)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Cloning into 'Medical-MNIST-Classification'...\n",
      "remote: Enumerating objects: 58532, done.\u001b[K\n",
      "remote: Total 58532 (delta 0), reused 0 (delta 0), pack-reused 58532\u001b[K\n",
      "Receiving objects: 100% (58532/58532), 77.86 MiB | 17.62 MiB/s, done.\n",
      "Resolving deltas: 100% (506/506), done.\n",
      "Checking connectivity... done.\n",
      "Checking out files: 100% (58959/58959), done.\n"
     ]
    }
   ],
   "source": [
    "! git clone https://github.com/apolanco3225/Medical-MNIST-Classification.git\n",
    "! rm -rf ./medical_mnist\n",
    "! mv Medical-MNIST-Classification/resized/ ./medical_mnist\n",
    "! rm -rf Medical-MNIST-Classification"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "\n",
    "PATH = Path(\"medical_mnist/\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We have much more powerful tools now, so we will deal with all 6 classes now, but first need to prepare data:\n",
    " - all data needs to be numerical\n",
    " - it needs to be in arrays\n",
    " - it needs to be labeled"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['AbdomenCT', 'BreastMRI', 'CXR', 'ChestCT', 'Hand', 'HeadCT']"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "classes = [cls.name for cls in PATH.iterdir()]\n",
    "classes"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Prepare data\n",
    "\n",
    "The plan is to "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "images = {}\n",
    "for cls in classes:\n",
    "    images[cls] = list((PATH/cls).iterdir())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "from PIL import Image"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch\n",
    "from torchvision.transforms import ToTensor\n",
    "\n",
    "image_tensors = {}\n",
    "\n",
    "for cls in classes:\n",
    "    image_tensors[cls] = torch.stack( # converts iterable of tensors into higher dimention single tensor\n",
    "        [\n",
    "            ToTensor()( # converts images to tensors\n",
    "                Image.open(path)\n",
    "            ).view(-1, 64 * 64).squeeze().float()/255 # reshape tensor from 64x64 to vector tensor of size 4096 and convert values\n",
    "            \n",
    "            for path in (PATH/cls).iterdir()]\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "so let's see what we got there"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "AbdomenCT has 10000 images of a size torch.Size([4096])\n",
      "BreastMRI has 8954 images of a size torch.Size([4096])\n",
      "CXR has 10000 images of a size torch.Size([4096])\n",
      "ChestCT has 10000 images of a size torch.Size([4096])\n",
      "Hand has 10000 images of a size torch.Size([4096])\n",
      "HeadCT has 10000 images of a size torch.Size([4096])\n"
     ]
    }
   ],
   "source": [
    "for cls in classes:\n",
    "    class_shape = image_tensors[cls].shape\n",
    "    print(f\"{cls} has {class_shape[0]} images of a size {class_shape[1:]}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "x_train = torch.cat([image_tensors[cls] for cls in classes], dim=0)\n",
    "y_train = torch.cat([torch.tensor([index] * image_tensors[cls].shape[0]) for index, cls in enumerate(classes)])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "shuffle the dataset. This is important as if we don't do this, images from the classes that we train first will be effectively not as \"fresh\" in the memmory of the network"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "permutations = torch.randperm(x_train.shape[0])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "x_train = x_train[permutations]\n",
    "y_train = y_train[permutations]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "create validation set that is 20% of the training set - this is important to asses the performance of the model. Validation set doesn't take part in training, so model is not biased towards those images (it cannot remember those exact images from training). This is essential to see how well model generalizes on examples it has not seen."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "11790"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "valid_pct = 0.2\n",
    "valid_index = int(x_train.shape[0] * valid_pct)\n",
    "valid_index"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "# we take out first 20% of examples from the training set\n",
    "x_valid = x_train[:valid_index]\n",
    "y_valid = y_train[:valid_index]\n",
    "\n",
    "x_train = x_train[valid_index:]\n",
    "y_train = y_train[valid_index:]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(torch.Size([47164, 4096]),\n",
       " torch.Size([47164]),\n",
       " torch.Size([11790, 4096]),\n",
       " torch.Size([11790]))"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x_train.shape, y_train.shape, x_valid.shape, y_valid.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we need to build the model. Let's use the most basic building blocks of the neural network - linear layer (`linear_layer` function) and nonlinearity (`softmax` function)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "# it normalizes all 10 classes so we can treat each class prediction as probability that add up to 1.0\n",
    "def softmax(x):\n",
    "    return x - x.exp().sum(-1).unsqueeze(-1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "def linear_layer(x):\n",
    "    return x @ weights + bias"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "the model is a simple function composition (results of linear layer and fed into nonlinear layer (`softmax`)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [],
   "source": [
    "def model(x):\n",
    "    return softmax(linear_layer(x))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For the Gradient Descent to work we also need to specify the loss function - this is crucial, as this is the function on which we compute gradients for our parameters. Just a quick recap: Gradient Descent algorithm finds out the values to change function parameters so the function values decreese.\n",
    "\n",
    "In your case we minimize `loss_func`. Parameters of this function (passed in a form of `preds`) are in the `model`: `wegiths` and `bias`. Gradient Descent will give us values to change each of those parameters so we minimize the `loss_func`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "def loss_func(preds, targets):\n",
    "    return -preds[range(targets.shape[0]), targets].mean()\n",
    "\n",
    "def accuracy(preds, targets):\n",
    "    return (torch.argmax(preds, dim=-1) == targets).float().mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And here is the Gradient Descent loop - see comments in the code for details"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 0 accuracy: 0.7695504426956177%, loss: 2.5434305667877197\n",
      "Epoch 1 accuracy: 0.787277340888977%, loss: 2.3623108863830566\n",
      "Epoch 2 accuracy: 0.7912637591362%, loss: 2.232290267944336\n",
      "Epoch 3 accuracy: 0.8318066000938416%, loss: 2.134671926498413\n",
      "Epoch 4 accuracy: 0.9314673542976379%, loss: 2.058687448501587\n",
      "Epoch 5 accuracy: 0.9561492800712585%, loss: 1.9977604150772095\n",
      "Epoch 6 accuracy: 0.9601356983184814%, loss: 1.9477065801620483\n",
      "Epoch 7 accuracy: 0.9603053331375122%, loss: 1.9057585000991821\n",
      "Epoch 8 accuracy: 0.9602205157279968%, loss: 1.8700224161148071\n",
      "Epoch 9 accuracy: 0.9598812460899353%, loss: 1.8391577005386353\n",
      "Epoch 10 accuracy: 0.9594571590423584%, loss: 1.8121910095214844\n",
      "Epoch 11 accuracy: 0.9597116112709045%, loss: 1.788395881652832\n",
      "Epoch 12 accuracy: 0.9596267938613892%, loss: 1.7672215700149536\n",
      "Epoch 13 accuracy: 0.9598812460899353%, loss: 1.7482411861419678\n",
      "Epoch 14 accuracy: 0.9598812460899353%, loss: 1.7311174869537354\n",
      "CPU times: user 1min 5s, sys: 173 ms, total: 1min 5s\n",
      "Wall time: 9.22 s\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "\n",
    "# number of training examples\n",
    "n  = x_train.shape[0] \n",
    "# batch size - this is necessary as we won't be able to fit all\n",
    "# the examples into the memmory, so we need to do the computations in batches\n",
    "bs = 64 \n",
    "# how many epochs to train for\n",
    "epochs = 15  \n",
    "weights = torch.zeros((64 * 64, 10), requires_grad=True) # define weights matrix\n",
    "bias = torch.zeros(10, requires_grad=True) # and bias term\n",
    "\n",
    "# in each of those epochs algorithm sees all the images. So in this case\n",
    "# we see all the images 15 times\n",
    "for epoch in range(epochs): \n",
    "    # here is the loop for batches: in each batch we:\n",
    "    #  - see 64 images\n",
    "    #  - compute predictions based on the model\n",
    "    #  - compute the loss\n",
    "    #  - compute gradients and update parameters (wegiths and bias)\n",
    "    for i in range((n - 1) // bs + 1):\n",
    "        # select images for this batch\n",
    "        start_i = i * bs\n",
    "        end_i = start_i + bs\n",
    "        xb = x_train[start_i:end_i]\n",
    "        yb = y_train[start_i:end_i]\n",
    "        \n",
    "        # compute predictions\n",
    "        preds = model(xb)\n",
    "        \n",
    "        # compute loss\n",
    "        loss = loss_func(preds, yb)\n",
    "\n",
    "        # compute gradients (this is done for us by PyTorch with this backwards function!)\n",
    "        loss.backward()\n",
    "        \n",
    "        # this block is necessary, so computations we do below, are not taken into account when\n",
    "        # computing next gradients\n",
    "        with torch.no_grad():\n",
    "            # update parameters\n",
    "            weights -= weights.grad\n",
    "            bias -= bias.grad\n",
    "            # zero out the gradients so they are ready for the next batch (otherwise they accumulate values)\n",
    "            weights.grad.zero_()\n",
    "            bias.grad.zero_()\n",
    "            \n",
    "    # eventually after each epoch (seeing all the images) we print out how we did\n",
    "    print(f\"Epoch {epoch} accuracy: {accuracy(model(x_valid),y_valid)}%, loss: {loss_func(model(x_valid), y_valid)}\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### What did we achieve?\n",
    "\n",
    "With this simplest neural network we got to almost 96% accuracy - this is pretty good."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}