{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "<!--NAVIGATION-->\n", "# < [CNN & LSTM](5-CNN-LSTM.ipynb) | Transfer Learning |" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Transfer Learning" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### What is transfer learning?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Transfer Learning is the re-use of pretrained models on new tasks. Most often, the two tasks are different but somehow related to each other. For example, a model which was trained on image classification might have learnt image features which can also be harnessed for other image related tasks. This technique became increasingly popular in the field of Deep Learning since it enables one to train a model on comparatively little data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The dataset can be downloaded from [Kaggle](https://www.kaggle.com/pmigdal/alien-vs-predator-images)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Freeze the intermediate layers and only train a few layers close to the output." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "Figures taken from https://www.kaggle.com/pmigdal/alien-vs-predator-images" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "_____" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Only for Google Colaboratory users" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# execute only if you're using Google Colab\n", "!wget -q https://raw.githubusercontent.com/ahug/amld-pytorch-workshop/master/binder/requirements.txt -O requirements.txt\n", "!pip install -qr requirements.txt\n", "\n", "!mkdir -p data\n", "!curl -L -o alien-vs-predator.zip \"https://drive.google.com/uc?id=1IGiEW3Vtf-ZiLINHCGVDM0NRSkyiYT98&export=download\"\n", "!unzip -oq alien-vs-predator.zip -d data/\n", "!rm alien-vs-predator.zip\n", "!ls -l data/alien-vs-predator/\n", "\n", "# for PIL.Image\n", "!pip install --no-cache-dir -I pillow" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Imports" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "\n", "import torch\n", "import torch.nn as nn\n", "import torch.optim as optim\n", "from torch.optim import lr_scheduler\n", "\n", "import torchvision\n", "from torchvision import datasets, models, transforms\n", "\n", "import numpy as np\n", "import matplotlib\n", "import matplotlib.pyplot as plt\n", "import colorama\n", "\n", "matplotlib.rc('font', size=16)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create dataloaders which loads images from a local folder with the following structure" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```\n", "data/alien-vs-predator\n", "│\n", "└───train\n", "│ │\n", "│ │───alien\n", "│ │ │ 20.jpg\n", "│ │ │ 104.jpg\n", "│ │ └ ...\n", "│ │\n", "│ └───predator\n", "│ │ 1.jpg\n", "│ │ 78.jpg\n", "│ └ ...\n", "│ \n", "└───validation\n", " │\n", " │───alien\n", " │ │ 233.jpg\n", " │ │ 12.jpg\n", " │ └ ...\n", " │\n", " └───predator\n", " │ 22.jpg\n", " │ 77.jpg\n", " └ ...\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the training dataloader, we can very easily add preprocessing steps to augment the data (scaling, flipping, etc.)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data_dir = os.path.join(os.getcwd(), \"data\", \"alien-vs-predator\")\n", "\n", "train_data = datasets.ImageFolder(os.path.join(data_dir, \"train\"),\n", " transform = transforms.Compose([\n", " transforms.RandomResizedCrop(224), # randomly crops and scales it\n", " transforms.RandomHorizontalFlip(),\n", " transforms.ToTensor()\n", " ]))\n", "\n", "test_data = datasets.ImageFolder(os.path.join(data_dir, \"validation\"),\n", " transform = transforms.Compose([\n", " transforms.Resize(256),\n", " transforms.CenterCrop(224),\n", " transforms.ToTensor()\n", " ]))\n", "\n", "train_loader = torch.utils.data.DataLoader(train_data, batch_size=32, shuffle=True)\n", "test_loader = torch.utils.data.DataLoader(test_data, batch_size=64, shuffle=True)\n", "class_names = train_data.classes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Transformers (Data augmentation)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Transformers can be used and stacked on top of each other in a similar fashion as modules. For training, they provide easy-to-use functionalities for [data-augmentation](https://en.wikipedia.org/wiki/Convolutional_neural_network#Artificial_data)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "resize_transformer = transforms.Resize(400)\n", "horizontal_flip_transformer = transforms.RandomHorizontalFlip()\n", "random_resize_crop_transformer = transforms.RandomResizedCrop(250, scale=(0.5, 1))\n", "tensor_transformer = transforms.ToTensor()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "preview_data = datasets.ImageFolder(os.path.join(data_dir, \"train\"))\n", "img, label = next(iter(preview_data))\n", "\n", "fig = plt.figure(figsize=(16,9))\n", "plt.subplot(1, 4, 1)\n", "plt.xlabel('Original')\n", "plt.imshow(tensor_transformer(img).permute(1, 2, 0))\n", "plt.subplot(1, 4, 2)\n", "plt.xlabel('Resized (400x400)')\n", "plt.imshow(tensor_transformer(resize_transformer(img)).permute(1, 2, 0))\n", "plt.subplot(1, 4, 3)\n", "plt.xlabel('Random Horizontal Flip')\n", "plt.imshow(tensor_transformer(horizontal_flip_transformer(img)).permute(1, 2, 0))\n", "plt.subplot(1, 4, 4)\n", "plt.xlabel('Random resizing + croping')\n", "plt.imshow(tensor_transformer(random_resize_crop_transformer(img)).permute(1, 2, 0))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualize some training samples" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data, labels = next(iter(test_loader))\n", "data, labels = data[:5], labels[:5]\n", "\n", "fig = plt.figure(figsize=(16,9))\n", "for i in range(0, 5):\n", " fig.add_subplot(1, 5, i+1)\n", " plt.imshow(data[i].permute(1, 2, 0))\n", " plt.xlabel(class_names[labels[i]])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "________" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# How can we load a pretrained model?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### List of available pretrained models" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from torchvision import models\n", "print(dir(models))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Let us use the ResNet-18 architecture:\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Load pretrained model (previously downloaded)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model_ft = models.resnet18(pretrained=True)\n", "model_ft" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Let's have a closer look at the ResNet-18" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Last fully connected layer has a 1000 output neurons (has been trained on ImageNet which consists of 1000 categories)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model_ft.fc" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We would like to perform binary classification (alien/predator). Therefore, we have to replace the last fully-connected layer to suit our needs (two output units)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model_ft.fc = nn.Linear(in_features=512, out_features=2)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "model_ft.fc" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Setup the training function" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So, now the architecture contains two output units, we can therefore use it to perform binary classification." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The *train_cnn* and _accuracy_ function are almost identical to the functions we used when traininig the CNN. This again nicely demonstrates the modularity of PyTorch and its simple interface." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def train(model, train_loader, test_loader, device, num_epochs=3, lr=0.1, use_scheduler=False):\n", " model.train() # not necessary in our example, but still good practice since modules\n", " # like nn.Dropout, nn.BatchNorm require it\n", " \n", " # define an optimizer\n", " optimizer = torch.optim.Adam(model.parameters(), lr=lr)\n", " criterion = torch.nn.CrossEntropyLoss()\n", " \n", " if use_scheduler:\n", " scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 1, 0.85)\n", " \n", " for epoch in range(num_epochs):\n", " print(\"=\"*40, \"Starting epoch %d\" % (epoch + 1), \"=\"*40)\n", " \n", " if use_scheduler:\n", " scheduler.step()\n", " \n", " cum_loss = 0\n", " # dataloader returns batches of images for 'data' and a tensor with their respective labels in 'labels'\n", " for batch_idx, (data, labels) in enumerate(train_loader):\n", " data, labels = data.to(device), labels.to(device)\n", "\n", " optimizer.zero_grad()\n", " \n", " output = model(data)\n", " loss = criterion(output, labels)\n", " loss.backward()\n", " optimizer.step()\n", " \n", " cum_loss += loss.item()\n", " \n", " if batch_idx % 5 == 0:\n", " print(\"Batch %d/%d\" % (batch_idx, len(train_loader)))\n", "\n", " train_acc = accuracy(model, train_loader, device)\n", " test_acc = accuracy(model, test_loader, device)\n", " print(colorama.Fore.GREEN + \"\\nEpoch %d/%d, Loss=%.4f, Train-Acc=%d%%, Valid-Acc=%d%%\" \n", " % (epoch+1, num_epochs, cum_loss/len(train_data), 100*train_acc, 100*test_acc), colorama.Fore.RESET)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def accuracy(model, dataloader, device):\n", " \"\"\" Computes the model's accuracy on the data provided by 'dataloader'\n", " \"\"\"\n", " model.eval()\n", " \n", " num_correct = 0\n", " num_samples = 0\n", " with torch.no_grad(): # deactivates autograd, reduces memory usage and speeds up computations\n", " for data, labels in dataloader:\n", " data, labels = data.to(device), labels.to(device)\n", "\n", " predictions = model(data).max(1)[1] # indices of the maxima along the second dimension\n", " num_correct += (predictions == labels).sum().item()\n", " num_samples += predictions.shape[0]\n", " \n", " return num_correct / num_samples" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Let's now freeze all the layers except the last fully-connected one" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for name, param in model_ft.named_parameters():\n", " if name not in [\"fc.weight\", \"fc.bias\"]:\n", " param.requires_grad = False" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')\n", "model_ft = model_ft.to(device)\n", "\n", "train(model_ft, train_loader, test_loader, device, num_epochs=2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Let's look at some of the model's predictions" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def visualize_predictions(model, dataloader, device):\n", " data, labels = next(iter(dataloader))\n", " data, labels = data[:10].to(device), labels[:10]\n", " predictions = model(data).max(1)[1].cpu()\n", " \n", " predictions, data = predictions.cpu(), data.cpu() # put it back on CPU for visualization\n", " \n", " plt.figure(figsize=(16,9))\n", " for i in range(5):\n", " img = data.squeeze(1)[i]\n", " plt.subplot(1, 5, i+1)\n", " plt.imshow(img.permute(1, 2, 0))\n", " plt.xlabel(\"%s\\n (%s)\" % (test_data.classes[predictions[i].item()], test_data.classes[labels[i]]), fontsize=18)\n", " plt.xticks([])\n", " plt.yticks([]) \n", " \n", "visualize_predictions(model_ft, test_loader, device)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "___" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Don't forget to download the notebook, otherwise your changes may be lost!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<!--NAVIGATION-->\n", "# < [CNN & LSTM](5-CNN-LSTM.ipynb) | Transfer Learning |" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.8" } }, "nbformat": 4, "nbformat_minor": 2 }