{ "cells": [ { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "UEBilEjLj5wY" }, "source": [ "Deep Learning Models -- A collection of various deep learning architectures, models, and tips for TensorFlow and PyTorch in Jupyter Notebooks.\n", "- Author: Sebastian Raschka\n", "- GitHub Repository: https://github.com/rasbt/deeplearning-models" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "colab": { "autoexec": { "startup": false, "wait_interval": 0 }, "base_uri": "https://localhost:8080/", "height": 119 }, "colab_type": "code", "executionInfo": { "elapsed": 536, "status": "ok", "timestamp": 1524974472601, "user": { "displayName": "Sebastian Raschka", "photoUrl": "//lh6.googleusercontent.com/-cxK6yOSQ6uE/AAAAAAAAAAI/AAAAAAAAIfw/P9ar_CHsKOQ/s50-c-k-no/photo.jpg", "userId": "118404394130788869227" }, "user_tz": 240 }, "id": "GOzuY8Yvj5wb", "outputId": "c19362ce-f87a-4cc2-84cc-8d7b4b9e6007" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Sebastian Raschka \n", "\n", "CPython 3.6.8\n", "IPython 7.2.0\n", "\n", "torch 1.0.1.post2\n" ] } ], "source": [ "%load_ext watermark\n", "%watermark -a 'Sebastian Raschka' -v -p torch" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "rH4XmErYj5wm" }, "source": [ "# Model Zoo -- CNN Gender Classifier (VGG16 Architecture, CelebA) with Data Parallelism" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are multiple ways of leveraging multiple GPUs when using PyTorch. One of these approaches is to send a copy of the model to each available GPU and split the minibatches across using `DataParallel`.\n", "\n", "To break it down into conceptual steps, this is what `DataParallel` does \n", "\n", "1. each GPU performs a forward pass on a chunk of the minibatch (on a copy of the model) to obtain the predictions;\n", "2. the first/default GPU gathers these predictions from all GPUs to compute the loss of each minibatch-chunk with respect to the true labels (this is done on the first/default GPU, because we typically define the loss, like `torch.nn.CrossEntropyLoss` outside the model);\n", "3. each GPU then peforms backpropagation to compute the gradient of the loss on their-subbatch with respect to the neural network weights;\n", "3. the first GPU sums up the gradients obtained from each GPU (computer engineers usually refer to this step as \"reduce\");\n", "4. the first GPU updates the weights in the neural network via gradient descent and sends copies to the individual GPUs for the next round.\n", "\n", "While the list above may look a bit complicated at first, the `DataParallel` class automatically takes care of it all, and it is very easy to use in practice.\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Data Parallelism vs regular Backpropagation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that using `DataParallel` will result in slightly different models compared to regular backpropagation. The reason is that via data parallelism, we combine the gradients from 4 individual forward and backward runs to update the model. In regular backprop, we would update the model after each minibatch. The following figure illustrates regular backpropagation showing 2 iterations:\n", "\n", "![](../images/dataparallel/minibatch-update.png)\n", "\n", "The next figure shows one model update iteration with `DataParallel` assuming 2 GPUs:\n", "\n", "![](../images/dataparallel/minibatch-update-dataparallel.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Implementation Details" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To use `DataParallel`, in the \"Model\" section (i.e., the corresponding code cell) we replace\n", "\n", "```python\n", "model.to(device)\n", "```\n", "\n", "with \n", "\n", "```python\n", "model = VGG16(num_features=num_features, num_classes=num_classes)\n", "if torch.cuda.device_count() > 1:\n", " print(\"Using\", torch.cuda.device_count(), \"GPUs\")\n", " model = nn.DataParallel(model)\n", "```\n", "\n", "and let the `DataParallel` class take care of the rest. Note that in order for this to work, the data currently needs to be on the first cuda device, \"cuda:0\". Otherwise, we will get a `RuntimeError: all tensors must be on devices[0]`. Hence, we define `device` below, which we use to transfer the input data to during training. Hence, make sure you set \n", "\n", "```python\n", "device = torch.device(\"cuda:0\")\n", "```\n", "\n", "and not \n", "\n", "```python\n", "device = torch.device(\"cuda:1\")\n", "```\n", "\n", "(or any other CUDA device number), so that in the training loop, we can use\n", "\n", "```python\n", " for i, (features, targets) in enumerate(data_loader):\n", " \n", " features = features.to(device)\n", " targets = targets.to(device)\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you look at the implementation part\n", "\n", "\n", "```python\n", "\n", "#### DATA PARALLEL START ####\n", "\n", "model = VGG16(num_features=num_features, num_classes=num_classes)\n", "if torch.cuda.device_count() > 1:\n", " print(\"Using\", torch.cuda.device_count(), \"GPUs\")\n", " model = nn.DataParallel(model)\n", "\n", "#### DATA PARALLEL END ####\n", " \n", "model.to(device)\n", "\n", "#### DATA PARALLEL START ####\n", "\n", "\n", "cost_fn = torch.nn.CrossEntropyLoss() \n", "optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate) \n", "```\n", "\n", "you notice that the `CrossEntropyLoss` (we could also use the one implemented in nn.functional) is not part of the model. Hence, the loss will be computed on the device where the target labels are, which is the default device (usually the first GPU). This is the reason why the outputs are gathered on the first/default GPU. I sketched a more detailed outline of the whole process below:\n", "\n", "![](../images/dataparallel/dataparallel.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Speed Comparison" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Using the same batch size as in the 1-GPU version of this code, means that if we have four GPUs, the 64-batch dataset gets split into four 16-batch sized datasets that will be distributed across the different GPUs. I noticed that the computation time is approximately half for 4 GPUs compared to 1 GPU (using GeForce 1080Ti cards).\n", "\n", "- When I multiply the batch size by 4 in the `DataParallel` version, so that each GPU gets a minibatch of size 64, I notice that the model trains approximately 3x faster on 4 GPUs compared to the single GPU version." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Network Architecture" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The network in this notebook is an implementation of the VGG-16 [1] architecture on the CelebA face dataset [2] to train a gender classifier. \n", "\n", "\n", "References\n", " \n", "- [1] Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.\n", "- [2] Zhang, K., Tan, L., Li, Z., & Qiao, Y. (2016). Gender and smile classification using deep convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (pp. 34-38).\n", "\n", "\n", "The following table (taken from Simonyan & Zisserman referenced above) summarizes the VGG19 architecture:\n", "\n", "![](../images/vgg16/vgg16-arch-table.png)\n", "\n", "**Note that the CelebA images are 218 x 178, not 256 x 256. We resize to 128x128**" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "MkoGLH_Tj5wn" }, "source": [ "## Imports" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "colab": { "autoexec": { "startup": false, "wait_interval": 0 } }, "colab_type": "code", "id": "ORj09gnrj5wp" }, "outputs": [], "source": [ "import os\n", "import time\n", "\n", "import numpy as np\n", "import pandas as pd\n", "\n", "import torch\n", "import torch.nn as nn\n", "import torch.nn.functional as F\n", "\n", "from torch.utils.data import Dataset\n", "from torch.utils.data import DataLoader\n", "\n", "from torchvision import datasets\n", "from torchvision import transforms\n", "\n", "import matplotlib.pyplot as plt\n", "from PIL import Image\n", "\n", "\n", "if torch.cuda.is_available():\n", " torch.backends.cudnn.deterministic = True" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Dataset" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Downloading the Dataset" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that the ~200,000 CelebA face image dataset is relatively large (~1.3 Gb). The download link provided below was provided by the author on the official CelebA website at http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "1) Download and unzip the file `img_align_celeba.zip`, which contains the images in jpeg format.\n", "\n", "2) Download the `list_attr_celeba.txt` file, which contains the class labels\n", "\n", "3) Download the `list_eval_partition.txt` file, which contains training/validation/test partitioning info" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Preparing the Dataset" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Male
000001.jpg0
000002.jpg0
000003.jpg1
000004.jpg0
000005.jpg0
\n", "
" ], "text/plain": [ " Male\n", "000001.jpg 0\n", "000002.jpg 0\n", "000003.jpg 1\n", "000004.jpg 0\n", "000005.jpg 0" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df1 = pd.read_csv('list_attr_celeba.txt', sep=\"\\s+\", skiprows=1, usecols=['Male'])\n", "\n", "# Make 0 (female) & 1 (male) labels instead of -1 & 1\n", "df1.loc[df1['Male'] == -1, 'Male'] = 0\n", "\n", "df1.head()" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Partition
Filename
000001.jpg0
000002.jpg0
000003.jpg0
000004.jpg0
000005.jpg0
\n", "
" ], "text/plain": [ " Partition\n", "Filename \n", "000001.jpg 0\n", "000002.jpg 0\n", "000003.jpg 0\n", "000004.jpg 0\n", "000005.jpg 0" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df2 = pd.read_csv('list_eval_partition.txt', sep=\"\\s+\", skiprows=0, header=None)\n", "df2.columns = ['Filename', 'Partition']\n", "df2 = df2.set_index('Filename')\n", "\n", "df2.head()" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
MalePartition
000001.jpg00
000002.jpg00
000003.jpg10
000004.jpg00
000005.jpg00
\n", "
" ], "text/plain": [ " Male Partition\n", "000001.jpg 0 0\n", "000002.jpg 0 0\n", "000003.jpg 1 0\n", "000004.jpg 0 0\n", "000005.jpg 0 0" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df3 = df1.merge(df2, left_index=True, right_index=True)\n", "df3.head()" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
MalePartition
000001.jpg00
000002.jpg00
000003.jpg10
000004.jpg00
000005.jpg00
\n", "
" ], "text/plain": [ " Male Partition\n", "000001.jpg 0 0\n", "000002.jpg 0 0\n", "000003.jpg 1 0\n", "000004.jpg 0 0\n", "000005.jpg 0 0" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df3.to_csv('celeba-gender-partitions.csv')\n", "df4 = pd.read_csv('celeba-gender-partitions.csv', index_col=0)\n", "df4.head()" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "df4.loc[df4['Partition'] == 0].to_csv('celeba-gender-train.csv')\n", "df4.loc[df4['Partition'] == 1].to_csv('celeba-gender-valid.csv')\n", "df4.loc[df4['Partition'] == 2].to_csv('celeba-gender-test.csv')" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(218, 178, 3)\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "img = Image.open('img_align_celeba/000001.jpg')\n", "print(np.asarray(img, dtype=np.uint8).shape)\n", "plt.imshow(img);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Implementing a Custom DataLoader Class" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "class CelebaDataset(Dataset):\n", " \"\"\"Custom Dataset for loading CelebA face images\"\"\"\n", "\n", " def __init__(self, csv_path, img_dir, transform=None):\n", " \n", " df = pd.read_csv(csv_path, index_col=0)\n", " self.img_dir = img_dir\n", " self.csv_path = csv_path\n", " self.img_names = df.index.values\n", " self.y = df['Male'].values\n", " self.transform = transform\n", "\n", " def __getitem__(self, index):\n", " img = Image.open(os.path.join(self.img_dir,\n", " self.img_names[index]))\n", " \n", " if self.transform is not None:\n", " img = self.transform(img)\n", " \n", " label = self.y[index]\n", " return img, label\n", "\n", " def __len__(self):\n", " return self.y.shape[0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Running the VGG16 on this dataset with a minibatch size of 64 uses approximately 6.6 Gb of GPU memory. However, since we will split the batch size over for GPUs now, along with the model, we can actually comfortably use 64*4 as the batch size." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "# Note that transforms.ToTensor()\n", "# already divides pixels by 255. internally\n", "\n", "custom_transform = transforms.Compose([transforms.CenterCrop((178, 178)),\n", " transforms.Resize((128, 128)),\n", " #transforms.Grayscale(), \n", " #transforms.Lambda(lambda x: x/255.),\n", " transforms.ToTensor()])\n", "\n", "train_dataset = CelebaDataset(csv_path='celeba-gender-train.csv',\n", " img_dir='img_align_celeba/',\n", " transform=custom_transform)\n", "\n", "valid_dataset = CelebaDataset(csv_path='celeba-gender-valid.csv',\n", " img_dir='img_align_celeba/',\n", " transform=custom_transform)\n", "\n", "test_dataset = CelebaDataset(csv_path='celeba-gender-test.csv',\n", " img_dir='img_align_celeba/',\n", " transform=custom_transform)\n", "\n", "BATCH_SIZE=64*torch.cuda.device_count()\n", "\n", "\n", "train_loader = DataLoader(dataset=train_dataset,\n", " batch_size=BATCH_SIZE,\n", " shuffle=True,\n", " num_workers=4)\n", "\n", "valid_loader = DataLoader(dataset=valid_dataset,\n", " batch_size=BATCH_SIZE,\n", " shuffle=False,\n", " num_workers=4)\n", "\n", "test_loader = DataLoader(dataset=test_dataset,\n", " batch_size=BATCH_SIZE,\n", " shuffle=False,\n", " num_workers=4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that for DataParallel to work, the data currently needs to be on the first cuda device, \"cuda:0\". Otherwise, we will get a `RuntimeError: all tensors must be on devices[0]`. Hence, we define `device` below, which we use to transfer the input data to during training." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch: 1 | Batch index: 0 | Batch size: 256\n", "Epoch: 2 | Batch index: 0 | Batch size: 256\n" ] } ], "source": [ "device = torch.device(\"cuda:0\")\n", "torch.manual_seed(0)\n", "\n", "num_epochs = 2\n", "for epoch in range(num_epochs):\n", "\n", " for batch_idx, (x, y) in enumerate(train_loader):\n", " \n", " print('Epoch:', epoch+1, end='')\n", " print(' | Batch index:', batch_idx, end='')\n", " print(' | Batch size:', y.size()[0])\n", " \n", " x = x.to(device)\n", " y = y.to(device)\n", " break" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "I6hghKPxj5w0" }, "source": [ "## Model" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "colab": { "autoexec": { "startup": false, "wait_interval": 0 }, "base_uri": "https://localhost:8080/", "height": 85 }, "colab_type": "code", "executionInfo": { "elapsed": 23936, "status": "ok", "timestamp": 1524974497505, "user": { "displayName": "Sebastian Raschka", "photoUrl": "//lh6.googleusercontent.com/-cxK6yOSQ6uE/AAAAAAAAAAI/AAAAAAAAIfw/P9ar_CHsKOQ/s50-c-k-no/photo.jpg", "userId": "118404394130788869227" }, "user_tz": 240 }, "id": "NnT0sZIwj5wu", "outputId": "55aed925-d17e-4c6a-8c71-0d9b3bde5637" }, "outputs": [], "source": [ "##########################\n", "### SETTINGS\n", "##########################\n", "\n", "# Hyperparameters\n", "random_seed = 1\n", "learning_rate = 0.001\n", "num_epochs = 3\n", "\n", "# Architecture\n", "num_features = 128*128\n", "num_classes = 2" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "colab": { "autoexec": { "startup": false, "wait_interval": 0 } }, "colab_type": "code", "id": "_lza9t_uj5w1" }, "outputs": [], "source": [ "##########################\n", "### MODEL\n", "##########################\n", "\n", "\n", "class VGG16(torch.nn.Module):\n", "\n", " def __init__(self, num_features, num_classes):\n", " super(VGG16, self).__init__()\n", " \n", " # calculate same padding:\n", " # (w - k + 2*p)/s + 1 = o\n", " # => p = (s(o-1) - w + k)/2\n", " \n", " self.block_1 = nn.Sequential(\n", " nn.Conv2d(in_channels=3,\n", " out_channels=64,\n", " kernel_size=(3, 3),\n", " stride=(1, 1),\n", " # (1(32-1)- 32 + 3)/2 = 1\n", " padding=1), \n", " nn.ReLU(),\n", " nn.Conv2d(in_channels=64,\n", " out_channels=64,\n", " kernel_size=(3, 3),\n", " stride=(1, 1),\n", " padding=1),\n", " nn.ReLU(),\n", " nn.MaxPool2d(kernel_size=(2, 2),\n", " stride=(2, 2))\n", " )\n", " \n", " self.block_2 = nn.Sequential(\n", " nn.Conv2d(in_channels=64,\n", " out_channels=128,\n", " kernel_size=(3, 3),\n", " stride=(1, 1),\n", " padding=1),\n", " nn.ReLU(),\n", " nn.Conv2d(in_channels=128,\n", " out_channels=128,\n", " kernel_size=(3, 3),\n", " stride=(1, 1),\n", " padding=1),\n", " nn.ReLU(),\n", " nn.MaxPool2d(kernel_size=(2, 2),\n", " stride=(2, 2))\n", " )\n", " \n", " self.block_3 = nn.Sequential( \n", " nn.Conv2d(in_channels=128,\n", " out_channels=256,\n", " kernel_size=(3, 3),\n", " stride=(1, 1),\n", " padding=1),\n", " nn.ReLU(),\n", " nn.Conv2d(in_channels=256,\n", " out_channels=256,\n", " kernel_size=(3, 3),\n", " stride=(1, 1),\n", " padding=1),\n", " nn.ReLU(), \n", " nn.Conv2d(in_channels=256,\n", " out_channels=256,\n", " kernel_size=(3, 3),\n", " stride=(1, 1),\n", " padding=1),\n", " nn.ReLU(),\n", " nn.Conv2d(in_channels=256,\n", " out_channels=256,\n", " kernel_size=(3, 3),\n", " stride=(1, 1),\n", " padding=1),\n", " nn.ReLU(),\n", " nn.MaxPool2d(kernel_size=(2, 2),\n", " stride=(2, 2))\n", " )\n", " \n", " \n", " self.block_4 = nn.Sequential( \n", " nn.Conv2d(in_channels=256,\n", " out_channels=512,\n", " kernel_size=(3, 3),\n", " stride=(1, 1),\n", " padding=1),\n", " nn.ReLU(), \n", " nn.Conv2d(in_channels=512,\n", " out_channels=512,\n", " kernel_size=(3, 3),\n", " stride=(1, 1),\n", " padding=1),\n", " nn.ReLU(), \n", " nn.Conv2d(in_channels=512,\n", " out_channels=512,\n", " kernel_size=(3, 3),\n", " stride=(1, 1),\n", " padding=1),\n", " nn.ReLU(),\n", " nn.Conv2d(in_channels=512,\n", " out_channels=512,\n", " kernel_size=(3, 3),\n", " stride=(1, 1),\n", " padding=1),\n", " nn.ReLU(), \n", " nn.MaxPool2d(kernel_size=(2, 2),\n", " stride=(2, 2))\n", " )\n", " \n", " self.block_5 = nn.Sequential(\n", " nn.Conv2d(in_channels=512,\n", " out_channels=512,\n", " kernel_size=(3, 3),\n", " stride=(1, 1),\n", " padding=1),\n", " nn.ReLU(), \n", " nn.Conv2d(in_channels=512,\n", " out_channels=512,\n", " kernel_size=(3, 3),\n", " stride=(1, 1),\n", " padding=1),\n", " nn.ReLU(), \n", " nn.Conv2d(in_channels=512,\n", " out_channels=512,\n", " kernel_size=(3, 3),\n", " stride=(1, 1),\n", " padding=1),\n", " nn.ReLU(),\n", " nn.Conv2d(in_channels=512,\n", " out_channels=512,\n", " kernel_size=(3, 3),\n", " stride=(1, 1),\n", " padding=1),\n", " nn.ReLU(), \n", " nn.MaxPool2d(kernel_size=(2, 2),\n", " stride=(2, 2)) \n", " )\n", " \n", " self.classifier = nn.Sequential(\n", " nn.Linear(512*4*4, 4096),\n", " nn.ReLU(),\n", " nn.Linear(4096, 4096),\n", " nn.ReLU(),\n", " nn.Linear(4096, num_classes)\n", " )\n", " \n", " \n", " for m in self.modules():\n", " if isinstance(m, torch.nn.Conv2d):\n", " #n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels\n", " #m.weight.data.normal_(0, np.sqrt(2. / n))\n", " m.weight.detach().normal_(0, 0.05)\n", " if m.bias is not None:\n", " m.bias.detach().zero_()\n", " elif isinstance(m, torch.nn.Linear):\n", " m.weight.detach().normal_(0, 0.05)\n", " m.bias.detach().detach().zero_()\n", " \n", " \n", " def forward(self, x):\n", "\n", " x = self.block_1(x)\n", " x = self.block_2(x)\n", " x = self.block_3(x)\n", " x = self.block_4(x)\n", " x = self.block_5(x)\n", "\n", " logits = self.classifier(x.view(-1, 512*4*4))\n", " probas = F.softmax(logits, dim=1)\n", "\n", " return logits, probas" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Using 4 GPUs\n" ] } ], "source": [ "torch.manual_seed(random_seed)\n", "\n", "#### DATA PARALLEL START ####\n", "\n", "model = VGG16(num_features=num_features, num_classes=num_classes)\n", "if torch.cuda.device_count() > 1:\n", " print(\"Using\", torch.cuda.device_count(), \"GPUs\")\n", " model = nn.DataParallel(model)\n", "\n", "#### DATA PARALLEL END ####\n", " \n", "model.to(device)\n", "\n", "#### DATA PARALLEL START ####\n", " \n", "optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate) " ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "RAodboScj5w6" }, "source": [ "## Training" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "colab": { "autoexec": { "startup": false, "wait_interval": 0 }, "base_uri": "https://localhost:8080/", "height": 1547 }, "colab_type": "code", "executionInfo": { "elapsed": 2384585, "status": "ok", "timestamp": 1524976888520, "user": { "displayName": "Sebastian Raschka", "photoUrl": "//lh6.googleusercontent.com/-cxK6yOSQ6uE/AAAAAAAAAAI/AAAAAAAAIfw/P9ar_CHsKOQ/s50-c-k-no/photo.jpg", "userId": "118404394130788869227" }, "user_tz": 240 }, "id": "Dzh3ROmRj5w7", "outputId": "5f8fd8c9-b076-403a-b0b7-fd2d498b48d7", "scrolled": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch: 001/003 | Batch 0000/0636 | Cost: 8824.2812\n", "Epoch: 001/003 | Batch 0050/0636 | Cost: 0.7300\n", "Epoch: 001/003 | Batch 0100/0636 | Cost: 0.5690\n", "Epoch: 001/003 | Batch 0150/0636 | Cost: 0.4973\n", "Epoch: 001/003 | Batch 0200/0636 | Cost: 0.4219\n", "Epoch: 001/003 | Batch 0250/0636 | Cost: 0.3375\n", "Epoch: 001/003 | Batch 0300/0636 | Cost: 0.4227\n", "Epoch: 001/003 | Batch 0350/0636 | Cost: 0.2883\n", "Epoch: 001/003 | Batch 0400/0636 | Cost: 0.2740\n", "Epoch: 001/003 | Batch 0450/0636 | Cost: 0.2414\n", "Epoch: 001/003 | Batch 0500/0636 | Cost: 0.3492\n", "Epoch: 001/003 | Batch 0550/0636 | Cost: 0.2444\n", "Epoch: 001/003 | Batch 0600/0636 | Cost: 0.2145\n", "Epoch: 001/003 | Train: 92.041% | Valid: 93.149%\n", "Time elapsed: 6.44 min\n", "Epoch: 002/003 | Batch 0000/0636 | Cost: 0.2167\n", "Epoch: 002/003 | Batch 0050/0636 | Cost: 0.1764\n", "Epoch: 002/003 | Batch 0100/0636 | Cost: 0.2338\n", "Epoch: 002/003 | Batch 0150/0636 | Cost: 0.2053\n", "Epoch: 002/003 | Batch 0200/0636 | Cost: 0.1492\n", "Epoch: 002/003 | Batch 0250/0636 | Cost: 0.2303\n", "Epoch: 002/003 | Batch 0300/0636 | Cost: 0.2100\n", "Epoch: 002/003 | Batch 0350/0636 | Cost: 0.1400\n", "Epoch: 002/003 | Batch 0400/0636 | Cost: 0.1781\n", "Epoch: 002/003 | Batch 0450/0636 | Cost: 0.1286\n", "Epoch: 002/003 | Batch 0500/0636 | Cost: 0.1550\n", "Epoch: 002/003 | Batch 0550/0636 | Cost: 0.1845\n", "Epoch: 002/003 | Batch 0600/0636 | Cost: 0.1180\n", "Epoch: 002/003 | Train: 94.716% | Valid: 95.550%\n", "Time elapsed: 12.84 min\n", "Epoch: 003/003 | Batch 0000/0636 | Cost: 0.1429\n", "Epoch: 003/003 | Batch 0050/0636 | Cost: 0.1716\n", "Epoch: 003/003 | Batch 0100/0636 | Cost: 0.1373\n", "Epoch: 003/003 | Batch 0150/0636 | Cost: 0.1546\n", "Epoch: 003/003 | Batch 0200/0636 | Cost: 0.2527\n", "Epoch: 003/003 | Batch 0250/0636 | Cost: 0.1723\n", "Epoch: 003/003 | Batch 0300/0636 | Cost: 0.1730\n", "Epoch: 003/003 | Batch 0350/0636 | Cost: 0.1218\n", "Epoch: 003/003 | Batch 0400/0636 | Cost: 0.1303\n", "Epoch: 003/003 | Batch 0450/0636 | Cost: 0.1377\n", "Epoch: 003/003 | Batch 0500/0636 | Cost: 0.1203\n", "Epoch: 003/003 | Batch 0550/0636 | Cost: 0.1719\n", "Epoch: 003/003 | Batch 0600/0636 | Cost: 0.1307\n", "Epoch: 003/003 | Train: 95.134% | Valid: 95.661%\n", "Time elapsed: 19.24 min\n", "Total Training Time: 19.24 min\n" ] } ], "source": [ "def compute_accuracy(model, data_loader):\n", " correct_pred, num_examples = 0, 0\n", " for i, (features, targets) in enumerate(data_loader):\n", " \n", " features = features.to(device)\n", " targets = targets.to(device)\n", "\n", " logits, probas = model(features)\n", " _, predicted_labels = torch.max(probas, 1)\n", " num_examples += targets.size(0)\n", " correct_pred += (predicted_labels == targets).sum()\n", " return correct_pred.float()/num_examples * 100\n", " \n", "\n", "start_time = time.time()\n", "for epoch in range(num_epochs):\n", " \n", " model.train()\n", " for batch_idx, (features, targets) in enumerate(train_loader):\n", " \n", " features = features.to(device)\n", " targets = targets.to(device)\n", " \n", " ### FORWARD AND BACK PROP\n", " logits, probas = model(features)\n", " cost = F.cross_entropy(logits, targets)\n", " optimizer.zero_grad()\n", " \n", " cost.backward()\n", " \n", " ### UPDATE MODEL PARAMETERS\n", " optimizer.step()\n", " \n", " ### LOGGING\n", " if not batch_idx % 50:\n", " print ('Epoch: %03d/%03d | Batch %04d/%04d | Cost: %.4f' \n", " %(epoch+1, num_epochs, batch_idx, \n", " len(train_loader), cost))\n", "\n", " \n", "\n", " model.eval()\n", " with torch.set_grad_enabled(False): # save memory during inference\n", " print('Epoch: %03d/%03d | Train: %.3f%% | Valid: %.3f%%' % (\n", " epoch+1, num_epochs, \n", " compute_accuracy(model, train_loader),\n", " compute_accuracy(model, valid_loader)))\n", " \n", " print('Time elapsed: %.2f min' % ((time.time() - start_time)/60))\n", " \n", "print('Total Training Time: %.2f min' % ((time.time() - start_time)/60))" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "paaeEQHQj5xC" }, "source": [ "## Evaluation" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "colab": { "autoexec": { "startup": false, "wait_interval": 0 }, "base_uri": "https://localhost:8080/", "height": 34 }, "colab_type": "code", "executionInfo": { "elapsed": 6514, "status": "ok", "timestamp": 1524976895054, "user": { "displayName": "Sebastian Raschka", "photoUrl": "//lh6.googleusercontent.com/-cxK6yOSQ6uE/AAAAAAAAAAI/AAAAAAAAIfw/P9ar_CHsKOQ/s50-c-k-no/photo.jpg", "userId": "118404394130788869227" }, "user_tz": 240 }, "id": "gzQMWKq5j5xE", "outputId": "de7dc005-5eeb-4177-9f9f-d9b5d1358db9" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Test accuracy: 94.96%\n" ] } ], "source": [ "model.eval()\n", "\n", "with torch.set_grad_enabled(False): # save memory during inference\n", " print('Test accuracy: %.2f%%' % (compute_accuracy(model, test_loader)))" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "for batch_idx, (features, targets) in enumerate(test_loader):\n", "\n", " features = features\n", " targets = targets\n", " break\n", " \n", "plt.imshow(np.transpose(features[0], (1, 2, 0)))" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Probability Female 97.02%\n" ] } ], "source": [ "logits, probas = model(features.to(device)[0, None])\n", "print('Probability Female %.2f%%' % (probas[0][0]*100))" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "numpy 1.15.4\n", "pandas 0.23.4\n", "torch 1.0.1.post2\n", "PIL.Image 5.3.0\n", "\n" ] } ], "source": [ "%watermark -iv" ] } ], "metadata": { "accelerator": "GPU", "colab": { "collapsed_sections": [], "default_view": {}, "name": "convnet-vgg16.ipynb", "provenance": [], "version": "0.3.2", "views": {} }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.1" }, "toc": { "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": true, "toc_position": { "height": "calc(100% - 180px)", "left": "10px", "top": "150px", "width": "371px" }, "toc_section_display": true, "toc_window_display": true } }, "nbformat": 4, "nbformat_minor": 2 }