{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Artificial Neural Networks in PyTorch\n", "> In this second chapter, we delve deeper into Artificial Neural Networks, learning how to train them with real datasets. This is the Summary of lecture \"Introduction to Deep Learning with PyTorch\", via datacamp.\n", "\n", "- toc: true \n", "- badges: true\n", "- comments: true\n", "- author: Chanseok Kang\n", "- categories: [Python, Datacamp, PyTorch, Deep_Learning]\n", "- image: images/activation.png" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import torch\n", "import torch.nn as nn\n", "import numpy as np\n", "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Activation functions\n", "![activation](image/activation.png)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Neural networks\n", "Let us see the differences between neural networks which apply `ReLU` and those which do not apply `ReLU`.\n", "\n", "We are going to convince ourselves that networks with multiple layers which do not contain non-linearity can be expressed as neural networks with one layer.\n", "\n", "The network and the shape of layers and weights is shown below.\n", "![net-ex](image/net-ex.jpg)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "input_layer = torch.tensor([[ 0.0401, -0.9005, 0.0397, -0.0876]])\n", "\n", "weight_1 = torch.tensor([[-0.1094, -0.8285, 0.0416, -1.1222],\n", " [ 0.3327, -0.0461, 1.4473, -0.8070],\n", " [ 0.0681, -0.7058, -1.8017, 0.5857],\n", " [ 0.8764, 0.9618, -0.4505, 0.2888]])\n", "\n", "weight_2 = torch.tensor([[ 0.6856, -1.7650, 1.6375, -1.5759],\n", " [-0.1092, -0.1620, 0.1951, -0.1169],\n", " [-0.5120, 1.1997, 0.8483, -0.2476],\n", " [-0.3369, 0.5617, -0.6658, 0.2221]])\n", "\n", "weight_3 = torch.tensor([[ 0.8824, 0.1268, 1.1951, 1.3061],\n", " [-0.8753, -0.3277, -0.1454, -0.0167],\n", " [ 0.3582, 0.3254, -1.8509, -1.4205],\n", " [ 0.3786, 0.5999, -0.5665, -0.3975]])" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "tensor([[0.2653, 0.1311, 3.8219, 3.0032]])\n", "tensor([[0.2653, 0.1311, 3.8219, 3.0032]])\n" ] } ], "source": [ "# Calculate the first and second hidden layer\n", "hidden_1 = torch.matmul(input_layer, weight_1)\n", "hidden_2 = torch.matmul(hidden_1, weight_2)\n", "\n", "# Calculate the output\n", "print(torch.matmul(hidden_2, weight_3))\n", "\n", "# Calculate wieght_composed_1 and weight\n", "weight_composed_1 = torch.matmul(weight_1, weight_2)\n", "weight = torch.matmul(weight_composed_1, weight_3)\n", "\n", "# Multiply input_layer with weight\n", "print(torch.matmul(input_layer, weight))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### ReLU activation\n", "Now we are going to build a neural network which has non-linearity and by doing so, we are going to convince ourselves that networks with multiple layers and non-linearity functions cannot be expressed as a neural network with one layer." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "tensor([[-0.2770, -0.0345, -0.1410, -0.0664]])\n", "tensor([[-0.2117, -0.4782, 4.0438, 3.0417]])\n" ] } ], "source": [ "relu = nn.ReLU()\n", "\n", "# Apply non-linearity on hidden_1 and hidden_2\n", "hidden_1_activated = relu(torch.matmul(input_layer, weight_1))\n", "hidden_2_activated = relu(torch.matmul(hidden_1_activated, weight_2))\n", "print(torch.matmul(hidden_2_activated, weight_3))\n", "\n", "# Apply non-linearity in the product of first two weights\n", "weight_composed_1_activated = relu(torch.matmul(weight_1, weight_2))\n", "\n", "# Multiply `weight_composed_1_activated` with `weight_3`\n", "weight = torch.matmul(weight_composed_1_activated, weight_3)\n", "\n", "# Multiply input_layer with weight\n", "print(torch.matmul(input_layer, weight))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### ReLU activation again\n", "Neural networks don't need to have the same number of units in each layer. Here, you are going to experiment with the `ReLU` activation function again, but this time we are going to have a different number of units in the layers of the neural network. The input layer will still have 4 features, but then the first hidden layer will have 6 units and the output layer will have 2 units.\n", "![net-ex2](image/net-ex2.jpg)\n" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "tensor([[0., 0.]])\n" ] } ], "source": [ "# Instantiate weight_1 and weight_2 with random numbers\n", "weight_1 = torch.rand(4, 6)\n", "weight_2 = torch.rand(6, 2)\n", "\n", "# Multiply input_layer with weight_1\n", "hidden_1 = torch.matmul(input_layer, weight_1)\n", "\n", "# Apply ReLU activation function over hidden_1 and multiply with weight_2\n", "hidden_1_activated = relu(hidden_1)\n", "print(torch.matmul(hidden_1_activated, weight_2))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loss functions\n", "- Process\n", " - Initialize neural networks with random weights\n", " - Do a forward pass\n", " - Calculate loss function (1 number)\n", " - Calcualte the gradients\n", " - Change the weights based on gradients" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Calculating loss function in PyTorch\n", "You are going to code the previous exercise, and make sure that we computed the loss correctly. Predicted scores are -1.2 for class 0 (cat), 0.12 for class 1 (car) and 4.8 for class 2 (frog). The ground truth is class 2 (frog). Compute the loss function in PyTorch.\n", "\n", "| Class | Predicted Score |\n", "| ------ | ------- |\n", "| Cat | -1.2 |\n", "| Car | 0.12 |\n", "| Frog | 4.8 |" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "tensor(0.0117)\n" ] } ], "source": [ "# Initialize the scores and ground truth\n", "logits = torch.tensor([[-1.2, 0.12, 4.8]])\n", "ground_truth = torch.tensor([2])\n", "\n", "# Instantiate cross entropy loss\n", "criterion = nn.CrossEntropyLoss()\n", "\n", "# Compute and print the loss\n", "loss = criterion(logits, ground_truth)\n", "print(loss)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "the loss function PyTorch calculated gives the same number as the loss function you calculated. Being proficient in understanding and calculating loss functions is a very important skill in deep learning." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Loss function of random scores\n", "If the neural network predicts random scores, what would be its loss function? Let's find it out in PyTorch. The neural network is going to have 1000 classes, each having a random score. For ground truth, it will have class 111. Calculate the loss function.\n", "\n" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "tensor(7.0403)\n" ] } ], "source": [ "# Initialize logits and ground truth\n", "logits = torch.rand(1, 1000)\n", "ground_truth = torch.tensor([111])\n", "\n", "# Instantiate cross-entropy loss\n", "criterion = nn.CrossEntropyLoss()\n", "\n", "# Calculate and print the loss\n", "loss = criterion(logits, ground_truth)\n", "print(loss)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Preparing a dataset in PyTorch" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Preparing MNIST dataset\n", "You are going to prepare dataloaders for MNIST training and testing set. As we explained in the lecture, MNIST has some differences to CIFAR-10, with the main difference being that MNIST images are grayscale (1 channel based) instead of RGB (3 channels).\n", "\n" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to mnist/MNIST/raw/train-images-idx3-ubyte.gz\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "ef177fe0999044fb8076320620b5e0c4", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Extracting mnist/MNIST/raw/train-images-idx3-ubyte.gz to mnist/MNIST/raw\n", "Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to mnist/MNIST/raw/train-labels-idx1-ubyte.gz\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "0a29edf278b54072af97f1c02c4581a6", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Extracting mnist/MNIST/raw/train-labels-idx1-ubyte.gz to mnist/MNIST/raw\n", "Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to mnist/MNIST/raw/t10k-images-idx3-ubyte.gz\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "8cfa953de67a442e97af7c439065346f", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Extracting mnist/MNIST/raw/t10k-images-idx3-ubyte.gz to mnist/MNIST/raw\n", "Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to mnist/MNIST/raw/t10k-labels-idx1-ubyte.gz\n", "\n", "\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "4820488736684ad4a0cc8c9e05bf9c9a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Extracting mnist/MNIST/raw/t10k-labels-idx1-ubyte.gz to mnist/MNIST/raw\n", "Processing...\n", "Done!\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/opt/conda/conda-bld/pytorch_1591914855613/work/torch/csrc/utils/tensor_numpy.cpp:141: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program.\n" ] } ], "source": [ "import torchvision\n", "import torchvision.transforms as transforms\n", "\n", "# Transform the data to torch tensors and normalize it\n", "transform = transforms.Compose([\n", " transforms.ToTensor(),\n", " transforms.Normalize((0.1307), (0.3081))\n", "])\n", "\n", "# Preparing training set and test set\n", "trainset = torchvision.datasets.MNIST('mnist', train=True, download=True, transform=transform)\n", "testset = torchvision.datasets.MNIST('mnist', train=False, download=True, transform=transform)\n", "\n", "# Prepare training loader and test loader\n", "train_loader = torch.utils.data.DataLoader(trainset, batch_size=32, shuffle=True, num_workers=0)\n", "test_loader = torch.utils.data.DataLoader(testset, batch_size=32, shuffle=False, num_workers=0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Inspecting the dataloaders\n", "Now you are going to explore a bit the dataloaders you created in the previous exercise. In particular, you will compute the shape of the dataset in addition to the minibatch size." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Note: `train_data`, `test_data` attributes in dataset is replaced with `data`" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "torch.Size([60000, 28, 28]) torch.Size([10000, 28, 28])\n", "32 32\n" ] } ], "source": [ "# Compute the shape of the training set and test set\n", "trainset_shape = train_loader.dataset.data.shape\n", "testset_shape = test_loader.dataset.data.shape\n", "\n", "# Print the computed shapes\n", "print(trainset_shape, testset_shape)\n", "\n", "# Compute the size of the minibatch for training set and test set\n", "trainset_batchsize = train_loader.batch_size\n", "testset_batchsize = test_loader.batch_size\n", "\n", "# Print sizes of the minibatch\n", "print(trainset_batchsize, testset_batchsize)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Training neural networks" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Building a neural network - again\n", "You haven't created a neural network since the end of the first chapter, so this is a good time to build one (practice makes perfect). Build a class for a neural network which will be used to train on the MNIST dataset. The dataset contains images of shape (28, 28, 1), so you should deduct the size of the input layer. For hidden layer use 200 units, while for output layer use 10 units (1 for each class). For activation function, use `relu` in a functional way.\n", "\n", "For context, the same net will be trained and used to make predictions in the next two exercises." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "import torch.nn.functional as F\n", "\n", "# Define the class Net\n", "class Net(nn.Module):\n", " def __init__(self):\n", " # Define all the parameters of the net\n", " super(Net, self).__init__()\n", " self.fc1 = nn.Linear(28 * 28 * 1, 200)\n", " self.fc2 = nn.Linear(200, 10)\n", " \n", " def forward(self, x):\n", " # Do the forward pass\n", " x = F.relu(self.fc1(x))\n", " x = self.fc2(x)\n", " return x" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Training a neural network\n", "Given the fully connected neural network (called `model`) which you built in the previous exercise and a train loader called `train_loader` containing the MNIST dataset (which we created for you), you're to train the net in order to predict the classes of digits. You will use the Adam optimizer to optimize the network, and considering that this is a classification problem you are going to use cross entropy as loss function.\n", "\n" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "import torch.optim as optim\n", "\n", "# Instantiate the Adam optimizer and Cross-Entropy loss function\n", "model = Net()\n", "optimizer = optim.Adam(model.parameters(), lr=3e-4)\n", "criterion = nn.CrossEntropyLoss()\n", "\n", "for batch_idx, data_target in enumerate(train_loader):\n", " data = data_target[0]\n", " target = data_target[1]\n", " \n", " data = data.view(-1, 28 * 28)\n", " optimizer.zero_grad()\n", " \n", " # Compute a forward pass\n", " output = model(data)\n", " \n", " # Compute the loss gradients and change the weights\n", " loss = criterion(output, target)\n", " loss.backward()\n", " optimizer.step()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Using the network to make predictions\n", "Now that you have trained the network, use it to make predictions for the data in the testing set. The network is called `model` (same as in the previous exercise), and the loader is called `test_loader`. We have already initialized variables `total` and `correct` to 0." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The test set accuracy of the network is: 95 %\n" ] } ], "source": [ "correct, total = 0, 0\n", "\n", "# Set the model in eval mode\n", "model.eval()\n", "\n", "for i, data in enumerate(test_loader, 0):\n", " inputs, labels = data\n", " \n", " # Put each image into a vector\n", " inputs = inputs.view(-1, 28 * 28)\n", " \n", " # Do the forward pass and get the predictions\n", " outputs = model(inputs)\n", " _, outputs = torch.max(outputs.data, 1)\n", " total += labels.size(0)\n", " correct += (outputs == labels).sum().item()\n", "print('The test set accuracy of the network is: %d %%' % (100 * correct / total)) " ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.6" } }, "nbformat": 4, "nbformat_minor": 4 }