{ "cells": [ { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false, "nbgrader": { "checksum": "42d798a03926cde5ab0a0b773d323e29", "grade": false, "grade_id": "cell-4e4261db2c4407db", "locked": true, "schema_version": 1, "solution": false } }, "source": [ "# Part 3: Visualising Convolutional Networks" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false, "nbgrader": { "checksum": "8f7b02984d508010e854561de75ff6d3", "grade": false, "grade_id": "cell-a649bda900315eaa", "locked": true, "schema_version": 1, "solution": false } }, "source": [ "__Before starting, we recommend you enable GPU acceleration if you're running on Colab. You'll also need to upload the weights you downloaded previously using the following block and using the upload button to upload your bettercnn.weights file:__" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Execute this code block to install dependencies when running on colab\n", "try:\n", " import torch\n", "except:\n", " from os.path import exists\n", " from wheel.pep425tags import get_abbr_impl, get_impl_ver, get_abi_tag\n", " platform = '{}{}-{}'.format(get_abbr_impl(), get_impl_ver(), get_abi_tag())\n", " cuda_output = !ldconfig -p|grep cudart.so|sed -e 's/.*\\.\\([0-9]*\\)\\.\\([0-9]*\\)$/cu\\1\\2/'\n", " accelerator = cuda_output[0] if exists('/dev/nvidia0') else 'cpu'\n", "\n", " !pip install -q http://download.pytorch.org/whl/{accelerator}/torch-1.0.0-{platform}-linux_x86_64.whl torchvision\n", "\n", "try: \n", " import torchbearer\n", "except:\n", " !pip install torchbearer\n", " \n", "try:\n", " from google.colab import files\n", " uploaded = files.upload()\n", "except:\n", " print(\"Not running on colab. Ignoring.\")\n", "\n", "!wget http://comp6248.ecs.soton.ac.uk/labs/lab5/0.PNG\n", "!wget http://comp6248.ecs.soton.ac.uk/labs/lab5/1.PNG\n", "!wget http://comp6248.ecs.soton.ac.uk/labs/lab5/2.PNG\n", "!wget http://comp6248.ecs.soton.ac.uk/labs/lab5/3.PNG\n", "!wget http://comp6248.ecs.soton.ac.uk/labs/lab5/4.PNG\n", "!wget http://comp6248.ecs.soton.ac.uk/labs/lab5/5.PNG\n", "!wget http://comp6248.ecs.soton.ac.uk/labs/lab5/6.PNG\n", "!wget http://comp6248.ecs.soton.ac.uk/labs/lab5/7.PNG\n", "!wget http://comp6248.ecs.soton.ac.uk/labs/lab5/8.PNG\n", "!wget http://comp6248.ecs.soton.ac.uk/labs/lab5/9.PNG" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false, "nbgrader": { "checksum": "08fe8a49e72d2b717e24d8a336779c6f", "grade": false, "grade_id": "cell-c83aaff989e87ceb", "locked": true, "schema_version": 1, "solution": false } }, "source": [ "## Visualising the first layers filters and responses\n", "\n", "In our previous `BetterCNN` convolutional network, the first layer was a Convolutional layer. Because this convolutional layer is applied directly to the greylevel input MNIST images the filters that are learned can themselves just be considered to be small (5x5 in this case) greylevel images. \n", "\n", "We'll start by doing a few imports and then loading our pre-trained model. Once again, please copy-paste the forward method from the first workbook:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "checksum": "87de33eb79dfec8861fd9437cf71a989", "grade": false, "grade_id": "cell-558c383ad9759aed", "locked": false, "schema_version": 1, "solution": true } }, "outputs": [], "source": [ "%matplotlib inline\n", "# automatically reload external modules if they change\n", "%load_ext autoreload\n", "%autoreload 2\n", "\n", "import torch \n", "import torch.nn.functional as F\n", "import matplotlib.pyplot as plt\n", "from torch import nn\n", "\n", "import torch \n", "import torch.nn.functional as F\n", "from torch import nn\n", "\n", "# Model Definition\n", "class BetterCNN(nn.Module):\n", " def __init__(self):\n", " super(BetterCNN, self).__init__()\n", " self.conv1 = nn.Conv2d(1, 30, (5, 5), padding=0)\n", " self.conv2 = nn.Conv2d(30, 15, (3, 3), padding=0)\n", " self.fc1 = nn.Linear(15 * 5**2, 128)\n", " self.fc2 = nn.Linear(128, 50)\n", " self.fc3 = nn.Linear(50, 10)\n", " \n", " def forward(self, x):\n", " # YOUR CODE HERE\n", " raise NotImplementedError()\n", "\n", "# build the model and load state\n", "model = BetterCNN()\n", "model.load_state_dict(torch.load('bettercnn.weights'))" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false, "nbgrader": { "checksum": "17db475c8135fcad22ed0a5bce95a7cf", "grade": false, "grade_id": "cell-de20621d24cc0077", "locked": true, "schema_version": 1, "solution": false } }, "source": [ "We can extract the weights of the first layer filters directly from the trained network and visualise them using `matplotlib` like this:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "checksum": "498b26024fded2ac188cba15e67fe7c7", "grade": false, "grade_id": "cell-0a9f4f87c60ba0cf", "locked": true, "schema_version": 1, "solution": false } }, "outputs": [], "source": [ "from scipy.misc import imread\n", "\n", "weights = model.conv1.weight.data.cpu()\n", "\n", "# plot the first layer features\n", "for i in range(0,30):\n", "\tplt.subplot(5,6,i+1)\n", "\tplt.imshow(weights[i, 0, :, :], cmap=plt.get_cmap('gray'))\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false, "nbgrader": { "checksum": "29f5b6fd6822465d1c53a7d130544b51", "grade": false, "grade_id": "cell-e63cf51ff4cf8e11", "locked": true, "schema_version": 1, "solution": false } }, "source": [ "Note that `model.conv1.data` is the tensor holding the weights. Calling `cpu()` ensures data is moved over from the GPU if necessary.\n", "\n", "__Answer the following question (enter the answer in the box below):__\n", "\n", "__1.__ What sort of features do the filters resemble? How does this relate to your knowledge of the training data?" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "nbgrader": { "checksum": "e13855c661d083543ba506aaae35dec6", "grade": true, "grade_id": "cell-2a6cfda360da7de9", "locked": false, "points": 2, "schema_version": 1, "solution": true } }, "source": [ "YOUR ANSWER HERE" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false, "nbgrader": { "checksum": "548977341456f673d50cbb1e1638f15f", "grade": false, "grade_id": "cell-c5c8d0378f255698", "locked": true, "schema_version": 1, "solution": false } }, "source": [ "## Visualising feature maps\n", "\n", "If we forward propagate an input through the network we can also visualise the response maps generated by the filters. The advantage of this kind of visualisation is that we can compute it at any layer, not just the first one. In order to do this in PyTorch, we can propagate the given input through the network to the required point and use a `hook` to intercept the feature maps as they are created. The following code shows how this can be achieved to generate the response maps of the second convolutional layer of our network:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "checksum": "35654c5ee8d457863c4f402cd4ad4864", "grade": false, "grade_id": "cell-8fb551f4810e7d84", "locked": true, "schema_version": 1, "solution": false } }, "outputs": [], "source": [ "from PIL import Image\n", "import torchvision\n", "\n", "transform = torchvision.transforms.ToTensor()\n", "im = transform(Image.open(\"1.PNG\")).unsqueeze(0)\n", "\n", "def hook_function(module, grad_in, grad_out):\n", " for i in range(grad_out.shape[1]):\n", " conv_output = grad_out.data[0, i]\n", " plt.subplot(5, int(1+grad_out.shape[1]/5), i+1)\n", " plt.imshow(conv_output, cmap=plt.get_cmap('gray'))\n", " \n", "hook = model.conv2.register_forward_hook(hook_function) # register the hook\n", "model(im) # forward pass\n", "hook.remove() #Tidy up" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false, "nbgrader": { "checksum": "eaff622c48d684ffe466a627534deff5", "grade": false, "grade_id": "cell-106a1c8abf99792d", "locked": true, "schema_version": 1, "solution": false } }, "source": [ "__Use the following code block to visualise the feature maps of the first convolutional layer__:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false, "nbgrader": { "checksum": "d65d2123a4d36fa419fd0157c5f2c06f", "grade": false, "grade_id": "cell-63d2c948620d68d2", "locked": true, "schema_version": 1, "solution": false } }, "source": [ "A final way of visualising what the filters (at any depth) are learning is to find the input image that maximises the response of the filter. We can do this by starting with a random image and using gradient ascent to optimise the image to maximise the chosen filter (see http://www.iro.umontreal.ca/~lisa/publications2/index.php/publications/show/247 and https://distill.pub/2017/feature-visualization/ for more info on this approach). The following code snippet shows how this can be achieved:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def visualise_maximum_activation(model, target, num=10, alpha = 1.0):\n", " for selected in range(num):\n", " input_img = torch.randn(1, 1, 28, 28, requires_grad=True)\n", "\n", " # we're interested in maximising outputs of the 3rd layer:\n", " conv_output = None\n", "\n", " def hook_function(module, grad_in, grad_out):\n", " nonlocal conv_output\n", " # Gets the conv output of the selected filter/feature (from selected layer)\n", " conv_output = grad_out[0, selected]\n", "\n", " hook = target.register_forward_hook(hook_function)\n", "\n", " for i in range(30):\n", " model(input_img)\n", " loss = torch.mean(conv_output)\n", " loss.backward()\n", "\n", " norm = input_img.grad.std() + 1e-5\n", " input_img.grad /= norm\n", " input_img.data = input_img + alpha * input_img.grad\n", "\n", " hook.remove()\n", "\n", " input_img = input_img.detach()\n", "\n", " plt.subplot(2,num/2,selected+1)\n", " plt.imshow(input_img[0,0], cmap=plt.get_cmap('gray'))\n", "\n", " plt.show()\n", " \n", "visualise_maximum_activation(model, model.fc3)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.7" } }, "nbformat": 4, "nbformat_minor": 2 }