{ "cells": [ { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "checksum": "512b9d458893e474da69aa7e23a01e24", "grade": false, "grade_id": "cell-9e42398fb4955fba", "locked": true, "schema_version": 1, "solution": false } }, "outputs": [], "source": [ "# Execute this code block to install dependencies when running on colab\n", "try:\n", " import torch\n", "except:\n", " from os.path import exists\n", " from wheel.pep425tags import get_abbr_impl, get_impl_ver, get_abi_tag\n", " platform = '{}{}-{}'.format(get_abbr_impl(), get_impl_ver(), get_abi_tag())\n", " cuda_output = !ldconfig -p|grep cudart.so|sed -e 's/.*\\.\\([0-9]*\\)\\.\\([0-9]*\\)$/cu\\1\\2/'\n", " accelerator = cuda_output[0] if exists('/dev/nvidia0') else 'cpu'\n", "\n", " !pip install -q http://download.pytorch.org/whl/{accelerator}/torch-1.0.0-{platform}-linux_x86_64.whl torchvision" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false, "nbgrader": { "checksum": "57488d0734123ec82683f50de0d1496d", "grade": false, "grade_id": "cell-9daf9d2f6ba17cf3", "locked": true, "schema_version": 1, "solution": false } }, "source": [ "## The Fashion-MNIST Dataset\n", "\n", "Fashion-MNIST is a dataset of Zalando’s article images consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28×28 grayscale image, associated with a label from 10 classes. Fashion-MNIST is intended to serve as a direct drop-in replacement of the original MNIST dataset for benchmarking machine learning algorithms as it is a more challenging dataset. \n", "\n", "For this lab, let's start by loading the Fashion-MNIST dataset using the Torchvision library. When loading, we can transform the images to be flattened vectors of dimension 784 (= 28 x 28).\n", "\n", "Once the data has been downloaded, we can plot some of the examples to see what the classes look like." ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false, "nbgrader": { "checksum": "a7eaa07fe30212adedb7388dbd85af8d", "grade": false, "grade_id": "cell-2122f281579eb211", "locked": true, "schema_version": 1, "solution": false } }, "source": [ "### Loading the Dataset using PyTorch and the Torchvision Library" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "checksum": "2cf6a4b32ba0c3cfe5b2265f3bb9980f", "grade": false, "grade_id": "cell-71a9243a4d066e4b", "locked": true, "schema_version": 1, "solution": false } }, "outputs": [], "source": [ "import torch\n", "import torchvision\n", "import torchvision.transforms as transforms\n", "\n", "batch_size = 256\n", "\n", "# dataset construction\n", "transform = transforms.Compose([\n", " transforms.ToTensor(), # convert to tensor\n", " transforms.Lambda(lambda x: x.view(image_dim)) # flatten into vector\n", " ])\n", "\n", "train_set = torchvision.datasets.FashionMNIST(\n", " root='./data/FashionMNIST'\n", " ,train=True\n", " ,download=True\n", " ,transform=transform\n", ")\n", "\n", "train_loader = torch.utils.data.DataLoader(\n", " train_set, batch_size=batch_size\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "checksum": "c4f8d5982641adb7486fd2aa2ba27e53", "grade": false, "grade_id": "cell-f128ce750ee024be", "locked": true, "schema_version": 1, "solution": false } }, "outputs": [], "source": [ "%matplotlib inline\n", "\n", "import matplotlib.pyplot as plt\n", "\n", "for i in range(8):\n", " plt.subplot(int(str(24)+str(i+1)))\n", " plt.imshow(train_set.train_data[i], cmap=plt.get_cmap('gray'))\n", "\n", "# show the plot\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false, "nbgrader": { "checksum": "d28cdaefd305274758b338af6fe64f68", "grade": false, "grade_id": "cell-c4a66cd9fec76585", "locked": true, "schema_version": 1, "solution": false } }, "source": [ "# Implement an Autoencoder\n", "\n", "The next step is to implement a very simple autoencoder algorithm. \n", "\n", "Recall from the lecture, an autoencoder is an unsupervised algorithm that consists of an encoder and a decoder. The input passes through an encoder, which typically contains a bottleneck to reduce to the dimensionality of the input. This latent code, or reduced dimensionality representation of the input is then passed through the decoder to reconstruct the input. The reconstruction will be a lossy version of the input. The encoder and decoder are neural networks and learning is achieved in the same manner as with a neural network.\n", "\n", "For this implementation, assume the Encoder (defined below) only has an input and an output. There is no hidden layer in the encoder. Assume the dimensionality of the latent space is $64$.\n", "\n", "For the Decoder, again, assume it is a simple dense layer without a hidden layer (simply an input and output layer). For the decoder, the output layer should have a Sigmoid non-linearity as opposed to Relu (which may be used for the other layers).\n", "\n", "Start by defining the Encoder and Decoder classes. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "checksum": "9b480dd0c33e23b0b5dfd8c4a546393b", "grade": true, "grade_id": "cell-9c0efdfaad95aa4d", "locked": false, "points": 6, "schema_version": 1, "solution": true } }, "outputs": [], "source": [ "import torch\n", "import torch.nn as nn\n", "import torch.nn.functional as F\n", "\n", "class Encoder(nn.Module):\n", " '''\n", " simple encoder with no hidden dense layer\n", " '''\n", " def __init__(self, input_dim, hidden_dim):\n", " super(Encoder, self).__init__()\n", " # YOUR CODE HERE\n", " raise NotImplementedError()\n", "\n", " def forward(self, x):\n", " # YOUR CODE HERE\n", " raise NotImplementedError()\n", "\n", "class Decoder(nn.Module):\n", " '''\n", " simple decoder: single dense hidden layer followed by \n", " output layer with a sigmoid to squish values\n", " '''\n", " def __init__(self, input_dim, output_dim):\n", " super(Decoder, self).__init__()\n", " # YOUR CODE HERE\n", " raise NotImplementedError()\n", "\n", " def forward(self, x):\n", " # YOUR CODE HERE\n", " raise NotImplementedError()" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false, "nbgrader": { "checksum": "3db39fdb6caa610d5867f8d5cc543d03", "grade": false, "grade_id": "cell-5b3a0d1e71e6d5c6", "locked": true, "schema_version": 1, "solution": false } }, "source": [ "Next, let's test the autoencoder implementation to make sure it is functioning and see what the reconstructed images look like.\n", "\n", "The code to test your autoencoder is written below. You will simply need to write the code to display your reconstructed images. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "checksum": "5fd6a5d1b81d78fb6e25cbad5a6e4b56", "grade": true, "grade_id": "cell-33025b3e04e43ebb", "locked": false, "points": 2, "schema_version": 1, "solution": true } }, "outputs": [], "source": [ "import matplotlib.gridspec as gridspec \n", "import os\n", "import torch.optim as optim\n", "import numpy as np\n", "\n", "from tqdm.autonotebook import tqdm\n", "from itertools import chain\n", "\n", "enc_dim = 64\n", "image_dim = 784 # [flattened]\n", "nEpoch = 10\n", "\n", "# construct the encoder, decoder and optimiser\n", "enc = Encoder(image_dim, enc_dim)\n", "dec = Decoder(enc_dim, image_dim)\n", "optimizer = optim.Adam(chain(enc.parameters(), dec.parameters()), lr=1e-3)\n", "\n", "# training loop\n", "for epoch in range(nEpoch):\n", " losses = []\n", " trainloader = tqdm(train_loader)\n", "\n", " for i, data in enumerate(trainloader, 0):\n", " inputs, _ = data\n", " optimizer.zero_grad()\n", "\n", " z = enc(inputs)\n", " outputs = dec(z)\n", "\n", " loss = F.binary_cross_entropy(outputs, inputs, reduction='sum') / inputs.shape[0]\n", " loss.backward()\n", " optimizer.step()\n", "\n", " # keep track of the loss and update the stats\n", " losses.append(loss.item())\n", " trainloader.set_postfix(loss=np.mean(losses), epoch=epoch)\n", " \n", " \n", " ## Display some of the reconstructed images\n", " # YOUR CODE HERE\n", " raise NotImplementedError()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.7" } }, "nbformat": 4, "nbformat_minor": 2 }