{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Image Classification Laboratory\n", "\n", "\n", "Images used in this laboration are from CIFAR 10 (https://en.wikipedia.org/wiki/CIFAR-10). The CIFAR-10 dataset contains 60,000 32x32 color images in 10 different classes. The 10 different classes represent airplanes, cars, birds, cats, deer, dogs, frogs, horses, ships, and trucks. There are 6,000 images of each class. Your task is to make a classifier that can correctly classify each image into the correct class.\n", "\n", "Let's start being sure that our script can see the graphics card that will be used. The graphics cards will perform all the time consuming convolutions in every training iteration." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "import warnings\n", "\n", "# Ignore FutureWarning from numpy\n", "warnings.simplefilter(action='ignore', category=FutureWarning)\n", "\n", "import keras.backend as K\n", "import tensorflow as tf\n", "\n", "os.environ[\"CUDA_DEVICE_ORDER\"]=\"PCI_BUS_ID\";\n", " \n", "# The GPU id to use, usually either \"0\" or \"1\";\n", "os.environ[\"CUDA_VISIBLE_DEVICES\"]=\"0\";\n", "\n", "# Allow growth of GPU memory (otherwise it will look like all the memory is being used, even if you only use 10 MB)\n", "config = tf.ConfigProto()\n", "config.gpu_options.allow_growth = True\n", "K.tensorflow_backend.set_session(tf.Session(config=config))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "## Load data\n", "Load the images and labels from keras.datasets" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from keras.datasets import cifar10\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "\n", "classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']\n", "\n", "(Xtrain, Ytrain), (Xtest, Ytest) = cifar10.load_data()\n", "\n", "print(\"Training images have size {} and labels have size {} \".format(Xtrain.shape, Ytrain.shape))\n", "print(\"Test images have size {} and labels have size {} \\n \".format(Xtest.shape, Ytest.shape))\n", "\n", "# Reduce the number of images for training and testing to 10000 and 2000 respectively, \n", "# to reduce processing time for this laboration\n", "Xtrain = Xtrain[0:10000]\n", "Ytrain = Ytrain[0:10000]\n", "\n", "Xtest = Xtest[0:2000]\n", "Ytest = Ytest[0:2000]\n", "\n", "print(\"Reduced training images have size %s and labels have size %s \" % (Xtrain.shape, Ytrain.shape))\n", "print(\"Reduced test images have size %s and labels have size %s \\n\" % (Xtest.shape, Ytest.shape))\n", "\n", "# Check that we have some training examples from each class\n", "for i in range(10):\n", " print(\"Number of training examples for class {} is {}\" .format(i,np.sum(Ytrain == i)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plotting\n", "\n", "Lets look at some of the training examples" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "plt.figure(figsize=(12,4))\n", "for i in range(18):\n", " idx = np.random.randint(7500)\n", " label = Ytrain[idx,0]\n", " \n", " plt.subplot(3,6,i+1)\n", " plt.tight_layout()\n", " plt.imshow(Xtrain[idx])\n", " plt.title(\"Class: {} ({})\".format(label, classes[label]))\n", " plt.axis('off')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Split data into training, validation and testing\n", "Split your training data into training (Xtrain, Ytrain) and validation (Xval, Yval).\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from sklearn.model_selection import train_test_split\n", "\n", "# Use train_test_split function to divide training data into training and validation\n", "Xtrain, Xval, Ytrain, Yval = train_test_split(Xtrain, Ytrain, test_size=0.25)\n", "\n", "print('The training, validation and testing set are made of {}, {} and {} images respectively.'.format(Xtrain.shape[0], Xval.shape[0], Xtest.shape[0]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Preprocessing of images\n", "\n", "Lets perform some preprocessing. The images are stored as uint8, i.e. 8 bit unsigned integers, but need to be converted to 32 bit floats. We also make sure that the range is -1 to 1, instead of 0 - 255" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\"Data type is \",Xtrain.dtype)\n", "\n", "# Convert datatype for Xtrain, Xval, Xtest, to float32\n", "Xtrain = Xtrain.astype('float32')\n", "Xval = Xval.astype('float32')\n", "Xtest = Xtest.astype('float32')\n", "\n", "Xtrain = Xtrain / 127.5 - 1\n", "Xval = Xval / 127.5 - 1\n", "Xtest = Xtest / 127.5 - 1\n", "\n", "print(\"Data type is \",Xtrain.dtype)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Preprocessing of labels\n", "\n", "The labels need to be converted from e.g. '3' to \"hot encoded\", i.e. to a vector of type [0, 0, 0, 1, 0, 0, 0, 0, 0, 0] " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from keras.utils import to_categorical\n", "\n", "print(\"Shape of Ytrain before hot encoding is \", Ytrain.shape)\n", "\n", "numClasses = 10\n", "\n", "# Your code\n", "\n", "print(\"Shape of Ytrain after hot encoding is \",Ytrain.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2D CNN\n", "Finish this code to create the image classifier, using a 2D CNN. Start with a simple network with 2 convolutional layers, 2 maxpooling layers and a final dense layer. The convolutional layers should have 16 and 32 filters with a kernel size of 3 x 3. The max pooling layers should have a pool size of 2 x 2. The final dense layer should have 10 nodes (= the number of classes in this laboration).\n", "\n", "Start with using SGD (stochastic gradient descent) as the optimization algorithm.\n", "\n", "Relevant functions are\n", "\n", "`model.add()`, adds a layer to the network\n", "\n", "`Conv2D()`, performs 2D convolutions with a number of filters with a certain size (e.g. 3 x 3). Use `'relu'` activation and `'same'` padding.\n", "\n", "`MaxPooling2D()`, saves the max for a given pool size, results in down sampling\n", "\n", "`Flatten()`, flatten a multi-channel tensor into a long vector\n", "\n", "`Dense()`, a dense network layer\n", "\n", "---\n", "\n", "Questions\n", "\n", "Why do we have to use a batch size? Why can't we simply use all the images at once? This is more relevant for larger images, as the images in this laboration are only 32 x 32 pixels.\n", "\n", "How busy is the GPU for a batch size of 100? Hint: run 'nvidia-smi' on the cloud computer. Increase the batch size until the GPU utilization is 100%.\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from keras.models import Sequential, Model\n", "from keras.layers import Input, Conv2D, BatchNormalization, MaxPooling2D, Flatten, Dense, Dropout\n", "from keras.optimizers import SGD, Adam\n", "from keras.losses import categorical_crossentropy as CC\n", "from keras import regularizers\n", "\n", "# Set seed from random number generator, for better comparisons\n", "from numpy.random import seed\n", "seed(123)\n", "\n", "def build_CNN(input_shape, reg_parameter=0.0):\n", "\n", " # Setup a sequential model\n", " model = Sequential()\n", "\n", " # Add layers to the model\n", " \n", " # Your code\n", " \n", " # Compile model\n", " \n", " # Your code\n", " \n", " return model" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "# Setup some training parameters (feel free to change)\n", "batch_size = 100\n", "epochs = 100\n", "\n", "# Build model\n", "input_shape = Xtrain.shape[1:]\n", "model = build_CNN(input_shape)\n", "\n", "# Train the model\n", "history = # Your code\n", "\n", "# Evaluate the trained model on test set, not used in training or validation\n", "score = model.evaluate(Xtest, Ytest, batch_size = batch_size, verbose=0)\n", "print('Test loss: %.4f' % score[0])\n", "print('Test accuracy: %.4f' % score[1])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Plot training and validation losses and accuracy\n", "val_loss, val_acc, loss, acc = history.history.values()\n", "\n", "plt.figure(figsize=(10,4))\n", "plt.xlabel('Epochs')\n", "plt.ylabel('Loss')\n", "plt.plot(loss)\n", "plt.plot(val_loss)\n", "plt.legend(['Training','Validation'])\n", "\n", "plt.figure(figsize=(10,4))\n", "plt.xlabel('Epochs')\n", "plt.ylabel('Accuracy')\n", "plt.plot(acc)\n", "plt.plot(val_acc)\n", "plt.legend(['Training','Validation'])\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Improving performance\n", "Write down the test accuracy, are you satisfied with the classifier performance (random chance is 10%) ? How many epochs does it take to reach 100% accuracy on training data?\n", "\n", "Change the number of filters in Conv2D from 16 and 32 to 32 and 64. Write down the test accuracy.\n", "\n", "Add 2 more convolutional layers, with 128 and 256 filters. Also add 2 more max pooling layers, one per convolutional layer. Write down the test accuracy.\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Batch normalization\n", "\n", "Now add batch normalization after each Conv2D layer. How many epochs does it take to reach 100% accuracy on training data?\n", "\n", "BatchNormalization(), normalize the activations of the previous layer at each batch, i.e. applies a transformation that maintains the mean activation close to 0 and the activation standard deviation close to 1.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Optimizer\n", "\n", "Try changing the optimizer from SGD to Adam (with default parameters), can you see any difference for the test accuracy? How many epochs does each optimization algorithm need to reach 100% accuracy on training data? This is a very small comparison, don't draw too big conclusions from it." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Regularization - L2\n", "\n", "How can the accuracy for validation & test data be improved? \n", "\n", "Try adding kernel regularization (L2) for each Conv2D layer, see https://keras.io/regularizers/ \n", "\n", "Train the network again, with Adam, and write down the test accuracy. Try at least 5 different settings for the degree of regularization, and write down the test accuracy." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Regularization - dropout\n", "\n", "Set the L2 regularization parameter to 0.0\n", "\n", "Add a second dense layer with 20 nodes (before the existing dense layer) with 'relu' activation. Train the network and write down the test accuracy.\n", "\n", "Now add a Dropout layer between the two dense layers, with a dropout probability of 0.5. Train the network and write down the test accuracy.\n", "\n", "Set the L2 regularization parameter back to some value > 0, train the network and write down the test accuracy.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Augmentation using Keras `ImageDataGenerator`\n", "\n", "We can increase the number of training images using data augmentation (we now ignore that CIFAR10 actually has 60 000 training images). Keras has its own built in augmentation (for 2D, but not in 3D). The generated images are created on-the-fly, and are therefore not stored anywhere. Lets try it, see https://keras.io/preprocessing/image/ for details.\n", "\n", "How does the test accuracy change compared to without augmentation?\n", "\n", "How does the training time change compared to without augmentaiton?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Get all 10 000 images. ImageDataGenerator manages validation data on its own\n", "(Xtrain, Ytrain), _ = cifar10.load_data()\n", "\n", "Xtrain = Xtrain[0:10000]\n", "Xtrain = Xtrain.astype('float32')\n", "Xtrain = Xtrain / 127.5 - 1\n", "\n", "Ytrain = Ytrain[0:10000]\n", "Ytrain = to_categorical(Ytrain, numClasses)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Set up a data generator with on-the-fly data augmentation\n", "from keras.preprocessing.image import ImageDataGenerator\n", "\n", "datagen = ImageDataGenerator(\n", " rotation_range=20,\n", " width_shift_range=0.2,\n", " height_shift_range=0.2,\n", " horizontal_flip=True,\n", " validation_split=0.2)\n", "\n", "train_datagen = datagen.flow(Xtrain, Ytrain, batch_size=batch_size, subset='training')\n", "val_datagen = datagen.flow(Xtrain, Ytrain, batch_size=batch_size, subset='validation')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Plot some augmented images\n", "plot_datagen = datagen.flow(Xtrain, Ytrain, batch_size=1)\n", "\n", "plt.figure(figsize=(12,4))\n", "for i in range(18):\n", " (im, label) = plot_datagen.next()\n", " im = (im[0] + 1) * 127.5\n", " im = im.astype('int')\n", " label = np.flatnonzero(label)[0]\n", " \n", " plt.subplot(3,6,i+1)\n", " plt.tight_layout()\n", " plt.imshow(im)\n", " plt.title(\"Class: {} ({})\".format(label, classes[label]))\n", " plt.axis('off')\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Build a new model\n", "model = build_CNN(input_shape)\n", "\n", "# Train the model using on-the-fly augmentation\n", "history = model.fit_generator(train_datagen, epochs=epochs, steps_per_epoch=15, validation_data=val_datagen, validation_steps=5, use_multiprocessing=True)\n", "\n", "# Evaluate the trained model on test set, not used in training or validation\n", "score = model.evaluate(Xtest, Ytest, batch_size = batch_size, verbose=0)\n", "print('Test loss: %.4f' % score[0])\n", "print('Test accuracy: %.4f' % score[1])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Plot training and validation losses and accuracy\n", "val_loss, val_acc, loss, acc = history.history.values()\n", "\n", "plt.figure(figsize=(10,4))\n", "plt.xlabel('Epochs')\n", "plt.ylabel('Loss')\n", "plt.plot(loss)\n", "plt.plot(val_loss)\n", "plt.legend(['Training','Validation'])\n", "\n", "plt.figure(figsize=(10,4))\n", "plt.xlabel('Epochs')\n", "plt.ylabel('Accuracy')\n", "plt.plot(acc)\n", "plt.plot(val_acc)\n", "plt.legend(['Training','Validation'])\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plot misclassified images" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Find misclassified images\n", "y_pred = model.predict_classes(Xtest)\n", "y_correct = np.argmax(Ytest,axis=1)\n", "\n", "miss = np.flatnonzero(y_correct != y_pred)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Plot a few of them\n", "plt.figure(figsize=(15,4))\n", "perm = np.random.permutation(miss)\n", "for i in range(18):\n", " im = (Xtest[perm[i]] + 1) * 127.5\n", " im = im.astype('int')\n", " label_correct = y_correct[perm[i]]\n", " label_pred = y_pred[perm[i]]\n", " \n", " plt.subplot(3,6,i+1)\n", " plt.tight_layout()\n", " plt.imshow(im)\n", " plt.axis('off')\n", " plt.title(\"{}, classified as {}\".format(classes[label_correct], classes[label_pred]))\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Optimize parameters\n", "\n", "Now you have seen the most important parts for training a CNN. Try to optimize the hyper parameters (number of convolutional layers, number of filters per layer, learning rate, degree of regularization) to get as high test accuracy as possible." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Report\n", "\n", "Write a short report where you discuss how different settings and parameters affect the test accuracy.\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 2 }