{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Image Segmentation Laboratory\n", "\n", "60 different blood retinal vessels images with corresponding ground-truths have been collected from the available image databases [STARE](http://cecas.clemson.edu/~ahoover/stare/) and [DRIVE](https://www.isi.uu.nl/Research/Databases/DRIVE/). The aim of the assignment is to develop a [U-Net](https://lmb.informatik.uni-freiburg.de/people/ronneber/u-net/) convolutional neural network able to segment and highlight the retinal vessels from such images.\n", "\n", "\n", "Let's start being sure that our script can see the graphics card that will be used. The graphics cards will perform all the time consuming convolutions in every training iteration." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "import warnings\n", "\n", "# Ignore FutureWarning from numpy\n", "warnings.simplefilter(action='ignore', category=FutureWarning)\n", "\n", "import keras.backend as K\n", "import tensorflow as tf\n", "\n", "os.environ[\"CUDA_DEVICE_ORDER\"]=\"PCI_BUS_ID\";\n", " \n", "# The GPU id to use, usually either \"0\" or \"1\";\n", "os.environ[\"CUDA_VISIBLE_DEVICES\"]=\"0\";\n", "\n", "# Allow growth of GPU memory (otherwise it will look like all the memory is being used, even if you only use 10 MB)\n", "config = tf.ConfigProto()\n", "config.gpu_options.allow_growth = True\n", "K.tensorflow_backend.set_session(tf.Session(config=config))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "## Load data\n", "Load your data into two numpy arrays, one for the images and one for the ground truth segmentations.\n", "\n", "In the end, tranform the ground-truth dataset into a categorical one.\n", "\n", "Ignore the warnings" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "from matplotlib import pyplot as plt\n", "from keras.preprocessing.image import load_img, img_to_array, array_to_img, ImageDataGenerator\n", "from keras.utils import to_categorical\n", "\n", "classes = [\"Background\", \"Retinal Vessels\"]\n", "Nclasses = len(classes)\n", "newSize = 320\n", "\n", "image_list = []\n", "label_list = []\n", "for i in range(60):\n", " filename_im = 'Data/Images/image' +str(i+1)+ '.tif'\n", " filename_gt = 'Data/Masks/mask' +str(i+1)+ '.gif'\n", " im = img_to_array(load_img(filename_im,target_size=(newSize,newSize)))/255\n", " image_list.append(im)\n", " label = img_to_array(load_img(filename_gt,target_size=(newSize,newSize),color_mode=\"grayscale\"))/255\n", " label_list.append(label)\n", " \n", "# Convert lists into numpy arrays \n", "imds = np.asarray(image_list)\n", "gtds = np.asarray(label_list, dtype=int)\n", "\n", "# Show the first retinal image and the first ground truth image\n", "plt.figure(figsize=(10,10))\n", "plt.subplot(121)\n", "plt.imshow(array_to_img(imds[0]))\n", "plt.title('Original Image')\n", "plt.subplot(122)\n", "plt.imshow(array_to_img(gtds[0,:,:]))\n", "plt.title('Ground-truth')\n", "\n", "# Transform ground truth images into categorical\n", "gtds = to_categorical(gtds, Nclasses)\n", "print('The image dataset has shape: {}'.format(imds.shape))\n", "print('The ground-truth dataset has shape: {}'.format(gtds.shape))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Split data into training, validation and testing\n", "Define your training (Xtrain, Ytrain), validation (Xvalid, Yvalid) and testing (Xtest, Ytest) such as they contain 45, 9, 6 images respectively. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from sklearn.model_selection import train_test_split\n", "\n", "X, Xtest, Y, Ytest = train_test_split(imds, gtds, test_size=0.1)\n", "Xtrain, Xvalid, Ytrain, Yvalid = train_test_split(X, Y, test_size=0.15)\n", "\n", "print('The training, validation and testing set are made by {}, {} and {} images respectively.'.format(Xtrain.shape[0], Xvalid.shape[0], Xtest.shape[0]))\n", "print('\\nThe training images dataset has shape: {}'.format(Xtrain.shape))\n", "print('The training ground-truth dataset has shape: {}'.format(Ytrain.shape))\n", "print('The validation images dataset has shape: {}'.format(Xvalid.shape))\n", "print('The validation ground-truth dataset has shape: {}'.format(Yvalid.shape))\n", "print('The testing images dataset has shape: {}'.format(Xtest.shape))\n", "print('The testing ground-truth dataset has shape: {}'.format(Ytest.shape))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Pre-processing\n", "Perform pre-processing, here simply removing the mean per color channel in the image.\n", "\n", "Successively, plot few filtered images." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def myPreProc(imgs, mean): \n", " X = imgs - mean\n", " return X\n", "\n", "# Do image pre-processing for the training, validation and testing set separately!\n", "meanTrain = np.mean(Xtrain, axis=(0,1,2))\n", "Xtrain_preprocessed = myPreProc(Xtrain, meanTrain)\n", "Xvalid_preprocessed = myPreProc(Xvalid, meanTrain)\n", "Xtest_preprocessed = myPreProc(Xtest, meanTrain)\n", "\n", "plt.figure(figsize=(10,10))\n", "plt.subplot(121)\n", "plt.imshow(array_to_img(Xtrain[0,:,:,:]))\n", "plt.title('Original Image')\n", "plt.subplot(122)\n", "plt.imshow(array_to_img(Xtrain_preprocessed[0,:,:,:]))\n", "plt.title('Pre-processed image')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Balance the classes\n", "\n", "Extract and print here below the class weights that belong to the training set using the function provided in the [link](https://scikit-learn.org/stable/modules/generated/sklearn.utils.class_weight.compute_class_weight.html)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from sklearn.utils.class_weight import compute_class_weight\n", "\n", "y = Ytrain[:,:,:,1].flatten()\n", "class_weights = compute_class_weight('balanced', np.arange(Nclasses), y)\n", "print('The class weights that belong to the background and to the foreground are respectively:\\n{}'.format(class_weights))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## U-Net\n", "Finish this code to create the U-Net architecture.\n", "\n", "Relevant [Keras](https://keras.io/layers/about-keras-layers/) layers are:\n", "\n", "`Conv2D`, performs 2D convolutions with a number of filters with a certain size. Use `he_normal` weights initialization, and `same` padding;\n", "\n", "`BatchNormalization`, normalize the activations of the previous layer at each batch, i.e. applies a transformation that maintains the mean activation close to 0 and the activation standard deviation close to 1;\n", "\n", "`Activation`, i.e. `relu` activation function sets to 0 all the negatives values;\n", "\n", "`MaxPooling2D`, saves the max for a given pool size, results in down sampling;\n", "\n", "`Conv2DTranspose`, increases the resolution by a certain factor, through interpolation;\n", "\n", "`concatenate`, concatenate a list of inputs.\n", "\n", "\n", "![U-Net architecture](u-net-architecture.png)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from keras.models import Sequential, Model\n", "from keras.layers import Input, Conv2D, Activation, BatchNormalization, MaxPooling2D, Conv2DTranspose, Dropout\n", "from keras.optimizers import Adam\n", "from keras.layers.merge import concatenate\n", "from keras.callbacks import ModelCheckpoint, EarlyStopping\n", "from keras import backend as K\n", "\n", "class unet(object):\n", " \n", " def __init__(self, img_size, Nclasses, class_weights, model_name='myWeights.h5', Nfilter_start=64, depth=3, batch_size=3):\n", " self.img_size = img_size\n", " self.Nclasses = Nclasses\n", " self.class_weights = class_weights\n", " self.model_name = model_name\n", " self.Nfilter_start = Nfilter_start\n", " self.depth = depth\n", " self.batch_size = batch_size\n", "\n", " self.model = Sequential()\n", " inputs = Input(img_size)\n", " \n", " def dice(y_true, y_pred, eps=1e-5):\n", " num = 2.*K.sum(self.class_weights*K.sum(y_true * y_pred, axis=[0,1,2]))\n", " den = K.sum(self.class_weights*K.sum(y_true + y_pred, axis=[0,1,2]))+eps\n", " return num/den\n", "\n", " def diceLoss(y_true, y_pred):\n", " return 1-dice(y_true, y_pred) \n", " \n", " def bceLoss(y_true, y_pred):\n", " bce = K.sum(-self.class_weights*K.sum(y_true*K.log(y_pred), axis=[0,1,2]))\n", " return bce \n", " \n", " # This is a help function that performs 2 convolutions (filter size (3 x 3), he normal initialization, same padding),\n", " # each followed by batch normalization and ReLu activations, Nf is the number of filters\n", " def convs(layer, Nf):\n", " \n", " # Your code\n", " \n", " return x\n", " \n", " # This is a help function that defines what happens in each layer of the encoder (downstream),\n", " # which calls \"convs\" and then Maxpooling (2 x 2). Save each layer (before max pooling) \n", " # for later concatenation in the upstream.\n", " def encoder_step(layer, Nf):\n", " \n", " # Your code\n", " \n", " return y, x\n", " \n", " # This is a help function that defines what happens in each layer of the decoder (upstream),\n", " # which contains transpose convolution (filter size (3 x 3), stride (2,2), he normal initialization, same padding)\n", " # batch normalization, concatenation with corresponding layer from encoder, and lastly \"convs\"\n", " def decoder_step(layer, layer_to_concatenate, Nf):\n", " \n", " # Your code\n", " \n", " return x\n", " \n", " layers_to_concatenate = []\n", " x = inputs\n", " \n", " # Make encoder with 'self.depth' layers, \n", " # note that the number of filters in each layer will double compared to the previous \"step\" in the encoder\n", " for d in range(self.depth-1):\n", " y,x = encoder_step(x, self.Nfilter_start*np.power(2,d))\n", " layers_to_concatenate.append(y)\n", " \n", " # Make bridge, that connects encoder and decoder using \"convs\" between them. \n", " # Use Dropout before and after the bridge, for regularization. Use dropout probability of 0.2.\n", " x = Dropout(0.2)(x)\n", " x = convs(x,self.Nfilter_start*np.power(2,self.depth-1))\n", " x = Dropout(0.2)(x) \n", " \n", " # Make decoder with 'self.depth' layers, \n", " # note that the number of filters in each layer will be halved compared to the previous \"step\" in the decoder\n", " for d in range(self.depth-2, -1, -1):\n", " y = layers_to_concatenate.pop()\n", " x = decoder_step(x, y, self.Nfilter_start*np.power(2,d)) \n", " \n", " # Make classification (segmentation) of each pixel, using convolution with 1 x 1 filter\n", " final = Conv2D(filters=self.Nclasses, kernel_size=(1,1), activation = 'softmax')(x)\n", " \n", " # Create model\n", " self.model = Model(inputs=inputs, outputs=final)\n", " self.model.compile(loss=diceLoss, optimizer=Adam(lr=1e-4), metrics=['accuracy',dice]) \n", " \n", " def train(self, X, Y, x, y, nEpochs):\n", " print('Training process:')\n", " callbacks = [ModelCheckpoint(self.model_name, verbose=1, save_best_only=True, save_weights_only=True),\n", " EarlyStopping(patience=10)]\n", " results = self.model.fit(X, Y, validation_data=(x,y), batch_size=self.batch_size, epochs=nEpochs, callbacks=callbacks)\n", " return results\n", " \n", " def train_with_aug(self, im_gen_train, gt_gen_train, im_gen_valid, gt_gen_valid, nEpochs): \n", " print('Training process:')\n", " # we save in a dictionary the metrics obtained after each epoch\n", " results_dict = {}\n", " results_dict['loss'] = []\n", " results_dict['acc'] = []\n", " results_dict['dice'] = []\n", " results_dict['val_loss'] = []\n", " results_dict['val_acc'] = []\n", " results_dict['val_dice'] = []\n", " \n", " val_loss0 = np.inf\n", " steps_val_not_improved = 0\n", " for e in range(nEpochs):\n", " print('\\nEpoch {}/{}'.format(e+1, nEpochs))\n", " Xb_train, Yb_train = im_gen_train.next(), gt_gen_train.next()\n", " Xb_valid, Yb_valid = im_gen_valid.next(), gt_gen_valid.next()\n", " # Transform ground truth images into categorical\n", " Yb_train = to_categorical(np.argmax(Yb_train, axis=-1), self.Nclasses)\n", " Yb_valid = to_categorical(np.argmax(Yb_valid, axis=-1), self.Nclasses) \n", "\n", " results = self.model.fit(Xb_train, Yb_train, validation_data=(Xb_valid,Yb_valid), batch_size=self.batch_size)\n", "\n", " if results.history['val_loss'][0] <= val_loss0:\n", " self.model.save_weights(self.model_name)\n", " print('val_loss decreased from {:.4f} to {:.4f}. Hence, new weights are now saved in {}.'.format(val_loss0, results.history['val_loss'][0], self.model_name))\n", " val_loss0 = results.history['val_loss'][0]\n", " steps_val_not_improved = 0\n", " else:\n", " print('val_loss did not improved.')\n", " steps_val_not_improved += 1\n", "\n", " # saving the metrics\n", " results_dict['loss'].append(results.history['loss'][0])\n", " results_dict['acc'].append(results.history['acc'][0])\n", " results_dict['dice'].append(results.history['dice'][0])\n", " results_dict['val_loss'].append(results.history['val_loss'][0])\n", " results_dict['val_acc'].append(results.history['val_acc'][0])\n", " results_dict['val_dice'].append(results.history['val_dice'][0])\n", " \n", " if steps_val_not_improved==10:\n", " print('\\nThe training stopped because the network after 10 epochs did not decrease it''s validation loss.')\n", " break\n", "\n", " return results_dict\n", " \n", " def evaluate(self, X, Y):\n", " print('Evaluation process:')\n", " score, acc, dice = self.model.evaluate(X,Y,self.batch_size)\n", " print('Accuracy: {:.4f}'.format(acc*100))\n", " print('Dice: {:.4f}'.format(dice*100))\n", " return acc, dice\n", " \n", " def predict(self, X):\n", " print('Segmenting unseen image')\n", " segmentation = self.model.predict(X)\n", " return segmentation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Training and Evaluation process\n", "Start the training process and write down the evalution from the test set." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "img_size = Xtrain_preprocessed[0].shape\n", "net = unet(img_size, Nclasses, class_weights, Nfilter_start=32, batch_size=3, depth=1)\n", "# net.model.summary()\n", "results = net.train(Xtrain_preprocessed, Ytrain, Xvalid_preprocessed, Yvalid, nEpochs=50)\n", "net.model.load_weights('myWeights.h5')\n", "print(' ')\n", "acc, dice = net.evaluate(Xtest_preprocessed,Ytest)\n", "\n", "# accuracy trend\n", "plt.plot(results.history['acc'])\n", "plt.plot(results.history['val_acc'])\n", "plt.ylabel('Accuracy')\n", "plt.xlabel('Epochs')\n", "plt.legend(['train', 'valid'], loc='lower right')\n", "plt.show()\n", "# dice trend\n", "plt.plot(results.history['dice'])\n", "plt.plot(results.history['val_dice'])\n", "plt.ylabel('Dice')\n", "plt.xlabel('Epochs')\n", "plt.legend(['train', 'valid'], loc='lower right')\n", "plt.show()\n", "# loss trend\n", "plt.plot(results.history['loss'])\n", "plt.plot(results.history['val_loss'])\n", "plt.ylabel('Loss')\n", "plt.xlabel('Epochs')\n", "plt.legend(['train', 'valid'], loc='upper right')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Results\n", "\n", "Now show the segmentation result for an image not used for training or validation." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Ypred = net.predict(Xtest_preprocessed)\n", "\n", "plt.figure(figsize=(12,12))\n", "plt.subplot(131)\n", "plt.imshow(Xtest[0])\n", "plt.title('Image')\n", "plt.subplot(132)\n", "# Segmentation in each pixel is the class of the strongest prediction / activation\n", "plt.imshow(np.argmax(Ypred[0], axis=-1))\n", "plt.title('Segmentation result')\n", "plt.subplot(133)\n", "plt.imshow(Ytest[0,:,:,1])\n", "plt.title('Ground truth segmentation')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Hyper parameters\n", "As you have noticed, U-Net has a number of hyper parameters; the depth of the network (the number of layers) and the number of filters per layer.\n", "\n", "Run the training with the following settings and write down the evaluation results for each setting.\n", "\n", "Depth 1, Nfilters 32\n", "\n", "Depth 1, Nfilters 64\n", "\n", "Depth 2, Nfilters 32\n", "\n", "Depth 2, Nfilters 64\n", "\n", "It is possible to have a deeper U-Net, but our graphics card only has 8 GB of memory...\n", "\n", "\n", "How does the performance change with the depth and number of filters?\n", "\n", "How does the training time change with the depth and number of filters?\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Augmentation functions\n", "\n", "45 images is a rather small training set, we can increase it through data augmentation.\n", "\n", "We use Keras builtin image augmentation provided as `ImageDataGenerator` [link](https://keras.io/preprocessing/image/)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from keras.preprocessing.image import ImageDataGenerator\n", "\n", "def apply_augmentation(X, Y, N_new_images):\n", " data_gen_args = dict(rotation_range=90,\n", " width_shift_range=0.1,\n", " height_shift_range=0.1,\n", " horizontal_flip = True,\n", " vertical_flip = True,\n", " zoom_range=0.2)\n", "\n", " image_datagen = ImageDataGenerator(**data_gen_args)\n", " mask_datagen = ImageDataGenerator(**data_gen_args)\n", " \n", " # Provide the same seed and keyword arguments to the fit and flow methods\n", " seed = np.random.randint(123456789)\n", " image_generator = image_datagen.flow(X, batch_size=N_new_images, seed=seed)\n", " mask_generator = mask_datagen.flow(Y, batch_size=N_new_images, seed=seed)\n", " \n", " return image_generator, mask_generator\n", "\n", "image_generator_train, mask_generator_train = apply_augmentation(Xtrain_preprocessed, Ytrain, len(Xtrain))\n", "image_generator_valid, mask_generator_valid = apply_augmentation(Xvalid_preprocessed, Yvalid, len(Xvalid))\n", "\n", "# let's generate one batch of augmented images for the training set and plot few of them\n", "X_batch, Y_batch = image_generator_train.next(), mask_generator_train.next()\n", "\n", "plt.figure(figsize=(15,15))\n", "for i in range(5):\n", " plt.subplot(2,5,i+1)\n", " plt.imshow(array_to_img(X_batch[i]))\n", " plt.subplot(2,5,i+6)\n", " plt.imshow(np.argmax(Y_batch[i], axis=-1))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Training with augmented data\n", "\n", "Now run the training with the augmented dataset, for the following settings\n", "\n", "Depth 1, Nfilters 32\n", "\n", "Depth 1, Nfilters 64\n", "\n", "Depth 2, Nfilters 32\n", "\n", "Depth 2, Nfilters 64\n", "\n", "It is possible to have a deeper U-Net, but our graphics card only has 8 GB of memory...\n", "\n", "How does the performance change compared to the un-augmented training dataset?\n", "\n", "How does the training time change compared to the un-augmented training dataset?" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "img_size = Xtrain_preprocessed[0].shape\n", "modelName = 'myWeightsAug.h5'\n", "\n", "net = unet(img_size, Nclasses, class_weights, modelName, Nfilter_start=32, batch_size=3, depth=1)\n", "#net.model.summary()\n", "results = net.train_with_aug(image_generator_train, mask_generator_train, image_generator_valid, mask_generator_valid, nEpochs=50)\n", "net.model.load_weights(modelName)\n", "print(' ')\n", "acc, dice = net.evaluate(Xtest_preprocessed,Ytest)\n", "\n", "# accuracy trend\n", "plt.plot(results['acc'])\n", "plt.plot(results['val_acc'])\n", "plt.ylabel('Accuracy')\n", "plt.xlabel('Epochs')\n", "plt.legend(['train', 'valid'], loc='lower right')\n", "plt.show()\n", "# dice trend\n", "plt.plot(results['dice'])\n", "plt.plot(results['val_dice'])\n", "plt.ylabel('Dice')\n", "plt.xlabel('Epochs')\n", "plt.legend(['train', 'valid'], loc='lower right')\n", "plt.show()\n", "# loss trend\n", "plt.plot(results['loss'])\n", "plt.plot(results['val_loss'])\n", "plt.ylabel('Loss')\n", "plt.xlabel('Epochs')\n", "plt.legend(['train', 'valid'], loc='upper right')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Results" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Ypred = net.predict(Xtest_preprocessed)\n", "\n", "plt.figure(figsize=(12,12))\n", "plt.subplot(131)\n", "plt.imshow(Xtest[0])\n", "plt.title('Image')\n", "plt.subplot(132)\n", "plt.imshow(np.argmax(Ypred[0], axis=-1))\n", "plt.title('Segmentation result')\n", "plt.subplot(133)\n", "plt.imshow(Ytest[0,:,:,1])\n", "plt.title('Ground truth segmentation')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Report\n", "\n", "Write a short report where you discuss the influence of number of layers and number of filters per layer." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" } }, "nbformat": 4, "nbformat_minor": 2 }