{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Neural Networks for Data Science Applications\n", "\n", "## Lab session 4: Model building with an application to OCR\n", "\n", "**Contents of the lab session**:\n", "* Building CNNs with keras.Model and tf.layers.\n", "* Effectively using the TensorBoard for debugging.\n", "* Using Keras' ImageDataGenerator for handling datasets of images.\n", "* Keras callbacks for checkpointing and TensorBoard." ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "5ZUWq7n3815E" }, "source": [ "## Virtual machine setup" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "LuINk8ttOZxr" }, "outputs": [], "source": [ "# Enable the GPU runtime and restart the kernel\n", "# Do it at the very beginning " ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "xLi3DGBDNtNS" }, "outputs": [], "source": [ "# The TensorFlow version currently available on Colab is the 1.15.0\n", "\n", "# Install the new 2.0 version with GPU support\n", "!pip install --quiet tensorflow-gpu==2.0.0\n", "\n", "# < 400MB\n", "# ignore the errors during the installation procedure" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "hQ00m5_SOyzx" }, "outputs": [], "source": [ "# Import TensorFlow and check the version\n", "import tensorflow as tf\n", "print(tf.__version__)" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "tadQrO-CSR5h" }, "source": [ "## Download the dataset" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "zPS8lS2t3wos" }, "outputs": [], "source": [ "# Download the dataset from the \"In Codice Ratio\" website\n", "!wget http://www.inf.uniroma3.it/db/icr/dataset_icr.zip\n", "# < 5MB" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "saSoWzUt8SEk" }, "outputs": [], "source": [ "# Unzip the compressed file\n", "!unzip -q dataset_icr.zip" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "Eixjnvq38Xt0" }, "outputs": [], "source": [ "# Have a look to the dataset main folder\n", "\n", "!ls -1 dataset\n", "\n", "# + We have a separate folder for each character\n", "# + We have 23 classes\n", "# + There are some duplicated letters (same letter, different shape) and a special \"no_char\" character" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# If you want to read more about the project, check out the publications here:\n", "# http://www.inf.uniroma3.it/db/icr/publications.html" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "TImhNnuA85hU" }, "source": [ "## Load the dataset" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "4yhavMZh8z3t" }, "outputs": [], "source": [ "import tensorflow as tf" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "sDO0bCDg8awk" }, "outputs": [], "source": [ "# Generate batches of images with real-time data augmentation\n", "# Define the abstract image generator\n", "image_generator = tf.keras.preprocessing.image.ImageDataGenerator(rotation_range=5, # up to ±5 degrees\n", " width_shift_range=0.05, # up to ±5% of the image\n", " height_shift_range=0.05, # up to ±5% of the image\n", " rescale=1.0/255, # convert from uint [0, 255] to float [0.0, 1.0]\n", " validation_split=0.2) # 20% of the samples for the validation set\n", "\n", "# You can also make a more flexible (and more complex) solution using tf.data" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "mOuY_Xmx8e4r" }, "outputs": [], "source": [ "# Define the specific data generator for our images\n", "train_data_gen = image_generator.flow_from_directory(directory='dataset',\n", " subset='training', \n", " batch_size=32,\n", " shuffle=True,\n", " target_size=(56, 56))" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "Pv3_l_Tj9PRB" }, "outputs": [], "source": [ "test_data_gen = image_generator.flow_from_directory(directory='dataset',\n", " subset='validation', \n", " batch_size=32,\n", " shuffle=False,\n", " target_size=(56, 56))" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "8pm2BEbKQdGx" }, "source": [ "## Look at the dataset" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "0eVzVGzEPkkN" }, "source": [ "> \"Become one with the data\" \n", "> Andrej Karpathy\n", "\n", "(rule n. 1 of [A Recipe for Training Neural Networks](https://karpathy.github.io/2019/04/25/recipe/))" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "D3Yno_crrXQi" }, "outputs": [], "source": [ "# Plot one (original) image\n", "import matplotlib.pyplot as plt\n", "image = plt.imread('./dataset/a/1.png')\n", "print('image shape:', image.shape)\n", "plt.imshow(image, cmap='gray')" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "0H-lSzOr9Sss" }, "outputs": [], "source": [ "# Plot one (augmented) image\n", "import matplotlib.pyplot as plt\n", "x_batch, y_batch = next(train_data_gen)\n", "image = x_batch[0]\n", "label_one_hot = y_batch[0] # one-hot encoded\n", "plt.imshow(image)\n", "print('label (one-hot encoded):', label_one_hot)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "class_dictionary = train_data_gen.class_indices\n", "label = list(class_dictionary.keys())[np.argmax(label_one_hot)]\n", "print('label:', label)" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "oTxyoI289_R9" }, "outputs": [], "source": [ "print(x_batch.shape)\n", "# Note: Channel last" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "J0xCZKtb9cO9" }, "source": [ "## Download a CNN from keras.applications" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "Dm-d0QxAp32m" }, "outputs": [], "source": [ "# Download a pretrained neural network\n", "from tensorflow.keras import applications\n", "vgg = applications.vgg16.VGG16(weights='imagenet')\n", "\n", "vgg.summary()\n", "\n", "# ImageNet images are 224x224x3 images belonging to 1000 classes\n", "# VGG: Oxford Visual Geometry Group" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "LQZ9LGQF9WH8" }, "outputs": [], "source": [ "vgg = applications.vgg16.VGG16(weights='imagenet', include_top=False, input_shape=(56, 56, 3))\n", "vgg.summary()" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "qi0ilDt8-DqM" }, "outputs": [], "source": [ "# Make all the layers not trainable\n", "for layer in vgg.layers:\n", " layer.trainable = False" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "77xjt4l7ryL1" }, "outputs": [], "source": [ "vgg.summary()\n", "# Non-trainable parameters: 14714688" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "FNBSR4VQ-Hc8" }, "outputs": [], "source": [ "from tensorflow.keras import models, layers\n", "\n", "def build_model():\n", " model = models.Sequential()\n", " # use the pretrained network as a feature extractor (not trainable)\n", " model.add(vgg)\n", " # add a linear classifier (single layer, trainable) on top of it\n", " model.add(layers.Flatten())\n", " model.add(layers.Dense(23, activation='softmax'))\n", "\n", " model.compile(loss='categorical_crossentropy',\n", " optimizer='adam',\n", " metrics=['accuracy'])\n", "\n", " return model" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "2k9jTgKS-It8" }, "outputs": [], "source": [ "model = build_model()\n", "model.summary()" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "ZuITDH2vnw_J" }, "outputs": [], "source": [ "tf.keras.utils.plot_model(model)" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "xAhAR8wP_gR9" }, "source": [ "## Basic fine-tuning (linear classifier only)" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "ZaIq9Hjn-Qss" }, "outputs": [], "source": [ "history = model.fit_generator(train_data_gen, epochs=10, shuffle=True)\n", "\n", "# ~1.5 minutes for each epoch\n", "# Very inefficient computing strategy. Why?\n", "# Loss is decreasing, accuracy is increasing. What about overfitting?\n", "# Train loss curve only at the end VS while the training is running" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "U-0q0FjlwG6z" }, "outputs": [], "source": [ "plt.plot(history.history['accuracy'])\n", "plt.xlabel('epoch')\n", "plt.ylabel('accuracy')" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "5Z0V49tT-bJE" }, "outputs": [], "source": [ "model.evaluate(test_data_gen)" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "I6aeViPq_kH-" }, "source": [ "## Checkpointing" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "PeMyo7JT_lpm" }, "outputs": [], "source": [ "# Let's start again from the beginning\n", "model = build_model()" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "L13CQ6g6_VNP" }, "outputs": [], "source": [ "from tensorflow.keras import callbacks\n", "checkpoint_callback = callbacks.ModelCheckpoint(filepath='./checkpoints/model-{epoch:02d}.ckpt', \n", " save_weights_only=True,\n", " save_freq='epoch',\n", " verbose=1)" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "DziS53Um_XaM" }, "outputs": [], "source": [ "model.fit_generator(train_data_gen,\n", " epochs=10,\n", " shuffle=True,\n", " callbacks=[checkpoint_callback])" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "ESCa_Kni_Z-k" }, "outputs": [], "source": [ "!ls ./checkpoints" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "55gutzdg_nlM" }, "outputs": [], "source": [ "latest_checkpoint = tf.train.latest_checkpoint('./checkpoints')\n", "print(latest_checkpoint)" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "l9SuGGY-y3GR" }, "source": [ "## Load previous checkpoints" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "Pbg3mcO0_psz" }, "outputs": [], "source": [ "model = build_model() # start again from skratch\n", "model.evaluate(test_data_gen)\n", "\n", "# - Trained feature extractor (on ImageNet)\n", "# - Untrained linear classifier" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "dInFdyDIdovM" }, "outputs": [], "source": [ "model = build_model()\n", "model.load_weights(latest_checkpoint)\n", "model.evaluate(test_data_gen)\n", "\n", "# - Trained feature extractor (on ImageNet)\n", "# - Trained linear classifier (on In Codice Ratio)" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "5bCuca8G_sae" }, "source": [ "## TensorBoard" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "7rTfA_6bADP3" }, "outputs": [], "source": [ "model = build_model()" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "cgolUXPF1srT" }, "outputs": [], "source": [ "!pip install tensorboard==2.0.0" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "MHfB-TvX14kV" }, "outputs": [], "source": [ "import tensorboard\n", "print(tensorboard.__version__)" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "BInLKRNV_qsL" }, "outputs": [], "source": [ "%load_ext tensorboard\n", "#%reload_ext tensorboard" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "euTXGvhNAqGg" }, "outputs": [], "source": [ "logdir = './logs'" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "31EAjvbL_wT0" }, "outputs": [], "source": [ "%tensorboard --logdir=$logdir\n", "#!kill 3510" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "F-zwIBb0_1Vk" }, "outputs": [], "source": [ "tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=logdir, update_freq='batch')#10" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "icmyy1aEAAON" }, "outputs": [], "source": [ "history = model.fit(train_data_gen, epochs=10, shuffle=True, validation_data=test_data_gen, callbacks=[checkpoint_callback, tensorboard_callback])" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "GbvhS5UPfX9O" }, "outputs": [], "source": [ "#!rm -r ./logs/*" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "v7jfe-Q2paZ6" }, "outputs": [], "source": [ "#vgg = applications.vgg16.VGG16(include_top=False, input_shape=(56, 56, 3), weights=None)" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "lBxEL86DqYNZ" }, "outputs": [], "source": [ "shrinked_vgg = models.Sequential([layers.InputLayer(input_shape=(56, 56, 3)), \n", " *vgg.layers[0:6], \n", " layers.MaxPool2D(pool_size=(5, 5)), \n", " layers.Flatten(), \n", " layers.Dense(23, activation='softmax')])" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "ovhxiRaUqYke" }, "outputs": [], "source": [ "shrinked_vgg.summary()" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "4_0L1x0R-D0C" }, "outputs": [], "source": [ "# Make all the layers trainable\n", "for layer in shrinked_vgg.layers:\n", " layer.trainable = True\n", "\n", "shrinked_vgg.summary()" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "OXCTYIW-_xUQ" }, "outputs": [], "source": [ "shrinked_vgg.compile(loss='categorical_crossentropy',\n", " optimizer='adam',\n", " metrics=['accuracy'])" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "Az93txAh_azi" }, "outputs": [], "source": [ "history = shrinked_vgg.fit(train_data_gen, \n", " epochs=10, \n", " shuffle=True, \n", " validation_data=test_data_gen, \n", " callbacks=[checkpoint_callback, tensorboard_callback])\n" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "colab": {}, "colab_type": "code", "id": "WbGL4t4Q_pvR" }, "outputs": [], "source": [ "# Homework: define a proper architecture and try to reach the highest possible validation accuracy" ] } ], "metadata": { "accelerator": "GPU", "colab": { "collapsed_sections": [], "name": "2019.10.25_esercitazione_Simone.ipynb", "provenance": [] }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.8" } }, "nbformat": 4, "nbformat_minor": 1 }