{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Credits: Forked from [deep-learning-keras-tensorflow](https://github.com/leriomaggio/deep-learning-keras-tensorflow) by Valerio Maggio" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Practical Deep Learning" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Constructing and training your own ConvNet from scratch can be Hard and a long task.\n", "\n", "A common trick used in Deep Learning is to use a **pre-trained** model and finetune it to the specific data it will be used for. " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Famous Models with Keras\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "This notebook contains code and reference for the following Keras models (gathered from [https://github.com/fchollet/deep-learning-models]())\n", "\n", "- VGG16\n", "- VGG19\n", "- ResNet50\n", "- Inception v3\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "## References\n", "\n", "- [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556) - please cite this paper if you use the VGG models in your work.\n", "- [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385) - please cite this paper if you use the ResNet model in your work.\n", "- [Rethinking the Inception Architecture for Computer Vision](http://arxiv.org/abs/1512.00567) - please cite this paper if you use the Inception v3 model in your work.\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "All architectures are compatible with both TensorFlow and Theano, and upon instantiation the models will be built according to the image dimension ordering set in your Keras configuration file at `~/.keras/keras.json`. \n", "\n", "For instance, if you have set `image_dim_ordering=tf`, then any model loaded from this repository will get built according to the TensorFlow dimension ordering convention, \"Width-Height-Depth\"." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Keras Configuration File" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{\r\n", " \"image_dim_ordering\": \"th\",\r\n", " \"floatx\": \"float32\",\r\n", " \"epsilon\": 1e-07,\r\n", " \"backend\": \"theano\"\r\n", "}" ] } ], "source": [ "!cat ~/.keras/keras.json" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{\r\n", " \"image_dim_ordering\": \"th\",\r\n", " \"floatx\": \"float32\",\r\n", " \"epsilon\": 1e-07,\r\n", " \"backend\": \"tensorflow\"\r\n", "}" ] } ], "source": [ "!sed -i 's/theano/tensorflow/g' ~/.keras/keras.json\n", "!cat ~/.keras/keras.json" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Using TensorFlow backend.\n" ] } ], "source": [ "import keras" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Using gpu device 0: GeForce GTX 760 (CNMeM is enabled with initial size: 90.0% of memory, cuDNN 4007)\n" ] } ], "source": [ "import theano" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{\r\n", " \"image_dim_ordering\": \"th\",\r\n", " \"backend\": \"theano\",\r\n", " \"floatx\": \"float32\",\r\n", " \"epsilon\": 1e-07\r\n", "}" ] } ], "source": [ "!sed -i 's/tensorflow/theano/g' ~/.keras/keras.json\n", "!cat ~/.keras/keras.json" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Using Theano backend.\n", "Using gpu device 0: GeForce GTX 760 (CNMeM is enabled with initial size: 90.0% of memory, cuDNN 4007)\n" ] } ], "source": [ "import keras" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false, "slideshow": { "slide_type": "skip" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Using Theano backend.\n", "Using gpu device 0: GeForce GTX 760 (CNMeM is enabled with initial size: 90.0% of memory, cuDNN 4007)\n" ] } ], "source": [ "# %load deep_learning_models/imagenet_utils.py\n", "import numpy as np\n", "import json\n", "\n", "from keras.utils.data_utils import get_file\n", "from keras import backend as K\n", "\n", "CLASS_INDEX = None\n", "CLASS_INDEX_PATH = 'https://s3.amazonaws.com/deep-learning-models/image-models/imagenet_class_index.json'\n", "\n", "\n", "def preprocess_input(x, dim_ordering='default'):\n", " if dim_ordering == 'default':\n", " dim_ordering = K.image_dim_ordering()\n", " assert dim_ordering in {'tf', 'th'}\n", "\n", " if dim_ordering == 'th':\n", " x[:, 0, :, :] -= 103.939\n", " x[:, 1, :, :] -= 116.779\n", " x[:, 2, :, :] -= 123.68\n", " # 'RGB'->'BGR'\n", " x = x[:, ::-1, :, :]\n", " else:\n", " x[:, :, :, 0] -= 103.939\n", " x[:, :, :, 1] -= 116.779\n", " x[:, :, :, 2] -= 123.68\n", " # 'RGB'->'BGR'\n", " x = x[:, :, :, ::-1]\n", " return x\n", "\n", "\n", "def decode_predictions(preds):\n", " global CLASS_INDEX\n", " assert len(preds.shape) == 2 and preds.shape[1] == 1000\n", " if CLASS_INDEX is None:\n", " fpath = get_file('imagenet_class_index.json',\n", " CLASS_INDEX_PATH,\n", " cache_subdir='models')\n", " CLASS_INDEX = json.load(open(fpath))\n", " indices = np.argmax(preds, axis=-1)\n", " results = []\n", " for i in indices:\n", " results.append(CLASS_INDEX[str(i)])\n", " return results\n" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": true, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "IMAGENET_FOLDER = 'imgs/imagenet' #in the repo" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# VGG16" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "# %load deep_learning_models/vgg16.py\n", "'''VGG16 model for Keras.\n", "\n", "# Reference:\n", "\n", "- [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556)\n", "\n", "'''\n", "from __future__ import print_function\n", "\n", "import numpy as np\n", "import warnings\n", "\n", "from keras.models import Model\n", "from keras.layers import Flatten, Dense, Input\n", "from keras.layers import Convolution2D, MaxPooling2D\n", "from keras.preprocessing import image\n", "from keras.utils.layer_utils import convert_all_kernels_in_model\n", "from keras.utils.data_utils import get_file\n", "from keras import backend as K\n", "\n", "TH_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_th_dim_ordering_th_kernels.h5'\n", "TF_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels.h5'\n", "TH_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_th_dim_ordering_th_kernels_notop.h5'\n", "TF_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5'\n", "\n", "\n", "def VGG16(include_top=True, weights='imagenet',\n", " input_tensor=None):\n", " '''Instantiate the VGG16 architecture,\n", " optionally loading weights pre-trained\n", " on ImageNet. Note that when using TensorFlow,\n", " for best performance you should set\n", " `image_dim_ordering=\"tf\"` in your Keras config\n", " at ~/.keras/keras.json.\n", "\n", " The model and the weights are compatible with both\n", " TensorFlow and Theano. The dimension ordering\n", " convention used by the model is the one\n", " specified in your Keras config file.\n", "\n", " # Arguments\n", " include_top: whether to include the 3 fully-connected\n", " layers at the top of the network.\n", " weights: one of `None` (random initialization)\n", " or \"imagenet\" (pre-training on ImageNet).\n", " input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)\n", " to use as image input for the model.\n", "\n", " # Returns\n", " A Keras model instance.\n", " '''\n", " if weights not in {'imagenet', None}:\n", " raise ValueError('The `weights` argument should be either '\n", " '`None` (random initialization) or `imagenet` '\n", " '(pre-training on ImageNet).')\n", " # Determine proper input shape\n", " if K.image_dim_ordering() == 'th':\n", " if include_top:\n", " input_shape = (3, 224, 224)\n", " else:\n", " input_shape = (3, None, None)\n", " else:\n", " if include_top:\n", " input_shape = (224, 224, 3)\n", " else:\n", " input_shape = (None, None, 3)\n", "\n", " if input_tensor is None:\n", " img_input = Input(shape=input_shape)\n", " else:\n", " if not K.is_keras_tensor(input_tensor):\n", " img_input = Input(tensor=input_tensor)\n", " else:\n", " img_input = input_tensor\n", " # Block 1\n", " x = Convolution2D(64, 3, 3, activation='relu', border_mode='same', name='block1_conv1')(img_input)\n", " x = Convolution2D(64, 3, 3, activation='relu', border_mode='same', name='block1_conv2')(x)\n", " x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)\n", "\n", " # Block 2\n", " x = Convolution2D(128, 3, 3, activation='relu', border_mode='same', name='block2_conv1')(x)\n", " x = Convolution2D(128, 3, 3, activation='relu', border_mode='same', name='block2_conv2')(x)\n", " x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)\n", "\n", " # Block 3\n", " x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv1')(x)\n", " x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv2')(x)\n", " x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv3')(x)\n", " x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)\n", "\n", " # Block 4\n", " x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv1')(x)\n", " x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv2')(x)\n", " x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv3')(x)\n", " x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)\n", "\n", " # Block 5\n", " x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv1')(x)\n", " x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv2')(x)\n", " x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv3')(x)\n", " x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x)\n", "\n", " if include_top:\n", " # Classification block\n", " x = Flatten(name='flatten')(x)\n", " x = Dense(4096, activation='relu', name='fc1')(x)\n", " x = Dense(4096, activation='relu', name='fc2')(x)\n", " x = Dense(1000, activation='softmax', name='predictions')(x)\n", "\n", " # Create model\n", " model = Model(img_input, x)\n", "\n", " # load weights\n", " if weights == 'imagenet':\n", " print('K.image_dim_ordering:', K.image_dim_ordering())\n", " if K.image_dim_ordering() == 'th':\n", " if include_top:\n", " weights_path = get_file('vgg16_weights_th_dim_ordering_th_kernels.h5',\n", " TH_WEIGHTS_PATH,\n", " cache_subdir='models')\n", " else:\n", " weights_path = get_file('vgg16_weights_th_dim_ordering_th_kernels_notop.h5',\n", " TH_WEIGHTS_PATH_NO_TOP,\n", " cache_subdir='models')\n", " model.load_weights(weights_path)\n", " if K.backend() == 'tensorflow':\n", " warnings.warn('You are using the TensorFlow backend, yet you '\n", " 'are using the Theano '\n", " 'image dimension ordering convention '\n", " '(`image_dim_ordering=\"th\"`). '\n", " 'For best performance, set '\n", " '`image_dim_ordering=\"tf\"` in '\n", " 'your Keras config '\n", " 'at ~/.keras/keras.json.')\n", " convert_all_kernels_in_model(model)\n", " else:\n", " if include_top:\n", " weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels.h5',\n", " TF_WEIGHTS_PATH,\n", " cache_subdir='models')\n", " else:\n", " weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5',\n", " TF_WEIGHTS_PATH_NO_TOP,\n", " cache_subdir='models')\n", " model.load_weights(weights_path)\n", " if K.backend() == 'theano':\n", " convert_all_kernels_in_model(model)\n", " return model" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "K.image_dim_ordering: th\n", "Input image shape: (1, 3, 224, 224)\n", "Predicted: [['n07745940', 'strawberry']]\n" ] } ], "source": [ "import os\n", "\n", "model = VGG16(include_top=True, weights='imagenet')\n", "\n", "img_path = os.path.join(IMAGENET_FOLDER, 'strawberry_1157.jpeg')\n", "img = image.load_img(img_path, target_size=(224, 224))\n", "x = image.img_to_array(img)\n", "x = np.expand_dims(x, axis=0)\n", "x = preprocess_input(x)\n", "print('Input image shape:', x.shape)\n", "\n", "preds = model.predict(x)\n", "print('Predicted:', decode_predictions(preds))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Fine Tuning of a Pre-Trained Model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```python\n", "def VGG16_FT(weights_path = None, \n", " img_width = 224, img_height = 224, \n", " f_type = None, n_labels = None ):\n", " \n", " \"\"\"Fine Tuning of a VGG16 based Net\"\"\"\n", "\n", " # VGG16 Up to the layer before the last!\n", " model = Sequential()\n", " model.add(ZeroPadding2D((1, 1), \n", " input_shape=(3, \n", " img_width, img_height)))\n", "\n", " model.add(Convolution2D(64, 3, 3, activation='relu', \n", " name='conv1_1'))\n", " model.add(ZeroPadding2D((1, 1)))\n", " model.add(Convolution2D(64, 3, 3, activation='relu', \n", " name='conv1_2'))\n", " model.add(MaxPooling2D((2, 2), strides=(2, 2)))\n", "\n", " model.add(ZeroPadding2D((1, 1)))\n", " model.add(Convolution2D(128, 3, 3, activation='relu', \n", " name='conv2_1'))\n", " model.add(ZeroPadding2D((1, 1)))\n", " model.add(Convolution2D(128, 3, 3, activation='relu', \n", " name='conv2_2'))\n", " model.add(MaxPooling2D((2, 2), strides=(2, 2)))\n", "\n", " model.add(ZeroPadding2D((1, 1)))\n", " model.add(Convolution2D(256, 3, 3, activation='relu', \n", " name='conv3_1'))\n", " model.add(ZeroPadding2D((1, 1)))\n", " model.add(Convolution2D(256, 3, 3, activation='relu', \n", " name='conv3_2'))\n", " model.add(ZeroPadding2D((1, 1)))\n", " model.add(Convolution2D(256, 3, 3, activation='relu', \n", " name='conv3_3'))\n", " model.add(MaxPooling2D((2, 2), strides=(2, 2)))\n", "\n", " model.add(ZeroPadding2D((1, 1)))\n", " model.add(Convolution2D(512, 3, 3, activation='relu', \n", " name='conv4_1'))\n", " model.add(ZeroPadding2D((1, 1)))\n", " model.add(Convolution2D(512, 3, 3, activation='relu', \n", " name='conv4_2'))\n", " model.add(ZeroPadding2D((1, 1)))\n", " model.add(Convolution2D(512, 3, 3, activation='relu', \n", " name='conv4_3'))\n", " model.add(MaxPooling2D((2, 2), strides=(2, 2)))\n", "\n", " model.add(ZeroPadding2D((1, 1)))\n", " model.add(Convolution2D(512, 3, 3, activation='relu', \n", " name='conv5_1'))\n", " model.add(ZeroPadding2D((1, 1)))\n", " model.add(Convolution2D(512, 3, 3, activation='relu', \n", " name='conv5_2'))\n", " model.add(ZeroPadding2D((1, 1)))\n", " model.add(Convolution2D(512, 3, 3, activation='relu', \n", " name='conv5_3'))\n", " model.add(MaxPooling2D((2, 2), strides=(2, 2)))\n", " model.add(Flatten())\n", "\n", " # Plugging new Layers\n", " model.add(Dense(768, activation='sigmoid'))\n", " model.add(Dropout(0.0))\n", " model.add(Dense(768, activation='sigmoid'))\n", " model.add(Dropout(0.0))\n", " \n", " last_layer = Dense(n_labels, activation='sigmoid')\n", " loss = 'categorical_crossentropy'\n", " optimizer = optimizers.Adam(lr=1e-4, epsilon=1e-08)\n", " batch_size = 128\n", " \n", " assert os.path.exists(weights_path), 'Model weights not found (see \"weights_path\" variable in script).'\n", " #model.load_weights(weights_path)\n", " f = h5py.File(weights_path)\n", " for k in range(len(f.attrs['layer_names'])):\n", " g = f[f.attrs['layer_names'][k]]\n", " weights = [g[g.attrs['weight_names'][p]] \n", " for p in range(len(g.attrs['weight_names']))]\n", " if k >= len(model.layers):\n", " break\n", " else:\n", " model.layers[k].set_weights(weights)\n", " f.close()\n", " print('Model loaded.')\n", "\n", " model.add(last_layer)\n", "\n", " # set the first 25 layers (up to the last conv block)\n", " # to non-trainable (weights will not be updated)\n", " for layer in model.layers[:25]:\n", " layer.trainable = False\n", "\n", " # compile the model with a SGD/momentum optimizer\n", " # and a very slow learning rate.\n", " model.compile(loss=loss,\n", " optimizer=optimizer,\n", " metrics=['accuracy'])\n", " return model, batch_size\n", "\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Hands On:\n", "\n", "### Try to do the same with other models " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "%load deep_learning_models/vgg19.py" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "%load deep_learning_models/resnet50.py" ] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.4.3" } }, "nbformat": 4, "nbformat_minor": 0 }