{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "**Chapter 11 – Training Deep Neural Networks**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "_This notebook contains all the sample code and solutions to the exercises in chapter 11._" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", " \n", " \n", "
\n", " \"Open\n", " \n", " \n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Setup" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, let's import a few common modules, ensure MatplotLib plots figures inline and prepare a function to save the figures. We also check that Python 3.5 or later is installed (although Python 2.x may work, it is deprecated so we strongly recommend you use Python 3 instead), as well as Scikit-Learn ≥0.20 and TensorFlow ≥2.0." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# Python ≥3.5 is required\n", "import sys\n", "assert sys.version_info >= (3, 5)\n", "\n", "# Scikit-Learn ≥0.20 is required\n", "import sklearn\n", "assert sklearn.__version__ >= \"0.20\"\n", "\n", "try:\n", " # %tensorflow_version only exists in Colab.\n", " %tensorflow_version 2.x\n", "except Exception:\n", " pass\n", "\n", "# TensorFlow ≥2.0 is required\n", "import tensorflow as tf\n", "from tensorflow import keras\n", "assert tf.__version__ >= \"2.0\"\n", "\n", "%load_ext tensorboard\n", "\n", "# Common imports\n", "import numpy as np\n", "import os\n", "\n", "# to make this notebook's output stable across runs\n", "np.random.seed(42)\n", "\n", "# To plot pretty figures\n", "%matplotlib inline\n", "import matplotlib as mpl\n", "import matplotlib.pyplot as plt\n", "mpl.rc('axes', labelsize=14)\n", "mpl.rc('xtick', labelsize=12)\n", "mpl.rc('ytick', labelsize=12)\n", "\n", "# Where to save the figures\n", "PROJECT_ROOT_DIR = \".\"\n", "CHAPTER_ID = \"deep\"\n", "IMAGES_PATH = os.path.join(PROJECT_ROOT_DIR, \"images\", CHAPTER_ID)\n", "os.makedirs(IMAGES_PATH, exist_ok=True)\n", "\n", "def save_fig(fig_id, tight_layout=True, fig_extension=\"png\", resolution=300):\n", " path = os.path.join(IMAGES_PATH, fig_id + \".\" + fig_extension)\n", " print(\"Saving figure\", fig_id)\n", " if tight_layout:\n", " plt.tight_layout()\n", " plt.savefig(path, format=fig_extension, dpi=resolution)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Vanishing/Exploding Gradients Problem" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "def logit(z):\n", " return 1 / (1 + np.exp(-z))" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Saving figure sigmoid_saturation_plot\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "z = np.linspace(-5, 5, 200)\n", "\n", "plt.plot([-5, 5], [0, 0], 'k-')\n", "plt.plot([-5, 5], [1, 1], 'k--')\n", "plt.plot([0, 0], [-0.2, 1.2], 'k-')\n", "plt.plot([-5, 5], [-3/4, 7/4], 'g--')\n", "plt.plot(z, logit(z), \"b-\", linewidth=2)\n", "props = dict(facecolor='black', shrink=0.1)\n", "plt.annotate('Saturating', xytext=(3.5, 0.7), xy=(5, 1), arrowprops=props, fontsize=14, ha=\"center\")\n", "plt.annotate('Saturating', xytext=(-3.5, 0.3), xy=(-5, 0), arrowprops=props, fontsize=14, ha=\"center\")\n", "plt.annotate('Linear', xytext=(2, 0.2), xy=(0, 0.5), arrowprops=props, fontsize=14, ha=\"center\")\n", "plt.grid(True)\n", "plt.title(\"Sigmoid activation function\", fontsize=14)\n", "plt.axis([-5, 5, -0.2, 1.2])\n", "\n", "save_fig(\"sigmoid_saturation_plot\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Xavier and He Initialization" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['Constant',\n", " 'GlorotNormal',\n", " 'GlorotUniform',\n", " 'HeNormal',\n", " 'HeUniform',\n", " 'Identity',\n", " 'Initializer',\n", " 'LecunNormal',\n", " 'LecunUniform',\n", " 'Ones',\n", " 'Orthogonal',\n", " 'RandomNormal',\n", " 'RandomUniform',\n", " 'TruncatedNormal',\n", " 'VarianceScaling',\n", " 'Zeros',\n", " 'constant',\n", " 'deserialize',\n", " 'get',\n", " 'glorot_normal',\n", " 'glorot_uniform',\n", " 'he_normal',\n", " 'he_uniform',\n", " 'identity',\n", " 'lecun_normal',\n", " 'lecun_uniform',\n", " 'ones',\n", " 'orthogonal',\n", " 'random_normal',\n", " 'random_uniform',\n", " 'serialize',\n", " 'truncated_normal',\n", " 'variance_scaling',\n", " 'zeros']" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[name for name in dir(keras.initializers) if not name.startswith(\"_\")]" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "keras.layers.Dense(10, activation=\"relu\", kernel_initializer=\"he_normal\")" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "init = keras.initializers.VarianceScaling(scale=2., mode='fan_avg',\n", " distribution='uniform')\n", "keras.layers.Dense(10, activation=\"relu\", kernel_initializer=init)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Nonsaturating Activation Functions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Leaky ReLU" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "def leaky_relu(z, alpha=0.01):\n", " return np.maximum(alpha*z, z)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Saving figure leaky_relu_plot\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.plot(z, leaky_relu(z, 0.05), \"b-\", linewidth=2)\n", "plt.plot([-5, 5], [0, 0], 'k-')\n", "plt.plot([0, 0], [-0.5, 4.2], 'k-')\n", "plt.grid(True)\n", "props = dict(facecolor='black', shrink=0.1)\n", "plt.annotate('Leak', xytext=(-3.5, 0.5), xy=(-5, -0.2), arrowprops=props, fontsize=14, ha=\"center\")\n", "plt.title(\"Leaky ReLU activation function\", fontsize=14)\n", "plt.axis([-5, 5, -0.5, 4.2])\n", "\n", "save_fig(\"leaky_relu_plot\")\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['deserialize',\n", " 'elu',\n", " 'exponential',\n", " 'gelu',\n", " 'get',\n", " 'hard_sigmoid',\n", " 'linear',\n", " 'relu',\n", " 'selu',\n", " 'serialize',\n", " 'sigmoid',\n", " 'softmax',\n", " 'softplus',\n", " 'softsign',\n", " 'swish',\n", " 'tanh']" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[m for m in dir(keras.activations) if not m.startswith(\"_\")]" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['LeakyReLU', 'PReLU', 'ReLU', 'ThresholdedReLU']" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[m for m in dir(keras.layers) if \"relu\" in m.lower()]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's train a neural network on Fashion MNIST using the Leaky ReLU:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "(X_train_full, y_train_full), (X_test, y_test) = keras.datasets.fashion_mnist.load_data()\n", "X_train_full = X_train_full / 255.0\n", "X_test = X_test / 255.0\n", "X_valid, X_train = X_train_full[:5000], X_train_full[5000:]\n", "y_valid, y_train = y_train_full[:5000], y_train_full[5000:]" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "tf.random.set_seed(42)\n", "np.random.seed(42)\n", "\n", "model = keras.models.Sequential([\n", " keras.layers.Flatten(input_shape=[28, 28]),\n", " keras.layers.Dense(300, kernel_initializer=\"he_normal\"),\n", " keras.layers.LeakyReLU(),\n", " keras.layers.Dense(100, kernel_initializer=\"he_normal\"),\n", " keras.layers.LeakyReLU(),\n", " keras.layers.Dense(10, activation=\"softmax\")\n", "])" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "model.compile(loss=\"sparse_categorical_crossentropy\",\n", " optimizer=keras.optimizers.SGD(learning_rate=1e-3),\n", " metrics=[\"accuracy\"])" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/10\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 1.6314 - accuracy: 0.5054 - val_loss: 0.8886 - val_accuracy: 0.7160\n", "Epoch 2/10\n", "1719/1719 [==============================] - 2s 892us/step - loss: 0.8416 - accuracy: 0.7247 - val_loss: 0.7130 - val_accuracy: 0.7656\n", "Epoch 3/10\n", "1719/1719 [==============================] - 2s 879us/step - loss: 0.7053 - accuracy: 0.7637 - val_loss: 0.6427 - val_accuracy: 0.7898\n", "Epoch 4/10\n", "1719/1719 [==============================] - 2s 883us/step - loss: 0.6325 - accuracy: 0.7908 - val_loss: 0.5900 - val_accuracy: 0.8066\n", "Epoch 5/10\n", "1719/1719 [==============================] - 2s 887us/step - loss: 0.5992 - accuracy: 0.8021 - val_loss: 0.5582 - val_accuracy: 0.8200\n", "Epoch 6/10\n", "1719/1719 [==============================] - 2s 881us/step - loss: 0.5624 - accuracy: 0.8142 - val_loss: 0.5350 - val_accuracy: 0.8238\n", "Epoch 7/10\n", "1719/1719 [==============================] - 2s 892us/step - loss: 0.5379 - accuracy: 0.8217 - val_loss: 0.5157 - val_accuracy: 0.8304\n", "Epoch 8/10\n", "1719/1719 [==============================] - 2s 895us/step - loss: 0.5152 - accuracy: 0.8295 - val_loss: 0.5078 - val_accuracy: 0.8284\n", "Epoch 9/10\n", "1719/1719 [==============================] - 2s 911us/step - loss: 0.5100 - accuracy: 0.8268 - val_loss: 0.4895 - val_accuracy: 0.8390\n", "Epoch 10/10\n", "1719/1719 [==============================] - 2s 897us/step - loss: 0.4918 - accuracy: 0.8340 - val_loss: 0.4817 - val_accuracy: 0.8396\n" ] } ], "source": [ "history = model.fit(X_train, y_train, epochs=10,\n", " validation_data=(X_valid, y_valid))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's try PReLU:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "tf.random.set_seed(42)\n", "np.random.seed(42)\n", "\n", "model = keras.models.Sequential([\n", " keras.layers.Flatten(input_shape=[28, 28]),\n", " keras.layers.Dense(300, kernel_initializer=\"he_normal\"),\n", " keras.layers.PReLU(),\n", " keras.layers.Dense(100, kernel_initializer=\"he_normal\"),\n", " keras.layers.PReLU(),\n", " keras.layers.Dense(10, activation=\"softmax\")\n", "])" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "model.compile(loss=\"sparse_categorical_crossentropy\",\n", " optimizer=keras.optimizers.SGD(learning_rate=1e-3),\n", " metrics=[\"accuracy\"])" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/10\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 1.6969 - accuracy: 0.4974 - val_loss: 0.9255 - val_accuracy: 0.7186\n", "Epoch 2/10\n", "1719/1719 [==============================] - 2s 990us/step - loss: 0.8706 - accuracy: 0.7247 - val_loss: 0.7305 - val_accuracy: 0.7630\n", "Epoch 3/10\n", "1719/1719 [==============================] - 2s 980us/step - loss: 0.7211 - accuracy: 0.7621 - val_loss: 0.6564 - val_accuracy: 0.7882\n", "Epoch 4/10\n", "1719/1719 [==============================] - 2s 985us/step - loss: 0.6447 - accuracy: 0.7879 - val_loss: 0.6003 - val_accuracy: 0.8048\n", "Epoch 5/10\n", "1719/1719 [==============================] - 2s 967us/step - loss: 0.6077 - accuracy: 0.8004 - val_loss: 0.5656 - val_accuracy: 0.8182\n", "Epoch 6/10\n", "1719/1719 [==============================] - 2s 984us/step - loss: 0.5692 - accuracy: 0.8118 - val_loss: 0.5406 - val_accuracy: 0.8236\n", "Epoch 7/10\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.5428 - accuracy: 0.8194 - val_loss: 0.5196 - val_accuracy: 0.8314\n", "Epoch 8/10\n", "1719/1719 [==============================] - 2s 983us/step - loss: 0.5193 - accuracy: 0.8284 - val_loss: 0.5113 - val_accuracy: 0.8316\n", "Epoch 9/10\n", "1719/1719 [==============================] - 2s 992us/step - loss: 0.5128 - accuracy: 0.8272 - val_loss: 0.4916 - val_accuracy: 0.8378\n", "Epoch 10/10\n", "1719/1719 [==============================] - 2s 988us/step - loss: 0.4941 - accuracy: 0.8314 - val_loss: 0.4826 - val_accuracy: 0.8398\n" ] } ], "source": [ "history = model.fit(X_train, y_train, epochs=10,\n", " validation_data=(X_valid, y_valid))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### ELU" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "def elu(z, alpha=1):\n", " return np.where(z < 0, alpha * (np.exp(z) - 1), z)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Saving figure elu_plot\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.plot(z, elu(z), \"b-\", linewidth=2)\n", "plt.plot([-5, 5], [0, 0], 'k-')\n", "plt.plot([-5, 5], [-1, -1], 'k--')\n", "plt.plot([0, 0], [-2.2, 3.2], 'k-')\n", "plt.grid(True)\n", "plt.title(r\"ELU activation function ($\\alpha=1$)\", fontsize=14)\n", "plt.axis([-5, 5, -2.2, 3.2])\n", "\n", "save_fig(\"elu_plot\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Implementing ELU in TensorFlow is trivial, just specify the activation function when building each layer:" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "keras.layers.Dense(10, activation=\"elu\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### SELU" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This activation function was proposed in this [great paper](https://arxiv.org/pdf/1706.02515.pdf) by Günter Klambauer, Thomas Unterthiner and Andreas Mayr, published in June 2017. During training, a neural network composed exclusively of a stack of dense layers using the SELU activation function and LeCun initialization will self-normalize: the output of each layer will tend to preserve the same mean and variance during training, which solves the vanishing/exploding gradients problem. As a result, this activation function outperforms the other activation functions very significantly for such neural nets, so you should really try it out. Unfortunately, the self-normalizing property of the SELU activation function is easily broken: you cannot use ℓ1 or ℓ2 regularization, regular dropout, max-norm, skip connections or other non-sequential topologies (so recurrent neural networks won't self-normalize). However, in practice it works quite well with sequential CNNs. If you break self-normalization, SELU will not necessarily outperform other activation functions." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "from scipy.special import erfc\n", "\n", "# alpha and scale to self normalize with mean 0 and standard deviation 1\n", "# (see equation 14 in the paper):\n", "alpha_0_1 = -np.sqrt(2 / np.pi) / (erfc(1/np.sqrt(2)) * np.exp(1/2) - 1)\n", "scale_0_1 = (1 - erfc(1 / np.sqrt(2)) * np.sqrt(np.e)) * np.sqrt(2 * np.pi) * (2 * erfc(np.sqrt(2))*np.e**2 + np.pi*erfc(1/np.sqrt(2))**2*np.e - 2*(2+np.pi)*erfc(1/np.sqrt(2))*np.sqrt(np.e)+np.pi+2)**(-1/2)" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [], "source": [ "def selu(z, scale=scale_0_1, alpha=alpha_0_1):\n", " return scale * elu(z, alpha)" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Saving figure selu_plot\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.plot(z, selu(z), \"b-\", linewidth=2)\n", "plt.plot([-5, 5], [0, 0], 'k-')\n", "plt.plot([-5, 5], [-1.758, -1.758], 'k--')\n", "plt.plot([0, 0], [-2.2, 3.2], 'k-')\n", "plt.grid(True)\n", "plt.title(\"SELU activation function\", fontsize=14)\n", "plt.axis([-5, 5, -2.2, 3.2])\n", "\n", "save_fig(\"selu_plot\")\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By default, the SELU hyperparameters (`scale` and `alpha`) are tuned in such a way that the mean output of each neuron remains close to 0, and the standard deviation remains close to 1 (assuming the inputs are standardized with mean 0 and standard deviation 1 too). Using this activation function, even a 1,000 layer deep neural network preserves roughly mean 0 and standard deviation 1 across all layers, avoiding the exploding/vanishing gradients problem:" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Layer 0: mean -0.00, std deviation 1.00\n", "Layer 100: mean 0.02, std deviation 0.96\n", "Layer 200: mean 0.01, std deviation 0.90\n", "Layer 300: mean -0.02, std deviation 0.92\n", "Layer 400: mean 0.05, std deviation 0.89\n", "Layer 500: mean 0.01, std deviation 0.93\n", "Layer 600: mean 0.02, std deviation 0.92\n", "Layer 700: mean -0.02, std deviation 0.90\n", "Layer 800: mean 0.05, std deviation 0.83\n", "Layer 900: mean 0.02, std deviation 1.00\n" ] } ], "source": [ "np.random.seed(42)\n", "Z = np.random.normal(size=(500, 100)) # standardized inputs\n", "for layer in range(1000):\n", " W = np.random.normal(size=(100, 100), scale=np.sqrt(1 / 100)) # LeCun initialization\n", " Z = selu(np.dot(Z, W))\n", " means = np.mean(Z, axis=0).mean()\n", " stds = np.std(Z, axis=0).mean()\n", " if layer % 100 == 0:\n", " print(\"Layer {}: mean {:.2f}, std deviation {:.2f}\".format(layer, means, stds))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using SELU is easy:" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "keras.layers.Dense(10, activation=\"selu\",\n", " kernel_initializer=\"lecun_normal\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's create a neural net for Fashion MNIST with 100 hidden layers, using the SELU activation function:" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "np.random.seed(42)\n", "tf.random.set_seed(42)" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "model = keras.models.Sequential()\n", "model.add(keras.layers.Flatten(input_shape=[28, 28]))\n", "model.add(keras.layers.Dense(300, activation=\"selu\",\n", " kernel_initializer=\"lecun_normal\"))\n", "for layer in range(99):\n", " model.add(keras.layers.Dense(100, activation=\"selu\",\n", " kernel_initializer=\"lecun_normal\"))\n", "model.add(keras.layers.Dense(10, activation=\"softmax\"))" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [], "source": [ "model.compile(loss=\"sparse_categorical_crossentropy\",\n", " optimizer=keras.optimizers.SGD(learning_rate=1e-3),\n", " metrics=[\"accuracy\"])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's train it. Do not forget to scale the inputs to mean 0 and standard deviation 1:" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [], "source": [ "pixel_means = X_train.mean(axis=0, keepdims=True)\n", "pixel_stds = X_train.std(axis=0, keepdims=True)\n", "X_train_scaled = (X_train - pixel_means) / pixel_stds\n", "X_valid_scaled = (X_valid - pixel_means) / pixel_stds\n", "X_test_scaled = (X_test - pixel_means) / pixel_stds" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/5\n", "1719/1719 [==============================] - 12s 6ms/step - loss: 1.3556 - accuracy: 0.4808 - val_loss: 0.7711 - val_accuracy: 0.6858\n", "Epoch 2/5\n", "1719/1719 [==============================] - 9s 5ms/step - loss: 0.7537 - accuracy: 0.7235 - val_loss: 0.7534 - val_accuracy: 0.7384\n", "Epoch 3/5\n", "1719/1719 [==============================] - 9s 5ms/step - loss: 0.7451 - accuracy: 0.7357 - val_loss: 0.5943 - val_accuracy: 0.7834\n", "Epoch 4/5\n", "1719/1719 [==============================] - 9s 5ms/step - loss: 0.5699 - accuracy: 0.7906 - val_loss: 0.5434 - val_accuracy: 0.8066\n", "Epoch 5/5\n", "1719/1719 [==============================] - 9s 5ms/step - loss: 0.5569 - accuracy: 0.8051 - val_loss: 0.4907 - val_accuracy: 0.8218\n" ] } ], "source": [ "history = model.fit(X_train_scaled, y_train, epochs=5,\n", " validation_data=(X_valid_scaled, y_valid))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now look at what happens if we try to use the ReLU activation function instead:" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [], "source": [ "np.random.seed(42)\n", "tf.random.set_seed(42)" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [], "source": [ "model = keras.models.Sequential()\n", "model.add(keras.layers.Flatten(input_shape=[28, 28]))\n", "model.add(keras.layers.Dense(300, activation=\"relu\", kernel_initializer=\"he_normal\"))\n", "for layer in range(99):\n", " model.add(keras.layers.Dense(100, activation=\"relu\", kernel_initializer=\"he_normal\"))\n", "model.add(keras.layers.Dense(10, activation=\"softmax\"))" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [], "source": [ "model.compile(loss=\"sparse_categorical_crossentropy\",\n", " optimizer=keras.optimizers.SGD(learning_rate=1e-3),\n", " metrics=[\"accuracy\"])" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/5\n", "1719/1719 [==============================] - 11s 5ms/step - loss: 2.0460 - accuracy: 0.1919 - val_loss: 1.5971 - val_accuracy: 0.3048\n", "Epoch 2/5\n", "1719/1719 [==============================] - 8s 5ms/step - loss: 1.2654 - accuracy: 0.4591 - val_loss: 0.9156 - val_accuracy: 0.6372\n", "Epoch 3/5\n", "1719/1719 [==============================] - 8s 5ms/step - loss: 0.9312 - accuracy: 0.6169 - val_loss: 0.8928 - val_accuracy: 0.6246\n", "Epoch 4/5\n", "1719/1719 [==============================] - 8s 5ms/step - loss: 0.8188 - accuracy: 0.6710 - val_loss: 0.6914 - val_accuracy: 0.7396\n", "Epoch 5/5\n", "1719/1719 [==============================] - 8s 5ms/step - loss: 0.7288 - accuracy: 0.7152 - val_loss: 0.6638 - val_accuracy: 0.7380\n" ] } ], "source": [ "history = model.fit(X_train_scaled, y_train, epochs=5,\n", " validation_data=(X_valid_scaled, y_valid))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Not great at all, we suffered from the vanishing/exploding gradients problem." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Batch Normalization" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [], "source": [ "model = keras.models.Sequential([\n", " keras.layers.Flatten(input_shape=[28, 28]),\n", " keras.layers.BatchNormalization(),\n", " keras.layers.Dense(300, activation=\"relu\"),\n", " keras.layers.BatchNormalization(),\n", " keras.layers.Dense(100, activation=\"relu\"),\n", " keras.layers.BatchNormalization(),\n", " keras.layers.Dense(10, activation=\"softmax\")\n", "])" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Model: \"sequential_4\"\n", "_________________________________________________________________\n", "Layer (type) Output Shape Param # \n", "=================================================================\n", "flatten_4 (Flatten) (None, 784) 0 \n", "_________________________________________________________________\n", "batch_normalization (BatchNo (None, 784) 3136 \n", "_________________________________________________________________\n", "dense_212 (Dense) (None, 300) 235500 \n", "_________________________________________________________________\n", "batch_normalization_1 (Batch (None, 300) 1200 \n", "_________________________________________________________________\n", "dense_213 (Dense) (None, 100) 30100 \n", "_________________________________________________________________\n", "batch_normalization_2 (Batch (None, 100) 400 \n", "_________________________________________________________________\n", "dense_214 (Dense) (None, 10) 1010 \n", "=================================================================\n", "Total params: 271,346\n", "Trainable params: 268,978\n", "Non-trainable params: 2,368\n", "_________________________________________________________________\n" ] } ], "source": [ "model.summary()" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[('batch_normalization/gamma:0', True),\n", " ('batch_normalization/beta:0', True),\n", " ('batch_normalization/moving_mean:0', False),\n", " ('batch_normalization/moving_variance:0', False)]" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bn1 = model.layers[1]\n", "[(var.name, var.trainable) for var in bn1.variables]" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [], "source": [ "#bn1.updates #deprecated" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [], "source": [ "model.compile(loss=\"sparse_categorical_crossentropy\",\n", " optimizer=keras.optimizers.SGD(learning_rate=1e-3),\n", " metrics=[\"accuracy\"])" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/10\n", "1719/1719 [==============================] - 3s 1ms/step - loss: 1.2287 - accuracy: 0.5993 - val_loss: 0.5526 - val_accuracy: 0.8230\n", "Epoch 2/10\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.5996 - accuracy: 0.7959 - val_loss: 0.4725 - val_accuracy: 0.8468\n", "Epoch 3/10\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.5312 - accuracy: 0.8168 - val_loss: 0.4375 - val_accuracy: 0.8558\n", "Epoch 4/10\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.4884 - accuracy: 0.8294 - val_loss: 0.4153 - val_accuracy: 0.8596\n", "Epoch 5/10\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.4717 - accuracy: 0.8343 - val_loss: 0.3997 - val_accuracy: 0.8640\n", "Epoch 6/10\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.4420 - accuracy: 0.8461 - val_loss: 0.3867 - val_accuracy: 0.8694\n", "Epoch 7/10\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.4285 - accuracy: 0.8496 - val_loss: 0.3763 - val_accuracy: 0.8710\n", "Epoch 8/10\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.4086 - accuracy: 0.8552 - val_loss: 0.3711 - val_accuracy: 0.8740\n", "Epoch 9/10\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.4079 - accuracy: 0.8566 - val_loss: 0.3631 - val_accuracy: 0.8752\n", "Epoch 10/10\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.3903 - accuracy: 0.8617 - val_loss: 0.3573 - val_accuracy: 0.8750\n" ] } ], "source": [ "history = model.fit(X_train, y_train, epochs=10,\n", " validation_data=(X_valid, y_valid))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sometimes applying BN before the activation function works better (there's a debate on this topic). Moreover, the layer before a `BatchNormalization` layer does not need to have bias terms, since the `BatchNormalization` layer has some as well, it would be a waste of parameters, so you can set `use_bias=False` when creating those layers:" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [], "source": [ "model = keras.models.Sequential([\n", " keras.layers.Flatten(input_shape=[28, 28]),\n", " keras.layers.BatchNormalization(),\n", " keras.layers.Dense(300, use_bias=False),\n", " keras.layers.BatchNormalization(),\n", " keras.layers.Activation(\"relu\"),\n", " keras.layers.Dense(100, use_bias=False),\n", " keras.layers.BatchNormalization(),\n", " keras.layers.Activation(\"relu\"),\n", " keras.layers.Dense(10, activation=\"softmax\")\n", "])" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [], "source": [ "model.compile(loss=\"sparse_categorical_crossentropy\",\n", " optimizer=keras.optimizers.SGD(learning_rate=1e-3),\n", " metrics=[\"accuracy\"])" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/10\n", "1719/1719 [==============================] - 3s 1ms/step - loss: 1.3677 - accuracy: 0.5604 - val_loss: 0.6767 - val_accuracy: 0.7812\n", "Epoch 2/10\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.7136 - accuracy: 0.7702 - val_loss: 0.5566 - val_accuracy: 0.8184\n", "Epoch 3/10\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.6123 - accuracy: 0.7990 - val_loss: 0.5007 - val_accuracy: 0.8360\n", "Epoch 4/10\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.5547 - accuracy: 0.8148 - val_loss: 0.4666 - val_accuracy: 0.8448\n", "Epoch 5/10\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.5255 - accuracy: 0.8230 - val_loss: 0.4434 - val_accuracy: 0.8534\n", "Epoch 6/10\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.4947 - accuracy: 0.8328 - val_loss: 0.4263 - val_accuracy: 0.8550\n", "Epoch 7/10\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.4736 - accuracy: 0.8385 - val_loss: 0.4130 - val_accuracy: 0.8566\n", "Epoch 8/10\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.4550 - accuracy: 0.8446 - val_loss: 0.4035 - val_accuracy: 0.8612\n", "Epoch 9/10\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.4495 - accuracy: 0.8440 - val_loss: 0.3943 - val_accuracy: 0.8638\n", "Epoch 10/10\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.4333 - accuracy: 0.8494 - val_loss: 0.3875 - val_accuracy: 0.8660\n" ] } ], "source": [ "history = model.fit(X_train, y_train, epochs=10,\n", " validation_data=(X_valid, y_valid))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Gradient Clipping" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All Keras optimizers accept `clipnorm` or `clipvalue` arguments:" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [], "source": [ "optimizer = keras.optimizers.SGD(clipvalue=1.0)" ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [], "source": [ "optimizer = keras.optimizers.SGD(clipnorm=1.0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Reusing Pretrained Layers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Reusing a Keras model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's split the fashion MNIST training set in two:\n", "* `X_train_A`: all images of all items except for sandals and shirts (classes 5 and 6).\n", "* `X_train_B`: a much smaller training set of just the first 200 images of sandals or shirts.\n", "\n", "The validation set and the test set are also split this way, but without restricting the number of images.\n", "\n", "We will train a model on set A (classification task with 8 classes), and try to reuse it to tackle set B (binary classification). We hope to transfer a little bit of knowledge from task A to task B, since classes in set A (sneakers, ankle boots, coats, t-shirts, etc.) are somewhat similar to classes in set B (sandals and shirts). However, since we are using `Dense` layers, only patterns that occur at the same location can be reused (in contrast, convolutional layers will transfer much better, since learned patterns can be detected anywhere on the image, as we will see in the CNN chapter)." ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [], "source": [ "def split_dataset(X, y):\n", " y_5_or_6 = (y == 5) | (y == 6) # sandals or shirts\n", " y_A = y[~y_5_or_6]\n", " y_A[y_A > 6] -= 2 # class indices 7, 8, 9 should be moved to 5, 6, 7\n", " y_B = (y[y_5_or_6] == 6).astype(np.float32) # binary classification task: is it a shirt (class 6)?\n", " return ((X[~y_5_or_6], y_A),\n", " (X[y_5_or_6], y_B))\n", "\n", "(X_train_A, y_train_A), (X_train_B, y_train_B) = split_dataset(X_train, y_train)\n", "(X_valid_A, y_valid_A), (X_valid_B, y_valid_B) = split_dataset(X_valid, y_valid)\n", "(X_test_A, y_test_A), (X_test_B, y_test_B) = split_dataset(X_test, y_test)\n", "X_train_B = X_train_B[:200]\n", "y_train_B = y_train_B[:200]" ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(43986, 28, 28)" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X_train_A.shape" ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(200, 28, 28)" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X_train_B.shape" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([4, 0, 5, 7, 7, 7, 4, 4, 3, 4, 0, 1, 6, 3, 4, 3, 2, 6, 5, 3, 4, 5,\n", " 1, 3, 4, 2, 0, 6, 7, 1], dtype=uint8)" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "y_train_A[:30]" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1., 1., 0., 0., 0., 0., 1., 1., 1., 0., 0., 1., 1., 0., 0., 0., 0.,\n", " 0., 0., 1., 1., 0., 0., 1., 1., 0., 1., 1., 1., 1.], dtype=float32)" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "y_train_B[:30]" ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [], "source": [ "tf.random.set_seed(42)\n", "np.random.seed(42)" ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [], "source": [ "model_A = keras.models.Sequential()\n", "model_A.add(keras.layers.Flatten(input_shape=[28, 28]))\n", "for n_hidden in (300, 100, 50, 50, 50):\n", " model_A.add(keras.layers.Dense(n_hidden, activation=\"selu\"))\n", "model_A.add(keras.layers.Dense(8, activation=\"softmax\"))" ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [], "source": [ "model_A.compile(loss=\"sparse_categorical_crossentropy\",\n", " optimizer=keras.optimizers.SGD(learning_rate=1e-3),\n", " metrics=[\"accuracy\"])" ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/20\n", "1375/1375 [==============================] - 2s 1ms/step - loss: 0.9249 - accuracy: 0.6994 - val_loss: 0.3896 - val_accuracy: 0.8662\n", "Epoch 2/20\n", "1375/1375 [==============================] - 1s 1ms/step - loss: 0.3651 - accuracy: 0.8745 - val_loss: 0.3288 - val_accuracy: 0.8827\n", "Epoch 3/20\n", "1375/1375 [==============================] - 1s 989us/step - loss: 0.3182 - accuracy: 0.8897 - val_loss: 0.3013 - val_accuracy: 0.8991\n", "Epoch 4/20\n", "1375/1375 [==============================] - 1s 1ms/step - loss: 0.3048 - accuracy: 0.8954 - val_loss: 0.2896 - val_accuracy: 0.9021\n", "Epoch 5/20\n", "1375/1375 [==============================] - 1s 1ms/step - loss: 0.2804 - accuracy: 0.9029 - val_loss: 0.2773 - val_accuracy: 0.9061\n", "Epoch 6/20\n", "1375/1375 [==============================] - 1s 1ms/step - loss: 0.2701 - accuracy: 0.9075 - val_loss: 0.2735 - val_accuracy: 0.9066\n", "Epoch 7/20\n", "1375/1375 [==============================] - 1s 1ms/step - loss: 0.2627 - accuracy: 0.9093 - val_loss: 0.2721 - val_accuracy: 0.9081\n", "Epoch 8/20\n", "1375/1375 [==============================] - 1s 997us/step - loss: 0.2609 - accuracy: 0.9122 - val_loss: 0.2589 - val_accuracy: 0.9141\n", "Epoch 9/20\n", "1375/1375 [==============================] - 1s 1ms/step - loss: 0.2558 - accuracy: 0.9110 - val_loss: 0.2562 - val_accuracy: 0.9136\n", "Epoch 10/20\n", "1375/1375 [==============================] - 1s 997us/step - loss: 0.2512 - accuracy: 0.9138 - val_loss: 0.2544 - val_accuracy: 0.9160\n", "Epoch 11/20\n", "1375/1375 [==============================] - 1s 1000us/step - loss: 0.2431 - accuracy: 0.9170 - val_loss: 0.2495 - val_accuracy: 0.9153\n", "Epoch 12/20\n", "1375/1375 [==============================] - 1s 995us/step - loss: 0.2422 - accuracy: 0.9168 - val_loss: 0.2515 - val_accuracy: 0.9126\n", "Epoch 13/20\n", "1375/1375 [==============================] - 1s 1ms/step - loss: 0.2360 - accuracy: 0.9181 - val_loss: 0.2446 - val_accuracy: 0.9160\n", "Epoch 14/20\n", "1375/1375 [==============================] - 1s 1ms/step - loss: 0.2266 - accuracy: 0.9232 - val_loss: 0.2415 - val_accuracy: 0.9178\n", "Epoch 15/20\n", "1375/1375 [==============================] - 1s 988us/step - loss: 0.2225 - accuracy: 0.9239 - val_loss: 0.2447 - val_accuracy: 0.9195\n", "Epoch 16/20\n", "1375/1375 [==============================] - 1s 995us/step - loss: 0.2261 - accuracy: 0.9216 - val_loss: 0.2384 - val_accuracy: 0.9198\n", "Epoch 17/20\n", "1375/1375 [==============================] - 1s 1ms/step - loss: 0.2191 - accuracy: 0.9251 - val_loss: 0.2412 - val_accuracy: 0.9175\n", "Epoch 18/20\n", "1375/1375 [==============================] - 1s 991us/step - loss: 0.2171 - accuracy: 0.9254 - val_loss: 0.2429 - val_accuracy: 0.9158\n", "Epoch 19/20\n", "1375/1375 [==============================] - 1s 992us/step - loss: 0.2180 - accuracy: 0.9252 - val_loss: 0.2330 - val_accuracy: 0.9205\n", "Epoch 20/20\n", "1375/1375 [==============================] - 1s 994us/step - loss: 0.2112 - accuracy: 0.9274 - val_loss: 0.2333 - val_accuracy: 0.9200\n" ] } ], "source": [ "history = model_A.fit(X_train_A, y_train_A, epochs=20,\n", " validation_data=(X_valid_A, y_valid_A))" ] }, { "cell_type": "code", "execution_count": 55, "metadata": {}, "outputs": [], "source": [ "model_A.save(\"my_model_A.h5\")" ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [], "source": [ "model_B = keras.models.Sequential()\n", "model_B.add(keras.layers.Flatten(input_shape=[28, 28]))\n", "for n_hidden in (300, 100, 50, 50, 50):\n", " model_B.add(keras.layers.Dense(n_hidden, activation=\"selu\"))\n", "model_B.add(keras.layers.Dense(1, activation=\"sigmoid\"))" ] }, { "cell_type": "code", "execution_count": 57, "metadata": {}, "outputs": [], "source": [ "model_B.compile(loss=\"binary_crossentropy\",\n", " optimizer=keras.optimizers.SGD(learning_rate=1e-3),\n", " metrics=[\"accuracy\"])" ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/20\n", "7/7 [==============================] - 0s 29ms/step - loss: 1.0360 - accuracy: 0.4975 - val_loss: 0.6314 - val_accuracy: 0.6004\n", "Epoch 2/20\n", "7/7 [==============================] - 0s 9ms/step - loss: 0.5883 - accuracy: 0.6971 - val_loss: 0.4784 - val_accuracy: 0.8529\n", "Epoch 3/20\n", "7/7 [==============================] - 0s 10ms/step - loss: 0.4380 - accuracy: 0.8854 - val_loss: 0.4102 - val_accuracy: 0.8945\n", "Epoch 4/20\n", "7/7 [==============================] - 0s 10ms/step - loss: 0.4021 - accuracy: 0.8712 - val_loss: 0.3647 - val_accuracy: 0.9178\n", "Epoch 5/20\n", "7/7 [==============================] - 0s 11ms/step - loss: 0.3361 - accuracy: 0.9348 - val_loss: 0.3300 - val_accuracy: 0.9320\n", "Epoch 6/20\n", "7/7 [==============================] - 0s 11ms/step - loss: 0.3113 - accuracy: 0.9233 - val_loss: 0.3019 - val_accuracy: 0.9402\n", "Epoch 7/20\n", "7/7 [==============================] - 0s 11ms/step - loss: 0.2817 - accuracy: 0.9299 - val_loss: 0.2804 - val_accuracy: 0.9422\n", "Epoch 8/20\n", "7/7 [==============================] - 0s 11ms/step - loss: 0.2632 - accuracy: 0.9379 - val_loss: 0.2606 - val_accuracy: 0.9473\n", "Epoch 9/20\n", "7/7 [==============================] - 0s 10ms/step - loss: 0.2373 - accuracy: 0.9481 - val_loss: 0.2428 - val_accuracy: 0.9523\n", "Epoch 10/20\n", "7/7 [==============================] - 0s 11ms/step - loss: 0.2229 - accuracy: 0.9657 - val_loss: 0.2281 - val_accuracy: 0.9544\n", "Epoch 11/20\n", "7/7 [==============================] - 0s 11ms/step - loss: 0.2155 - accuracy: 0.9590 - val_loss: 0.2150 - val_accuracy: 0.9584\n", "Epoch 12/20\n", "7/7 [==============================] - 0s 11ms/step - loss: 0.1834 - accuracy: 0.9738 - val_loss: 0.2036 - val_accuracy: 0.9584\n", "Epoch 13/20\n", "7/7 [==============================] - 0s 10ms/step - loss: 0.1671 - accuracy: 0.9828 - val_loss: 0.1931 - val_accuracy: 0.9615\n", "Epoch 14/20\n", "7/7 [==============================] - 0s 10ms/step - loss: 0.1527 - accuracy: 0.9915 - val_loss: 0.1838 - val_accuracy: 0.9635\n", "Epoch 15/20\n", "7/7 [==============================] - 0s 10ms/step - loss: 0.1595 - accuracy: 0.9904 - val_loss: 0.1746 - val_accuracy: 0.9686\n", "Epoch 16/20\n", "7/7 [==============================] - 0s 11ms/step - loss: 0.1473 - accuracy: 0.9937 - val_loss: 0.1674 - val_accuracy: 0.9686\n", "Epoch 17/20\n", "7/7 [==============================] - 0s 10ms/step - loss: 0.1412 - accuracy: 0.9944 - val_loss: 0.1604 - val_accuracy: 0.9706\n", "Epoch 18/20\n", "7/7 [==============================] - 0s 11ms/step - loss: 0.1242 - accuracy: 0.9931 - val_loss: 0.1539 - val_accuracy: 0.9706\n", "Epoch 19/20\n", "7/7 [==============================] - 0s 11ms/step - loss: 0.1224 - accuracy: 0.9931 - val_loss: 0.1482 - val_accuracy: 0.9716\n", "Epoch 20/20\n", "7/7 [==============================] - 0s 10ms/step - loss: 0.1096 - accuracy: 0.9912 - val_loss: 0.1431 - val_accuracy: 0.9716\n" ] } ], "source": [ "history = model_B.fit(X_train_B, y_train_B, epochs=20,\n", " validation_data=(X_valid_B, y_valid_B))" ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Model: \"sequential_7\"\n", "_________________________________________________________________\n", "Layer (type) Output Shape Param # \n", "=================================================================\n", "flatten_5 (Flatten) (None, 784) 0 \n", "_________________________________________________________________\n", "dense_28 (Dense) (None, 300) 235500 \n", "_________________________________________________________________\n", "dense_29 (Dense) (None, 100) 30100 \n", "_________________________________________________________________\n", "dense_30 (Dense) (None, 50) 5050 \n", "_________________________________________________________________\n", "dense_31 (Dense) (None, 50) 2550 \n", "_________________________________________________________________\n", "dense_32 (Dense) (None, 50) 2550 \n", "_________________________________________________________________\n", "dense_33 (Dense) (None, 1) 51 \n", "=================================================================\n", "Total params: 275,801\n", "Trainable params: 275,801\n", "Non-trainable params: 0\n", "_________________________________________________________________\n" ] } ], "source": [ "model_B.summary()" ] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [], "source": [ "model_A = keras.models.load_model(\"my_model_A.h5\")\n", "model_B_on_A = keras.models.Sequential(model_A.layers[:-1])\n", "model_B_on_A.add(keras.layers.Dense(1, activation=\"sigmoid\"))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that `model_B_on_A` and `model_A` actually share layers now, so when we train one, it will update both models. If we want to avoid that, we need to build `model_B_on_A` on top of a *clone* of `model_A`:" ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [], "source": [ "model_A_clone = keras.models.clone_model(model_A)\n", "model_A_clone.set_weights(model_A.get_weights())\n", "model_B_on_A = keras.models.Sequential(model_A_clone.layers[:-1])\n", "model_B_on_A.add(keras.layers.Dense(1, activation=\"sigmoid\"))" ] }, { "cell_type": "code", "execution_count": 62, "metadata": {}, "outputs": [], "source": [ "for layer in model_B_on_A.layers[:-1]:\n", " layer.trainable = False\n", "\n", "model_B_on_A.compile(loss=\"binary_crossentropy\",\n", " optimizer=keras.optimizers.SGD(learning_rate=1e-3),\n", " metrics=[\"accuracy\"])" ] }, { "cell_type": "code", "execution_count": 63, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/4\n", "7/7 [==============================] - 0s 29ms/step - loss: 0.2575 - accuracy: 0.9487 - val_loss: 0.2797 - val_accuracy: 0.9270\n", "Epoch 2/4\n", "7/7 [==============================] - 0s 9ms/step - loss: 0.2566 - accuracy: 0.9371 - val_loss: 0.2701 - val_accuracy: 0.9300\n", "Epoch 3/4\n", "7/7 [==============================] - 0s 9ms/step - loss: 0.2473 - accuracy: 0.9332 - val_loss: 0.2613 - val_accuracy: 0.9341\n", "Epoch 4/4\n", "7/7 [==============================] - 0s 10ms/step - loss: 0.2450 - accuracy: 0.9463 - val_loss: 0.2531 - val_accuracy: 0.9391\n", "Epoch 1/16\n", "7/7 [==============================] - 1s 29ms/step - loss: 0.2106 - accuracy: 0.9524 - val_loss: 0.2045 - val_accuracy: 0.9615\n", "Epoch 2/16\n", "7/7 [==============================] - 0s 9ms/step - loss: 0.1738 - accuracy: 0.9526 - val_loss: 0.1719 - val_accuracy: 0.9706\n", "Epoch 3/16\n", "7/7 [==============================] - 0s 9ms/step - loss: 0.1451 - accuracy: 0.9660 - val_loss: 0.1491 - val_accuracy: 0.9807\n", "Epoch 4/16\n", "7/7 [==============================] - 0s 9ms/step - loss: 0.1242 - accuracy: 0.9717 - val_loss: 0.1325 - val_accuracy: 0.9817\n", "Epoch 5/16\n", "7/7 [==============================] - 0s 11ms/step - loss: 0.1078 - accuracy: 0.9855 - val_loss: 0.1200 - val_accuracy: 0.9848\n", "Epoch 6/16\n", "7/7 [==============================] - 0s 11ms/step - loss: 0.1075 - accuracy: 0.9931 - val_loss: 0.1101 - val_accuracy: 0.9858\n", "Epoch 7/16\n", "7/7 [==============================] - 0s 11ms/step - loss: 0.0893 - accuracy: 0.9950 - val_loss: 0.1020 - val_accuracy: 0.9858\n", "Epoch 8/16\n", "7/7 [==============================] - 0s 10ms/step - loss: 0.0815 - accuracy: 0.9950 - val_loss: 0.0953 - val_accuracy: 0.9868\n", "Epoch 9/16\n", "7/7 [==============================] - 0s 10ms/step - loss: 0.0640 - accuracy: 0.9973 - val_loss: 0.0892 - val_accuracy: 0.9868\n", "Epoch 10/16\n", "7/7 [==============================] - 0s 10ms/step - loss: 0.0641 - accuracy: 0.9931 - val_loss: 0.0844 - val_accuracy: 0.9878\n", "Epoch 11/16\n", "7/7 [==============================] - 0s 11ms/step - loss: 0.0609 - accuracy: 0.9931 - val_loss: 0.0800 - val_accuracy: 0.9888\n", "Epoch 12/16\n", "7/7 [==============================] - 0s 11ms/step - loss: 0.0641 - accuracy: 1.0000 - val_loss: 0.0762 - val_accuracy: 0.9888\n", "Epoch 13/16\n", "7/7 [==============================] - 0s 10ms/step - loss: 0.0478 - accuracy: 1.0000 - val_loss: 0.0728 - val_accuracy: 0.9888\n", "Epoch 14/16\n", "7/7 [==============================] - 0s 10ms/step - loss: 0.0444 - accuracy: 1.0000 - val_loss: 0.0700 - val_accuracy: 0.9878\n", "Epoch 15/16\n", "7/7 [==============================] - 0s 11ms/step - loss: 0.0490 - accuracy: 1.0000 - val_loss: 0.0675 - val_accuracy: 0.9878\n", "Epoch 16/16\n", "7/7 [==============================] - 0s 11ms/step - loss: 0.0434 - accuracy: 1.0000 - val_loss: 0.0652 - val_accuracy: 0.9878\n" ] } ], "source": [ "history = model_B_on_A.fit(X_train_B, y_train_B, epochs=4,\n", " validation_data=(X_valid_B, y_valid_B))\n", "\n", "for layer in model_B_on_A.layers[:-1]:\n", " layer.trainable = True\n", "\n", "model_B_on_A.compile(loss=\"binary_crossentropy\",\n", " optimizer=keras.optimizers.SGD(learning_rate=1e-3),\n", " metrics=[\"accuracy\"])\n", "history = model_B_on_A.fit(X_train_B, y_train_B, epochs=16,\n", " validation_data=(X_valid_B, y_valid_B))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So, what's the final verdict?" ] }, { "cell_type": "code", "execution_count": 64, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "63/63 [==============================] - 0s 714us/step - loss: 0.1408 - accuracy: 0.9705\n" ] }, { "data": { "text/plain": [ "[0.1408407837152481, 0.9704999923706055]" ] }, "execution_count": 64, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model_B.evaluate(X_test_B, y_test_B)" ] }, { "cell_type": "code", "execution_count": 65, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "63/63 [==============================] - 0s 751us/step - loss: 0.0562 - accuracy: 0.9940\n" ] }, { "data": { "text/plain": [ "[0.0561506561934948, 0.9940000176429749]" ] }, "execution_count": 65, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model_B_on_A.evaluate(X_test_B, y_test_B)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Great! We got quite a bit of transfer: the error rate dropped by a factor of 4.9!" ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "4.916666666666718" ] }, "execution_count": 66, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(100 - 97.05) / (100 - 99.40)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Faster Optimizers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Momentum optimization" ] }, { "cell_type": "code", "execution_count": 67, "metadata": {}, "outputs": [], "source": [ "optimizer = keras.optimizers.SGD(learning_rate=0.001, momentum=0.9)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Nesterov Accelerated Gradient" ] }, { "cell_type": "code", "execution_count": 68, "metadata": {}, "outputs": [], "source": [ "optimizer = keras.optimizers.SGD(learning_rate=0.001, momentum=0.9, nesterov=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## AdaGrad" ] }, { "cell_type": "code", "execution_count": 69, "metadata": {}, "outputs": [], "source": [ "optimizer = keras.optimizers.Adagrad(learning_rate=0.001)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## RMSProp" ] }, { "cell_type": "code", "execution_count": 70, "metadata": {}, "outputs": [], "source": [ "optimizer = keras.optimizers.RMSprop(learning_rate=0.001, rho=0.9)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Adam Optimization" ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [], "source": [ "optimizer = keras.optimizers.Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Adamax Optimization" ] }, { "cell_type": "code", "execution_count": 72, "metadata": {}, "outputs": [], "source": [ "optimizer = keras.optimizers.Adamax(learning_rate=0.001, beta_1=0.9, beta_2=0.999)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Nadam Optimization" ] }, { "cell_type": "code", "execution_count": 73, "metadata": {}, "outputs": [], "source": [ "optimizer = keras.optimizers.Nadam(learning_rate=0.001, beta_1=0.9, beta_2=0.999)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Learning Rate Scheduling" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Power Scheduling" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```lr = lr0 / (1 + steps / s)**c```\n", "* Keras uses `c=1` and `s = 1 / decay`" ] }, { "cell_type": "code", "execution_count": 74, "metadata": {}, "outputs": [], "source": [ "optimizer = keras.optimizers.SGD(learning_rate=0.01, decay=1e-4)" ] }, { "cell_type": "code", "execution_count": 75, "metadata": {}, "outputs": [], "source": [ "model = keras.models.Sequential([\n", " keras.layers.Flatten(input_shape=[28, 28]),\n", " keras.layers.Dense(300, activation=\"selu\", kernel_initializer=\"lecun_normal\"),\n", " keras.layers.Dense(100, activation=\"selu\", kernel_initializer=\"lecun_normal\"),\n", " keras.layers.Dense(10, activation=\"softmax\")\n", "])\n", "model.compile(loss=\"sparse_categorical_crossentropy\", optimizer=optimizer, metrics=[\"accuracy\"])" ] }, { "cell_type": "code", "execution_count": 76, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.5980 - accuracy: 0.7933 - val_loss: 0.4031 - val_accuracy: 0.8598\n", "Epoch 2/25\n", "1719/1719 [==============================] - 2s 954us/step - loss: 0.3829 - accuracy: 0.8636 - val_loss: 0.3714 - val_accuracy: 0.8720\n", "Epoch 3/25\n", "1719/1719 [==============================] - 2s 943us/step - loss: 0.3491 - accuracy: 0.8771 - val_loss: 0.3746 - val_accuracy: 0.8738\n", "Epoch 4/25\n", "1719/1719 [==============================] - 2s 954us/step - loss: 0.3277 - accuracy: 0.8814 - val_loss: 0.3502 - val_accuracy: 0.8798\n", "Epoch 5/25\n", "1719/1719 [==============================] - 2s 934us/step - loss: 0.3172 - accuracy: 0.8856 - val_loss: 0.3453 - val_accuracy: 0.8780\n", "Epoch 6/25\n", "1719/1719 [==============================] - 2s 919us/step - loss: 0.2922 - accuracy: 0.8940 - val_loss: 0.3419 - val_accuracy: 0.8820\n", "Epoch 7/25\n", "1719/1719 [==============================] - 2s 921us/step - loss: 0.2870 - accuracy: 0.8973 - val_loss: 0.3362 - val_accuracy: 0.8872\n", "Epoch 8/25\n", "1719/1719 [==============================] - 2s 925us/step - loss: 0.2720 - accuracy: 0.9032 - val_loss: 0.3415 - val_accuracy: 0.8830\n", "Epoch 9/25\n", "1719/1719 [==============================] - 2s 929us/step - loss: 0.2730 - accuracy: 0.9004 - val_loss: 0.3297 - val_accuracy: 0.8864\n", "Epoch 10/25\n", "1719/1719 [==============================] - 2s 928us/step - loss: 0.2585 - accuracy: 0.9068 - val_loss: 0.3269 - val_accuracy: 0.8888\n", "Epoch 11/25\n", "1719/1719 [==============================] - 2s 932us/step - loss: 0.2529 - accuracy: 0.9100 - val_loss: 0.3280 - val_accuracy: 0.8878\n", "Epoch 12/25\n", "1719/1719 [==============================] - 2s 954us/step - loss: 0.2485 - accuracy: 0.9101 - val_loss: 0.3343 - val_accuracy: 0.8822\n", "Epoch 13/25\n", "1719/1719 [==============================] - 2s 964us/step - loss: 0.2420 - accuracy: 0.9148 - val_loss: 0.3266 - val_accuracy: 0.8890\n", "Epoch 14/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2373 - accuracy: 0.9144 - val_loss: 0.3299 - val_accuracy: 0.8890\n", "Epoch 15/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2363 - accuracy: 0.9154 - val_loss: 0.3255 - val_accuracy: 0.8874\n", "Epoch 16/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2309 - accuracy: 0.9181 - val_loss: 0.3217 - val_accuracy: 0.8910\n", "Epoch 17/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2235 - accuracy: 0.9211 - val_loss: 0.3248 - val_accuracy: 0.8914\n", "Epoch 18/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2247 - accuracy: 0.9194 - val_loss: 0.3202 - val_accuracy: 0.8934\n", "Epoch 19/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2235 - accuracy: 0.9218 - val_loss: 0.3243 - val_accuracy: 0.8906\n", "Epoch 20/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2227 - accuracy: 0.9225 - val_loss: 0.3224 - val_accuracy: 0.8900\n", "Epoch 21/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2193 - accuracy: 0.9230 - val_loss: 0.3221 - val_accuracy: 0.8912\n", "Epoch 22/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2163 - accuracy: 0.9227 - val_loss: 0.3195 - val_accuracy: 0.8948\n", "Epoch 23/25\n", "1719/1719 [==============================] - 2s 997us/step - loss: 0.2127 - accuracy: 0.9252 - val_loss: 0.3208 - val_accuracy: 0.8908\n", "Epoch 24/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2076 - accuracy: 0.9273 - val_loss: 0.3226 - val_accuracy: 0.8902\n", "Epoch 25/25\n", "1719/1719 [==============================] - 2s 999us/step - loss: 0.2104 - accuracy: 0.9250 - val_loss: 0.3225 - val_accuracy: 0.8924\n" ] } ], "source": [ "n_epochs = 25\n", "history = model.fit(X_train_scaled, y_train, epochs=n_epochs,\n", " validation_data=(X_valid_scaled, y_valid))" ] }, { "cell_type": "code", "execution_count": 77, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "import math\n", "\n", "learning_rate = 0.01\n", "decay = 1e-4\n", "batch_size = 32\n", "n_steps_per_epoch = math.ceil(len(X_train) / batch_size)\n", "epochs = np.arange(n_epochs)\n", "lrs = learning_rate / (1 + decay * epochs * n_steps_per_epoch)\n", "\n", "plt.plot(epochs, lrs, \"o-\")\n", "plt.axis([0, n_epochs - 1, 0, 0.01])\n", "plt.xlabel(\"Epoch\")\n", "plt.ylabel(\"Learning Rate\")\n", "plt.title(\"Power Scheduling\", fontsize=14)\n", "plt.grid(True)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exponential Scheduling" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```lr = lr0 * 0.1**(epoch / s)```" ] }, { "cell_type": "code", "execution_count": 78, "metadata": {}, "outputs": [], "source": [ "def exponential_decay_fn(epoch):\n", " return 0.01 * 0.1**(epoch / 20)" ] }, { "cell_type": "code", "execution_count": 79, "metadata": {}, "outputs": [], "source": [ "def exponential_decay(lr0, s):\n", " def exponential_decay_fn(epoch):\n", " return lr0 * 0.1**(epoch / s)\n", " return exponential_decay_fn\n", "\n", "exponential_decay_fn = exponential_decay(lr0=0.01, s=20)" ] }, { "cell_type": "code", "execution_count": 80, "metadata": {}, "outputs": [], "source": [ "model = keras.models.Sequential([\n", " keras.layers.Flatten(input_shape=[28, 28]),\n", " keras.layers.Dense(300, activation=\"selu\", kernel_initializer=\"lecun_normal\"),\n", " keras.layers.Dense(100, activation=\"selu\", kernel_initializer=\"lecun_normal\"),\n", " keras.layers.Dense(10, activation=\"softmax\")\n", "])\n", "model.compile(loss=\"sparse_categorical_crossentropy\", optimizer=\"nadam\", metrics=[\"accuracy\"])\n", "n_epochs = 25" ] }, { "cell_type": "code", "execution_count": 81, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 1.1122 - accuracy: 0.7363 - val_loss: 0.8947 - val_accuracy: 0.7496\n", "Epoch 2/25\n", "1719/1719 [==============================] - 3s 2ms/step - loss: 0.7354 - accuracy: 0.7825 - val_loss: 0.6059 - val_accuracy: 0.8122\n", "Epoch 3/25\n", "1719/1719 [==============================] - 3s 2ms/step - loss: 0.5973 - accuracy: 0.8175 - val_loss: 0.8195 - val_accuracy: 0.7754\n", "Epoch 4/25\n", "1719/1719 [==============================] - 3s 2ms/step - loss: 0.6040 - accuracy: 0.8148 - val_loss: 0.6135 - val_accuracy: 0.8398\n", "Epoch 5/25\n", "1719/1719 [==============================] - 3s 2ms/step - loss: 0.5462 - accuracy: 0.8323 - val_loss: 0.5075 - val_accuracy: 0.8490\n", "Epoch 6/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.4479 - accuracy: 0.8555 - val_loss: 0.4538 - val_accuracy: 0.8502\n", "Epoch 7/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.4225 - accuracy: 0.8622 - val_loss: 0.4792 - val_accuracy: 0.8524\n", "Epoch 8/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.3873 - accuracy: 0.8678 - val_loss: 0.5517 - val_accuracy: 0.8448\n", "Epoch 9/25\n", "1719/1719 [==============================] - 3s 2ms/step - loss: 0.3635 - accuracy: 0.8767 - val_loss: 0.5312 - val_accuracy: 0.8600\n", "Epoch 10/25\n", "1719/1719 [==============================] - 3s 2ms/step - loss: 0.3353 - accuracy: 0.8840 - val_loss: 0.4671 - val_accuracy: 0.8660\n", "Epoch 11/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.3108 - accuracy: 0.8927 - val_loss: 0.4885 - val_accuracy: 0.8670\n", "Epoch 12/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.2895 - accuracy: 0.8987 - val_loss: 0.4698 - val_accuracy: 0.8636\n", "Epoch 13/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.2660 - accuracy: 0.9071 - val_loss: 0.4558 - val_accuracy: 0.8820\n", "Epoch 14/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.2442 - accuracy: 0.9153 - val_loss: 0.4325 - val_accuracy: 0.8774\n", "Epoch 15/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.2375 - accuracy: 0.9177 - val_loss: 0.4703 - val_accuracy: 0.8800\n", "Epoch 16/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.2196 - accuracy: 0.9231 - val_loss: 0.4657 - val_accuracy: 0.8870\n", "Epoch 17/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.2013 - accuracy: 0.9312 - val_loss: 0.5023 - val_accuracy: 0.8760\n", "Epoch 18/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.1938 - accuracy: 0.9331 - val_loss: 0.4782 - val_accuracy: 0.8856\n", "Epoch 19/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.1774 - accuracy: 0.9394 - val_loss: 0.4815 - val_accuracy: 0.8898\n", "Epoch 20/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.1703 - accuracy: 0.9418 - val_loss: 0.4674 - val_accuracy: 0.8902\n", "Epoch 21/25\n", "1719/1719 [==============================] - 3s 2ms/step - loss: 0.1611 - accuracy: 0.9462 - val_loss: 0.5116 - val_accuracy: 0.8930\n", "Epoch 22/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.1530 - accuracy: 0.9481 - val_loss: 0.5326 - val_accuracy: 0.8934\n", "Epoch 23/25\n", "1719/1719 [==============================] - 3s 2ms/step - loss: 0.1436 - accuracy: 0.9519 - val_loss: 0.5297 - val_accuracy: 0.8902\n", "Epoch 24/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.1326 - accuracy: 0.9560 - val_loss: 0.5526 - val_accuracy: 0.8930\n", "Epoch 25/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.1308 - accuracy: 0.9560 - val_loss: 0.5699 - val_accuracy: 0.8928\n" ] } ], "source": [ "lr_scheduler = keras.callbacks.LearningRateScheduler(exponential_decay_fn)\n", "history = model.fit(X_train_scaled, y_train, epochs=n_epochs,\n", " validation_data=(X_valid_scaled, y_valid),\n", " callbacks=[lr_scheduler])" ] }, { "cell_type": "code", "execution_count": 82, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.plot(history.epoch, history.history[\"lr\"], \"o-\")\n", "plt.axis([0, n_epochs - 1, 0, 0.011])\n", "plt.xlabel(\"Epoch\")\n", "plt.ylabel(\"Learning Rate\")\n", "plt.title(\"Exponential Scheduling\", fontsize=14)\n", "plt.grid(True)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The schedule function can take the current learning rate as a second argument:" ] }, { "cell_type": "code", "execution_count": 83, "metadata": {}, "outputs": [], "source": [ "def exponential_decay_fn(epoch, lr):\n", " return lr * 0.1**(1 / 20)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you want to update the learning rate at each iteration rather than at each epoch, you must write your own callback class:" ] }, { "cell_type": "code", "execution_count": 84, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/25\n", "1719/1719 [==============================] - 5s 3ms/step - loss: 1.1153 - accuracy: 0.7390 - val_loss: 0.9588 - val_accuracy: 0.7338\n", "Epoch 2/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.6929 - accuracy: 0.7934 - val_loss: 0.5328 - val_accuracy: 0.8318\n", "Epoch 3/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.6317 - accuracy: 0.8097 - val_loss: 0.7656 - val_accuracy: 0.8278\n", "Epoch 4/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.5827 - accuracy: 0.8258 - val_loss: 0.5585 - val_accuracy: 0.8382\n", "Epoch 5/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.5041 - accuracy: 0.8407 - val_loss: 0.5367 - val_accuracy: 0.8574\n", "Epoch 6/25\n", "1719/1719 [==============================] - 4s 3ms/step - loss: 0.4595 - accuracy: 0.8588 - val_loss: 0.6000 - val_accuracy: 0.8516\n", "Epoch 7/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.4490 - accuracy: 0.8644 - val_loss: 0.4605 - val_accuracy: 0.8648\n", "Epoch 8/25\n", "1719/1719 [==============================] - 4s 3ms/step - loss: 0.3925 - accuracy: 0.8783 - val_loss: 0.5076 - val_accuracy: 0.8616\n", "Epoch 9/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.4085 - accuracy: 0.8797 - val_loss: 0.4577 - val_accuracy: 0.8650\n", "Epoch 10/25\n", "1719/1719 [==============================] - 4s 3ms/step - loss: 0.3440 - accuracy: 0.8927 - val_loss: 0.5309 - val_accuracy: 0.8762\n", "Epoch 11/25\n", "1719/1719 [==============================] - 4s 3ms/step - loss: 0.3267 - accuracy: 0.8948 - val_loss: 0.4652 - val_accuracy: 0.8792\n", "Epoch 12/25\n", "1719/1719 [==============================] - 4s 3ms/step - loss: 0.3046 - accuracy: 0.9033 - val_loss: 0.4863 - val_accuracy: 0.8692\n", "Epoch 13/25\n", "1719/1719 [==============================] - 5s 3ms/step - loss: 0.2811 - accuracy: 0.9087 - val_loss: 0.4726 - val_accuracy: 0.8770\n", "Epoch 14/25\n", "1719/1719 [==============================] - 5s 3ms/step - loss: 0.2684 - accuracy: 0.9145 - val_loss: 0.4526 - val_accuracy: 0.8760\n", "Epoch 15/25\n", "1719/1719 [==============================] - 5s 3ms/step - loss: 0.2478 - accuracy: 0.9209 - val_loss: 0.4926 - val_accuracy: 0.8838\n", "Epoch 16/25\n", "1719/1719 [==============================] - 5s 3ms/step - loss: 0.2315 - accuracy: 0.9253 - val_loss: 0.4686 - val_accuracy: 0.8840\n", "Epoch 17/25\n", "1719/1719 [==============================] - 5s 3ms/step - loss: 0.2164 - accuracy: 0.9318 - val_loss: 0.4845 - val_accuracy: 0.8858\n", "Epoch 18/25\n", "1719/1719 [==============================] - 4s 3ms/step - loss: 0.2093 - accuracy: 0.9346 - val_loss: 0.4923 - val_accuracy: 0.8834\n", "Epoch 19/25\n", "1719/1719 [==============================] - 5s 3ms/step - loss: 0.1929 - accuracy: 0.9396 - val_loss: 0.4779 - val_accuracy: 0.8880\n", "Epoch 20/25\n", "1719/1719 [==============================] - 5s 3ms/step - loss: 0.1852 - accuracy: 0.9439 - val_loss: 0.4886 - val_accuracy: 0.8868\n", "Epoch 21/25\n", "1719/1719 [==============================] - 5s 3ms/step - loss: 0.1740 - accuracy: 0.9470 - val_loss: 0.5097 - val_accuracy: 0.8852\n", "Epoch 22/25\n", "1719/1719 [==============================] - 5s 3ms/step - loss: 0.1668 - accuracy: 0.9474 - val_loss: 0.5161 - val_accuracy: 0.8898\n", "Epoch 23/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.1571 - accuracy: 0.9530 - val_loss: 0.5381 - val_accuracy: 0.8886\n", "Epoch 24/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.1444 - accuracy: 0.9575 - val_loss: 0.5415 - val_accuracy: 0.8910\n", "Epoch 25/25\n", "1719/1719 [==============================] - 5s 3ms/step - loss: 0.1447 - accuracy: 0.9569 - val_loss: 0.5833 - val_accuracy: 0.8880\n" ] } ], "source": [ "K = keras.backend\n", "\n", "class ExponentialDecay(keras.callbacks.Callback):\n", " def __init__(self, s=40000):\n", " super().__init__()\n", " self.s = s\n", "\n", " def on_batch_begin(self, batch, logs=None):\n", " # Note: the `batch` argument is reset at each epoch\n", " lr = K.get_value(self.model.optimizer.learning_rate)\n", " K.set_value(self.model.optimizer.learning_rate, lr * 0.1**(1 / self.s))\n", "\n", " def on_epoch_end(self, epoch, logs=None):\n", " logs = logs or {}\n", " logs['lr'] = K.get_value(self.model.optimizer.learning_rate)\n", "\n", "model = keras.models.Sequential([\n", " keras.layers.Flatten(input_shape=[28, 28]),\n", " keras.layers.Dense(300, activation=\"selu\", kernel_initializer=\"lecun_normal\"),\n", " keras.layers.Dense(100, activation=\"selu\", kernel_initializer=\"lecun_normal\"),\n", " keras.layers.Dense(10, activation=\"softmax\")\n", "])\n", "lr0 = 0.01\n", "optimizer = keras.optimizers.Nadam(learning_rate=lr0)\n", "model.compile(loss=\"sparse_categorical_crossentropy\", optimizer=optimizer, metrics=[\"accuracy\"])\n", "n_epochs = 25\n", "\n", "s = 20 * len(X_train) // 32 # number of steps in 20 epochs (batch size = 32)\n", "exp_decay = ExponentialDecay(s)\n", "history = model.fit(X_train_scaled, y_train, epochs=n_epochs,\n", " validation_data=(X_valid_scaled, y_valid),\n", " callbacks=[exp_decay])" ] }, { "cell_type": "code", "execution_count": 85, "metadata": {}, "outputs": [], "source": [ "n_steps = n_epochs * len(X_train) // 32\n", "steps = np.arange(n_steps)\n", "lrs = lr0 * 0.1**(steps / s)" ] }, { "cell_type": "code", "execution_count": 86, "metadata": { "scrolled": true }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.plot(steps, lrs, \"-\", linewidth=2)\n", "plt.axis([0, n_steps - 1, 0, lr0 * 1.1])\n", "plt.xlabel(\"Batch\")\n", "plt.ylabel(\"Learning Rate\")\n", "plt.title(\"Exponential Scheduling (per batch)\", fontsize=14)\n", "plt.grid(True)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Piecewise Constant Scheduling" ] }, { "cell_type": "code", "execution_count": 87, "metadata": {}, "outputs": [], "source": [ "def piecewise_constant_fn(epoch):\n", " if epoch < 5:\n", " return 0.01\n", " elif epoch < 15:\n", " return 0.005\n", " else:\n", " return 0.001" ] }, { "cell_type": "code", "execution_count": 88, "metadata": {}, "outputs": [], "source": [ "def piecewise_constant(boundaries, values):\n", " boundaries = np.array([0] + boundaries)\n", " values = np.array(values)\n", " def piecewise_constant_fn(epoch):\n", " return values[np.argmax(boundaries > epoch) - 1]\n", " return piecewise_constant_fn\n", "\n", "piecewise_constant_fn = piecewise_constant([5, 15], [0.01, 0.005, 0.001])" ] }, { "cell_type": "code", "execution_count": 89, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 1.1511 - accuracy: 0.7326 - val_loss: 0.8456 - val_accuracy: 0.7410\n", "Epoch 2/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.7371 - accuracy: 0.7786 - val_loss: 0.6796 - val_accuracy: 0.8092\n", "Epoch 3/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.8055 - accuracy: 0.7700 - val_loss: 1.7429 - val_accuracy: 0.4514\n", "Epoch 4/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 1.0351 - accuracy: 0.6826 - val_loss: 0.9870 - val_accuracy: 0.6928\n", "Epoch 5/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.9185 - accuracy: 0.7098 - val_loss: 0.8727 - val_accuracy: 0.6932\n", "Epoch 6/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.6905 - accuracy: 0.7481 - val_loss: 0.6694 - val_accuracy: 0.7696\n", "Epoch 7/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.6115 - accuracy: 0.7713 - val_loss: 0.6956 - val_accuracy: 0.7306\n", "Epoch 8/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.5791 - accuracy: 0.7793 - val_loss: 0.6659 - val_accuracy: 0.7738\n", "Epoch 9/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.5622 - accuracy: 0.7881 - val_loss: 0.7363 - val_accuracy: 0.7850\n", "Epoch 10/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.5253 - accuracy: 0.8470 - val_loss: 0.5484 - val_accuracy: 0.8578\n", "Epoch 11/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.4401 - accuracy: 0.8694 - val_loss: 0.6724 - val_accuracy: 0.8602\n", "Epoch 12/25\n", "1719/1719 [==============================] - 4s 3ms/step - loss: 0.4334 - accuracy: 0.8732 - val_loss: 0.5551 - val_accuracy: 0.8504\n", "Epoch 13/25\n", "1719/1719 [==============================] - 5s 3ms/step - loss: 0.4179 - accuracy: 0.8771 - val_loss: 0.6685 - val_accuracy: 0.8554\n", "Epoch 14/25\n", "1719/1719 [==============================] - 5s 3ms/step - loss: 0.4300 - accuracy: 0.8775 - val_loss: 0.5340 - val_accuracy: 0.8584\n", "Epoch 15/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.4069 - accuracy: 0.8777 - val_loss: 0.6519 - val_accuracy: 0.8478\n", "Epoch 16/25\n", "1719/1719 [==============================] - 5s 3ms/step - loss: 0.3349 - accuracy: 0.8953 - val_loss: 0.4801 - val_accuracy: 0.8778\n", "Epoch 17/25\n", "1719/1719 [==============================] - 4s 3ms/step - loss: 0.2695 - accuracy: 0.9109 - val_loss: 0.4880 - val_accuracy: 0.8786\n", "Epoch 18/25\n", "1719/1719 [==============================] - 5s 3ms/step - loss: 0.2568 - accuracy: 0.9136 - val_loss: 0.4726 - val_accuracy: 0.8822\n", "Epoch 19/25\n", "1719/1719 [==============================] - 5s 3ms/step - loss: 0.2436 - accuracy: 0.9203 - val_loss: 0.4792 - val_accuracy: 0.8842\n", "Epoch 20/25\n", "1719/1719 [==============================] - 4s 3ms/step - loss: 0.2421 - accuracy: 0.9212 - val_loss: 0.5088 - val_accuracy: 0.8838\n", "Epoch 21/25\n", "1719/1719 [==============================] - 4s 3ms/step - loss: 0.2288 - accuracy: 0.9246 - val_loss: 0.5083 - val_accuracy: 0.8830\n", "Epoch 22/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.2215 - accuracy: 0.9270 - val_loss: 0.5217 - val_accuracy: 0.8846\n", "Epoch 23/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.2106 - accuracy: 0.9297 - val_loss: 0.5297 - val_accuracy: 0.8834\n", "Epoch 24/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.2002 - accuracy: 0.9334 - val_loss: 0.5597 - val_accuracy: 0.8864\n", "Epoch 25/25\n", "1719/1719 [==============================] - 4s 2ms/step - loss: 0.2005 - accuracy: 0.9350 - val_loss: 0.5533 - val_accuracy: 0.8868\n" ] } ], "source": [ "lr_scheduler = keras.callbacks.LearningRateScheduler(piecewise_constant_fn)\n", "\n", "model = keras.models.Sequential([\n", " keras.layers.Flatten(input_shape=[28, 28]),\n", " keras.layers.Dense(300, activation=\"selu\", kernel_initializer=\"lecun_normal\"),\n", " keras.layers.Dense(100, activation=\"selu\", kernel_initializer=\"lecun_normal\"),\n", " keras.layers.Dense(10, activation=\"softmax\")\n", "])\n", "model.compile(loss=\"sparse_categorical_crossentropy\", optimizer=\"nadam\", metrics=[\"accuracy\"])\n", "n_epochs = 25\n", "history = model.fit(X_train_scaled, y_train, epochs=n_epochs,\n", " validation_data=(X_valid_scaled, y_valid),\n", " callbacks=[lr_scheduler])" ] }, { "cell_type": "code", "execution_count": 90, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.plot(history.epoch, [piecewise_constant_fn(epoch) for epoch in history.epoch], \"o-\")\n", "plt.axis([0, n_epochs - 1, 0, 0.011])\n", "plt.xlabel(\"Epoch\")\n", "plt.ylabel(\"Learning Rate\")\n", "plt.title(\"Piecewise Constant Scheduling\", fontsize=14)\n", "plt.grid(True)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Performance Scheduling" ] }, { "cell_type": "code", "execution_count": 91, "metadata": {}, "outputs": [], "source": [ "tf.random.set_seed(42)\n", "np.random.seed(42)" ] }, { "cell_type": "code", "execution_count": 92, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.7116 - accuracy: 0.7769 - val_loss: 0.4869 - val_accuracy: 0.8478\n", "Epoch 2/25\n", "1719/1719 [==============================] - 2s 947us/step - loss: 0.4912 - accuracy: 0.8390 - val_loss: 0.5958 - val_accuracy: 0.8270\n", "Epoch 3/25\n", "1719/1719 [==============================] - 2s 987us/step - loss: 0.5222 - accuracy: 0.8379 - val_loss: 0.4869 - val_accuracy: 0.8584\n", "Epoch 4/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.5061 - accuracy: 0.8467 - val_loss: 0.4588 - val_accuracy: 0.8548\n", "Epoch 5/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.5216 - accuracy: 0.8469 - val_loss: 0.6096 - val_accuracy: 0.8300\n", "Epoch 6/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.4984 - accuracy: 0.8546 - val_loss: 0.5359 - val_accuracy: 0.8498\n", "Epoch 7/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.5104 - accuracy: 0.8579 - val_loss: 0.5457 - val_accuracy: 0.8522\n", "Epoch 8/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.5375 - accuracy: 0.8538 - val_loss: 0.6445 - val_accuracy: 0.8218\n", "Epoch 9/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.5333 - accuracy: 0.8522 - val_loss: 0.5472 - val_accuracy: 0.8560\n", "Epoch 10/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.3280 - accuracy: 0.8902 - val_loss: 0.3826 - val_accuracy: 0.8876\n", "Epoch 11/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2410 - accuracy: 0.9135 - val_loss: 0.4025 - val_accuracy: 0.8876\n", "Epoch 12/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2266 - accuracy: 0.9180 - val_loss: 0.4540 - val_accuracy: 0.8694\n", "Epoch 13/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2129 - accuracy: 0.9221 - val_loss: 0.4310 - val_accuracy: 0.8866\n", "Epoch 14/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.1959 - accuracy: 0.9270 - val_loss: 0.4406 - val_accuracy: 0.8814\n", "Epoch 15/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.1975 - accuracy: 0.9277 - val_loss: 0.4341 - val_accuracy: 0.8840\n", "Epoch 16/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.1409 - accuracy: 0.9464 - val_loss: 0.4220 - val_accuracy: 0.8932\n", "Epoch 17/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.1181 - accuracy: 0.9542 - val_loss: 0.4409 - val_accuracy: 0.8948\n", "Epoch 18/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.1124 - accuracy: 0.9560 - val_loss: 0.4480 - val_accuracy: 0.8898\n", "Epoch 19/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.1070 - accuracy: 0.9579 - val_loss: 0.4610 - val_accuracy: 0.8932\n", "Epoch 20/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.1016 - accuracy: 0.9606 - val_loss: 0.4845 - val_accuracy: 0.8918\n", "Epoch 21/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.0848 - accuracy: 0.9686 - val_loss: 0.4829 - val_accuracy: 0.8934\n", "Epoch 22/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.0792 - accuracy: 0.9700 - val_loss: 0.4906 - val_accuracy: 0.8952\n", "Epoch 23/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.0751 - accuracy: 0.9720 - val_loss: 0.4951 - val_accuracy: 0.8950\n", "Epoch 24/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.0687 - accuracy: 0.9739 - val_loss: 0.5109 - val_accuracy: 0.8948\n", "Epoch 25/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.0683 - accuracy: 0.9752 - val_loss: 0.5241 - val_accuracy: 0.8936\n" ] } ], "source": [ "lr_scheduler = keras.callbacks.ReduceLROnPlateau(factor=0.5, patience=5)\n", "\n", "model = keras.models.Sequential([\n", " keras.layers.Flatten(input_shape=[28, 28]),\n", " keras.layers.Dense(300, activation=\"selu\", kernel_initializer=\"lecun_normal\"),\n", " keras.layers.Dense(100, activation=\"selu\", kernel_initializer=\"lecun_normal\"),\n", " keras.layers.Dense(10, activation=\"softmax\")\n", "])\n", "optimizer = keras.optimizers.SGD(learning_rate=0.02, momentum=0.9)\n", "model.compile(loss=\"sparse_categorical_crossentropy\", optimizer=optimizer, metrics=[\"accuracy\"])\n", "n_epochs = 25\n", "history = model.fit(X_train_scaled, y_train, epochs=n_epochs,\n", " validation_data=(X_valid_scaled, y_valid),\n", " callbacks=[lr_scheduler])" ] }, { "cell_type": "code", "execution_count": 93, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAdMAAAEeCAYAAADRiP/HAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAABeyElEQVR4nO2deXxU1fXAvycJhE1URFEgJKKCAlVwrxuobbWL1aptVWzVVqm4VG2rtT+1omhbrbW1dasWV3Cvu9W6Je67ghJUFCVUCbsiYSec3x/nDXmZzPIms2U538/nfTJz373n3fcY5sy59yyiqjiO4ziO03pKij0Bx3Ecx2nvuDJ1HMdxnCxxZeo4juM4WeLK1HEcx3GyxJWp4ziO42SJK1PHcRzHyRJXpk6nQ0QaROT4Ys+joyEis0XkN8Weh+MUA1emTptERG4REQ2OdSIyR0SuE5FNiz23XCAiY4J765vk/ITQ/a8XkbkiMkVEKgo912A+x4fmoyJSLyL3iMjWWcpsyOU8HadYuDJ12jJPA1sBVcCJwCHAtcWcUIH5ELv/gcCPga8B9xRxPiuC+fQHjgFGAg+LSGkR5+Q4bQJXpk5bZrWqzlPVz1T1SeBu4FvhDiJygojMEJFVIjJTRM4SkZLQ+W1FpCY4/6GIfC9ufFVgae0a164icmToff/AMlwsIitEZKqI7B86f4iIvBVc51MRuVREumZ5/+uC+5+rqi8ANwJ7ikjvVINE5HAReU9EVovI/0TkPBGR0PnZInK+iPxTRL4Skc9E5OwI89FgPvWqWg1cBIwAtk0yj1+JyLsislxEPheRf4nIJsG5McDNQM+QtTshONdVRC4L5rVcRN4QkYNCcktFZFLwnFeKyEcick7cv/stIvJo3HwmiMj0CPfpOBlTVuwJOE4URGQwcDCwNtR2EnAxcDrwFvbFfmPQ5+rgy/UB4Avg60AP4CqgPMNr9wSeAxYAPwA+B3YKnT8ImAKcATwPDAKuD66Tkz1EEdkSOBxoDI5k/XYB7gUuCea0G/BP4CvgH6GuZwEXAn8Gvg38XUReVNVXMpjWyuBvlyTn1wNnAp8AlcH1/wH8BHg5OPcHYJugf2zJ9+ag7RjgM+A7wCMispuqTsOMgM+BHwELgd2BG4DFwKQM5u84uUNV/fCjzR3ALcA67At2JaDBcVaozxzgJ3HjzgRmBK+/hSmeQaHz+wRyjg/eVwXvd42To8CRweuTgGVA3yRzfR64IK7tsGDukmTMmOAayWROCObegC2vxu7/qjTPbQrwbAJZn4XezwbujOvzEXB+CrnHAw2h9wOBV4D/AV1Dcn+TQsbBwGqgJJHMoG0bTAkPimt/ELg2hew/AU/HfX4eTfAcphf7s+1HxzzcMnXaMs8D44DumELbBvg7gIhsDlQA/xSR60JjyoDYkuYOwOeqOid0/jXsyzoTRgHvquqiJOd3AXYXkd+G2kqCeW8J1Gd4vRizMKusHDgUOAL4vzRjdgAei2t7EbhQRHqr6ldB27txfeYCW6SR3TNwGBLMyn8bOFxV1yTqLCIHAL8L5rQxUAp0xZ7J3CTX2DmQPyO0Mg32DJ4NyT4Z20evxJ5zF6AuzfwdJ2+4MnXaMitU9ePg9S9FpBq4ALMwYvtjJ2NLhomQJO1hYoo1vKcYv2yZTk4Jtn94b4JzCyPMIRlrQvdfKyLbAddgFl0yBLNgExFuX5vgXDofihWY09F6YL6qLk86CZFKTKnfCPweW4LdGbgTU6jJKAnmsluCOa4MZP8Y+Bu2hP4ytoR9KrYEH2M9Lf/dki1HO07WuDJ12hMXAY+LyA2qOldEPge2UdXbkvSfAQwQkQpV/V/QtjvNlUZM2W0VahsZJ+dt4FgR6ZvEOn0b2D6k+PLFROBDEfmHqr6VpM8MbCk7zD7YMu+yLK+vGdzjrpjSPEtVGwHinb+ANZi1GuYdTAluqebklIh9gNdU9epYg4hsE9dnIS3/HePfO07OcG9ep92gqjVALXB+0DQBOCfw4B0qIiNE5Kci8rvg/NPAB8BtIjJSRL4O/BXbi43JXAm8CvxWRIaLyF7AFXGXvgNzPnpQRPYVka1F5Pshb96LgWNE5OJgDtuLyJEicnmE2xoRzC18JPx/qaqfAA9jSjUZfwFGB56rQ0RkLPBrIMpccslH2PfLmcHzOhrbzw4zG+gmIt8Ukb4i0kNVZ2L7vrcEz3CwiOwqIr8RkcODcTOBnUXk2yKynYhcAIyOk/0sMEpEfibm0X0OsHee7tVx3AHJj7Z5kMCBJGg/BnNiqQzeH41Zhqswr90XgaNC/YdgnrirsS/472NOPceH+uwAvIQtY74H7EvIASnoMxALzfky6PcOMCZ0/lvAC8G5r4A3gdNS3N8YmpyK4o9eJHGWAfYK+uyVQvbhwX2swRyEziPkCEUCRyGgBrg6hczjiXMWStCnmVzgl5jX7UrgGcz7VoGqUJ/rgEVB+4SgrUtw/58E9zAP+xGxS3C+K+a1+0Xw7zEJW0qeHTefCdh+9VIsPvkPiZ6pH37k4hDVZNsrjuM4juNEwZd5HcdxHCdLXJk6juM4Tpa4MnUcx3GcLHFl6jiO4zhZ4nGmESkpKdHu3bsXexptjvXr11NS4r/J4vHn0hJ/Jonp6M9lxYoVqqod9wYDXJlGpGvXrixfnjThS6elpqaGMWPGFHsabQ5/Li3xZ5KYjv5cRGRl+l7tnw7/a8FxHMdx8o0rU8dxHKe4iPRB5AFEliNSh8gxKfoORuRRRJYhsohwpjGRGkRWIdIQHB8WYvrgytRxHMcpPtdg2a76AWOB6xAZ3qKXSFfgKSxd5JZYZrLJcb1OQ7VXcAzN66xDuDJ1HMdxiodIT6y84AWoNqD6IpY+8icJeh8PzEX1SlSXo7oK1fhygkXBlanjOI6TN/pCGSJvho5xcV2GAI1YkYMY04CWlinsCcxG5PFgibcGka/F9fljcO4lRMbk7EbSUFBlKkIfER4QYbkIdSIkXRcX4SwR5omwVISbRCgP2stFmBSMXybCOyJ8O27sgSJ8IMIKEapFqAydExEuE2FxcFwukr7u5erVpVRVwZQp0e51yhSoqoKSEjr0uAMOGF2w6+02sJ7nZDS7VczL+/Ucx8kNi2AdqruGjhviuvTCihGEWQpslEDcQOAo4O9Af6xm7kPB8i/Ab4HBwADgBuARWpbnyw+FzKoPeifo3aC9QPcBXQo6PEG/g0Dngw4H3RS0BvRPwbmeoBNAq0BLQL8Hugy0KjjfN5D7Q9BuoH8GfTUk+xegH4IOBB0AOgP05PRz76Gg2qOH6uTJmpLJk60fNB0+Lvtx1zBe11GiV3NKXq+XC6qrq/N/kXaGP5PEdPTnAizXVN+vMEphRVzbrxUeSdD3IYXq0HtRWKqwUxLZTyicnvL6OToKFmcqQmxdfIQqDcCLIhvWxc+N634cMEmV2mDsRKzG4bmqLMdKK8V4VIRPgV2wElCHA7Wq3BuMnQAsEmF7VT4IZP9Flc+C838BTgKuj3IfK1bAqafChyl8xP7+d+vn43I3rveKen7OvyhlPSdwMxNXXMCpp27Zquuddx6MHZt8nOM4BWUmthS8HaofBW07YbWL43mXzOrSKqRfecwFBSvBJsIo4GVVuofafgOMVuWQuL7TgD+ocnfwvi+wEOiryuK4vv2AOmCkKh+IcBXQVZXxoT7TgQtV+bcIS4FvqfJacG5XoFq15ZKCCOOAYH2/5y4QS9qgSIp/HnukiTr4uNaOu4mfcQK3ALCKrkziRE7j6lZdT0R59tnnkg/MAQ0NDfTq1Suv12hv+DNJTEd/Lvvvv/8KVe2ZspPIXZjiOxEYCfwH2AvV2rh+Q7Fawt8HqrGauadhNYl7AHtg9YvXAT/Glnp3RjX/ITKFMH9NYeu+oPPi2k4CrUnQdxbowaH3XYJluqq4fl1Anwb9Z6htUmxJONT2EujxwetG0O1D57YLZEvq+ffYsFRYWZlkPSOgslKbLS36uOzG7Tpgrq6ia7NBy+muuw6sz8v1ckFHX7prDf5MEtPRnwvplnlVUeij8KDCcoU5CscE7YMUGhQGhfoervCxwlcKNQrDg/bNFd5QWKbwpcKrCt9Me+0cHYV0QGoAese19QaWRegbe72hrwglwO1YbNJpGVwnkewG+7dIT48ecOmlqftceqn183G5GTdl+4mU0tisrYRGJg+dmJfrOY5TYFSXoHoYqj1RHYTqHUH7HCxedE6o7/2obotqb1THELNeVReiuhuqG6G6Cap7ovpUAe+hYJZpT9A1oNuF2m6LtyKD9jtALw29PyBs1YIK6M2g1aDd48aOA30p7rorYtYo6MugJ4XO/4yQg1Ly+ffQysroziuTJ5sFJKIdfNz6/F9v5MjEJubIkZGu1717k0VaCOcj1Y5vbbQGfyaJ6ejPhSiWaQc4Cnsx9C7Mo7cn6N4k9+Y9GHQe6DDMm/fZsNIFvR70VdBeCcZuHsg9AvPmvYzm3rwng76PefL2B60lgjdveXl58k9LJ6ZgXwQ77KDar599ZB95JKOhZ56p2rOn6vr1eZpbAjr6F2Rr8GeSmI7+XDqLMi100oZTgO7AAuBOYLwqtSIMEqFBhEEAqjwBXI5tMNcFx4UAQczoL7BN6nnBuAYRxgZjF2Jew5cCX2Ab0keF5vBP4BHgPWA6Fqf0z3zetJMlK1aYu+8xx4AITJ2a0fCqKli+HBYvTtvVcRynVRS0BJsqS4DDErTPwQJ3w21XAlcm6FtHGldnVZ4Gtk9yToFzgsNpD0yfDuvXw777wqOPwjvvZDS8stL+1tVB3755mJ/jOJ0eTyfotH1ilujIkXa0wjIFmD07ZzNyHMdphitTp+0zdSr07m1acdQo+OQTWBqffSw5rkwdx8k3rkydts/UqWaRitjfWFtENtnEdHFdXe6n5jiOA65MnbbO+vXw7rtNSnTUKPvbiqVet0wdx8kXrkydts2sWeaKG1OmW25pRyuckFyZOo6TL1yZOm2bsPNRjJEjM1amVVW2zKuFSUXtOE4nw5Wp07aZOhXKymDYsKa2UaNgxgxYvTqymKoq+Oor+PLLXE/QcRzHlanT1pk61RRpeXlT26hRsG4d1Caq0JSYcKyp4zhOrnFl6rRtpk6FnXZq3hZb8s1gqdfDYxzHySeuTJ22y4IFMHdu8/1SgG22gV69MvLodWXqOE4+cWXqtF2mTbO/8cq0pMSs1Qws0z59oGdPX+Z1HCc/uDJ12i4xyzN+mRds33TaNItDjYCIx5o6jpM/XJk6bZepU6GiAjbbrOW5UaOgocHiUCNSWemWqeM4+cGVqdN2iaURTEQrnZDcMnUcJx+4MnXaJitXwgcfJFemw4db/GmGTkhffGHxpo7jOLnElanTNqmttf3QZMq0vNwUagaWqceaOo6TL1yZOm2TRGkE48kwraCHxziOky9cmTpQXw+jR8O8ecWeSRPhGqbJGDUK5s+PPO+YZerK1HGcXFNQZSpCHxEeEGG5CHUiHJOi71kizBNhqQg3iVAeOneaCG+KsFqEW+LGjRWhIXSsEEFF2CU4P0GEtXF9BuftptsD//d/8OKLMHFisWfSRCzzUUmKj2isHFtE63SLLaBbN1/mdRwn9xTaMr0GWAP0A8YC14kwPL6TCAcB5wIHAlXAYOCiUJe5wCXATfFjVZmiSq/YAZwCfAK8Hep2d7iPKp/k5O7aI/X1cOuttj95881twzpdv95iSFMt8UJT/GlEZeqxpo7j5IuCKVMRegJHABeo0qDKi8DDwE8SdD8OmKRKrSpfABOB42MnVblflQeBxREufRxwmypefCsRF13UVJessbFtWKeffGIxpImSNYTZeGMYPDgjj16PNXUcJx+UFfBaQ4BGVWaG2qYBoxP0HQ48FNevnwibqUZSoACIUAnsB/ws7tQhIiwB6oGrVbkuyfhxwDiAsjKhpqYm6qXbBV0XL2bPm25q+kW1Zg2Nkybx2oEHsqZPn0gyGhoacv5cNn/uOYYDb65bR0Ma2cMHDKDnyy/zesQ5dO06hI8+6ktNzctZzzMV+Xgu7R1/Jonx59JBUNWCHKD7gs6LazsJtCZB31mgB4fedzHzSavi+l0CekuKa14QLx90GGh/0FLQvUDrQY9ON//y8nLtcIwfr1pWpho8XAXVrl1VTzklsojq6urcz+u881RLS1VXrkzfd+JEm/dXX0US/Yc/WPeGhiznmIa8PJd2jj+TxHT05wIs1wLpmWIehdwzbQB6x7X1BpZF6Bt7nahvKn4K3BpuUGWGKnNVaVTlZeAq4MgM5XYMXnnF6oKGWbMGXs6v1ZaWqVNhhx3MWygdMSekWFL8NMScg32p13GcXFJIZToTKBNhu1DbTkCiCs+1wblwv/kZLvHuDfQH7kvTVQGJKrdD8c47MGGCvf7BD6B/f3P+ySB2My+kSiMYT4ZpBT3W1HHaICJ9EHkAkeWI1CGSNNIDkcGIPIrIMkQWIXJ5q+TkmIIpU1WWA/cDF4vQM1B2hwK3J+h+G/BzEYaJsClwPjSFwIhQJkI3oBQoFaGbSIv93+OAf6s2t2ZFOFSETUUQEXYHfknz/dnORV0dbLUVHHSQ1Q79+OPizmfRIvj88+jKtH9/2HzzyE5IHmvqOG2SFpEeiLSI9ECkK/AU8CywJTAQmJyxnDxQ6NCYU4DuwALgTmC8KrUiDAriPQcBqPIEcDlQDdQFx4UhOecDK7HwmWOD1+fHTgaK9kfELfEGHAV8jC0Z3wZcppqwX+dg9mwz18aMsffFdoRIVsM0GSK21BvRMt1yS+ja1Zd5HafNILIh0gPVBlRTRXocD8xF9UpUl6O6CtV3WyEn5xRUmaqyRJXDVOmpyiBV7gja56jFe84J9b1SlX6q9FblBFVWh85NUEXijgmh86tU2USVZxLM4WhVNguut70qf8/bDbfFzELx1NWZuTZkiGmaYivTVDVMkzFyJEyfbvu9aSgpsdt1y9RxCkNfKEPkzdAxLq7LEKAR1fhIj0QW5Z7AbEQeD5Z4axD5Wivk5BxPJ5hPJk5se5mFwjQ2wv/+Z9pFxKzTmpqmuNNiMHUqDBwIfftGHzNqFKxdC++/H6m7x5o6TuFYBOtQ3TV03BDXpRewNK5tKbBRAnEDsdXFv2M+MY8BDwXLv5nIyTmuTPNFfT38619tK7NQPPX1poRiXjn771/8fdNMnI9iZJhW0LMgOU6bIpNIj5XAi6g+juoa4ApgM2CHDOXkHFem+WLiRFNU0HYyC8UTM89iXjnF3jddtcqsy0yV6bbbQo8ekZVpZaXlx1+5MvMpOo6Tc2ZiS8FRIj3ehaTZ7DKRk3NcmeaD+nqzRmOsWdM2rdN4ZbrddubZW11dnPnU1toPj0z2SwFKS21MRI/emCE+Z07Kbo7jFALVDZEeiPREJFWkx2RgT0S+gUgpcCawCHg/Qzk5x5VpPpg40ZRCmLZoncbWOmPKtNj7plFqmCZj5Egbv3592q4ea+o4bY4WkR6o1iIyCJEGRAYBoPohFsFxPfAFpiy/Hyz5JpdTAFyZ5oNXXmla4o3RFjILxVNXZ44+PXs2tY0ZY5b1Rx8Vfj5Tp0KvXpa8PlNGjYKvvoqkIT3W1HHaGKpLUD0M1Z6oDkL1jqB9Dqq9UJ0T6ns/qtui2hvVMc2UZTI5BcCVaT545x04/3xbfuzWDc480yy9YmcWiicWFhOmmPumUWqYJiMDJ6T+/aGszD16HcfJHa5M80VtrTnGDBtmr9sisYQNYWL7poVWplFrmCZjxAj78RJBmZaWQkWFW6aO4+QOV6b5orbWvuBHjGibylTVPHDiLdNi7ZvOng3LlrVemXbrZsnxM3BCcsvUcZxc4co0H6xaZbGaw4fbMXcufPFFsWfVnIULLTYk3jKF4uybZuN8FCODtIIea+o4Ti5xZZoPPvjAli1jlim0Pes03pM3zP77299CLvVOnWrrr8OzyPw1cqT9cFmwIG3Xykrrunp12q6O4zhpcWWaD6ZPt78xyzTc1laIjzENs+225qVTyHjTqVNh++2he/fWy4g5IUVY6o0Z5P/7X+sv5ziOE8OVaT6orYUuXcyZZ9AgC/doT5ZpMfZNW5NGMJ5YsocIS70eHuM4Ti5xZZoPpk+HoUNNoYqYddoWLdONN4ZNNkl8fswYy9g0c2bi87lk8WIzEbNVpn36mJaMoEw9cYPjOEkR6ZLpEFem+SDmyRujLXr0JooxDVPIeNNMa5imYtSoSMu8AwfaFq179DpOJ0fkl4gcEXo/CViJyIeIDI0qxpVprmlogE8/be5IM3y4ec9GcIwpGIliTMPE9k0LoUxbU8M0GSNHmjXd0JCyW1kZDBjglqnjOPwSWAiAyH7Aj4BjgKnAX6IKcWWaa2I1NeOVKbQd61Q1vWVayH3TqVNNcW++efayRo2y+b73XtquHmvqOA4wAJgdvD4EuBfVe4AJWDHySLgyzTWxvdH4Zd7wuWLz5ZeWICGVMoXC7ZvmwvkoRgZpBT3W1HEc4Csg9kv+m8Azweu1QLeoQiIrUxG+LcKjIswQoSJoO1GEA6PK6BTU1lo2nnCy9q22MkeftmKZxjRIqmVeKMy+6erVrathmoyBA80RKaJH7+eft6xJ4DhOp+JJ4MZgr3Rb4PGgfTjwaVQhkZSpCGOBe4CPgK2BmKdTKXBO1IuJ0EeEB0RYLkKdCMek6HuWCPNEWCrCTSKUh86dJsKbIqwW4Za4cVUiqAgNoeOC0HkR4TIRFgfH5SJI1HtIS22tpbUrLW02qTblhJQqxjTMttvaxmI+401nzIB163KnTEUiOyFVVVlujc8+y82lHcdpl5wKvAT0BY5EdUnQvjNWxi0SUS3Tc4CTVDkLWBdqfxUYGfViwDXAGqAfMBa4ToQWKW9EOAg4FzgQqAIGAxeFuswFLgFuSnGtTVTpFRzhQqLjgMOwCuw7At8DfpHBPaRm+vTEWXxi4THFqBMaT0yZprNMC7Fvmos0gvGMGmV7pmlMTo81dRwH1a9QPR3VQ1F9ItR+Iap/iComqjLdDnglQXsD0DuKABF6AkcAF6jSoMqLwMPATxJ0Pw6YpEqtKl8AE4HjYydVuV+VB4HFEecfL/svqnymyueYt9bxqYdEZOlSM3PC+6UxRoywvcr6+pxcKitmz4YePWCzzdL3HTMG5s+HDz/Mz1ymTrV6qttskzuZI0fa8vEHH6TsFvst4U5IjtOJERnWLARG5JuITEbkd4iUphjZjLKI/eYCQ4D4r539gFkRZQwBGlUJe7NMA0Yn6DsceCiuXz8RNlONrEDrRFDgKeBsVRaFZE+Lk50wIawI4zBLlrIyoSbN3mHv6dPZGXhv/XoWx/XdZO1aRgLTpkzhi912i3gL+WH4W2/RY/PNeeO559L27d69O3sAM2+4gbnf/36L8w0NDWmfSypG1tQgVVW88/zzrZYRT4+1a9kdeP/OO5m/OPnHZe1aQWQ/amrqqKqanbPrQ/bPpSPizyQx/lyKziTgKuBDRAZiuqcGW/7tDfwukhRVTXuAngP6PujeoMtAR4MeB7oQ9NSIMvYFnRfXdhJoTYK+s0APDr3vYuuMWhXX7xLQW+LaeoHuCloG2g/0PtD/hs43gm4fer9dIFtSzb+8vFzTcsMNqqD66actzy1YYOeuvDK9nHyz886q3/52tL7r16sOGKD64x8nPF1dXd36eaxfr9q7t+opp7ReRiLWrlXt1k31rLPSdh0wQPW443J7edUsn0sHxZ9JYjr6cwGWawQdUbQDvlQYErw+S6E6eL2/wuyociJZpqpcLsLGmJXXDagGVgNXqHJNNOWfcEm4N7AsQt/Y60R94+faALwZvJ0vwmlAvQi9VfkqiewGe3ZZMn26LVkOGtTy3Oab29EWwmNmz4bdd4/WN7Zv+vTT9ptDcuerxezZ8NVXud0vBcvIsOOOkcNjfJnXcTo1pZgvD5ifzn+C17Mw/55IRA6NUeU8zNtpdyyQdXPVJi/ZCMwEykTYLtS2E5DIxbU2OBfuNz+DJd4wMSUZ0wKJZOfGzba2FoYNg5Ikj7UtePQ2NMCSJek9ecPka980H85HMWIevWkcpyor3QHJcTo504HxiOyLKdOYE9IA2LA9mJaooTE3ibCRKitUeVOV11VpEKGnSEqP2g2oshy4H7g4GLc3cChwe4LutwE/F2GYCJsC50NTCIwIZSJ0w35RlIrQTcSsbBH2EGGoCCUibAb8HahRZWlI9q9EGCBCf+DXYdlZMX16YuejGMOHmzLNl2dsFKJ68obJV7zp1Kn2wyPVM2stI0eaw1cas7OqynLsr1uXspvjOB2X3wInYfukd6IaS5/2feD1qEKiWqbHAYkKTXYHfhr1YsApwZgFWPzOeFVqRRgUxIMOAlDlCeBybDm5LjguDMk5H1iJhc8cG7w+Pzg3GPtlsQz7xbEaODo09p/AI8B7wfnHgrbsWLzYrLdUxa2HDzfLcM6crC/XalKVXkvGNtvkJ9506lSrrpNNDdNkRKxtWlUFjY1WKNxxnE6I6vNYBqS+qP4sdOafwPioYlLumYrQB1seFWBTkWYxpqXAd4H50efMEizGM759DtArru1K4MokciZgeRMTnbuTFIG2wd7oOWSQbCISseXbVFZW7FxtbWbKLJdETdgQRgT23x+efDK3+6ZTp8Lee+dGVjxf+5pZve+8A4cdlrRbONY00Va34zidANVGRFYiMgLbGpyF6uxMRKSzTBdhVqQCM7DM+rFjHvAv4NoMp90xiTkWpbNMw32LwezZ0LUrbLllZuPGjLGqN2liNyOzZIlZ6PnYLwWLox06NK0TkseaOk4nR6QMkT8DX2Chku8BXyByeSZ1TdN58+6PWaXPYgkXloTOrQHqVPEFMjBrc+ONbTk0GZtuatVRiumEVFdnJlgyJ6lkhPdNd9gh+3m8+679zZcyBVvqfeGFlF1i1qg7ITlOp+VybCvwZODFoG1f4I+YwfmbKEJSKlNVngMQYWvgf6qsb+1sOzyxNILplkBjaQWLRbrSa8kYPNiSyNfUwPjI2wjJyWUN02SMGgV33GH72UmyPXXrZka6K1PH6bQcA/wM1f+E2mYhshBbfY2kTCOZJ6rUqbJehP4i7CnCfuEj87l3MFTN2ozilTpihFVJaWzM/7wSka4oeDJynad36lSrptMvchhX5sSs3ghLvb7M6zidlo1JnMlvFrBJVCFRQ2P6i1ADfIZl16/BPG1jR+dm/nyzflLtl8YYPhxWroRPI1f2yR2rVtlcW+v8lMt901zWME1GTH4aj16PNXWcIiPSB5EHEFmOSB0iiSuKiRyPSCMiDaFjTOh8DSKrQueiBMdPA36ZoP0MYGrUW4i6cfY3oBEYBqzA1pN/CLwPHBz1Yh2WKJ68McIevYUmFpLTGssUchdvumaNlV7LtzLt29eWpiNYpnPmWDk2x3GKQouKYogks05eQbVX6KiJO39a6NzQRALiOAc4DpGZiNyKyC2BEj4WODvqDURVpqOB36ryAebZu1CV+7Fg14kpR3YGonjyxhg2rPmYQtKaGNMwsX3TbONNZ8yw8mj5VqYQqbZpVZVNpy0U9HGcTofIhopiqDagmqqiWO6xONMhwL1YiGbv4PXQYC6RiKpMu9OUVmkJsEXwegZWE7RzU1trDi5bbJG+70YbmTIrhmXamhjTMLnaN81nGsF4Ro2yPep994V58xJ28bqmjlNUhgCNqMZXFEtmnYxCZFFgSV6ASLwj7R+D8y81WwJOhepcVM9D9QhUD0f1fKALIvdEvYmoyvQDYPvg9VTgZBEqsRI1n0e9WIcl5nwUNZlBLK1goamrg9LS1OE76dh/f1i40BRUa3npJQvN6dmz9TKiMnKkKf6XXoKJiRdRPNbUcfJHXyhD5M3QMS6uSy/YkO41xlJgowTingdGYAbdEVhIS3gp9rdYFrwBwA3AI4i0tljyJsE1IhFVmV4FxKL8Lwa+BXyCpQf8vwwm1/FQbQqLicrw4ebEU+iEsLNn2zJtWdQytgnIxb7pY4/ZBuUfIhexbz2xHw6qcPPNCa1TjzV1nPyxCNahumvouCGuS/SKYqqfoPopquuDHLoXA0eGzr+G6jJUV6N6K+Yw+51c3k8yoobGTFG1ZPCqvA1UAbsBg1S5N2+zaw98/rmVEcskWfuIEeaE8/HH+ZtXIlobYxpm662hoqJ1ynTdOrjiiqbNySTKLafcFKrD0NiY0Drt2dOq47kydZyiMBOzXqNUFItHaaoI1przOSPDNDhGUD3mbWC5COfmeE7ti0ycj2IUK61gXV3rPXljtGbfVBX+/W/7EXH22U3L4UmUW86or4dbb216v2ZNUgXusaaOUyRUN1QUQ6QnIskriol8G5F+wevtgQuAh4L3myByECLdghSBY4H9gP8W4jbSKlMR+orwXRG+JUJp0NZFhDOB2UTMDtFhie19ZqJMd9jBFEoh903XrjUrOhcJ9seMib5v+uyzsMcecOSRtrTbtWuTEk6h3HLCxIkt412SKHCPNXWcotKiohiqtYgMCuJFY2UoDgTeRWQ5VsT7fiC2X9QFuATLHb8IOB04DNXEsaYiD6c8bHszMimVqQh7AR9hJcseB14SYXvgXeA0LCymc9famD7d8tElSVeXkB49LMykkJbpZ5+ZYsmVMoXUS71vvQXf+hYceKApy5tvhgMOaNkvn9bpK6+Ywg6zZg28/HKLrjHL1GNNHacIqC5B9TBUe6I6CNU7gvY5QbzonOD9b1DtF/QbjOrvUV0bnFuI6m6oboTqJqjuiepTKa66OM3xKVb/OhLpPFEmYibyJcDPgDOBR7FN39uDcmadm6hpBOMZMaKwlmlrioInI7xvGoubjTFzJlxwAdxzj/3AuPJKy+XbrRtcdVVk5ZYTYskazj4b/vEPyzyVxOO6shJWr7YET5kW1HEcpx2iekIuxaVb5t0JmKjKdKz4tgK/U+U2V6SYGVNbm9kSb4zhw03xrF6d+3klItuEDWFi+6bPPMPIM84wy3PuXDj5ZFOujz1mCnXWLDjrLFOkYMpNteWRJkNR1lRU2HNetChpl9hvDF/qdRynNaRTpn2w9WdUWYGlEszzN187oq4OVqxonTIdMcKWOGfOTN83F9TVmRKsqMiNvDFjYMkSNn7vPfjOd2DbbWHSJFOos2bBxRdbSbq2QOye//e/pF081tRxnGyI4s27qQh9RNgMs0x7B+83HHmeY9sltufZmmXemAIu1FLv7NlWpaW8PDfygvlLzLI8+GD48EO4+ur8VoJpDRGUqWdBchwnG6Io0xmYdboAy1TxRvA+5jG1MOrFAuX7gAjLRagTIXFlAOt7lgjzRFgqwk0ilIfOnSbCmyKsFrH419C5PUV4SoQlIiwU4V4RtgqdnyDCWhEaQsfgqPfQjJgijN83jMLQoZaNqFBOSLmIMQ1zyy1N+49dupiiHty6x5h3Bg60vymU6UYbQZ8+rkwdx2kd6ZTp/sABoSPZ+6i0qAwg0jL/oggHAedibtBVWHqoi0Jd5mJOUTfFjwU2xdJIVQGVWBaNm+P63K1Kr9DxSQb30ERtrVk9rVnOLC+H7bYrnGWaixjTGPX1pkxjIS5r1xYmAUNr2WILU/iffZaym8eaOo7TWlJ686ryXK4uJEKsMsAIVRqAF0U2VAaIT/xwHDBJ1TJgiDARmBLrF1SsQYRdgYFxc3487rpXQ+7uoxmZphGMZ/hwmDYtd/NJRmOj1Rj70Y9yIy9V/OY11+TmGrmkpMSs0xSWKZjhnotSrY7jtDNEegAjsZy/zY1M1fujiMgiSWvGDAEaVYmvDDA6Qd/hxLJaNPXrJ8JmqizO8Lr70TIt1SEiLAHqgatVuS7RQBHGAeMAysqEmnBcZWMj+9XW8tnQoXzSyjy1Vb16UTlrFi/897+sz9VeZgLKFy7k6+vWMXP1auZmW4sU2OWpp9goQYjLsief5K0cyM8HIzfaCKZPZ2qK+ZWWbsMnn/SnuvqFyDULktHQ0ND88+L4M0mCP5ciI/INLFFEomQBCpasKC2qWpADdF/QeXFtJ4HWJOg7C/Tg0PsuQRxFVVy/S0BvSXHNHUGXgO4bahsG2h+0FHQv0HrQo9PNv7y8XJvx4YcW2HHzzdpq7rnHZLz1VutlROHFF+06jz+ec9HV1dU5l5kXjjlGtaoqZZe//c0e04IF2V+u3TyXAuLPJDEd/bkAy7VAeqZVB9Qq3KLQPxs5rcrN20qiVwZo2Tf2OlHfhIiwLZa16QxVXoi1qzJDlbmqNKryMpYy6shkcpIS2+tsjSdvjNjYfO+b5jLGtL1SUWHpFFOkOPJYU8fplFQBE1Gdm42QQirTmUCZCFEqA9QG58L95kdd4g1qrT6NJZxomSy5Oa2rKhDzwt1hh4yHbmDbbc0xJt8evTGvmkGdOPNjRYU5Si1YkLSLx5o6To6pr2codCv2NNLwEjA0WyEFU6aqbKgMIEJPEZJXBrB8iD8XYZgIm2LZl26JnRShTIRu2Fp2qQjdRGz/V4QBwLPANapcHy9YhENF2FQEEWF34Jc035+NRm2thYJkU+C6SxfYfvv8W6Z1dVZjrBDFuNsqHmvqOIVn4kR6FdZoaw3XA1cgciIieyCyc7MjIpEckEQShqCAWXWrgI+xcJN0ZvIpWDjLAiyR8HhVakUYhMWzDlNljipPiHA5UI1VEvg3cGFIzvlx74/FQmcmACdioTQXijT1UaVX8PKoYA7lwGfAZaqE6nRFJFtP3hjDh8Orr2YvJxWzZ3fuJV5oijX97DPYbbeEXTbZxKKcXJk6Tg6oq4Mbbyz2LKJwX/A3vmg5ZOCAFNWbd3NgX2A9EFuTHIEtj74FHI5ZnPuqMjWZEFWWAIclaJ8DG5RdrO1K4MokciZgijPRuYtoHpMaf/7oZOcis3atpQE85JCsRTFiBNx1FzQ0QK9e6fu3hro6+NrX8iO7vRDBMgX7zeHLvI6TJR9+CPvsA+vWFXsmUdg6F0Kimt8vYc48A1XZT5X9sPjO/wBPYskRHgP+kotJtXk++sgUajbORzFi1u2MGdnLSoRq7rMftUf69rWE+2mUaVWVW6aO02pUzRodNSplYYk2hWpdyiMiUZXpGcDFasnug+uzArgUOEuVNcBlWNBrxyfmMJSrZd6wzFyzYAGsWuXKVCRS4oZYFiT1mkiOkxmLF8MRR8C4cZabs2vXYs8oOiI7InIbIm8i8gYityKS0XJeVGXaC5ry24bYkqbl2a8obBKI4lFba1l1tt8+e1mDB5vFlC8npFzWMW3vDByYNqVgZSUsWwZffFGgOTlOR+CZZ2DHHeHRR+GKK2wlKD6xS1tF5PvA20AFtgL7BDAIeBuRyHt5UZXpA8AkEX4oQpUIlSL8EJiEeegC7A4UqJ5YkZk+3cJauuXA47u01MJr8mWZxpRpZ7dMwfZNI1im4Eu9jhOJNWvgnHPgm9+E3r3h9dfh17+GqVM31Ct+i6YVzTbKJcClqO6P6gXBsT/wx+BcJKIq05OB/wKTgVnAJ8HrJzAPXYD3gZOiXrhdU1ubm/3SGCNG5M8y9YQNTcQSNzQ2Ju0Se0zuhOQ4afjgA9hzT/jzn62O8VtvwciRxZ5VaxhC4hDN28kg/jSSMlVlhSonY8XCRwE7A31UGR/Ej6LK1FSevB2GVavMASkX+6Uxhg+3L/kvv8ydzBh1dU0xH52digpTpCmq27hl6jhpUIV//hN23tlWeh56CK69Fnr0KPbMWssCYJcE7bsA86MKySiYVpXlqryryrSYEu10fPihpaTLtWUK+bFO3ZO3iXCsaRL69LEIJVemjhOivh5Gj7bvqB/8wCzRffaBd9+F73+/2LPLlhuBfyJyHiL7IzIGkfOxZA6JYk8TEjVpQzfMo/dAEpSoUWXHqBds9+TSkzdG2KN3771zJxdMK7TVot2FJhxrusceCbuIeKyp47Rg4kR44QXYfXeLHb3ySjjjDHPEbP9cguWD/zUwMWibiyUG+ntUIVG9b68FfgDcC7yMZYXonNTWQlmZFfbOFYMGmTmUa8s0FmN6QCb12zswERM3eKyp44T47DOLHVWFlSvhySfhG98o9qxyh1W2+SvwV0Q2CtoiF1WJEVWZHgb8UJWnM71Ah6O2FoYOzW0MVUkJDBuWe2X6xRcW5+HLvMamm0L37pGU6UsvFWZKjtOmaWiAffdtymTUpQs88EDHUqZhWqFEY0S10VcAqb+BOgu5yskbz4gRuQ+P8bCY5oiYdRoh1vTLL/PjD+Y47YbPP4evf735Ms2aNXDzzSmd+NoFIu8ismnw+r3gfeIjIlGV6eXAr0TafPb//LJ8OXz6aW6dj2IMH27ZihYuzJ1MT9jQkgxiTX3f1Om0vPOO7Y9+8IFta4VpbLQ91PbNv4HVodepjkhEXeb9Jpbo/mARZgBrwydVaffuXJF4/33bN8iHZRqTWVsLY8bkRqbHmLakogKeeipll3Cs6U47pezqOB2PRx6Bo4821/ZttrEIhjBr1sDLLxdnbrlC9aLQ6wm5EBnV0lyEZUF6FpiHlU8LH52D2J5mPizTfITH1NVZDdPNNsudzPbOwIHm5p+imoXHmjqdElX429/g0EMtK9trr5llGmQyana8806xZ5s7RJ5FZJME7b0ReTaqmEiWqSonRJ9ZB2b6dCgvt19ruaZ/f0uskMt901iMqUjuZLZ3KiosTri+vsm7N47NNzc/JVemTqdh3ToLdbn2Wjj8cLj99sImYRDpg6Wn/RZmvP0O1TsS9Ds+6Lcy1Po9VGsyktOcMUAij9Ju2IpsJDpHYvpcUVtrv9hKI9WKzQyR3KcV9KLgLQmHxyRRph5r6nQqvvoKfvxjeOIJy7P7xz8WI370GmAN0A+rPvYYItNQTfSF+Aqq+2QtR2Tn0LsdEVkSel8KHAR8HvUGkipTEd4FRqvyhQjvkSK2tNMkbZg+HfbbL3/yhw+He++1ZZRcWJN1dZY702nCY00dp4m6Ovje92w598Yb4cQTCz8HkZ7AEcAIVBuAFxF5GPgJcG4e5byJ6TXF6nLHsxI4PerlU1mmYW+n+6IK7KiUgH0B58P5KMaIEXDDDeZ2vlWiincZsGwZLFnilmk8sZSCaZRpZSW88UYB5uM4xeL11y0V4KpVZpUeeGBeLtMXyhB5M9R0A6rhNH1DgEZUw1XHpgGjk4gchcgiYAmWjP6PqK5rhZytAcEKt+wOhEMp1gALUE1eFSOOpLa8KhfFioEHr5MeUS8mQh8RHhBhuQh1IhyTou9ZIswTYakIN4lQHjp3mghvirBahFsSjD1QhA9EWCFCtQiVoXMiwmUiLA6Oy0VIawZ2W7/eXuTD+ShG2KM3WzzGNDEbb2zZptLEmn7xhdU6LikxK3XKlGjip0yx/gccMLpV4zK5XmvGFGOc04aI5di98Ub726MHvPJK3hQpwCJYh+quoSM+320vYGlc21JgowTingdGYGltjwCOBs5uhRxQrUN1NqolqL4ZvI8d9Zko0kCeFuwAvRP0btBeoPuALgUdnqDfQaDzQYeDbgpaA/qn0PnDQQ8DvQ70lrixfQO5PwTtBvpn0FdD538B+iHoQNABoDNAT0439y3LysyPbdYszRvz59s1/vrX7GU9+qjJeuWV7GWloLq6Oq/y88IOO6gefnjS05Mnq3bt2tx9sUcPa0/F5MnWrxDjCnmtbMaFaZeflQJQ0Ody8smqIvYPuNdeqgsW5P2SwHJN9f0KoxRWxLX9WuGRlOOs31EKb+VATpnCXoG8nzY7Iuq3qInu+wCXkjzRfe8IMjasZ6vSALwoQrL17OOASarUBmMnAlNi/VStILkIuwID48YeDtSqcm/QZwKwSITtVfkgkP0XVT4Lzv8Fq8N6far5d1O1X3H5TICwxRZWod4t0/ySJnHDeedZKF2YFStg/Hh4++3kYm+80foVYlwhr5Vq3Hnnwdixycc5bYDGRivW/eijto2kak6UU6aY63rxmYktBW+H6kdB205AlC9ChQ0ri62TI7I98AhNy76N2BboWmyr87YoNxHVm3cSVsf0BiybfmsS3Q8BGlWJsp49HHgorl8/ETZTTRvXOjzoD1jZOBFmBe0fxJ8PXifcCBVhHDAOYIgoX1VU8Pbzz6e5fHbsNHAgJS+/zDs1NVnJGfz88wzs0oXn33+/ZdB1DmloaKAmy7kWmqGlpfSZNYtXksx7zpzRkGDlf9ky5brrkq/8rFxZWrBxhbxWqnFz5ig1Nc8lHRemPX5WCkFrnkvXxYsZdvHFzLjwQtb06dPsnKxbx0YzZ7LxtGlsMm0aG0+fTtlyq5gZ0zzrRag/80w+OvPMnNxDVqguR+R+4GJETsS8cA8F9mrRV+TbwNuozg+U4AVYAZbM5DTnb8BbQf95wd+NgeuA8zO4j0jLs1+B7hHV3E0iY1/QeXFtJ4HWJOg7C/Tg0PsuwdJSVVy/SxIs804KLwkHbS+BHh+8bgTdPnRuu0C2pJr/TqB6/PHR1jWy4dRTVTfaSHX9+uzk/OhHqtttl5s5paBdLt1deKEtda1enfB0ZWXz5czYUVmZWmwhx7WHOcbTLj8rBaBVz2X8eNWSEtVTTlFduVL1uedUJ05U/cY3mq/H77CDLe1ec41qeXnzf7zu3VXr63N+P/GQbpnXlln7KDyosFxhjsIxQfsghQaFQcH7KxTmB/0+UbhYoUtaOamvvVhhRPB6qcLQ4PVohXfTjg+OqMFEC7B6b9nQAC2Wg3sDibL0x/eNvY6S0T/ddRLJbrBnl5wyyK8nb4zhw80TN423aVq8KHhyKirs62Tu3ISnL720Zbx6jx7WnopCjmsPc3TyxNy5cNNNlnzk+uvNqW70aPj97y23989/DvfdB/Pnw4wZcN11FtancV9xbSnHruoSVA9DtSeqg4glWlCdg2ovVOcE73+Dar+g32BUf4/q2rRyUiNYMRcwj94BwevPgG2j3kJUZXoecLEIvaIKTsBMoEyEcCHQZOvZtcG5cL/5EZZ4W4wN9mq3CV0nkexom5T9+0fqlhUxb+Hvfje7ygyzZ3uC+2SkiTUdO9a2lmLJoyor7X26vcHm47SV46JdLzdzbN04sLHXX+/7pQVlxQpTojvtBKuDqEVVSyTz0EOwaJHtjf7973DEEeaDEeOVV1o6AnSEHLu5YTpNOuF14LeIjAYuAj6OLCWK+Qr6Hugy0BWg74O+Gz6imsGgd2EevT1B9ya5N+/BoPNAh2HevM/S3Ju3DPPU/SPo7cHrsuDc5oHcI4L2y2juzXtycA8DQPuD1hLBm3cXUP3JT7Jb74jC4sW2BCNiSzitYcUKkzFxYm7nloB2uXRXW2vPZ8qUvF2iXT6XiNx9tz2+11/PbFxHfibZkPa5zJypetZZqpts0vTdUITl2tZClGXeYh5wkMLhwevBCjMU1issUBgTVU5Uy/Q+4ArgMuAuWlmiBjgF6I4tG98JjFelVoRBIjSIMMgUPE9gZd+qgbrguDAk53wsO8W5wLHB6/ODsQsxr+FLgS+APYCjQmP/iXluvYf9InksaEvPffflv45f+Bdna+sGzrEVEV/mTULMMk0Ta+okJpYEzH2JQsTiN3P1/bBunVmbBx0EQ4bAP/5hrw891Ap0h2lLy7XtEdX/onp/8PoTVIcBfYF+xHL+RpOT1prsAno5aGXRf0EU8dgFLPiwtdZiVGKOBaDapUvrrvff/9r455/P/fziaLfWxsYbq552Wt7Et9vnEpGhQ1W/+93MxnToZxJ2CMqEuXP1ix13bLIs581TveQS1YoK+z88cKCtMMXOjxypCT3BRo7M7f3kENq6ZZqjI21ojCprRRgPXNsapd+hiFWZv+AC2HLL3Muvrzf5sWxLa9e27noeY5qeCEXCneSMHg133WVGUT7qPrQr6uvhX/9qcghavtz8FbbYwo5+/Zpeb7JJ87zbEyey8XvvWaBv9+62+rV2rWUkuuoqOOSQ5sW5O1Lps2IiUg0RQzxVD4jSLWqc6ZPAAcBNEft3XGJLKtdck3vZEyc2KdIYa9Zkfr3Zs+0brhAOU+2VgQNdmWbBmDHmkDRtGuy8c9ruHZtx40wBgv3/veuupu2aeLp0aVKsG28ML7yAqMKDD8JGG8Epp8DJJ8P22xds+p2UcK3LUmAsFmP6WtC2O7AVMDmqwKjK9BngDyLsiAW3Lg+f1CAjUacgnx5wiTzuGhsz35yqqzPLqyzqP28npKIidcofJyWjR9vfmppOrkzfeMMyC4UpKbEfamVlFp6yYEHzI9b2+uv2/xvsx+9RR1lxbif/qDZVgxH5K3ArcAa2LB1r/xuJMpUkIeq37dXB318mmham2Ts008vLrbpCPolfwqmrs1CZAQNsZyRqWTaPMU1PRYV9oa1ebQXfnYzo3x+23Raeew5+9atiz6ZIrFoF3/lOy/bGRqsJes01ybdn6uth8ODmYyZPhosvzs8WkpOKnwJfb6ZIjWuBV4EzogiJ5M2rSkmKo8Mr0qJRWQmXXw5PPWV7p1HxouDpcY/erBk9Gl54oeXORKdA1ZZkFy1qeS7K6lWiLR33yi0WAnwtQXuitqQUvJy6kyG/+IV9a/3qV/B5hKLva9dahhRP2JCaiHVNneSMHm2l6t57r9gzKQI33GA/cM8/P5FvbXpHIU+i0Ja4CfgXIuciMiY4zgVuBCJbMZE31YLKMQcDg4Cu4XOqXBxVjpMhJSXmKbjjjuaY8PDDqZd7P/vMfvG6ZZoat0yzJrZv+txzlpSn0/Daa3D66XDwwTBhQutkhJRtTU0NY8aMycnUnFZxDpb74AzgD0FbPfAn4C9RhUSyTEXYE/gIS9wwEfgZlmLwN8CRkafstI5tt7UkqI8+Cnfembrv7Nn215VpatKkFHTSM2gQbL11J0veMH++peobONBKmHX6uKAOgOp6VC9HdQCwCbAJqgOCtsgFwqMu8/4Zqyc6AFiFhckMAt7EsiI5+eaXv4Q997RfxPPnJ+8XizH1Zd7U9OgBffq4Ms2S0aPh+ec7yb7punXmcbt4Mfz73/b5cToWql+h+lVrhkZVpjsCV6uiWOHUclXmA78FJrTmwk6GlJZakuuGBlOoyairs2XgmOXlJGfgQF/mzZLRo023zJhR7JkUgHPPNTP8hhtg1Khiz8bJBpF3Edk0eP1e8D7xEZGoe6bhnfL5QCXwPlbOzDMDFIoddoALL4TzzrNfxkcc0bLP7NkWt9C1a8tzTnM8C1LWxLb6nnuuqeBRh+Tuu+Evf4FTT4Wf/KTYs3Gy599ALLPGfbkQGFWZvg3shpVRqwEuEaEflmQ+suZ2csDZZ1vKsVNPtW+yzTZrft5jTKNTUQGvvlrsWbRrqqps77Smxj6SHZLaWqsRutdecOWVxZ6NkwtUL0r4OgsyqWcaq6R8PlZA9R/ApsC4XEzEiUiXLrbcu3gxnHVWy/OuTKNTUWHPccWK9H2dpMT2TVuEvHcEli6FH/zAUv3de6+v+DhJiWSZqvJm6PVC4Nt5m5GTnpEj4Xe/swDvH//YComDBX3PmWNtTnpisaaffw7bbZe6r5OU0aPh9tvhgw9sJ6LDsH49/PSn8Omn8Oyznuu6IyHyHtET3e8YpVtGyVtF2BXYBnhUleUi9ARWq7IuEzlODjjvPLj/fkvqUFtrSbPr683j0C3TaITDY1yZtprwvmmHUqZ//KPFdV91Fey7b7Fn4+SWnOyThomkTIP90YexfVMFtgM+Aa7EQmUi5S50ckh5uS33fv3rto96ww0eY5opHmuaEwYPtvTRNTWWV6RD8N//WunDY45J7T3vtE9ytE8aJuqe6V+x8jSbAeENpnuBb+V6Uk5Edt8dfv1ruPFGePppjzHNlAED7K8r06wQsaXe557rAPum9fWwxx62VTJihP1IjVpgwunURFWmBwLnqfJFXPssLHmDUywuusiWKE86ydKcgTtJRKV7d+jb12NNc8Do0TBvHnz0UbFnkiW//72VRlu5Eh54AHr2LPaMnEIgcgIiTyLyASKfNDsiElWZdqd5rGmMzbFl3kiI0EeEB0RYLkKdCMek6HuWCPNEWCrCTSKUR5EjwlgRGkLHChFUhF2C8xNEWBvXZ3CiObQLune35d66uqYC4n+JnE7S8VjTnBDO09tueest+78UwxVp50DkbCwH71tAFfAgVjy8D5YEPxJRlenzwPGh9ypCKZYB6ZmoFwOuwZRyP6yy+XUiDI/vJMJBwLmYRVwFDAbCa9xJ5agyRZVesQM4BdvfDVeCvjvcR5XIvz7aJPvsAyec0JTT7eabzUxw0uPKNCcMGWJlONttnt5777U40nBeRC+H1lk4CRiH6u+AtcDVqH4fU7CRHVCiKtNzgJNEeAooDy4yA9gb+F0UAYHn7xHABao0qPIi5tSUKJ3IccAkVWqDpeWJBMo8QzkxWbcFqRA7LuF9Ha+LGJ2BA12Z5oB2u2/61Vdw/PHwox9Z+cIYa9b4j9LOw0Dg9eD1SqB38PpOTNdEImqc6QwRvgaMx1IwdcOcj65RpT7itYYAjarMDLVNA0Yn6DsceCiuXz8RNsP2aCPJEaES2A+rchPmEBGWYGV2rlblukQTFmEcQVKKsjKhpo3+7O66eDF73H57U5X2NWtonDSJ1w48kDV5Tsbd0NDQZp9LFAatXcvgL7/khccfp7F795zJbc1z6bp4McMuvpgZF16Y93+3fLDVVv35/PMh3HHHqwwY0HL3p619VnrX1rLDpZfSbf58lg0ZQq9PPqFkXVOU3/q1a6k/+WQ+OvPMvM6jrT2XTsg8oC8wB6gDvg5MBbYlaiwqgKq2+gCtBL0nYt99QefFtZ0EWpOg7yzQg0PvuwRVd6sylHNBfDvoMND+oKWge4HWgx6dbv7l5eXaZhk/XrVr1+blibt2VT3llLxfurq6Ou/XyCuTJ9vzev/9nIpt1XMZP161pKQg/275oLbWHuWkSYnPt5nPytq1qhdeqFpaqlpVpfrii6ojRyYq8W3teabNPJc8ASzXLPRM3g/4l8KE4PXJCisVqhWWKtwYVU7UZd5kbEJ0M7iBJvM5Rm9gWYS+sdfLMpTzU+DWcIMqM1SZq0qjKi8DV9Hea7K+8ootS4VZswZefrk482lPtJVY0/fftzCM9evb7fLiDjvA5pu38X3TWbMsAcNFF1kM6dSpsPfeVqw7kToNFfF2OhgiBwavxgGXAKB6Pbal+B6WRveUqOKyVaaZMBMoEyGcamYnoDZB39rgXLjffFUWR5Ujwt5YRZt0mS4UaN+BZP5F0HpiKQWLrUxPPNH2uqHd7nmH903bHKpwyy2WivODD+Cuu+C22yxzmFN8RPog8gAiyxGpQyRppEdozLOIKCJlobYaRFYh0hAcH6aQ8FQQ+vI7YIsNrap3o/pLVK9GdW3S0XEUTJmqshy4H7hYhJ6BsjsUuD1B99uAn4swTIRNseT6t2Qo5zjg36rNLVYRDhVhUxFEhN2BX9J8f9bpTMQSNxQz1rS+3lYXYrRj55fRoy09dCwZV5tgyRJzMDrhBNhlF3j3Xc9f3fZoEaGBSItIjw2IjCW5z89pqPYKjqEprjkc0yWnA3WIPIbIYYiUphiTlEJapmAmc3dgAeYpNV6VWhEGBfGegwBUeQK4HKjGNoTrgAvTyYmdFKEb8CPilngDjgI+xpaFbwMuU03Yz+kMlJdDv37FtUzPOqulC+yaNe3SOm0z8ab19TaZe++FHXeEhx6CP/0JnnmmaWnfaRuIbIjQQLUB1dQRGiIbY/rgnKyuq/o+qr/BvHl/jK1S3gt8jshliKRSxC1I6c0rwsNpxsfvXaZElSXAYQna5wC94tquxHL/RpYTOr8K289NdO7oqPN1OgnFjjWtrm7Z1tgITzxR+LlkyfDhVmK3pgaOO66IE7nwQqsL9/zzMHSoJazfeeciTqjz0hfKEHkz1HQDqjeE3g8BGlGNEukB8AfgOswLNxF/RORPwIfAeajWpJyg6jrMQr0fkf7YnukJwG8QeQnV/VKOD0gXGrM4wvlPo1zIcdosAwfCzJnp++WD9esti9W3vw3/+Y+1rVhhTjGffmr5+dpRRZuSEthvvyJaplOnwt//bsvkAKWl9lwHt98kZ+2dRbAO1V1TdOkFLI1rWwps1KKnyK5YfoMzMIsynt9iORDWYKuQjyAyEtVZkSarOheRa7GVywnBtSKRUpmqckJUQY7TbqmosHqVxeCllywV5B/+0NTWo4flhd11VzjsMHj1VStO3U4YPdqm/7//FWhFdckSuOMOSwX4zjum0UtK7IdKaaml14yl2nTaItEiNERKgGuBM1Bdl7AAgeproXe3InI08B3gH2lnIfINLCfBYVia3DuBf0W7hcLvmTpO26OiwjLhfPVV4a89ebLlgD300ObtVVVwzz3w4YdWoDqc5q6NU5B908ZGePJJOOoo2GorK5MmApdeaoUeYs+rHTtzdSJmYkvB6SI9egO7AncjMg94I2j/DJFkBWdTR2uIDELkQkQ+BZ7EIkDGAf1RPRXVyCERrkwdp1ixpqtXm8L8wQ8SJ1U/4AC44gp48EFTErkk5qCTByXzta/BJpvkKN40fp6ffGKVXbbeGg46yBTqL35hFulbb5lXdvwPj3YaatRpUN0QoYFIT0SSRWgsxZTdyOD4TtC+C/AaIpsgchAi3RApCzx+9wP+m/C6Ik9hedt/AdwFDEF1DKqTUY1cwCVGpHSCjtOhCceaDk/ujZ9z/vMf+PJLOPbY5H3OOMOUxIUXWozkIYfk5toTJ8KLL9rfHC+BlpbmcN80Ns9jjzWP52efNQv0W9+yHxrf/z5069bU3xOYtFdOwSq0LMB8ccajWovIIGwPdBiqcwg7HYnE/uHnB8u+G2PJF7YHGoEPgMNQTRZruhI4HHgM1cZsb8CVqePELNNCx5pOnmxhOQcemLyPiGVGmjHDFMrrr5t3ajbU18OkSWbB/etfcO65Od/cHD3aHGjnzoX+/bOY57/+ZfOMhbRMnGhuwsnm64lK2ieqiSM0TIH2atFu52YTXsJVXQjslsE1v5/JFNPhy7yO07+/Ka1CLvN+8QU8+igcfTSUpflN2727efSUl5tDUjZ7u2+/DXvs0WS9rVlj3sLjx1sYSY72ZnOyb3rssU2VXMrK4Hvfg/PP9zhRp03iytRxunQxJ5ZCKtN//9sUWaol3jCDBtn+6kcfwU9+krnSe/99+OEPLQNQ/H2uWwe33moasLISzj7blG58IokMGDkSevfOYt/02mube1ivW2fpAN2RyGmjuDJ1HCh8XdPJk2H77TNLJDBmDPz1r7Z+evHF0cZ8+qnV6xwxAv77Xwu36dq1eZ/SUlPqd9wBo0bB3/5mSneHHSwhfDgGN6LjUmmp5ZNvlWV6771w6qm0CH1wRyKnDePK1HHAlg4LtWc6Z45pmbFjWyqMdJx2mu0ZXnSRpchLxty5ppCGDoW774Zf/co8YdetS+yg88YbtuT88MMwf77t0261lV1n6FBTrn/5C/z2t02OS2kYPdoiezIyJh980Kq59OiROMWiOxI5bRRXpo4DTSkFs1jajMwdd9jfY9IXxmiBCFx/Pey2m1mTzz/f3FJcvBjOOQe22cYU4oknWtmxP/8Z+vaNVmGoTx846SRLc/i//5kSLSmB3/wGbr/dlpgnTUqrJWP7ps8/H/HeHn3UEtLvuqtZwF4JyWlHuDJ1HDBluny5harkE1VTSHvv3foUd926wf33m/V2yCFmKZ5/vlmRW29tyu9HPzKz8Nprs3Cnxarq/OpXTZZraVBQY/Vqu3a8lRti552hV6+I+6ZPPAFHHAE77WSve2eU9ttxio4rU8eBwtU1nTatKcwlGwYONAv1q6+aLMUJEyz+8r33zKEol/lo6+vNo7gxFI735ptWkeWNNxIOKSuDffaJsG/61FPmpTx8uCVh8BqjTjvElanjQOFiTSdPNu/hH/4we1lPPdVkKYrAkUfCfffBsGHZy45n4sSWHsRlZfa89tzTlpZXrmwxbPRo++2wcGESudXVlnhh6FC7n003zf3cHacAuDJ1HChMSsHGRrjzTvjOd6xOWTbU11vO2ZilqAqPPZa/0JFEmYXWrTPr9+c/tz3ZnXayJecQKfdNn3/eYke32Qaefjr7Z+I4RcSVqeOAea6WluZXmdbUmJft2LHZy0pkKeYzdCSZ49K775qj09NPW4KF/fazpPMNDYD5EvXokWDf9OWX7UfFoEGW3WjzzfMzb8cpEK5MHQdMkeY7ccPkyeZY873vZS+rreWgPfBA26s9/XTL9fu1r8HTT9OlC+y1V9y+6WuvwcEHm2PUs89aSkXHaee4MnWcGPmMNV2xwrIeHXmkpQfMlighLoWmVy+46ipbvu3aFb75TTjxRA7a40sWvlfPiFPPNE/dgw6CLbaw/dKttirefB0nh3iie8eJUVGRP2X0yCOwbFn2XrztgX32galTzbv4iis4vc/jDGMnNpvxrtVtjVmkAwYUe6aOkzMKapmK0EeEB0RYLkKdCEmj1kU4S4R5IiwV4SYRyqPIEaFKBBWhIXRcEDovIlwmwuLguFwkRfFYp/MQSymYj8QNkyeb/JhHTkene3e47DJ49VVWdtmI7/A4grJ+zVoeP+Ee2ytNw5QpViO9pMT+TpkS7dLtZZzTsSi0ZXoNsAbohxV3fUyEaarNK6qLcBBwLnAAMBd4ALgoaIsqZxNV1iWYwzis1M9OWBX2WIHY67O/PaddU1EBq1bBkiW59SxdtMiWN3/1K/vG7URMmbkbKxeM5ng+poxG1lLGnItv4Z/9duPII5OPu+8+OOuspmibujpLytTQQJsfN26cvc6Fn5nTjlDVghygPUHXgA4Jtd0O+qcEfe8A/UPo/YGg86LIAa0KNpDKkszjZdBxofc/B3013fzLy8vVaUl1dXWxp5A77rvPdh7feSdrUc2eyzXXmNx3381abntj1wFzdQXdmu3sLqe79qM+wYZvxzkqK6M/ow71fygBwHItkJ4p5lFIy3QI0KhKqAQF04BE617DgYfi+vUTYTNgUEQ5dSIbLM+zVVkUkj0tbuzwRBMWYRxmyVJWJtS0up5Ux6WhoaHDPJeNFixgF+C9//yHxVmmFQw/l1HXXkvp4MG8uXhxFjXJ2icnfH43QvMQnhIauYCL+fD0s5KO+8c/toWEuy/K6ad/3ObHzZmj1NREK5nTkf4PdWoKpbVB941Zl6G2k0BrEvSdBXpw6H2X4BdfVTo5oL1AdwUtA+0Heh/of0N9G0G3D73fLpAtqebvlmliOtSv6s8/N7PimmuyFrXhuXz8scm87LKsZbZHpncZmdB0m95lZMpxlZWts/jay7gwHer/UALoJJZpITdwGoD47NW9gWUR+sZeL0snR5UGVd5UZZ0q84HTgG+JbBiTSHaDKnnwOnHaFf36NaXIyxVTpliqv6OPzp3MdsTUm9+hZw9FaDp69lCm3pzaa/rSSy3ZQ5gePay9rY/r2jX9OKfjUUhlOhMoE2G7UNtO0Nz5KKA2OBfuN1+VxRnKATYoydhaTCLZycY6nYnSUgvXyFXiBlXz4h0zpildYSdj7FhLkFRZCSJKZaW9T+ec03wc7WZcly7mu3bUUanHOR2QQprBoHeB3ok5Ee0NuhR0eIJ+B4POAx0Guinos4QclVLJAd0DdChoCehmoHeDVofGngz6PugA0P6gtaAnp5u7L/MmpsMtUe2zj+ro0VmLqa6uVn3tNVvzmzQpa3kdgQ73WUnAPffYP/mdd0Yf09GfC77MmxdOAboDC4A7gfGq1IowKIgHHWQKnieAy4FqoC44LkwnJzg3GHgCW/adDqwGwmts/wQeAd4Lzj8WtDlOU6xpLpg8GcrLrU6n0yk44ggr2pModbLTsSlonKkqS7AYz/j2OUCvuLYrgSszkROcuxNTsMnmoMA5weE4zamosMLbqrZu10pk3Tq46y4roO31OTsNJSVwwQW2Rf7vf+em0p7TPuhcEeSOk46KCksYn7QAZzQ2festk9EZ0gc6zfjhD2H77d067Wy4MnWcMDmqa9rvqaegTx/49rdzMCmnPVFaCuefb0V0Hnyw2LNxCoUrU8cJM3Cg/c1GmS5bRt8XX4Qf/cjiJJxOx1FHwZAhcPHFtmPgdHxcmTpOmJhlmk2s6S23ULp6tVulnZjSUjjvPJg2DR5+uNizcQqBK1PHCbP55mZNZmOZXn65BTc/8USuZuW0Q445BrbZxq3TzoIrU8cJU1KSXXjMc8/BZ59ZhpBbboF583I4Oac9UVZm1unbb8NjjxV7Nm0ckT6IPIDIckTqEElanjM05llEFJGyUFvmcnKEK1PHiae1yrSuDr773ab3jY3m0ul0Wo49Frbe2q3TCITLao4FrkMkYQESAETGkji0MzM5OcSVqePEU1GR+Z7p/Pmw//6wfHlT25o1cPPNbp12Yrp0gf/7P3jjDV/1T4pIT+AI4AJUG1B9EXgY+EmS/htjSXzOiWvPTE6OcWXqOPFUVMDnn0cPEvzySzj4YJgzx749w7h12un56U9h0CC46KLOaZ32hTJE3gwd4+K6DAEaUY0vq5nMovwDcB0Q/ys1Uzk5xZWp48RTUQFr15q1mY4VKyzLUW2tZTtfu7b5+TVr4OWX8zNPp13QtatZp6+9Bk89VezZFJ5FsA7VXUPHDXFdegFL49qWAhu1ECayK7A38I8El4ouJw+4MnWceKLGmq5ZA0ceCS+9ZHl4Z83aUNKyprq6qbzlO6nLjTkdn+OPt49VZ7VO0xCtPKdICXAtcAaq61otJ0+4MnWceKLEmq5fb9+Qjz8O119vCRocJwnl5fC739kiRXV1sWfT5piJLQWnK6vZG9gVuBuRecAbQftniOybgZy84MrUceJJl1JQFU4/He68E/70JxgXvwXkOC352c+gf3+zTp0QqsuB+4GLEemJyN7AocDtcT2XAv2BkcHxnaB9F+C1DOTkBVemjhPPZptBt27Jlenvfw/XXgvnnAO//W1h5+a0W7p1s4/L889bOLLTjBZlNVGtRWQQIg2IDAoKh87bcECsGsV8VNeklFMAXJk6TjwiyWNNr7wSLrkETjzRrFLHyYCTToItt3TrtAWqS1A9DNWeqA5C9Y6gfQ6qvVCdk2DMbFSl2f5pMjkFwJWp4yQiUazpzTfDr39tTkfXX59VvVOnc9K9uy1oVFfDCy8UezZOLnFl6jiJqKhobpk+8IBZo9/8pnnulpYWb25Ou+YXv4AttrCsSE7HwZWp4ySiogLmzrWkC888YzW1dt8d7r/fXDMdp5X06AFnnw1PP+0hyB2JgipTEfqI8IAIy0WoEyFpEmIRzhJhnghLRbhJhPIockTYU4SnRFgiwkIR7hVhq9D5CSKsFaEhdAzO31077ZKBA02R7rijJWUYMsSylffqVeyZOR2A8ePto3TggXDAAaOpqoIpU6KNnTIFqqqsJkN7GGc/Hzo+hbZMWyQhFmmZ6kmEg4BzgQOBKmAwEN6yTyVnU+CGYFwlFrB7c9wl7lalV+j4JCd353QcYuExM2ZY+Y8nn4Q+fYo7J6fD8OCDsHo1rFoFqkJdnUVYpVNUU6ZYv7o6i9BqD+M6C6IFSschQk/gC2CEKjODttuBz1U5N67vHcBsVf4veH8gMEWVLTORE5zbGXhO1VJKiTAB2FaVYzOZf7du3XTVqlUZ3XNnoKamhjFjxhR7GrnnmWfgG9+w1926waefmhtmRDrsc8kCfyZNVFUlVjRlZbYIkoyZM2Fdgtw/bXtcT1SXd3hvvUQlbPLFEKAxpgADpgGjE/QdDjwU16+fCJsBgzKQA7AfLTNgHCLCEqAeuFqV6xINFGEcMA6grEyoqalJconOS0NDQ4d8LttddRX9S0qQ9etZv24d9SefzEdnnhl5fEd9Ltngz6SJOXNGAy31y7p1yuabL2w5IGDGjM3b9bgOjcXB5v8A3Rd0XlzbSaA1CfrOAj049L5LkOi0KkM5O4IuAd031DYMtD9oKeheoPWgR6ebf3l5uTotqa6uLvYUcs/cuardusUy69rRvbtqfX1kER3yuWSJP5MmKiubf7xiR2VlRxzXQ7VAeqaYRyH3TDNJQhzfN/Z6WVQ5ImwLPA6cocqGiC5VZqgyV5VGVV4GrgKOzPBenI7MxIkty695KTUnh1x6aUu3nB49rL2jjuvoFFKZzgTKRIiShLg2OBfuN1+VxVHkiFAJPA1MVE2bl1HpdOsRTkpeecUqwoTxUmpODhk7Fm64war2iSiVlfZ+7NhMxtEuxnUWCuaABCDCXZjyOhFLVPwfYC/V5gpVhIOBW4ADsH3NfwOva+BglEqOCAOA54HrVflzgjkcGpz/EtgNeAD4P1VuTTV3d0BKjDuVJMafS0v8mSSmoz8XEVmhqj2LPY98U+jQmBZJiAMFOCiI9xwEoMoTwOVANVAXHBemkxOcOxELpbkwHEsaGnsU8DG2LHwbcFk6Reo4juM4qSikNy+qLAEOS9A+B6uSHm67ErgyEznBuYtoHpMaf/7oyBN2HMdxnAh4OkHHcRzHyRJXpo7jOI6TJa5MHcdxHCdLCurN254RkfXAymLPow1SBiRIONbp8efSEn8mienoz6W7qnZ4w62gDkjtnLdVdddiT6KtISJv+nNpiT+XlvgzSYw/l45Bh/+14DiO4zj5xpWp4ziO42SJK9Po3FDsCbRR/Lkkxp9LS/yZJMafSwfAHZAcx3EcJ0vcMnUcx3GcLHFl6jiO4zhZ4srUcRzHcbLElWkaRKSPiDwgIstFpE5Ejin2nNoCIlIjIqtEpCE4Piz2nAqNiJwmIm+KyGoRuSXu3IEi8oGIrBCRahHpNJUdkz0XEakSEQ19ZhpE5IIiTrVgiEi5iEwKvkOWicg7IvLt0PlO+3npKLgyTc81wBqgHzAWuE5Ehhd3Sm2G01S1V3AMLfZkisBc4BLgpnCjiPQF7gcuAPoAbwJ3F3x2xSPhcwmxSehzM7GA8yomZcD/gNHAxthn457gB0Zn/7x0CDwDUgpEpCdwBDBCVRuAF0XkYeAnYIXKnc6Lqt4PICK7AgNDpw4HalX13uD8BGCRiGyvqh8UfKIFJsVz6bSo6nJgQqjpURH5FNgF2IxO/HnpKLhlmpohQKOqzgy1TQPcMjX+KCKLROQlERlT7Mm0IYZjnxNgwxfpLPxzE6NORD4TkZsDq6zTISL9sO+XWvzz0iFwZZqaXsDSuLalwEZFmEtb47fAYGAAFnT+iIhsU9wptRn8c5OYRcBuQCVmkW0ETCnqjIqAiHTB7vvWwPL0z0sHwJVpahqA3nFtvYFlRZhLm0JVX1PVZaq6WlVvBV4CvlPsebUR/HOTAFVtUNU3VXWdqs4HTgO+JSLxz6rDIiIlwO2YH8ZpQbN/XjoArkxTMxMoE5HtQm07YUszTnMUkGJPoo1Qi31OgA1779vgn5t4YunXOsXnRkQEmIQ5Mx6hqmuDU/556QC4Mk1BsHdxP3CxiPQUkb2BQ7Fflp0WEdlERA4SkW4iUiYiY4H9gP8We26FJLj3bkApUBp7HsADwAgROSI4/3vg3c7iTJLsuYjIHiIyVERKRGQz4O9AjarGL3F2VK4DdgAOUdVwbeRO/XnpKLgyTc8pQHdgAXAnMF5VO/svxi5Y6MNCbB/sdOAwVe1ssabnYwXjzwWODV6fr6oLMS/wS4EvgD2Ao4o1ySKQ8Llge+xPYMuX04HVwNFFmmNBCeJGfwGMBOaF4mzH+uelY+CJ7h3HcRwnS9wydRzHcZwscWXqOI7jOFniytRxHMdxssSVqeM4juNkiStTx3Ecx8kSV6aO4ziOkyWuTB2nkxLUFj2y2PNwnI6AK1PHKQIickugzOKPV4s9N8dxMsfrmTpO8Xgaq40bZk0xJuI4Tna4Zeo4xWO1qs6LO5bAhiXY00TkMRFZISJ1InJseLCIfE1EnhaRlSKyJLB2N47rc5yIvCciq0VkvojcEjeHPiJyr4gsF5FP4q/hOE40XJk6TtvlIuBhLJ/rDcBtIrIrgIj0wPLcNgC7Az8A9gJuig0WkV8A/wRuBnbESuTF55X+PfAQVrXkbuCmII+s4zgZ4Ll5HacIBBbiscCquFPXqOpvRUSBf6nqSaExTwPzVPVYETkJuAIYqKrLgvNjgGpgO1X9WEQ+Ayar6rlJ5qDAn1T1d8H7MuArYJyqTs7d3TpOx8f3TB2neDwPjItr+zL0+pW4c68A3w1e74CV6QoXkH4ZWA8ME5GvgAHAM2nm8G7shaquE5GFwBaRZu84zgZcmTpO8Vihqh+3cqzQVFw7nkwKta+Ne6/49o/jZIz/p3GctsueCd6/H7yeAewkIhuFzu+F/Z9+X1XnA58DB+Z9lo7juGXqOEWkXES2jGtrDIpFAxwuIm8ANcCRmGLcIzg3BXNQuk1Efg9sijkb3R+ydi8F/ioi84HHgB7Agar6l3zdkON0VlyZOk7x+AZQH9f2OTAweD0BOAL4O7AQOEFV3wBQ1RUichDwN+B1zJHpIeCMmCBVvU5E1gC/Bi4DlgD/ydO9OE6nxr15HacNEnja/lBV7yv2XBzHSY/vmTqO4zhOlrgydRzHcZws8WVex3Ecx8kSt0wdx3EcJ0tcmTqO4zhOlrgydRzHcZwscWXqOI7jOFniytRxHMdxsuT/Aaqq0VYt2x00AAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.plot(history.epoch, history.history[\"lr\"], \"bo-\")\n", "plt.xlabel(\"Epoch\")\n", "plt.ylabel(\"Learning Rate\", color='b')\n", "plt.tick_params('y', colors='b')\n", "plt.gca().set_xlim(0, n_epochs - 1)\n", "plt.grid(True)\n", "\n", "ax2 = plt.gca().twinx()\n", "ax2.plot(history.epoch, history.history[\"val_loss\"], \"r^-\")\n", "ax2.set_ylabel('Validation Loss', color='r')\n", "ax2.tick_params('y', colors='r')\n", "\n", "plt.title(\"Reduce LR on Plateau\", fontsize=14)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### tf.keras schedulers" ] }, { "cell_type": "code", "execution_count": 94, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.5995 - accuracy: 0.7923 - val_loss: 0.4095 - val_accuracy: 0.8606\n", "Epoch 2/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.3890 - accuracy: 0.8613 - val_loss: 0.3738 - val_accuracy: 0.8692\n", "Epoch 3/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.3530 - accuracy: 0.8772 - val_loss: 0.3735 - val_accuracy: 0.8692\n", "Epoch 4/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.3296 - accuracy: 0.8813 - val_loss: 0.3494 - val_accuracy: 0.8798\n", "Epoch 5/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.3178 - accuracy: 0.8867 - val_loss: 0.3430 - val_accuracy: 0.8794\n", "Epoch 6/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2930 - accuracy: 0.8951 - val_loss: 0.3414 - val_accuracy: 0.8826\n", "Epoch 7/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2854 - accuracy: 0.8985 - val_loss: 0.3354 - val_accuracy: 0.8810\n", "Epoch 8/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2714 - accuracy: 0.9039 - val_loss: 0.3364 - val_accuracy: 0.8824\n", "Epoch 9/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2714 - accuracy: 0.9047 - val_loss: 0.3265 - val_accuracy: 0.8846\n", "Epoch 10/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2570 - accuracy: 0.9084 - val_loss: 0.3238 - val_accuracy: 0.8854\n", "Epoch 11/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2502 - accuracy: 0.9117 - val_loss: 0.3250 - val_accuracy: 0.8862\n", "Epoch 12/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2453 - accuracy: 0.9145 - val_loss: 0.3299 - val_accuracy: 0.8830\n", "Epoch 13/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2408 - accuracy: 0.9154 - val_loss: 0.3219 - val_accuracy: 0.8870\n", "Epoch 14/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2380 - accuracy: 0.9154 - val_loss: 0.3221 - val_accuracy: 0.8860\n", "Epoch 15/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2378 - accuracy: 0.9166 - val_loss: 0.3208 - val_accuracy: 0.8864\n", "Epoch 16/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2318 - accuracy: 0.9191 - val_loss: 0.3184 - val_accuracy: 0.8892\n", "Epoch 17/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2266 - accuracy: 0.9212 - val_loss: 0.3197 - val_accuracy: 0.8906\n", "Epoch 18/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2284 - accuracy: 0.9185 - val_loss: 0.3169 - val_accuracy: 0.8906\n", "Epoch 19/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2286 - accuracy: 0.9205 - val_loss: 0.3197 - val_accuracy: 0.8884\n", "Epoch 20/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2288 - accuracy: 0.9211 - val_loss: 0.3169 - val_accuracy: 0.8906\n", "Epoch 21/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2265 - accuracy: 0.9212 - val_loss: 0.3179 - val_accuracy: 0.8904\n", "Epoch 22/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2258 - accuracy: 0.9205 - val_loss: 0.3163 - val_accuracy: 0.8914\n", "Epoch 23/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2224 - accuracy: 0.9226 - val_loss: 0.3170 - val_accuracy: 0.8904\n", "Epoch 24/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2182 - accuracy: 0.9244 - val_loss: 0.3165 - val_accuracy: 0.8898\n", "Epoch 25/25\n", "1719/1719 [==============================] - 2s 1ms/step - loss: 0.2224 - accuracy: 0.9229 - val_loss: 0.3164 - val_accuracy: 0.8904\n" ] } ], "source": [ "model = keras.models.Sequential([\n", " keras.layers.Flatten(input_shape=[28, 28]),\n", " keras.layers.Dense(300, activation=\"selu\", kernel_initializer=\"lecun_normal\"),\n", " keras.layers.Dense(100, activation=\"selu\", kernel_initializer=\"lecun_normal\"),\n", " keras.layers.Dense(10, activation=\"softmax\")\n", "])\n", "s = 20 * len(X_train) // 32 # number of steps in 20 epochs (batch size = 32)\n", "learning_rate = keras.optimizers.schedules.ExponentialDecay(0.01, s, 0.1)\n", "optimizer = keras.optimizers.SGD(learning_rate)\n", "model.compile(loss=\"sparse_categorical_crossentropy\", optimizer=optimizer, metrics=[\"accuracy\"])\n", "n_epochs = 25\n", "history = model.fit(X_train_scaled, y_train, epochs=n_epochs,\n", " validation_data=(X_valid_scaled, y_valid))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For piecewise constant scheduling, try this:" ] }, { "cell_type": "code", "execution_count": 95, "metadata": {}, "outputs": [], "source": [ "learning_rate = keras.optimizers.schedules.PiecewiseConstantDecay(\n", " boundaries=[5. * n_steps_per_epoch, 15. * n_steps_per_epoch],\n", " values=[0.01, 0.005, 0.001])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1Cycle scheduling" ] }, { "cell_type": "code", "execution_count": 96, "metadata": {}, "outputs": [], "source": [ "K = keras.backend\n", "\n", "class ExponentialLearningRate(keras.callbacks.Callback):\n", " def __init__(self, factor):\n", " self.factor = factor\n", " self.rates = []\n", " self.losses = []\n", " def on_batch_end(self, batch, logs):\n", " self.rates.append(K.get_value(self.model.optimizer.learning_rate))\n", " self.losses.append(logs[\"loss\"])\n", " K.set_value(self.model.optimizer.learning_rate, self.model.optimizer.learning_rate * self.factor)\n", "\n", "def find_learning_rate(model, X, y, epochs=1, batch_size=32, min_rate=10**-5, max_rate=10):\n", " init_weights = model.get_weights()\n", " iterations = math.ceil(len(X) / batch_size) * epochs\n", " factor = np.exp(np.log(max_rate / min_rate) / iterations)\n", " init_lr = K.get_value(model.optimizer.learning_rate)\n", " K.set_value(model.optimizer.learning_rate, min_rate)\n", " exp_lr = ExponentialLearningRate(factor)\n", " history = model.fit(X, y, epochs=epochs, batch_size=batch_size,\n", " callbacks=[exp_lr])\n", " K.set_value(model.optimizer.learning_rate, init_lr)\n", " model.set_weights(init_weights)\n", " return exp_lr.rates, exp_lr.losses\n", "\n", "def plot_lr_vs_loss(rates, losses):\n", " plt.plot(rates, losses)\n", " plt.gca().set_xscale('log')\n", " plt.hlines(min(losses), min(rates), max(rates))\n", " plt.axis([min(rates), max(rates), min(losses), (losses[0] + min(losses)) / 2])\n", " plt.xlabel(\"Learning rate\")\n", " plt.ylabel(\"Loss\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Warning**: In the `on_batch_end()` method, `logs[\"loss\"]` used to contain the batch loss, but in TensorFlow 2.2.0 it was replaced with the mean loss (since the start of the epoch). This explains why the graph below is much smoother than in the book (if you are using TF 2.2 or above). It also means that there is a lag between the moment the batch loss starts exploding and the moment the explosion becomes clear in the graph. So you should choose a slightly smaller learning rate than you would have chosen with the \"noisy\" graph. Alternatively, you can tweak the `ExponentialLearningRate` callback above so it computes the batch loss (based on the current mean loss and the previous mean loss):\n", "\n", "```python\n", "class ExponentialLearningRate(keras.callbacks.Callback):\n", " def __init__(self, factor):\n", " self.factor = factor\n", " self.rates = []\n", " self.losses = []\n", " def on_epoch_begin(self, epoch, logs=None):\n", " self.prev_loss = 0\n", " def on_batch_end(self, batch, logs=None):\n", " batch_loss = logs[\"loss\"] * (batch + 1) - self.prev_loss * batch\n", " self.prev_loss = logs[\"loss\"]\n", " self.rates.append(K.get_value(self.model.optimizer.learning_rate))\n", " self.losses.append(batch_loss)\n", " K.set_value(self.model.optimizer.learning_rate, self.model.optimizer.learning_rate * self.factor)\n", "```" ] }, { "cell_type": "code", "execution_count": 97, "metadata": {}, "outputs": [], "source": [ "tf.random.set_seed(42)\n", "np.random.seed(42)\n", "\n", "model = keras.models.Sequential([\n", " keras.layers.Flatten(input_shape=[28, 28]),\n", " keras.layers.Dense(300, activation=\"selu\", kernel_initializer=\"lecun_normal\"),\n", " keras.layers.Dense(100, activation=\"selu\", kernel_initializer=\"lecun_normal\"),\n", " keras.layers.Dense(10, activation=\"softmax\")\n", "])\n", "model.compile(loss=\"sparse_categorical_crossentropy\",\n", " optimizer=keras.optimizers.SGD(learning_rate=1e-3),\n", " metrics=[\"accuracy\"])" ] }, { "cell_type": "code", "execution_count": 98, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "430/430 [==============================] - 1s 2ms/step - loss: nan - accuracy: 0.3120\n" ] }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "batch_size = 128\n", "rates, losses = find_learning_rate(model, X_train_scaled, y_train, epochs=1, batch_size=batch_size)\n", "plot_lr_vs_loss(rates, losses)" ] }, { "cell_type": "code", "execution_count": 99, "metadata": {}, "outputs": [], "source": [ "class OneCycleScheduler(keras.callbacks.Callback):\n", " def __init__(self, iterations, max_rate, start_rate=None,\n", " last_iterations=None, last_rate=None):\n", " self.iterations = iterations\n", " self.max_rate = max_rate\n", " self.start_rate = start_rate or max_rate / 10\n", " self.last_iterations = last_iterations or iterations // 10 + 1\n", " self.half_iteration = (iterations - self.last_iterations) // 2\n", " self.last_rate = last_rate or self.start_rate / 1000\n", " self.iteration = 0\n", " def _interpolate(self, iter1, iter2, rate1, rate2):\n", " return ((rate2 - rate1) * (self.iteration - iter1)\n", " / (iter2 - iter1) + rate1)\n", " def on_batch_begin(self, batch, logs):\n", " if self.iteration < self.half_iteration:\n", " rate = self._interpolate(0, self.half_iteration, self.start_rate, self.max_rate)\n", " elif self.iteration < 2 * self.half_iteration:\n", " rate = self._interpolate(self.half_iteration, 2 * self.half_iteration,\n", " self.max_rate, self.start_rate)\n", " else:\n", " rate = self._interpolate(2 * self.half_iteration, self.iterations,\n", " self.start_rate, self.last_rate)\n", " self.iteration += 1\n", " K.set_value(self.model.optimizer.learning_rate, rate)" ] }, { "cell_type": "code", "execution_count": 100, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/25\n", "430/430 [==============================] - 1s 2ms/step - loss: 0.6572 - accuracy: 0.7740 - val_loss: 0.4872 - val_accuracy: 0.8338\n", "Epoch 2/25\n", "430/430 [==============================] - 1s 2ms/step - loss: 0.4580 - accuracy: 0.8397 - val_loss: 0.4274 - val_accuracy: 0.8520\n", "Epoch 3/25\n", "430/430 [==============================] - 1s 2ms/step - loss: 0.4121 - accuracy: 0.8545 - val_loss: 0.4116 - val_accuracy: 0.8588\n", "Epoch 4/25\n", "430/430 [==============================] - 1s 2ms/step - loss: 0.3837 - accuracy: 0.8642 - val_loss: 0.3868 - val_accuracy: 0.8688\n", "Epoch 5/25\n", "430/430 [==============================] - 1s 2ms/step - loss: 0.3639 - accuracy: 0.8719 - val_loss: 0.3766 - val_accuracy: 0.8688\n", "Epoch 6/25\n", "430/430 [==============================] - 1s 2ms/step - loss: 0.3456 - accuracy: 0.8775 - val_loss: 0.3739 - val_accuracy: 0.8706\n", "Epoch 7/25\n", "430/430 [==============================] - 1s 2ms/step - loss: 0.3330 - accuracy: 0.8811 - val_loss: 0.3635 - val_accuracy: 0.8708\n", "Epoch 8/25\n", "430/430 [==============================] - 1s 2ms/step - loss: 0.3184 - accuracy: 0.8861 - val_loss: 0.3959 - val_accuracy: 0.8610\n", "Epoch 9/25\n", "430/430 [==============================] - 1s 2ms/step - loss: 0.3065 - accuracy: 0.8890 - val_loss: 0.3475 - val_accuracy: 0.8770\n", "Epoch 10/25\n", "430/430 [==============================] - 1s 2ms/step - loss: 0.2943 - accuracy: 0.8927 - val_loss: 0.3392 - val_accuracy: 0.8806\n", "Epoch 11/25\n", "430/430 [==============================] - 1s 2ms/step - loss: 0.2838 - accuracy: 0.8963 - val_loss: 0.3467 - val_accuracy: 0.8800\n", "Epoch 12/25\n", "430/430 [==============================] - 1s 2ms/step - loss: 0.2707 - accuracy: 0.9024 - val_loss: 0.3646 - val_accuracy: 0.8696\n", "Epoch 13/25\n", "430/430 [==============================] - 1s 2ms/step - loss: 0.2536 - accuracy: 0.9079 - val_loss: 0.3350 - val_accuracy: 0.8842\n", "Epoch 14/25\n", "430/430 [==============================] - 1s 2ms/step - loss: 0.2405 - accuracy: 0.9135 - val_loss: 0.3465 - val_accuracy: 0.8794\n", "Epoch 15/25\n", "430/430 [==============================] - 1s 2ms/step - loss: 0.2279 - accuracy: 0.9185 - val_loss: 0.3257 - val_accuracy: 0.8830\n", "Epoch 16/25\n", "430/430 [==============================] - 1s 2ms/step - loss: 0.2159 - accuracy: 0.9232 - val_loss: 0.3294 - val_accuracy: 0.8824\n", "Epoch 17/25\n", "430/430 [==============================] - 1s 2ms/step - loss: 0.2062 - accuracy: 0.9263 - val_loss: 0.3333 - val_accuracy: 0.8882\n", "Epoch 18/25\n", "430/430 [==============================] - 1s 2ms/step - loss: 0.1978 - accuracy: 0.9301 - val_loss: 0.3235 - val_accuracy: 0.8898\n", "Epoch 19/25\n", "430/430 [==============================] - 1s 2ms/step - loss: 0.1892 - accuracy: 0.9337 - val_loss: 0.3233 - val_accuracy: 0.8906\n", "Epoch 20/25\n", "430/430 [==============================] - 1s 2ms/step - loss: 0.1821 - accuracy: 0.9365 - val_loss: 0.3224 - val_accuracy: 0.8928\n", "Epoch 21/25\n", "430/430 [==============================] - 1s 2ms/step - loss: 0.1752 - accuracy: 0.9400 - val_loss: 0.3220 - val_accuracy: 0.8908\n", "Epoch 22/25\n", "430/430 [==============================] - 1s 2ms/step - loss: 0.1700 - accuracy: 0.9416 - val_loss: 0.3180 - val_accuracy: 0.8962\n", "Epoch 23/25\n", "430/430 [==============================] - 1s 2ms/step - loss: 0.1655 - accuracy: 0.9438 - val_loss: 0.3187 - val_accuracy: 0.8940\n", "Epoch 24/25\n", "430/430 [==============================] - 1s 2ms/step - loss: 0.1627 - accuracy: 0.9454 - val_loss: 0.3177 - val_accuracy: 0.8932\n", "Epoch 25/25\n", "430/430 [==============================] - 1s 2ms/step - loss: 0.1610 - accuracy: 0.9462 - val_loss: 0.3170 - val_accuracy: 0.8934\n" ] } ], "source": [ "n_epochs = 25\n", "onecycle = OneCycleScheduler(math.ceil(len(X_train) / batch_size) * n_epochs, max_rate=0.05)\n", "history = model.fit(X_train_scaled, y_train, epochs=n_epochs, batch_size=batch_size,\n", " validation_data=(X_valid_scaled, y_valid),\n", " callbacks=[onecycle])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Avoiding Overfitting Through Regularization" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## $\\ell_1$ and $\\ell_2$ regularization" ] }, { "cell_type": "code", "execution_count": 101, "metadata": {}, "outputs": [], "source": [ "layer = keras.layers.Dense(100, activation=\"elu\",\n", " kernel_initializer=\"he_normal\",\n", " kernel_regularizer=keras.regularizers.l2(0.01))\n", "# or l1(0.1) for ℓ1 regularization with a factor of 0.1\n", "# or l1_l2(0.1, 0.01) for both ℓ1 and ℓ2 regularization, with factors 0.1 and 0.01 respectively" ] }, { "cell_type": "code", "execution_count": 102, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/2\n", "1719/1719 [==============================] - 6s 3ms/step - loss: 3.2189 - accuracy: 0.7967 - val_loss: 0.7169 - val_accuracy: 0.8340\n", "Epoch 2/2\n", "1719/1719 [==============================] - 5s 3ms/step - loss: 0.7280 - accuracy: 0.8247 - val_loss: 0.6850 - val_accuracy: 0.8376\n" ] } ], "source": [ "model = keras.models.Sequential([\n", " keras.layers.Flatten(input_shape=[28, 28]),\n", " keras.layers.Dense(300, activation=\"elu\",\n", " kernel_initializer=\"he_normal\",\n", " kernel_regularizer=keras.regularizers.l2(0.01)),\n", " keras.layers.Dense(100, activation=\"elu\",\n", " kernel_initializer=\"he_normal\",\n", " kernel_regularizer=keras.regularizers.l2(0.01)),\n", " keras.layers.Dense(10, activation=\"softmax\",\n", " kernel_regularizer=keras.regularizers.l2(0.01))\n", "])\n", "model.compile(loss=\"sparse_categorical_crossentropy\", optimizer=\"nadam\", metrics=[\"accuracy\"])\n", "n_epochs = 2\n", "history = model.fit(X_train_scaled, y_train, epochs=n_epochs,\n", " validation_data=(X_valid_scaled, y_valid))" ] }, { "cell_type": "code", "execution_count": 103, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/2\n", "1719/1719 [==============================] - 6s 3ms/step - loss: 3.2911 - accuracy: 0.7924 - val_loss: 0.7218 - val_accuracy: 0.8310\n", "Epoch 2/2\n", "1719/1719 [==============================] - 5s 3ms/step - loss: 0.7282 - accuracy: 0.8245 - val_loss: 0.6826 - val_accuracy: 0.8382\n" ] } ], "source": [ "from functools import partial\n", "\n", "RegularizedDense = partial(keras.layers.Dense,\n", " activation=\"elu\",\n", " kernel_initializer=\"he_normal\",\n", " kernel_regularizer=keras.regularizers.l2(0.01))\n", "\n", "model = keras.models.Sequential([\n", " keras.layers.Flatten(input_shape=[28, 28]),\n", " RegularizedDense(300),\n", " RegularizedDense(100),\n", " RegularizedDense(10, activation=\"softmax\")\n", "])\n", "model.compile(loss=\"sparse_categorical_crossentropy\", optimizer=\"nadam\", metrics=[\"accuracy\"])\n", "n_epochs = 2\n", "history = model.fit(X_train_scaled, y_train, epochs=n_epochs,\n", " validation_data=(X_valid_scaled, y_valid))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Dropout" ] }, { "cell_type": "code", "execution_count": 104, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/2\n", "1719/1719 [==============================] - 6s 3ms/step - loss: 0.7611 - accuracy: 0.7576 - val_loss: 0.3730 - val_accuracy: 0.8644\n", "Epoch 2/2\n", "1719/1719 [==============================] - 5s 3ms/step - loss: 0.4306 - accuracy: 0.8401 - val_loss: 0.3395 - val_accuracy: 0.8722\n" ] } ], "source": [ "model = keras.models.Sequential([\n", " keras.layers.Flatten(input_shape=[28, 28]),\n", " keras.layers.Dropout(rate=0.2),\n", " keras.layers.Dense(300, activation=\"elu\", kernel_initializer=\"he_normal\"),\n", " keras.layers.Dropout(rate=0.2),\n", " keras.layers.Dense(100, activation=\"elu\", kernel_initializer=\"he_normal\"),\n", " keras.layers.Dropout(rate=0.2),\n", " keras.layers.Dense(10, activation=\"softmax\")\n", "])\n", "model.compile(loss=\"sparse_categorical_crossentropy\", optimizer=\"nadam\", metrics=[\"accuracy\"])\n", "n_epochs = 2\n", "history = model.fit(X_train_scaled, y_train, epochs=n_epochs,\n", " validation_data=(X_valid_scaled, y_valid))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Alpha Dropout" ] }, { "cell_type": "code", "execution_count": 105, "metadata": {}, "outputs": [], "source": [ "tf.random.set_seed(42)\n", "np.random.seed(42)" ] }, { "cell_type": "code", "execution_count": 106, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/20\n", "1719/1719 [==============================] - 3s 2ms/step - loss: 0.8023 - accuracy: 0.7146 - val_loss: 0.5781 - val_accuracy: 0.8442\n", "Epoch 2/20\n", "1719/1719 [==============================] - 3s 1ms/step - loss: 0.5663 - accuracy: 0.7905 - val_loss: 0.5182 - val_accuracy: 0.8520\n", "Epoch 3/20\n", "1719/1719 [==============================] - 3s 2ms/step - loss: 0.5264 - accuracy: 0.8054 - val_loss: 0.4874 - val_accuracy: 0.8600\n", "Epoch 4/20\n", "1719/1719 [==============================] - 3s 2ms/step - loss: 0.5126 - accuracy: 0.8092 - val_loss: 0.4890 - val_accuracy: 0.8598\n", "Epoch 5/20\n", "1719/1719 [==============================] - 3s 2ms/step - loss: 0.5071 - accuracy: 0.8133 - val_loss: 0.4266 - val_accuracy: 0.8696\n", "Epoch 6/20\n", "1719/1719 [==============================] - 3s 1ms/step - loss: 0.4793 - accuracy: 0.8198 - val_loss: 0.4585 - val_accuracy: 0.8640\n", "Epoch 7/20\n", "1719/1719 [==============================] - 3s 1ms/step - loss: 0.4724 - accuracy: 0.8262 - val_loss: 0.4740 - val_accuracy: 0.8612\n", "Epoch 8/20\n", "1719/1719 [==============================] - 3s 2ms/step - loss: 0.4570 - accuracy: 0.8297 - val_loss: 0.4295 - val_accuracy: 0.8656\n", "Epoch 9/20\n", "1719/1719 [==============================] - 3s 2ms/step - loss: 0.4632 - accuracy: 0.8286 - val_loss: 0.4357 - val_accuracy: 0.8736\n", "Epoch 10/20\n", "1719/1719 [==============================] - 3s 2ms/step - loss: 0.4552 - accuracy: 0.8340 - val_loss: 0.4366 - val_accuracy: 0.8674\n", "Epoch 11/20\n", "1719/1719 [==============================] - 3s 1ms/step - loss: 0.4461 - accuracy: 0.8346 - val_loss: 0.4278 - val_accuracy: 0.8684\n", "Epoch 12/20\n", "1719/1719 [==============================] - 3s 2ms/step - loss: 0.4419 - accuracy: 0.8351 - val_loss: 0.5086 - val_accuracy: 0.8558\n", "Epoch 13/20\n", "1719/1719 [==============================] - 3s 2ms/step - loss: 0.4329 - accuracy: 0.8385 - val_loss: 0.4280 - val_accuracy: 0.8728\n", "Epoch 14/20\n", "1719/1719 [==============================] - 3s 2ms/step - loss: 0.4305 - accuracy: 0.8399 - val_loss: 0.4460 - val_accuracy: 0.8628\n", "Epoch 15/20\n", "1719/1719 [==============================] - 3s 2ms/step - loss: 0.4315 - accuracy: 0.8397 - val_loss: 0.4361 - val_accuracy: 0.8706\n", "Epoch 16/20\n", "1719/1719 [==============================] - 3s 2ms/step - loss: 0.4251 - accuracy: 0.8403 - val_loss: 0.4280 - val_accuracy: 0.8758\n", "Epoch 17/20\n", "1719/1719 [==============================] - 3s 2ms/step - loss: 0.4207 - accuracy: 0.8427 - val_loss: 0.5336 - val_accuracy: 0.8584\n", "Epoch 18/20\n", "1719/1719 [==============================] - 3s 2ms/step - loss: 0.4365 - accuracy: 0.8387 - val_loss: 0.4769 - val_accuracy: 0.8736\n", "Epoch 19/20\n", "1719/1719 [==============================] - 3s 1ms/step - loss: 0.4262 - accuracy: 0.8409 - val_loss: 0.4636 - val_accuracy: 0.8706\n", "Epoch 20/20\n", "1719/1719 [==============================] - 3s 1ms/step - loss: 0.4189 - accuracy: 0.8421 - val_loss: 0.4388 - val_accuracy: 0.8760\n" ] } ], "source": [ "model = keras.models.Sequential([\n", " keras.layers.Flatten(input_shape=[28, 28]),\n", " keras.layers.AlphaDropout(rate=0.2),\n", " keras.layers.Dense(300, activation=\"selu\", kernel_initializer=\"lecun_normal\"),\n", " keras.layers.AlphaDropout(rate=0.2),\n", " keras.layers.Dense(100, activation=\"selu\", kernel_initializer=\"lecun_normal\"),\n", " keras.layers.AlphaDropout(rate=0.2),\n", " keras.layers.Dense(10, activation=\"softmax\")\n", "])\n", "optimizer = keras.optimizers.SGD(learning_rate=0.01, momentum=0.9, nesterov=True)\n", "model.compile(loss=\"sparse_categorical_crossentropy\", optimizer=optimizer, metrics=[\"accuracy\"])\n", "n_epochs = 20\n", "history = model.fit(X_train_scaled, y_train, epochs=n_epochs,\n", " validation_data=(X_valid_scaled, y_valid))" ] }, { "cell_type": "code", "execution_count": 107, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "313/313 [==============================] - 0s 834us/step - loss: 0.4723 - accuracy: 0.8639\n" ] }, { "data": { "text/plain": [ "[0.47229740023612976, 0.8639000058174133]" ] }, "execution_count": 107, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model.evaluate(X_test_scaled, y_test)" ] }, { "cell_type": "code", "execution_count": 108, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1719/1719 [==============================] - 1s 782us/step - loss: 0.3501 - accuracy: 0.8840\n" ] }, { "data": { "text/plain": [ "[0.3501231074333191, 0.8840363621711731]" ] }, "execution_count": 108, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model.evaluate(X_train_scaled, y_train)" ] }, { "cell_type": "code", "execution_count": 109, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1719/1719 [==============================] - 2s 1ms/step - loss: 0.4225 - accuracy: 0.8432\n" ] } ], "source": [ "history = model.fit(X_train_scaled, y_train)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## MC Dropout" ] }, { "cell_type": "code", "execution_count": 110, "metadata": {}, "outputs": [], "source": [ "tf.random.set_seed(42)\n", "np.random.seed(42)" ] }, { "cell_type": "code", "execution_count": 111, "metadata": {}, "outputs": [], "source": [ "y_probas = np.stack([model(X_test_scaled, training=True)\n", " for sample in range(100)])\n", "y_proba = y_probas.mean(axis=0)\n", "y_std = y_probas.std(axis=0)" ] }, { "cell_type": "code", "execution_count": 112, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.04, 0. , 0.96]],\n", " dtype=float32)" ] }, "execution_count": 112, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.round(model.predict(X_test_scaled[:1]), 2)" ] }, { "cell_type": "code", "execution_count": 113, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[[0. , 0. , 0. , 0. , 0. , 0.01, 0. , 0.57, 0. , 0.42]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.03, 0. , 0.93, 0. , 0.05]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.01, 0. , 0. , 0. , 0.99]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.08, 0. , 0.92]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.46, 0. , 0.53]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.75, 0. , 0.24]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.06, 0. , 0.43, 0. , 0.51]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.13, 0. , 0.28, 0. , 0.59]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.12, 0. , 0.06, 0. , 0.81]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.01, 0. , 0.02, 0. , 0.98]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.01, 0. , 0.07, 0. , 0.92]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.08, 0. , 0.28, 0. , 0.64]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.01, 0. , 0.29, 0. , 0.7 ]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.26, 0. , 0.22, 0. , 0.52]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.03, 0. , 0.25, 0. , 0.72]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.3 , 0. , 0.06, 0. , 0.65]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.04, 0. , 0. , 0. , 0.96]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.02, 0. , 0.81, 0. , 0.17]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.01, 0. , 0.25, 0. , 0.75]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.04, 0. , 0.03, 0. , 0.93]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.84, 0.04, 0.01, 0. , 0.1 ]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.2 , 0. , 0.1 , 0. , 0.7 ]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 1. ]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.01, 0. , 0.28, 0. , 0.72]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.12, 0. , 0.12, 0. , 0.76]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.04, 0. , 0.75, 0. , 0.21]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.02, 0. , 0.11, 0. , 0.87]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.28, 0. , 0.39, 0. , 0.34]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.02, 0. , 0.04, 0. , 0.95]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.18, 0. , 0.51, 0.01, 0.3 ]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.01, 0. , 0.01, 0. , 0.99]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 1. ]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.59, 0. , 0.01, 0. , 0.39]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.12, 0. , 0.8 , 0. , 0.08]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.05, 0. , 0.27, 0. , 0.67]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.07, 0. , 0.77, 0. , 0.16]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.01, 0. , 0.99]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.01, 0. , 0.18, 0. , 0.81]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.04, 0. , 0.32, 0. , 0.64]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.75, 0. , 0.04, 0. , 0.21]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.01, 0. , 0.24, 0. , 0.75]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.04, 0. , 0.96]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.44, 0. , 0.38, 0. , 0.17]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.02, 0. , 0.98]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.01, 0. , 0.02, 0. , 0.96]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.01, 0. , 0.99]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.06, 0. , 0.12, 0. , 0.82]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.02, 0. , 0.97]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.28, 0. , 0.71]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.02, 0. , 0.17, 0. , 0.82]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.23, 0. , 0.27, 0. , 0.5 ]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.06, 0. , 0.78, 0. , 0.16]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.01, 0. , 0.99]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.03, 0. , 0.03, 0. , 0.94]],\n", "\n", " [[0. , 0. , 0.01, 0. , 0.02, 0.21, 0.01, 0.01, 0. , 0.75]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.01, 0. , 0.06, 0. , 0.93]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.4 , 0. , 0.6 ]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.12, 0. , 0.18, 0.01, 0.69]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.02, 0. , 0.15, 0. , 0.83]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.24, 0. , 0.24, 0. , 0.52]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.11, 0. , 0.01, 0. , 0.88]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.2 , 0. , 0.8 ]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.05, 0. , 0.04, 0. , 0.91]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.02, 0. , 0.21, 0. , 0.77]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.26, 0. , 0.67, 0. , 0.07]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.27, 0. , 0.55, 0. , 0.19]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.09, 0. , 0.47, 0. , 0.43]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.01, 0. , 0.99]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.07, 0. , 0.71, 0. , 0.22]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.1 , 0. , 0.63, 0.01, 0.26]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 1. ]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.03, 0. , 0.16, 0. , 0.81]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.04, 0. , 0.15, 0. , 0.8 ]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.01, 0. , 0.24, 0. , 0.75]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.06, 0. , 0.13, 0. , 0.81]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.42, 0. , 0.01, 0. , 0.58]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.02, 0. , 0.15, 0. , 0.83]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.01, 0. , 0.79, 0. , 0.19]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.21, 0. , 0.79]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.06, 0. , 0.93]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.01, 0. , 0.12, 0. , 0.87]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.04, 0. , 0.06, 0. , 0.9 ]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 1. ]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.01, 0. , 0.01, 0. , 0.98]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.05, 0. , 0.08, 0. , 0.87]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.03, 0. , 0.51, 0. , 0.46]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.96, 0. , 0.02, 0. , 0.02]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.38, 0. , 0.05, 0. , 0.57]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.01, 0. , 0.07, 0. , 0.92]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.02, 0. , 0.98]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.05, 0. , 0.06, 0. , 0.89]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.01, 0. , 0.05, 0. , 0.94]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.37, 0. , 0.37, 0. , 0.26]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.37, 0. , 0.19, 0. , 0.44]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.02, 0. , 0.11, 0. , 0.87]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.01, 0. , 0.01, 0. , 0.98]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.01, 0. , 0.41, 0. , 0.59]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0. , 0. , 0.12, 0. , 0.87]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.08, 0. , 0.21, 0. , 0.71]],\n", "\n", " [[0. , 0. , 0. , 0. , 0. , 0.12, 0. , 0.13, 0. , 0.75]]],\n", " dtype=float32)" ] }, "execution_count": 113, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.round(y_probas[:, :1], 2)" ] }, { "cell_type": "code", "execution_count": 114, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[0. , 0. , 0. , 0. , 0. , 0.1 , 0. , 0.22, 0. , 0.68]],\n", " dtype=float32)" ] }, "execution_count": 114, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.round(y_proba[:1], 2)" ] }, { "cell_type": "code", "execution_count": 115, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[0. , 0. , 0. , 0. , 0. , 0.18, 0. , 0.24, 0. , 0.29]],\n", " dtype=float32)" ] }, "execution_count": 115, "metadata": {}, "output_type": "execute_result" } ], "source": [ "y_std = y_probas.std(axis=0)\n", "np.round(y_std[:1], 2)" ] }, { "cell_type": "code", "execution_count": 116, "metadata": {}, "outputs": [], "source": [ "y_pred = np.argmax(y_proba, axis=1)" ] }, { "cell_type": "code", "execution_count": 117, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.8661" ] }, "execution_count": 117, "metadata": {}, "output_type": "execute_result" } ], "source": [ "accuracy = np.sum(y_pred == y_test) / len(y_test)\n", "accuracy" ] }, { "cell_type": "code", "execution_count": 118, "metadata": {}, "outputs": [], "source": [ "class MCDropout(keras.layers.Dropout):\n", " def call(self, inputs):\n", " return super().call(inputs, training=True)\n", "\n", "class MCAlphaDropout(keras.layers.AlphaDropout):\n", " def call(self, inputs):\n", " return super().call(inputs, training=True)" ] }, { "cell_type": "code", "execution_count": 119, "metadata": {}, "outputs": [], "source": [ "tf.random.set_seed(42)\n", "np.random.seed(42)" ] }, { "cell_type": "code", "execution_count": 120, "metadata": {}, "outputs": [], "source": [ "mc_model = keras.models.Sequential([\n", " MCAlphaDropout(layer.rate) if isinstance(layer, keras.layers.AlphaDropout) else layer\n", " for layer in model.layers\n", "])" ] }, { "cell_type": "code", "execution_count": 121, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Model: \"sequential_20\"\n", "_________________________________________________________________\n", "Layer (type) Output Shape Param # \n", "=================================================================\n", "flatten_18 (Flatten) (None, 784) 0 \n", "_________________________________________________________________\n", "mc_alpha_dropout (MCAlphaDro (None, 784) 0 \n", "_________________________________________________________________\n", "dense_262 (Dense) (None, 300) 235500 \n", "_________________________________________________________________\n", "mc_alpha_dropout_1 (MCAlphaD (None, 300) 0 \n", "_________________________________________________________________\n", "dense_263 (Dense) (None, 100) 30100 \n", "_________________________________________________________________\n", "mc_alpha_dropout_2 (MCAlphaD (None, 100) 0 \n", "_________________________________________________________________\n", "dense_264 (Dense) (None, 10) 1010 \n", "=================================================================\n", "Total params: 266,610\n", "Trainable params: 266,610\n", "Non-trainable params: 0\n", "_________________________________________________________________\n" ] } ], "source": [ "mc_model.summary()" ] }, { "cell_type": "code", "execution_count": 122, "metadata": {}, "outputs": [], "source": [ "optimizer = keras.optimizers.SGD(learning_rate=0.01, momentum=0.9, nesterov=True)\n", "mc_model.compile(loss=\"sparse_categorical_crossentropy\", optimizer=optimizer, metrics=[\"accuracy\"])" ] }, { "cell_type": "code", "execution_count": 123, "metadata": {}, "outputs": [], "source": [ "mc_model.set_weights(model.get_weights())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can use the model with MC Dropout:" ] }, { "cell_type": "code", "execution_count": 124, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[0. , 0. , 0. , 0. , 0. , 0.14, 0. , 0.25, 0.01, 0.61]],\n", " dtype=float32)" ] }, "execution_count": 124, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.round(np.mean([mc_model.predict(X_test_scaled[:1]) for sample in range(100)], axis=0), 2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Max norm" ] }, { "cell_type": "code", "execution_count": 125, "metadata": {}, "outputs": [], "source": [ "layer = keras.layers.Dense(100, activation=\"selu\", kernel_initializer=\"lecun_normal\",\n", " kernel_constraint=keras.constraints.max_norm(1.))" ] }, { "cell_type": "code", "execution_count": 126, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/2\n", "1719/1719 [==============================] - 5s 3ms/step - loss: 0.5763 - accuracy: 0.8020 - val_loss: 0.3674 - val_accuracy: 0.8674\n", "Epoch 2/2\n", "1719/1719 [==============================] - 5s 3ms/step - loss: 0.3545 - accuracy: 0.8709 - val_loss: 0.3714 - val_accuracy: 0.8662\n" ] } ], "source": [ "MaxNormDense = partial(keras.layers.Dense,\n", " activation=\"selu\", kernel_initializer=\"lecun_normal\",\n", " kernel_constraint=keras.constraints.max_norm(1.))\n", "\n", "model = keras.models.Sequential([\n", " keras.layers.Flatten(input_shape=[28, 28]),\n", " MaxNormDense(300),\n", " MaxNormDense(100),\n", " keras.layers.Dense(10, activation=\"softmax\")\n", "])\n", "model.compile(loss=\"sparse_categorical_crossentropy\", optimizer=\"nadam\", metrics=[\"accuracy\"])\n", "n_epochs = 2\n", "history = model.fit(X_train_scaled, y_train, epochs=n_epochs,\n", " validation_data=(X_valid_scaled, y_valid))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Exercises" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. to 7." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "See appendix A." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 8. Deep Learning on CIFAR10" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### a.\n", "*Exercise: Build a DNN with 20 hidden layers of 100 neurons each (that's too many, but it's the point of this exercise). Use He initialization and the ELU activation function.*" ] }, { "cell_type": "code", "execution_count": 127, "metadata": {}, "outputs": [], "source": [ "keras.backend.clear_session()\n", "tf.random.set_seed(42)\n", "np.random.seed(42)\n", "\n", "model = keras.models.Sequential()\n", "model.add(keras.layers.Flatten(input_shape=[32, 32, 3]))\n", "for _ in range(20):\n", " model.add(keras.layers.Dense(100,\n", " activation=\"elu\",\n", " kernel_initializer=\"he_normal\"))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### b.\n", "*Exercise: Using Nadam optimization and early stopping, train the network on the CIFAR10 dataset. You can load it with `keras.datasets.cifar10.load_data()`. The dataset is composed of 60,000 32 × 32–pixel color images (50,000 for training, 10,000 for testing) with 10 classes, so you'll need a softmax output layer with 10 neurons. Remember to search for the right learning rate each time you change the model's architecture or hyperparameters.*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's add the output layer to the model:" ] }, { "cell_type": "code", "execution_count": 128, "metadata": {}, "outputs": [], "source": [ "model.add(keras.layers.Dense(10, activation=\"softmax\"))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's use a Nadam optimizer with a learning rate of 5e-5. I tried learning rates 1e-5, 3e-5, 1e-4, 3e-4, 1e-3, 3e-3 and 1e-2, and I compared their learning curves for 10 epochs each (using the TensorBoard callback, below). The learning rates 3e-5 and 1e-4 were pretty good, so I tried 5e-5, which turned out to be slightly better." ] }, { "cell_type": "code", "execution_count": 129, "metadata": {}, "outputs": [], "source": [ "optimizer = keras.optimizers.Nadam(learning_rate=5e-5)\n", "model.compile(loss=\"sparse_categorical_crossentropy\",\n", " optimizer=optimizer,\n", " metrics=[\"accuracy\"])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's load the CIFAR10 dataset. We also want to use early stopping, so we need a validation set. Let's use the first 5,000 images of the original training set as the validation set:" ] }, { "cell_type": "code", "execution_count": 130, "metadata": {}, "outputs": [], "source": [ "(X_train_full, y_train_full), (X_test, y_test) = keras.datasets.cifar10.load_data()\n", "\n", "X_train = X_train_full[5000:]\n", "y_train = y_train_full[5000:]\n", "X_valid = X_train_full[:5000]\n", "y_valid = y_train_full[:5000]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can create the callbacks we need and train the model:" ] }, { "cell_type": "code", "execution_count": 131, "metadata": {}, "outputs": [], "source": [ "early_stopping_cb = keras.callbacks.EarlyStopping(patience=20)\n", "model_checkpoint_cb = keras.callbacks.ModelCheckpoint(\"my_cifar10_model.h5\", save_best_only=True)\n", "run_index = 1 # increment every time you train the model\n", "run_logdir = os.path.join(os.curdir, \"my_cifar10_logs\", \"run_{:03d}\".format(run_index))\n", "tensorboard_cb = keras.callbacks.TensorBoard(run_logdir)\n", "callbacks = [early_stopping_cb, model_checkpoint_cb, tensorboard_cb]" ] }, { "cell_type": "code", "execution_count": 132, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "ERROR: Failed to launch TensorBoard (exited with 255).\n", "Contents of stderr:\n", "E0213 19:12:55.493896 4621630912 program.py:311] TensorBoard could not bind to port 6006, it was already in use\n", "ERROR: TensorBoard could not bind to port 6006, it was already in use" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%tensorboard --logdir=./my_cifar10_logs --port=6006" ] }, { "cell_type": "code", "execution_count": 133, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/100\n", "1407/1407 [==============================] - 9s 5ms/step - loss: 9.4191 - accuracy: 0.1388 - val_loss: 2.2328 - val_accuracy: 0.2040\n", "Epoch 2/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 2.1097 - accuracy: 0.2317 - val_loss: 2.0485 - val_accuracy: 0.2402\n", "Epoch 3/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.9667 - accuracy: 0.2844 - val_loss: 1.9681 - val_accuracy: 0.2964\n", "Epoch 4/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.8740 - accuracy: 0.3149 - val_loss: 1.9178 - val_accuracy: 0.3254\n", "Epoch 5/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.8064 - accuracy: 0.3423 - val_loss: 1.8256 - val_accuracy: 0.3384\n", "Epoch 6/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.7525 - accuracy: 0.3595 - val_loss: 1.7430 - val_accuracy: 0.3692\n", "Epoch 7/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.7116 - accuracy: 0.3819 - val_loss: 1.7199 - val_accuracy: 0.3824\n", "Epoch 8/100\n", "1407/1407 [==============================] - 8s 5ms/step - loss: 1.6782 - accuracy: 0.3935 - val_loss: 1.6746 - val_accuracy: 0.3972\n", "Epoch 9/100\n", "1407/1407 [==============================] - 8s 5ms/step - loss: 1.6517 - accuracy: 0.4025 - val_loss: 1.6622 - val_accuracy: 0.4004\n", "Epoch 10/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.6140 - accuracy: 0.4194 - val_loss: 1.7065 - val_accuracy: 0.3840\n", "Epoch 11/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.5884 - accuracy: 0.4301 - val_loss: 1.6736 - val_accuracy: 0.3914\n", "Epoch 12/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.5640 - accuracy: 0.4378 - val_loss: 1.6220 - val_accuracy: 0.4224\n", "Epoch 13/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.5437 - accuracy: 0.4448 - val_loss: 1.6332 - val_accuracy: 0.4144\n", "Epoch 14/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.5214 - accuracy: 0.4555 - val_loss: 1.5785 - val_accuracy: 0.4326\n", "Epoch 15/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.5117 - accuracy: 0.4564 - val_loss: 1.6267 - val_accuracy: 0.4164\n", "Epoch 16/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.4972 - accuracy: 0.4622 - val_loss: 1.5846 - val_accuracy: 0.4316\n", "Epoch 17/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.4888 - accuracy: 0.4661 - val_loss: 1.5549 - val_accuracy: 0.4420\n", "Epoch 18/100\n", "<<24 more lines>>\n", "1407/1407 [==============================] - 8s 5ms/step - loss: 1.3362 - accuracy: 0.5212 - val_loss: 1.6025 - val_accuracy: 0.4500\n", "Epoch 31/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.3360 - accuracy: 0.5207 - val_loss: 1.5175 - val_accuracy: 0.4602\n", "Epoch 32/100\n", "1407/1407 [==============================] - 8s 5ms/step - loss: 1.3031 - accuracy: 0.5302 - val_loss: 1.5397 - val_accuracy: 0.4572\n", "Epoch 33/100\n", "1407/1407 [==============================] - 8s 5ms/step - loss: 1.3082 - accuracy: 0.5308 - val_loss: 1.4997 - val_accuracy: 0.4776\n", "Epoch 34/100\n", "1407/1407 [==============================] - 8s 5ms/step - loss: 1.2882 - accuracy: 0.5338 - val_loss: 1.5482 - val_accuracy: 0.4620\n", "Epoch 35/100\n", "1407/1407 [==============================] - 8s 5ms/step - loss: 1.2889 - accuracy: 0.5355 - val_loss: 1.5474 - val_accuracy: 0.4604\n", "Epoch 36/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.2761 - accuracy: 0.5410 - val_loss: 1.5434 - val_accuracy: 0.4658\n", "Epoch 37/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.2658 - accuracy: 0.5481 - val_loss: 1.5502 - val_accuracy: 0.4706\n", "Epoch 38/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.2554 - accuracy: 0.5489 - val_loss: 1.5527 - val_accuracy: 0.4624\n", "Epoch 39/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.2504 - accuracy: 0.5471 - val_loss: 1.5482 - val_accuracy: 0.4602\n", "Epoch 40/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.2516 - accuracy: 0.5545 - val_loss: 1.5881 - val_accuracy: 0.4574\n", "Epoch 41/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.2401 - accuracy: 0.5566 - val_loss: 1.5403 - val_accuracy: 0.4670\n", "Epoch 42/100\n", "1407/1407 [==============================] - 8s 5ms/step - loss: 1.2305 - accuracy: 0.5570 - val_loss: 1.5343 - val_accuracy: 0.4790\n", "Epoch 43/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.2228 - accuracy: 0.5615 - val_loss: 1.5344 - val_accuracy: 0.4708\n", "Epoch 44/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.2161 - accuracy: 0.5619 - val_loss: 1.5782 - val_accuracy: 0.4526\n", "Epoch 45/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.2124 - accuracy: 0.5641 - val_loss: 1.5182 - val_accuracy: 0.4794\n", "Epoch 46/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.1870 - accuracy: 0.5766 - val_loss: 1.5435 - val_accuracy: 0.4650\n", "Epoch 47/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.1925 - accuracy: 0.5701 - val_loss: 1.5532 - val_accuracy: 0.4686\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 133, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model.fit(X_train, y_train, epochs=100,\n", " validation_data=(X_valid, y_valid),\n", " callbacks=callbacks)" ] }, { "cell_type": "code", "execution_count": 134, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "157/157 [==============================] - 0s 1ms/step - loss: 1.4960 - accuracy: 0.4762\n" ] }, { "data": { "text/plain": [ "[1.4960416555404663, 0.47620001435279846]" ] }, "execution_count": 134, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model = keras.models.load_model(\"my_cifar10_model.h5\")\n", "model.evaluate(X_valid, y_valid)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The model with the lowest validation loss gets about 47.6% accuracy on the validation set. It took 27 epochs to reach the lowest validation loss, with roughly 8 seconds per epoch on my laptop (without a GPU). Let's see if we can improve performance using Batch Normalization." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### c.\n", "*Exercise: Now try adding Batch Normalization and compare the learning curves: Is it converging faster than before? Does it produce a better model? How does it affect training speed?*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The code below is very similar to the code above, with a few changes:\n", "\n", "* I added a BN layer after every Dense layer (before the activation function), except for the output layer. I also added a BN layer before the first hidden layer.\n", "* I changed the learning rate to 5e-4. I experimented with 1e-5, 3e-5, 5e-5, 1e-4, 3e-4, 5e-4, 1e-3 and 3e-3, and I chose the one with the best validation performance after 20 epochs.\n", "* I renamed the run directories to run_bn_* and the model file name to my_cifar10_bn_model.h5." ] }, { "cell_type": "code", "execution_count": 135, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/100\n", "1407/1407 [==============================] - 19s 9ms/step - loss: 1.9765 - accuracy: 0.2968 - val_loss: 1.6602 - val_accuracy: 0.4042\n", "Epoch 2/100\n", "1407/1407 [==============================] - 11s 8ms/step - loss: 1.6787 - accuracy: 0.4056 - val_loss: 1.5887 - val_accuracy: 0.4304\n", "Epoch 3/100\n", "1407/1407 [==============================] - 11s 8ms/step - loss: 1.6097 - accuracy: 0.4274 - val_loss: 1.5781 - val_accuracy: 0.4326\n", "Epoch 4/100\n", "1407/1407 [==============================] - 11s 8ms/step - loss: 1.5574 - accuracy: 0.4486 - val_loss: 1.5064 - val_accuracy: 0.4676\n", "Epoch 5/100\n", "1407/1407 [==============================] - 11s 8ms/step - loss: 1.5075 - accuracy: 0.4642 - val_loss: 1.4412 - val_accuracy: 0.4844\n", "Epoch 6/100\n", "1407/1407 [==============================] - 11s 8ms/step - loss: 1.4664 - accuracy: 0.4787 - val_loss: 1.4179 - val_accuracy: 0.4984\n", "Epoch 7/100\n", "1407/1407 [==============================] - 11s 8ms/step - loss: 1.4334 - accuracy: 0.4932 - val_loss: 1.4277 - val_accuracy: 0.4906\n", "Epoch 8/100\n", "1407/1407 [==============================] - 12s 8ms/step - loss: 1.4054 - accuracy: 0.5038 - val_loss: 1.3843 - val_accuracy: 0.5130\n", "Epoch 9/100\n", "1407/1407 [==============================] - 12s 8ms/step - loss: 1.3816 - accuracy: 0.5106 - val_loss: 1.3691 - val_accuracy: 0.5108\n", "Epoch 10/100\n", "1407/1407 [==============================] - 12s 8ms/step - loss: 1.3547 - accuracy: 0.5206 - val_loss: 1.3552 - val_accuracy: 0.5226\n", "Epoch 11/100\n", "1407/1407 [==============================] - 12s 9ms/step - loss: 1.3244 - accuracy: 0.5371 - val_loss: 1.3678 - val_accuracy: 0.5142\n", "Epoch 12/100\n", "1407/1407 [==============================] - 12s 8ms/step - loss: 1.3078 - accuracy: 0.5393 - val_loss: 1.3844 - val_accuracy: 0.5080\n", "Epoch 13/100\n", "1407/1407 [==============================] - 12s 9ms/step - loss: 1.2889 - accuracy: 0.5431 - val_loss: 1.3566 - val_accuracy: 0.5164\n", "Epoch 14/100\n", "1407/1407 [==============================] - 12s 9ms/step - loss: 1.2607 - accuracy: 0.5559 - val_loss: 1.3626 - val_accuracy: 0.5248\n", "Epoch 15/100\n", "1407/1407 [==============================] - 12s 8ms/step - loss: 1.2580 - accuracy: 0.5587 - val_loss: 1.3616 - val_accuracy: 0.5276\n", "Epoch 16/100\n", "1407/1407 [==============================] - 12s 8ms/step - loss: 1.2441 - accuracy: 0.5586 - val_loss: 1.3350 - val_accuracy: 0.5286\n", "Epoch 17/100\n", "1407/1407 [==============================] - 12s 8ms/step - loss: 1.2241 - accuracy: 0.5676 - val_loss: 1.3370 - val_accuracy: 0.5408\n", "Epoch 18/100\n", "<<29 more lines>>\n", "Epoch 33/100\n", "1407/1407 [==============================] - 12s 8ms/step - loss: 1.0336 - accuracy: 0.6369 - val_loss: 1.3682 - val_accuracy: 0.5450\n", "Epoch 34/100\n", "1407/1407 [==============================] - 11s 8ms/step - loss: 1.0228 - accuracy: 0.6388 - val_loss: 1.3348 - val_accuracy: 0.5458\n", "Epoch 35/100\n", "1407/1407 [==============================] - 12s 8ms/step - loss: 1.0205 - accuracy: 0.6407 - val_loss: 1.3490 - val_accuracy: 0.5440\n", "Epoch 36/100\n", "1407/1407 [==============================] - 12s 9ms/step - loss: 1.0008 - accuracy: 0.6489 - val_loss: 1.3568 - val_accuracy: 0.5408\n", "Epoch 37/100\n", "1407/1407 [==============================] - 12s 9ms/step - loss: 0.9785 - accuracy: 0.6543 - val_loss: 1.3628 - val_accuracy: 0.5396\n", "Epoch 38/100\n", "1407/1407 [==============================] - 12s 9ms/step - loss: 0.9832 - accuracy: 0.6592 - val_loss: 1.3617 - val_accuracy: 0.5482\n", "Epoch 39/100\n", "1407/1407 [==============================] - 12s 8ms/step - loss: 0.9707 - accuracy: 0.6581 - val_loss: 1.3767 - val_accuracy: 0.5446\n", "Epoch 40/100\n", "1407/1407 [==============================] - 12s 9ms/step - loss: 0.9590 - accuracy: 0.6651 - val_loss: 1.4200 - val_accuracy: 0.5314\n", "Epoch 41/100\n", "1407/1407 [==============================] - 12s 9ms/step - loss: 0.9548 - accuracy: 0.6668 - val_loss: 1.3692 - val_accuracy: 0.5450\n", "Epoch 42/100\n", "1407/1407 [==============================] - 12s 9ms/step - loss: 0.9480 - accuracy: 0.6667 - val_loss: 1.3841 - val_accuracy: 0.5310\n", "Epoch 43/100\n", "1407/1407 [==============================] - 12s 9ms/step - loss: 0.9411 - accuracy: 0.6716 - val_loss: 1.4036 - val_accuracy: 0.5382\n", "Epoch 44/100\n", "1407/1407 [==============================] - 12s 9ms/step - loss: 0.9383 - accuracy: 0.6708 - val_loss: 1.4114 - val_accuracy: 0.5236\n", "Epoch 45/100\n", "1407/1407 [==============================] - 12s 9ms/step - loss: 0.9258 - accuracy: 0.6769 - val_loss: 1.4224 - val_accuracy: 0.5324\n", "Epoch 46/100\n", "1407/1407 [==============================] - 12s 9ms/step - loss: 0.9072 - accuracy: 0.6836 - val_loss: 1.3875 - val_accuracy: 0.5442\n", "Epoch 47/100\n", "1407/1407 [==============================] - 12s 9ms/step - loss: 0.8996 - accuracy: 0.6850 - val_loss: 1.4449 - val_accuracy: 0.5280\n", "Epoch 48/100\n", "1407/1407 [==============================] - 13s 9ms/step - loss: 0.9050 - accuracy: 0.6835 - val_loss: 1.4167 - val_accuracy: 0.5338\n", "Epoch 49/100\n", "1407/1407 [==============================] - 12s 9ms/step - loss: 0.8934 - accuracy: 0.6880 - val_loss: 1.4260 - val_accuracy: 0.5294\n", "157/157 [==============================] - 1s 2ms/step - loss: 1.3344 - accuracy: 0.5398\n" ] }, { "data": { "text/plain": [ "[1.3343921899795532, 0.5397999882698059]" ] }, "execution_count": 135, "metadata": {}, "output_type": "execute_result" } ], "source": [ "keras.backend.clear_session()\n", "tf.random.set_seed(42)\n", "np.random.seed(42)\n", "\n", "model = keras.models.Sequential()\n", "model.add(keras.layers.Flatten(input_shape=[32, 32, 3]))\n", "model.add(keras.layers.BatchNormalization())\n", "for _ in range(20):\n", " model.add(keras.layers.Dense(100, kernel_initializer=\"he_normal\"))\n", " model.add(keras.layers.BatchNormalization())\n", " model.add(keras.layers.Activation(\"elu\"))\n", "model.add(keras.layers.Dense(10, activation=\"softmax\"))\n", "\n", "optimizer = keras.optimizers.Nadam(learning_rate=5e-4)\n", "model.compile(loss=\"sparse_categorical_crossentropy\",\n", " optimizer=optimizer,\n", " metrics=[\"accuracy\"])\n", "\n", "early_stopping_cb = keras.callbacks.EarlyStopping(patience=20)\n", "model_checkpoint_cb = keras.callbacks.ModelCheckpoint(\"my_cifar10_bn_model.h5\", save_best_only=True)\n", "run_index = 1 # increment every time you train the model\n", "run_logdir = os.path.join(os.curdir, \"my_cifar10_logs\", \"run_bn_{:03d}\".format(run_index))\n", "tensorboard_cb = keras.callbacks.TensorBoard(run_logdir)\n", "callbacks = [early_stopping_cb, model_checkpoint_cb, tensorboard_cb]\n", "\n", "model.fit(X_train, y_train, epochs=100,\n", " validation_data=(X_valid, y_valid),\n", " callbacks=callbacks)\n", "\n", "model = keras.models.load_model(\"my_cifar10_bn_model.h5\")\n", "model.evaluate(X_valid, y_valid)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* *Is the model converging faster than before?* Much faster! The previous model took 27 epochs to reach the lowest validation loss, while the new model achieved that same loss in just 5 epochs and continued to make progress until the 16th epoch. The BN layers stabilized training and allowed us to use a much larger learning rate, so convergence was faster.\n", "* *Does BN produce a better model?* Yes! The final model is also much better, with 54.0% accuracy instead of 47.6%. It's still not a very good model, but at least it's much better than before (a Convolutional Neural Network would do much better, but that's a different topic, see chapter 14).\n", "* *How does BN affect training speed?* Although the model converged much faster, each epoch took about 12s instead of 8s, because of the extra computations required by the BN layers. But overall the training time (wall time) was shortened significantly!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### d.\n", "*Exercise: Try replacing Batch Normalization with SELU, and make the necessary adjustements to ensure the network self-normalizes (i.e., standardize the input features, use LeCun normal initialization, make sure the DNN contains only a sequence of dense layers, etc.).*" ] }, { "cell_type": "code", "execution_count": 136, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/100\n", "1407/1407 [==============================] - 10s 5ms/step - loss: 2.0622 - accuracy: 0.2631 - val_loss: 1.7878 - val_accuracy: 0.3552\n", "Epoch 2/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.7328 - accuracy: 0.3830 - val_loss: 1.7028 - val_accuracy: 0.3828\n", "Epoch 3/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.6342 - accuracy: 0.4279 - val_loss: 1.6692 - val_accuracy: 0.4022\n", "Epoch 4/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.5524 - accuracy: 0.4538 - val_loss: 1.6350 - val_accuracy: 0.4300\n", "Epoch 5/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.4979 - accuracy: 0.4756 - val_loss: 1.5773 - val_accuracy: 0.4356\n", "Epoch 6/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.4428 - accuracy: 0.4902 - val_loss: 1.5529 - val_accuracy: 0.4630\n", "Epoch 7/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.3966 - accuracy: 0.5126 - val_loss: 1.5290 - val_accuracy: 0.4682\n", "Epoch 8/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.3549 - accuracy: 0.5232 - val_loss: 1.4633 - val_accuracy: 0.4792\n", "Epoch 9/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.3162 - accuracy: 0.5444 - val_loss: 1.4787 - val_accuracy: 0.4776\n", "Epoch 10/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.2825 - accuracy: 0.5534 - val_loss: 1.4794 - val_accuracy: 0.4934\n", "Epoch 11/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.2529 - accuracy: 0.5682 - val_loss: 1.5529 - val_accuracy: 0.4982\n", "Epoch 12/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.2256 - accuracy: 0.5784 - val_loss: 1.4942 - val_accuracy: 0.4902\n", "Epoch 13/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.2049 - accuracy: 0.5823 - val_loss: 1.4868 - val_accuracy: 0.5024\n", "Epoch 14/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.1627 - accuracy: 0.6012 - val_loss: 1.4839 - val_accuracy: 0.5082\n", "Epoch 15/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.1543 - accuracy: 0.6034 - val_loss: 1.5097 - val_accuracy: 0.4968\n", "Epoch 16/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.1200 - accuracy: 0.6135 - val_loss: 1.5001 - val_accuracy: 0.5120\n", "Epoch 17/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.1028 - accuracy: 0.6199 - val_loss: 1.4856 - val_accuracy: 0.5056\n", "Epoch 18/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.0863 - accuracy: 0.6265 - val_loss: 1.5116 - val_accuracy: 0.4966\n", "Epoch 19/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.0715 - accuracy: 0.6345 - val_loss: 1.5787 - val_accuracy: 0.5070\n", "Epoch 20/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.0342 - accuracy: 0.6453 - val_loss: 1.4987 - val_accuracy: 0.5144\n", "Epoch 21/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.0169 - accuracy: 0.6531 - val_loss: 1.6292 - val_accuracy: 0.4462\n", "Epoch 22/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.1346 - accuracy: 0.6074 - val_loss: 1.5280 - val_accuracy: 0.5136\n", "Epoch 23/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 0.9820 - accuracy: 0.6678 - val_loss: 1.5392 - val_accuracy: 0.5040\n", "Epoch 24/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 0.9701 - accuracy: 0.6679 - val_loss: 1.5505 - val_accuracy: 0.5170\n", "Epoch 25/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.3604 - accuracy: 0.6753 - val_loss: 1.5468 - val_accuracy: 0.4992\n", "Epoch 26/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.0177 - accuracy: 0.6510 - val_loss: 1.5474 - val_accuracy: 0.5020\n", "Epoch 27/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 0.9425 - accuracy: 0.6798 - val_loss: 1.5545 - val_accuracy: 0.5076\n", "Epoch 28/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 0.9005 - accuracy: 0.6902 - val_loss: 1.5659 - val_accuracy: 0.5138\n", "157/157 [==============================] - 0s 1ms/step - loss: 1.4633 - accuracy: 0.4792\n" ] }, { "data": { "text/plain": [ "[1.4633383750915527, 0.47920000553131104]" ] }, "execution_count": 136, "metadata": {}, "output_type": "execute_result" } ], "source": [ "keras.backend.clear_session()\n", "tf.random.set_seed(42)\n", "np.random.seed(42)\n", "\n", "model = keras.models.Sequential()\n", "model.add(keras.layers.Flatten(input_shape=[32, 32, 3]))\n", "for _ in range(20):\n", " model.add(keras.layers.Dense(100,\n", " kernel_initializer=\"lecun_normal\",\n", " activation=\"selu\"))\n", "model.add(keras.layers.Dense(10, activation=\"softmax\"))\n", "\n", "optimizer = keras.optimizers.Nadam(learning_rate=7e-4)\n", "model.compile(loss=\"sparse_categorical_crossentropy\",\n", " optimizer=optimizer,\n", " metrics=[\"accuracy\"])\n", "\n", "early_stopping_cb = keras.callbacks.EarlyStopping(patience=20)\n", "model_checkpoint_cb = keras.callbacks.ModelCheckpoint(\"my_cifar10_selu_model.h5\", save_best_only=True)\n", "run_index = 1 # increment every time you train the model\n", "run_logdir = os.path.join(os.curdir, \"my_cifar10_logs\", \"run_selu_{:03d}\".format(run_index))\n", "tensorboard_cb = keras.callbacks.TensorBoard(run_logdir)\n", "callbacks = [early_stopping_cb, model_checkpoint_cb, tensorboard_cb]\n", "\n", "X_means = X_train.mean(axis=0)\n", "X_stds = X_train.std(axis=0)\n", "X_train_scaled = (X_train - X_means) / X_stds\n", "X_valid_scaled = (X_valid - X_means) / X_stds\n", "X_test_scaled = (X_test - X_means) / X_stds\n", "\n", "model.fit(X_train_scaled, y_train, epochs=100,\n", " validation_data=(X_valid_scaled, y_valid),\n", " callbacks=callbacks)\n", "\n", "model = keras.models.load_model(\"my_cifar10_selu_model.h5\")\n", "model.evaluate(X_valid_scaled, y_valid)" ] }, { "cell_type": "code", "execution_count": 137, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "157/157 [==============================] - 0s 1ms/step - loss: 1.4633 - accuracy: 0.4792\n" ] }, { "data": { "text/plain": [ "[1.4633383750915527, 0.47920000553131104]" ] }, "execution_count": 137, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model = keras.models.load_model(\"my_cifar10_selu_model.h5\")\n", "model.evaluate(X_valid_scaled, y_valid)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We get 47.9% accuracy, which is not much better than the original model (47.6%), and not as good as the model using batch normalization (54.0%). However, convergence was almost as fast as with the BN model, plus each epoch took only 7 seconds. So it's by far the fastest model to train so far." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### e.\n", "*Exercise: Try regularizing the model with alpha dropout. Then, without retraining your model, see if you can achieve better accuracy using MC Dropout.*" ] }, { "cell_type": "code", "execution_count": 138, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/100\n", "1407/1407 [==============================] - 9s 5ms/step - loss: 2.0583 - accuracy: 0.2742 - val_loss: 1.7429 - val_accuracy: 0.3858\n", "Epoch 2/100\n", "1407/1407 [==============================] - 6s 5ms/step - loss: 1.6852 - accuracy: 0.4008 - val_loss: 1.7055 - val_accuracy: 0.3792\n", "Epoch 3/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.5963 - accuracy: 0.4413 - val_loss: 1.7401 - val_accuracy: 0.4072\n", "Epoch 4/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.5231 - accuracy: 0.4634 - val_loss: 1.5728 - val_accuracy: 0.4584\n", "Epoch 5/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.4619 - accuracy: 0.4887 - val_loss: 1.5448 - val_accuracy: 0.4702\n", "Epoch 6/100\n", "1407/1407 [==============================] - 6s 5ms/step - loss: 1.4074 - accuracy: 0.5061 - val_loss: 1.5678 - val_accuracy: 0.4664\n", "Epoch 7/100\n", "1407/1407 [==============================] - 6s 5ms/step - loss: 1.3718 - accuracy: 0.5222 - val_loss: 1.5764 - val_accuracy: 0.4824\n", "Epoch 8/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.3220 - accuracy: 0.5387 - val_loss: 1.4805 - val_accuracy: 0.4890\n", "Epoch 9/100\n", "1407/1407 [==============================] - 6s 5ms/step - loss: 1.2908 - accuracy: 0.5487 - val_loss: 1.5521 - val_accuracy: 0.4638\n", "Epoch 10/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.2537 - accuracy: 0.5607 - val_loss: 1.5281 - val_accuracy: 0.4924\n", "Epoch 11/100\n", "1407/1407 [==============================] - 6s 5ms/step - loss: 1.2215 - accuracy: 0.5782 - val_loss: 1.5147 - val_accuracy: 0.5046\n", "Epoch 12/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.1910 - accuracy: 0.5831 - val_loss: 1.5248 - val_accuracy: 0.5002\n", "Epoch 13/100\n", "1407/1407 [==============================] - 6s 5ms/step - loss: 1.1659 - accuracy: 0.5982 - val_loss: 1.5620 - val_accuracy: 0.5066\n", "Epoch 14/100\n", "1407/1407 [==============================] - 6s 5ms/step - loss: 1.1282 - accuracy: 0.6120 - val_loss: 1.5440 - val_accuracy: 0.5180\n", "Epoch 15/100\n", "1407/1407 [==============================] - 6s 5ms/step - loss: 1.1127 - accuracy: 0.6133 - val_loss: 1.5782 - val_accuracy: 0.5146\n", "Epoch 16/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.0917 - accuracy: 0.6266 - val_loss: 1.6182 - val_accuracy: 0.5182\n", "Epoch 17/100\n", "1407/1407 [==============================] - 6s 5ms/step - loss: 1.0620 - accuracy: 0.6331 - val_loss: 1.6285 - val_accuracy: 0.5126\n", "Epoch 18/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.0433 - accuracy: 0.6413 - val_loss: 1.6299 - val_accuracy: 0.5158\n", "Epoch 19/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 1.0087 - accuracy: 0.6549 - val_loss: 1.7172 - val_accuracy: 0.5062\n", "Epoch 20/100\n", "1407/1407 [==============================] - 6s 5ms/step - loss: 0.9950 - accuracy: 0.6571 - val_loss: 1.6524 - val_accuracy: 0.5098\n", "Epoch 21/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 0.9848 - accuracy: 0.6652 - val_loss: 1.7686 - val_accuracy: 0.5038\n", "Epoch 22/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 0.9597 - accuracy: 0.6744 - val_loss: 1.6177 - val_accuracy: 0.5084\n", "Epoch 23/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 0.9399 - accuracy: 0.6790 - val_loss: 1.7095 - val_accuracy: 0.5082\n", "Epoch 24/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 0.9148 - accuracy: 0.6884 - val_loss: 1.7160 - val_accuracy: 0.5150\n", "Epoch 25/100\n", "1407/1407 [==============================] - 6s 5ms/step - loss: 0.9023 - accuracy: 0.6949 - val_loss: 1.7017 - val_accuracy: 0.5152\n", "Epoch 26/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 0.8732 - accuracy: 0.7031 - val_loss: 1.7274 - val_accuracy: 0.5088\n", "Epoch 27/100\n", "1407/1407 [==============================] - 6s 5ms/step - loss: 0.8542 - accuracy: 0.7091 - val_loss: 1.7648 - val_accuracy: 0.5166\n", "Epoch 28/100\n", "1407/1407 [==============================] - 7s 5ms/step - loss: 0.8499 - accuracy: 0.7118 - val_loss: 1.7973 - val_accuracy: 0.5000\n", "157/157 [==============================] - 0s 1ms/step - loss: 1.4805 - accuracy: 0.4890\n" ] }, { "data": { "text/plain": [ "[1.4804893732070923, 0.48899999260902405]" ] }, "execution_count": 138, "metadata": {}, "output_type": "execute_result" } ], "source": [ "keras.backend.clear_session()\n", "tf.random.set_seed(42)\n", "np.random.seed(42)\n", "\n", "model = keras.models.Sequential()\n", "model.add(keras.layers.Flatten(input_shape=[32, 32, 3]))\n", "for _ in range(20):\n", " model.add(keras.layers.Dense(100,\n", " kernel_initializer=\"lecun_normal\",\n", " activation=\"selu\"))\n", "\n", "model.add(keras.layers.AlphaDropout(rate=0.1))\n", "model.add(keras.layers.Dense(10, activation=\"softmax\"))\n", "\n", "optimizer = keras.optimizers.Nadam(learning_rate=5e-4)\n", "model.compile(loss=\"sparse_categorical_crossentropy\",\n", " optimizer=optimizer,\n", " metrics=[\"accuracy\"])\n", "\n", "early_stopping_cb = keras.callbacks.EarlyStopping(patience=20)\n", "model_checkpoint_cb = keras.callbacks.ModelCheckpoint(\"my_cifar10_alpha_dropout_model.h5\", save_best_only=True)\n", "run_index = 1 # increment every time you train the model\n", "run_logdir = os.path.join(os.curdir, \"my_cifar10_logs\", \"run_alpha_dropout_{:03d}\".format(run_index))\n", "tensorboard_cb = keras.callbacks.TensorBoard(run_logdir)\n", "callbacks = [early_stopping_cb, model_checkpoint_cb, tensorboard_cb]\n", "\n", "X_means = X_train.mean(axis=0)\n", "X_stds = X_train.std(axis=0)\n", "X_train_scaled = (X_train - X_means) / X_stds\n", "X_valid_scaled = (X_valid - X_means) / X_stds\n", "X_test_scaled = (X_test - X_means) / X_stds\n", "\n", "model.fit(X_train_scaled, y_train, epochs=100,\n", " validation_data=(X_valid_scaled, y_valid),\n", " callbacks=callbacks)\n", "\n", "model = keras.models.load_model(\"my_cifar10_alpha_dropout_model.h5\")\n", "model.evaluate(X_valid_scaled, y_valid)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The model reaches 48.9% accuracy on the validation set. That's very slightly better than without dropout (47.6%). With an extensive hyperparameter search, it might be possible to do better (I tried dropout rates of 5%, 10%, 20% and 40%, and learning rates 1e-4, 3e-4, 5e-4, and 1e-3), but probably not much better in this case." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's use MC Dropout now. We will need the `MCAlphaDropout` class we used earlier, so let's just copy it here for convenience:" ] }, { "cell_type": "code", "execution_count": 139, "metadata": {}, "outputs": [], "source": [ "class MCAlphaDropout(keras.layers.AlphaDropout):\n", " def call(self, inputs):\n", " return super().call(inputs, training=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's create a new model, identical to the one we just trained (with the same weights), but with `MCAlphaDropout` dropout layers instead of `AlphaDropout` layers:" ] }, { "cell_type": "code", "execution_count": 140, "metadata": {}, "outputs": [], "source": [ "mc_model = keras.models.Sequential([\n", " MCAlphaDropout(layer.rate) if isinstance(layer, keras.layers.AlphaDropout) else layer\n", " for layer in model.layers\n", "])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then let's add a couple utility functions. The first will run the model many times (10 by default) and it will return the mean predicted class probabilities. The second will use these mean probabilities to predict the most likely class for each instance:" ] }, { "cell_type": "code", "execution_count": 141, "metadata": {}, "outputs": [], "source": [ "def mc_dropout_predict_probas(mc_model, X, n_samples=10):\n", " Y_probas = [mc_model.predict(X) for sample in range(n_samples)]\n", " return np.mean(Y_probas, axis=0)\n", "\n", "def mc_dropout_predict_classes(mc_model, X, n_samples=10):\n", " Y_probas = mc_dropout_predict_probas(mc_model, X, n_samples)\n", " return np.argmax(Y_probas, axis=1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's make predictions for all the instances in the validation set, and compute the accuracy:" ] }, { "cell_type": "code", "execution_count": 142, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.4892" ] }, "execution_count": 142, "metadata": {}, "output_type": "execute_result" } ], "source": [ "keras.backend.clear_session()\n", "tf.random.set_seed(42)\n", "np.random.seed(42)\n", "\n", "y_pred = mc_dropout_predict_classes(mc_model, X_valid_scaled)\n", "accuracy = np.mean(y_pred == y_valid[:, 0])\n", "accuracy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We get no accuracy improvement in this case (we're still at 48.9% accuracy).\n", "\n", "So the best model we got in this exercise is the Batch Normalization model." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### f.\n", "*Exercise: Retrain your model using 1cycle scheduling and see if it improves training speed and model accuracy.*" ] }, { "cell_type": "code", "execution_count": 143, "metadata": {}, "outputs": [], "source": [ "keras.backend.clear_session()\n", "tf.random.set_seed(42)\n", "np.random.seed(42)\n", "\n", "model = keras.models.Sequential()\n", "model.add(keras.layers.Flatten(input_shape=[32, 32, 3]))\n", "for _ in range(20):\n", " model.add(keras.layers.Dense(100,\n", " kernel_initializer=\"lecun_normal\",\n", " activation=\"selu\"))\n", "\n", "model.add(keras.layers.AlphaDropout(rate=0.1))\n", "model.add(keras.layers.Dense(10, activation=\"softmax\"))\n", "\n", "optimizer = keras.optimizers.SGD(learning_rate=1e-3)\n", "model.compile(loss=\"sparse_categorical_crossentropy\",\n", " optimizer=optimizer,\n", " metrics=[\"accuracy\"])" ] }, { "cell_type": "code", "execution_count": 144, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "352/352 [==============================] - 2s 6ms/step - loss: nan - accuracy: 0.1255\n" ] }, { "data": { "text/plain": [ "(9.999999747378752e-06,\n", " 9.999868392944336,\n", " 2.6167492866516113,\n", " 3.9354368618556435)" ] }, "execution_count": 144, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "batch_size = 128\n", "rates, losses = find_learning_rate(model, X_train_scaled, y_train, epochs=1, batch_size=batch_size)\n", "plot_lr_vs_loss(rates, losses)\n", "plt.axis([min(rates), max(rates), min(losses), (losses[0] + min(losses)) / 1.4])" ] }, { "cell_type": "code", "execution_count": 145, "metadata": {}, "outputs": [], "source": [ "keras.backend.clear_session()\n", "tf.random.set_seed(42)\n", "np.random.seed(42)\n", "\n", "model = keras.models.Sequential()\n", "model.add(keras.layers.Flatten(input_shape=[32, 32, 3]))\n", "for _ in range(20):\n", " model.add(keras.layers.Dense(100,\n", " kernel_initializer=\"lecun_normal\",\n", " activation=\"selu\"))\n", "\n", "model.add(keras.layers.AlphaDropout(rate=0.1))\n", "model.add(keras.layers.Dense(10, activation=\"softmax\"))\n", "\n", "optimizer = keras.optimizers.SGD(learning_rate=1e-2)\n", "model.compile(loss=\"sparse_categorical_crossentropy\",\n", " optimizer=optimizer,\n", " metrics=[\"accuracy\"])" ] }, { "cell_type": "code", "execution_count": 146, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/15\n", "352/352 [==============================] - 3s 6ms/step - loss: 2.2298 - accuracy: 0.2349 - val_loss: 1.7841 - val_accuracy: 0.3834\n", "Epoch 2/15\n", "352/352 [==============================] - 2s 6ms/step - loss: 1.7928 - accuracy: 0.3689 - val_loss: 1.6806 - val_accuracy: 0.4086\n", "Epoch 3/15\n", "352/352 [==============================] - 2s 6ms/step - loss: 1.6475 - accuracy: 0.4190 - val_loss: 1.6378 - val_accuracy: 0.4350\n", "Epoch 4/15\n", "352/352 [==============================] - 2s 6ms/step - loss: 1.5428 - accuracy: 0.4543 - val_loss: 1.6266 - val_accuracy: 0.4390\n", "Epoch 5/15\n", "352/352 [==============================] - 2s 6ms/step - loss: 1.4865 - accuracy: 0.4769 - val_loss: 1.6158 - val_accuracy: 0.4384\n", "Epoch 6/15\n", "352/352 [==============================] - 2s 6ms/step - loss: 1.4339 - accuracy: 0.4866 - val_loss: 1.5850 - val_accuracy: 0.4412\n", "Epoch 7/15\n", "352/352 [==============================] - 2s 6ms/step - loss: 1.4042 - accuracy: 0.5056 - val_loss: 1.6146 - val_accuracy: 0.4384\n", "Epoch 8/15\n", "352/352 [==============================] - 2s 6ms/step - loss: 1.3437 - accuracy: 0.5229 - val_loss: 1.5299 - val_accuracy: 0.4846\n", "Epoch 9/15\n", "352/352 [==============================] - 2s 5ms/step - loss: 1.2721 - accuracy: 0.5459 - val_loss: 1.5145 - val_accuracy: 0.4874\n", "Epoch 10/15\n", "352/352 [==============================] - 2s 6ms/step - loss: 1.1942 - accuracy: 0.5698 - val_loss: 1.4958 - val_accuracy: 0.5040\n", "Epoch 11/15\n", "352/352 [==============================] - 2s 6ms/step - loss: 1.1211 - accuracy: 0.6033 - val_loss: 1.5406 - val_accuracy: 0.4984\n", "Epoch 12/15\n", "352/352 [==============================] - 2s 6ms/step - loss: 1.0673 - accuracy: 0.6161 - val_loss: 1.5284 - val_accuracy: 0.5144\n", "Epoch 13/15\n", "352/352 [==============================] - 2s 6ms/step - loss: 0.9927 - accuracy: 0.6435 - val_loss: 1.5449 - val_accuracy: 0.5140\n", "Epoch 14/15\n", "352/352 [==============================] - 2s 6ms/step - loss: 0.9205 - accuracy: 0.6703 - val_loss: 1.5652 - val_accuracy: 0.5224\n", "Epoch 15/15\n", "352/352 [==============================] - 2s 6ms/step - loss: 0.8936 - accuracy: 0.6801 - val_loss: 1.5912 - val_accuracy: 0.5198\n" ] } ], "source": [ "n_epochs = 15\n", "onecycle = OneCycleScheduler(math.ceil(len(X_train_scaled) / batch_size) * n_epochs, max_rate=0.05)\n", "history = model.fit(X_train_scaled, y_train, epochs=n_epochs, batch_size=batch_size,\n", " validation_data=(X_valid_scaled, y_valid),\n", " callbacks=[onecycle])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One cycle allowed us to train the model in just 15 epochs, each taking only 2 seconds (thanks to the larger batch size). This is several times faster than the fastest model we trained so far. Moreover, we improved the model's performance (from 47.6% to 52.0%). The batch normalized model reaches a slightly better performance (54%), but it's much slower to train." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.12" }, "nav_menu": { "height": "360px", "width": "416px" }, "toc": { "navigate_menu": true, "number_sections": true, "sideBar": true, "threshold": 6, "toc_cell": false, "toc_section_display": "block", "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 4 }