{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# High Level APIs\n", "> In the final chapter, you'll use high-level APIs in TensorFlow 2.0 to train a sign language letter classifier. You will use both the sequential and functional Keras APIs to train, validate, make predictions with, and evaluate models. You will also learn how to use the Estimators API to streamline the model definition and training process, and to avoid errors. This is the Summary of lecture \"Introduction to TensorFlow in Python\", via datacamp.\n", "\n", "- toc: true \n", "- badges: true\n", "- comments: true\n", "- author: Chanseok Kang\n", "- categories: [Python, Datacamp, Tensorflow-Keras, Deep_Learning]\n", "- image: images/estimators.png" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'2.2.0'" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import tensorflow as tf\n", "import pandas as pd\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "\n", "plt.rcParams['figure.figsize'] = (8, 8)\n", "tf.__version__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Defining neural networks with Keras" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### The sequential model in Keras\n", "n chapter 3, we used components of the keras API in tensorflow to define a neural network, but we stopped short of using its full capabilities to streamline model definition and training. In this exercise, you will use the keras sequential model API to define a neural network that can be used to classify images of sign language letters. You will also use the `.summary()` method to print the model's architecture, including the shape and number of parameters associated with each layer.\n", "\n", "Note that the images were reshaped from (28, 28) to (784,), so that they could be used as inputs to a dense layer." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Model: \"sequential\"\n", "_________________________________________________________________\n", "Layer (type) Output Shape Param # \n", "=================================================================\n", "dense (Dense) (None, 16) 12560 \n", "_________________________________________________________________\n", "dense_1 (Dense) (None, 8) 136 \n", "_________________________________________________________________\n", "dense_2 (Dense) (None, 4) 36 \n", "=================================================================\n", "Total params: 12,732\n", "Trainable params: 12,732\n", "Non-trainable params: 0\n", "_________________________________________________________________\n", "None\n" ] } ], "source": [ "# Define a Keras sequential model\n", "model = tf.keras.Sequential()\n", "\n", "# Define the first dense layer\n", "model.add(tf.keras.layers.Dense(16, activation='relu', input_shape=(784,)))\n", "\n", "# Define the second dense layer\n", "model.add(tf.keras.layers.Dense(8, activation='relu', ))\n", "\n", "# Define the output layer\n", "model.add(tf.keras.layers.Dense(4, activation='softmax'))\n", "\n", "# Print the model architecture\n", "print(model.summary())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that we've defined a model, but we haven't compiled it. The compilation step in keras allows us to set the optimizer, loss function, and other useful training parameters in a single line of code. Furthermore, the `.summary()` method allows us to view the model's architecture." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Compiling a sequential model\n", "In this exercise, you will work towards classifying letters from the Sign Language MNIST dataset; however, you will adopt a different network architecture than what you used in the previous exercise. There will be fewer layers, but more nodes. You will also apply dropout to prevent overfitting. Finally, you will compile the model to use the adam optimizer and the `categorical_crossentropy` loss. You will also use a method in keras to summarize your model's architecture. " ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Model: \"sequential_1\"\n", "_________________________________________________________________\n", "Layer (type) Output Shape Param # \n", "=================================================================\n", "dense_3 (Dense) (None, 16) 12560 \n", "_________________________________________________________________\n", "dropout (Dropout) (None, 16) 0 \n", "_________________________________________________________________\n", "dense_4 (Dense) (None, 4) 68 \n", "=================================================================\n", "Total params: 12,628\n", "Trainable params: 12,628\n", "Non-trainable params: 0\n", "_________________________________________________________________\n", "None\n" ] } ], "source": [ "model = tf.keras.Sequential()\n", "\n", "# Define the first dense layer\n", "model.add(tf.keras.layers.Dense(16, activation='sigmoid', input_shape=(784,)))\n", "\n", "# Apply dropout to the first layer's output\n", "model.add(tf.keras.layers.Dropout(0.25))\n", "\n", "# Define the output layer\n", "model.add(tf.keras.layers.Dense(4, activation='softmax'))\n", "\n", "# Compile the model\n", "model.compile(optimizer='adam', loss='categorical_crossentropy')\n", "\n", "# Print a model summary\n", "print(model.summary())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Defining a multiple input model\n", "In some cases, the sequential API will not be sufficiently flexible to accommodate your desired model architecture and you will need to use the functional API instead. If, for instance, you want to train two models with different architectures jointly, you will need to use the functional API to do this. In this exercise, we will see how to do this. We will also use the `.summary()` method to examine the joint model's architecture." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "m1_inputs = tf.keras.Input(shape=(784,))\n", "m2_inputs = tf.keras.Input(shape=(784,))" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Model: \"model\"\n", "__________________________________________________________________________________________________\n", "Layer (type) Output Shape Param # Connected to \n", "==================================================================================================\n", "input_1 (InputLayer) [(None, 784)] 0 \n", "__________________________________________________________________________________________________\n", "input_2 (InputLayer) [(None, 784)] 0 \n", "__________________________________________________________________________________________________\n", "dense_5 (Dense) (None, 12) 9420 input_1[0][0] \n", "__________________________________________________________________________________________________\n", "dense_7 (Dense) (None, 12) 9420 input_2[0][0] \n", "__________________________________________________________________________________________________\n", "dense_6 (Dense) (None, 4) 52 dense_5[0][0] \n", "__________________________________________________________________________________________________\n", "dense_8 (Dense) (None, 4) 52 dense_7[0][0] \n", "__________________________________________________________________________________________________\n", "add (Add) (None, 4) 0 dense_6[0][0] \n", " dense_8[0][0] \n", "==================================================================================================\n", "Total params: 18,944\n", "Trainable params: 18,944\n", "Non-trainable params: 0\n", "__________________________________________________________________________________________________\n", "None\n" ] } ], "source": [ "# For model 1, pass the input layer to layer 1 and layer 1 to layer 2\n", "m1_layer1 = tf.keras.layers.Dense(12, activation='sigmoid')(m1_inputs)\n", "m1_layer2 = tf.keras.layers.Dense(4, activation='softmax')(m1_layer1)\n", "\n", "# For model 2, pass the input layer to layer 1 and layer 1 to layer 2\n", "m2_layer1 = tf.keras.layers.Dense(12, activation='relu')(m2_inputs)\n", "m2_layer2 = tf.keras.layers.Dense(4, activation='softmax')(m2_layer1)\n", "\n", "# Merge model outputs and define a functional model\n", "merged = tf.keras.layers.add([m1_layer2, m2_layer2])\n", "model = tf.keras.Model(inputs=[m1_inputs, m2_inputs], outputs=merged)\n", "\n", "# Print a model summary\n", "print(model.summary())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that the `.summary()` method yields a new column: `connected to`. This column tells you how layers connect to each other within the network. We can see that `dense_9`, for instance, is connected to the `input_2` layer. We can also see that the add layer, which merged the two models, connected to both `dense_10` and `dense_12`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Training and validation with Keras\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Training with Keras\n", "In this exercise, we return to our sign language letter classification problem. We have 2000 images of four letters--A, B, C, and D--and we want to classify them with a high level of accuracy. We will complete all parts of the problem, including the model definition, compilation, and training." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "df = pd.read_csv('./dataset/slmnist.csv', header=None)\n", "X = df.iloc[:, 1:]\n", "y = df.iloc[:, 0]" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "sign_language_features = (X - X.mean()) / (X.max() - X.min()).to_numpy()\n", "sign_language_labels = pd.get_dummies(y).astype(np.float32).to_numpy()" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/5\n", "63/63 [==============================] - 0s 1ms/step - loss: 1.2644\n", "Epoch 2/5\n", "63/63 [==============================] - 0s 1ms/step - loss: 1.0355\n", "Epoch 3/5\n", "63/63 [==============================] - 0s 988us/step - loss: 0.8604\n", "Epoch 4/5\n", "63/63 [==============================] - 0s 960us/step - loss: 0.7198\n", "Epoch 5/5\n", "63/63 [==============================] - 0s 911us/step - loss: 0.5997\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Define a sequential model\n", "model = tf.keras.Sequential()\n", "\n", "# Define a hidden layer\n", "model.add(tf.keras.layers.Dense(16, activation='relu', input_shape=(784, )))\n", "\n", "# Define the output layer\n", "model.add(tf.keras.layers.Dense(4, activation='softmax'))\n", "\n", "# Compile the model\n", "model.compile(optimizer='SGD', loss='categorical_crossentropy')\n", "\n", "# Complete the fitting operation\n", "model.fit(sign_language_features, sign_language_labels, epochs=5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You probably noticed that your only measure of performance improvement was the value of the loss function in the training sample, which is not particularly informative." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Metrics and validation with Keras\n", "We trained a model to predict sign language letters in the previous exercise, but it is unclear how successful we were in doing so. In this exercise, we will try to improve upon the interpretability of our results. Since we did not use a validation split, we only observed performance improvements within the training set; however, it is unclear how much of that was due to overfitting. Furthermore, since we did not supply a metric, we only saw decreases in the loss function, which do not have any clear interpretation." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/10\n", "57/57 [==============================] - 0s 3ms/step - loss: 0.9454 - accuracy: 0.7244 - val_loss: 0.5956 - val_accuracy: 0.8550\n", "Epoch 2/10\n", "57/57 [==============================] - 0s 2ms/step - loss: 0.4391 - accuracy: 0.9439 - val_loss: 0.3207 - val_accuracy: 0.9800\n", "Epoch 3/10\n", "57/57 [==============================] - 0s 2ms/step - loss: 0.2457 - accuracy: 0.9889 - val_loss: 0.1868 - val_accuracy: 0.9850\n", "Epoch 4/10\n", "57/57 [==============================] - 0s 2ms/step - loss: 0.1399 - accuracy: 0.9939 - val_loss: 0.1073 - val_accuracy: 0.9950\n", "Epoch 5/10\n", "57/57 [==============================] - 0s 2ms/step - loss: 0.0796 - accuracy: 0.9983 - val_loss: 0.0614 - val_accuracy: 1.0000\n", "Epoch 6/10\n", "57/57 [==============================] - 0s 2ms/step - loss: 0.0446 - accuracy: 0.9989 - val_loss: 0.0346 - val_accuracy: 1.0000\n", "Epoch 7/10\n", "57/57 [==============================] - 0s 2ms/step - loss: 0.0243 - accuracy: 1.0000 - val_loss: 0.0191 - val_accuracy: 1.0000\n", "Epoch 8/10\n", "57/57 [==============================] - 0s 2ms/step - loss: 0.0134 - accuracy: 1.0000 - val_loss: 0.0109 - val_accuracy: 1.0000\n", "Epoch 9/10\n", "57/57 [==============================] - 0s 2ms/step - loss: 0.0073 - accuracy: 1.0000 - val_loss: 0.0059 - val_accuracy: 1.0000\n", "Epoch 10/10\n", "57/57 [==============================] - 0s 2ms/step - loss: 0.0041 - accuracy: 1.0000 - val_loss: 0.0033 - val_accuracy: 1.0000\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Define sequential model\n", "model = tf.keras.Sequential()\n", "\n", "# Define the first layer\n", "model.add(tf.keras.layers.Dense(32, activation='sigmoid', input_shape=(784,)))\n", "\n", "# Add activation function to classifier\n", "model.add(tf.keras.layers.Dense(4, activation='softmax'))\n", "\n", "# Set the optimizer, loss function, and metrics\n", "model.compile(optimizer='RMSprop', loss='categorical_crossentropy', metrics=['accuracy'])\n", "\n", "# Add the number of epochs and the validation split\n", "model.fit(sign_language_features, sign_language_labels, epochs=10, validation_split=0.1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " With the keras API, you only needed 14 lines of code to define, compile, train, and validate a model. You may have noticed that your model performed quite well. In just 10 epochs, we achieved a classification accuracy of over 90% in the validation sample!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Overfitting detection\n", "In this exercise, we'll work with a small subset of the examples from the original sign language letters dataset. A small sample, coupled with a heavily-parameterized model, will generally lead to overfitting. This means that your model will simply memorize the class of each example, rather than identifying features that generalize to many examples.\n", "\n", "You will detect overfitting by checking whether the validation sample loss is substantially higher than the training sample loss and whether it increases with further training. With a small sample and a high learning rate, the model will struggle to converge on an optimum. You will set a low learning rate for the optimizer, which will make it easier to identify overfitting." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/50\n", "32/32 [==============================] - 0s 5ms/step - loss: 0.3319 - accuracy: 0.8980 - val_loss: 0.0688 - val_accuracy: 0.9760\n", "Epoch 2/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 0.0194 - accuracy: 0.9980 - val_loss: 0.0228 - val_accuracy: 0.9920\n", "Epoch 3/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 0.0071 - accuracy: 0.9990 - val_loss: 0.0180 - val_accuracy: 0.9940\n", "Epoch 4/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 0.0038 - accuracy: 1.0000 - val_loss: 0.0071 - val_accuracy: 1.0000\n", "Epoch 5/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 0.0024 - accuracy: 1.0000 - val_loss: 0.0050 - val_accuracy: 1.0000\n", "Epoch 6/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 0.0018 - accuracy: 1.0000 - val_loss: 0.0051 - val_accuracy: 1.0000\n", "Epoch 7/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 0.0013 - accuracy: 1.0000 - val_loss: 0.0043 - val_accuracy: 1.0000\n", "Epoch 8/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 0.0011 - accuracy: 1.0000 - val_loss: 0.0035 - val_accuracy: 1.0000\n", "Epoch 9/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 8.9295e-04 - accuracy: 1.0000 - val_loss: 0.0035 - val_accuracy: 1.0000\n", "Epoch 10/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 7.4761e-04 - accuracy: 1.0000 - val_loss: 0.0027 - val_accuracy: 1.0000\n", "Epoch 11/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 6.3920e-04 - accuracy: 1.0000 - val_loss: 0.0027 - val_accuracy: 1.0000\n", "Epoch 12/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 5.5781e-04 - accuracy: 1.0000 - val_loss: 0.0021 - val_accuracy: 1.0000\n", "Epoch 13/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 4.8971e-04 - accuracy: 1.0000 - val_loss: 0.0023 - val_accuracy: 1.0000\n", "Epoch 14/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 4.2039e-04 - accuracy: 1.0000 - val_loss: 0.0020 - val_accuracy: 1.0000\n", "Epoch 15/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 3.7444e-04 - accuracy: 1.0000 - val_loss: 0.0019 - val_accuracy: 1.0000\n", "Epoch 16/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 3.3658e-04 - accuracy: 1.0000 - val_loss: 0.0018 - val_accuracy: 1.0000\n", "Epoch 17/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 3.0254e-04 - accuracy: 1.0000 - val_loss: 0.0016 - val_accuracy: 1.0000\n", "Epoch 18/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 2.7138e-04 - accuracy: 1.0000 - val_loss: 0.0015 - val_accuracy: 1.0000\n", "Epoch 19/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 2.4630e-04 - accuracy: 1.0000 - val_loss: 0.0014 - val_accuracy: 1.0000\n", "Epoch 20/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 2.2582e-04 - accuracy: 1.0000 - val_loss: 0.0013 - val_accuracy: 1.0000\n", "Epoch 21/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 2.0794e-04 - accuracy: 1.0000 - val_loss: 0.0013 - val_accuracy: 1.0000\n", "Epoch 22/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 1.9037e-04 - accuracy: 1.0000 - val_loss: 0.0013 - val_accuracy: 1.0000\n", "Epoch 23/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 1.7535e-04 - accuracy: 1.0000 - val_loss: 0.0012 - val_accuracy: 1.0000\n", "Epoch 24/50\n", "32/32 [==============================] - 0s 4ms/step - loss: 1.6198e-04 - accuracy: 1.0000 - val_loss: 0.0011 - val_accuracy: 1.0000\n", "Epoch 25/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 1.5424e-04 - accuracy: 1.0000 - val_loss: 0.0010 - val_accuracy: 1.0000\n", "Epoch 26/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 1.3962e-04 - accuracy: 1.0000 - val_loss: 9.8751e-04 - val_accuracy: 1.0000\n", "Epoch 27/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 1.3078e-04 - accuracy: 1.0000 - val_loss: 9.8295e-04 - val_accuracy: 1.0000\n", "Epoch 28/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 1.2235e-04 - accuracy: 1.0000 - val_loss: 9.3358e-04 - val_accuracy: 1.0000\n", "Epoch 29/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 1.1414e-04 - accuracy: 1.0000 - val_loss: 8.8058e-04 - val_accuracy: 1.0000\n", "Epoch 30/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 1.0747e-04 - accuracy: 1.0000 - val_loss: 8.6045e-04 - val_accuracy: 1.0000\n", "Epoch 31/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 1.0094e-04 - accuracy: 1.0000 - val_loss: 8.2212e-04 - val_accuracy: 1.0000\n", "Epoch 32/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 9.5395e-05 - accuracy: 1.0000 - val_loss: 7.8644e-04 - val_accuracy: 1.0000\n", "Epoch 33/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 9.0415e-05 - accuracy: 1.0000 - val_loss: 7.5970e-04 - val_accuracy: 1.0000\n", "Epoch 34/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 8.5597e-05 - accuracy: 1.0000 - val_loss: 7.3851e-04 - val_accuracy: 1.0000\n", "Epoch 35/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 8.0712e-05 - accuracy: 1.0000 - val_loss: 6.9588e-04 - val_accuracy: 1.0000\n", "Epoch 36/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 7.6454e-05 - accuracy: 1.0000 - val_loss: 6.7830e-04 - val_accuracy: 1.0000\n", "Epoch 37/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 7.2831e-05 - accuracy: 1.0000 - val_loss: 6.5712e-04 - val_accuracy: 1.0000\n", "Epoch 38/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 6.9023e-05 - accuracy: 1.0000 - val_loss: 6.2887e-04 - val_accuracy: 1.0000\n", "Epoch 39/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 6.5904e-05 - accuracy: 1.0000 - val_loss: 6.3692e-04 - val_accuracy: 1.0000\n", "Epoch 40/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 6.2966e-05 - accuracy: 1.0000 - val_loss: 5.9707e-04 - val_accuracy: 1.0000\n", "Epoch 41/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 5.9689e-05 - accuracy: 1.0000 - val_loss: 5.8867e-04 - val_accuracy: 1.0000\n", "Epoch 42/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 5.7342e-05 - accuracy: 1.0000 - val_loss: 5.6061e-04 - val_accuracy: 1.0000\n", "Epoch 43/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 5.4794e-05 - accuracy: 1.0000 - val_loss: 5.5977e-04 - val_accuracy: 1.0000\n", "Epoch 44/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 5.2118e-05 - accuracy: 1.0000 - val_loss: 5.1977e-04 - val_accuracy: 1.0000\n", "Epoch 45/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 4.9796e-05 - accuracy: 1.0000 - val_loss: 5.2937e-04 - val_accuracy: 1.0000\n", "Epoch 46/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 4.7737e-05 - accuracy: 1.0000 - val_loss: 5.2163e-04 - val_accuracy: 1.0000\n", "Epoch 47/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 4.5816e-05 - accuracy: 1.0000 - val_loss: 5.1073e-04 - val_accuracy: 1.0000\n", "Epoch 48/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 4.3799e-05 - accuracy: 1.0000 - val_loss: 4.8537e-04 - val_accuracy: 1.0000\n", "Epoch 49/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 4.2239e-05 - accuracy: 1.0000 - val_loss: 4.6707e-04 - val_accuracy: 1.0000\n", "Epoch 50/50\n", "32/32 [==============================] - 0s 3ms/step - loss: 4.0358e-05 - accuracy: 1.0000 - val_loss: 4.6815e-04 - val_accuracy: 1.0000\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Define sequential model\n", "model = tf.keras.Sequential()\n", "\n", "# Define the first layer\n", "model.add(tf.keras.layers.Dense(1024, activation='relu', input_shape=(784, )))\n", "\n", "# Add activation function to classifier\n", "model.add(tf.keras.layers.Dense(4, activation='softmax'))\n", "\n", "# Finish the model compilation\n", "model.compile(optimizer=tf.keras.optimizers.Adam(lr=0.001),\n", " loss='categorical_crossentropy', metrics=['accuracy'])\n", "\n", "# Complete the model fit operation\n", "model.fit(sign_language_features, sign_language_labels, epochs=50, validation_split=0.5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Evaluating models\n", "Two models have been trained and are available: `large_model`, which has many parameters; and `small_model`, which has fewer parameters. Both models have been trained using `train_features` and `train_labels`, which are available to you. A separate test set, which consists of `test_features` and `test_labels`, is also available.\n", "\n", "Your goal is to evaluate relative model performance and also determine whether either model exhibits signs of overfitting. You will do this by evaluating `large_model` and `small_model` on both the train and test sets. For each model, you can do this by applying the `.evaluate(x, y)` method to compute the loss for features `x` and labels `y`. You will then compare the four losses generated." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "small_model = tf.keras.Sequential()\n", "\n", "small_model.add(tf.keras.layers.Dense(8, activation='relu', input_shape=(784,)))\n", "small_model.add(tf.keras.layers.Dense(4, activation='softmax'))\n", "\n", "small_model.compile(optimizer=tf.keras.optimizers.SGD(lr=0.01), \n", " loss='categorical_crossentropy', \n", " metrics=['accuracy'])" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "large_model = tf.keras.Sequential()\n", "\n", "large_model.add(tf.keras.layers.Dense(64, activation='sigmoid', input_shape=(784,)))\n", "large_model.add(tf.keras.layers.Dense(4, activation='softmax'))\n", "\n", "large_model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001, \n", " beta_1=0.9, beta_2=0.999),\n", " loss='categorical_crossentropy', metrics=['accuracy'])" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "from sklearn.model_selection import train_test_split\n", "\n", "train_features, test_features, train_labels, test_labels = train_test_split(sign_language_features, \n", " sign_language_labels,\n", " test_size=0.5)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "small_model.fit(train_features, train_labels, epochs=30, verbose=False)\n", "large_model.fit(train_features, train_labels, epochs=30, verbose=False)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "32/32 [==============================] - 0s 1ms/step - loss: 0.1836 - accuracy: 0.9860\n", "32/32 [==============================] - 0s 973us/step - loss: 0.1819 - accuracy: 0.9920\n", "32/32 [==============================] - 0s 933us/step - loss: 0.0085 - accuracy: 1.0000\n", "32/32 [==============================] - 0s 1ms/step - loss: 0.0091 - accuracy: 1.0000\n", "\n", " Small - Train: [0.18358397483825684, 0.9860000014305115], Test: [0.18193349242210388, 0.9919999837875366]\n", "Large - Train: [0.008485781960189342, 1.0], Test: [0.009079609997570515, 1.0]\n" ] } ], "source": [ "# Evaluate the small model using the train data\n", "small_train = small_model.evaluate(train_features, train_labels)\n", "\n", "# Evaluate the small model using the test data\n", "small_test = small_model.evaluate(test_features, test_labels)\n", "\n", "# Evaluate the large model using the train data\n", "large_train = large_model.evaluate(train_features, train_labels)\n", "\n", "# Evalute the large model using the test data\n", "large_test = large_model.evaluate(test_features, test_labels)\n", "\n", "# Print losses\n", "print('\\n Small - Train: {}, Test: {}'.format(small_train, small_test))\n", "print('Large - Train: {}, Test: {}'.format(large_train, large_test))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Training models with the Estimators API\n", "- Estimators API\n", "![estimators](image/estimators.png)\n", " - High level submodule\n", " - Less flexible\n", " - Faster deployment\n", " - Many premade model\n", "- Model specification and training\n", " 1. Define feature columns\n", " 2. Load and transform data\n", " 3. Define an estimator\n", " 4. Apply train operation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Preparing to train with Estimators\n", "For this exercise, we'll return to the King County housing transaction dataset from chapter 2. We will again develop and train a machine learning model to predict house prices; however, this time, we'll do it using the `estimator` API.\n", "\n", "Rather than completing everything in one step, we'll break this procedure down into parts. We'll begin by defining the feature columns and loading the data. In the next exercise, we'll define and train a premade `estimator`. " ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
iddatepricebedroomsbathroomssqft_livingsqft_lotfloorswaterfrontview...gradesqft_abovesqft_basementyr_builtyr_renovatedzipcodelatlongsqft_living15sqft_lot15
0712930052020141013T000000221900.031.00118056501.000...711800195509817847.5112-122.25713405650
1641410019220141209T000000538000.032.25257072422.000...72170400195119919812547.7210-122.31916907639
2563150040020150225T000000180000.021.00770100001.000...67700193309802847.7379-122.23327208062
3248720087520141209T000000604000.043.00196050001.000...71050910196509813647.5208-122.39313605000
4195440051020150218T000000510000.032.00168080801.000...816800198709807447.6168-122.04518007503
\n", "

5 rows × 21 columns

\n", "
" ], "text/plain": [ " id date price bedrooms bathrooms sqft_living \\\n", "0 7129300520 20141013T000000 221900.0 3 1.00 1180 \n", "1 6414100192 20141209T000000 538000.0 3 2.25 2570 \n", "2 5631500400 20150225T000000 180000.0 2 1.00 770 \n", "3 2487200875 20141209T000000 604000.0 4 3.00 1960 \n", "4 1954400510 20150218T000000 510000.0 3 2.00 1680 \n", "\n", " sqft_lot floors waterfront view ... grade sqft_above sqft_basement \\\n", "0 5650 1.0 0 0 ... 7 1180 0 \n", "1 7242 2.0 0 0 ... 7 2170 400 \n", "2 10000 1.0 0 0 ... 6 770 0 \n", "3 5000 1.0 0 0 ... 7 1050 910 \n", "4 8080 1.0 0 0 ... 8 1680 0 \n", "\n", " yr_built yr_renovated zipcode lat long sqft_living15 \\\n", "0 1955 0 98178 47.5112 -122.257 1340 \n", "1 1951 1991 98125 47.7210 -122.319 1690 \n", "2 1933 0 98028 47.7379 -122.233 2720 \n", "3 1965 0 98136 47.5208 -122.393 1360 \n", "4 1987 0 98074 47.6168 -122.045 1800 \n", "\n", " sqft_lot15 \n", "0 5650 \n", "1 7639 \n", "2 8062 \n", "3 5000 \n", "4 7503 \n", "\n", "[5 rows x 21 columns]" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "housing = pd.read_csv('./dataset/kc_house_data.csv')\n", "housing.head()" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [], "source": [ "# Define feature columns for bedrooms and bathrooms\n", "bedrooms = tf.feature_column.numeric_column(\"bedrooms\")\n", "bathrooms = tf.feature_column.numeric_column(\"bathrooms\")\n", "\n", "# Define the list of feature columns\n", "feature_list = [bedrooms, bathrooms]\n", "\n", "def input_fn():\n", " # Define the labels\n", " labels = np.array(housing['price'])\n", " \n", " # Define the features\n", " features = {'bedrooms': np.array(housing['bedrooms']),\n", " 'bathrooms': np.array(housing['bathrooms'])}\n", " \n", " return features, labels" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Defining Estimators\n", "In the previous exercise, you defined a list of feature columns, `feature_list`, and a data input function, `input_fn()`. In this exercise, you will build on that work by defining an estimator that makes use of input data." ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:Using default config.\n", "WARNING:tensorflow:Using temporary folder as model directory: /tmp/tmpr1koyqq7\n", "INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmpr1koyqq7', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true\n", "graph_options {\n", " rewrite_options {\n", " meta_optimizer_iterations: ONE\n", " }\n", "}\n", ", '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}\n", "INFO:tensorflow:Calling model_fn.\n", "WARNING:tensorflow:Layer dnn is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2. The layer has dtype float32 because it's dtype defaults to floatx.\n", "\n", "If you intended to run this layer in float32, you can safely ignore this warning. If in doubt, this warning is likely only an issue if you are porting a TensorFlow 1.X model to TensorFlow 2.\n", "\n", "To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.\n", "\n", "INFO:tensorflow:Done calling model_fn.\n", "INFO:tensorflow:Create CheckpointSaverHook.\n", "INFO:tensorflow:Graph was finalized.\n", "INFO:tensorflow:Running local_init_op.\n", "INFO:tensorflow:Done running local_init_op.\n", "INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...\n", "INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmpr1koyqq7/model.ckpt.\n", "INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...\n", "INFO:tensorflow:loss = 426467560000.0, step = 0\n", "INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 1...\n", "INFO:tensorflow:Saving checkpoints for 1 into /tmp/tmpr1koyqq7/model.ckpt.\n", "INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 1...\n", "INFO:tensorflow:Loss for final step: 426467560000.0.\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Define the model and set the number of steps\n", "model = tf.estimator.DNNRegressor(feature_columns=feature_list, hidden_units=[2,2])\n", "model.train(input_fn, steps=1)" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "INFO:tensorflow:Using default config.\n", "WARNING:tensorflow:Using temporary folder as model directory: /tmp/tmp9go_x8af\n", "INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmp9go_x8af', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true\n", "graph_options {\n", " rewrite_options {\n", " meta_optimizer_iterations: ONE\n", " }\n", "}\n", ", '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}\n", "INFO:tensorflow:Calling model_fn.\n", "WARNING:tensorflow:Layer linear/linear_model is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2. The layer has dtype float32 because it's dtype defaults to floatx.\n", "\n", "If you intended to run this layer in float32, you can safely ignore this warning. If in doubt, this warning is likely only an issue if you are porting a TensorFlow 1.X model to TensorFlow 2.\n", "\n", "To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.\n", "\n", "WARNING:tensorflow:From /home/chanseok/anaconda3/lib/python3.7/site-packages/tensorflow/python/feature_column/feature_column_v2.py:540: Layer.add_variable (from tensorflow.python.keras.engine.base_layer_v1) is deprecated and will be removed in a future version.\n", "Instructions for updating:\n", "Please use `layer.add_weight` method instead.\n", "INFO:tensorflow:Done calling model_fn.\n", "INFO:tensorflow:Create CheckpointSaverHook.\n", "INFO:tensorflow:Graph was finalized.\n", "INFO:tensorflow:Running local_init_op.\n", "INFO:tensorflow:Done running local_init_op.\n", "INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...\n", "INFO:tensorflow:Saving checkpoints for 0 into /tmp/tmp9go_x8af/model.ckpt.\n", "INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...\n", "INFO:tensorflow:loss = 426471360000.0, step = 0\n", "INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 2...\n", "INFO:tensorflow:Saving checkpoints for 2 into /tmp/tmp9go_x8af/model.ckpt.\n", "INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 2...\n", "INFO:tensorflow:Loss for final step: 426469820000.0.\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Define the model and set the number of steps\n", "model = tf.estimator.LinearRegressor(feature_columns=feature_list)\n", "model.train(input_fn, steps=2)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.6" } }, "nbformat": 4, "nbformat_minor": 4 }