{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Fine-tuning keras models\n", "> Learn how to optimize your deep learning models in Keras. Start by learning how to validate your models, then understand the concept of model capacity, and finally, experiment with wider and deeper networks. This is the Summary of lecture \"Introduction to Deep Learning in Python\", via datacamp.\n", "\n", "- toc: true \n", "- badges: true\n", "- comments: true\n", "- author: Chanseok Kang\n", "- categories: [Python, Datacamp, Tensorflow-Keras, Deep_Learning]\n", "- image: images/of.png" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "import tensorflow as tf\n", "import matplotlib.pyplot as plt\n", "\n", "plt.rcParams['figure.figsize'] = (8, 8)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Understanding model optimization\n", " - Why optimization is hard\n", " - Simultaneously optimizing 1000s of parameters with complex relationships\n", " - Updates may not improve model meaningfully\n", " - Updates too small (if learning rate is low) or too large (if learning rate is high)\n", " - Vanishing gradients\n", " - Occurs when many layers have very small slopes (e.g. due to being on flat part of tanh curve)\n", " - In deep networks, updates to backprop were close to 0" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Changing optimization parameters\n", "It's time to get your hands dirty with optimization. You'll now try optimizing a model at a very low learning rate, a very high learning rate, and a \"just right\" learning rate. You'll want to look at the results after running this exercise, remembering that a low value for the loss function is good.\n", "\n", "For these exercises, we've pre-loaded the predictors and target values from your previous classification models (predicting who would survive on the Titanic). You'll want the optimization to start from scratch every time you change the learning rate, to give a fair comparison of how each learning rate did in your results." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
survivedpclassagesibspparchfaremaleage_was_missingembarked_from_cherbourgembarked_from_queenstownembarked_from_southampton
00322.0107.25001False001
11138.01071.28330False100
21326.0007.92500False001
31135.01053.10000False001
40335.0008.05001False001
\n", "
" ], "text/plain": [ " survived pclass age sibsp parch fare male age_was_missing \\\n", "0 0 3 22.0 1 0 7.2500 1 False \n", "1 1 1 38.0 1 0 71.2833 0 False \n", "2 1 3 26.0 0 0 7.9250 0 False \n", "3 1 1 35.0 1 0 53.1000 0 False \n", "4 0 3 35.0 0 0 8.0500 1 False \n", "\n", " embarked_from_cherbourg embarked_from_queenstown \\\n", "0 0 0 \n", "1 1 0 \n", "2 0 0 \n", "3 0 0 \n", "4 0 0 \n", "\n", " embarked_from_southampton \n", "0 1 \n", "1 0 \n", "2 1 \n", "3 1 \n", "4 1 " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_csv('./dataset/titanic_all_numeric.csv')\n", "df.head()" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "from tensorflow.keras.utils import to_categorical\n", "\n", "predictors = df.iloc[:, 1:].astype(np.float32).to_numpy()\n", "target = to_categorical(df.iloc[:, 0].astype(np.float32).to_numpy())" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "input_shape = (10, )\n", "\n", "def get_new_model(input_shape = input_shape):\n", " model = tf.keras.Sequential()\n", " model.add(tf.keras.layers.Dense(100, activation='relu', input_shape = input_shape))\n", " model.add(tf.keras.layers.Dense(100, activation='relu'))\n", " model.add(tf.keras.layers.Dense(2, activation='softmax'))\n", " return model" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "\n", "Testing model with learning rate: 0.000001\n", "\n", "Epoch 1/10\n", "28/28 [==============================] - 0s 1ms/step - loss: 2.1949\n", "Epoch 2/10\n", "28/28 [==============================] - 0s 1ms/step - loss: 2.1499\n", "Epoch 3/10\n", "28/28 [==============================] - 0s 1ms/step - loss: 2.1049\n", "Epoch 4/10\n", "28/28 [==============================] - 0s 997us/step - loss: 2.0602\n", "Epoch 5/10\n", "28/28 [==============================] - 0s 954us/step - loss: 2.0157\n", "Epoch 6/10\n", "28/28 [==============================] - 0s 1ms/step - loss: 1.9715\n", "Epoch 7/10\n", "28/28 [==============================] - 0s 1ms/step - loss: 1.9275\n", "Epoch 8/10\n", "28/28 [==============================] - 0s 988us/step - loss: 1.8839\n", "Epoch 9/10\n", "28/28 [==============================] - 0s 1ms/step - loss: 1.8403\n", "Epoch 10/10\n", "28/28 [==============================] - 0s 1ms/step - loss: 1.7973\n", "\n", "\n", "Testing model with learning rate: 0.010000\n", "\n", "Epoch 1/10\n", "28/28 [==============================] - 0s 1ms/step - loss: 1.8178\n", "Epoch 2/10\n", "28/28 [==============================] - 0s 1ms/step - loss: 0.8062\n", "Epoch 3/10\n", "28/28 [==============================] - 0s 1ms/step - loss: 0.6392\n", "Epoch 4/10\n", "28/28 [==============================] - 0s 992us/step - loss: 0.6361\n", "Epoch 5/10\n", "28/28 [==============================] - 0s 1ms/step - loss: 0.6157\n", "Epoch 6/10\n", "28/28 [==============================] - 0s 992us/step - loss: 0.5927\n", "Epoch 7/10\n", "28/28 [==============================] - 0s 932us/step - loss: 0.5925\n", "Epoch 8/10\n", "28/28 [==============================] - 0s 957us/step - loss: 0.5918\n", "Epoch 9/10\n", "28/28 [==============================] - 0s 959us/step - loss: 0.5884\n", "Epoch 10/10\n", "28/28 [==============================] - 0s 1ms/step - loss: 0.5908\n", "\n", "\n", "Testing model with learning rate: 1.000000\n", "\n", "Epoch 1/10\n", "28/28 [==============================] - 0s 1ms/step - loss: 26042796.0000\n", "Epoch 2/10\n", "28/28 [==============================] - 0s 1ms/step - loss: 0.6709\n", "Epoch 3/10\n", "28/28 [==============================] - 0s 1ms/step - loss: 0.6674\n", "Epoch 4/10\n", "28/28 [==============================] - 0s 977us/step - loss: 0.6734\n", "Epoch 5/10\n", "28/28 [==============================] - 0s 993us/step - loss: 0.6778\n", "Epoch 6/10\n", "28/28 [==============================] - 0s 998us/step - loss: 0.6676\n", "Epoch 7/10\n", "28/28 [==============================] - 0s 1ms/step - loss: 0.6733\n", "Epoch 8/10\n", "28/28 [==============================] - 0s 949us/step - loss: 0.6736\n", "Epoch 9/10\n", "28/28 [==============================] - 0s 966us/step - loss: 0.6697\n", "Epoch 10/10\n", "28/28 [==============================] - 0s 1ms/step - loss: 0.6697\n" ] } ], "source": [ "# Create list of learning rates: lr_to_test\n", "lr_to_test = [0.000001, 0.01, 1]\n", "\n", "# Loop over learning rates\n", "for lr in lr_to_test:\n", " print('\\n\\nTesting model with learning rate: %f\\n' % lr)\n", " \n", " # Build new model to test, unaffected by previous models\n", " model = get_new_model()\n", " \n", " # Create SGD optimizer with specified learning rate: my_optimizer\n", " my_optimizer = tf.keras.optimizers.SGD(lr=lr)\n", " \n", " # Compile the model\n", " model.compile(optimizer=my_optimizer, loss='categorical_crossentropy')\n", " \n", " # Fit the model\n", " model.fit(predictors, target, epochs=10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Model validation\n", "- Validation in deep learning\n", " - Commonly use validation split rather than cross-validation\n", " - Deep learning widely used on large datasets\n", " - Single validation score is based on large amount of data, and is reliable\n", "- Experimentation\n", " - Experiment with different architectures\n", " - More layers\n", " - Fewer layers\n", " - Layers with more nodes\n", " - Layers with fewer nodes\n", " - Creating a great model requires experimentation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Evaluating model accuracy on validation dataset\n", "Now it's your turn to monitor model accuracy with a validation data set. A model definition has been provided as `model`. Your job is to add the code to compile it and then fit it. You'll check the validation score in each epoch." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/10\n", "20/20 [==============================] - 0s 7ms/step - loss: 0.8035 - accuracy: 0.6308 - val_loss: 0.5820 - val_accuracy: 0.7090\n", "Epoch 2/10\n", "20/20 [==============================] - 0s 2ms/step - loss: 0.6391 - accuracy: 0.6934 - val_loss: 0.5344 - val_accuracy: 0.7239\n", "Epoch 3/10\n", "20/20 [==============================] - 0s 2ms/step - loss: 0.7710 - accuracy: 0.6372 - val_loss: 0.7069 - val_accuracy: 0.7239\n", "Epoch 4/10\n", "20/20 [==============================] - 0s 2ms/step - loss: 0.7163 - accuracy: 0.6533 - val_loss: 0.6188 - val_accuracy: 0.6754\n", "Epoch 5/10\n", "20/20 [==============================] - 0s 3ms/step - loss: 0.6758 - accuracy: 0.6822 - val_loss: 0.5743 - val_accuracy: 0.7239\n", "Epoch 6/10\n", "20/20 [==============================] - 0s 2ms/step - loss: 0.6334 - accuracy: 0.6758 - val_loss: 0.5045 - val_accuracy: 0.7463\n", "Epoch 7/10\n", "20/20 [==============================] - 0s 2ms/step - loss: 0.6088 - accuracy: 0.6870 - val_loss: 0.6298 - val_accuracy: 0.6530\n", "Epoch 8/10\n", "20/20 [==============================] - 0s 2ms/step - loss: 0.6201 - accuracy: 0.6693 - val_loss: 0.5187 - val_accuracy: 0.7537\n", "Epoch 9/10\n", "20/20 [==============================] - 0s 2ms/step - loss: 0.5964 - accuracy: 0.7127 - val_loss: 0.5387 - val_accuracy: 0.7388\n", "Epoch 10/10\n", "20/20 [==============================] - 0s 2ms/step - loss: 0.6187 - accuracy: 0.6709 - val_loss: 0.4768 - val_accuracy: 0.7687\n" ] } ], "source": [ "# Save the number of columns in predictors: n_cols\n", "n_cols = predictors.shape[1]\n", "input_shape = (n_cols, )\n", "\n", "# Specify the model\n", "model = tf.keras.Sequential()\n", "model.add(tf.keras.layers.Dense(100, activation='relu', input_shape=input_shape))\n", "model.add(tf.keras.layers.Dense(100, activation='relu'))\n", "model.add(tf.keras.layers.Dense(2, activation='softmax'))\n", "\n", "# Compile the model\n", "model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])\n", "\n", "# Fit the model\n", "hist = model.fit(predictors, target, epochs=10, validation_split=0.3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Early stopping: Optimizing the optimization\n", "Now that you know how to monitor your model performance throughout optimization, you can use early stopping to stop optimization when it isn't helping any more. Since the optimization stops automatically when it isn't helping, you can also set a high value for epochs in your call to `.fit()`." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/30\n", "20/20 [==============================] - 0s 6ms/step - loss: 0.6940 - accuracy: 0.6469 - val_loss: 0.5595 - val_accuracy: 0.7201\n", "Epoch 2/30\n", "20/20 [==============================] - 0s 2ms/step - loss: 0.6713 - accuracy: 0.6549 - val_loss: 0.5452 - val_accuracy: 0.7276\n", "Epoch 3/30\n", "20/20 [==============================] - 0s 2ms/step - loss: 0.6777 - accuracy: 0.6613 - val_loss: 0.5755 - val_accuracy: 0.7351\n", "Epoch 4/30\n", "20/20 [==============================] - 0s 2ms/step - loss: 0.6433 - accuracy: 0.6629 - val_loss: 0.5584 - val_accuracy: 0.7500\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from tensorflow.keras.callbacks import EarlyStopping\n", "\n", "# Save the number of columns in predictors: n_cols\n", "n_cols = predictors.shape[1]\n", "input_shape = (n_cols, )\n", "\n", "# Specify the model\n", "model = tf.keras.Sequential()\n", "model.add(tf.keras.layers.Dense(100, activation='relu', input_shape=input_shape))\n", "model.add(tf.keras.layers.Dense(100, activation='relu'))\n", "model.add(tf.keras.layers.Dense(2, activation='softmax'))\n", "\n", "# Compile the model\n", "model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])\n", "\n", "# Define early_stopping_monitor\n", "early_stopping_monitor = EarlyStopping(patience=2)\n", "\n", "# Fit the model\n", "model.fit(predictors, target, epochs=30, validation_split=0.3,\n", " callbacks=[early_stopping_monitor])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Because optimization will automatically stop when it is no longer helpful, it is okay to specify the maximum number of epochs as 30 rather than using the default of 10 that you've used so far. Here, it seems like the optimization stopped after 4 epochs." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Experimenting with wider networks\n", "Now you know everything you need to begin experimenting with different models!\n", "\n", "A model called `model_1` has been pre-loaded. This is a relatively small network, with only 10 units in each hidden layer.\n", "\n", "In this exercise you'll create a new model called `model_2` which is similar to `model_1`, except it has 100 units in each hidden layer.\n", "\n", "After you create model_2, both models will be fitted, and a graph showing both models loss score at each epoch will be shown. We added the argument verbose=False in the fitting commands to print out fewer updates, since you will look at these graphically instead of as text.\n", "\n", "Because you are fitting two models, it will take a moment to see the outputs after you hit run, so be patient." ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "model_1 = tf.keras.Sequential()\n", "model_1.add(tf.keras.layers.Dense(10, activation='relu', input_shape=input_shape))\n", "model_1.add(tf.keras.layers.Dense(10, activation='relu'))\n", "model_1.add(tf.keras.layers.Dense(2, activation='softmax'))\n", "model_1.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Model: \"sequential_25\"\n", "_________________________________________________________________\n", "Layer (type) Output Shape Param # \n", "=================================================================\n", "dense_73 (Dense) (None, 10) 110 \n", "_________________________________________________________________\n", "dense_74 (Dense) (None, 10) 110 \n", "_________________________________________________________________\n", "dense_75 (Dense) (None, 2) 22 \n", "=================================================================\n", "Total params: 242\n", "Trainable params: 242\n", "Non-trainable params: 0\n", "_________________________________________________________________\n" ] } ], "source": [ "model_1.summary()" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Define early_stopping_monitor\n", "early_stopping_monitor = EarlyStopping(patience=2)\n", "\n", "# Create the new model: model_2\n", "model_2 = tf.keras.Sequential()\n", "\n", "# Add the first and second layers\n", "model_2.add(tf.keras.layers.Dense(100, activation='relu', input_shape=input_shape))\n", "model_2.add(tf.keras.layers.Dense(100, activation='relu'))\n", "\n", "# Add the output layer\n", "model_2.add(tf.keras.layers.Dense(2, activation='softmax'))\n", "\n", "# Compile model_2\n", "model_2.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])\n", "\n", "# Fit model_1\n", "model_1_training = model_1.fit(predictors, target, epochs=15, validation_split=0.2,\n", " callbacks=[early_stopping_monitor], verbose=False)\n", "\n", "# Fit model_2\n", "model_2_training = model_2.fit(predictors, target, epochs=15, validation_split=0.2,\n", " callbacks=[early_stopping_monitor], verbose=False)\n", "\n", "# Create th eplot\n", "plt.plot(model_1_training.history['val_loss'], 'r', model_2_training.history['val_loss'], 'b');\n", "plt.xlabel('Epochs')\n", "plt.ylabel('Validation score');" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Adding layers to a network\n", "You've seen how to experiment with wider networks. In this exercise, you'll try a deeper network (more hidden layers).\n", "\n", "Once again, you have a baseline model called `model_1` as a starting point. It has 1 hidden layer, with 50 units. You can see a summary of that model's structure printed out. You will create a similar network with 3 hidden layers (still keeping 50 units in each layer).\n", "\n", "This will again take a moment to fit both models, so you'll need to wait a few seconds to see the results after you run your code." ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [], "source": [ "model_1 = tf.keras.Sequential()\n", "model_1.add(tf.keras.layers.Dense(50, activation='relu', input_shape=input_shape))\n", "model_1.add(tf.keras.layers.Dense(2, activation='softmax'))\n", "model_1.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Model: \"sequential_32\"\n", "_________________________________________________________________\n", "Layer (type) Output Shape Param # \n", "=================================================================\n", "dense_97 (Dense) (None, 50) 550 \n", "_________________________________________________________________\n", "dense_98 (Dense) (None, 2) 102 \n", "=================================================================\n", "Total params: 652\n", "Trainable params: 652\n", "Non-trainable params: 0\n", "_________________________________________________________________\n" ] } ], "source": [ "model_1.summary()" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Create the new model: model_2\n", "model_2 = tf.keras.Sequential()\n", "\n", "# Add the first, second, and third hidden layers\n", "model_2.add(tf.keras.layers.Dense(50, activation='relu', input_shape=input_shape))\n", "model_2.add(tf.keras.layers.Dense(50, activation='relu'))\n", "model_2.add(tf.keras.layers.Dense(50, activation='relu'))\n", "\n", "# Add the output layer\n", "model_2.add(tf.keras.layers.Dense(2, activation='softmax'))\n", "\n", "# Compile model_2\n", "model_2.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])\n", "\n", "# Fit model 1\n", "model_1_training = model_1.fit(predictors, target, epochs=20, validation_split=0.4, callbacks=[early_stopping_monitor], verbose=False)\n", "\n", "# Fit model 2\n", "model_2_training = model_2.fit(predictors, target, epochs=20, validation_split=0.4, callbacks=[early_stopping_monitor], verbose=False)\n", "\n", "# Create the plot\n", "plt.plot(model_1_training.history['val_loss'], 'r', model_2_training.history['val_loss'], 'b');\n", "plt.xlabel('Epochs');\n", "plt.ylabel('Validation score');" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Thinking about model capacity\n", "- Overfitting\n", "![of](image/of.png)\n", "- Workflow for optimizing model capacity\n", " - Start with a small network\n", " - Gradually increase capacity\n", " - Keep increasing capacity until validation score is no longer improving" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Stepping up to images" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Building your own digit recognition model\n", "You've reached the final exercise of the course - you now know everything you need to build an accurate model to recognize handwritten digits!\n", "\n", "To add an extra challenge, we've loaded only 2500 images, rather than 60000 which you will see in some published results. Deep learning models perform better with more data, however, they also take longer to train, especially when they start becoming more complex.\n", "\n", "If you have a computer with a CUDA compatible GPU, you can take advantage of it to improve computation time. If you don't have a GPU, no problem! You can set up a deep learning environment in the cloud that can run your models on a GPU. Here is a [blog post](https://www.datacamp.com/community/tutorials/deep-learning-jupyter-aws) by Dan that explains how to do this - check it out after completing this exercise! It is a great next step as you continue your deep learning journey." ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
0123456789...775776777778779780781782783784
0500.10.20.30.40.50.60.70.8...0.6080.6090.610.6110.6120.6130.6140.6150.6160.617
1400.00.00.00.00.00.00.00.0...0.0000.0000.000.0000.0000.0000.0000.0000.0000.000
2300.00.00.00.00.00.00.00.0...0.0000.0000.000.0000.0000.0000.0000.0000.0000.000
3000.00.00.00.00.00.00.00.0...0.0000.0000.000.0000.0000.0000.0000.0000.0000.000
4200.00.00.00.00.00.00.00.0...0.0000.0000.000.0000.0000.0000.0000.0000.0000.000
\n", "

5 rows × 785 columns

\n", "
" ], "text/plain": [ " 0 1 2 3 4 5 6 7 8 9 ... 775 776 777 \\\n", "0 5 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 ... 0.608 0.609 0.61 \n", "1 4 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.000 0.000 0.00 \n", "2 3 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.000 0.000 0.00 \n", "3 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.000 0.000 0.00 \n", "4 2 0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.000 0.000 0.00 \n", "\n", " 778 779 780 781 782 783 784 \n", "0 0.611 0.612 0.613 0.614 0.615 0.616 0.617 \n", "1 0.000 0.000 0.000 0.000 0.000 0.000 0.000 \n", "2 0.000 0.000 0.000 0.000 0.000 0.000 0.000 \n", "3 0.000 0.000 0.000 0.000 0.000 0.000 0.000 \n", "4 0.000 0.000 0.000 0.000 0.000 0.000 0.000 \n", "\n", "[5 rows x 785 columns]" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mnist = pd.read_csv('./dataset/mnist.csv', header=None)\n", "mnist.head()" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [], "source": [ "X = mnist.iloc[:, 1:].astype(np.float32).to_numpy()\n", "y = to_categorical(mnist.iloc[:, 0])" ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/50\n", "44/44 [==============================] - 0s 3ms/step - loss: 18.8407 - accuracy: 0.4229 - val_loss: 7.5732 - val_accuracy: 0.5391\n", "Epoch 2/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 4.0665 - accuracy: 0.6879 - val_loss: 4.7740 - val_accuracy: 0.6456\n", "Epoch 3/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 2.1160 - accuracy: 0.7729 - val_loss: 4.4106 - val_accuracy: 0.6689\n", "Epoch 4/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 1.2134 - accuracy: 0.8307 - val_loss: 3.7577 - val_accuracy: 0.6839\n", "Epoch 5/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 0.9683 - accuracy: 0.8614 - val_loss: 3.5194 - val_accuracy: 0.7238\n", "Epoch 6/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 0.5479 - accuracy: 0.9114 - val_loss: 3.1736 - val_accuracy: 0.7288\n", "Epoch 7/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 0.3776 - accuracy: 0.9221 - val_loss: 3.3420 - val_accuracy: 0.7255\n", "Epoch 8/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 0.2887 - accuracy: 0.9329 - val_loss: 3.0673 - val_accuracy: 0.7338\n", "Epoch 9/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 0.1504 - accuracy: 0.9664 - val_loss: 3.0627 - val_accuracy: 0.7371\n", "Epoch 10/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 0.0817 - accuracy: 0.9800 - val_loss: 3.0174 - val_accuracy: 0.7438\n", "Epoch 11/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 0.0606 - accuracy: 0.9821 - val_loss: 3.0648 - val_accuracy: 0.7388\n", "Epoch 12/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 0.0328 - accuracy: 0.9879 - val_loss: 3.0618 - val_accuracy: 0.7537\n", "Epoch 13/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 0.0253 - accuracy: 0.9914 - val_loss: 2.9957 - val_accuracy: 0.7687\n", "Epoch 14/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 0.0132 - accuracy: 0.9964 - val_loss: 3.0766 - val_accuracy: 0.7521\n", "Epoch 15/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 0.0045 - accuracy: 0.9993 - val_loss: 3.0787 - val_accuracy: 0.7471\n", "Epoch 16/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 0.0019 - accuracy: 1.0000 - val_loss: 3.0682 - val_accuracy: 0.7537\n", "Epoch 17/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 0.0017 - accuracy: 1.0000 - val_loss: 3.0551 - val_accuracy: 0.7554\n", "Epoch 18/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 9.4450e-04 - accuracy: 1.0000 - val_loss: 3.0612 - val_accuracy: 0.7504\n", "Epoch 19/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 7.6430e-04 - accuracy: 1.0000 - val_loss: 3.0624 - val_accuracy: 0.7504\n", "Epoch 20/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 6.7846e-04 - accuracy: 1.0000 - val_loss: 3.0623 - val_accuracy: 0.7521\n", "Epoch 21/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 6.2465e-04 - accuracy: 1.0000 - val_loss: 3.0610 - val_accuracy: 0.7537\n", "Epoch 22/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 5.8118e-04 - accuracy: 1.0000 - val_loss: 3.0643 - val_accuracy: 0.7504\n", "Epoch 23/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 5.3415e-04 - accuracy: 1.0000 - val_loss: 3.0627 - val_accuracy: 0.7537\n", "Epoch 24/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 5.0316e-04 - accuracy: 1.0000 - val_loss: 3.0637 - val_accuracy: 0.7537\n", "Epoch 25/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 4.6953e-04 - accuracy: 1.0000 - val_loss: 3.0648 - val_accuracy: 0.7537\n", "Epoch 26/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 4.3986e-04 - accuracy: 1.0000 - val_loss: 3.0635 - val_accuracy: 0.7537\n", "Epoch 27/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 4.1727e-04 - accuracy: 1.0000 - val_loss: 3.0630 - val_accuracy: 0.7537\n", "Epoch 28/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 3.9517e-04 - accuracy: 1.0000 - val_loss: 3.0641 - val_accuracy: 0.7537\n", "Epoch 29/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 3.7410e-04 - accuracy: 1.0000 - val_loss: 3.0674 - val_accuracy: 0.7537\n", "Epoch 30/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 3.5648e-04 - accuracy: 1.0000 - val_loss: 3.0671 - val_accuracy: 0.7554\n", "Epoch 31/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 3.3969e-04 - accuracy: 1.0000 - val_loss: 3.0688 - val_accuracy: 0.7554\n", "Epoch 32/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 3.2434e-04 - accuracy: 1.0000 - val_loss: 3.0653 - val_accuracy: 0.7554\n", "Epoch 33/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 3.1208e-04 - accuracy: 1.0000 - val_loss: 3.0667 - val_accuracy: 0.7554\n", "Epoch 34/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 2.9868e-04 - accuracy: 1.0000 - val_loss: 3.0664 - val_accuracy: 0.7554\n", "Epoch 35/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 2.8480e-04 - accuracy: 1.0000 - val_loss: 3.0667 - val_accuracy: 0.7554\n", "Epoch 36/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 2.7411e-04 - accuracy: 1.0000 - val_loss: 3.0662 - val_accuracy: 0.7554\n", "Epoch 37/50\n", "44/44 [==============================] - 0s 6ms/step - loss: 2.6219e-04 - accuracy: 1.0000 - val_loss: 3.0684 - val_accuracy: 0.7537\n", "Epoch 38/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 2.5434e-04 - accuracy: 1.0000 - val_loss: 3.0676 - val_accuracy: 0.7554\n", "Epoch 39/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 2.4130e-04 - accuracy: 1.0000 - val_loss: 3.0685 - val_accuracy: 0.7537\n", "Epoch 40/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 2.3264e-04 - accuracy: 1.0000 - val_loss: 3.0675 - val_accuracy: 0.7554\n", "Epoch 41/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 2.2436e-04 - accuracy: 1.0000 - val_loss: 3.0682 - val_accuracy: 0.7554\n", "Epoch 42/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 2.1692e-04 - accuracy: 1.0000 - val_loss: 3.0698 - val_accuracy: 0.7537\n", "Epoch 43/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 2.0934e-04 - accuracy: 1.0000 - val_loss: 3.0683 - val_accuracy: 0.7554\n", "Epoch 44/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 2.0208e-04 - accuracy: 1.0000 - val_loss: 3.0703 - val_accuracy: 0.7554\n", "Epoch 45/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 1.9523e-04 - accuracy: 1.0000 - val_loss: 3.0680 - val_accuracy: 0.7554\n", "Epoch 46/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 1.8736e-04 - accuracy: 1.0000 - val_loss: 3.0668 - val_accuracy: 0.7571\n", "Epoch 47/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 1.7953e-04 - accuracy: 1.0000 - val_loss: 3.0687 - val_accuracy: 0.7554\n", "Epoch 48/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 1.7335e-04 - accuracy: 1.0000 - val_loss: 3.0697 - val_accuracy: 0.7554\n", "Epoch 49/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 1.6714e-04 - accuracy: 1.0000 - val_loss: 3.0696 - val_accuracy: 0.7571\n", "Epoch 50/50\n", "44/44 [==============================] - 0s 2ms/step - loss: 1.6202e-04 - accuracy: 1.0000 - val_loss: 3.0690 - val_accuracy: 0.7571\n" ] } ], "source": [ "# Create the model: model\n", "model = tf.keras.Sequential()\n", "\n", "# Add the first hidden layer\n", "model.add(tf.keras.layers.Dense(50, activation='relu', input_shape=(X.shape[1], )))\n", "\n", "# Add the second hidden layer\n", "model.add(tf.keras.layers.Dense(50, activation='relu'))\n", "\n", "# Add the output layer\n", "model.add(tf.keras.layers.Dense(10, activation='softmax'))\n", "\n", "# Compile the model\n", "model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])\n", "\n", "# Fit the model\n", "model.fit(X, y, validation_split=0.3, epochs=50);" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.6" } }, "nbformat": 4, "nbformat_minor": 4 }