{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# How to simply use keras\n", "* Reference\n", " + https://www.tensorflow.org/guide/keras" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setup" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1.12.0\n", "2.1.6-tf\n" ] } ], "source": [ "from __future__ import absolute_import, division, print_function\n", "import numpy as np\n", "import tensorflow as tf\n", "\n", "keras = tf.keras\n", "\n", "print(tf.__version__)\n", "print(tf.keras.__version__)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Build a simple model\n", "https://www.tensorflow.org/guide/keras#build_a_simple_model\n", "### Sequential model\n", "In Keras, you assemble layers to build models. A model is (usually) a graph of layers. The most common type of model is a stack of layers: the `tf.keras.Sequential` model." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "## To build a simple, fully-connected network (i.e. multi-layer perceptron)\n", "# If you specify the input shape, the model gets built continuously, as you are adding layers.\n", "# Note that when using this delayed-build pattern (no input shape specified),\n", "# the model doesn't have any weights until the first call,\n", "# to a training/evaluation method (since it isn't yet built)\n", "\n", "model = keras.Sequential()\n", "model.add(keras.layers.Dense(units = 64, activation = 'relu')) \n", "model.add(keras.layers.Dense(units = 64, activation = 'relu'))\n", "model.add(keras.layers.Dense(units = 10, activation = 'softmax'))" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[]\n", "[]\n" ] } ], "source": [ "# Note that when using this delayed-build pattern (no input shape specified),\n", "# the model doesn't have any weights until the first call,\n", "# to a training/evaluation method (since it isn't yet built)\n", "print(tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES))\n", "print(tf.get_default_graph().get_operations())\n", "del model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Configure the layers\n", "There are many tf.keras.layers available with some common constructor parameters:\n", "\n", "* `activation`: Set the activation function for the layer. This parameter is specified by the name of a built-in function or as a callable object. By default, no activation is applied.\n", "* `kernel_initializer` and `bias_initializer`: The initialization schemes that create the layer's weights (kernel and bias). This parameter is a name or a callable object. This defaults to the `\"Glorot uniform\"` initializer.\n", "* `kernel_regularizer` and `bias_regularizer`: The regularization schemes that apply the layer's weights (kernel and bias), such as L1 or L2 regularization. By default, no regularization is applied. \n", " \n", "The following instantiates `tf.keras.layers.Dense` layers using constructor arguments:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "<tensorflow.python.keras.layers.core.Dense at 0x7ff1e70874a8>" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "keras.backend.clear_session()\n", "tf.reset_default_graph()\n", "# Create a sigmoid layer:\n", "keras.layers.Dense(64, activation='sigmoid')\n", "# Or:\n", "keras.layers.Dense(64, activation=tf.sigmoid)\n", "\n", "# A linear layer with L1 regularization of factor 0.01 applied to the kernel matrix:\n", "keras.layers.Dense(64, kernel_regularizer=tf.keras.regularizers.l1(0.01))\n", "\n", "# A linear layer with L2 regularization of factor 0.01 applied to the bias vector:\n", "keras.layers.Dense(64, bias_regularizer=tf.keras.regularizers.l2(0.01))\n", "\n", "# A linear layer with a kernel initialized to a random orthogonal matrix:\n", "keras.layers.Dense(64, kernel_initializer='orthogonal')\n", "\n", "# A linear layer with a bias vector initialized to 2.0s:\n", "keras.layers.Dense(64, bias_initializer=tf.keras.initializers.constant(2.0))" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[]\n", "[]\n" ] } ], "source": [ "print(tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES))\n", "print(tf.get_default_graph().get_operations())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Train and evaluate\n", "https://www.tensorflow.org/guide/keras?hl=ko#train_and_evaluate" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Set up training\n", "After the model is constructed, configure its learning process by calling the `compile` method, `tf.keras.Model.compile` or `tf.keras.Sequential.compile` takes three important arguments. \n", " \n", "* `optimizer`: This object specifies the training procedure. Pass it optimizer instances from the `tf.train` module, such as `tf.train.AdamOptimizer`, `tf.train.RMSPropOptimizer`, or `tf.train.GradientDescentOptimizer`.\n", "* `loss`: The function to minimize during optimization. Common choices include mean square error (`mse`), `categorical_crossentropy`, and `binary_crossentropy`. Loss functions are specified by name or by passing a callable object from the `tf.keras.losses` module.\n", "* `metrics`: Used to monitor training. These are string names or callables from the `tf.keras.metrics` module." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[<tf.Variable 'dense/kernel:0' shape=(32, 64) dtype=float32>, <tf.Variable 'dense/bias:0' shape=(64,) dtype=float32>, <tf.Variable 'dense_1/kernel:0' shape=(64, 64) dtype=float32>, <tf.Variable 'dense_1/bias:0' shape=(64,) dtype=float32>, <tf.Variable 'dense_2/kernel:0' shape=(64, 10) dtype=float32>, <tf.Variable 'dense_2/bias:0' shape=(10,) dtype=float32>]\n", "[<tf.Operation 'dense_input' type=Placeholder>, <tf.Operation 'dense/kernel/Initializer/random_uniform/shape' type=Const>, <tf.Operation 'dense/kernel/Initializer/random_uniform/min' type=Const>, <tf.Operation 'dense/kernel/Initializer/random_uniform/max' type=Const>, <tf.Operation 'dense/kernel/Initializer/random_uniform/RandomUniform' type=RandomUniform>, <tf.Operation 'dense/kernel/Initializer/random_uniform/sub' type=Sub>, <tf.Operation 'dense/kernel/Initializer/random_uniform/mul' type=Mul>, <tf.Operation 'dense/kernel/Initializer/random_uniform' type=Add>, <tf.Operation 'dense/kernel' type=VarHandleOp>, <tf.Operation 'dense/kernel/IsInitialized/VarIsInitializedOp' type=VarIsInitializedOp>, <tf.Operation 'dense/kernel/Assign' type=AssignVariableOp>, <tf.Operation 'dense/kernel/Read/ReadVariableOp' type=ReadVariableOp>, <tf.Operation 'dense/bias/Initializer/zeros' type=Const>, <tf.Operation 'dense/bias' type=VarHandleOp>, <tf.Operation 'dense/bias/IsInitialized/VarIsInitializedOp' type=VarIsInitializedOp>, <tf.Operation 'dense/bias/Assign' type=AssignVariableOp>, <tf.Operation 'dense/bias/Read/ReadVariableOp' type=ReadVariableOp>, <tf.Operation 'dense/MatMul/ReadVariableOp' type=ReadVariableOp>, <tf.Operation 'dense/MatMul' type=MatMul>, <tf.Operation 'dense/BiasAdd/ReadVariableOp' type=ReadVariableOp>, <tf.Operation 'dense/BiasAdd' type=BiasAdd>, <tf.Operation 'dense/Relu' type=Relu>, <tf.Operation 'dense_1/kernel/Initializer/random_uniform/shape' type=Const>, <tf.Operation 'dense_1/kernel/Initializer/random_uniform/min' type=Const>, <tf.Operation 'dense_1/kernel/Initializer/random_uniform/max' type=Const>, <tf.Operation 'dense_1/kernel/Initializer/random_uniform/RandomUniform' type=RandomUniform>, <tf.Operation 'dense_1/kernel/Initializer/random_uniform/sub' type=Sub>, <tf.Operation 'dense_1/kernel/Initializer/random_uniform/mul' type=Mul>, <tf.Operation 'dense_1/kernel/Initializer/random_uniform' type=Add>, <tf.Operation 'dense_1/kernel' type=VarHandleOp>, <tf.Operation 'dense_1/kernel/IsInitialized/VarIsInitializedOp' type=VarIsInitializedOp>, <tf.Operation 'dense_1/kernel/Assign' type=AssignVariableOp>, <tf.Operation 'dense_1/kernel/Read/ReadVariableOp' type=ReadVariableOp>, <tf.Operation 'dense_1/bias/Initializer/zeros' type=Const>, <tf.Operation 'dense_1/bias' type=VarHandleOp>, <tf.Operation 'dense_1/bias/IsInitialized/VarIsInitializedOp' type=VarIsInitializedOp>, <tf.Operation 'dense_1/bias/Assign' type=AssignVariableOp>, <tf.Operation 'dense_1/bias/Read/ReadVariableOp' type=ReadVariableOp>, <tf.Operation 'dense_1/MatMul/ReadVariableOp' type=ReadVariableOp>, <tf.Operation 'dense_1/MatMul' type=MatMul>, <tf.Operation 'dense_1/BiasAdd/ReadVariableOp' type=ReadVariableOp>, <tf.Operation 'dense_1/BiasAdd' type=BiasAdd>, <tf.Operation 'dense_1/Relu' type=Relu>, <tf.Operation 'dense_2/kernel/Initializer/random_uniform/shape' type=Const>, <tf.Operation 'dense_2/kernel/Initializer/random_uniform/min' type=Const>, <tf.Operation 'dense_2/kernel/Initializer/random_uniform/max' type=Const>, <tf.Operation 'dense_2/kernel/Initializer/random_uniform/RandomUniform' type=RandomUniform>, <tf.Operation 'dense_2/kernel/Initializer/random_uniform/sub' type=Sub>, <tf.Operation 'dense_2/kernel/Initializer/random_uniform/mul' type=Mul>, <tf.Operation 'dense_2/kernel/Initializer/random_uniform' type=Add>, <tf.Operation 'dense_2/kernel' type=VarHandleOp>, <tf.Operation 'dense_2/kernel/IsInitialized/VarIsInitializedOp' type=VarIsInitializedOp>, <tf.Operation 'dense_2/kernel/Assign' type=AssignVariableOp>, <tf.Operation 'dense_2/kernel/Read/ReadVariableOp' type=ReadVariableOp>, <tf.Operation 'dense_2/bias/Initializer/zeros' type=Const>, <tf.Operation 'dense_2/bias' type=VarHandleOp>, <tf.Operation 'dense_2/bias/IsInitialized/VarIsInitializedOp' type=VarIsInitializedOp>, <tf.Operation 'dense_2/bias/Assign' type=AssignVariableOp>, <tf.Operation 'dense_2/bias/Read/ReadVariableOp' type=ReadVariableOp>, <tf.Operation 'dense_2/MatMul/ReadVariableOp' type=ReadVariableOp>, <tf.Operation 'dense_2/MatMul' type=MatMul>, <tf.Operation 'dense_2/BiasAdd/ReadVariableOp' type=ReadVariableOp>, <tf.Operation 'dense_2/BiasAdd' type=BiasAdd>, <tf.Operation 'dense_2/Softmax' type=Softmax>]\n" ] } ], "source": [ "keras.backend.clear_session()\n", "tf.reset_default_graph()\n", "\n", "model = keras.Sequential()\n", "model.add(keras.layers.Dense(units=64, activation='relu', input_shape = (32,)))\n", "model.add(keras.layers.Dense(units=64, activation='relu'))\n", "model.add(keras.layers.Dense(units=10, activation='softmax'))\n", "\n", "print(tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES))\n", "print(tf.get_default_graph().get_operations())" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "# Compile\n", "model.compile(optimizer=tf.train.AdamOptimizer(0.001),\n", " loss='categorical_crossentropy',\n", " metrics=['accuracy'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following shows a few examples of configuring a model for training:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "# Configure a model for mean-squared error regression.\n", "model.compile(optimizer=tf.train.AdamOptimizer(0.01),\n", " loss='mse', # mean squared error\n", " metrics=['mae']) # mean absolute error\n", "\n", "# Configure a model for categorical classification.\n", "model.compile(optimizer=tf.train.RMSPropOptimizer(0.01),\n", " loss=keras.losses.categorical_crossentropy,\n", " metrics=[keras.metrics.categorical_accuracy])\n", "\n", "del model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Input NumPy data" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "float32 int32\n" ] } ], "source": [ "keras.backend.clear_session()\n", "tf.reset_default_graph()\n", "\n", "# Numpy dataset\n", "tr_data = np.random.random((1000, 32)).astype(np.float32)\n", "tr_label = np.random.randint(low=0, high=10, size = 1000).astype(np.int32)\n", "\n", "val_data = np.random.random((100, 32)).astype(np.float32)\n", "val_label = np.random.randint(low=0, high=10, size = 100).astype(np.int32)\n", "\n", "tst_data = np.random.random((100, 32)).astype(np.float32)\n", "tst_label = np.random.randint(low=0, high=10, size = 100).astype(np.int32)\n", "\n", "print(tr_data.dtype, tr_label.dtype)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "# Create a model\n", "model = keras.Sequential()\n", "model.add(keras.layers.Dense(units=64, activation='relu'))\n", "model.add(keras.layers.Dense(units=64, activation='relu'))\n", "model.add(keras.layers.Dense(units=10, activation='softmax'))" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Train on 1000 samples, validate on 100 samples\n", "Epoch 1/5\n", "1000/1000 [==============================] - 1s 692us/step - loss: 2.3259 - acc: 0.1010 - val_loss: 2.3399 - val_acc: 0.0800\n", "Epoch 2/5\n", "1000/1000 [==============================] - 0s 48us/step - loss: 2.3171 - acc: 0.1030 - val_loss: 2.3322 - val_acc: 0.0800\n", "Epoch 3/5\n", "1000/1000 [==============================] - 0s 40us/step - loss: 2.3123 - acc: 0.1080 - val_loss: 2.3265 - val_acc: 0.0700\n", "Epoch 4/5\n", "1000/1000 [==============================] - 0s 43us/step - loss: 2.3083 - acc: 0.1140 - val_loss: 2.3239 - val_acc: 0.0600\n", "Epoch 5/5\n", "1000/1000 [==============================] - 0s 46us/step - loss: 2.3055 - acc: 0.1150 - val_loss: 2.3202 - val_acc: 0.0600\n" ] }, { "data": { "text/plain": [ "<tensorflow.python.keras.callbacks.History at 0x7ff1e94e6128>" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "model.compile(optimizer=tf.train.GradientDescentOptimizer(.01), \n", " loss=keras.losses.sparse_categorical_crossentropy,\n", " metrics=['accuracy'])\n", "\n", "model.fit(x=tr_data, y=tr_label, epochs=5, batch_size=32, validation_data=(val_data, val_label))" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['loss', 'acc']\n", "100/100 [==============================] - 0s 34us/step\n", "[2.2931285667419434, 0.1]\n", "(100, 10)\n" ] } ], "source": [ "# Evaluate and predict\n", "print(model.metrics_names)\n", "print(model.evaluate(x=tst_data, y=tst_label))\n", "print(model.predict(x=tst_data).shape)\n", "\n", "del model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Input tf.data datasets\n", "Pass a `tf.data.Dataset` instance to the `fit`, `evaluate`, `predict` method.\n", "\n", "* issue in tf 1.12 (but this issue is resolved in tf-nightly) \n", "When passing `tf.data.Dataset` instance to `model.fit` method which is instantiated by `tf.keras.Sequential`, `tf.keras.Model`,\n", "subclassing `tf.keras.Model`, passing `metrics` argument to `'accuracy'` in `model.compile` method provokes `TypeError`" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[]\n", "[]\n" ] } ], "source": [ "keras.backend.clear_session() # very important!\n", "tf.reset_default_graph()\n", "\n", "print(tf.get_default_graph().get_operations())\n", "print(tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES))" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(tf.float32, tf.int32)\n" ] } ], "source": [ "# tf.data.Dataset instance\n", "tr_data = np.random.random((1000, 32)).astype(np.float32)\n", "tr_label = np.random.randint(low=0, high=10, size = 1000).astype(np.int32)\n", "tr_dataset = tf.data.Dataset.from_tensor_slices((tr_data, tr_label))\n", "tr_dataset = tr_dataset.batch(batch_size=32)\n", "tr_dataset = tr_dataset.repeat()\n", "\n", "val_data = np.random.random((100, 32)).astype(np.float32)\n", "val_label = np.random.randint(low=0, high=10, size = 100).astype(np.int32)\n", "val_dataset = tf.data.Dataset.from_tensor_slices((val_data, val_label))\n", "val_dataset = val_dataset.batch(batch_size=100).repeat()\n", "\n", "tst_data = np.random.random((100, 32)).astype(np.float32)\n", "tst_label = np.random.randint(low=0, high=10, size = 100).astype(np.int32)\n", "tst_dataset = tf.data.Dataset.from_tensor_slices((tst_data, tst_label))\n", "tst_dataset = tst_dataset.batch(batch_size=100)\n", "\n", "print(tr_dataset.output_types)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/5\n", "31/31 [==============================] - 0s 3ms/step - loss: 2.3375 - val_loss: 2.3473\n", "Epoch 2/5\n", "31/31 [==============================] - 0s 1ms/step - loss: 2.3215 - val_loss: 2.3414\n", "Epoch 3/5\n", "31/31 [==============================] - 0s 1ms/step - loss: 2.3140 - val_loss: 2.3372\n", "Epoch 4/5\n", "31/31 [==============================] - 0s 1ms/step - loss: 2.3098 - val_loss: 2.3354\n", "Epoch 5/5\n", "31/31 [==============================] - 0s 1ms/step - loss: 2.3070 - val_loss: 2.3337\n" ] }, { "data": { "text/plain": [ "<tensorflow.python.keras.callbacks.History at 0x7ff1e4109d30>" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Training\n", "model = keras.Sequential()\n", "model.add(keras.layers.Dense(units=64, activation='relu'))\n", "model.add(keras.layers.Dense(units=64, activation='relu'))\n", "model.add(keras.layers.Dense(units=10, activation='softmax'))\n", "model.compile(optimizer=tf.train.GradientDescentOptimizer(.01), \n", " loss=keras.losses.sparse_categorical_crossentropy)\n", "\n", "model.fit(tr_dataset, epochs = 5, steps_per_epoch = 1000 // 32,\n", " validation_data = val_dataset, validation_steps = 1)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['loss']\n", "1/1 [==============================] - 0s 653us/step\n", "2.3561947345733643\n", "(32, 10)\n" ] } ], "source": [ "# Evaluate and predict\n", "print(model.metrics_names)\n", "print(model.evaluate(tst_dataset, steps = 1))\n", "print(model.predict(tst_dataset, steps = 1).shape)\n", "\n", "del model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Build advanced models\n", "https://www.tensorflow.org/guide/keras?hl=ko#build_advanced_models" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Functional API\n", "The `tf.keras.Sequential` model is a simple stack of layers that cannot represent arbitrary models. Use the Keras functional API to build complex model topologies such as:\n", "\n", "* Multi-input models,\n", "* Multi-output models,\n", "* Models with shared layers (the same layer called several times),\n", "* Models with non-sequential data flows (e.g. residual connections). \n", "\n", "Building a model with the functional API works like this: \n", "\n", "1. A layer instance is callable and returns a tensor.\n", "2. Input tensors and output tensors are used to define a `tf.keras.Model` instance.\n", "3. This model is trained just like the `Sequential` model. \n", "\n", "The following example uses the functional API to build a simple, fully-connected network:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "# Clear\n", "keras.backend.clear_session()\n", "tf.reset_default_graph()" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(tf.float32, tf.int32)\n", "Tensor(\"input_1:0\", shape=(?, 32), dtype=float32) <class 'tensorflow.python.framework.ops.Tensor'>\n", "[<tf.Variable 'dense/kernel:0' shape=(32, 64) dtype=float32>, <tf.Variable 'dense/bias:0' shape=(64,) dtype=float32>, <tf.Variable 'dense_1/kernel:0' shape=(64, 64) dtype=float32>, <tf.Variable 'dense_1/bias:0' shape=(64,) dtype=float32>, <tf.Variable 'dense_2/kernel:0' shape=(64, 10) dtype=float32>, <tf.Variable 'dense_2/bias:0' shape=(10,) dtype=float32>]\n", "[<tf.Operation 'tensors/component_0' type=Const>, <tf.Operation 'tensors/component_1' type=Const>, <tf.Operation 'batch_size' type=Const>, <tf.Operation 'drop_remainder' type=Const>, <tf.Operation 'count' type=Const>, <tf.Operation 'input_1' type=Placeholder>, <tf.Operation 'dense/kernel/Initializer/random_uniform/shape' type=Const>, <tf.Operation 'dense/kernel/Initializer/random_uniform/min' type=Const>, <tf.Operation 'dense/kernel/Initializer/random_uniform/max' type=Const>, <tf.Operation 'dense/kernel/Initializer/random_uniform/RandomUniform' type=RandomUniform>, <tf.Operation 'dense/kernel/Initializer/random_uniform/sub' type=Sub>, <tf.Operation 'dense/kernel/Initializer/random_uniform/mul' type=Mul>, <tf.Operation 'dense/kernel/Initializer/random_uniform' type=Add>, <tf.Operation 'dense/kernel' type=VarHandleOp>, <tf.Operation 'dense/kernel/IsInitialized/VarIsInitializedOp' type=VarIsInitializedOp>, <tf.Operation 'dense/kernel/Assign' type=AssignVariableOp>, <tf.Operation 'dense/kernel/Read/ReadVariableOp' type=ReadVariableOp>, <tf.Operation 'dense/bias/Initializer/zeros' type=Const>, <tf.Operation 'dense/bias' type=VarHandleOp>, <tf.Operation 'dense/bias/IsInitialized/VarIsInitializedOp' type=VarIsInitializedOp>, <tf.Operation 'dense/bias/Assign' type=AssignVariableOp>, <tf.Operation 'dense/bias/Read/ReadVariableOp' type=ReadVariableOp>, <tf.Operation 'dense/MatMul/ReadVariableOp' type=ReadVariableOp>, <tf.Operation 'dense/MatMul' type=MatMul>, <tf.Operation 'dense/BiasAdd/ReadVariableOp' type=ReadVariableOp>, <tf.Operation 'dense/BiasAdd' type=BiasAdd>, <tf.Operation 'dense/Relu' type=Relu>, <tf.Operation 'dense_1/kernel/Initializer/random_uniform/shape' type=Const>, <tf.Operation 'dense_1/kernel/Initializer/random_uniform/min' type=Const>, <tf.Operation 'dense_1/kernel/Initializer/random_uniform/max' type=Const>, <tf.Operation 'dense_1/kernel/Initializer/random_uniform/RandomUniform' type=RandomUniform>, <tf.Operation 'dense_1/kernel/Initializer/random_uniform/sub' type=Sub>, <tf.Operation 'dense_1/kernel/Initializer/random_uniform/mul' type=Mul>, <tf.Operation 'dense_1/kernel/Initializer/random_uniform' type=Add>, <tf.Operation 'dense_1/kernel' type=VarHandleOp>, <tf.Operation 'dense_1/kernel/IsInitialized/VarIsInitializedOp' type=VarIsInitializedOp>, <tf.Operation 'dense_1/kernel/Assign' type=AssignVariableOp>, <tf.Operation 'dense_1/kernel/Read/ReadVariableOp' type=ReadVariableOp>, <tf.Operation 'dense_1/bias/Initializer/zeros' type=Const>, <tf.Operation 'dense_1/bias' type=VarHandleOp>, <tf.Operation 'dense_1/bias/IsInitialized/VarIsInitializedOp' type=VarIsInitializedOp>, <tf.Operation 'dense_1/bias/Assign' type=AssignVariableOp>, <tf.Operation 'dense_1/bias/Read/ReadVariableOp' type=ReadVariableOp>, <tf.Operation 'dense_1/MatMul/ReadVariableOp' type=ReadVariableOp>, <tf.Operation 'dense_1/MatMul' type=MatMul>, <tf.Operation 'dense_1/BiasAdd/ReadVariableOp' type=ReadVariableOp>, <tf.Operation 'dense_1/BiasAdd' type=BiasAdd>, <tf.Operation 'dense_1/Relu' type=Relu>, <tf.Operation 'dense_2/kernel/Initializer/random_uniform/shape' type=Const>, <tf.Operation 'dense_2/kernel/Initializer/random_uniform/min' type=Const>, <tf.Operation 'dense_2/kernel/Initializer/random_uniform/max' type=Const>, <tf.Operation 'dense_2/kernel/Initializer/random_uniform/RandomUniform' type=RandomUniform>, <tf.Operation 'dense_2/kernel/Initializer/random_uniform/sub' type=Sub>, <tf.Operation 'dense_2/kernel/Initializer/random_uniform/mul' type=Mul>, <tf.Operation 'dense_2/kernel/Initializer/random_uniform' type=Add>, <tf.Operation 'dense_2/kernel' type=VarHandleOp>, <tf.Operation 'dense_2/kernel/IsInitialized/VarIsInitializedOp' type=VarIsInitializedOp>, <tf.Operation 'dense_2/kernel/Assign' type=AssignVariableOp>, <tf.Operation 'dense_2/kernel/Read/ReadVariableOp' type=ReadVariableOp>, <tf.Operation 'dense_2/bias/Initializer/zeros' type=Const>, <tf.Operation 'dense_2/bias' type=VarHandleOp>, <tf.Operation 'dense_2/bias/IsInitialized/VarIsInitializedOp' type=VarIsInitializedOp>, <tf.Operation 'dense_2/bias/Assign' type=AssignVariableOp>, <tf.Operation 'dense_2/bias/Read/ReadVariableOp' type=ReadVariableOp>, <tf.Operation 'dense_2/MatMul/ReadVariableOp' type=ReadVariableOp>, <tf.Operation 'dense_2/MatMul' type=MatMul>, <tf.Operation 'dense_2/BiasAdd/ReadVariableOp' type=ReadVariableOp>, <tf.Operation 'dense_2/BiasAdd' type=BiasAdd>, <tf.Operation 'dense_2/Softmax' type=Softmax>]\n", "Epoch 1/5\n", "31/31 [==============================] - 0s 5ms/step - loss: 2.3502\n", "Epoch 2/5\n", "31/31 [==============================] - 0s 2ms/step - loss: 2.3298\n", "Epoch 3/5\n", "31/31 [==============================] - 0s 1ms/step - loss: 2.3133\n", "Epoch 4/5\n", "31/31 [==============================] - 0s 1ms/step - loss: 2.3034\n", "Epoch 5/5\n", "31/31 [==============================] - 0s 1ms/step - loss: 2.2969\n" ] } ], "source": [ "data = np.random.random((1000, 32)).astype(np.float32)\n", "label = np.random.randint(low=0, high=10, size = 1000).astype(np.int32)\n", "dataset = tf.data.Dataset.from_tensor_slices((data, label))\n", "dataset = dataset.batch(batch_size=32).repeat()\n", "print(dataset.output_types)\n", "\n", "inputs = tf.keras.Input(shape=(32,)) # Returns a placeholder tensor\n", "print(inputs, type(inputs))\n", "\n", "# A layer instance is callable on a tensor, and returns a tensor.\n", "x = keras.layers.Dense(64, activation='relu')(inputs)\n", "x = keras.layers.Dense(64, activation='relu')(x)\n", "predictions = keras.layers.Dense(10, activation='softmax')(x)\n", "\n", "# Instantiate the model given inputs and outputs\n", "model = keras.Model(inputs = inputs, outputs = predictions)\n", "\n", "print(tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES))\n", "print(tf.get_default_graph().get_operations())\n", "\n", "# The compile step specifies the training configuration.\n", "model.compile(optimizer=tf.train.RMSPropOptimizer(.001),\n", " loss=keras.losses.sparse_categorical_crossentropy)\n", "# Trains for 5 epochs\n", "model.fit(dataset, epochs=5, steps_per_epoch = 1000//32)\n", "\n", "del model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Model subclassing\n", "Build a fully-customizable model by subclassing `tf.keras.Model` and defining your own forward pass. Create layers in the `__init__` method and set them as attributes of the class instance. Define the forward pass in the call method.\n", "\n", "Model subclassing is particularly useful when eager execution is enabled since the forward pass can be written imperatively." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[]\n", "[]\n" ] } ], "source": [ "# Clear\n", "keras.backend.clear_session()\n", "tf.reset_default_graph()\n", "\n", "print(tf.get_default_graph().get_operations())\n", "print(tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES))" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "# Subclassing tf.keras.Model\n", "class MLP(keras.Model):\n", " def __init__(self, hidden_dim, num_classes):\n", " super(MLP, self).__init__()\n", " # Define your layers here.\n", " self.hidden_layer = keras.layers.Dense(units = hidden_dim, activation='relu')\n", " self.output_layer = keras.layers.Dense(units = num_classes, activation='softmax')\n", " \n", " def call(self, inputs):\n", " hidden = self.hidden_layer(inputs)\n", " score = self.output_layer(hidden)\n", " return score\n", " \n", "# Instantiate the MLP class\n", "mlp = MLP(hidden_dim=100, num_classes=10)\n", "\n", "# The compile step specifies the training configuration.\n", "mlp.compile(optimizer=tf.train.RMSPropOptimizer(.001),\n", " loss=keras.losses.sparse_categorical_crossentropy)" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(tf.float32, tf.int32)\n" ] } ], "source": [ "# tf.data.Dataset instance\n", "tr_data = np.random.random((1000, 32)).astype(np.float32)\n", "tr_label = np.random.randint(low=0, high=10, size = 1000).astype(np.int32)\n", "tr_dataset = tf.data.Dataset.from_tensor_slices((tr_data, tr_label))\n", "tr_dataset = tr_dataset.batch(batch_size=32)\n", "tr_dataset = tr_dataset.repeat()\n", "\n", "val_data = np.random.random((100, 32)).astype(np.float32)\n", "val_label = np.random.randint(low=0, high=10, size = 100).astype(np.int32)\n", "val_dataset = tf.data.Dataset.from_tensor_slices((val_data, val_label))\n", "val_dataset = val_dataset.batch(batch_size=100).repeat()\n", "\n", "tst_data = np.random.random((100, 32)).astype(np.float32)\n", "tst_label = np.random.randint(low=0, high=10, size = 100).astype(np.int32)\n", "tst_dataset = tf.data.Dataset.from_tensor_slices((tst_data, tst_label))\n", "tst_dataset = tst_dataset.batch(batch_size=100)\n", "\n", "print(tr_dataset.output_types)" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/5\n", "31/31 [==============================] - 0s 5ms/step - loss: 2.3889 - val_loss: 2.2315\n", "Epoch 2/5\n", "31/31 [==============================] - 0s 1ms/step - loss: 2.3525 - val_loss: 2.2768\n", "Epoch 3/5\n", "31/31 [==============================] - 0s 1ms/step - loss: 2.3126 - val_loss: 2.3388\n", "Epoch 4/5\n", "31/31 [==============================] - 0s 1ms/step - loss: 2.2994 - val_loss: 2.3531\n", "Epoch 5/5\n", "31/31 [==============================] - 0s 1ms/step - loss: 2.2890 - val_loss: 2.3579\n" ] } ], "source": [ "# Trains for 5 epochs\n", "mlp.fit(tr_dataset, epochs=5, steps_per_epoch=1000//32, validation_data = val_dataset, validation_steps=1)\n", "\n", "del mlp" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Custom layers\n", "Reading https://www.tensorflow.org/guide/keras?hl=ko#custom_layers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Callbacks\n", "https://www.tensorflow.org/guide/keras?hl=ko#callbacks \n", "\n", "A callback is an object passed to a model to customize and extend its behavior during training. You can write your own custom callback, or use the built-in `tf.keras.callbacks` that include:\n", "\n", "* `tf.keras.callbacks.ModelCheckpoint`: Save checkpoints of your model at regular intervals.\n", "* `tf.keras.callbacks.LearningRateScheduler`: Dynamically change the learning rate.\n", "* `tf.keras.callbacks.EarlyStopping`: Interrupt training when validation performance has stopped improving.\n", "* `tf.keras.callbacks.TensorBoard`: Monitor the model's behavior using TensorBoard. \n", "\n", "To use a `tf.keras.callbacks.Callback`, pass it to the model's `fit` method:" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[]\n", "[]\n" ] } ], "source": [ "# Clear\n", "keras.backend.clear_session()\n", "tf.reset_default_graph()\n", "\n", "print(tf.get_default_graph().get_operations())\n", "print(tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES))" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(tf.float32, tf.int32)\n" ] } ], "source": [ "# tf.data.Dataset instance\n", "tr_data = np.random.random((1000, 32)).astype(np.float32)\n", "tr_label = np.random.randint(low=0, high=10, size = 1000).astype(np.int32)\n", "tr_dataset = tf.data.Dataset.from_tensor_slices((tr_data, tr_label))\n", "tr_dataset = tr_dataset.batch(batch_size=32)\n", "tr_dataset = tr_dataset.repeat()\n", "\n", "val_data = np.random.random((100, 32)).astype(np.float32)\n", "val_label = np.random.randint(low=0, high=10, size = 100).astype(np.int32)\n", "val_dataset = tf.data.Dataset.from_tensor_slices((val_data, val_label))\n", "val_dataset = val_dataset.batch(batch_size=100).repeat()\n", "\n", "tst_data = np.random.random((100, 32)).astype(np.float32)\n", "tst_label = np.random.randint(low=0, high=10, size = 100).astype(np.int32)\n", "tst_dataset = tf.data.Dataset.from_tensor_slices((tst_data, tst_label))\n", "tst_dataset = tst_dataset.batch(batch_size=100)\n", "\n", "print(tr_dataset.output_types)" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/5\n", "31/31 [==============================] - 0s 3ms/step - loss: 2.3456 - val_loss: 2.3992\n", "Epoch 2/5\n", "31/31 [==============================] - 0s 1ms/step - loss: 2.3280 - val_loss: 2.3774\n", "Epoch 3/5\n", "31/31 [==============================] - 0s 1ms/step - loss: 2.3194 - val_loss: 2.3638\n", "Epoch 4/5\n", "31/31 [==============================] - 0s 1ms/step - loss: 2.3141 - val_loss: 2.3551\n", "Epoch 5/5\n", "31/31 [==============================] - 0s 1ms/step - loss: 2.3105 - val_loss: 2.3495\n" ] } ], "source": [ "# Creating \"callback\" object\n", "callbacks = [\n", " # Interrupt training if `val_loss` stops improving for over 2 epochs\n", " keras.callbacks.EarlyStopping(patience=2, monitor='val_loss'),\n", " # Write TensorBoard logs to `./logs` directory\n", " keras.callbacks.TensorBoard(log_dir='./logs')\n", "]\n", "\n", "# Training\n", "model = keras.Sequential()\n", "model.add(keras.layers.Dense(units=64, activation='relu'))\n", "model.add(keras.layers.Dense(units=64, activation='relu'))\n", "model.add(keras.layers.Dense(units=10, activation='softmax'))\n", "model.compile(optimizer=tf.train.GradientDescentOptimizer(.01), \n", " loss=keras.losses.sparse_categorical_crossentropy,\n", " callbacks = callbacks)\n", "\n", "model.fit(tr_dataset, epochs = 5, steps_per_epoch = 1000 // 32,\n", " validation_data = val_dataset, validation_steps = 1)\n", "\n", "del model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Save and restore\n", "https://www.tensorflow.org/guide/keras?hl=ko#save_and_restore" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Weights only\n", "Save and load the weights of a model using `tf.keras.Model.save_weights`:" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[]\n", "[]\n" ] } ], "source": [ "# Clear\n", "keras.backend.clear_session()\n", "tf.reset_default_graph()\n", "\n", "print(tf.get_default_graph().get_operations())\n", "print(tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES))" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "# Subclassing tf.keras.Model\n", "class MLP(keras.Model):\n", " def __init__(self, hidden_dim, num_classes):\n", " super(MLP, self).__init__()\n", " # Define your layers here.\n", " self.hidden_layer = keras.layers.Dense(units = hidden_dim, activation='relu')\n", " self.output_layer = keras.layers.Dense(units = num_classes, activation='softmax')\n", " \n", " def call(self, inputs):\n", " hidden = self.hidden_layer(inputs)\n", " score = self.output_layer(hidden)\n", " return score\n", " \n", "# Instantiate the MLP class\n", "mlp = MLP(hidden_dim=100, num_classes=10)\n", "\n", "# The compile step specifies the training configuration.\n", "mlp.compile(optimizer=tf.train.GradientDescentOptimizer(.001),\n", " loss=keras.losses.sparse_categorical_crossentropy)" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(tf.float32, tf.int32)\n", "(tf.float32, tf.int32)\n" ] } ], "source": [ "# tf.data.Dataset instance\n", "tr_data = np.random.random((1000, 32)).astype(np.float32)\n", "tr_label = np.random.randint(low=0, high=10, size = 1000).astype(np.int32)\n", "tr_dataset = tf.data.Dataset.from_tensor_slices((tr_data, tr_label))\n", "tr_dataset = tr_dataset.batch(batch_size=100)\n", "tr_dataset = tr_dataset.repeat()\n", "\n", "val_data = np.random.random((100, 32)).astype(np.float32)\n", "val_label = np.random.randint(low=0, high=10, size = 100).astype(np.int32)\n", "val_dataset = tf.data.Dataset.from_tensor_slices((val_data, val_label))\n", "val_dataset = val_dataset.batch(batch_size=100).repeat()\n", "\n", "tst_data = np.ones((100,32), dtype=np.float32)\n", "tst_label = np.ones((100,), dtype=np.int32)\n", "tst_dataset = tf.data.Dataset.from_tensor_slices((tst_data, tst_label))\n", "tst_dataset = tst_dataset.batch(batch_size=100)\n", "\n", "print(tr_dataset.output_types)\n", "print(tst_dataset.output_types)" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Train on 1000 samples, validate on 100 samples\n", "Epoch 1/5\n", "1000/1000 [==============================] - 0s 112us/step - loss: 2.3423 - val_loss: 2.3729\n", "Epoch 2/5\n", "1000/1000 [==============================] - 0s 15us/step - loss: 2.3414 - val_loss: 2.3719\n", "Epoch 3/5\n", "1000/1000 [==============================] - 0s 15us/step - loss: 2.3404 - val_loss: 2.3710\n", "Epoch 4/5\n", "1000/1000 [==============================] - 0s 18us/step - loss: 2.3395 - val_loss: 2.3701\n", "Epoch 5/5\n", "1000/1000 [==============================] - 0s 14us/step - loss: 2.3387 - val_loss: 2.3692\n", "100/100 [==============================] - 0s 34us/step\n", "1.9046856212615966\n" ] } ], "source": [ "# Trains for 5 epochs\n", "mlp.fit(x=tr_data, y=tr_label, epochs=5, batch_size=100,\n", " validation_data=(val_data, val_label))\n", "# mlp.fit(tr_dataset, epochs=5, steps_per_epoch=1000//100,\n", "# validation_data=val_dataset, validation_steps=1)\n", "mlp.save_weights('../graphs/lecture05/keras/mlp')\n", "y_before = np.argmax(mlp.predict(x=tst_data), axis = -1)\n", "print(mlp.evaluate(x=tst_data, y=tst_label))\n", "# with keras.backend.get_session() as sess:\n", "# before = sess.run(mlp.variables)\n", "del mlp" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[]\n", "[]\n" ] } ], "source": [ "# Clear\n", "keras.backend.clear_session()\n", "tf.reset_default_graph()\n", "\n", "print(tf.get_default_graph().get_operations())\n", "print(tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES))" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [], "source": [ "# Restore\n", "## Instantiate the MLP class\n", "tst_model = MLP(hidden_dim=100, num_classes=10)\n", "tst_model.compile(optimizer=tf.train.GradientDescentOptimizer(.001),\n", " loss=keras.losses.sparse_categorical_crossentropy)" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [], "source": [ "# tst_model.build(input_shape=tf.TensorShape(([None,32])))" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "<tensorflow.python.training.checkpointable.util.CheckpointLoadStatus at 0x7ff19c15fc18>" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tst_model.load_weights('../graphs/lecture05/keras/mlp')" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [], "source": [ "tst_data = np.ones((100,32), dtype=np.float32)\n", "tst_label = np.ones((100,), dtype=np.int32)\n", "tst_dataset = tf.data.Dataset.from_tensor_slices((tst_data, tst_label))\n", "tst_dataset = tst_dataset.batch(batch_size=100)" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1/1 [==============================] - 0s 10ms/step\n", "1.904685616493225\n" ] } ], "source": [ "y_after = np.argmax(tst_model.predict(tst_dataset, steps = 1), axis = -1)\n", "print(tst_model.evaluate(tst_dataset, steps = 1))" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1.0" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# equal\n", "np.mean(y_before == y_after)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Configuration only\n", "Reading https://www.tensorflow.org/guide/keras?hl=ko#configuration_only" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Entire model\n", "Reading https://www.tensorflow.org/guide/keras?hl=ko#entire_model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Eager execution\n", "Reading https://www.tensorflow.org/guide/keras?hl=ko#eager_execution" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Distribution\n", "Reading https://www.tensorflow.org/guide/keras?hl=ko#distribution" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.8" } }, "nbformat": 4, "nbformat_minor": 2 }