{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "MhoQ0WE77laV"
      },
      "source": [
        "##### Copyright 2020 The TensorFlow Authors."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "cellView": "form",
        "id": "_ckMIh7O7s6D"
      },
      "outputs": [],
      "source": [
        "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "# https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "jYysdyb-CaWM"
      },
      "source": [
        "# Train and serve a TensorFlow model with TensorFlow Serving"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "E6FwTNtl3S4v"
      },
      "source": [
        "**Warning: This notebook is designed to be run in a Google Colab only**.  It installs packages on the system and requires root access.  If you want to run it in a local Jupyter notebook, please proceed with caution.\n",
        "\n",
        "Note: You can run this example right now in a Jupyter-style notebook, no setup required!  Just click \"Run in Google Colab\"\n",
        "\n",
        "\u003cdiv class=\"devsite-table-wrapper\"\u003e\u003ctable class=\"tfo-notebook-buttons\" align=\"left\"\u003e\n",
        "\u003ctr\u003e\u003ctd\u003e\u003ca target=\"_blank\" href=\"https://www.tensorflow.org/tfx/tutorials/serving/rest_simple\"\u003e\n",
        "\u003cimg src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" /\u003eView on TensorFlow.org\u003c/a\u003e\u003c/td\u003e\n",
        "\u003ctd\u003e\u003ca target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/tfx/blob/master/docs/tutorials/serving/rest_simple.ipynb\"\u003e\n",
        "\u003cimg src=\"https://www.tensorflow.org/images/colab_logo_32px.png\"\u003eRun in Google Colab\u003c/a\u003e\u003c/td\u003e\n",
        "\u003ctd\u003e\u003ca target=\"_blank\" href=\"https://github.com/tensorflow/tfx/blob/master/docs/tutorials/serving/rest_simple.ipynb\"\u003e\n",
        "\u003cimg width=32px src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\"\u003eView source on GitHub\u003c/a\u003e\u003c/td\u003e\n",
        "\u003ctd\u003e\u003ca href=\"https://storage.googleapis.com/tensorflow_docs/tfx/docs/tutorials/serving/rest_simple.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/download_logo_32px.png\" /\u003eDownload notebook\u003c/a\u003e\u003c/td\u003e\n",
        "\u003c/tr\u003e\u003c/table\u003e\u003c/div\u003e"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "FbVhjPpzn6BM"
      },
      "source": [
        "This guide trains a neural network model to classify [images of clothing, like sneakers and shirts](https://github.com/zalandoresearch/fashion-mnist), saves the trained model, and then serves it with [TensorFlow Serving](https://www.tensorflow.org/tfx/guide/serving).  The focus is on TensorFlow Serving, rather than the modeling and training in TensorFlow, so for a complete example which focuses on the modeling and training see the [Basic Classification example](https://github.com/tensorflow/docs/blob/master/site/en/r1/tutorials/keras/basic_classification.ipynb).\n",
        "\n",
        "This guide uses [tf.keras](https://github.com/tensorflow/docs/blob/master/site/en/r1/guide/keras.ipynb), a high-level API to build and train models in TensorFlow."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "FWkuJabJSKGB"
      },
      "outputs": [],
      "source": [
        "import sys\n",
        "\n",
        "# Confirm that we're using Python 3\n",
        "assert sys.version_info.major == 3, 'Oops, not running Python 3. Use Runtime \u003e Change runtime type'"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "dzLKpmZICaWN"
      },
      "outputs": [],
      "source": [
        "# TensorFlow and tf.keras\n",
        "print(\"Installing dependencies for Colab environment\")\n",
        "!pip install -Uq grpcio==1.26.0\n",
        "\n",
        "import tensorflow as tf\n",
        "from tensorflow import keras\n",
        "\n",
        "# Helper libraries\n",
        "import numpy as np\n",
        "import matplotlib.pyplot as plt\n",
        "import os\n",
        "import subprocess\n",
        "\n",
        "print('TensorFlow version: {}'.format(tf.__version__))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "5jAk1ZXqTJqN"
      },
      "source": [
        "## Create your model"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "yR0EdgrLCaWR"
      },
      "source": [
        "### Import the Fashion MNIST dataset\n",
        "\n",
        "This guide uses the [Fashion MNIST](https://github.com/zalandoresearch/fashion-mnist) dataset which contains 70,000 grayscale images in 10 categories. The images show individual articles of clothing at low resolution (28 by 28 pixels), as seen here:\n",
        "\n",
        "\u003ctable\u003e\n",
        "  \u003ctr\u003e\u003ctd\u003e\n",
        "    \u003cimg src=\"https://tensorflow.org/images/fashion-mnist-sprite.png\"\n",
        "         alt=\"Fashion MNIST sprite\"  width=\"600\"\u003e\n",
        "  \u003c/td\u003e\u003c/tr\u003e\n",
        "  \u003ctr\u003e\u003ctd align=\"center\"\u003e\n",
        "    \u003cb\u003eFigure 1.\u003c/b\u003e \u003ca href=\"https://github.com/zalandoresearch/fashion-mnist\"\u003eFashion-MNIST samples\u003c/a\u003e (by Zalando, MIT License).\u003cbr/\u003e\u0026nbsp;\n",
        "  \u003c/td\u003e\u003c/tr\u003e\n",
        "\u003c/table\u003e\n",
        "\n",
        "Fashion MNIST is intended as a drop-in replacement for the classic [MNIST](http://yann.lecun.com/exdb/mnist/) dataset—often used as the \"Hello, World\" of machine learning programs for computer vision. You can access the Fashion MNIST directly from TensorFlow, just import and load the data.\n",
        "\n",
        "Note: Although these are really images, they are loaded as NumPy arrays and not binary image objects."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "7MqDQO0KCaWS"
      },
      "outputs": [],
      "source": [
        "fashion_mnist = keras.datasets.fashion_mnist\n",
        "(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()\n",
        "\n",
        "# scale the values to 0.0 to 1.0\n",
        "train_images = train_images / 255.0\n",
        "test_images = test_images / 255.0\n",
        "\n",
        "# reshape for feeding into the model\n",
        "train_images = train_images.reshape(train_images.shape[0], 28, 28, 1)\n",
        "test_images = test_images.reshape(test_images.shape[0], 28, 28, 1)\n",
        "\n",
        "class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',\n",
        "               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']\n",
        "\n",
        "print('\\ntrain_images.shape: {}, of {}'.format(train_images.shape, train_images.dtype))\n",
        "print('test_images.shape: {}, of {}'.format(test_images.shape, test_images.dtype))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "PDu7OX8Nf5PY"
      },
      "source": [
        "### Train and evaluate your model\n",
        "\n",
        "Let's use the simplest possible CNN, since we're not focused on the modeling part."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "LTNN0ANGgA36"
      },
      "outputs": [],
      "source": [
        "model = keras.Sequential([\n",
        "  keras.layers.Conv2D(input_shape=(28,28,1), filters=8, kernel_size=3, \n",
        "                      strides=2, activation='relu', name='Conv1'),\n",
        "  keras.layers.Flatten(),\n",
        "  keras.layers.Dense(10, name='Dense')\n",
        "])\n",
        "model.summary()\n",
        "\n",
        "testing = False\n",
        "epochs = 5\n",
        "\n",
        "model.compile(optimizer='adam', \n",
        "              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),\n",
        "              metrics=[keras.metrics.SparseCategoricalAccuracy()])\n",
        "model.fit(train_images, train_labels, epochs=epochs)\n",
        "\n",
        "test_loss, test_acc = model.evaluate(test_images, test_labels)\n",
        "print('\\nTest accuracy: {}'.format(test_acc))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "AwGPItyphqXT"
      },
      "source": [
        "## Save your model\n",
        "\n",
        "To load our trained model into TensorFlow Serving we first need to save it in [SavedModel](https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/saved_model) format.  This will create a protobuf file in a well-defined directory hierarchy, and will include a version number.  [TensorFlow Serving](https://www.tensorflow.org/tfx/guide/serving) allows us to select which version of a model, or \"servable\" we want to use when we make inference requests.  Each version will be exported to a different sub-directory under the given path."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "0w5Rq8SsgWE6"
      },
      "outputs": [],
      "source": [
        "# Fetch the Keras session and save the model\n",
        "# The signature definition is defined by the input and output tensors,\n",
        "# and stored with the default serving key\n",
        "import tempfile\n",
        "\n",
        "MODEL_DIR = tempfile.gettempdir()\n",
        "version = 1\n",
        "export_path = os.path.join(MODEL_DIR, str(version))\n",
        "print('export_path = {}\\n'.format(export_path))\n",
        "\n",
        "tf.keras.models.save_model(\n",
        "    model,\n",
        "    export_path,\n",
        "    overwrite=True,\n",
        "    include_optimizer=True,\n",
        "    save_format=None,\n",
        "    signatures=None,\n",
        "    options=None\n",
        ")\n",
        "\n",
        "print('\\nSaved model:')\n",
        "!ls -l {export_path}"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "FM7B_RuDYoIj"
      },
      "source": [
        "## Examine your saved model\n",
        "\n",
        "We'll use the command line utility `saved_model_cli` to look at the [MetaGraphDefs](https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/MetaGraphDef) (the models) and [SignatureDefs](../signature_defs) (the methods you can call) in our SavedModel.  See [this discussion of the SavedModel CLI](https://github.com/tensorflow/docs/blob/master/site/en/r1/guide/saved_model.md#cli-to-inspect-and-execute-savedmodel) in the TensorFlow Guide."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "LU4GDF_aYtfQ"
      },
      "outputs": [],
      "source": [
        "!saved_model_cli show --dir {export_path} --all"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "lSPWuegUb7Eo"
      },
      "source": [
        "That tells us a lot about our model!  In this case we just trained our model, so we already know the inputs and outputs, but if we didn't this would be important information.  It doesn't tell us everything, like the fact that this is grayscale image data for example, but it's a great start."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "DBgsyhytS6KD"
      },
      "source": [
        "## Serve your model with TensorFlow Serving\n",
        "\n",
        "**Warning: If you are running this NOT on a Google Colab,** following cells\n",
        "will install packages on the system with root access. If you want to run it in\n",
        "a local Jupyter notebook, please proceed with caution.\n",
        "\n",
        "### Add TensorFlow Serving distribution URI as a package source:\n",
        "\n",
        "We're preparing to install TensorFlow Serving using [Aptitude](https://wiki.debian.org/Aptitude) since this Colab runs in a Debian environment.  We'll add the `tensorflow-model-server` package to the list of packages that Aptitude knows about.  Note that we're running as root.\n",
        "\n",
        "Note: This example is running TensorFlow Serving natively, but [you can also run it in a Docker container](https://www.tensorflow.org/tfx/serving/docker), which is one of the easiest ways to get started using TensorFlow Serving."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "v2hF_ChoOrEd"
      },
      "outputs": [],
      "source": [
        "import sys\n",
        "# We need sudo prefix if not on a Google Colab.\n",
        "if 'google.colab' not in sys.modules:\n",
        "  SUDO_IF_NEEDED = 'sudo'\n",
        "else:\n",
        "  SUDO_IF_NEEDED = ''"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "EWg9X2QHlbGS"
      },
      "outputs": [],
      "source": [
        "# This is the same as you would do from your command line, but without the [arch=amd64], and no sudo\n",
        "# You would instead do:\n",
        "# echo \"deb [arch=amd64] http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal\" | sudo tee /etc/apt/sources.list.d/tensorflow-serving.list \u0026\u0026 \\\n",
        "# curl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | sudo apt-key add -\n",
        "\n",
        "!echo \"deb http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal\" | {SUDO_IF_NEEDED} tee /etc/apt/sources.list.d/tensorflow-serving.list \u0026\u0026 \\\n",
        "curl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | {SUDO_IF_NEEDED} apt-key add -\n",
        "!{SUDO_IF_NEEDED} apt update"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "W1ZVp_VOU7Wu"
      },
      "source": [
        "### Install TensorFlow Serving\n",
        "\n",
        "This is all you need - one command line!"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "ygwa9AgRloYy"
      },
      "outputs": [],
      "source": [
        "!{SUDO_IF_NEEDED} apt-get install tensorflow-model-server"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "k5NrYdQeVm52"
      },
      "source": [
        "### Start running TensorFlow Serving\n",
        "\n",
        "This is where we start running TensorFlow Serving and load our model.  After it loads we can start making inference requests using REST.  There are some important parameters:\n",
        "\n",
        "* `rest_api_port`: The port that you'll use for REST requests.\n",
        "* `model_name`: You'll use this in the URL of REST requests.  It can be anything.\n",
        "* `model_base_path`: This is the path to the directory where you've saved your model.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "aUgp3vUdU5GS"
      },
      "outputs": [],
      "source": [
        "os.environ[\"MODEL_DIR\"] = MODEL_DIR"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "kJDhHNJVnaLN"
      },
      "outputs": [],
      "source": [
        "%%bash --bg \n",
        "nohup tensorflow_model_server \\\n",
        "  --rest_api_port=8501 \\\n",
        "  --model_name=fashion_model \\\n",
        "  --model_base_path=\"${MODEL_DIR}\" \u003eserver.log 2\u003e\u00261\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "IxbeiOCUUs2z"
      },
      "outputs": [],
      "source": [
        "!tail server.log"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "vwg1JKaGXWAg"
      },
      "source": [
        "## Make a request to your model in TensorFlow Serving\n",
        "\n",
        "First, let's take a look at a random example from our test data."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Luqm_Jyff9iR"
      },
      "outputs": [],
      "source": [
        "def show(idx, title):\n",
        "  plt.figure()\n",
        "  plt.imshow(test_images[idx].reshape(28,28))\n",
        "  plt.axis('off')\n",
        "  plt.title('\\n\\n{}'.format(title), fontdict={'size': 16})\n",
        "\n",
        "import random\n",
        "rando = random.randint(0,len(test_images)-1)\n",
        "show(rando, 'An Example Image: {}'.format(class_names[test_labels[rando]]))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "TKnEHeTrbh3L"
      },
      "source": [
        "Ok, that looks interesting.  How hard is that for you to recognize? Now let's create the JSON object for a batch of  three inference requests, and see how well our model recognizes things:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "2dsD7KQG1m-R"
      },
      "outputs": [],
      "source": [
        "import json\n",
        "data = json.dumps({\"signature_name\": \"serving_default\", \"instances\": test_images[0:3].tolist()})\n",
        "print('Data: {} ... {}'.format(data[:50], data[len(data)-52:]))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ReQd4QESIwXN"
      },
      "source": [
        "### Make REST requests"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "iT3J-lHrhOYQ"
      },
      "source": [
        "#### Newest version of the servable\n",
        "\n",
        "We'll send a predict request as a POST to our server's REST endpoint, and pass it three examples.  We'll ask our server to give us the latest version of our servable by not specifying a particular version."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "vGvFyuIzW6n6"
      },
      "outputs": [],
      "source": [
        "# docs_infra: no_execute\n",
        "!pip install -q requests\n",
        "\n",
        "import requests\n",
        "headers = {\"content-type\": \"application/json\"}\n",
        "json_response = requests.post('http://localhost:8501/v1/models/fashion_model:predict', data=data, headers=headers)\n",
        "predictions = json.loads(json_response.text)['predictions']\n",
        "\n",
        "show(0, 'The model thought this was a {} (class {}), and it was actually a {} (class {})'.format(\n",
        "  class_names[np.argmax(predictions[0])], np.argmax(predictions[0]), class_names[test_labels[0]], test_labels[0]))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "YJH8LtM4XELp"
      },
      "source": [
        "#### A particular version of the servable\n",
        "\n",
        "Now let's specify a particular version of our servable.  Since we only have one, let's select version 1.  We'll also look at all three results."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "zRftRxeR1tZx"
      },
      "outputs": [],
      "source": [
        "# docs_infra: no_execute\n",
        "headers = {\"content-type\": \"application/json\"}\n",
        "json_response = requests.post('http://localhost:8501/v1/models/fashion_model/versions/1:predict', data=data, headers=headers)\n",
        "predictions = json.loads(json_response.text)['predictions']\n",
        "\n",
        "for i in range(0,3):\n",
        "  show(i, 'The model thought this was a {} (class {}), and it was actually a {} (class {})'.format(\n",
        "    class_names[np.argmax(predictions[i])], np.argmax(predictions[i]), class_names[test_labels[i]], test_labels[i]))"
      ]
    }
  ],
  "metadata": {
    "accelerator": "GPU",
    "colab": {
      "collapsed_sections": [],
      "name": "rest_simple.ipynb",
      "provenance": [],
      "toc_visible": true
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}