{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "accelerator": "GPU",
    "colab": {
      "name": "TFServing_dummy_model.ipynb",
      "provenance": [],
      "collapsed_sections": [],
      "toc_visible": true,
      "include_colab_link": true
    },
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.7.4"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/Rishit-dagli/GDG-Ahmedabad-2020/blob/master/TFServing_dummy_model.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "sXdXzdwpo45D"
      },
      "source": [
        "# Getting Started with TensorFlow Serving\n",
        "\n",
        "In this notebook you will serve your first TensorFlow model with TensorFlow Serving. We will start by building a very simple model to infer a simple number relationship:\n",
        "\n",
        "$$\n",
        "y = 2x - 1\n",
        "$$\n",
        "\n",
        "between a few pairs of numbers. After training our model, we will serve it with TensorFlow Serving, and then we will make inference requests."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "Vo0JfluI1Vzw"
      },
      "source": [
        "Note: This notebook is designed to be run in Google Colab if you want to run it locally or on a Jupyter notebook you would need to make minor changes and remove the Colab specific code"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "V5MGykVsYQXq"
      },
      "source": [
        "## Setup"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "XaTZKCai1E0O",
        "colab_type": "text"
      },
      "source": [
        "We will start by importing `tensorflow`"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "xcV03BQ0o45G",
        "colab": {}
      },
      "source": [
        "try:\n",
        "    %tensorflow_version 2.x\n",
        "except:\n",
        "    pass"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "dzLKpmZICaWN",
        "outputId": "0467319e-cbac-4481-d644-432d9923c6b4",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        }
      },
      "source": [
        "import os\n",
        "import json\n",
        "import tempfile\n",
        "import requests\n",
        "import numpy as np\n",
        "\n",
        "import tensorflow as tf\n",
        "\n",
        "print(\"\\u2022 Using TensorFlow Version:\", tf.__version__)"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "• Using TensorFlow Version: 2.2.0\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "jd-XtLMispqY"
      },
      "source": [
        "## Add TensorFlow Serving Distribution URI as a Package Source\n",
        "\n",
        "We will install TensorFlow Serving using [Aptitude](https://wiki.debian.org/Aptitude) (the default Debian package manager) since Google's Colab runs in a Debian environment. \n",
        "\n",
        "Before we can install TensorFlow Serving, we need to add the `tensorflow-model-server` package to the list of packages that Aptitude knows about. Note that we're running as root.\n",
        "\n",
        "**Note**: This notebook is running TensorFlow Serving natively, but [you can also run it in a Docker container](https://www.tensorflow.org/tfx/serving/docker), which is one of the easiest ways to get started using TensorFlow Serving. The Docker Engine is available for a variety of Linux platforms, Windows, and Mac."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "BbU7MZtcZboG",
        "outputId": "f5dcab92-716d-495e-9a52-788ca43699d3",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 547
        }
      },
      "source": [
        "# This is the same as you would do from your command line, but without the [arch=amd64], and no sudo\n",
        "# You would instead do:\n",
        "# echo \"deb [arch=amd64] http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal\" | sudo tee /etc/apt/sources.list.d/tensorflow-serving.list && \\\n",
        "# curl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | sudo apt-key add -\n",
        "\n",
        "!echo \"deb http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal\" | tee /etc/apt/sources.list.d/tensorflow-serving.list && \\\n",
        "curl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | apt-key add -\n",
        "!apt update"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "deb http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal\n",
            "  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current\n",
            "                                 Dload  Upload   Total   Spent    Left  Speed\n",
            "\r  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0\r100  2943  100  2943    0     0   5181      0 --:--:-- --:--:-- --:--:--  5181\n",
            "OK\n",
            "Hit:1 http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu bionic InRelease\n",
            "Hit:2 http://archive.ubuntu.com/ubuntu bionic InRelease\n",
            "Get:3 http://archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB]\n",
            "Get:4 http://ppa.launchpad.net/marutter/c2d4u3.5/ubuntu bionic InRelease [15.4 kB]\n",
            "Get:5 https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/ InRelease [3,626 B]\n",
            "Get:6 http://archive.ubuntu.com/ubuntu bionic-backports InRelease [74.6 kB]\n",
            "Get:7 http://storage.googleapis.com/tensorflow-serving-apt stable InRelease [3,012 B]\n",
            "Get:8 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]\n",
            "Ign:9 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  InRelease\n",
            "Ign:10 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  InRelease\n",
            "Hit:11 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  Release\n",
            "Hit:12 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  Release\n",
            "Get:13 http://ppa.launchpad.net/marutter/c2d4u3.5/ubuntu bionic/main Sources [1,816 kB]\n",
            "Get:14 http://ppa.launchpad.net/marutter/c2d4u3.5/ubuntu bionic/main amd64 Packages [876 kB]\n",
            "Get:15 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 Packages [1,376 kB]\n",
            "Get:16 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages [1,207 kB]\n",
            "Get:17 http://storage.googleapis.com/tensorflow-serving-apt stable/tensorflow-model-server amd64 Packages [354 B]\n",
            "Get:19 http://storage.googleapis.com/tensorflow-serving-apt stable/tensorflow-model-server-universal amd64 Packages [361 B]\n",
            "Get:21 http://security.ubuntu.com/ubuntu bionic-security/main amd64 Packages [912 kB]\n",
            "Get:22 http://security.ubuntu.com/ubuntu bionic-security/universe amd64 Packages [846 kB]\n",
            "Fetched 7,307 kB in 2s (2,975 kB/s)\n",
            "Reading package lists... Done\n",
            "Building dependency tree       \n",
            "Reading state information... Done\n",
            "39 packages can be upgraded. Run 'apt list --upgradable' to see them.\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "ZT6BgcLFtN8E"
      },
      "source": [
        "## Install TensorFlow Serving\n",
        "\n",
        "Now that the Aptitude packages have been updated, we can use the `apt-get` command to install the TensorFlow model server."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "YoHTRDi1Zf_Z",
        "outputId": "81687ccc-5a0d-4237-86b7-d8a208a024d0",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 292
        }
      },
      "source": [
        "!apt-get install tensorflow-model-server"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "\rReading package lists... 0%\r\rReading package lists... 0%\r\rReading package lists... 0%\r\rReading package lists... 6%\r\rReading package lists... 6%\r\rReading package lists... 7%\r\rReading package lists... 7%\r\rReading package lists... 63%\r\rReading package lists... 63%\r\rReading package lists... 64%\r\rReading package lists... 64%\r\rReading package lists... 71%\r\rReading package lists... 71%\r\rReading package lists... 72%\r\rReading package lists... 72%\r\rReading package lists... 80%\r\rReading package lists... 81%\r\rReading package lists... 81%\r\rReading package lists... 81%\r\rReading package lists... 81%\r\rReading package lists... 81%\r\rReading package lists... 81%\r\rReading package lists... 81%\r\rReading package lists... 81%\r\rReading package lists... 87%\r\rReading package lists... 87%\r\rReading package lists... 87%\r\rReading package lists... 87%\r\rReading package lists... 93%\r\rReading package lists... 93%\r\rReading package lists... 93%\r\rReading package lists... 93%\r\rReading package lists... 93%\r\rReading package lists... 93%\r\rReading package lists... 94%\r\rReading package lists... 94%\r\rReading package lists... 95%\r\rReading package lists... 95%\r\rReading package lists... 98%\r\rReading package lists... 98%\r\rReading package lists... 98%\r\rReading package lists... 98%\r\rReading package lists... 98%\r\rReading package lists... 98%\r\rReading package lists... 98%\r\rReading package lists... 98%\r\rReading package lists... Done\r\n",
            "\rBuilding dependency tree... 0%\r\rBuilding dependency tree... 0%\r\rBuilding dependency tree... 50%\r\rBuilding dependency tree... 50%\r\rBuilding dependency tree       \r\n",
            "\rReading state information... 0%\r\rReading state information... 0%\r\rReading state information... Done\r\n",
            "The following NEW packages will be installed:\n",
            "  tensorflow-model-server\n",
            "0 upgraded, 1 newly installed, 0 to remove and 39 not upgraded.\n",
            "Need to get 175 MB of archives.\n",
            "After this operation, 0 B of additional disk space will be used.\n",
            "Get:1 http://storage.googleapis.com/tensorflow-serving-apt stable/tensorflow-model-server amd64 tensorflow-model-server all 2.1.0 [175 MB]\n",
            "Fetched 175 MB in 3s (55.7 MB/s)\n",
            "Selecting previously unselected package tensorflow-model-server.\n",
            "(Reading database ... 144433 files and directories currently installed.)\n",
            "Preparing to unpack .../tensorflow-model-server_2.1.0_all.deb ...\n",
            "Unpacking tensorflow-model-server (2.1.0) ...\n",
            "Setting up tensorflow-model-server (2.1.0) ...\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "k5u6UVdJ2K0W"
      },
      "source": [
        "## Create Dataset\n",
        "\n",
        "Now, we will create a simple dataset that expresses the relationship:\n",
        "\n",
        "$$\n",
        "y = 2x - 1\n",
        "$$\n",
        "\n",
        "between inputs (`xs`) and outputs (`ys`)."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "3qqsNxy83Imw",
        "colab": {}
      },
      "source": [
        "xs = np.array([-1.0,  0.0, 1.0, 2.0, 3.0, 4.0], dtype=float)\n",
        "ys = np.array([-3.0, -1.0, 1.0, 3.0, 5.0, 7.0], dtype=float)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "c6frUt2r3NYJ"
      },
      "source": [
        "## Build and Train the Model\n",
        "\n",
        "We'll use the simplest possible model for this example. Since we are going to train our model for `500` epochs, in order to avoid clutter on the screen, we will use the argument `verbose=0` in the `fit` method. The Verbosity mode can be:\n",
        "\n",
        "* `0` : silent.\n",
        "\n",
        "* `1` : progress bar.\n",
        "\n",
        "* `2` : one line per epoch.\n",
        "\n",
        "As a side note, we should mention that since the progress bar is not particularly useful when logged to a file, `verbose=2` is recommended when not running interactively (eg, in a production environment)."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "9952f7iAaT9F",
        "outputId": "1d9e11dc-d510-486b-c8c0-3dbcef6280ed",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        }
      },
      "source": [
        "model = tf.keras.Sequential([tf.keras.layers.Dense(units=1, input_shape=[1])])\n",
        "\n",
        "model.compile(optimizer='sgd',\n",
        "              loss='mean_squared_error')\n",
        "\n",
        "history = model.fit(xs, ys, epochs=500, verbose=0)\n",
        "\n",
        "print(\"Finished training the model\")"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Finished training the model\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "ONjN9e34vPwC"
      },
      "source": [
        "## Test the Model\n",
        "\n",
        "Now that the model is trained, we can test it. If we give it the value `10`, we should get a value very close to `19`."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "cnRch66BvNFF",
        "outputId": "80c59d83-ea85-484d-b692-304b64d7b2d6",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        }
      },
      "source": [
        "print(model.predict([[5.0]]))"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "[[8.994236]]\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "IQ67rKM9367n"
      },
      "source": [
        "## Save the Model\n",
        "\n",
        "To load the trained model into TensorFlow Serving we first need to save it in the [SavedModel](https://www.tensorflow.org/guide/saved_model) format.  This will create a protobuf file in a well-defined directory hierarchy, and will include a version number.  [TensorFlow Serving](https://www.tensorflow.org/tfx/serving/serving_config) allows us to select which version of a model, or \"servable\" we want to use when we make inference requests.  Each version will be exported to a different sub-directory under the given path."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "D9c2eOEHpjGS",
        "outputId": "6837102d-99f3-4071-9f1a-441e8c3b9aae",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 207
        }
      },
      "source": [
        "MODEL_DIR = tempfile.gettempdir()\n",
        "\n",
        "version = 1\n",
        "\n",
        "export_path = os.path.join(MODEL_DIR, str(version))\n",
        "\n",
        "if os.path.isdir(export_path):\n",
        "    print('\\nAlready saved a model, cleaning up\\n')\n",
        "    !rm -r {export_path}\n",
        "\n",
        "model.save(export_path, save_format=\"tf\")\n",
        "\n",
        "print('\\nexport_path = {}'.format(export_path))\n",
        "!ls -l {export_path}"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/resource_variable_ops.py:1817: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.\n",
            "Instructions for updating:\n",
            "If using Keras pass *_constraint arguments to layers.\n",
            "INFO:tensorflow:Assets written to: /tmp/1/assets\n",
            "\n",
            "export_path = /tmp/1\n",
            "total 48\n",
            "drwxr-xr-x 2 root root  4096 May 16 05:12 assets\n",
            "-rw-r--r-- 1 root root 39128 May 16 05:12 saved_model.pb\n",
            "drwxr-xr-x 2 root root  4096 May 16 05:12 variables\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "E-ARQOR87Mt2"
      },
      "source": [
        "## Examine Your Saved Model\n",
        "\n",
        "We'll use the command line utility `saved_model_cli` to look at the `MetaGraphDefs` and `SignatureDefs` in our SavedModel. The signature definition is defined by the input and output tensors, and stored with the default serving key."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "7M0VJORSN2w9",
        "outputId": "fc3c2e40-7270-4ec0-9edd-780002f9e49e",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 1000
        }
      },
      "source": [
        "!saved_model_cli show --dir {export_path} --all"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "\n",
            "MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:\n",
            "\n",
            "signature_def['__saved_model_init_op']:\n",
            "  The given SavedModel SignatureDef contains the following input(s):\n",
            "  The given SavedModel SignatureDef contains the following output(s):\n",
            "    outputs['__saved_model_init_op'] tensor_info:\n",
            "        dtype: DT_INVALID\n",
            "        shape: unknown_rank\n",
            "        name: NoOp\n",
            "  Method name is: \n",
            "\n",
            "signature_def['serving_default']:\n",
            "  The given SavedModel SignatureDef contains the following input(s):\n",
            "    inputs['dense_input'] tensor_info:\n",
            "        dtype: DT_FLOAT\n",
            "        shape: (-1, 1)\n",
            "        name: serving_default_dense_input:0\n",
            "  The given SavedModel SignatureDef contains the following output(s):\n",
            "    outputs['dense'] tensor_info:\n",
            "        dtype: DT_FLOAT\n",
            "        shape: (-1, 1)\n",
            "        name: StatefulPartitionedCall:0\n",
            "  Method name is: tensorflow/serving/predict\n",
            "WARNING: Logging before flag parsing goes to stderr.\n",
            "W0516 05:12:34.321393 140293225809792 deprecation.py:506] From /usr/local/lib/python2.7/dist-packages/tensorflow_core/python/ops/resource_variable_ops.py:1786: calling __init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.\n",
            "Instructions for updating:\n",
            "If using Keras pass *_constraint arguments to layers.\n",
            "\n",
            "Defined Functions:\n",
            "  Function Name: '__call__'\n",
            "    Option #1\n",
            "      Callable with:\n",
            "        Argument #1\n",
            "          dense_input: TensorSpec(shape=(None, 1), dtype=tf.float32, name=u'dense_input')\n",
            "        Argument #2\n",
            "          DType: bool\n",
            "          Value: True\n",
            "        Argument #3\n",
            "          DType: NoneType\n",
            "          Value: None\n",
            "    Option #2\n",
            "      Callable with:\n",
            "        Argument #1\n",
            "          inputs: TensorSpec(shape=(None, 1), dtype=tf.float32, name=u'inputs')\n",
            "        Argument #2\n",
            "          DType: bool\n",
            "          Value: True\n",
            "        Argument #3\n",
            "          DType: NoneType\n",
            "          Value: None\n",
            "    Option #3\n",
            "      Callable with:\n",
            "        Argument #1\n",
            "          inputs: TensorSpec(shape=(None, 1), dtype=tf.float32, name=u'inputs')\n",
            "        Argument #2\n",
            "          DType: bool\n",
            "          Value: False\n",
            "        Argument #3\n",
            "          DType: NoneType\n",
            "          Value: None\n",
            "    Option #4\n",
            "      Callable with:\n",
            "        Argument #1\n",
            "          dense_input: TensorSpec(shape=(None, 1), dtype=tf.float32, name=u'dense_input')\n",
            "        Argument #2\n",
            "          DType: bool\n",
            "          Value: False\n",
            "        Argument #3\n",
            "          DType: NoneType\n",
            "          Value: None\n",
            "\n",
            "  Function Name: '_default_save_signature'\n",
            "    Option #1\n",
            "      Callable with:\n",
            "        Argument #1\n",
            "          dense_input: TensorSpec(shape=(None, 1), dtype=tf.float32, name=u'dense_input')\n",
            "\n",
            "  Function Name: 'call_and_return_all_conditional_losses'\n",
            "    Option #1\n",
            "      Callable with:\n",
            "        Argument #1\n",
            "          inputs: TensorSpec(shape=(None, 1), dtype=tf.float32, name=u'inputs')\n",
            "        Argument #2\n",
            "          DType: bool\n",
            "          Value: False\n",
            "        Argument #3\n",
            "          DType: NoneType\n",
            "          Value: None\n",
            "    Option #2\n",
            "      Callable with:\n",
            "        Argument #1\n",
            "          inputs: TensorSpec(shape=(None, 1), dtype=tf.float32, name=u'inputs')\n",
            "        Argument #2\n",
            "          DType: bool\n",
            "          Value: True\n",
            "        Argument #3\n",
            "          DType: NoneType\n",
            "          Value: None\n",
            "    Option #3\n",
            "      Callable with:\n",
            "        Argument #1\n",
            "          dense_input: TensorSpec(shape=(None, 1), dtype=tf.float32, name=u'dense_input')\n",
            "        Argument #2\n",
            "          DType: bool\n",
            "          Value: False\n",
            "        Argument #3\n",
            "          DType: NoneType\n",
            "          Value: None\n",
            "    Option #4\n",
            "      Callable with:\n",
            "        Argument #1\n",
            "          dense_input: TensorSpec(shape=(None, 1), dtype=tf.float32, name=u'dense_input')\n",
            "        Argument #2\n",
            "          DType: bool\n",
            "          Value: True\n",
            "        Argument #3\n",
            "          DType: NoneType\n",
            "          Value: None\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "hS3dODJgAB87"
      },
      "source": [
        "## Run the TensorFlow Model Server\n",
        "\n",
        "We will now launch the TensorFlow model server with a bash script. We will use the argument `--bg` to run the script in the background.\n",
        "\n",
        "Our script will start running TensorFlow Serving and will load our model. Here are the parameters we will use:\n",
        "\n",
        "* `rest_api_port`: The port that you'll use for requests.\n",
        "\n",
        "\n",
        "* `model_name`: You'll use this in the URL of your requests.  It can be anything.\n",
        "\n",
        "\n",
        "* `model_base_path`: This is the path to the directory where you've saved your model.\n",
        "\n",
        "Also, because the variable that points to the directory containing the model is in Python, we need a way to tell the bash script where to find the model. To do this, we will write the value of the Python variable to an environment variable using the `os.environ` function."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "nhXu-01eOFGE",
        "colab": {}
      },
      "source": [
        "os.environ[\"MODEL_DIR\"] = MODEL_DIR"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "kJDhHNJVnaLN",
        "outputId": "5e28c222-8e6f-4ba6-9cdb-8894af88191c",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        }
      },
      "source": [
        "%%bash --bg \n",
        "nohup tensorflow_model_server \\\n",
        "  --rest_api_port=8501 \\\n",
        "  --model_name=test \\\n",
        "  --model_base_path=\"${MODEL_DIR}\" >server.log 2>&1"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Starting job # 0 in a separate thread.\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "jWBzpit--6hS"
      },
      "source": [
        "Now we can take a look at the server log."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "F_PudlFqdtfl",
        "outputId": "f972f6cc-0a86-4da9-f4ce-0830ab93506e",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 207
        }
      },
      "source": [
        "!tail server.log"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "2020-05-16 05:12:35.568881: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA\n",
            "2020-05-16 05:12:35.582063: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:203] Restoring SavedModel bundle.\n",
            "2020-05-16 05:12:35.592041: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:152] Running initialization op on SavedModel bundle at path: /tmp/1\n",
            "2020-05-16 05:12:35.594441: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:333] SavedModel load for tags { serve }; Status: success: OK. Took 26368 microseconds.\n",
            "2020-05-16 05:12:35.594760: I tensorflow_serving/servables/tensorflow/saved_model_warmup.cc:105] No warmup data file found at /tmp/1/assets.extra/tf_serving_warmup_requests\n",
            "2020-05-16 05:12:35.594866: I tensorflow_serving/core/loader_harness.cc:87] Successfully loaded servable version {name: test version: 1}\n",
            "2020-05-16 05:12:35.595857: I tensorflow_serving/model_servers/server.cc:358] Running gRPC ModelServer at 0.0.0.0:8500 ...\n",
            "[warn] getaddrinfo: address family for nodename not supported\n",
            "2020-05-16 05:12:35.596419: I tensorflow_serving/model_servers/server.cc:378] Exporting HTTP/REST API at:localhost:8501 ...\n",
            "[evhttp_server.cc : 238] NET_LOG: Entering the event loop ...\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "gD_dr7baFJ-s"
      },
      "source": [
        "## Create JSON Object with Test Data\n",
        "\n",
        "We are now ready to construct a JSON object with some data so that we can make a couple of inferences. We will use $x=9$ and $x=10$ as our test data."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "FwxEEnOei38-",
        "outputId": "68037c8f-0ba1-49b3-eb30-ce3df8e075cb",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        }
      },
      "source": [
        "xs = np.array([[9.0], [10.0]])\n",
        "data = json.dumps({\"signature_name\": \"serving_default\", \"instances\": xs.tolist()})\n",
        "print(data)"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "{\"signature_name\": \"serving_default\", \"instances\": [[9.0], [10.0]]}\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "DZ0KtEF7Fjer"
      },
      "source": [
        "## Make Inference Request\n",
        "\n",
        "Finally, we can make the inference request and get the inferences back. We'll send a predict request as a POST to our server's REST endpoint, and pass it our test data. We'll ask our server to give us the latest version of our model by not specifying a particular version. The response will be a JSON payload containing the predictions."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "vGvFyuIzW6n6",
        "outputId": "f4bfb17d-5451-4645-83a1-53a413145f6d",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 85
        }
      },
      "source": [
        "headers = {\"content-type\": \"application/json\"}\n",
        "json_response = requests.post('http://localhost:8501/v1/models/test:predict', data=data, headers=headers)\n",
        "\n",
        "print(json_response.text)"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "{\n",
            "    \"predictions\": [[16.9821], [18.9790649]\n",
            "    ]\n",
            "}\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "5DKLw7PwI928"
      },
      "source": [
        "We can also look at the predictions directly by loading the value for the `predictions` key."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "F-x87o_DqfOL",
        "outputId": "1d381f2a-8c46-4a90-e7bd-c5950ba722a9",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 34
        }
      },
      "source": [
        "predictions = json.loads(json_response.text)['predictions']\n",
        "print(predictions)"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "[[16.9821], [18.9790649]]\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qfzkY3DR5uJm",
        "colab_type": "text"
      },
      "source": [
        "You just saw how you could serve a dummy model with TF Model server let's now see a real model in action"
      ]
    }
  ]
}