{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "accelerator": "GPU", "colab": { "name": "TFServing_dummy_model.ipynb", "provenance": [], "collapsed_sections": [], "toc_visible": true, "include_colab_link": true }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" } }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "view-in-github", "colab_type": "text" }, "source": [ "\"Open" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "sXdXzdwpo45D" }, "source": [ "# Getting Started with TensorFlow Serving\n", "\n", "In this notebook you will serve your first TensorFlow model with TensorFlow Serving. We will start by building a very simple model to infer a simple number relationship:\n", "\n", "$$\n", "y = 2x - 1\n", "$$\n", "\n", "between a few pairs of numbers. After training our model, we will serve it with TensorFlow Serving, and then we will make inference requests." ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "Vo0JfluI1Vzw" }, "source": [ "Note: This notebook is designed to be run in Google Colab if you want to run it locally or on a Jupyter notebook you would need to make minor changes and remove the Colab specific code" ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "V5MGykVsYQXq" }, "source": [ "## Setup" ] }, { "cell_type": "markdown", "metadata": { "id": "XaTZKCai1E0O", "colab_type": "text" }, "source": [ "We will start by importing `tensorflow`" ] }, { "cell_type": "code", "metadata": { "colab_type": "code", "id": "xcV03BQ0o45G", "colab": {} }, "source": [ "try:\n", " %tensorflow_version 2.x\n", "except:\n", " pass" ], "execution_count": 0, "outputs": [] }, { "cell_type": "code", "metadata": { "colab_type": "code", "id": "dzLKpmZICaWN", "outputId": "0467319e-cbac-4481-d644-432d9923c6b4", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "source": [ "import os\n", "import json\n", "import tempfile\n", "import requests\n", "import numpy as np\n", "\n", "import tensorflow as tf\n", "\n", "print(\"\\u2022 Using TensorFlow Version:\", tf.__version__)" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "• Using TensorFlow Version: 2.2.0\n" ], "name": "stdout" } ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "jd-XtLMispqY" }, "source": [ "## Add TensorFlow Serving Distribution URI as a Package Source\n", "\n", "We will install TensorFlow Serving using [Aptitude](https://wiki.debian.org/Aptitude) (the default Debian package manager) since Google's Colab runs in a Debian environment. \n", "\n", "Before we can install TensorFlow Serving, we need to add the `tensorflow-model-server` package to the list of packages that Aptitude knows about. Note that we're running as root.\n", "\n", "**Note**: This notebook is running TensorFlow Serving natively, but [you can also run it in a Docker container](https://www.tensorflow.org/tfx/serving/docker), which is one of the easiest ways to get started using TensorFlow Serving. The Docker Engine is available for a variety of Linux platforms, Windows, and Mac." ] }, { "cell_type": "code", "metadata": { "colab_type": "code", "id": "BbU7MZtcZboG", "outputId": "f5dcab92-716d-495e-9a52-788ca43699d3", "colab": { "base_uri": "https://localhost:8080/", "height": 547 } }, "source": [ "# This is the same as you would do from your command line, but without the [arch=amd64], and no sudo\n", "# You would instead do:\n", "# echo \"deb [arch=amd64] http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal\" | sudo tee /etc/apt/sources.list.d/tensorflow-serving.list && \\\n", "# curl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | sudo apt-key add -\n", "\n", "!echo \"deb http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal\" | tee /etc/apt/sources.list.d/tensorflow-serving.list && \\\n", "curl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | apt-key add -\n", "!apt update" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "deb http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal\n", " % Total % Received % Xferd Average Speed Time Time Time Current\n", " Dload Upload Total Spent Left Speed\n", "\r 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0\r100 2943 100 2943 0 0 5181 0 --:--:-- --:--:-- --:--:-- 5181\n", "OK\n", "Hit:1 http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu bionic InRelease\n", "Hit:2 http://archive.ubuntu.com/ubuntu bionic InRelease\n", "Get:3 http://archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB]\n", "Get:4 http://ppa.launchpad.net/marutter/c2d4u3.5/ubuntu bionic InRelease [15.4 kB]\n", "Get:5 https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/ InRelease [3,626 B]\n", "Get:6 http://archive.ubuntu.com/ubuntu bionic-backports InRelease [74.6 kB]\n", "Get:7 http://storage.googleapis.com/tensorflow-serving-apt stable InRelease [3,012 B]\n", "Get:8 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]\n", "Ign:9 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 InRelease\n", "Ign:10 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 InRelease\n", "Hit:11 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 Release\n", "Hit:12 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64 Release\n", "Get:13 http://ppa.launchpad.net/marutter/c2d4u3.5/ubuntu bionic/main Sources [1,816 kB]\n", "Get:14 http://ppa.launchpad.net/marutter/c2d4u3.5/ubuntu bionic/main amd64 Packages [876 kB]\n", "Get:15 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 Packages [1,376 kB]\n", "Get:16 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages [1,207 kB]\n", "Get:17 http://storage.googleapis.com/tensorflow-serving-apt stable/tensorflow-model-server amd64 Packages [354 B]\n", "Get:19 http://storage.googleapis.com/tensorflow-serving-apt stable/tensorflow-model-server-universal amd64 Packages [361 B]\n", "Get:21 http://security.ubuntu.com/ubuntu bionic-security/main amd64 Packages [912 kB]\n", "Get:22 http://security.ubuntu.com/ubuntu bionic-security/universe amd64 Packages [846 kB]\n", "Fetched 7,307 kB in 2s (2,975 kB/s)\n", "Reading package lists... Done\n", "Building dependency tree \n", "Reading state information... Done\n", "39 packages can be upgraded. Run 'apt list --upgradable' to see them.\n" ], "name": "stdout" } ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "ZT6BgcLFtN8E" }, "source": [ "## Install TensorFlow Serving\n", "\n", "Now that the Aptitude packages have been updated, we can use the `apt-get` command to install the TensorFlow model server." ] }, { "cell_type": "code", "metadata": { "colab_type": "code", "id": "YoHTRDi1Zf_Z", "outputId": "81687ccc-5a0d-4237-86b7-d8a208a024d0", "colab": { "base_uri": "https://localhost:8080/", "height": 292 } }, "source": [ "!apt-get install tensorflow-model-server" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "\rReading package lists... 0%\r\rReading package lists... 0%\r\rReading package lists... 0%\r\rReading package lists... 6%\r\rReading package lists... 6%\r\rReading package lists... 7%\r\rReading package lists... 7%\r\rReading package lists... 63%\r\rReading package lists... 63%\r\rReading package lists... 64%\r\rReading package lists... 64%\r\rReading package lists... 71%\r\rReading package lists... 71%\r\rReading package lists... 72%\r\rReading package lists... 72%\r\rReading package lists... 80%\r\rReading package lists... 81%\r\rReading package lists... 81%\r\rReading package lists... 81%\r\rReading package lists... 81%\r\rReading package lists... 81%\r\rReading package lists... 81%\r\rReading package lists... 81%\r\rReading package lists... 81%\r\rReading package lists... 87%\r\rReading package lists... 87%\r\rReading package lists... 87%\r\rReading package lists... 87%\r\rReading package lists... 93%\r\rReading package lists... 93%\r\rReading package lists... 93%\r\rReading package lists... 93%\r\rReading package lists... 93%\r\rReading package lists... 93%\r\rReading package lists... 94%\r\rReading package lists... 94%\r\rReading package lists... 95%\r\rReading package lists... 95%\r\rReading package lists... 98%\r\rReading package lists... 98%\r\rReading package lists... 98%\r\rReading package lists... 98%\r\rReading package lists... 98%\r\rReading package lists... 98%\r\rReading package lists... 98%\r\rReading package lists... 98%\r\rReading package lists... Done\r\n", "\rBuilding dependency tree... 0%\r\rBuilding dependency tree... 0%\r\rBuilding dependency tree... 50%\r\rBuilding dependency tree... 50%\r\rBuilding dependency tree \r\n", "\rReading state information... 0%\r\rReading state information... 0%\r\rReading state information... Done\r\n", "The following NEW packages will be installed:\n", " tensorflow-model-server\n", "0 upgraded, 1 newly installed, 0 to remove and 39 not upgraded.\n", "Need to get 175 MB of archives.\n", "After this operation, 0 B of additional disk space will be used.\n", "Get:1 http://storage.googleapis.com/tensorflow-serving-apt stable/tensorflow-model-server amd64 tensorflow-model-server all 2.1.0 [175 MB]\n", "Fetched 175 MB in 3s (55.7 MB/s)\n", "Selecting previously unselected package tensorflow-model-server.\n", "(Reading database ... 144433 files and directories currently installed.)\n", "Preparing to unpack .../tensorflow-model-server_2.1.0_all.deb ...\n", "Unpacking tensorflow-model-server (2.1.0) ...\n", "Setting up tensorflow-model-server (2.1.0) ...\n" ], "name": "stdout" } ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "k5u6UVdJ2K0W" }, "source": [ "## Create Dataset\n", "\n", "Now, we will create a simple dataset that expresses the relationship:\n", "\n", "$$\n", "y = 2x - 1\n", "$$\n", "\n", "between inputs (`xs`) and outputs (`ys`)." ] }, { "cell_type": "code", "metadata": { "colab_type": "code", "id": "3qqsNxy83Imw", "colab": {} }, "source": [ "xs = np.array([-1.0, 0.0, 1.0, 2.0, 3.0, 4.0], dtype=float)\n", "ys = np.array([-3.0, -1.0, 1.0, 3.0, 5.0, 7.0], dtype=float)" ], "execution_count": 0, "outputs": [] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "c6frUt2r3NYJ" }, "source": [ "## Build and Train the Model\n", "\n", "We'll use the simplest possible model for this example. Since we are going to train our model for `500` epochs, in order to avoid clutter on the screen, we will use the argument `verbose=0` in the `fit` method. The Verbosity mode can be:\n", "\n", "* `0` : silent.\n", "\n", "* `1` : progress bar.\n", "\n", "* `2` : one line per epoch.\n", "\n", "As a side note, we should mention that since the progress bar is not particularly useful when logged to a file, `verbose=2` is recommended when not running interactively (eg, in a production environment)." ] }, { "cell_type": "code", "metadata": { "colab_type": "code", "id": "9952f7iAaT9F", "outputId": "1d9e11dc-d510-486b-c8c0-3dbcef6280ed", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "source": [ "model = tf.keras.Sequential([tf.keras.layers.Dense(units=1, input_shape=[1])])\n", "\n", "model.compile(optimizer='sgd',\n", " loss='mean_squared_error')\n", "\n", "history = model.fit(xs, ys, epochs=500, verbose=0)\n", "\n", "print(\"Finished training the model\")" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "Finished training the model\n" ], "name": "stdout" } ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "ONjN9e34vPwC" }, "source": [ "## Test the Model\n", "\n", "Now that the model is trained, we can test it. If we give it the value `10`, we should get a value very close to `19`." ] }, { "cell_type": "code", "metadata": { "colab_type": "code", "id": "cnRch66BvNFF", "outputId": "80c59d83-ea85-484d-b692-304b64d7b2d6", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "source": [ "print(model.predict([[5.0]]))" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "[[8.994236]]\n" ], "name": "stdout" } ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "IQ67rKM9367n" }, "source": [ "## Save the Model\n", "\n", "To load the trained model into TensorFlow Serving we first need to save it in the [SavedModel](https://www.tensorflow.org/guide/saved_model) format. This will create a protobuf file in a well-defined directory hierarchy, and will include a version number. [TensorFlow Serving](https://www.tensorflow.org/tfx/serving/serving_config) allows us to select which version of a model, or \"servable\" we want to use when we make inference requests. Each version will be exported to a different sub-directory under the given path." ] }, { "cell_type": "code", "metadata": { "colab_type": "code", "id": "D9c2eOEHpjGS", "outputId": "6837102d-99f3-4071-9f1a-441e8c3b9aae", "colab": { "base_uri": "https://localhost:8080/", "height": 207 } }, "source": [ "MODEL_DIR = tempfile.gettempdir()\n", "\n", "version = 1\n", "\n", "export_path = os.path.join(MODEL_DIR, str(version))\n", "\n", "if os.path.isdir(export_path):\n", " print('\\nAlready saved a model, cleaning up\\n')\n", " !rm -r {export_path}\n", "\n", "model.save(export_path, save_format=\"tf\")\n", "\n", "print('\\nexport_path = {}'.format(export_path))\n", "!ls -l {export_path}" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/resource_variable_ops.py:1817: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.\n", "Instructions for updating:\n", "If using Keras pass *_constraint arguments to layers.\n", "INFO:tensorflow:Assets written to: /tmp/1/assets\n", "\n", "export_path = /tmp/1\n", "total 48\n", "drwxr-xr-x 2 root root 4096 May 16 05:12 assets\n", "-rw-r--r-- 1 root root 39128 May 16 05:12 saved_model.pb\n", "drwxr-xr-x 2 root root 4096 May 16 05:12 variables\n" ], "name": "stdout" } ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "E-ARQOR87Mt2" }, "source": [ "## Examine Your Saved Model\n", "\n", "We'll use the command line utility `saved_model_cli` to look at the `MetaGraphDefs` and `SignatureDefs` in our SavedModel. The signature definition is defined by the input and output tensors, and stored with the default serving key." ] }, { "cell_type": "code", "metadata": { "colab_type": "code", "id": "7M0VJORSN2w9", "outputId": "fc3c2e40-7270-4ec0-9edd-780002f9e49e", "colab": { "base_uri": "https://localhost:8080/", "height": 1000 } }, "source": [ "!saved_model_cli show --dir {export_path} --all" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "\n", "MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:\n", "\n", "signature_def['__saved_model_init_op']:\n", " The given SavedModel SignatureDef contains the following input(s):\n", " The given SavedModel SignatureDef contains the following output(s):\n", " outputs['__saved_model_init_op'] tensor_info:\n", " dtype: DT_INVALID\n", " shape: unknown_rank\n", " name: NoOp\n", " Method name is: \n", "\n", "signature_def['serving_default']:\n", " The given SavedModel SignatureDef contains the following input(s):\n", " inputs['dense_input'] tensor_info:\n", " dtype: DT_FLOAT\n", " shape: (-1, 1)\n", " name: serving_default_dense_input:0\n", " The given SavedModel SignatureDef contains the following output(s):\n", " outputs['dense'] tensor_info:\n", " dtype: DT_FLOAT\n", " shape: (-1, 1)\n", " name: StatefulPartitionedCall:0\n", " Method name is: tensorflow/serving/predict\n", "WARNING: Logging before flag parsing goes to stderr.\n", "W0516 05:12:34.321393 140293225809792 deprecation.py:506] From /usr/local/lib/python2.7/dist-packages/tensorflow_core/python/ops/resource_variable_ops.py:1786: calling __init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.\n", "Instructions for updating:\n", "If using Keras pass *_constraint arguments to layers.\n", "\n", "Defined Functions:\n", " Function Name: '__call__'\n", " Option #1\n", " Callable with:\n", " Argument #1\n", " dense_input: TensorSpec(shape=(None, 1), dtype=tf.float32, name=u'dense_input')\n", " Argument #2\n", " DType: bool\n", " Value: True\n", " Argument #3\n", " DType: NoneType\n", " Value: None\n", " Option #2\n", " Callable with:\n", " Argument #1\n", " inputs: TensorSpec(shape=(None, 1), dtype=tf.float32, name=u'inputs')\n", " Argument #2\n", " DType: bool\n", " Value: True\n", " Argument #3\n", " DType: NoneType\n", " Value: None\n", " Option #3\n", " Callable with:\n", " Argument #1\n", " inputs: TensorSpec(shape=(None, 1), dtype=tf.float32, name=u'inputs')\n", " Argument #2\n", " DType: bool\n", " Value: False\n", " Argument #3\n", " DType: NoneType\n", " Value: None\n", " Option #4\n", " Callable with:\n", " Argument #1\n", " dense_input: TensorSpec(shape=(None, 1), dtype=tf.float32, name=u'dense_input')\n", " Argument #2\n", " DType: bool\n", " Value: False\n", " Argument #3\n", " DType: NoneType\n", " Value: None\n", "\n", " Function Name: '_default_save_signature'\n", " Option #1\n", " Callable with:\n", " Argument #1\n", " dense_input: TensorSpec(shape=(None, 1), dtype=tf.float32, name=u'dense_input')\n", "\n", " Function Name: 'call_and_return_all_conditional_losses'\n", " Option #1\n", " Callable with:\n", " Argument #1\n", " inputs: TensorSpec(shape=(None, 1), dtype=tf.float32, name=u'inputs')\n", " Argument #2\n", " DType: bool\n", " Value: False\n", " Argument #3\n", " DType: NoneType\n", " Value: None\n", " Option #2\n", " Callable with:\n", " Argument #1\n", " inputs: TensorSpec(shape=(None, 1), dtype=tf.float32, name=u'inputs')\n", " Argument #2\n", " DType: bool\n", " Value: True\n", " Argument #3\n", " DType: NoneType\n", " Value: None\n", " Option #3\n", " Callable with:\n", " Argument #1\n", " dense_input: TensorSpec(shape=(None, 1), dtype=tf.float32, name=u'dense_input')\n", " Argument #2\n", " DType: bool\n", " Value: False\n", " Argument #3\n", " DType: NoneType\n", " Value: None\n", " Option #4\n", " Callable with:\n", " Argument #1\n", " dense_input: TensorSpec(shape=(None, 1), dtype=tf.float32, name=u'dense_input')\n", " Argument #2\n", " DType: bool\n", " Value: True\n", " Argument #3\n", " DType: NoneType\n", " Value: None\n" ], "name": "stdout" } ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "hS3dODJgAB87" }, "source": [ "## Run the TensorFlow Model Server\n", "\n", "We will now launch the TensorFlow model server with a bash script. We will use the argument `--bg` to run the script in the background.\n", "\n", "Our script will start running TensorFlow Serving and will load our model. Here are the parameters we will use:\n", "\n", "* `rest_api_port`: The port that you'll use for requests.\n", "\n", "\n", "* `model_name`: You'll use this in the URL of your requests. It can be anything.\n", "\n", "\n", "* `model_base_path`: This is the path to the directory where you've saved your model.\n", "\n", "Also, because the variable that points to the directory containing the model is in Python, we need a way to tell the bash script where to find the model. To do this, we will write the value of the Python variable to an environment variable using the `os.environ` function." ] }, { "cell_type": "code", "metadata": { "colab_type": "code", "id": "nhXu-01eOFGE", "colab": {} }, "source": [ "os.environ[\"MODEL_DIR\"] = MODEL_DIR" ], "execution_count": 0, "outputs": [] }, { "cell_type": "code", "metadata": { "colab_type": "code", "id": "kJDhHNJVnaLN", "outputId": "5e28c222-8e6f-4ba6-9cdb-8894af88191c", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "source": [ "%%bash --bg \n", "nohup tensorflow_model_server \\\n", " --rest_api_port=8501 \\\n", " --model_name=test \\\n", " --model_base_path=\"${MODEL_DIR}\" >server.log 2>&1" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "Starting job # 0 in a separate thread.\n" ], "name": "stdout" } ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "jWBzpit--6hS" }, "source": [ "Now we can take a look at the server log." ] }, { "cell_type": "code", "metadata": { "colab_type": "code", "id": "F_PudlFqdtfl", "outputId": "f972f6cc-0a86-4da9-f4ce-0830ab93506e", "colab": { "base_uri": "https://localhost:8080/", "height": 207 } }, "source": [ "!tail server.log" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "2020-05-16 05:12:35.568881: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA\n", "2020-05-16 05:12:35.582063: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:203] Restoring SavedModel bundle.\n", "2020-05-16 05:12:35.592041: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:152] Running initialization op on SavedModel bundle at path: /tmp/1\n", "2020-05-16 05:12:35.594441: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:333] SavedModel load for tags { serve }; Status: success: OK. Took 26368 microseconds.\n", "2020-05-16 05:12:35.594760: I tensorflow_serving/servables/tensorflow/saved_model_warmup.cc:105] No warmup data file found at /tmp/1/assets.extra/tf_serving_warmup_requests\n", "2020-05-16 05:12:35.594866: I tensorflow_serving/core/loader_harness.cc:87] Successfully loaded servable version {name: test version: 1}\n", "2020-05-16 05:12:35.595857: I tensorflow_serving/model_servers/server.cc:358] Running gRPC ModelServer at 0.0.0.0:8500 ...\n", "[warn] getaddrinfo: address family for nodename not supported\n", "2020-05-16 05:12:35.596419: I tensorflow_serving/model_servers/server.cc:378] Exporting HTTP/REST API at:localhost:8501 ...\n", "[evhttp_server.cc : 238] NET_LOG: Entering the event loop ...\n" ], "name": "stdout" } ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "gD_dr7baFJ-s" }, "source": [ "## Create JSON Object with Test Data\n", "\n", "We are now ready to construct a JSON object with some data so that we can make a couple of inferences. We will use $x=9$ and $x=10$ as our test data." ] }, { "cell_type": "code", "metadata": { "colab_type": "code", "id": "FwxEEnOei38-", "outputId": "68037c8f-0ba1-49b3-eb30-ce3df8e075cb", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "source": [ "xs = np.array([[9.0], [10.0]])\n", "data = json.dumps({\"signature_name\": \"serving_default\", \"instances\": xs.tolist()})\n", "print(data)" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "{\"signature_name\": \"serving_default\", \"instances\": [[9.0], [10.0]]}\n" ], "name": "stdout" } ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "DZ0KtEF7Fjer" }, "source": [ "## Make Inference Request\n", "\n", "Finally, we can make the inference request and get the inferences back. We'll send a predict request as a POST to our server's REST endpoint, and pass it our test data. We'll ask our server to give us the latest version of our model by not specifying a particular version. The response will be a JSON payload containing the predictions." ] }, { "cell_type": "code", "metadata": { "colab_type": "code", "id": "vGvFyuIzW6n6", "outputId": "f4bfb17d-5451-4645-83a1-53a413145f6d", "colab": { "base_uri": "https://localhost:8080/", "height": 85 } }, "source": [ "headers = {\"content-type\": \"application/json\"}\n", "json_response = requests.post('http://localhost:8501/v1/models/test:predict', data=data, headers=headers)\n", "\n", "print(json_response.text)" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "{\n", " \"predictions\": [[16.9821], [18.9790649]\n", " ]\n", "}\n" ], "name": "stdout" } ] }, { "cell_type": "markdown", "metadata": { "colab_type": "text", "id": "5DKLw7PwI928" }, "source": [ "We can also look at the predictions directly by loading the value for the `predictions` key." ] }, { "cell_type": "code", "metadata": { "colab_type": "code", "id": "F-x87o_DqfOL", "outputId": "1d381f2a-8c46-4a90-e7bd-c5950ba722a9", "colab": { "base_uri": "https://localhost:8080/", "height": 34 } }, "source": [ "predictions = json.loads(json_response.text)['predictions']\n", "print(predictions)" ], "execution_count": 0, "outputs": [ { "output_type": "stream", "text": [ "[[16.9821], [18.9790649]]\n" ], "name": "stdout" } ] }, { "cell_type": "markdown", "metadata": { "id": "qfzkY3DR5uJm", "colab_type": "text" }, "source": [ "You just saw how you could serve a dummy model with TF Model server let's now see a real model in action" ] } ] }