{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", "\n", " \n", "## [mlcourse.ai](https://mlcourse.ai) – Open Machine Learning Course \n", "\n", "\n", "### Tutorial\n", "# Deploying your Machine Learning Model\n", "\n", "
Author: Maxim Klyuchnikov
\n", "
" ] }, { "cell_type": "markdown", "metadata": { "toc": true }, "source": [ "

Table of Contents

\n", "
\n", " \n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 1. Introduction" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So, you have your brand new shiny ML model and you're impatient to let the world to see and use it :)\n", "
\n", "The model was developed in Jupyter Notebook with Python, but how to make it available for other people and external systems now?\n", "
\n", "That is the question I hope to be answered for you after following this tutorial.\n", "
\n", "Here's an application containing a model which we are going to develop and deploy: https://mlcourse-tutorial-deployment.herokuapp.com/" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1.1. Who this tutorial is for\n", "* novice machine learning practitioners, who seek for an easy way to deploy their models\n", "* software engineers who are not familiar with the specifics of deploying machine learning models\n", "* anyone who just want to bring their proof-of-concept model online to demonstrate and evaluate" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1.2. What we're gonna use\n", "* **scikit-learn** to build the machine learning model\n", "* **Joblib** to persist the model\n", "* **Flask** as a web microframework and local development server\n", "* **Git** as a way to deliver our model and application\n", "* **Heroku** as a simple PaaS hosting for our application\n", "* a pinch of **HTML/CSS/JavaScript**, to look nice" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1.3. Why is it important to know how to do it?\n", "Let's start from the root of the problem - there is still no strict definition of who the Data Scientist is. If you ask ten people with Data Scientist title about their job, you'll probably get ten very different answers, varying from doing Excel calculations to developing the math models behind the [AlphaZero](https://en.wikipedia.org/wiki/AlphaZero).\n", "\n", "Giving the vague definition of the Data Scientist (by Data Scientists themselves), it's no wonder that many business people don't understand it either :)\n", "\n", "In ideal world creating the math model, implementing it in a form of machine learning model, making it available for production and deployment/maintenance should be done by different people with different job titles, however it's still not so common case to see. So you must be prepared that _making the model available to use might be a part of your job_.\n", "\n", "Aside from that, there are other reasons to learn it:\n", "\n", "* there is a whole world ouside of the Jupyter Notebook. Like it or not, as a good Data Scientist, you have to understand how your model will be used by external systems and what type of problems exist in the area of integration between the applications;\n", "* for curious person it's always nice to see what's is going on under the hood of the services which provide ready-to-use ML models through the APIs. We will build our solution using pretty basic tools, so you should get an understanding of how things work on a relatively low level;\n", "* ability to deploy your own model is an important skill to add to your Data Scientist toolbox. It's not as hard as some of the machine learning algorithms, but it will make you differ from your colleagues or other job candidates. There is a fancy buzzword for this skill: **productionizing** the model - we'll make a tiny step in this direction;\n", "* don't let your model feel alone, let others talk to it :)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1.4. Disclaimer\n", "While the approach presented here is suitable for showing the proof-of-concept and early version of your model, it might not be sufficient for heavy models or very intensive workloads. See Missing Parts section for details about what need to be considered for more complex and production-ready scenarios." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 2. Building our Model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2.1. Usual Machine Learning workflow\n", "Let's see what we usually have in our simple ML workflow and get an idea of what parts need to be deployed:\n", "\n", "\n", "In most cases we have to take **Feature Processing**, **Persisted Model**, **Make Predictions** parts with us.\n", "\n", "In cases when our ML model supports [online training](https://en.wikipedia.org/wiki/Online_machine_learning), we can also take the **Model Training** part. Essentially, online traning allows to partially train the model with new data only, without the need to re-train it on the whole dataset, which is especially useful for large datasets and complex models. For the sake of simplicity, we will take the offline model.\n", "\n", "What is **Persisted Model**? After training the most important part of the model is its weights which this model has calculated. Along with the weights it's always nice to have some metadata describing the algorithm which will use these weights, feature processing pipeline, model version, etc - all these things have to be unloaded from the computer memory and put somewhere (disk / database / cloud). These data is called persisted model and it can be loaded back to memory for future use." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2.2. The Model\n", "Since we're mostly interested in stuff around our model, let's take the most _Hello World_'ish dataset in machine learning: [The Iris Dataset](https://archive.ics.uci.edu/ml/datasets/iris). Thankfully, scikit-learn already have it included, so we don't even need to download anything.\n", "\n", "Let's import the dependencies first:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import warnings\n", "\n", "import joblib\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import pandas as pd\n", "import seaborn as sns\n", "from pandas.tools.plotting import parallel_coordinates\n", "from sklearn.datasets import load_iris\n", "from sklearn.linear_model import LogisticRegression\n", "from sklearn.metrics import classification_report, confusion_matrix\n", "from sklearn.model_selection import train_test_split\n", "from sklearn.pipeline import FeatureUnion, Pipeline\n", "from sklearn.preprocessing import PolynomialFeatures, StandardScaler\n", "\n", "warnings.filterwarnings(\"ignore\")\n", "\n", "RANDOM_SEED = 17 # because it's the most popular random number between 1 and 20" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now load the dataset and take a look at its properties:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dataset = load_iris()\n", "print(\"Feature names:\", dataset.feature_names)\n", "print(\"Iris names:\", dataset.target_names)\n", "print(\"Number of instances:\", dataset.data.shape[0])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Feature names in this format are not convenient for future use, let's convert them:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "feature_names = list(\n", " map(lambda x: x.replace(\" (cm)\", \"\").replace(\" \", \"_\"), dataset.feature_names)\n", ")\n", "feature_names" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Load the dataset into DataFrame:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "target = np.array([dataset.target_names[x] for x in dataset.target]).reshape(-1, 1)\n", "df_full = pd.DataFrame(\n", " np.concatenate([dataset.data, target], axis=1), columns=feature_names + [\"target\"]\n", ")\n", "df_full[feature_names] = df_full[feature_names].astype(float)\n", "\n", "df_full.sample(5, random_state=RANDOM_SEED)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_full.describe()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Of course we can't have Data Science-related tutorial without cool chart, so here's one:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.figure(figsize=(15, 7))\n", "parallel_coordinates(\n", " df_full, \"target\", colormap=plt.get_cmap(\"cool\")\n", ") # see - it's cool as promised\n", "plt.title(\"Iris species features with their values\")\n", "plt.ylabel(\"cm\")\n", "# semicolon to suppress the last line output. Plot will be shown anyway, even without plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, let's split our data to train and test, having the holdout set of 40%:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_train, df_test, y_train, y_test = train_test_split(\n", " df_full.drop(\"target\", axis=1), df_full.target, test_size=0.4\n", ")\n", "df_train.shape, y_train.shape, df_test.shape, y_test.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Giving the toy nature of the dataset we're limited in feature engineering, but to have our **Feature Processing** step in place, let's add polynomial features to our data and then scale all features.\n", "\n", "Finally, as an estimator we will use plain LogisticRegression.\n", "\n", "To have all of these steps in one place, let's put them into the pipeline - it will make our life easier later." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pipeline = Pipeline(\n", " [\n", " (\n", " \"features\",\n", " FeatureUnion(\n", " [(\"poly\", PolynomialFeatures()), (\"scaler\", StandardScaler())]\n", " ),\n", " ),\n", " (\"logreg\", LogisticRegression(random_state=RANDOM_SEED)),\n", " ]\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that with its default settings PolynomialFeatures will add 15 new features:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\", \".join(PolynomialFeatures().fit(df_train).get_feature_names())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Fit the model and draw a confusion matrix:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pipeline.fit(df_train, y_train)\n", "pred = pipeline.predict(df_test)\n", "plt.figure(figsize=(7, 5))\n", "sns.heatmap(\n", " confusion_matrix(pred, y_test),\n", " annot=True,\n", " xticklabels=dataset.target_names,\n", " yticklabels=dataset.target_names,\n", ");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And yay, we've got pretty good results :)\n", "\n", "Let's make some predictions based on the cool chart above, like if we did real measurements. Take the values in such a way that we will get a _setosa_ class:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pipeline.predict(\n", " pd.DataFrame(\n", " [\n", " {\n", " \"sepal_length\": 4.0,\n", " \"sepal_width\": 5.0,\n", " \"petal_length\": 1.0,\n", " \"petal_width\": 0.5,\n", " }\n", " ]\n", " )\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Hmmm, it's not _setosa_, so what happened? Let's check how the DataFrame is constructed:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pd.DataFrame(\n", " [{\"sepal_length\": 4.0, \"sepal_width\": 5.0, \"petal_length\": 1.0, \"petal_width\": 0.5}]\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here is the problem: when initializing from a dictionary, the order of the columns in DataFrame is not guaranteed to be the same as in dictionary passed. Although note that we actually do not need DataFrame here - we can pass simple list without column names, but DataFrame is better for readability and our convenience.\n", "\n", "The workaround is to pass the ```columns=dictionary.keys()``` to DataFrame constructor or just select the DataFrame columns before doing the predictions:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_predict = pd.DataFrame(\n", " [{\"sepal_length\": 4.0, \"sepal_width\": 5.0, \"petal_length\": 1.0, \"petal_width\": 0.5}]\n", ")\n", "pipeline.predict(df_predict[feature_names])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Okay, now we're good, but let's remember this behaviour to workaround it later as well." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2.3. Persisting The Model\n", "After the training model have to be persisted. Basically, it's done by serializing the model state from the memory to file on disk in some specific format. It's common to use [pickle](https://docs.python.org/3/library/pickle.html) binary serialization in Python for that, but due to [performance](https://scikit-learn.org/stable/modules/model_persistence.html) reasons we will use [joblib](https://joblib.readthedocs.io/en/latest/persistence.html) (which still use pickle internally).\n", "\n", "Note that both pickle and joblib have problems with security - you have to be sure that the file you're trying to deserialize was not replaced by an attacker because it's possible to force Python to execute malicious code during deserialization. You can see how it works with more details [here](https://rushter.com/blog/pickle-serialization-internals/).\n", "\n", "What exactly have to be persisted in our case? Obviously, the following things need to be saved:\n", "* the model itself which is ```LogisticRegression``` estimator in our case - it contains the weights and logic on how to process these weights;\n", "* feature processing steps which represented by ```PolynomialFeatures``` and ```StandardScaler``` - saving them will allow us to process date for predictions in exactly the same manner as we did for training.\n", "\n", "Not so obvious, but still nice to have:\n", "* model version - for our own convenience and to be able to distinguish different models between each other;\n", "* incoming feature names or any other info which describe input and/or output data.\n", "\n", "### 2.3.1. scikit-learn Pipelines\n", "Before we proceed, let's pretend that we have our ```PolynomialFeatures``` and ```StandardScaler``` as separate objects. In this case we would need to save them and ```LogisticRegression``` as separate entities which introduce too much hassle, especially if our model eventually grow and other feature processing steps will be added.\n", "\n", "So, there are many good reasons to use the scikit-learn pipelines and here is another one: we can persist just our ```Pipeline``` object and have all of the steps stored as a single entity in one file.\n", "\n", "### 2.3.2. Python and library versions\n", "Also when serializing your model you have to be aware that serialized object might not load in different version of Python or with another version of the library, classes of which were serialized. That's why it's always good to fix and keep the Python and all of the related library versions along with the model. We will do it in this tutorial.\n", "\n", "### 2.3.3. Wrapping all of the model data\n", "Let's create a class which will contain a model and all additional metadata:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "from sklearn.pipeline import Pipeline\n", "\n", "\n", "class IrisModel:\n", " def __init__(\n", " self, pipeline: Pipeline, version=\"unknown\", input_features=[], class_names=[]\n", " ):\n", " self.pipeline = pipeline\n", "\n", " self.version = version\n", " self.input_features = input_features\n", " self.class_names = class_names\n", "\n", " def predict(self, data: pd.DataFrame) -> np.ndarray:\n", " data = data[\n", " self.input_features\n", " ] # this is the workaround for the problem with DataFrame fields order, see above\n", " return self.pipeline.predict(data)\n", "\n", " # pretty-print our class in Jupyter notebook and when we're converting it to string\n", " def __repr__(self):\n", " return (\n", " f\"{self.__class__.__name__}(\"\n", " f\"{self.pipeline!r}, \"\n", " f\"version={self.version!r}, \"\n", " f\"input_features={self.input_features!r}, \"\n", " f\"class_names={self.class_names!r})\"\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Put the content of the cell above into file named **iris_model.py** and save it to the same directory from where you're running this notebook. Saving this class to file (which become a Python module actually) is necessary to workaround problem with persistence of classes which are not the part of any module.\n", "\n", "Now create the instance of this class, fill with our data and serialize to file:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from iris_model import IrisModel\n", "\n", "iris_model = IrisModel(\n", " pipeline,\n", " version=\"0.1\",\n", " input_features=df_train.columns.values,\n", " class_names=dataset.target_names,\n", ")\n", "\n", "joblib.dump(iris_model, \"iris-model-v%s.jl\" % iris_model.version)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 3. Wrapping The Model in Web Application" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we're ready to build our web application which will serve our model to outside world. Let's take [Flask](http://flask.pocoo.org/) as a web-framework because of its simplicity, small CPU and memory footprint and easy portability.\n", "\n", "This web application will expose a basic Web API which will be available for browsers and any other 3rd-party system to use over the HTTP protocol. So it will have its own URL which you can directly type in the browser or call programmatically using any programming language, not necessarily Python.\n", "\n", "Also, nowadays it's common to use [JSON](https://en.wikipedia.org/wiki/JSON) as a default format for most of the Web APIs, so we will stick to it as well.\n", "\n", "First, install the Flask itself:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip install -U flask" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Import necessary dependencies and initialize Flask application:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from flask import Flask, jsonify, make_response, request\n", "from flask.testing import FlaskClient\n", "from werkzeug import Request\n", "\n", "app = Flask(__name__)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Load persisted model wrapper into the variable:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "IRIS_MODEL = joblib.load(\"iris-model-v0.1.jl\")\n", "IRIS_MODEL" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Try to predict with the same values as we already did for pipeline before:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_predict = pd.DataFrame(\n", " [{\"sepal_length\": 4.0, \"sepal_width\": 5.0, \"petal_length\": 1.0, \"petal_width\": 0.5}]\n", ")\n", "IRIS_MODEL.predict(df_predict[feature_names])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3.1. Create an endpoint for predictions\n", "Flask, as a typical web framework allows us to expose _endpoints_ with their specific URLs, _routes_, and bind these routes to our custom methods which will perform the actions we need and return their results.\n", "\n", "Let's create the ```/predict``` endpoint:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "app = Flask(__name__)\n", "\n", "\n", "@app.route(\n", " \"/predict\", methods=[\"POST\"]\n", ") # Flask decorator to mark the method which will be called when application recieved\n", "# an HTTP request to /predict URL using the POST method\n", "def predict():\n", " # here we expect an HTTP request body with the following data in JSON format:\n", " # [ {\"sepal_length\": 4.0, \"sepal_width\": 5.0, \"petal_length\": 1.0, \"petal_width\": 0.5},\n", " # {\"sepal_length\": 3.0, \"sepal_width\": 3.0, \"petal_length\": 2.0, \"petal_width\": 4} ]\n", " data = request.get_json() # get the request body as a Python list of dictionaries\n", "\n", " df_predict = pd.DataFrame(data)\n", " predictions = IRIS_MODEL.predict(df_predict)\n", " predictions = predictions.tolist()\n", "\n", " return jsonify(\n", " {\n", " \"version\": IRIS_MODEL.version, # return model version to calling side\n", " \"status\": \"success\", # indicate that the call went well\n", " \"predictions\": predictions, # actual predictions, [\"setosa\", \"virginica\"]\n", " }\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you see, our endpoint is represented by ```/predict``` route which is bound to ```predict()``` method.\n", "* this method is called by Flask after it processed the incoming HTTP request and found which method is responsible for further processing;\n", "* we take the JSON from the body of HTTP request, convert it to the list of Python dictionaries, feed them into DataFrame and make a predictions;\n", "* finally, this method convert the predictions to JSON format and return them back to the Flask;\n", "* Flask forms HTTP response and return it back to the calling side (browser or another application).\n", "\n", "As a rule of thumb, it's better to make your prediction endpoints to work with multiple predictions at once, due to:\n", "* performance reasons - multiple predictions are generally done faster;\n", "* convenience - clients of your API will be able to work with just one endpoint by sending both single and multiple prediction requests to it.\n", "\n", "## 3.2. Incoming data validation\n", "As you see, the incoming format of the data is pretty complex and there are many ways to send something incorrect or even malicious. For this we need to have some sort of data validation to be sure that we actually can process the input data without any problems.\n", "\n", "But since it's a very broad topic we will skip it. Anyway, take time to learn the one of the following ways to validate JSON in Python: [Cerebrus](http://docs.python-cerberus.org/en/stable/), [jsonschema](https://python-jsonschema.readthedocs.io/en/latest/).\n", "\n", "## 3.3. Testing the API call\n", "That's another broad topic, actually :) In most cases when your application grows, you'd have to use unit tests ([unittest](https://docs.python.org/3/library/unittest.html), [pytest](https://docs.pytest.org/en/latest/)) to check if it's behaving as expected, especially when it's being refactored or actively developed.\n", "\n", "Here we will do some basic checks using the ```FlaskClient``` and asserts, which still work inside the Jupyter notebook and will allow you to debug your endpoint:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# prepare some test data\n", "test_objects = df_train.sample(3, random_state=RANDOM_SEED)\n", "\n", "expected_results = IRIS_MODEL.predict(test_objects).tolist()\n", "expected_results = expected_results\n", "\n", "data_to_send = [row.to_dict() for i, row in test_objects.iterrows()]\n", "\n", "data_to_send, \"---\", expected_results" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# normal successful request\n", "with app.test_client() as client: # client is a FlaskClient object\n", " response = client.post(\"/predict\", json=data_to_send)\n", "\n", " assert response.status_code == 200\n", " assert response.get_json()[\"status\"] == \"success\"\n", " assert response.get_json()[\"predictions\"] == expected_results\n", "\"done\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3.4. Running web application" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Okay, now we have all components in place to be able to run our web app and make our first real call to it.\n", "\n", "### 3.4.1. Preparing file structure\n", "We can't stay inside the Python notebook anymore, so we need to put or code in separate files according to the Flask-compatible structure:\n", "```\n", "+---static\n", "| irises.png\n", "| main.css\n", "+---templates\n", "| index.html # we'll have a '/' route for this page - it will be the start page of our site\n", "| # this is not needed for the model to work\n", "| # it's just a simple UI to play with our Flask endpoint from the browser\n", "| iris-model-v0.1.jl # our previously saved model, just copy it here\n", "| iris_app.py\n", "| iris_model.py # use the same file which you should already have (see \"Wrapping all of the model data\" section)\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### static/irises.png\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### static/main.css\n", "```css\n", "html,\n", "body {\n", " height: 100%;\n", "}\n", "\n", "body {\n", " display: -ms-flexbox;\n", " display: flex;\n", " -ms-flex-align: center;\n", " align-items: center;\n", " padding-top: 40px;\n", " padding-bottom: 40px;\n", "}\n", "\n", ".form-predict {\n", " width: 100%;\n", " max-width: 700px;\n", " padding: 15px;\n", " margin: 0 auto auto;\n", "}\n", "\n", ".form-predict .form-control {\n", " position: relative;\n", " box-sizing: border-box;\n", " height: auto;\n", " padding: 10px;\n", " font-size: 16px;\n", "}\n", "\n", ".form-predict .form-control:focus {\n", " z-index: 2;\n", "}\n", "\n", ".result span {\n", " color: green;\n", "}```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### templates/index.html\n", "\n", "```html\n", "\n", "\n", "\n", " \n", " mlcourse.ai tutorial demo application\n", " \n", " \n", "\n", "\n", "
\n", "

Iris Classification

\n", "\n", " \"\"\n", "\n", "
\n", "\n", "

Enter measured characteristics

\n", "\n", "
\n", " \n", "
\n", " \n", "
\n", " \n", "
\n", "\n", "
\n", " \n", "
\n", " \n", "
\n", " \n", "
\n", "\n", "
\n", " \n", "
\n", " \n", "
\n", " \n", "
\n", "\n", "
\n", " \n", "
\n", " \n", "
\n", " \n", "
\n", "\n", " \n", "\n", "
\n", "\n", "

\n", "
\n", "\n", "\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### iris_app.py\n", "\n", "```python\n", "import joblib\n", "import pandas as pd\n", "from flask import Flask, request, jsonify, render_template\n", "\n", "app = Flask(__name__)\n", "\n", "IRIS_MODEL = joblib.load('iris-model-v0.1.jl')\n", "\n", "\n", "@app.route('/')\n", "def index():\n", " return render_template('index.html')\n", "\n", "\n", "@app.route('/predict', methods=['POST']) # Flask decorator to mark the method which will be called when application recieved\n", " # an HTTP request to /predict URL using the POST method\n", "def predict():\n", " # here we expect an HTTP request body with the following data in JSON format:\n", " # [ {\"sepal_length\": 4.0, \"sepal_width\": 5.0, \"petal_length\": 1.0, \"petal_width\": 0.5},\n", " # {\"sepal_length\": 3.0, \"sepal_width\": 3.0, \"petal_length\": 2.0, \"petal_width\": 4} ]\n", " data = request.get_json() # get the request body as a Python list of dictionaries\n", "\n", " df_predict = pd.DataFrame(data)\n", " predictions = IRIS_MODEL.predict(df_predict)\n", " predictions = predictions.tolist()\n", "\n", " return jsonify({\n", " 'version': IRIS_MODEL.version, # return model version to calling side\n", " 'status': 'success', # indicate that the call went well\n", " 'predictions': predictions # actual predictions, [\"setosa\", \"virginica\"]\n", " })```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.4.2. Run Flask development server\n", "In addition to its web framework capabilities, Flask includes its own development server which is installed along with Python package. This server is not supposed to be used in production because of its pestability and scaling problems, but it works pretty well during the application development.\n", "\n", "First, open the command prompt in the folder where you put all of the files above.\n", "\n", "To run Flask on our local machine, we need to pass some settings to it using the environment variables. Here is how to do it on Windows:\n", "> set FLASK_APP=iris_app.py && set FLASK_ENV=development && flask run\n", "\n", "```\n", "* Serving Flask app \"iris_app.py \"\n", "* Environment: development\n", "* Debug mode: off\n", "* Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.4.3. Try it with browser\n", "\n", "Open the URL which Flask gave to you and you should see something like that:\n", "\n", "\n", "Feel free to play with it and make your own predictions. Refer to cool chart from the model building section to input meaningful data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.4.4. Try it with Python\n", "\n", "As was said previously, you don't have to use a browser to call your web application endpoint. So let's do it with Python.\n", "\n", "Assuming that the Flask development server is running, execute the following code:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import requests\n", "\n", "response = requests.post(\n", " \"http://127.0.0.1:5000/predict\",\n", " json=[\n", " {\n", " \"sepal_length\": 5.0,\n", " \"sepal_width\": 3.5,\n", " \"petal_length\": 1.3,\n", " \"petal_width\": 0.3,\n", " }\n", " ],\n", ")\n", "\n", "response.json()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.4.5. Flask tricks\n", "When you change the application files while Flask server is running, you have to restart it since it does not load the changes from the disk automatically. Fortunately, there is an option to force Flask to watch file changes (it enables debugging, but automatic reload is a nice side effect during development):\n", "\n", "> set FLASK_APP=iris_app.py && set FLASK_ENV=development **&& set FLASK_DEBUG=1** && flask run\n", "\n", "\n", "If you run Flask with the host parameter you should be able to show your application to your colleague if you in the same LAN and your port ```5000``` is open:\n", "\n", "> set FLASK_APP=iris_app.py && set FLASK_ENV=development && flask run **-h 0.0.0.0**\n", "\n", "Now you can proudly hold this badge :) \n", "\n", "Which is, yeah, definitely awesome, but we'd like to do solutions which works somewhere else, right? So let's go on." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 4. Deployment to Heroku" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[Heroku](https://www.heroku.com/) is a Platform as a Service (PaaS) cloud provider which, by its PaaS nature, allow to skip complex procedures of application deployment while still providing good customization support.\n", "\n", "What is also good about Heroku is that it allows you to run one instance of your web application for free, which makes it a platform of choice for proof of concepts or small applications.\n", "\n", "## 4.1. Setting up\n", "\n", "First, do these things:\n", "* [create an account](https://signup.heroku.com/) on Heroku\n", "* [download](https://devcenter.heroku.com/articles/heroku-cli) and install Heroku CLI, which allows you to work with the platform straight from command line\n", "\n", "Once the Heroku CLI is installed, introduce your installation to Heroku by running\n", "> heroku login\n", "\n", "which, dependig on your platform, open a browser to perform login or ask for your Heroku login and password in the command line." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4.2. Prepare our app for Heroku\n", "When Heroku receive your application files, it need to recognize what type of application your are deploying and how to run it. Often it's done automatically by analyzing the application files, but better to provide exact definition of what we're need Heroku to do.\n", "\n", "Let's create couple of files in the app root directory which will tell Heroku how to work with our application.\n", "\n", "```\n", "| Procfile\n", "| requirements.txt\n", "```\n", "\n", "### 4.2.1. Procfile\n", "This is the file which Heroku use to get the commands to execute to start your application and do some post-start actions.\n", "Here is such file for our application:\n", "```\n", "web: gunicorn iris_app:app --log-file -\n", "```\n", "Yes, just a single line. It tells Heroku the following:\n", "* our application must be run as ```web``` appication;\n", "* use ```gunicorn``` as a web server (remember that Flask server is for development only, so we switched to [gunicorn](https://gunicorn.org/)) and pass path to our application within the specific module;\n", "* ```--log-file -``` is an option for gunicorn to output the log information straight to STDOUT - it will be then caught by Heroku and provided as a part of Heroku own logging mechanism.\n", "\n", "### 4.2.2. requirements.txt\n", "This file is not Heroku specific - it's a common file used in Python ecosystem to keep the list of the dependencies required for application or library to run. Heroku is able to recognize this file and will requested dependencies while deploying your application.\n", "\n", "Here is it:\n", "```\n", "pandas==0.22.0\n", "numpy==1.15.4\n", "scikit-learn==0.20.1\n", "joblib==0.13.0\n", "Flask==1.0.2\n", "gunicorn==19.9.0\n", "```\n", "Note that I've added ```gunicorn``` manually - it does not have to be installed on a local machine, but we need it to be installed on Heroku.\n", "You can see and save the libraries that you have in your Python installation (or, which is way better, in your [virtualenv](https://virtualenv.pypa.io/en/latest/)) by running the following command:\n", "> pip freeze > requirements.txt\n", "\n", "However, it will save all of the dependencies which are not used in your application and sub-dependencies as well, which may do no harm, but better to go through the generated file and remove those dependencies which are not necessary for your application to work." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4.3. Initialize local Git repository\n", "Heroku allow you to deploy your application by doing a simple git push. To start using Git, we need to run the following commands in the directory with our web application:\n", "\n", "Initializing empty Git repository:\n", "> git init\n", "\n", "Adding all of the existing files to Git:\n", "> git add .\n", "\n", "Commit added files to just created local Git repository:\n", "> git commit -m \"iris app initial commit\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4.4. Create Heroku application\n", "Run the following command in the app root directory:\n", "> heroku create\n", "\n", "It will create weird-named Heroku application for you and add remote Git repository (hosted on Heroku) to your app's Git configuration:\n", "\n", "> git remote -v\n", "\n", "```\n", "heroku https://git.heroku.com/blooming-eyrie-32543.git (fetch)\n", "heroku https://git.heroku.com/blooming-eyrie-32543.git (push)\n", "```\n", "\n", "Don't worry that your application is named like ```blooming-eyrie-32543``` - you can [rename](https://devcenter.heroku.com/articles/renaming-apps) it later :) It's done intentionally because Heroku apps must have unique names and we don't want to spend time to find a name which is not being used by someone else. Just for reference, it's possible to use your [custom domain](https://devcenter.heroku.com/articles/custom-domains) on Heroku, if needed.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4.5. Pushing your application\n", "Phew, now we're ready to go with actual deployment, which is a simple git push in case of Heroku:\n", "\n", "> git push heroku master\n", "\n", "```\n", "Counting objects: 12, done.\n", "Delta compression using up to 8 threads.\n", "Compressing objects: 100% (10/10), done.\n", "Writing objects: 100% (12/12), 281.51 KiB | 15.64 MiB/s, done.\n", "Total 12 (delta 0), reused 0 (delta 0)\n", "remote: Compressing source files... done.\n", "remote: Building source:\n", "remote:\n", "remote: -----> Python app detected\n", "remote: -----> Installing python-3.6.7\n", "remote: -----> Installing pip\n", "remote: -----> Installing SQLite3\n", "remote: -----> Installing requirements with pip\n", "... skipped long long of dependencies installation\n", "remote:\n", "remote: -----> Discovering process types\n", "remote: Procfile declares types -> web\n", "remote:\n", "remote: -----> Compressing...\n", "remote: Done: 125.5M\n", "remote: -----> Launching...\n", "remote: Released v3\n", "remote: https://blooming-eyrie-32543.herokuapp.com/ deployed to Heroku\n", "remote:\n", "remote: Verifying deploy... done.\n", "To https://git.heroku.com/blooming-eyrie-32543.git\n", " * [new branch] master -> master\n", "```\n", "\n", "Open the URL provided at the end of the output log - you should see the site app and running.\n", "\n", "If for some reason there is an error displaying, it worth to take a look at the Heroku logs (```--tail``` makes them live):\n", "\n", "> heroku logs --tail\n", "\n", "```\n", "2018-12-13T13:40:49.557538+00:00 heroku[web.1]: Starting process with command `gunicorn app:app --log-file -`\n", "2018-12-13T13:40:49.000000+00:00 app[api]: Build succeeded\n", "2018-12-13T13:40:51.776548+00:00 heroku[web.1]: State changed from starting to crashed\n", "2018-12-13T13:40:51.757263+00:00 heroku[web.1]: Process exited with status 3\n", "2018-12-13T13:40:51.621873+00:00 app[web.1]: [2018-12-13 13:40:51 +0000] [4] [INFO] Starting gunicorn 19.9.0\n", "2018-12-13T13:40:51.622501+00:00 app[web.1]: [2018-12-13 13:40:51 +0000] [4] [INFO] Listening at: http://0.0.0.0:10953 (4)\n", "2018-12-13T13:40:51.622615+00:00 app[web.1]: [2018-12-13 13:40:51 +0000] [4] [INFO] Using worker: sync\n", "2018-12-13T13:40:51.626310+00:00 app[web.1]: [2018-12-13 13:40:51 +0000] [10] [INFO] Booting worker with pid: 10\n", "2018-12-13T13:40:51.630939+00:00 app[web.1]: [2018-12-13 13:40:51 +0000] [10] [ERROR] Exception in worker process\n", "2018-12-13T13:40:51.630943+00:00 app[web.1]: Traceback (most recent call last):\n", "... skipped stack trace\n", "2018-12-13T13:40:51.630990+00:00 app[web.1]: __import__(module)\n", "2018-12-13T13:40:51.630991+00:00 app[web.1]: ModuleNotFoundError: No module named 'app'\n", "2018-12-13T13:40:51.634161+00:00 app[web.1]: [2018-12-13 13:40:51 +0000] [10] [INFO] Worker exiting (pid: 10)\n", "2018-12-13T13:40:51.668563+00:00 app[web.1]: [2018-12-13 13:40:51 +0000] [4] [INFO] Shutting down: Master\n", "2018-12-13T13:40:51.668996+00:00 app[web.1]: [2018-12-13 13:40:51 +0000] [4] [INFO] Reason: Worker failed to boot.\n", "```\n", "Here, for example, an incorrect Procfile was provided and gunicorn was unable to find an application.\n", "\n", "In addition to error logging this command will also display all of the incoming requests information (per our gunicorn config in Procfile), so you can see if your requests actually reaching the application and see an errors, if any.\n", "\n", "Hopefully, everything went fine and now you are able to use your model by using a browser and by calling web app endpoints from any other programming language!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 5. Missing Parts\n", "Here you can find some thougts and directions about what can be done better with what we did in the tutorial and what could be a potential areas of interest for further learning.\n", "\n", "This section is totally subjective, so your mileage may vary." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5.1. Versioning\n", "Each time you re-train or change your model internals, it has to have different version.\n", "\n", "The API built around the model must have and update its version accordingly if the set of input fields or output data is changed, to not break existing clients. Version + endpoints + format of the input and output data is a **contract** between the service and its clients, it should not be violated freely. Older clients, if they are not under your control, should have an ability to talk to your old API which must be deployed separately from the new one.\n", "\n", "It's also good to use appropriate Git branching structure, like [GitFlow](https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow) or specialized machine learning version control systems, like [DVC](https://dvc.org)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5.2. Dependency management\n", "Many multi-libraries systems has its own problems when you need to have multiple versions of the same library used by different applications: [DLL Hell](https://en.wikipedia.org/wiki/DLL_Hell), [JAR Hell](https://dzone.com/articles/what-is-jar-hell), [Python Dependency Hell](https://medium.com/knerd/the-nine-circles-of-python-dependency-hell-481d53e3e025), each has its own \n", "\n", "When it comes to portability between different computers, Python versions and different environments, it's necessary to fix your project dependecies so that you have exactly the same libraries that you had during the development.\n", "\n", "We've touched this area a bit by creating the ```requirements.txt```, but you should learn how to use [virtualenv](https://virtualenv.pypa.io/en/latest/), if you don't yet.\n", "\n", "As an extreme form of dependency management it's possible to use Docker and use it as a replacement of virtualenv - it will also provide you an abstraction from the target operating system since you can fix it in the Docker container as well as Python libraries." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5.3. Containerizing your application\n", "As it was mentioned, Docker provide nice abstraction of the environment since you can fix OS, system liraries, installed software and Python libraries inside the Docker cotainer.\n", "\n", "Later, this Docker container can be delivered to production in exactly the same state as you had during the development time.\n", "In addition, nowadays there are many cloud providers that allow you to deploy Docker containers as an applications.\n", "\n", "Actually, containerizing the ML applications is a pretty big topic which worth another tutorial." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5.4. Scalability\n", "It's always worth to know how much hardware resourses do you need for your model to work properly and without slowing its clients.\n", "\n", "The first step in this direction is to measure the performance of your final application by sending bunch of queries to it ([ApacheBench](https://en.wikipedia.org/wiki/ApacheBench), [JMeter](https://en.wikipedia.org/wiki/Apache_JMeter)).\n", "\n", "Since our application is stateless (means that it does not produce anything that has to be stored between the requests from the same client), we can scale it by simply adding more instances on our hosting service, Heroku.\n", "\n", "In addition to that we can [tune gunicorn to process multiple queries simultaneously](https://devcenter.heroku.com/articles/optimizing-dyno-usage#python) by spawning more workers, to be able to utilize single instance resourses more fully." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5.5. Persisting input data\n", "Sometime there is a need to collect real data which your model receive after being deployed. Usually that data is kept along with the predictions which your model did.\n", "For example, it could be usable in the following situations:\n", "* you need to analyze that the data sent by clients are still from the same distribution which you had during the training time - so after collecting real data you can analyze it, label, add to the original data and re-train your model;\n", "* it was found that your model give strange results on some input data - having the logs with these data and your model predictions will allow you to debug.\n", "\n", "The obvious choice for persistence is using the databases, which is pretty easy in Python.\n", "\n", "Be aware that choosing the SQLite (which is just a file on local disk), while hosting your model on PaaS service (like Heroku) might get you in trouble because in most cases local file system there is ephemeral and can be wiped out at any time. So better to choose the database server which is hosted on another machine - PaaS services usually have it as an [option](https://devcenter.heroku.com/categories/data-management)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5.6. Heavy workloads and models\n", "The model which we used in the tutorial is very toy'ish and it will not suffer much even under the high load. But for some scenarios the tools which we used might not be enough to keep up. In such cases different technologies which are created specifically for big data and high load, must be considered.\n", "\n", "As a Machine Learning practitioner, take a look at the [Spark MLlib](https://spark.apache.org/mllib/) and [H2O](https://www.h2o.ai/products/h2o/)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5.7. Other deployment options\n", "There are some other deployment options which are not covered in this tutorial but the core idea is still the same. Take time to learn them or create another tutorial :)\n", "* Docker containers (see above about their pros);\n", "* serverless deployments using [AWS Lambda](https://aws.amazon.com/lambda/) or [Azure Functions](https://docs.microsoft.com/en-us/azure/azure-functions/) - they will allow you to run your model on demand, usually using much less resources and money. It's another great way to deploy proof-of-concepts and small models to production;\n", "* self-hosted, when you want to have the total control over your hardware and software or you have to use your models inside the organization's network. This way require a lot more efforts and it's better to have experienced DevOps in the team for help." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.5" }, "toc": { "base_numbering": 1, "nav_menu": { "height": "729px", "width": "404px" }, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": true, "toc_position": {}, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 2 }