# BentoML Example: SKLearn to ONNX model


**BentoML makes moving trained ML models to production easy:**

* Package models trained with **any ML framework** and reproduce them for model serving in production
* **Deploy anywhere** for online API serving or offline batch serving
* High-Performance API model server with *adaptive micro-batching* support
* Central hub for managing models and deployment process via Web UI and APIs
* Modular and flexible design making it *adaptable to your infrastrcuture*

BentoML is a framework for serving, managing, and deploying machine learning models. It is aiming to bridge the gap between Data Science and DevOps, and enable teams to deliver prediction services in a fast, repeatable, and scalable way.

Before reading this example project, be sure to check out the [Getting started guide](https://github.com/bentoml/BentoML/blob/master/guides/quick-start/bentoml-quick-start-guide.ipynb) to learn about the basic concepts in BentoML.

This example notebooks demonstrates how to build SK model and packed as ONNX model for BentoML

![Impression](https://www.google-analytics.com/collect?v=1&tid=UA-112879361-3&cid=555&t=event&ec=onnx&ea=onnx-sk-iris-classifier&dt=onnx-sk-iris-classifier)

In [1]:
!pip install -q bentoml "skl2onnx==1.7.0" "onnx==1.7.0" "onnxmltools==1.7.0"

Collecting skl2onnx
  Using cached skl2onnx-1.7.0-py2.py3-none-any.whl (191 kB)
Collecting onnxmltools
  Using cached onnxmltools-1.7.0-py2.py3-none-any.whl (252 kB)
Collecting onnxconverter-common>=1.5.1
  Using cached onnxconverter_common-1.7.0-py2.py3-none-any.whl (64 kB)
Collecting keras2onnx
  Using cached keras2onnx-1.7.0-py3-none-any.whl (96 kB)
Collecting fire
  Using cached fire-0.3.1.tar.gz (81 kB)
Collecting termcolor
  Downloading termcolor-1.1.0.tar.gz (3.9 kB)
Building wheels for collected packages: fire, termcolor
  Building wheel for fire (setup.py) ... [?25ldone
[?25h  Created wheel for fire: filename=fire-0.3.1-py2.py3-none-any.whl size=111005 sha256=771d43d8451dfcbda6c1066b6e619b3f1ddec020d8ecc3fb0d8c0e70bbf8e526
  Stored in directory: /home/bentoml/.cache/pip/wheels/95/38/e1/8b62337a8ecf5728bdc1017e828f253f7a9cf25db999861bec
  Building wheel for termcolor (setup.py) ... [?25ldone
[?25h  Created wheel for termcolor: filename=termcolor-1.1.0-py3-none-any.whl size=

In [1]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
iris = load_iris()
X, y = iris.data, iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y)

from sklearn.linear_model import LogisticRegression
clr = LogisticRegression()
clr.fit(X_train, y_train)

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression


LogisticRegression()

In [2]:
clr.predict([[5.1, 3.5, 1.4, 0.2]])

array([0])

In [3]:
%%writefile iris_classifier_onnx.py

import numpy
import bentoml
from bentoml.adapters import DataframeInput
from bentoml.frameworks.onnx import OnnxModelArtifact

@bentoml.artifacts([OnnxModelArtifact('model')])
@bentoml.env(infer_pip_packages=True)
class IrisClassifierOnnx(bentoml.BentoService):
    
    @bentoml.api(input=DataframeInput(), batch=True)
    def predict(self, df):
        input_data = df.to_numpy().astype(numpy.float32)
        input_name = self.artifacts.model.get_inputs()[0].name
        output_name = self.artifacts.model.get_outputs()[0].name
        outputs = numpy.zeros(input_data.shape[0])
        for i in range(input_data.shape[0]):
            outputs[i] = self.artifacts.model.run([output_name], {input_name: input_data[i: i + 1]})[0]
        return outputs

Writing iris_classifier_onnx.py


In [4]:
from iris_classifier_onnx import IrisClassifierOnnx
from skl2onnx.common.data_types import FloatTensorType
import onnxmltools



initial_type = [('float_input', FloatTensorType([None, 4]))]
onnx_model = onnxmltools.convert_sklearn(clr, initial_types=initial_type)

svc = IrisClassifierOnnx()
svc.pack('model', onnx_model)

saved_path = svc.save()

[2020-09-22 14:32:58,902] INFO - Using default docker base image: `None` specified inBentoML config file or env var. User must make sure that the docker base image either has Python 3.7 or conda installed.
[2020-09-22 14:33:01,032] INFO - Detected non-PyPI-released BentoML installed, copying local BentoML modulefiles to target saved bundle path..


  normalized_version,
no previously-included directories found matching 'e2e_tests'
no previously-included directories found matching 'tests'
no previously-included directories found matching 'benchmark'


UPDATING BentoML-0.9.0rc0+3.gcebf2015/bentoml/_version.py
set BentoML-0.9.0rc0+3.gcebf2015/bentoml/_version.py to '0.9.0.pre+3.gcebf2015'
[2020-09-22 14:33:06,941] INFO - BentoService bundle 'IrisClassifierOnnx:20200922143300_5EB8CF' saved to: /Users/bozhaoyu/bentoml/repository/IrisClassifierOnnx/20200922143300_5EB8CF


## REST API Model Serving


To start a REST API model server with the BentoService saved above, use the bentoml serve command:

In [2]:
!bentoml serve IrisClassifierOnnx:latest --enable-microbatch

[2020-08-04 09:58:22,230] INFO - Getting latest version IrisClassifierOnnx:20200804094903_8746D7
[2020-08-04 09:58:22,230] INFO - Starting BentoML API server in development mode..
[2020-08-04 09:58:24,466] INFO - Micro batch enabled for API `predict`
[2020-08-04 09:58:24,466] INFO - Your system nofile limit is 10000, which means each instance of microbatch service is able to hold this number of connections at same time. You can increase the number of file descriptors for the server process, or launch more microbatch instances to accept more concurrent connection.
[2020-08-04 09:58:24,475] INFO - Running micro batch service on :5000
 * Serving Flask app "IrisClassifierOnnx" (lazy loading)
 * Environment: production
[2m   Use a production WSGI server instead.[0m
 * Debug mode: off
 * Running on http://127.0.0.1:40091/ (Press CTRL+C to quit)
(Press CTRL+C to quit)
[2020-08-04 09:58:27,265] INFO - Initializing onnxruntime InferenceSession from onnx file:'/home/bentoml/bentoml/repository/

If you are running this notebook from Google Colab, you can start the dev server with `--run-with-ngrok` option, to gain acccess to the API endpoint via a public endpoint managed by [ngrok](https://ngrok.com/):

In [None]:
!bentoml serve IrisClassifierOnnx:latest --run-with-ngrok

#### Send prediction requeset to the REST API server

```
curl -X POST \
  http://localhost:5000/predict \
  -H 'Content-Type: application/json' \
  -d '[[5.1, 3.5, 1.4, 0.2]]'
```



## Containerize model server with Docker


One common way of distributing this model API server for production deployment, is via Docker containers. And BentoML provides a convenient way to do that.

Note that docker is **not available in Google Colab**. You will need to download and run this notebook locally to try out this containerization with docker feature.

If you already have docker configured, simply run the follow command to product a docker container serving the IrisClassifier prediction service created above:

In [6]:
!bentoml containerize IrisClassifierOnnx:latest

[2020-09-22 14:38:45,682] INFO - Getting latest version IrisClassifierOnnx:20200922143300_5EB8CF
[39mFound Bento: /Users/bozhaoyu/bentoml/repository/IrisClassifierOnnx/20200922143300_5EB8CF[0m
[39mTag not specified, using tag parsed from BentoService: 'irisclassifieronnx:20200922143300_5EB8CF'[0m
Building Docker image irisclassifieronnx:20200922143300_5EB8CF from IrisClassifierOnnx:latest 
-we in here
processed docker file
(None, None)
root in create archive /Users/bozhaoyu/bentoml/repository/IrisClassifierOnnx/20200922143300_5EB8CF ['Dockerfile', 'IrisClassifierOnnx', 'IrisClassifierOnnx/__init__.py', 'IrisClassifierOnnx/artifacts', 'IrisClassifierOnnx/artifacts/__init__.py', 'IrisClassifierOnnx/artifacts/model.onnx', 'IrisClassifierOnnx/bentoml.yml', 'IrisClassifierOnnx/iris_classifier_onnx.py', 'MANIFEST.in', 'README.md', 'bentoml-init.sh', 'bentoml.yml', 'bundled_pip_dependencies', 'bundled_pip_dependencies/BentoML-0.9.0rc0+3.gcebf2015.tar.gz', 'docker-entrypoint.sh', 'environm

/[39mCollecting pytz>=2011k[0m
[39m  Downloading pytz-2020.1-py2.py3-none-any.whl (510 kB)[0m
\[39mCollecting onnx>=1.2.3[0m
[39m  Downloading onnx-1.7.0-cp37-cp37m-manylinux1_x86_64.whl (7.4 MB)[0m
\[39mInstalling collected packages: numpy, pytz, pandas, onnx, onnxruntime
  Attempting uninstall: numpy[0m
[39m    Found existing installation: numpy 1.19.2[0m
/[39m    Uninstalling numpy-1.19.2:[0m
\[39m      Successfully uninstalled numpy-1.19.2[0m
-[39mSuccessfully installed numpy-1.18.4 onnx-1.7.0 onnxruntime-1.3.0 pandas-0.24.2 pytz-2020.1[0m
\[39m ---> 866c51f54760[0m
[39mStep 8/15 : COPY . /bento[0m
/[39m ---> 6e11b086aa71[0m
[39mStep 9/15 : RUN if [ -d /bento/bundled_pip_dependencies ]; then pip install -U bundled_pip_dependencies/* ;fi[0m
[39m ---> Running in 675f3fde37a6[0m
-[39mProcessing ./bundled_pip_dependencies/BentoML-0.9.0rc0+3.gcebf2015.tar.gz[0m
-[39m  Installing build dependencies: started[0m
-[39m  Installing build dependenci

[39mBuilding wheels for collected packages: BentoML[0m
[39m  Building wheel for BentoML (PEP 517): started[0m
\[39m  Building wheel for BentoML (PEP 517): finished with status 'done'[0m
[39m  Created wheel for BentoML: filename=BentoML-0.9.0rc0+3.gcebf2015-py3-none-any.whl size=3064091 sha256=b065620707c341cb307a75ee66e4c24eafeb59170d1bc3c344f040dcad703f83
  Stored in directory: /root/.cache/pip/wheels/a0/45/41/62152db705af4ff47e7a3d6abf6247986eef4aa1b94a58d3b9[0m
[39mSuccessfully built BentoML[0m
\[39mInstalling collected packages: BentoML
  Attempting uninstall: BentoML[0m
[39m    Found existing installation: BentoML 0.9.0rc0[0m
-[39m    Uninstalling BentoML-0.9.0rc0:[0m
\[39m      Successfully uninstalled BentoML-0.9.0rc0[0m
-[39mSuccessfully installed BentoML-0.9.0rc0+3.gcebf2015[0m
/[39m ---> 9d4f82cb1019[0m
[39mStep 10/15 : ENV PORT 5000[0m
[39m ---> Running in 676f60e8ba7c[0m
|[39m ---> 572602250521[0m
[39mStep 11/15 : EXPOSE $PORT[0m
[39m -

In [7]:
!docker run --rm -p 5000:5000 irisclassifieronnx:20200922143300_5EB8CF

[2020-09-22 21:40:47,259] INFO - Starting BentoML API server in production mode..
[2020-09-22 21:40:47,615] INFO - get_gunicorn_num_of_workers: 3, calculated by cpu count
[2020-09-22 21:40:47 +0000] [1] [INFO] Starting gunicorn 20.0.4
[2020-09-22 21:40:47 +0000] [1] [INFO] Listening at: http://0.0.0.0:5000 (1)
[2020-09-22 21:40:47 +0000] [1] [INFO] Using worker: sync
[2020-09-22 21:40:47 +0000] [11] [INFO] Booting worker with pid: 11
[2020-09-22 21:40:47 +0000] [12] [INFO] Booting worker with pid: 12
[2020-09-22 21:40:47 +0000] [13] [INFO] Booting worker with pid: 13
^C
[2020-09-22 21:40:51 +0000] [1] [INFO] Handling signal: int
[2020-09-22 21:40:51 +0000] [11] [INFO] Worker exiting (pid: 11)
[2020-09-22 21:40:51 +0000] [13] [INFO] Worker exiting (pid: 13)
[2020-09-22 21:40:51 +0000] [12] [INFO] Worker exiting (pid: 12)


## Load saved BentoService

bentoml.load is the API for loading a BentoML packaged model in python:

In [None]:
from bentoml import load


svc = load(saved_path)

print(svc.predict([[5.1, 3.5, 1.4, 0.2]]))

## Launch inference job from CLI

BentoML cli supports loading and running a packaged model from CLI. With the DataframeInput adapter, the CLI command supports reading input Dataframe data from CLI argument or local csv or json files:

In [9]:
!bentoml run IrisClassifierOnnx:latest predict --input '[[5.1, 3.5, 1.4, 0.2], [5.1, 3.5, 1.4, 0.2]]'

[2020-09-22 14:41:28,131] INFO - Getting latest version IrisClassifierOnnx:20200922143300_5EB8CF
[2020-09-22 14:41:28,584] INFO - Using default docker base image: `None` specified inBentoML config file or env var. User must make sure that the docker base image either has Python 3.7 or conda installed.
[2020-09-22 14:41:29,074] INFO - Initializing onnxruntime InferenceSession from onnx file:'/Users/bozhaoyu/bentoml/repository/IrisClassifierOnnx/20200922143300_5EB8CF/IrisClassifierOnnx/artifacts/model.onnx'
[2020-09-22 14:41:33,065] INFO - {'service_name': 'IrisClassifierOnnx', 'service_version': '20200922143300_5EB8CF', 'api': 'predict', 'task': {'data': {}, 'task_id': '1c20a573-fe40-44b2-b28b-7dd907927996', 'batch': 2, 'cli_args': ('--input', '[[5.1, 3.5, 1.4, 0.2], [5.1, 3.5, 1.4, 0.2]]')}, 'result': {'data': '[0.0, 0.0]', 'http_status': 200, 'http_headers': (('Content-Type', 'application/json'),)}, 'request_id': '1c20a573-fe40-44b2-b28b-7dd907927996'}
[0.0, 0.0]


# Deployment Options

If you are at a small team with limited engineering or DevOps resources, try out automated deployment with BentoML CLI, currently supporting AWS Lambda, AWS SageMaker, and Azure Functions:
- [AWS Lambda Deployment Guide](https://docs.bentoml.org/en/latest/deployment/aws_lambda.html)
- [AWS SageMaker Deployment Guide](https://docs.bentoml.org/en/latest/deployment/aws_sagemaker.html)
- [Azure Functions Deployment Guide](https://docs.bentoml.org/en/latest/deployment/azure_functions.html)

If the cloud platform you are working with is not on the list above, try out these step-by-step guide on manually deploying BentoML packaged model to cloud platforms:
- [AWS ECS Deployment](https://docs.bentoml.org/en/latest/deployment/aws_ecs.html)
- [Google Cloud Run Deployment](https://docs.bentoml.org/en/latest/deployment/google_cloud_run.html)
- [Azure container instance Deployment](https://docs.bentoml.org/en/latest/deployment/azure_container_instance.html)
- [Heroku Deployment](https://docs.bentoml.org/en/latest/deployment/heroku.html)

Lastly, if you have a DevOps or ML Engineering team who's operating a Kubernetes or OpenShift cluster, use the following guides as references for implementating your deployment strategy:
- [Kubernetes Deployment](https://docs.bentoml.org/en/latest/deployment/kubernetes.html)
- [Knative Deployment](https://docs.bentoml.org/en/latest/deployment/knative.html)
- [Kubeflow Deployment](https://docs.bentoml.org/en/latest/deployment/kubeflow.html)
- [KFServing Deployment](https://docs.bentoml.org/en/latest/deployment/kfserving.html)
- [Clipper.ai Deployment Guide](https://docs.bentoml.org/en/latest/deployment/clipper.html)