{"cells":[{"cell_type":"markdown","metadata":{"collapsed":true,"id":"75feab07-55ef-4436-ab6c-48d5c2d133e4"},"source":["<img src=\"https://github.com/pmservice/ai-openscale-tutorials/raw/master/notebooks/images/banner.png\" align=\"left\" alt=\"banner\">"]},{"cell_type":"markdown","metadata":{"id":"b18c0eec7d1c46f6b8fe4c779c478b7b"},"source":["# Monitor Custom Machine Learning engine with Watson OpenScale"]},{"attachments":{},"cell_type":"markdown","metadata":{"id":"fc3ebd630e524b3e812e2d17d603e5c9"},"source":["This notebook will configure an OpenScale data mart subscription for a Custom ML Provider deployment. We will then configure and execute the fairness, explain, quality and drift monitors."]},{"attachments":{},"cell_type":"markdown","metadata":{"id":"7438bf7721dd4f3b8f2f8a7fa8559c0b"},"source":["## Custom Machine Learning Provider Setup\n","Following code can be used to start a gunicorn/flask application that can be hosted in a VM, such that it can be accessable from CPD system.\n","This code does the following:\n","* It wraps a Watson Machine Learning model that is deployed to a space.\n","* So the hosting application URL should contain the SPACE ID and the DEPLOYMENT ID. Then, the same can be used to talk to the target WML model/deployment.\n","* Having said that, this is only for this tutorial purpose, and you can define your Custom ML provider endpoint in any fashion you want, such that it wraps your own custom ML engine.\n","* The scoring request and response payload should confirm to the schema as described here at: https://dataplatform.cloud.ibm.com/docs/content/wsj/model/wos-frameworks-custom.html\n","* To start the application using the below code, make sure you install following python packages in your VM:\n","\n","```\n","python -m pip install gunicorn\n","python -m pip install flask\n","python -m pip install numpy\n","python -m pip install pandas\n","python -m pip install requests\n","python -m pip install joblib==0.11\n","python -m pip install scipy==0.19.1\n","python -m pip install --user numpy scipy matplotlib ipython jupyter pandas sympy nose\n","python -m pip install ibm_watson_machine_learning\n","```\n","-----------------\n","\n","\n","\n","```\n","from flask import Flask, request, abort, jsonify\n","from ibm_watson_machine_learning import APIClient\n","from ibm_cloud_sdk_core.authenticators import IAMAuthenticator\n","import os\n","\n","app = Flask(__name__)\n","\n","# implement two APIs here: https://aiopenscale-custom-deployement-spec.mybluemix.net/#/Deployments/post_v1_deployments__deployment_id__online\n","# - /v1/deployments:\n","#       Lists all deployments\n","#\n","# - /v1/deployments/{deployment_id}/online:\n","#       Makes an online prediction\n","\n","@app.route('/spaces/<space_id>/v1/deployments/<deployment_id>/online', methods=['POST'])\n","def wml_online(space_id, deployment_id):\n","    if not request.json:\n","        print(\"not json - reject\")\n","        abort(400)\n","\n","    if 'APIKEY' not in os.environ:\n","        print(\"no APIKEY, system error\")\n","        abort(500)\n","\n","    payload_scoring = {\n","        \"input_data\": [\n","            request.json\n","        ]\n","    }\n","\n","    wml_client = APIClient(wml_credentials={\n","        \"url\": \"https://us-south.ml.cloud.ibm.com\",\n","        'apikey': os.environ['APIKEY']}\n","    )\n","\n","    wml_client.set.default_space(space_id)\n","\n","    scoring_response = wml_client.deployments.score(\n","        deployment_id, payload_scoring)\n","\n","    return jsonify(scoring_response[\"predictions\"][0])\n","\n","deployment_id=''\n","asset_guid=''\n","@app.route('/spaces/<space_id>/v1/deployments', methods=['GET'])\n","def deployments(space_id):\n","    # This API endpoint is optional.\n","    # It should list all the deployed models in your custom environment.\n","\n","    # If deploy this app on IBM Code Engine, the hostname can be determined by:\n","    # hostname = 'https://' + os.environ['CE_APP'] + '.' + os.environ['CE_SUBDOMAIN'] + '.' + os.environ['CE_DOMAIN']\n","\n","    # If deploying on a VM, change the hostname to the VM's IP that OpenScale service instance can access\n","    hostname = \"http://yyy.ddd.sss\"\n","    return {\n","        \"count\": 1,\n","        \"resources\": [\n","            {\n","                \"metadata\": {\n","                    \"guid\": deployment_id,\n","                    \"created_at\": \"2022-10-14T17:31:58.350Z\",\n","                    \"modified_at\": \"2022-10-14T17:31:58.350Z\"\n","                },\n","                \"entity\": {\n","                    \"name\": \"openscale-german-credit\",\n","                    \"description\": \"custom ml engine\",\n","                    \"scoring_url\": hostname + \"/spaces/\" + space_id + \"/v1/deployments/deployment_id/online\",\n","                    \"asset\": {\n","                        \"guid\": asset_guid,\n","                        \"url\": hostname + \"/spaces/\" + space_id + \"/v1/deployments/deployment_id/online\",\n","                        \"name\": \"openscale-german-credit\"\n","                    },\n","                    \"asset_properties\": {\n","                        \"problem_type\": \"binary\",\n","                        \"predicted_target_field\": \"prediction\",\n","                        \"input_data_type\": \"structured\",\n","                    }\n","                }\n","            }\n","        ]\n","    }\n","\n","if __name__ == '__main__':\n","    app.run(debug=True, port=5001,host=\"0.0.0.0\")\n","```\n","-----------------"]},{"cell_type":"markdown","metadata":{},"source":["# Steps\n","\n","1. [Setup](#setup)\n","1. [Load and explore data](#load)\n","1. [Configure OpenScale](#configure)\n","1. [Score the model so we can configure monitors ](#score)\n","1. [Fairness configuration](#Fairness)\n","1. [Explainability configuration](#explain)\n","1. [Quality monitoring and feedback logging](#quality)\n","1. [Drift configuration](#drift)"]},{"cell_type":"markdown","metadata":{"id":"262ffb75c64b4523bfda24ea95d4b8f5"},"source":["# 1.0 Setup <a name=\"setup\"></a>"]},{"cell_type":"markdown","metadata":{"id":"3b7c0a42ac1e43139057e5d78abfe075"},"source":["## 1.1 Package installation"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"827d92b4b8e1491e81d23153816f555c"},"outputs":[],"source":["import warnings\n","warnings.filterwarnings('ignore')"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"dc441429a8ad4aada3fa29d81e98e5cf","scrolled":false},"outputs":[],"source":["# IF you are not using IBM Watson Studio then install the below packages\n","!pip install --upgrade pandas==1.3.4 --no-cache | tail -n 1\n","!pip install --upgrade requests==2.28.1 --no-cache | tail -n 1\n","!pip install numpy==1.23.1 --no-cache | tail -n 1\n","!pip install scikit-learn==1.0.2 --no-cache | tail -n 1\n","!pip install SciPy==1.8.1 --no-cache | tail -n 1\n","!pip install --upgrade ibm-watson-machine-learning==1.0.264 --user | tail -n 1\n","!pip install --upgrade ibm-watson-openscale==3.0.27 --no-cache | tail -n 1\n","!pip install --upgrade ibm_wos_utils==4.6.1.3 --no-cache | tail -n 1\n","!pip install --upgrade ibm-cos-sdk"]},{"cell_type":"markdown","metadata":{"id":"44f0e2adc53c4f04ba9647b031d88cb9"},"source":["## Action: restart the kernel!"]},{"cell_type":"markdown","metadata":{"id":"f8cdd3fa70d94b9c8e3bea9fc4c65e9f"},"source":["## Configure credentials"]},{"cell_type":"markdown","metadata":{},"source":["\n","### Provide your IBM Cloud API key\n","Since we are using Watson OpenScale in the public cloud, we need IBM Cloud API key to access the Watson OpenScale service.\n","\n","1. Generate an IBM Cloud API key on the [API Keys page in the IBM Cloud console](https://cloud.ibm.com/iam/apikeys).\n","2. Click **Create an IBM Cloud API key**. Provide a key name, and click **Create**, then copy the created key and paste it below. As a best practice, download the API key in addition to copying the key.\n","\n"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["#masked\n","ibmcloud_api_key = 'your apikey'"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["### REST API to support custom ML provider\n","If you run the flask application in a VM, it should look like:\n","`http://<VM hostname>:5001/spaces/<space id>/v1/deployments/<deployment id>/online`\n","\n","For Code Engine, it may look like:\n","`http://application-06.wqfpjpzgx8v.us-east.codeengine.appdomain.cloud/spaces/<space_id>/v1/deployments/<deployment_id>/online`\n","\n","The space id and deployment id can be found in the WML model/deployment."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"93f35ebd184946f7a659713d7ca13b0f"},"outputs":[],"source":["# masked\n","# fill the url below to your scoring API, the online REST API to support custom ML provider\n","CUSTOM_ML_PROVIDER_SCORING_URL = 'http://<your host ip>:5001/spaces/<space id>/v1/deployments/<deployment id>/online'\n","scoring_url = CUSTOM_ML_PROVIDER_SCORING_URL"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["IAM_URL=\"https://iam.ng.bluemix.net/oidc/token\""]},{"cell_type":"markdown","metadata":{},"source":["### Create IBM Cloud Object Store service credential\n","A service credential provides the necessary information to connect an application to Object Storage packaged in a JSON document.\n","1. Log in to the IBM Cloud console and navigate to your instance of Object Storage.\n","2. In the side navigation, click **Service Credentials**.\n","3. Click New credential and select **Manager** Role.\n","4. Click **Add** to generate service credential."]},{"cell_type":"markdown","metadata":{},"source":["### Cloud object storage details\n","\n","In next cells, you will need to paste some credentials from the new service credential JSON document."]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# masked\n","\n","COS_API_KEY_ID = \"your service credential apikey\" # apikey. eg \"W00YixxxxxxxxxxMB-odB-2ySfTrFBIQQWanc--P3byk\"\n","COS_ENDPOINT = \"https://s3.us-south.cloud-object-storage.appdomain.cloud\" # endpoint Current list avaiable at https://control.cloud-object-storage.cloud.ibm.com/v2/endpoints\n","COS_RESOURCE_INSTANCE_ID=\"your resource instance id\" # resource_instance_id"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["import os.path, uuid\n","import ibm_boto3\n","from ibm_botocore.client import Config, ClientError\n","\n","# Create resource\n","cos = ibm_boto3.resource(\"s3\",\n","    ibm_api_key_id=COS_API_KEY_ID,\n","    ibm_service_instance_id=COS_RESOURCE_INSTANCE_ID,\n","    config=Config(signature_version=\"oauth\"),\n","    endpoint_url=COS_ENDPOINT\n",")\n","\n","def create_bucket(bucket_name):\n","    print(\"Creating new bucket: {0}\".format(bucket_name))\n","    try:\n","        cos.create_bucket(Bucket=bucket_name)\n","        print(\"Bucket: {0} created!\".format(bucket_name))\n","    except ClientError as be:\n","        print(\"CLIENT ERROR: {0}\\n\".format(be))\n","    except Exception as e:\n","        print(\"Unable to create bucket: {0}\".format(e))\n","\n","        \n","def upload_file(bucket_name, item_name):\n","    print(\"Creating new item: {0}\".format(item_name))\n","    try:\n","        cos.Object(bucket_name,item_name).upload_file(item_name)\n","        print(\"Item: {0} created!\".format(item_name))\n","    except ClientError as be:\n","        print(\"CLIENT ERROR: {0}\\n\".format(be))\n","    except Exception as e:\n","        print(\"Unable to create text file: {0}\".format(e))\n","\n","def get_bucket():\n","    file_name = \"bucket.txt\"\n","    bucket_name = \"\"\n","    if os.path.exists(file_name) is False:\n","        with open(file_name, \"w\") as f:\n","            f.write(str(uuid.uuid4()))\n","    with open(file_name, 'r') as f:\n","        bucket_name=f.readline()\n","        create_bucket(bucket_name)\n","    return bucket_name"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# Create a new bucket with uniq name\n","BUCKET_NAME=get_bucket()\n","COS_RESOURCE_CRN = COS_RESOURCE_INSTANCE_ID + \"bucket:\" + BUCKET_NAME"]},{"cell_type":"markdown","metadata":{"id":"056e2d289da0487c83e16983cde997ec"},"source":["# 2.0  Load and explore data <a name=\"load\"></a>"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"a74cc3aca27d406d88a25ab74a373157"},"outputs":[],"source":["FILE_NAME = 'german_credit_data_biased_training.csv'\n","!rm $FILE_NAME\n","!wget https://raw.githubusercontent.com/pmservice/ai-openscale-tutorials/master/assets/historical_data/german_credit_risk/wml/$FILE_NAME"]},{"attachments":{},"cell_type":"markdown","metadata":{},"source":["Upload the training data to the object storage. The data will be used to setup a subscription later."]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["\n","upload_file(bucket_name=BUCKET_NAME,item_name=FILE_NAME)"]},{"cell_type":"markdown","metadata":{"id":"ee69bf1da4a04865a6db14f13c46908d"},"source":["## 2.2 Construct the scoring payload"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["label_column=\"Risk\"\n","model_type = \"binary\""]},{"cell_type":"code","execution_count":null,"metadata":{"id":"852c1f20b4e2424dac75c61399d2164b"},"outputs":[],"source":["import pandas as pd\n","import requests\n","\n","df = pd.read_csv(\"german_credit_data_biased_training.csv\")\n","df.head()"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"3f50af2cb0f04203998c93f6caade4fb"},"outputs":[],"source":["cols_to_remove = [label_column]\n","def get_scoring_payload(no_of_records_to_score = 1):\n","\n","    for col in cols_to_remove:\n","        if col in df.columns:\n","            del df[col] \n","\n","    fields = df.columns.tolist()\n","    values = df[fields].values.tolist()\n","\n","    payload_scoring ={\"fields\": fields, \"values\": values[:no_of_records_to_score]}  \n","    return payload_scoring"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"042a3a8f7d5948a49be97970e96c2bab"},"outputs":[],"source":["#debug\n","payload_scoring = get_scoring_payload(1)\n","payload_scoring"]},{"cell_type":"markdown","metadata":{"id":"90f239341ee04e13acb5455607cd4f25"},"source":["## 2.3 Method to perform scoring"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"c71c1586d3d442ca95539dd853b7dafc"},"outputs":[],"source":["def custom_ml_scoring():\n","    header = {\"Content-Type\": \"application/json\"}\n","    print(scoring_url)\n","    scoring_response = requests.post(scoring_url, json=payload_scoring, headers=header)\n","    jsonify_scoring_response = scoring_response.json()\n","    return jsonify_scoring_response"]},{"cell_type":"markdown","metadata":{"id":"be29706f90b64587882f53c707269b81"},"source":["## 2.4 Method to perform payload logging"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"ed80603cfe4d44148752f0d02205b268"},"outputs":[],"source":["import uuid\n","scoring_id = None"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"89e174c4423346468545f47b39f9795f"},"outputs":[],"source":["from ibm_watson_openscale.supporting_classes.payload_record import PayloadRecord\n","def payload_logging(payload_scoring, scoring_response):\n","    scoring_id = str(uuid.uuid4())\n","    records_list=[]\n","    \n","    #manual PL logging for custom ml provider\n","    pl_record = PayloadRecord(scoring_id=scoring_id, request=payload_scoring, response=scoring_response, response_time=int(460))\n","    records_list.append(pl_record)\n","    wos_client.data_sets.store_records(data_set_id = payload_data_set_id, request_body=records_list)\n","    \n","    time.sleep(10)\n","    pl_records_count = wos_client.data_sets.get_records_count(payload_data_set_id)\n","    print(\"Number of records in the payload logging table: {}\".format(pl_records_count))\n","    return scoring_id"]},{"cell_type":"markdown","metadata":{"id":"bd59a8e3d0bb431eb3be9252b1f3772c"},"source":["## 2.5 Score the model and print the scoring response\n","### Sample Scoring"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"9b1853a125a340e99c622b9f71167c9c"},"outputs":[],"source":["custom_ml_scoring()"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["# match the class label above\n","class_label='prediction'"]},{"cell_type":"markdown","metadata":{"id":"7056440604434d4f8b9429befee9bdce"},"source":["# 3.0 Configure OpenScale <a name=\"configure\"></a>\n","\n","The notebook will now import the necessary libraries and set up a Python OpenScale client."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"e577b95129004af19aeccbb4ae4a7862"},"outputs":[],"source":["from ibm_cloud_sdk_core.authenticators import IAMAuthenticator\n","from ibm_watson_openscale import *\n","from ibm_watson_openscale.supporting_classes.enums import *\n","from ibm_watson_openscale.supporting_classes import *\n","from ibm_watson_openscale.base_classes.watson_open_scale_v2 import ScoringEndpointRequest\n","\n","import json\n","import requests\n","import base64\n","import time"]},{"cell_type":"markdown","metadata":{"id":"3043544c667e49d38cd70855d8ccb5ad"},"source":["## 3.1 Get a instance of the OpenScale SDK client"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"d3d5b4df0a9f4d598a6cbd595700ae53"},"outputs":[],"source":["authenticator = IAMAuthenticator(apikey=ibmcloud_api_key)\n","wos_client = APIClient(authenticator=authenticator)\n","wos_client.version"]},{"cell_type":"markdown","metadata":{"id":"5dbc59b8112144b8a06e6c41a8301f0a"},"source":["## Set up Database\n","\n","Watson OpenScale uses a database to store payload logs and calculated metrics. If database credentials were not supplied above, the notebook will use the free, internal lite database. If database credentials were supplied, the database will be used unless there is an existing database in the OpenScale instance.\n","\n","Prior instances of the model will be removed from OpenScale monitoring."]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["DB_CREDENTIALS = None\n","SCHEMA_NAME = None"]},{"cell_type":"code","execution_count":null,"metadata":{},"outputs":[],"source":["data_marts = wos_client.data_marts.list().result.data_marts\n","if len(data_marts) == 0:\n","    if DB_CREDENTIALS is not None:\n","        if SCHEMA_NAME is None: \n","            print(\"Please specify the SCHEMA_NAME and rerun the cell\")\n","\n","        print('Setting up external datamart')\n","        added_data_mart_result = wos_client.data_marts.add(\n","                background_mode=False,\n","                name=\"WOS Data Mart\",\n","                description=\"Data Mart created by Industry Accelerator\",\n","                database_configuration=DatabaseConfigurationRequest(\n","                  database_type=DatabaseType.POSTGRESQL,\n","                    credentials=PrimaryStorageCredentialsLong(\n","                        hostname=DB_CREDENTIALS['hostname'],\n","                        username=DB_CREDENTIALS['username'],\n","                        password=DB_CREDENTIALS['password'],\n","                        db=DB_CREDENTIALS['database'],\n","                        port=DB_CREDENTIALS['port'],\n","                        ssl=True,\n","                        sslmode=DB_CREDENTIALS['sslmode'],\n","                        certificate_base64=DB_CREDENTIALS['certificate_base64']\n","                    ),\n","                    location=LocationSchemaName(\n","                        schema_name= SCHEMA_NAME\n","                    )\n","                )\n","             ).result\n","    else:\n","        print('Setting up internal datamart')\n","        added_data_mart_result = wos_client.data_marts.add(\n","                background_mode=False,\n","                name=\"WOS Data Mart\",\n","                description=\"Data Mart created by WOS tutorial notebook\", \n","                internal_database = True).result\n","        \n","    data_mart_id = added_data_mart_result.metadata.id\n","    \n","else:\n","    data_mart_id=data_marts[0].metadata.id\n","    print('Using existing datamart {}'.format(data_mart_id))\n","    "]},{"cell_type":"code","execution_count":null,"metadata":{"id":"ca368ed3793443da944e24b8b5e0d281"},"outputs":[],"source":["wos_client.data_marts.show()"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"2966db92a2994a958136635144f05572"},"outputs":[],"source":["data_marts = wos_client.data_marts.list().result.data_marts\n","if len(data_marts) == 0:\n","    raise Exception(\"Missing data mart.\")\n","data_mart_id=data_marts[0].metadata.id\n","print('Using existing datamart {}'.format(data_mart_id))"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"5862cdb82a534375b7e290cee338774e"},"outputs":[],"source":["data_mart_details = wos_client.data_marts.list().result.data_marts[0]\n","data_mart_details.to_dict()"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"b61551bb00fa430b8595748c93a2dfb9"},"outputs":[],"source":["wos_client.service_providers.show()"]},{"cell_type":"markdown","metadata":{"id":"b1ecf806072240cd89af4324cb844744"},"source":["## 3.3 Remove existing service provider connected with used WML instance.\n","\n","Multiple service providers for the same engine instance are avaiable in Watson OpenScale. To avoid multiple service providers of used WML instance in the tutorial notebook the following code deletes existing service provder(s) and then adds new one."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"34534f784ac845438b54c6947d227ccb"},"outputs":[],"source":["SERVICE_PROVIDER_NAME = \"Custom ML Provider Demo - All Monitors\"\n","SERVICE_PROVIDER_DESCRIPTION = \"Added by tutorial WOS notebook to showcase monitoring Fairness, Quality, Drift and Explainability against a Custom ML provider.\""]},{"cell_type":"code","execution_count":null,"metadata":{"id":"ebd41ed59d0a44b3837152bdd67e10fc"},"outputs":[],"source":["service_providers = wos_client.service_providers.list().result.service_providers\n","for service_provider in service_providers:\n","    service_instance_name = service_provider.entity.name\n","    if service_instance_name == SERVICE_PROVIDER_NAME:\n","        service_provider_id = service_provider.metadata.id\n","        wos_client.service_providers.delete(service_provider_id)\n","        print(\"Deleted existing service_provider for WML instance: {}\".format(service_provider_id))"]},{"cell_type":"markdown","metadata":{"id":"48cf5770290744c28274669a5af0103b"},"source":["## 3.4 Add service provider\n","\n","Watson OpenScale needs to be bound to the Watson Machine Learning instance to capture payload data into and out of the model.\n","Note: You can bind more than one engine instance if needed by calling wos_client.service_providers.add method. Next, you can refer to particular service provider using service_provider_id."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"2d260d321cad475682aea0857c9b1054"},"outputs":[],"source":["request_headers = {\"Content-Type\": \"application/json\"}\n","MLCredentials = {}\n","added_service_provider_result = wos_client.service_providers.add(\n","        name=SERVICE_PROVIDER_NAME,\n","        description=SERVICE_PROVIDER_DESCRIPTION,\n","        service_type=ServiceTypes.CUSTOM_MACHINE_LEARNING,\n","        request_headers=request_headers,\n","        operational_space_id = \"production\",\n","        credentials=MLCredentials,\n","        background_mode=False\n","    ).result\n","service_provider_id = added_service_provider_result.metadata.id"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"eae202864f724fb9880ed31fadf75ac0"},"outputs":[],"source":["print(wos_client.service_providers.get(service_provider_id).result)"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"fba89f425d944c439b6950e285e0b6c0"},"outputs":[],"source":["print('Data Mart ID : ' + data_mart_id)\n","print('Service Provider ID : ' + service_provider_id)"]},{"cell_type":"markdown","metadata":{"id":"09f4993d2caa48269b9832c7e9b145d6"},"source":["## 3.5 Subscriptions"]},{"cell_type":"markdown","metadata":{"id":"2e59906593c44bda97f79a9c35e728f6"},"source":["Remove existing credit risk subscriptions\n","\n","This code removes previous subscriptions to the model to refresh the monitors with the new model and new data."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"6b97d781684b401095188860bd0a3c96"},"outputs":[],"source":["wos_client.subscriptions.show()"]},{"cell_type":"markdown","metadata":{"id":"b0d085c8a875458b834992abd3577219"},"source":["## 3.6 Remove the existing subscription"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"3084501d8ecc482282eaa59a7c4327f9"},"outputs":[],"source":["SUBSCRIPTION_NAME = \"Custom ML Subscription - All Monitors\""]},{"cell_type":"code","execution_count":null,"metadata":{"id":"f5ef621a07cb4e468fca6cc62d22595e"},"outputs":[],"source":["subscriptions = wos_client.subscriptions.list().result.subscriptions\n","for subscription in subscriptions:\n","    if subscription.entity.asset.name == \"[asset] \" + SUBSCRIPTION_NAME:\n","        sub_model_id = subscription.metadata.id\n","        wos_client.subscriptions.delete(subscription.metadata.id)\n","        print('Deleted existing subscription for model', sub_model_id)"]},{"cell_type":"markdown","metadata":{"id":"d117477d751b4d6380d2feceac14450e"},"source":["This code creates the model subscription in OpenScale using the Python client API. Note that we need to provide the model unique identifier, and some information about the model itself."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"5172e076895e4a8f8844452a9a538ff0"},"outputs":[],"source":["feature_columns=[\"CheckingStatus\",\"LoanDuration\",\"CreditHistory\",\"LoanPurpose\",\"LoanAmount\",\"ExistingSavings\",\"EmploymentDuration\",\"InstallmentPercent\",\"Sex\",\"OthersOnLoan\",\"CurrentResidenceDuration\",\"OwnsProperty\",\"Age\",\"InstallmentPlans\",\"Housing\",\"ExistingCreditsCount\",\"Job\",\"Dependents\",\"Telephone\",\"ForeignWorker\"]\n","cat_features=[\"CheckingStatus\",\"CreditHistory\",\"LoanPurpose\",\"ExistingSavings\",\"EmploymentDuration\",\"Sex\",\"OthersOnLoan\",\"OwnsProperty\",\"InstallmentPlans\",\"Housing\",\"Job\",\"Telephone\",\"ForeignWorker\"]"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"e2df727978ef4fbe8062c30a63944b6d"},"outputs":[],"source":["import uuid\n","asset_id = str(uuid.uuid4())\n","asset_name = '[asset] ' + SUBSCRIPTION_NAME\n","url = scoring_url\n","\n","asset_deployment_id = str(uuid.uuid4())\n","asset_deployment_name = asset_name\n","asset_deployment_scoring_url = scoring_url\n","\n","scoring_endpoint_url = scoring_url\n","scoring_request_headers = {\n","        \"Content-Type\": \"application/json\",\n","    }"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"1de20abba6d84c6c8ba84cd3ed3bc96e"},"outputs":[],"source":["\n","subscription_details = wos_client.subscriptions.add(\n","        data_mart_id=data_mart_id,\n","        service_provider_id=service_provider_id,\n","        asset=Asset(\n","            asset_id=asset_id,\n","            name=asset_name,\n","            url=scoring_endpoint_url,\n","            asset_type=AssetTypes.MODEL,\n","            input_data_type=InputDataType.STRUCTURED,\n","            problem_type=ProblemType.BINARY_CLASSIFICATION\n","        ),\n","        deployment=AssetDeploymentRequest(\n","            deployment_id=asset_deployment_id,\n","            name=asset_deployment_name,\n","            deployment_type= DeploymentTypes.ONLINE,\n","            scoring_endpoint=ScoringEndpointRequest(\n","                url=scoring_endpoint_url,\n","                request_headers=scoring_request_headers\n","            )\n","        ),\n","        asset_properties=AssetPropertiesRequest(\n","            label_column=label_column,\n","            probability_fields=[\"probability\"],\n","            prediction_field=class_label,\n","            feature_fields = feature_columns,\n","            categorical_fields = cat_features,\n","            training_data_reference=TrainingDataReference(type=\"cos\",\n","                                                          location=COSTrainingDataReferenceLocation(bucket = BUCKET_NAME,\n","                                                                                                    file_name = FILE_NAME),\n","                                                          connection=COSTrainingDataReferenceConnection.from_dict({\n","                                                                        \"resource_instance_id\": COS_RESOURCE_CRN,\n","                                                                        \"url\": COS_ENDPOINT,\n","                                                                        \"api_key\": COS_API_KEY_ID,\n","                                                                        \"iam_url\": IAM_URL}))\n","        )\n","    ).result\n","subscription_id = subscription_details.metadata.id"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"4ebafebf55ca4fd482b5cdb3c4d411bf"},"outputs":[],"source":["print('Subscription ID: ' + subscription_id)"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"9dee41fc8ae2407483f7d08f7ecd9663"},"outputs":[],"source":["time.sleep(5)\n","payload_data_set_id = None\n","payload_data_set_id = wos_client.data_sets.list(type=DataSetTypes.PAYLOAD_LOGGING, \n","                                                target_target_id=subscription_id, \n","                                                target_target_type=TargetTypes.SUBSCRIPTION).result.data_sets[0].metadata.id\n","if payload_data_set_id is None:\n","    print(\"Payload data set not found. Please check subscription status.\")\n","else:\n","    print(\"Payload data set id:\", payload_data_set_id)"]},{"cell_type":"markdown","metadata":{"id":"ff19b2c4f74c4cddb732adca319b7a72"},"source":["### Before the payload logging"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"5ba479997dca4e2899796ba7c050fed5"},"outputs":[],"source":["wos_client.subscriptions.get(subscription_id).result.to_dict()"]},{"cell_type":"markdown","metadata":{"id":"2d75c57ab8914d70acdc3ed523675b1b"},"source":["# 4.0 Score the model so we can configure monitors <a name=\"score\"></a>\n","\n","Now that the WML service has been bound and the subscription has been created, we need to send a request to the model before we configure OpenScale. This allows OpenScale to create a payload log in the datamart with the correct schema, so it can capture data coming into and out of the model."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"a934eaf8340f4bd4b43bbdae75b78fc9"},"outputs":[],"source":["no_of_records_to_score = 100"]},{"cell_type":"markdown","metadata":{"id":"b43d456e976949069a6e46a1fc37dfdc"},"source":["### Construct the scoring payload"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"a5299ea291c846299adc2cd576429608"},"outputs":[],"source":["payload_scoring = get_scoring_payload(no_of_records_to_score)\n","payload_scoring"]},{"cell_type":"markdown","metadata":{"id":"a32a41e273d44e398b223639ee79215c"},"source":["### Perform the scoring against the Custom ML Provider"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"f2313593ada1476c8a8097412ccb624f"},"outputs":[],"source":["scoring_response = custom_ml_scoring()\n","scoring_response"]},{"cell_type":"markdown","metadata":{"id":"e5aadcc1ea024713bbbe5737ca477f9e"},"source":["### Perform payload logging by passing the scoring payload and scoring response"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"14ef8cf50f5748589f34ab3c15fe5d2f"},"outputs":[],"source":["scoring_id = payload_logging(payload_scoring, scoring_response)"]},{"cell_type":"markdown","metadata":{"id":"71256d791a85409ebe16b2bf47074b70"},"source":["### The scoring id, which would be later used for explanation of the randomly picked transactions"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"0521631eb6bc4cf285c58ae798c0cfdc"},"outputs":[],"source":["print('scoring_id: ' + str(scoring_id))"]},{"cell_type":"markdown","metadata":{"id":"1b7feec4ae804004829333126c78f49f"},"source":["# 5.0 Fairness configuration <a name=\"Fairness\"></a>"]},{"cell_type":"markdown","metadata":{"id":"70a9089343c54117a3efe9d66de7c963"},"source":["The code below configures fairness monitoring for our model. It turns on monitoring for two features, sex and age. In each case, we must specify:\n","    \n","* Which model feature to monitor One or more majority groups\n","* Which are values of that feature that we expect to receive a higher percentage of favorable outcomes One or more minority groups\n","* Which are values of that feature that we expect to receive a higher percentage of unfavorable outcomes \n","* The threshold at which we would like OpenScale to display an alert if the fairness measurement falls below (in this case, 80%) \n","\n","* Which outcomes from the model are favourable outcomes, and which are unfavourable.\n","* The number of records OpenScale will use to calculate the fairness score. In this case, OpenScale's fairness monitor will run hourly, but will not calculate a new fairness rating until at least 100 records have been added. Finally, to calculate fairness, OpenScale must perform some calculations on the training data, so we provide the dataframe containing the data."]},{"cell_type":"markdown","metadata":{"id":"ce78948191184cd98a4c2145791595bd"},"source":["## Create Fairness Monitor Instance"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"1d513bda1f814014883078af4e8ebf48"},"outputs":[],"source":["target = Target(\n","    target_type=TargetTypes.SUBSCRIPTION,\n","    target_id=subscription_id\n","\n",")\n","parameters = {\n","    \"class_label\": \"Risk\",\n","    \"features\": [\n","        {\"feature\": \"Sex\",\n","         \"majority\": ['male'],\n","         \"minority\": ['female']\n","         },\n","        {\"feature\": \"Age\",\n","         \"majority\": [[26, 75]],\n","         \"minority\": [[18, 25]]\n","         }\n","    ],\n","    \"favourable_class\": [\"No Risk\"],\n","    \"unfavourable_class\": [\"Risk\"],\n","    \"min_records\": 100\n","}\n","thresholds = [{\n","    \"metric_id\": \"fairness_value\",\n","    \"specific_values\": [{\n","            \"applies_to\": [{\n","                \"key\": \"feature\",\n","                \"type\": \"tag\",\n","                \"value\": \"Age\"\n","            }],\n","            \"value\": 95\n","        },\n","        {\n","            \"applies_to\": [{\n","                \"key\": \"feature\",\n","                \"type\": \"tag\",\n","                \"value\": \"Sex\"\n","            }],\n","            \"value\": 95\n","        }\n","    ],\n","    \"type\": \"lower_limit\",\n","    \"value\": 80.0\n","}]\n","\n","fairness_monitor_details = wos_client.monitor_instances.create(\n","    data_mart_id=data_mart_id,\n","    background_mode=False,\n","    monitor_definition_id=wos_client.monitor_definitions.MONITORS.FAIRNESS.ID,\n","    target=target,\n","    parameters=parameters,\n","    thresholds=thresholds).result"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"03546eac65224455bab6cf2b03ea25b9"},"outputs":[],"source":["fairness_monitor_instance_id = fairness_monitor_details.metadata.id"]},{"cell_type":"markdown","metadata":{"id":"16b061046d6f41e28d099b5b87787047"},"source":["### Get Fairness Monitor Instance"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"51afb3e06284403a89713b5057346290"},"outputs":[],"source":["wos_client.monitor_instances.show()"]},{"cell_type":"markdown","metadata":{"id":"56aa0450e8814fb7b4f11936bf887693"},"source":["### Get run details\n","In case of production subscription, initial monitoring run is triggered internally. Checking its status"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"521237eca73d4164a85d4eff503807ef"},"outputs":[],"source":["runs = wos_client.monitor_instances.list_runs(fairness_monitor_instance_id, limit=1).result.to_dict()\n","fairness_monitoring_run_id = runs[\"runs\"][0][\"metadata\"][\"id\"]\n","run_status = None\n","while(run_status not in [\"finished\", \"error\"]):\n","    run_details = wos_client.monitor_instances.get_run_details(fairness_monitor_instance_id, fairness_monitoring_run_id).result.to_dict()\n","    run_status = run_details[\"entity\"][\"status\"][\"state\"]\n","    print('run_status: ', run_status)\n","    if run_status in [\"finished\", \"error\"]:\n","        break\n","    time.sleep(10)"]},{"cell_type":"markdown","metadata":{"id":"a2defd166d3d4ecc9dbb21dc687b6f99"},"source":["### Fairness run output"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"c720b45fef0e4eadbb0e1185b56a2ec4"},"outputs":[],"source":["wos_client.monitor_instances.get_run_details(fairness_monitor_instance_id, fairness_monitoring_run_id).result.to_dict()"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"f6942554cdd84c8394e3e3168cbdd572"},"outputs":[],"source":["wos_client.monitor_instances.show_metrics(monitor_instance_id=fairness_monitor_instance_id)"]},{"cell_type":"markdown","metadata":{"id":"053fcec298ea4fb4bed192eb21672585"},"source":["# 6.0 Configure Explainability <a name=\"explain\"></a>\n","We provide OpenScale with the training data to enable and configure the explainability features."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"884370b4c6f7432783928bbb296e5b3c"},"outputs":[],"source":["target = Target(\n","    target_type=TargetTypes.SUBSCRIPTION,\n","    target_id=subscription_id\n",")\n","parameters = {\n","    \"enabled\": True\n","}\n","explain_monitor_details = wos_client.monitor_instances.create(\n","    data_mart_id=data_mart_id,\n","    background_mode=False,\n","    monitor_definition_id=wos_client.monitor_definitions.MONITORS.EXPLAINABILITY.ID,\n","    target=target,\n","    parameters=parameters\n",").result\n","\n","explain_monitor_details.metadata.id"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"e03a4335fce946eda759c038df97cb1b"},"outputs":[],"source":["scoring_ids = []\n","sample_size = 2\n","import random\n","for i in range(0, sample_size):\n","    n = random.randint(1,100)\n","    scoring_ids.append(scoring_id + '-' + str(n))\n","print(\"Running explanations on scoring IDs: {}\".format(scoring_ids))"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"a5c91ed0f5114985830a0e72084fd950"},"outputs":[],"source":["explanation_types = [\"lime\", \"contrastive\"]\n","result = wos_client.monitor_instances.explanation_tasks(scoring_ids=scoring_ids, explanation_types=explanation_types, subscription_id=subscription_id).result\n","print(result)"]},{"cell_type":"markdown","metadata":{"id":"803d5d32f30941979eea07f690714fe0"},"source":["### Explanation tasks"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"3dbbc996ee204f21a62fd9bf2f1d8de0"},"outputs":[],"source":["explanation_task_ids=result.metadata.explanation_task_ids\n","explanation_task_ids"]},{"cell_type":"markdown","metadata":{"id":"751895d521cb443f84e8eb1f3483a6ee"},"source":["### Wait for the explanation tasks to complete - all of them"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"df48920698a74c37843464bc662eae2b"},"outputs":[],"source":["import time\n","def finish_explanation_tasks():\n","    finished_explanations = []\n","    finished_explanation_task_ids = []\n","    \n","    # Check for the explanation task status for finished status. \n","    # If it is in-progress state, then sleep for some time and check again. \n","    # Perform the same for couple of times, so that all tasks get into finished state.\n","    for i in range(0, 5):\n","        # for each explanation\n","        print('iteration ' + str(i))\n","        \n","        #check status for all explanation tasks\n","        for explanation_task_id in explanation_task_ids:\n","            if explanation_task_id not in finished_explanation_task_ids:\n","                result = wos_client.monitor_instances.get_explanation_tasks(explanation_task_id=explanation_task_id,subscription_id=subscription_id).result\n","                print(explanation_task_id + ' : ' + result.entity.status.state)\n","                if (result.entity.status.state == 'finished' or result.entity.status.state == 'error') and explanation_task_id not in finished_explanation_task_ids:\n","                    finished_explanation_task_ids.append(explanation_task_id)\n","                    finished_explanations.append(result)\n","\n","\n","        # if there is altest one explanation task that is not yet completed, then sleep for sometime, \n","        # and check for all those tasks, for which explanation is not yet completeed.\n","        \n","        if len(finished_explanation_task_ids) != sample_size:\n","            print('sleeping for some time..')\n","            time.sleep(10)\n","        else:\n","            break\n","                    \n","    return finished_explanations"]},{"cell_type":"markdown","metadata":{"id":"34ac1cd4fcb843268919b87a2c84e236"},"source":["### You may have to run the cell below multiple times till all explanation tasks are either finished or error'ed."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"c395cf23ff24424db500205ee9704677"},"outputs":[],"source":["finished_explanations = finish_explanation_tasks()"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"9333438de9dd43188886694e01f8f4ea"},"outputs":[],"source":["len(finished_explanations)"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"1592fa5ec3e24a21a5d81e50d65d2baf"},"outputs":[],"source":["def construct_explanation_features_map(feature_name, feature_weight):\n","    if feature_name in explanation_features_map:\n","        explanation_features_map[feature_name].append(feature_weight)\n","    else:\n","        explanation_features_map[feature_name] = [feature_weight]"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"4d22a34683d2474cafdc5b9f3d982472"},"outputs":[],"source":["explanation_features_map = {}\n","for result in finished_explanations:\n","    print('\\n>>>>>>>>>>>>>>>>>>>>>>\\n')\n","    print('explanation task: ' + str(result.metadata.explanation_task_id) + ', perturbed:' + str(result.entity.perturbed))\n","    if result.entity.explanations is not None:\n","        explanations = result.entity.explanations\n","        for explanation in explanations:\n","            if 'predictions' in explanation:\n","                predictions = explanation['predictions']\n","                for prediction in predictions:\n","                    predicted_value = prediction['value']\n","                    probability = prediction['probability']\n","                    print('prediction : ' + str(predicted_value) + ', probability : ' + str(probability))\n","                    if 'explanation_features' in prediction:\n","                        explanation_features = prediction['explanation_features']\n","                        for explanation_feature in explanation_features:\n","                            feature_name = explanation_feature['feature_name']\n","                            feature_weight = explanation_feature['weight']\n","                            if (feature_weight >= 0 ):\n","                                feature_weight_percent = round(feature_weight * 100, 2)\n","                                print(str(feature_name) + ' : ' + str(feature_weight_percent))\n","                                task_feature_weight_map = {}\n","                                task_feature_weight_map[result.metadata.explanation_task_id] = feature_weight_percent\n","                                construct_explanation_features_map(feature_name, feature_weight_percent)\n","        print('\\n>>>>>>>>>>>>>>>>>>>>>>\\n')\n","explanation_features_map"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"b384c69b031247a78206dfb64f1f08df"},"outputs":[],"source":["import matplotlib.pyplot as plt\n","for key in explanation_features_map.keys():\n","    #plot_graph(key, explanation_features_map[key])\n","    values = explanation_features_map[key]\n","    plt.title(key)\n","    plt.ylabel('Weight')\n","    plt.bar(range(len(values)), values)\n","    plt.show()"]},{"cell_type":"markdown","metadata":{"id":"ec40b353d4694553a6402f8fd0cbe0b9"},"source":["# 7.0 Quality monitoring and feedback logging <a name=\"quality\"></a>"]},{"cell_type":"markdown","metadata":{"id":"d5ea1298d23c466a8d4dd526f2ceb9d3"},"source":["## 7.1 Enable quality monitoring"]},{"cell_type":"markdown","metadata":{"id":"7a58927065b041e185af41ba4b37bb97"},"source":["The code below waits ten seconds to allow the payload logging table to be set up before it begins enabling monitors. First, it turns on the quality (accuracy) monitor and sets an alert threshold of 70%. OpenScale will show an alert on the dashboard if the model accuracy measurement (area under the curve, in the case of a binary classifier) falls below this threshold.\n","\n","The second paramater supplied, min_records, specifies the minimum number of feedback records OpenScale needs before it calculates a new measurement. The quality monitor runs hourly, but the accuracy reading in the dashboard will not change until an additional 50 feedback records have been added, via the user interface, the Python client, or the supplied feedback endpoint."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"4490c55606904e228110ab51b3564e7d"},"outputs":[],"source":["import time\n","\n","#time.sleep(10)\n","target = Target(\n","        target_type=TargetTypes.SUBSCRIPTION,\n","        target_id=subscription_id\n",")\n","parameters = {\n","    \"min_feedback_data_size\": 90\n","}\n","thresholds = [\n","                {\n","                    \"metric_id\": \"area_under_roc\",\n","                    \"type\": \"lower_limit\",\n","                    \"value\": .80\n","                }\n","            ]\n","quality_monitor_details = wos_client.monitor_instances.create(\n","    data_mart_id=data_mart_id,\n","    background_mode=False,\n","    monitor_definition_id=wos_client.monitor_definitions.MONITORS.QUALITY.ID,\n","    target=target,\n","    parameters=parameters,\n","    thresholds=thresholds\n",").result"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"8aab8286132144d98d0201e4315e67ee"},"outputs":[],"source":["quality_monitor_instance_id = quality_monitor_details.metadata.id\n","quality_monitor_instance_id"]},{"cell_type":"markdown","metadata":{"id":"e6408ea6ef764a7591c98b2411ee1629"},"source":["## 7.2 Feedback logging"]},{"cell_type":"markdown","metadata":{"id":"e31f0d931ede4cd780b237e4633ef3c3"},"source":["The code below downloads and stores enough feedback data to meet the minimum threshold so that OpenScale can calculate a new accuracy measurement. It then kicks off the accuracy monitor. The monitors run hourly, or can be initiated via the Python API, the REST API, or the graphical user interface."]},{"cell_type":"code","execution_count":null,"metadata":{"id":"6b9bd3632fdf471989c51be3e1168618"},"outputs":[],"source":["!rm additional_feedback_data_v2.json\n","!wget https://raw.githubusercontent.com/IBM/watson-openscale-samples/main/IBM%20Cloud/WML/assets/data/credit_risk/additional_feedback_data_v2.json"]},{"cell_type":"markdown","metadata":{"id":"d8de92b2cd0740008e4a4759aa84e67e"},"source":["## 7.2 Get feedback logging dataset ID"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"3b53b5fa000b426d8b23e9f29ba11306"},"outputs":[],"source":["feedback_dataset_id = None\n","feedback_dataset = wos_client.data_sets.list(type=DataSetTypes.FEEDBACK, \n","                                                target_target_id=subscription_id, \n","                                                target_target_type=TargetTypes.SUBSCRIPTION).result\n","feedback_dataset_id = feedback_dataset.data_sets[0].metadata.id\n","if feedback_dataset_id is None:\n","    print(\"Feedback data set not found. Please check quality monitor status.\")"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"4199682e0b0d452c8c90b95a6778e775"},"outputs":[],"source":["with open('additional_feedback_data_v2.json') as feedback_file:\n","    additional_feedback_data = json.load(feedback_file)"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"11e24d67d93946bf839d96656f82479a"},"outputs":[],"source":["wos_client.data_sets.store_records(feedback_dataset_id, request_body=additional_feedback_data, background_mode=False)"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"86c64cbcb3f449b2b82df44cfb410692"},"outputs":[],"source":["wos_client.data_sets.get_records_count(data_set_id=feedback_dataset_id)"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"c36d23ee80d744098829b8303f52cf05"},"outputs":[],"source":["run_details = wos_client.monitor_instances.run(monitor_instance_id=quality_monitor_instance_id, background_mode=False).result"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"fb90e7a52420427c8414ea1dfe83746a"},"outputs":[],"source":["wos_client.monitor_instances.show_metrics(monitor_instance_id=quality_monitor_instance_id)"]},{"cell_type":"markdown","metadata":{"id":"9e84949cb7be4078839dfc38075658f4"},"source":["# 8.0 Drift configuration <a name=\"drift\"></a>"]},{"cell_type":"markdown","metadata":{"id":"d527f082a1964412b7a616f87f4e2570"},"source":["## 8.1 Drift detection model generation\n","\n","Please update the score function which will be used for generating drift detection model which will used for drift detection . This might take sometime to generate the model, the time taken depends on the training dataset size. The output of the score function should be a 2 arrays:\n","1. Array of model prediction \n","2. Array of probabilities \n","\n","- User is expected to make sure that the data type of the \"class label\" column selected and the prediction column are same . For eg : If class label is numeric , the prediction array should also be numeric\n","\n","- Each entry of a probability array should have all the probabities of the unique class lable .\n","  For eg: If the model_type=multiclass and unique class labels are A, B, C, D . Each entry in the probability array should be a array of size 4 . Eg : [ [50,30,10,10] ,[40,20,30,10]...]\n","  \n","**Note:**\n","- *User is expected to add a \"score\" method , which should output prediction column array and probability column array.*\n","- *The data type of the label column and prediction column should be same . User needs to make sure that label column and prediction column array should have the same unique class labels*\n","- **Please update the score function below with the help of templates documented [here](https://github.com/IBM-Watson/aios-data-distribution/blob/master/Score%20function%20templates%20for%20drift%20detection.md)**"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"7181830b5181442e97f0da2258e6416a"},"outputs":[],"source":["import pandas as pd\n","\n","df = pd.read_csv(\"german_credit_data_biased_training.csv\")\n","df.head()"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"aa93a4cdde0b4e9886a9c52b742608e0"},"outputs":[],"source":["def score(training_data_frame):\n","    #The data type of the label column and prediction column should be same .\n","    #User needs to make sure that label column and prediction column array should have the same unique class labels\n","    prediction_column_name = class_label\n","    probability_column_name = \"probability\"\n","    \n","    feature_columns = list(training_data_frame.columns)\n","    training_data_rows = training_data_frame[feature_columns].values.tolist()\n","    \n","    payload_scoring_records = {\n","          \"fields\": feature_columns,\n","          \"values\": [x for x in training_data_rows]\n","      }\n","    \n","    header = {\"Content-Type\": \"application/json\", \"x\":\"y\"}\n","    scoring_response_raw = requests.post(scoring_url, json=payload_scoring_records, headers=header, verify=False)\n","    scoring_response = scoring_response_raw.json()\n","\n","    probability_array = None\n","    prediction_vector = None\n","    \n","    prob_col_index = list(scoring_response.get('fields')).index(probability_column_name)\n","    predict_col_index = list(scoring_response.get('fields')).index(prediction_column_name)\n","\n","    if prob_col_index < 0 or predict_col_index < 0:\n","      raise Exception(\"Missing prediction/probability column in the scoring response\")\n","\n","    import numpy as np\n","    probability_array = np.array([value[prob_col_index] for value in scoring_response.get('values')])\n","    prediction_vector = np.array([value[predict_col_index] for value in scoring_response.get('values')])\n","\n","    return probability_array, prediction_vector"]},{"cell_type":"markdown","metadata":{"id":"e21b310647ce4c5983305a987459aa14"},"source":["### Define the drift detection input"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"24f28862e1194aca8db15f5bb9a771a9"},"outputs":[],"source":["drift_detection_input = {\n","    \"feature_columns\": feature_columns,\n","    \"categorical_columns\": cat_features,\n","    \"label_column\": label_column,\n","    \"problem_type\": model_type\n","}\n","print(drift_detection_input)"]},{"cell_type":"markdown","metadata":{"id":"fe51e8ebb82b412d921fcf890b60a7f8"},"source":["### Generate drift detection model"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"ba265b3dc4864f68b3e2a231a9a5dcbc"},"outputs":[],"source":["!rm drift_detection_model.tar.gz"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"b84b1274b60e45cca14e3cc58b7dd57f"},"outputs":[],"source":["from ibm_wos_utils.drift.drift_trainer import DriftTrainer\n","drift_trainer = DriftTrainer(df,drift_detection_input)\n","if model_type != \"regression\":\n","    #Note: batch_size can be customized by user as per the training data size\n","    drift_trainer.generate_drift_detection_model(score,batch_size=df.shape[0])\n","\n","#Note: Two column constraints are not computed beyond two_column_learner_limit(default set to 200)\n","#User can adjust the value depending on the requirement\n","drift_trainer.learn_constraints(two_column_learner_limit=200)\n","drift_trainer.create_archive()"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"c57fe4ed41994c32a353f2f2702e9357"},"outputs":[],"source":["!ls -al"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"419f7cb1266f4085804506ceb54da21d"},"outputs":[],"source":["filename = 'drift_detection_model.tar.gz'"]},{"cell_type":"markdown","metadata":{"id":"6f60edd7ce2c4d698734d21bbb9d0c39"},"source":["### Upload the drift detection model to OpenScale subscription"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"ba95377401c041be8b9b41076b82f2e9"},"outputs":[],"source":["wos_client.monitor_instances.upload_drift_model(\n","        model_path=filename,\n","        archive_name=filename,\n","        data_mart_id=data_mart_id,\n","        subscription_id=subscription_id,\n","        enable_data_drift=True,\n","        enable_model_drift=True\n","     )"]},{"cell_type":"markdown","metadata":{"id":"6256b1a6ed3b40858ea69e70d21d30a1"},"source":["### Delete the existing drift monitor instance for the subscription"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"cff7eca6fd6147f289050252f2d4b0c8"},"outputs":[],"source":["monitor_instances = wos_client.monitor_instances.list().result.monitor_instances\n","for monitor_instance in monitor_instances:\n","    monitor_def_id=monitor_instance.entity.monitor_definition_id\n","    if monitor_def_id == \"drift\" and monitor_instance.entity.target.target_id == subscription_id:\n","        wos_client.monitor_instances.delete(monitor_instance.metadata.id)\n","        print('Deleted existing drift monitor instance with id: ', monitor_instance.metadata.id)"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"459150da617b4f738d98f88518aaa1fb"},"outputs":[],"source":["target = Target(\n","    target_type=TargetTypes.SUBSCRIPTION,\n","    target_id=subscription_id\n","\n",")\n","parameters = {\n","    \"min_samples\": 100,\n","    \"drift_threshold\": 0.1,\n","    \"train_drift_model\": False,\n","    \"enable_model_drift\": True,\n","    \"enable_data_drift\": True\n","}\n","\n","drift_monitor_details = wos_client.monitor_instances.create(\n","    data_mart_id=data_mart_id,\n","    background_mode=False,\n","    monitor_definition_id=wos_client.monitor_definitions.MONITORS.DRIFT.ID,\n","    target=target,\n","    parameters=parameters\n",").result\n","\n","drift_monitor_instance_id = drift_monitor_details.metadata.id\n","drift_monitor_instance_id"]},{"cell_type":"markdown","metadata":{"id":"48aa770ab264479286e56fb1ba8057f4"},"source":["### Drift run"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"312c02c782a94f999e24232ea454edc2"},"outputs":[],"source":["drift_run_details = wos_client.monitor_instances.run(monitor_instance_id=drift_monitor_instance_id, background_mode=False)"]},{"cell_type":"code","execution_count":null,"metadata":{"id":"6c6db5901c564d45ba8d55a705d2166c"},"outputs":[],"source":["time.sleep(5)\n","wos_client.monitor_instances.show_metrics(monitor_instance_id=drift_monitor_instance_id)"]},{"cell_type":"markdown","metadata":{"id":"929139e3589e479f81166847d09d28d8"},"source":["## Summary\n","\n","As part of this notebook, we have performed the following:\n","* Create a subscription to an custom ML end point\n","* Scored the custom ML provider with 100 records\n","* With the scored payload and also the scored response, we called the DataSets SDK method to store the payload logging records into the data mart. While doing so, we have set the scoring_id attribute.\n","* Configured the fairness monitor and executed it and viewed the fairness metrics output.\n","* Configured explainabilty monitor\n","* Randomly selected 5 transactions for which we want to get the prediction explanation.\n","* Submitted explainability tasks for the selected scoring ids, and waited for their completion.\n","* In the end, we composed a weight map of feature and its weight across transactions. And plotted the same.\n","* For example:\n","```\n","{'ForeignWorker': [33.29, 5.23],\n"," 'OthersOnLoan': [15.96, 19.97, 12.76],\n"," 'OwnsProperty': [15.43, 3.92, 4.44, 10.36],\n"," 'Dependents': [9.06],\n"," 'InstallmentPercent': [9.05],\n"," 'CurrentResidenceDuration': [8.74, 13.15, 12.1, 10.83],\n"," 'Sex': [2.96, 12.76],\n"," 'InstallmentPlans': [2.4, 5.67, 6.57],\n"," 'Age': [2.28, 8.6, 11.26],\n"," 'Job': [0.84],\n"," 'LoanDuration': [15.02, 10.87, 18.91, 12.72],\n"," 'EmploymentDuration': [14.02, 14.05, 12.1],\n"," 'LoanAmount': [9.28, 12.42, 7.85],\n"," 'Housing': [4.35],\n"," 'CreditHistory': [6.5]}\n"," ```\n","\n","The understanding of the above map is like this:\n","* LoanDuration, CurrentResidenceDuration, OwnsProperty are the most contributing features across transactions for their respective prediction. Their weights for the respective prediction can also be seen.\n","* And the low contributing features are CreditHistory, Housing, Job, InstallmentPercent and Dependents, with their respective weights can also be seen as printed.\n","\n","* We configured quality monitor and uploaded feedback data, and thereby ran the quality monitor\n","* For drift monitoring purposes, we created the drift detection model and uploaded to the OpenScale subscription.\n","* Executed the drift monitor.\n","\n","Thank You! for working on tutorial notebook."]}],"metadata":{"kernelspec":{"display_name":"Python 3","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0]"},"varInspector":{"cols":{"lenName":16,"lenType":16,"lenVar":40},"kernels_config":{"python":{"delete_cmd_postfix":"","delete_cmd_prefix":"del ","library":"var_list.py","varRefreshCmd":"print(var_dic_list())"},"r":{"delete_cmd_postfix":") ","delete_cmd_prefix":"rm(","library":"var_list.r","varRefreshCmd":"cat(var_dic_list()) "}},"types_to_exclude":["module","function","builtin_function_or_method","instance","_Feature"],"window_display":false},"vscode":{"interpreter":{"hash":"31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6"}}},"nbformat":4,"nbformat_minor":1}