{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\"banner\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Working with SageMaker Machine Learning engine" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This notebook works correctly with kernel `Python 3.7.x`. It shows how to log the payload for the model deployed on custom model serving engine using Watson OpenScale python sdk." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Contents\n", "- [1.0 Setup](#setup)\n", "- [2.0 Binding machine learning engine](#binding)\n", "- [3.0 Subscriptions](#subscriptions)\n", "- [4.0 Performance monitor, scoring and payload logging](#perf)\n", "- [5.0 Quality monitor and feedback logging](#quality)\n", "- [6.0 Fairness, Drift monitoring and explanations](#fairness)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## 1.0 Setup" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Requirements installation" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!pip install sagemaker --no-cache | tail -n 1\n", "!pip install --upgrade ibm-watson-openscale --no-cache | tail -n 1\n", "!pip install --upgrade boto3 --no-cache | tail -n 1\n", "!pip install -U pandas==1.2.5 | tail -n 1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## **Action:** Restart the kernel." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Sample model creation using [Amazon SageMaker](https://aws.amazon.com/sagemaker/)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- You should have alread run the [CreditModelSagemakerLinearLearner.ipynb notebook](https://raw.githubusercontent.com/IBM/monitor-sagemaker-ml-with-watson-openscale/master/notebooks/CreditModelSagemakerLinearLearner.ipynb)to create SageMaker model." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.1 Authentication" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### ACTION: Get IBM Cloud OpenScale `apikey`\n", "\n", "How to install IBM Cloud (bluemix) console: [instruction](https://console.bluemix.net/docs/cli/reference/ibmcloud/download_cli.html#install_use)\n", "\n", "- Switch to resource group containing OpenScale instance\n", "\n", "```bash\n", "ibmcloud target -g \n", "```\n", "\n", "Create an api key using bluemix console:\n", "```bash\n", "ibmcloud login --sso\n", "ibmcloud iam api-key-create 'my_key'\n", "```\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.1.1 Add the IBM Cloud apikey as *CLOUD_API_KEY*" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "CLOUD_API_KEY = '******'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.2 Leave the *DB_CREDNTIALS* as `None` unless you have a custom setup with an external database." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "DB_CREDENTIALS=None\n", "#DB_CREDENTIALS= {\"hostname\":\"\",\"username\":\"\",\"password\":\"\",\"database\":\"\",\"port\":\"\",\"ssl\":True,\"sslmode\":\"\",\"certificate_base64\":\"\"}" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "SCHEMA_NAME = 'data_mart_for_aws_sagemaker'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.3 [Follow the README instructions](https://github.com/IBM/monitor-sagemaker-ml-with-watson-openscale#create-cos-bucket-and-get-credentials) to use in the following 2 cells" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "IAM_URL=\"https://iam.ng.bluemix.net/oidc/token\"\n", "COS_RESOURCE_CRN=\"\"\n", "COS_API_KEY_ID = \"\"\n", "COS_ENDPOINT = \"https://\" # Current list avaiable at https://control.cloud-object-storage.cloud.ibm.com/v2/endpoints" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### 1.3.1 Add the BUCKET_NAME" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "BUCKET_NAME = \"******\" #example: \"--credit-risk-training-data\"\n", "training_data_file_name=\"credit_risk_training_recoded.csv\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!rm credit_risk_training_recoded.csv\n", "!wget \"https://raw.githubusercontent.com/IBM/monitor-sagemaker-ml-with-watson-openscale/master/data/credit_risk_training_recoded.csv\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Store training data in COS for OpenScale reference" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import ibm_boto3\n", "from ibm_botocore.client import Config, ClientError\n", "\n", "cos_client = ibm_boto3.resource(\"s3\",\n", " ibm_api_key_id=COS_API_KEY_ID,\n", " ibm_service_instance_id=COS_RESOURCE_CRN,\n", " ibm_auth_endpoint=\"https://iam.bluemix.net/oidc/token\",\n", " config=Config(signature_version=\"oauth\"),\n", " endpoint_url=COS_ENDPOINT\n", ")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "with open(training_data_file_name, \"rb\") as file_data:\n", " cos_client.Object(BUCKET_NAME, training_data_file_name).upload_fileobj(\n", " Fileobj=file_data\n", " )" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Initiate OpenScale client" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from ibm_cloud_sdk_core.authenticators import IAMAuthenticator,CloudPakForDataAuthenticator\n", "\n", "from ibm_watson_openscale import *\n", "from ibm_watson_openscale.supporting_classes.enums import *\n", "from ibm_watson_openscale.supporting_classes import *\n", "\n", "authenticator = IAMAuthenticator(apikey=CLOUD_API_KEY)\n", "wos_client = APIClient(authenticator=authenticator,service_url=\"https://aiopenscale.cloud.ibm.com\")\n", "wos_client.version" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create schema for data mart." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "wos_client.data_marts.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data_marts = wos_client.data_marts.list().result.data_marts\n", "if len(data_marts) == 0:\n", " if DB_CREDENTIALS is not None:\n", " if SCHEMA_NAME is None: \n", " print(\"Please specify the SCHEMA_NAME and rerun the cell\")\n", "\n", " print('Setting up external datamart')\n", " added_data_mart_result = wos_client.data_marts.add(\n", " background_mode=False,\n", " name=\"WOS Data Mart\",\n", " description=\"Data Mart created by WOS tutorial notebook\",\n", " database_configuration=DatabaseConfigurationRequest(\n", " database_type=DatabaseType.POSTGRESQL, # For DB2 use DatabaseType.DB2\n", " credentials=PrimaryStorageCredentialsLong(\n", " hostname=DB_CREDENTIALS['hostname'],\n", " username=DB_CREDENTIALS['username'],\n", " password=DB_CREDENTIALS['password'],\n", " db=DB_CREDENTIALS['database'],\n", " port=DB_CREDENTIALS['port'],\n", " ssl=True,\n", " sslmode=DB_CREDENTIALS['sslmode'],\n", " certificate_base64=DB_CREDENTIALS['certificate_base64']\n", " ),\n", " location=LocationSchemaName(\n", " schema_name= SCHEMA_NAME\n", " )\n", " )\n", " ).result\n", " else:\n", " print('Setting up internal datamart')\n", " added_data_mart_result = wos_client.data_marts.add(\n", " background_mode=False,\n", " name=\"WOS Data Mart\",\n", " description=\"Data Mart created by WOS tutorial notebook\", \n", " internal_database = True).result\n", " \n", " data_mart_id = added_data_mart_result.metadata.id\n", " \n", "else:\n", " data_mart_id=data_marts[0].metadata.id\n", " print('Using existing datamart {}'.format(data_mart_id))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "wos_client.data_marts.get(data_mart_id).result.to_dict()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## 2.0 Bind machine learning engines" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Bind `SageMaker` machine learning engine\n", "\n", "Provide credentials using following fields you obtained in the step to [Get AWS keys](https://github.com/IBM/monitor-sagemaker-ml-with-watson-openscale#get-aws-keys)\n", "- `access_key_id`\n", "- `secret_access_key`\n", "- `region`" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "SAGEMAKER_ENGINE_CREDENTIALS = {\n", " 'access_key_id': '', \n", " 'secret_access_key': '', \n", " 'region': ''}" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "SERVICE_PROVIDER_NAME = \"AWS Machine Learning\"\n", "SERVICE_PROVIDER_DESCRIPTION = \"Added by AWS tutorial WOS notebook.\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "service_providers = wos_client.service_providers.list().result.service_providers\n", "for service_provider in service_providers:\n", " service_instance_name = service_provider.entity.name\n", " if service_instance_name == SERVICE_PROVIDER_NAME:\n", " service_provider_id = service_provider.metadata.id\n", " wos_client.service_providers.delete(service_provider_id)\n", " print(\"Deleted existing service_provider for WML instance: {}\".format(service_provider_id))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "added_service_provider_result=wos_client.service_providers.add(\n", " name=SERVICE_PROVIDER_NAME,\n", " description=\"AWS Service Provider\",\n", " service_type=ServiceTypes.AMAZON_SAGEMAKER,\n", " credentials=SageMakerCredentials(\n", " access_key_id=SAGEMAKER_ENGINE_CREDENTIALS['access_key_id'],\n", " secret_access_key=SAGEMAKER_ENGINE_CREDENTIALS['secret_access_key'],\n", " region=SAGEMAKER_ENGINE_CREDENTIALS['region']\n", " ),\n", " background_mode=False\n", " ).result\n", "\n", "\n", "\n", "service_provider_id = added_service_provider_result.metadata.id\n", "print(\"Service Provider id \", service_provider_id)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "wos_client.service_providers.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "asset_deployment_details = wos_client.service_providers.list_assets(data_mart_id=data_mart_id, service_provider_id=service_provider_id).result\n", "asset_deployment_details" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Below will only work if you've a single deployment with the name Credit-risk-endpoint-scoring-*\n", "# IF you have >1 deployment, you'll get the first one, and will need to set deployment_id explicitly\n", "\n", "for resource in asset_deployment_details['resources']:\n", " if ('Credit-risk-endpoint-scoring' in resource['entity']['name']):\n", " deployment_id = resource['metadata']['guid']\n", " print(deployment_id)\n", " else:\n", " print(\"deployment_id not found. Set the deployment_id explicitly following instructions below.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2.1 get *deployment_id* explicitly if instructed to above\n", "\n", "In the notebook [CreditModelSagemakerLinearLearner.ipynb](https://raw.githubusercontent.com/IBM/monitor-sagemaker-ml-with-watson-openscale/master/notebooks/CreditModelSagemakerLinearLearner.ipynb), for the cell *4.3 Create online scoring endpoint* you created a *scoring_endpoint*:\n", "\n", "*scoring_endpoint = 'Credit-risk-endpoint-scoring-' + time_suffix*\n", "\n", "Use that scoring_endpoint name to find the ```['metadata']['guid']``` that matches that *scoring_endpoint* as the ```['entity']['name']``` in the output as *asset_deployment_details* above." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# deployment_id = \"********\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for model_asset_details in asset_deployment_details['resources']:\n", " if model_asset_details['metadata']['guid']==deployment_id:\n", " break\n", "model_asset_details" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## 3.0 Subscriptions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Add subscriptions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "List available deployments.\n", "\n", "**Note:** Depending on number of assets it may take some time." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "aws_asset = Asset(\n", " asset_id=model_asset_details['entity']['asset']['asset_id'],\n", " name=model_asset_details['entity']['asset']['name'],\n", " url=model_asset_details['entity']['asset']['url'],\n", " asset_type=model_asset_details['entity']['asset']['asset_type'] if 'asset_type' in model_asset_details['entity']['asset'] else 'model',\n", " problem_type=ProblemType.BINARY_CLASSIFICATION,\n", " input_data_type=InputDataType.STRUCTURED,\n", " )" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from ibm_watson_openscale.base_classes.watson_open_scale_v2 import ScoringEndpointRequest\n", "deployment_scoring_endpoint = model_asset_details['entity']['scoring_endpoint']\n", "scoring_endpoint = ScoringEndpointRequest(url = model_asset_details['entity']['scoring_endpoint']['url'] )" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "deployment = AssetDeploymentRequest(\n", " deployment_id=model_asset_details['metadata']['guid'],\n", " url=model_asset_details['metadata']['url'],\n", " name=model_asset_details['entity']['name'],\n", " deployment_type=model_asset_details['entity']['type'],\n", " scoring_endpoint = scoring_endpoint\n", " )" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "training_data_reference = TrainingDataReference(type='cos',\n", " location=COSTrainingDataReferenceLocation(bucket = BUCKET_NAME,\n", " file_name = training_data_file_name),\n", " connection=COSTrainingDataReferenceConnection(\n", " resource_instance_id= COS_RESOURCE_CRN,\n", " url= COS_ENDPOINT,\n", " api_key= COS_API_KEY_ID,\n", " iam_url=IAM_URL)\n", " )" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "feature_columns = ['CheckingStatus_0_to_200', 'CheckingStatus_greater_200', 'CheckingStatus_less_0', 'CheckingStatus_no_checking', 'CreditHistory_all_credits_paid_back', 'CreditHistory_credits_paid_to_date', 'CreditHistory_no_credits', 'CreditHistory_outstanding_credit', 'CreditHistory_prior_payments_delayed', 'LoanPurpose_appliances', 'LoanPurpose_business', 'LoanPurpose_car_new', 'LoanPurpose_car_used', 'LoanPurpose_education', 'LoanPurpose_furniture', 'LoanPurpose_other', 'LoanPurpose_radio_tv', 'LoanPurpose_repairs', 'LoanPurpose_retraining', 'LoanPurpose_vacation', 'ExistingSavings_100_to_500', 'ExistingSavings_500_to_1000', 'ExistingSavings_greater_1000', 'ExistingSavings_less_100', 'ExistingSavings_unknown', 'EmploymentDuration_1_to_4', 'EmploymentDuration_4_to_7', 'EmploymentDuration_greater_7', 'EmploymentDuration_less_1', 'EmploymentDuration_unemployed', 'Sex_female', 'Sex_male', 'OthersOnLoan_co-applicant', 'OthersOnLoan_guarantor', 'OthersOnLoan_none', 'OwnsProperty_car_other', 'OwnsProperty_real_estate', 'OwnsProperty_savings_insurance', 'OwnsProperty_unknown', 'InstallmentPlans_bank', 'InstallmentPlans_none', 'InstallmentPlans_stores', 'Housing_free', 'Housing_own', 'Housing_rent', 'Job_management_self-employed', 'Job_skilled', 'Job_unemployed', 'Job_unskilled', 'Telephone_none', 'Telephone_yes', 'ForeignWorker_no', 'ForeignWorker_yes', 'LoanDuration', 'LoanAmount', 'InstallmentPercent', 'CurrentResidenceDuration', 'Age', 'ExistingCreditsCount', 'Dependents']\n", "categorical_columns = ['CheckingStatus_0_to_200', 'CheckingStatus_greater_200', 'CheckingStatus_less_0', 'CheckingStatus_no_checking', 'CreditHistory_all_credits_paid_back', 'CreditHistory_credits_paid_to_date', 'CreditHistory_no_credits', 'CreditHistory_outstanding_credit', 'CreditHistory_prior_payments_delayed', 'LoanPurpose_appliances', 'LoanPurpose_business', 'LoanPurpose_car_new', 'LoanPurpose_car_used', 'LoanPurpose_education', 'LoanPurpose_furniture', 'LoanPurpose_other', 'LoanPurpose_radio_tv', 'LoanPurpose_repairs', 'LoanPurpose_retraining', 'LoanPurpose_vacation', 'ExistingSavings_100_to_500', 'ExistingSavings_500_to_1000', 'ExistingSavings_greater_1000', 'ExistingSavings_less_100', 'ExistingSavings_unknown', 'EmploymentDuration_1_to_4', 'EmploymentDuration_4_to_7', 'EmploymentDuration_greater_7', 'EmploymentDuration_less_1', 'EmploymentDuration_unemployed', 'Sex_female', 'Sex_male', 'OthersOnLoan_co-applicant', 'OthersOnLoan_guarantor', 'OthersOnLoan_none', 'OwnsProperty_car_other', 'OwnsProperty_real_estate', 'OwnsProperty_savings_insurance', 'OwnsProperty_unknown', 'InstallmentPlans_bank', 'InstallmentPlans_none', 'InstallmentPlans_stores', 'Housing_free', 'Housing_own', 'Housing_rent', 'Job_management_self-employed', 'Job_skilled', 'Job_unemployed', 'Job_unskilled', 'Telephone_none', 'Telephone_yes', 'ForeignWorker_no', 'ForeignWorker_yes']" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "asset_properties = AssetPropertiesRequest(\n", " label_column=\"Risk\",\n", " prediction_field='predicted_label',\n", " probability_fields=['score'],\n", " training_data_reference=training_data_reference,\n", " training_data_schema=None,\n", " input_data_schema=None,\n", " output_data_schema=None,\n", " feature_fields=feature_columns,\n", " categorical_fields=categorical_columns\n", " )" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "subscription_details = wos_client.subscriptions.add(\n", " data_mart_id=data_mart_id,\n", " service_provider_id=service_provider_id,\n", " asset=aws_asset,\n", " deployment=deployment,\n", " asset_properties=asset_properties,\n", " background_mode=False\n", ").result\n", "subscription_id = subscription_details.metadata.id\n", "subscription_id" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### List subscriptions" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "wos_client.subscriptions.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## 4.0 Performance metrics, scoring and payload logging" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Score the credit risk model and measure response time" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import requests\n", "import time\n", "import json\n", "import boto3" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "subscription_details=wos_client.subscriptions.get(subscription_id).result.to_dict()\n", "subscription_details" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "endpoint_name = subscription_details['entity']['deployment']['name']\n", "\n", "payload = \"0,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,1,0,0,1,0,0,1,0,0,1,0,0,1,0,0,1,0,0,1,0,0,1,12,4152,2,3,29,2,1\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "runtime = boto3.client('sagemaker-runtime',\n", " region_name=SAGEMAKER_ENGINE_CREDENTIALS['region'],\n", " aws_access_key_id=SAGEMAKER_ENGINE_CREDENTIALS['access_key_id'],\n", " aws_secret_access_key=SAGEMAKER_ENGINE_CREDENTIALS['secret_access_key'])\n", "\n", "start_time = time.time()\n", "response = runtime.invoke_endpoint(EndpointName=endpoint_name, ContentType='text/csv', Body=payload)\n", "response_time = int((time.time() - start_time)*1000)\n", "result = json.loads(response['Body'].read().decode())\n", "\n", "print(json.dumps(result, indent=2))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Store the request and response in payload logging table" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Transform the model's input and output to the format compatible with OpenScale standard." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import time\n", "\n", "time.sleep(5)\n", "payload_data_set_id = None\n", "payload_data_set_id = wos_client.data_sets.list(type=DataSetTypes.PAYLOAD_LOGGING, \n", " target_target_id=subscription_id, \n", " target_target_type=TargetTypes.SUBSCRIPTION).result.data_sets[0].metadata.id\n", "if payload_data_set_id is None:\n", " print(\"Payload data set not found. Please check subscription status.\")\n", "else:\n", " print(\"Payload data set id: \", payload_data_set_id)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "values = [float(s) for s in payload.split(',')]\n", "\n", "request_data = {'fields': feature_columns, \n", " 'values': values}\n", "\n", "response_data = {'fields': list(result['predictions'][0]),\n", " 'values': [list(x.values()) for x in result['predictions']]}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Store the payload using Python SDK" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Hint:** You can embed payload logging code into your custom deployment so it is logged automatically each time you score the model." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import uuid\n", "from ibm_watson_openscale.supporting_classes.payload_record import PayloadRecord\n", "\n", "print(\"Performing explicit payload logging.....\")\n", "wos_client.data_sets.store_records(data_set_id=payload_data_set_id, background_mode=False,request_body=[PayloadRecord(\n", " scoring_id=str(uuid.uuid4()),\n", " request=request_data,\n", " response=response_data,\n", " response_time=460\n", ")])\n", "time.sleep(5)\n", "pl_records_count = wos_client.data_sets.get_records_count(payload_data_set_id)\n", "print(\"Number of records in the payload logging table: {}\".format(pl_records_count))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "wos_client.data_sets.show_records(data_set_id=payload_data_set_id)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## 5.0 Feedback logging & quality (accuracy) monitoring" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Enable quality monitoring" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You need to provide the monitoring `threshold` and `min_records` (minimal number of feedback records)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import time\n", "\n", "time.sleep(10)\n", "target = Target(\n", " target_type=TargetTypes.SUBSCRIPTION,\n", " target_id=subscription_id\n", ")\n", "parameters = {\n", " \"min_feedback_data_size\": 10\n", "}\n", "thresholds = [\n", " {\n", " \"metric_id\": \"area_under_roc\",\n", " \"type\": \"lower_limit\",\n", " \"value\": .80\n", " }\n", " ]\n", "quality_monitor_details = wos_client.monitor_instances.create(\n", " data_mart_id=data_mart_id,\n", " background_mode=False,\n", " monitor_definition_id=wos_client.monitor_definitions.MONITORS.QUALITY.ID,\n", " target=target,\n", " parameters=parameters,\n", " thresholds=thresholds\n", ").result" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "quality_monitor_instance_id = quality_monitor_details.metadata.id\n", "quality_monitor_instance_id" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Feedback records logging" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Feedback records are used to evaluate your model. The predicted values are compared to real values (feedback records)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can check the schema of feedback table using below method." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "feedback_dataset_id = None\n", "feedback_dataset = wos_client.data_sets.list(type=DataSetTypes.FEEDBACK, \n", " target_target_id=subscription_id, \n", " target_target_type=TargetTypes.SUBSCRIPTION).result\n", "feedback_dataset_id = feedback_dataset.data_sets[0].metadata.id\n", "if feedback_dataset_id is None:\n", " print(\"Feedback data set not found. Please check quality monitor status.\")\n", "feedback_dataset_id" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The feedback records can be send to feedback table using below code." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import requests\n", "import pandas as pd\n", "import numpy as np\n", "import time\n", "\n", "time.sleep(10) #It gives enough time for dataset creation\n", "\n", "data = pd.read_csv('https://raw.githubusercontent.com/IBM/watson-openscale-samples/main/IBM%20Cloud/AWS%20Sagemaker/assets/data/credit_risk_aws/credit_risk_feedback_recoded.csv',header=0,dtype=np.float)\n", "feedback_columns = data.columns.tolist()\n", "feedback_records = data.values.tolist()\n", "\n", "payload_scoring = [{\"fields\": feedback_columns, \"values\": feedback_records}]\n", "wos_client.data_sets.store_records(feedback_dataset_id, request_body=payload_scoring, background_mode=False)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "wos_client.data_sets.print_records_schema(data_set_id=feedback_dataset_id)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "wos_client.data_sets.get_records_count(data_set_id=feedback_dataset_id)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "run_details = wos_client.monitor_instances.run(monitor_instance_id=quality_monitor_instance_id, background_mode=False).result" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "time.sleep(5)\n", "wos_client.monitor_instances.show_metrics(monitor_instance_id=quality_monitor_instance_id)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## 5.1 Get the logged data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Payload logging" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Print schema of payload_logging table" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "wos_client.data_sets.print_records_schema(data_set_id=payload_data_set_id)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## 6.0 Fairness, Drift monitoring and explanations" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Get payload data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "scoring_data_filename='credit_risk_scoring_recoded.csv'\n", "scoring_data_filename_json='credit_risk_scoring_recoded.json'" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!rm credit_risk_scoring_recoded.json\n", "!rm credit_risk_scoring_recoded.csv\n", "!wget \"https://raw.githubusercontent.com/IBM/watson-openscale-samples/main/IBM%20Cloud/AWS%20Sagemaker/assets/data/credit_risk_aws/credit_risk_scoring_recoded.json\"\n", "!wget \"https://raw.githubusercontent.com/IBM/watson-openscale-samples/main/IBM%20Cloud/AWS%20Sagemaker/assets/data/credit_risk_aws/credit_risk_scoring_recoded.csv\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sm_runtime = boto3.client('sagemaker-runtime',\n", " region_name=SAGEMAKER_ENGINE_CREDENTIALS['region'],\n", " aws_access_key_id=SAGEMAKER_ENGINE_CREDENTIALS['access_key_id'],\n", " aws_secret_access_key=SAGEMAKER_ENGINE_CREDENTIALS['secret_access_key'])\n", "\n", "with open(scoring_data_filename) as f_payload:\n", " scoring_response = sm_runtime.invoke_endpoint(EndpointName = endpoint_name,\n", " ContentType = 'text/csv',\n", " Body = f_payload.read().encode())\n", " \n", " result = json.loads(scoring_response['Body'].read().decode())\n", " print(json.dumps(result, indent=2))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "f = open(scoring_data_filename_json,\"r\")\n", "payload_values = json.load(f)\n", "request_data = {'fields': feature_columns, \n", " 'values': payload_values}\n", "\n", "response_data = {'fields': list(result['predictions'][0]),\n", " 'values': [list(x.values()) for x in result['predictions']]}\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import uuid\n", "from ibm_watson_openscale.supporting_classes.payload_record import PayloadRecord\n", "\n", "print(\"Performing explicit payload logging.....\")\n", "wos_client.data_sets.store_records(data_set_id=payload_data_set_id, request_body=[PayloadRecord(\n", " scoring_id=str(uuid.uuid4()),\n", " request=request_data,\n", " response=response_data,\n", " response_time=460\n", ")])\n", "time.sleep(5)\n", "pl_records_count = wos_client.data_sets.get_records_count(payload_data_set_id)\n", "print(\"Number of records in the payload logging table: {}\".format(pl_records_count))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "wos_client.data_sets.show_records(payload_data_set_id)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Enable and run fairness monitoring" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "target = Target(\n", " target_type=TargetTypes.SUBSCRIPTION,\n", " target_id=subscription_id\n", "\n", ")\n", "parameters = {\n", " \"features\": [\n", " {\"feature\": \"Sex_female\",\n", " \"majority\": [[0,0]],\n", " \"minority\": [[1,1]],\n", " \"threshold\": 0.95\n", " },\n", " {\"feature\": \"Age\",\n", " \"majority\": [[26, 75]],\n", " \"minority\": [[18, 25]],\n", " \"threshold\": 0.95\n", " }\n", " ],\n", " \"favourable_class\": [0],\n", " \"unfavourable_class\": [1],\n", " \"min_records\": 30\n", "}\n", "\n", "fairness_monitor_details = wos_client.monitor_instances.create(\n", " data_mart_id=data_mart_id,\n", " background_mode=False,\n", " monitor_definition_id=wos_client.monitor_definitions.MONITORS.FAIRNESS.ID,\n", " target=target,\n", " parameters=parameters).result\n", "fairness_monitor_instance_id =fairness_monitor_details.metadata.id\n", "fairness_monitor_instance_id" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Run fairness monitor" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "time.sleep(20)\n", "#Note: When you create fairness monitor, initial run is also created\n", "wos_client.monitor_instances.show_metrics(monitor_instance_id=fairness_monitor_instance_id)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Enable and run Drift monitoring" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Drift requires a trained model to be uploaded manually for AWS. You can train, create and download a drift detection model using template given ( check for Drift detection model generation) [here](https://github.com/IBM-Watson/aios-data-distribution/blob/master/training_statistics_notebook.ipynb)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!rm -rf creditrisk_aws_drift_detection_model.tar.gz\n", "!wget -O creditrisk_aws_drift_detection_model.tar.gz https://github.com/IBM/watson-openscale-samples/blob/main/IBM%20Cloud/AWS%20Sagemaker/assets/models/credit_risk/aws_creditrisk_drift_detection_model.tar.gz?raw=true " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "wos_client.monitor_instances.upload_drift_model(\n", " model_path='creditrisk_aws_drift_detection_model.tar.gz',\n", " data_mart_id=data_mart_id,\n", " subscription_id=subscription_id\n", " )" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "monitor_instances = wos_client.monitor_instances.list().result.monitor_instances\n", "for monitor_instance in monitor_instances:\n", " monitor_def_id=monitor_instance.entity.monitor_definition_id\n", " if monitor_def_id == \"drift\" and monitor_instance.entity.target.target_id == subscription_id:\n", " wos_client.monitor_instances.delete(monitor_instance.metadata.id)\n", " print('Deleted existing drift monitor instance with id: ', monitor_instance.metadata.id)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "target = Target(\n", " target_type=TargetTypes.SUBSCRIPTION,\n", " target_id=subscription_id\n", "\n", ")\n", "parameters = {\n", " \"min_samples\": 30,\n", " \"drift_threshold\": 0.1,\n", " \"train_drift_model\": False,\n", " \"enable_model_drift\": True,\n", " \"enable_data_drift\": True\n", "}\n", "\n", "drift_monitor_details = wos_client.monitor_instances.create(\n", " data_mart_id=data_mart_id,\n", " background_mode=False,\n", " monitor_definition_id=wos_client.monitor_definitions.MONITORS.DRIFT.ID,\n", " target=target,\n", " parameters=parameters\n", ").result\n", "\n", "drift_monitor_instance_id = drift_monitor_details.metadata.id\n", "drift_monitor_instance_id" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "drift_run_details = wos_client.monitor_instances.run(monitor_instance_id=drift_monitor_instance_id, background_mode=False)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "time.sleep(5)\n", "wos_client.monitor_instances.show_metrics(monitor_instance_id=drift_monitor_instance_id)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Enable Explainability and run explanation on sample record" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "target = Target(\n", " target_type=TargetTypes.SUBSCRIPTION,\n", " target_id=subscription_id\n", ")\n", "parameters = {\n", " \"enabled\": True\n", "}\n", "explainability_details = wos_client.monitor_instances.create(\n", " data_mart_id=data_mart_id,\n", " background_mode=False,\n", " monitor_definition_id=wos_client.monitor_definitions.MONITORS.EXPLAINABILITY.ID,\n", " target=target,\n", " parameters=parameters\n", ").result" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Getting a `transaction_id` to run explanation on" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "explainability_monitor_id = explainability_details.metadata.id\n", "explainability_monitor_id" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "wos_client.data_sets.show_records(data_set_id=payload_data_set_id)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "payload_data = wos_client.data_sets.get_list_of_records(limit=5,data_set_id=payload_data_set_id,output_type='pandas').result\n", "scoring_ids=payload_data['scoring_id'].tolist()\n", "print(\"Running explanations on scoring IDs: {}\".format(scoring_ids))\n", "explanation_types = [\"lime\", \"contrastive\"]\n", "result = wos_client.monitor_instances.explanation_tasks(scoring_ids=scoring_ids, explanation_types=explanation_types).result\n", "print(result)\n", "explanation_task_id=result.to_dict()['metadata']['explanation_task_ids'][0]\n", "explanation_task_id" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "wos_client.monitor_instances.get_explanation_tasks(explanation_task_id=explanation_task_id).result.to_dict()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Congratulations\n", "\n", "You have finished the tutorial for IBM Watson OpenScale and AWS Machine Learning Studio. You can now view the [OpenScale Dashboard](https://aiopenscale.cloud.ibm.com/). Click on the tile for the German Credit AWS model to see fairness, accuracy, and performance monitors. Click on the timeseries graph to get detailed information on transactions during a specific time window." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.8", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.10" }, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "window_display": false } }, "nbformat": 4, "nbformat_minor": 1 }