{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
">### 🚩 *Create a free WhyLabs account to get more value out of whylogs!*
\n",
">*Did you know you can store, visualize, and monitor whylogs profiles with the [WhyLabs Observability Platform](https://whylabs.ai/whylogs-free-signup?utm_source=whylogs-Github&utm_medium=whylogs-example&utm_campaign=Writing_Regression_Performance_Metrics_to_WhyLabs)? Sign up for a [free WhyLabs account](https://whylabs.ai/whylogs-free-signup?utm_source=whylogs-Github&utm_medium=whylogs-example&utm_campaign=Writing_Regression_Performance_Metrics_to_WhyLabs) to leverage the power of whylogs and WhyLabs together!*"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Monitoring Regression Model Performance Metrics\n",
"\n",
"[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/whylabs/whylogs/blob/mainline/python/examples/integrations/writers/Writing_Regression_Performance_Metrics_to_WhyLabs.ipynb)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"In this tutorial, we'll show how you can log performance metrics of your ML Model with whylogs, and how to send it to your dashboard at Whylabs Platform. We'll follow a regression use case, where we're predicting the temperature of a given location based on metereological features.\n",
"\n",
"We will:\n",
"\n",
"- Download Weather Data for 7 days\n",
"- Log daily input features with whylogs\n",
"- Log daily regression performance metrics with whylogs\n",
"- Write logged profiles to WhyLabs' dashboard\n",
"- Show performance summary at WhyLabs\n",
"- __Advanced__: Monitor segmented performance metrics"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Installing whylogs\n",
"\n",
"First, let's install __whylogs__. Since we want to write to WhyLabs, we'll install the __whylabs__ extra. Additionally, we'll use the __datasets__ module, so let's install it as well:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# Note: you may need to restart the kernel to use updated packages.\n",
"%pip install 'whylogs[datasets]'"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## 🌤️ The Data - Weather Forecast Dataset"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"In this example, we will use the Weather Forecast Dataset, using whylogs' __Datasets__ module.\n",
"\n",
"This dataset contains several meteorological features at a particular place (defined by latitude and longitude features) and time. The task is to predict the temperature based on the input features.\n",
"\n",
"The original data is described in [Shifts: A Dataset of Real Distributional Shift Across Multiple Large-Scale Tasks](https://arxiv.org/pdf/2107.07455.pdf), by **Malinin, Andrey, et al.**, and was further transformed to compose the current dataset. You can have more information about the resulting dataset and how to use it at https://whylogs.readthedocs.io/en/latest/datasets/weather.html."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Downloading the data into daily batches"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's download 7 batches with 7 days worth of data, corresponding to the last 7 days. We can use directly the __datasets__ module for that."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"from whylogs.datasets import Weather\n",
"from datetime import datetime, timezone, timedelta\n",
"dataset = Weather()\n",
"\n",
"start_timestamp = datetime.now(timezone.utc) - timedelta(days=6)\n",
"dataset.set_parameters(inference_start_timestamp=start_timestamp)\n",
"\n",
"daily_batches = dataset.get_inference_data(number_batches=7)\n",
"\n",
"#batches is an iterator, so let's get the list for this\n",
"daily_batches = list(daily_batches)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"Since in this example we're mainly concerned with regression metrics, let's select a subset of the available features, for simplicity.\n",
"\n",
"__meta_climate__, __meta_latitude__, __meta_longitude__ will be our input features, while __prediction_temperature__ is the predicted feature given by a trained ML model and the __temperature__ feature is our target.\n",
"\n",
"Let's take a look at the data for the first day:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n", " | meta_climate | \n", "meta_latitude | \n", "meta_longitude | \n", "prediction_temperature | \n", "temperature | \n", "
---|---|---|---|---|---|
date | \n", "\n", " | \n", " | \n", " | \n", " | \n", " |
2022-12-12 00:00:00+00:00 | \n", "mild temperate | \n", "38.891300 | \n", "-6.821330 | \n", "9.163181 | \n", "10.0 | \n", "
2022-12-12 00:00:00+00:00 | \n", "tropical | \n", "12.216667 | \n", "109.216667 | \n", "26.220221 | \n", "27.0 | \n", "
2022-12-12 00:00:00+00:00 | \n", "dry | \n", "37.991699 | \n", "-101.746002 | \n", "13.178478 | \n", "15.0 | \n", "
2022-12-12 00:00:00+00:00 | \n", "mild temperate | \n", "-23.333599 | \n", "-51.130100 | \n", "23.255124 | \n", "25.0 | \n", "
2022-12-12 00:00:00+00:00 | \n", "mild temperate | \n", "-23.479445 | \n", "-52.012222 | \n", "27.851674 | \n", "32.0 | \n", "