{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
">### 🚩 *Create a free WhyLabs account to get more value out of whylogs!*
\n",
">*Did you know you can store, visualize, and monitor whylogs profiles with the [WhyLabs Observability Platform](https://whylabs.ai/whylogs-free-signup?utm_source=whylogs-Github&utm_medium=whylogs-example&utm_campaign=Metric_Constraints)? Sign up for a [free WhyLabs account](https://whylabs.ai/whylogs-free-signup?utm_source=whylogs-Github&utm_medium=whylogs-example&utm_campaign=Metric_Constraints) to leverage the power of whylogs and WhyLabs together!*"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Data Validation with Metric Constraints"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/whylabs/whylogs/blob/mainline/python/examples/advanced/Metric_Constraints.ipynb)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> This is an example for whylogs versions 1.0.0 and above. If you're interested in constraints for versions <1.0.0, please see these examples: [Constraints Suite](https://github.com/whylabs/whylogs/blob/maintenance/0.7.x/examples/Constraints_Suite.ipynb), [Constraints-Distributional Measures](https://github.com/whylabs/whylogs/blob/maintenance/0.7.x/examples/Constraints_Distributional_Measures.ipynb), and [Creating Customized Constraints](https://github.com/whylabs/whylogs/blob/maintenance/0.7.x/examples/Creating_Customized_Constraints.ipynb)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Note: you may need to restart the kernel to use updated packages.\n",
"%pip install 'whylogs[viz]'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Starting with the basic pandas dataframe logging, consider the following input. We will generate whylogs profile view from this"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import whylogs as why\n",
"\n",
"data = {\n",
" \"animal\": [\"cat\", \"hawk\", \"snake\", \"cat\", \"mosquito\"],\n",
" \"legs\": [4, 2, 0, 4, 6],\n",
" \"weight\": [4.3, 1.8, 1.3, 4.1, 5.5e-6],\n",
"}\n",
"\n",
"results = why.log(pd.DataFrame(data))\n",
"profile_view = results.view()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The profile view can be display as a pandas dataframe where the columns are metric/component paths"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n", " | cardinality/est | \n", "cardinality/lower_1 | \n", "cardinality/upper_1 | \n", "counts/inf | \n", "counts/n | \n", "counts/nan | \n", "counts/null | \n", "distribution/max | \n", "distribution/mean | \n", "distribution/median | \n", "... | \n", "distribution/stddev | \n", "frequent_items/frequent_strings | \n", "type | \n", "types/boolean | \n", "types/fractional | \n", "types/integral | \n", "types/object | \n", "types/string | \n", "ints/max | \n", "ints/min | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
column | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
animal | \n", "4.0 | \n", "4.0 | \n", "4.00020 | \n", "0 | \n", "5 | \n", "0 | \n", "0 | \n", "NaN | \n", "0.000000 | \n", "NaN | \n", "... | \n", "0.000000 | \n", "[FrequentItem(value='cat', est=2, upper=2, low... | \n", "SummaryType.COLUMN | \n", "0 | \n", "0 | \n", "0 | \n", "0 | \n", "5 | \n", "NaN | \n", "NaN | \n", "
legs | \n", "4.0 | \n", "4.0 | \n", "4.00020 | \n", "0 | \n", "5 | \n", "0 | \n", "0 | \n", "6.0 | \n", "3.200000 | \n", "4.0 | \n", "... | \n", "2.280351 | \n", "[FrequentItem(value='4', est=2, upper=2, lower... | \n", "SummaryType.COLUMN | \n", "0 | \n", "0 | \n", "5 | \n", "0 | \n", "0 | \n", "6.0 | \n", "0.0 | \n", "
weight | \n", "5.0 | \n", "5.0 | \n", "5.00025 | \n", "0 | \n", "5 | \n", "0 | \n", "0 | \n", "4.3 | \n", "2.300001 | \n", "1.8 | \n", "... | \n", "1.856069 | \n", "NaN | \n", "SummaryType.COLUMN | \n", "0 | \n", "5 | \n", "0 | \n", "0 | \n", "0 | \n", "NaN | \n", "NaN | \n", "
3 rows × 30 columns
\n", "