{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Example Tool Usage - Regression Problems\n",
    "----\n",
    "\n",
    "# About\n",
    "This notebook contains simple, toy examples to help you get started with FairMLHealth tool usage. This same content is mirrored in the repository's main [README](../../../README.md)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Example Setup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from fairmlhealth import report, measure, stat_utils\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "from sklearn.model_selection import train_test_split\n",
    "from sklearn.linear_model import LinearRegression, TweedieRegressor"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "# First, we'll create a semi-randomized dataframe with specific columns for our attributes of interest\n",
    "rng = np.random.RandomState(506)\n",
    "N = 240\n",
    "X = pd.DataFrame({'col1': rng.randint(1, 4, N), \n",
    "                  'col2': rng.randint(1, 75, N),\n",
    "                  'col3': rng.randint(0, 2, N),\n",
    "                  'gender': [0, 1]*int(N/2), \n",
    "                  'ethnicity': [1, 1, 0, 0]*int(N/4),\n",
    "                  'other': [1, 0, 0, 0, 1, 0, 0, 1]*int(N/8)\n",
    "                 })\n",
    "\n",
    "# Second, we'll create a randomized target variable\n",
    "y = pd.Series((X['col3']+X['gender']).values + rng.uniform(0, 6, N), name='Example_Target')\n",
    "\n",
    "# Third, we'll split the data and use it to train two generic models\n",
    "splits = train_test_split(X, y, test_size=0.5, random_state=42)\n",
    "X_train, X_test, y_train, y_test = splits\n",
    "\n",
    "model_1 = LinearRegression().fit(X_train, y_train)\n",
    "model_2 = TweedieRegressor().fit(X_train, y_train)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>col1</th>\n",
       "      <th>col2</th>\n",
       "      <th>col3</th>\n",
       "      <th>gender</th>\n",
       "      <th>ethnicity</th>\n",
       "      <th>other</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>15</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>3</td>\n",
       "      <td>51</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1</td>\n",
       "      <td>30</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2</td>\n",
       "      <td>28</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>1</td>\n",
       "      <td>72</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   col1  col2  col3  gender  ethnicity  other\n",
       "0     1    15     0       0          1      1\n",
       "1     3    51     1       1          1      0\n",
       "2     1    30     1       0          0      0\n",
       "3     2    28     1       1          0      0\n",
       "4     1    72     0       0          1      1"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/plain": [
       "0    1.700759\n",
       "1    2.312593\n",
       "2    6.117705\n",
       "3    3.481302\n",
       "4    1.051515\n",
       "Name: Example_Target, dtype: float64"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "display(X.head(), y.head())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Generalized Reports\n",
    "fairMLHealth has tools to create generalized reports of model bias and performance.\n",
    "\n",
    "The primary reporting tool is now the **compare** function, which can be used to generate side-by-side comparisons for any number of models, and for either binary classifcation or for regression problems. Model performance metrics such as accuracy and precision (or MAE and RSquared for regression problems) are also provided to facilitate comparison. \n",
    "\n",
    "A flagging protocol is applied by default to highlight any cells with values that are out of range.  This can be turned off by passing ***flag_oor = False*** to report.compare().\n",
    "\n",
    "Below is an example applying the function for a regression model. Note that the \"fair\" range to be used for evaluation of regression metrics does requires judgment on the part of the user. Default ranges have been set to [0.8, 1.2] for ratios, 10% of the available target range for *Mean Prediction Difference*, and 10% of the available MAE range for *MAE Difference*. If the default flags do not meet your needs, they can be turned off by passing ***flag_oor = False*** to report.compare(). More information is available in our [Evaluating Fairness Documentation](./docs/resources/Evaluating_Fairness.md#regression_ranges)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style  type=\"text/css\" >\n",
       "#T_a68cb8b4_0c4c_11ec_a02b_acde48001122row1_col0,#T_a68cb8b4_0c4c_11ec_a02b_acde48001122row2_col0,#T_a68cb8b4_0c4c_11ec_a02b_acde48001122row3_col0{\n",
       "            background-color: #c2bae3;\n",
       "        }</style><table id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122\" ><thead>    <tr>        <th class=\"blank\" ></th>        <th class=\"blank level0\" ></th>        <th class=\"col_heading level0 col0\" >model 1</th>    </tr>    <tr>        <th class=\"index_name level0\" >Metric</th>        <th class=\"index_name level1\" >Measure</th>        <th class=\"blank\" ></th>    </tr></thead><tbody>\n",
       "                <tr>\n",
       "                        <th id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122level0_row0\" class=\"row_heading level0 row0\" rowspan=4>Group Fairness</th>\n",
       "                        <th id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122level1_row0\" class=\"row_heading level1 row0\" >MAE Difference</th>\n",
       "                        <td id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122row0_col0\" class=\"data row0 col0\" >0.3878</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122level1_row1\" class=\"row_heading level1 row1\" >MAE Ratio</th>\n",
       "                        <td id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122row1_col0\" class=\"data row1 col0\" >1.2864</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122level1_row2\" class=\"row_heading level1 row2\" >Mean Prediction Difference</th>\n",
       "                        <td id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122row2_col0\" class=\"data row2 col0\" >-1.0663</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122level1_row3\" class=\"row_heading level1 row3\" >Mean Prediction Ratio</th>\n",
       "                        <td id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122row3_col0\" class=\"data row3 col0\" >0.7721</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                        <th id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122level0_row4\" class=\"row_heading level0 row4\" rowspan=2>Individual Fairness</th>\n",
       "                        <th id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122level1_row4\" class=\"row_heading level1 row4\" >Between-Group Gen. Entropy Error</th>\n",
       "                        <td id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122row4_col0\" class=\"data row4 col0\" >0.0000</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122level1_row5\" class=\"row_heading level1 row5\" >Consistency Score</th>\n",
       "                        <td id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122row5_col0\" class=\"data row5 col0\" >0.3652</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                        <th id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122level0_row6\" class=\"row_heading level0 row6\" rowspan=9>Model Performance</th>\n",
       "                        <th id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122level1_row6\" class=\"row_heading level1 row6\" >MAE</th>\n",
       "                        <td id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122row6_col0\" class=\"data row6 col0\" >1.5547</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122level1_row7\" class=\"row_heading level1 row7\" >MSE</th>\n",
       "                        <td id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122row7_col0\" class=\"data row7 col0\" >3.3753</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122level1_row8\" class=\"row_heading level1 row8\" >Mean Error</th>\n",
       "                        <td id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122row8_col0\" class=\"data row8 col0\" >-0.1224</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122level1_row9\" class=\"row_heading level1 row9\" >Mean Example_Target</th>\n",
       "                        <td id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122row9_col0\" class=\"data row9 col0\" >4.2513</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122level1_row10\" class=\"row_heading level1 row10\" >Mean Prediction</th>\n",
       "                        <td id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122row10_col0\" class=\"data row10 col0\" >4.1290</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122level1_row11\" class=\"row_heading level1 row11\" >Rsqrd</th>\n",
       "                        <td id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122row11_col0\" class=\"data row11 col0\" >0.1326</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122level1_row12\" class=\"row_heading level1 row12\" >Std. Dev. Error</th>\n",
       "                        <td id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122row12_col0\" class=\"data row12 col0\" >1.8408</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122level1_row13\" class=\"row_heading level1 row13\" >Std. Dev. Example_Target</th>\n",
       "                        <td id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122row13_col0\" class=\"data row13 col0\" >1.9809</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122level1_row14\" class=\"row_heading level1 row14\" >Std. Dev. Prediction</th>\n",
       "                        <td id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122row14_col0\" class=\"data row14 col0\" >0.9631</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                        <th id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122level0_row15\" class=\"row_heading level0 row15\" >Data Metrics</th>\n",
       "                        <th id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122level1_row15\" class=\"row_heading level1 row15\" >Prevalence of Privileged Class (%)</th>\n",
       "                        <td id=\"T_a68cb8b4_0c4c_11ec_a02b_acde48001122row15_col0\" class=\"data row15 col0\" >48.0000</td>\n",
       "            </tr>\n",
       "    </tbody></table>"
      ],
      "text/plain": [
       "<pandas.io.formats.style.Styler at 0x7faaa8917f60>"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Generate a measure report\n",
    "report.compare(X_test, y_test,  X_test['gender'],  model_1,  pred_type=\"regression\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Returned type: <class 'pandas.io.formats.style.Styler'>\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<style  type=\"text/css\" >\n",
       "#T_a699aa06_0c4c_11ec_8aa6_acde48001122row1_col0,#T_a699aa06_0c4c_11ec_8aa6_acde48001122row2_col0,#T_a699aa06_0c4c_11ec_8aa6_acde48001122row3_col0{\n",
       "            background-color: #c2bae3;\n",
       "        }</style><table id=\"T_a699aa06_0c4c_11ec_8aa6_acde48001122\" ><thead>    <tr>        <th class=\"blank\" ></th>        <th class=\"blank level0\" ></th>        <th class=\"col_heading level0 col0\" >model 1</th>    </tr>    <tr>        <th class=\"index_name level0\" >Metric</th>        <th class=\"index_name level1\" >Measure</th>        <th class=\"blank\" ></th>    </tr></thead><tbody>\n",
       "                <tr>\n",
       "                        <th id=\"T_a699aa06_0c4c_11ec_8aa6_acde48001122level0_row0\" class=\"row_heading level0 row0\" rowspan=4>Group Fairness</th>\n",
       "                        <th id=\"T_a699aa06_0c4c_11ec_8aa6_acde48001122level1_row0\" class=\"row_heading level1 row0\" >MAE Difference</th>\n",
       "                        <td id=\"T_a699aa06_0c4c_11ec_8aa6_acde48001122row0_col0\" class=\"data row0 col0\" >0.3878</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a699aa06_0c4c_11ec_8aa6_acde48001122level1_row1\" class=\"row_heading level1 row1\" >MAE Ratio</th>\n",
       "                        <td id=\"T_a699aa06_0c4c_11ec_8aa6_acde48001122row1_col0\" class=\"data row1 col0\" >1.2864</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a699aa06_0c4c_11ec_8aa6_acde48001122level1_row2\" class=\"row_heading level1 row2\" >Mean Prediction Difference</th>\n",
       "                        <td id=\"T_a699aa06_0c4c_11ec_8aa6_acde48001122row2_col0\" class=\"data row2 col0\" >-1.0663</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a699aa06_0c4c_11ec_8aa6_acde48001122level1_row3\" class=\"row_heading level1 row3\" >Mean Prediction Ratio</th>\n",
       "                        <td id=\"T_a699aa06_0c4c_11ec_8aa6_acde48001122row3_col0\" class=\"data row3 col0\" >0.7721</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                        <th id=\"T_a699aa06_0c4c_11ec_8aa6_acde48001122level0_row4\" class=\"row_heading level0 row4\" rowspan=2>Individual Fairness</th>\n",
       "                        <th id=\"T_a699aa06_0c4c_11ec_8aa6_acde48001122level1_row4\" class=\"row_heading level1 row4\" >Between-Group Gen. Entropy Error</th>\n",
       "                        <td id=\"T_a699aa06_0c4c_11ec_8aa6_acde48001122row4_col0\" class=\"data row4 col0\" >0.0000</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a699aa06_0c4c_11ec_8aa6_acde48001122level1_row5\" class=\"row_heading level1 row5\" >Consistency Score</th>\n",
       "                        <td id=\"T_a699aa06_0c4c_11ec_8aa6_acde48001122row5_col0\" class=\"data row5 col0\" >0.3652</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                        <th id=\"T_a699aa06_0c4c_11ec_8aa6_acde48001122level0_row6\" class=\"row_heading level0 row6\" >Data Metrics</th>\n",
       "                        <th id=\"T_a699aa06_0c4c_11ec_8aa6_acde48001122level1_row6\" class=\"row_heading level1 row6\" >Prevalence of Privileged Class (%)</th>\n",
       "                        <td id=\"T_a699aa06_0c4c_11ec_8aa6_acde48001122row6_col0\" class=\"data row6 col0\" >48.0000</td>\n",
       "            </tr>\n",
       "    </tbody></table>"
      ],
      "text/plain": [
       "<pandas.io.formats.style.Styler at 0x7faaa9709320>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Display the same report without performance measures\n",
    "bias_report = report.compare(test_data=X_test, \n",
    "                             targets=y_test, \n",
    "                             protected_attr=X_test['gender'], \n",
    "                             models=model_1, \n",
    "                             pred_type=\"regression\", \n",
    "                             skip_performance=True)\n",
    "print(\"Returned type:\", type(bias_report))\n",
    "display(bias_report)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Alternative Return Types\n",
    "\n",
    "By default the **compare** function returns a flagged comparison of type pandas Styler (pandas.io.formats.style.Styler). When flags are disabled, the default return type is a pandas DataFrame. Outputs can also be returned as embedded HTML -- with or without flags -- by specitying *output_type=\"html\"*. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Returned type: <class 'pandas.core.frame.DataFrame'>\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>model 1</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Metric</th>\n",
       "      <th>Measure</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th rowspan=\"2\" valign=\"top\">Group Fairness</th>\n",
       "      <th>MAE Difference</th>\n",
       "      <td>0.3878</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>MAE Ratio</th>\n",
       "      <td>1.2864</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                               model 1\n",
       "Metric         Measure                \n",
       "Group Fairness MAE Difference   0.3878\n",
       "               MAE Ratio        1.2864"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# With flags disabled, the report is returned as a pandas DataFrame\n",
    "df = report.compare(test_data=X_test, \n",
    "                   targets=y_test, \n",
    "                   protected_attr=X_test['gender'], \n",
    "                   models=model_1, \n",
    "                   pred_type=\"regression\",\n",
    "                   flag_oor=False)\n",
    "\n",
    "print(\"Returned type:\", type(df))\n",
    "display(df.head(2))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Returned type: <class 'str'>\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<style  type=\"text/css\" >\n",
       "#T_a6b0ca58_0c4c_11ec_bfdb_acde48001122row1_col0,#T_a6b0ca58_0c4c_11ec_bfdb_acde48001122row2_col0,#T_a6b0ca58_0c4c_11ec_bfdb_acde48001122row3_col0{\n",
       "            background-color: #c2bae3;\n",
       "        }</style><table id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122\" ><thead>    <tr>        <th class=\"blank\" ></th>        <th class=\"blank level0\" ></th>        <th class=\"col_heading level0 col0\" >model 1</th>    </tr>    <tr>        <th class=\"index_name level0\" >Metric</th>        <th class=\"index_name level1\" >Measure</th>        <th class=\"blank\" ></th>    </tr></thead><tbody>\n",
       "                <tr>\n",
       "                        <th id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122level0_row0\" class=\"row_heading level0 row0\" rowspan=4>Group Fairness</th>\n",
       "                        <th id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122level1_row0\" class=\"row_heading level1 row0\" >MAE Difference</th>\n",
       "                        <td id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122row0_col0\" class=\"data row0 col0\" >0.3878</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122level1_row1\" class=\"row_heading level1 row1\" >MAE Ratio</th>\n",
       "                        <td id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122row1_col0\" class=\"data row1 col0\" >1.2864</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122level1_row2\" class=\"row_heading level1 row2\" >Mean Prediction Difference</th>\n",
       "                        <td id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122row2_col0\" class=\"data row2 col0\" >-1.0663</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122level1_row3\" class=\"row_heading level1 row3\" >Mean Prediction Ratio</th>\n",
       "                        <td id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122row3_col0\" class=\"data row3 col0\" >0.7721</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                        <th id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122level0_row4\" class=\"row_heading level0 row4\" rowspan=2>Individual Fairness</th>\n",
       "                        <th id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122level1_row4\" class=\"row_heading level1 row4\" >Between-Group Gen. Entropy Error</th>\n",
       "                        <td id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122row4_col0\" class=\"data row4 col0\" >0.0000</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122level1_row5\" class=\"row_heading level1 row5\" >Consistency Score</th>\n",
       "                        <td id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122row5_col0\" class=\"data row5 col0\" >0.3652</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                        <th id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122level0_row6\" class=\"row_heading level0 row6\" rowspan=9>Model Performance</th>\n",
       "                        <th id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122level1_row6\" class=\"row_heading level1 row6\" >MAE</th>\n",
       "                        <td id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122row6_col0\" class=\"data row6 col0\" >1.5547</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122level1_row7\" class=\"row_heading level1 row7\" >MSE</th>\n",
       "                        <td id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122row7_col0\" class=\"data row7 col0\" >3.3753</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122level1_row8\" class=\"row_heading level1 row8\" >Mean Error</th>\n",
       "                        <td id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122row8_col0\" class=\"data row8 col0\" >-0.1224</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122level1_row9\" class=\"row_heading level1 row9\" >Mean Example_Target</th>\n",
       "                        <td id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122row9_col0\" class=\"data row9 col0\" >4.2513</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122level1_row10\" class=\"row_heading level1 row10\" >Mean Prediction</th>\n",
       "                        <td id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122row10_col0\" class=\"data row10 col0\" >4.1290</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122level1_row11\" class=\"row_heading level1 row11\" >Rsqrd</th>\n",
       "                        <td id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122row11_col0\" class=\"data row11 col0\" >0.1326</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122level1_row12\" class=\"row_heading level1 row12\" >Std. Dev. Error</th>\n",
       "                        <td id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122row12_col0\" class=\"data row12 col0\" >1.8408</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122level1_row13\" class=\"row_heading level1 row13\" >Std. Dev. Example_Target</th>\n",
       "                        <td id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122row13_col0\" class=\"data row13 col0\" >1.9809</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122level1_row14\" class=\"row_heading level1 row14\" >Std. Dev. Prediction</th>\n",
       "                        <td id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122row14_col0\" class=\"data row14 col0\" >0.9631</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                        <th id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122level0_row15\" class=\"row_heading level0 row15\" >Data Metrics</th>\n",
       "                        <th id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122level1_row15\" class=\"row_heading level1 row15\" >Prevalence of Privileged Class (%)</th>\n",
       "                        <td id=\"T_a6b0ca58_0c4c_11ec_bfdb_acde48001122row15_col0\" class=\"data row15 col0\" >48.0000</td>\n",
       "            </tr>\n",
       "    </tbody></table>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Comparisons can also be returned as embedded HTML\n",
    "from IPython.core.display import HTML\n",
    "html_output = report.compare(test_data=X_test, \n",
    "                             targets=y_test, \n",
    "                             protected_attr=X_test['gender'], \n",
    "                             models=model_1, \n",
    "                             pred_type=\"regression\", \n",
    "                             output_type=\"html\")\n",
    "print(\"Returned type:\", type(html_output))\n",
    "HTML(html_output)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Comparing Results for Multiple Models\n",
    "\n",
    "The **compare** tool can also be used to measure two different models or two different protected attributes. Protected attributes are measured separately and cannot yet be combined together with the **compare** tool, although they can be grouped as cohorts in the stratified tables [as shown below](#cohort).\n",
    "\n",
    "Here is an example output comparing the two test models defined above. Missing values have been added for metrics requiring prediction probabilities, which the second model does not have (note the warning below)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style  type=\"text/css\" >\n",
       "#T_a6c56be6_0c4c_11ec_abcc_acde48001122row0_col0,#T_a6c56be6_0c4c_11ec_abcc_acde48001122row1_col0,#T_a6c56be6_0c4c_11ec_abcc_acde48001122row1_col1,#T_a6c56be6_0c4c_11ec_abcc_acde48001122row2_col0,#T_a6c56be6_0c4c_11ec_abcc_acde48001122row3_col0{\n",
       "            background-color: #c2bae3;\n",
       "        }</style><table id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122\" ><thead>    <tr>        <th class=\"blank\" ></th>        <th class=\"blank level0\" ></th>        <th class=\"col_heading level0 col0\" >model 1</th>        <th class=\"col_heading level0 col1\" >model 2</th>    </tr>    <tr>        <th class=\"index_name level0\" >Metric</th>        <th class=\"index_name level1\" >Measure</th>        <th class=\"blank\" ></th>        <th class=\"blank\" ></th>    </tr></thead><tbody>\n",
       "                <tr>\n",
       "                        <th id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122level0_row0\" class=\"row_heading level0 row0\" rowspan=4>Group Fairness</th>\n",
       "                        <th id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122level1_row0\" class=\"row_heading level1 row0\" >MAE Difference</th>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row0_col0\" class=\"data row0 col0\" >0.3878</td>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row0_col1\" class=\"data row0 col1\" >0.3357</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122level1_row1\" class=\"row_heading level1 row1\" >MAE Ratio</th>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row1_col0\" class=\"data row1 col0\" >1.2864</td>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row1_col1\" class=\"data row1 col1\" >1.2271</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122level1_row2\" class=\"row_heading level1 row2\" >Mean Prediction Difference</th>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row2_col0\" class=\"data row2 col0\" >-1.0663</td>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row2_col1\" class=\"data row2 col1\" >-0.2019</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122level1_row3\" class=\"row_heading level1 row3\" >Mean Prediction Ratio</th>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row3_col0\" class=\"data row3 col0\" >0.7721</td>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row3_col1\" class=\"data row3 col1\" >0.9523</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                        <th id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122level0_row4\" class=\"row_heading level0 row4\" rowspan=2>Individual Fairness</th>\n",
       "                        <th id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122level1_row4\" class=\"row_heading level1 row4\" >Between-Group Gen. Entropy Error</th>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row4_col0\" class=\"data row4 col0\" >0.0000</td>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row4_col1\" class=\"data row4 col1\" >0.0000</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122level1_row5\" class=\"row_heading level1 row5\" >Consistency Score</th>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row5_col0\" class=\"data row5 col0\" >0.3652</td>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row5_col1\" class=\"data row5 col1\" >0.8737</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                        <th id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122level0_row6\" class=\"row_heading level0 row6\" rowspan=9>Model Performance</th>\n",
       "                        <th id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122level1_row6\" class=\"row_heading level1 row6\" >MAE</th>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row6_col0\" class=\"data row6 col0\" >1.5547</td>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row6_col1\" class=\"data row6 col1\" >1.6516</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122level1_row7\" class=\"row_heading level1 row7\" >MSE</th>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row7_col0\" class=\"data row7 col0\" >3.3753</td>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row7_col1\" class=\"data row7 col1\" >3.7409</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122level1_row8\" class=\"row_heading level1 row8\" >Mean Error</th>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row8_col0\" class=\"data row8 col0\" >-0.1224</td>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row8_col1\" class=\"data row8 col1\" >-0.1204</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122level1_row9\" class=\"row_heading level1 row9\" >Mean Example_Target</th>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row9_col0\" class=\"data row9 col0\" >4.2513</td>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row9_col1\" class=\"data row9 col1\" >4.2513</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122level1_row10\" class=\"row_heading level1 row10\" >Mean Prediction</th>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row10_col0\" class=\"data row10 col0\" >4.1290</td>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row10_col1\" class=\"data row10 col1\" >4.1310</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122level1_row11\" class=\"row_heading level1 row11\" >Rsqrd</th>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row11_col0\" class=\"data row11 col0\" >0.1326</td>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row11_col1\" class=\"data row11 col1\" >0.0386</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122level1_row12\" class=\"row_heading level1 row12\" >Std. Dev. Error</th>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row12_col0\" class=\"data row12 col0\" >1.8408</td>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row12_col1\" class=\"data row12 col1\" >1.9385</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122level1_row13\" class=\"row_heading level1 row13\" >Std. Dev. Example_Target</th>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row13_col0\" class=\"data row13 col0\" >1.9809</td>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row13_col1\" class=\"data row13 col1\" >1.9809</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122level1_row14\" class=\"row_heading level1 row14\" >Std. Dev. Prediction</th>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row14_col0\" class=\"data row14 col0\" >0.9631</td>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row14_col1\" class=\"data row14 col1\" >0.2086</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                        <th id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122level0_row15\" class=\"row_heading level0 row15\" >Data Metrics</th>\n",
       "                        <th id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122level1_row15\" class=\"row_heading level1 row15\" >Prevalence of Privileged Class (%)</th>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row15_col0\" class=\"data row15 col0\" >48.0000</td>\n",
       "                        <td id=\"T_a6c56be6_0c4c_11ec_abcc_acde48001122row15_col1\" class=\"data row15 col1\" >48.0000</td>\n",
       "            </tr>\n",
       "    </tbody></table>"
      ],
      "text/plain": [
       "<pandas.io.formats.style.Styler at 0x7faaa97095f8>"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Generate a pandas dataframe of measures\n",
    "report.compare(X_test,  \n",
    "               y_test,  \n",
    "               X_test['gender'], \n",
    "               {'model 1':model_1, 'model 2':model_2}, \n",
    "               pred_type=\"regression\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Detailed Analyses\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Significance Testing\n",
    "\n",
    "It is generally recommended to test whether any differences in model outcomes for protected attributes are the effect of a sampling error in our test. FairMLHealth comes with a bootstrapping utility and supporting functions that can be used in statistical testing. The bootstrapping utility accepts any function that returns a p-value and will return a True or False if the p-value is greater than some alpha for a threshold number of randomly sampled trials. While the selection of proper statistical tests is beyond the scope of this notebook, three examples using the bootstrap_significance tool with a built-in Kruskal-Wallis test function are shown below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Is the y value is different for male vs female?\n",
      " True\n"
     ]
    }
   ],
   "source": [
    "# Example 1 Bootstrap Test Results Applying Kruskal-Wallis to Relative to Gender\n",
    "isMale = X['gender'].eq(1)\n",
    "reject_h0 = stat_utils.bootstrap_significance(func=stat_utils.kruskal_pval, \n",
    "                                              a=y.loc[isMale], \n",
    "                                              b=y.loc[~isMale])\n",
    "print(\"Is the y value is different for male vs female?\\n\", reject_h0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Is the y-value is different for caucasian vs not-caucasian?\n",
      " False\n"
     ]
    }
   ],
   "source": [
    "# Example 1 Bootstrap Test Results Applying Kruskal-Wallis to Relative to Ethnicity\n",
    "isCaucasian = X['ethnicity'].eq(1)\n",
    "reject_h0 = stat_utils.bootstrap_significance(func=stat_utils.kruskal_pval, \n",
    "                                              a=y.loc[isCaucasian], \n",
    "                                              b=y.loc[~isCaucasian])\n",
    "print(\"Is the y-value is different for caucasian vs not-caucasian?\\n\", reject_h0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "P-Value of single K-W test: 2.981592458110808e-10\n"
     ]
    }
   ],
   "source": [
    "# Example of Single Krusakal-Wallis Test\n",
    "pval = stat_utils.kruskal_pval(a=y.loc[X['col3'].eq(1)], \n",
    "                               b=y.loc[X['col3'].eq(0)], \n",
    "                               # If n_sample set to None, tests on full dataset rather than sample\n",
    "                               n_sample=None\n",
    "                              )\n",
    "print(\"P-Value of single K-W test:\", pval)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Stratified Tables\n",
    "FairMLHealth also provides tools for detailed analysis of model variance by way of stratified data, performance, and bias tables. Beyond evaluating fairness, these tools are intended for flexible use in any generic assessment of model bais. Tables can evaluate multiple features at once. *An important update starting in Version 1.0.0 is that all of these features are now contained in the **measure.py** module (previously named reports.py).*\n",
    "\n",
    "All tables display a summary row for \"All Features, All Values\". This summary can be turned off by passing ***add_overview=False*** to measure.data()."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Data Tables\n",
    "\n",
    "The stratified data table can be used to evaluate data against one or multiple targets. Two methods are available for identifying which features to assess, as shown in the examples below. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Feature Name</th>\n",
       "      <th>Feature Value</th>\n",
       "      <th>Obs.</th>\n",
       "      <th>Entropy</th>\n",
       "      <th>Mean Example_Target</th>\n",
       "      <th>Median Example_Target</th>\n",
       "      <th>Missing Values</th>\n",
       "      <th>Std. Dev. Example_Target</th>\n",
       "      <th>Value Prevalence</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>ALL FEATURES</td>\n",
       "      <td>ALL VALUES</td>\n",
       "      <td>120</td>\n",
       "      <td>NaN</td>\n",
       "      <td>4.2513</td>\n",
       "      <td>4.5745</td>\n",
       "      <td>0</td>\n",
       "      <td>1.9809</td>\n",
       "      <td>1.0000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>gender</td>\n",
       "      <td>0</td>\n",
       "      <td>62</td>\n",
       "      <td>0.9992</td>\n",
       "      <td>3.5410</td>\n",
       "      <td>3.7835</td>\n",
       "      <td>0</td>\n",
       "      <td>2.0357</td>\n",
       "      <td>0.5167</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>gender</td>\n",
       "      <td>1</td>\n",
       "      <td>58</td>\n",
       "      <td>0.9992</td>\n",
       "      <td>5.0106</td>\n",
       "      <td>5.0673</td>\n",
       "      <td>0</td>\n",
       "      <td>1.6192</td>\n",
       "      <td>0.4833</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   Feature Name Feature Value  Obs.  Entropy  Mean Example_Target  \\\n",
       "0  ALL FEATURES    ALL VALUES   120      NaN               4.2513   \n",
       "1        gender             0    62   0.9992               3.5410   \n",
       "2        gender             1    58   0.9992               5.0106   \n",
       "\n",
       "   Median Example_Target  Missing Values  Std. Dev. Example_Target  \\\n",
       "0                 4.5745               0                    1.9809   \n",
       "1                 3.7835               0                    2.0357   \n",
       "2                 5.0673               0                    1.6192   \n",
       "\n",
       "   Value Prevalence  \n",
       "0            1.0000  \n",
       "1            0.5167  \n",
       "2            0.4833  "
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Arguments Option 1: pass full set of data, subsetting with *features* argument\n",
    "measure.data(X_test, y_test, features=['gender'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Feature Name</th>\n",
       "      <th>Feature Value</th>\n",
       "      <th>Obs.</th>\n",
       "      <th>Entropy</th>\n",
       "      <th>Mean Example_Target</th>\n",
       "      <th>Median Example_Target</th>\n",
       "      <th>Missing Values</th>\n",
       "      <th>Std. Dev. Example_Target</th>\n",
       "      <th>Value Prevalence</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>ALL FEATURES</td>\n",
       "      <td>ALL VALUES</td>\n",
       "      <td>120</td>\n",
       "      <td>NaN</td>\n",
       "      <td>4.2513</td>\n",
       "      <td>4.5745</td>\n",
       "      <td>0</td>\n",
       "      <td>1.9809</td>\n",
       "      <td>1.0000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>gender</td>\n",
       "      <td>0</td>\n",
       "      <td>62</td>\n",
       "      <td>0.9992</td>\n",
       "      <td>3.5410</td>\n",
       "      <td>3.7835</td>\n",
       "      <td>0</td>\n",
       "      <td>2.0357</td>\n",
       "      <td>0.5167</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>gender</td>\n",
       "      <td>1</td>\n",
       "      <td>58</td>\n",
       "      <td>0.9992</td>\n",
       "      <td>5.0106</td>\n",
       "      <td>5.0673</td>\n",
       "      <td>0</td>\n",
       "      <td>1.6192</td>\n",
       "      <td>0.4833</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   Feature Name Feature Value  Obs.  Entropy  Mean Example_Target  \\\n",
       "0  ALL FEATURES    ALL VALUES   120      NaN               4.2513   \n",
       "1        gender             0    62   0.9992               3.5410   \n",
       "2        gender             1    58   0.9992               5.0106   \n",
       "\n",
       "   Median Example_Target  Missing Values  Std. Dev. Example_Target  \\\n",
       "0                 4.5745               0                    1.9809   \n",
       "1                 3.7835               0                    2.0357   \n",
       "2                 5.0673               0                    1.6192   \n",
       "\n",
       "   Value Prevalence  \n",
       "0            1.0000  \n",
       "1            0.5167  \n",
       "2            0.4833  "
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Arguments Option 2: pass the data subset of interest without using the *features* argument\n",
    "measure.data(X_test['gender'], y_test)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Feature Name</th>\n",
       "      <th>Feature Value</th>\n",
       "      <th>Obs.</th>\n",
       "      <th>Entropy</th>\n",
       "      <th>Mean col2</th>\n",
       "      <th>Mean col3</th>\n",
       "      <th>Median col2</th>\n",
       "      <th>Median col3</th>\n",
       "      <th>Missing Values</th>\n",
       "      <th>Std. Dev. col2</th>\n",
       "      <th>Std. Dev. col3</th>\n",
       "      <th>Value Prevalence</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>gender</td>\n",
       "      <td>0</td>\n",
       "      <td>62</td>\n",
       "      <td>0.9992</td>\n",
       "      <td>36.6452</td>\n",
       "      <td>0.4677</td>\n",
       "      <td>34.5</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0</td>\n",
       "      <td>22.5811</td>\n",
       "      <td>0.5030</td>\n",
       "      <td>0.5167</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>gender</td>\n",
       "      <td>1</td>\n",
       "      <td>58</td>\n",
       "      <td>0.9992</td>\n",
       "      <td>36.2241</td>\n",
       "      <td>0.6034</td>\n",
       "      <td>32.5</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0</td>\n",
       "      <td>20.9821</td>\n",
       "      <td>0.4935</td>\n",
       "      <td>0.4833</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>col1</td>\n",
       "      <td>1</td>\n",
       "      <td>51</td>\n",
       "      <td>1.5579</td>\n",
       "      <td>36.4706</td>\n",
       "      <td>0.6471</td>\n",
       "      <td>32.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0</td>\n",
       "      <td>21.2935</td>\n",
       "      <td>0.4826</td>\n",
       "      <td>0.4250</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>col1</td>\n",
       "      <td>2</td>\n",
       "      <td>33</td>\n",
       "      <td>1.5579</td>\n",
       "      <td>33.6364</td>\n",
       "      <td>0.4545</td>\n",
       "      <td>30.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0</td>\n",
       "      <td>21.4226</td>\n",
       "      <td>0.5056</td>\n",
       "      <td>0.2750</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>col1</td>\n",
       "      <td>3</td>\n",
       "      <td>36</td>\n",
       "      <td>1.5579</td>\n",
       "      <td>38.9722</td>\n",
       "      <td>0.4444</td>\n",
       "      <td>40.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0</td>\n",
       "      <td>22.9016</td>\n",
       "      <td>0.5040</td>\n",
       "      <td>0.3000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  Feature Name Feature Value  Obs.  Entropy  Mean col2  Mean col3  \\\n",
       "0       gender             0    62   0.9992    36.6452     0.4677   \n",
       "1       gender             1    58   0.9992    36.2241     0.6034   \n",
       "2         col1             1    51   1.5579    36.4706     0.6471   \n",
       "3         col1             2    33   1.5579    33.6364     0.4545   \n",
       "4         col1             3    36   1.5579    38.9722     0.4444   \n",
       "\n",
       "   Median col2  Median col3  Missing Values  Std. Dev. col2  Std. Dev. col3  \\\n",
       "0         34.5          0.0               0         22.5811          0.5030   \n",
       "1         32.5          1.0               0         20.9821          0.4935   \n",
       "2         32.0          1.0               0         21.2935          0.4826   \n",
       "3         30.0          0.0               0         21.4226          0.5056   \n",
       "4         40.0          0.0               0         22.9016          0.5040   \n",
       "\n",
       "   Value Prevalence  \n",
       "0            0.5167  \n",
       "1            0.4833  \n",
       "2            0.4250  \n",
       "3            0.2750  \n",
       "4            0.3000  "
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Display a similar report for multiple targets, dropping the summary row\n",
    "measure.data(X=X_test, # used to define rows\n",
    "             Y=X_test, # used to define columns\n",
    "             features=['gender', 'col1'], # optional subset of X\n",
    "             targets=['col2', 'col3'], # optional subset of Y\n",
    "             add_overview=False # turns off \"All Features, All Values\" row\n",
    "             )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Feature Name</th>\n",
       "      <th>Feature Value</th>\n",
       "      <th>Mean col2</th>\n",
       "      <th>Mean col3</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>gender</td>\n",
       "      <td>1</td>\n",
       "      <td>36.2241</td>\n",
       "      <td>0.6034</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>col1</td>\n",
       "      <td>1</td>\n",
       "      <td>36.4706</td>\n",
       "      <td>0.6471</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  Feature Name Feature Value  Mean col2  Mean col3\n",
       "2       gender             1    36.2241     0.6034\n",
       "3         col1             1    36.4706     0.6471"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Analytical tables are output as pandas DataFrames\n",
    "test_table = measure.data(X=X_test[['gender', 'col1']], # used to define rows\n",
    "                          Y=X_test[['col2', 'col3']], # used to define columns\n",
    "                         )\n",
    "\n",
    "test_table.loc[test_table['Feature Value'].eq(\"1\"), ['Feature Name', 'Feature Value', 'Mean col2', 'Mean col3']]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Stratified Performance Tables\n",
    "\n",
    "The stratified performance table evaluates model performance specific to each feature-value subset. These tables are compatible with both classification and regression models."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Feature Name</th>\n",
       "      <th>Feature Value</th>\n",
       "      <th>Obs.</th>\n",
       "      <th>Mean Target</th>\n",
       "      <th>Mean Prediction</th>\n",
       "      <th>MAE</th>\n",
       "      <th>MSE</th>\n",
       "      <th>Mean Error</th>\n",
       "      <th>Rsqrd</th>\n",
       "      <th>Std. Dev. Error</th>\n",
       "      <th>Std. Dev. Prediction</th>\n",
       "      <th>Std. Dev. Target</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>ALL FEATURES</td>\n",
       "      <td>ALL VALUES</td>\n",
       "      <td>120.0</td>\n",
       "      <td>4.2513</td>\n",
       "      <td>4.1290</td>\n",
       "      <td>1.5547</td>\n",
       "      <td>3.3753</td>\n",
       "      <td>-0.1224</td>\n",
       "      <td>0.1326</td>\n",
       "      <td>1.8408</td>\n",
       "      <td>0.9631</td>\n",
       "      <td>1.9809</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>gender</td>\n",
       "      <td>0</td>\n",
       "      <td>62.0</td>\n",
       "      <td>3.5410</td>\n",
       "      <td>3.6136</td>\n",
       "      <td>1.7422</td>\n",
       "      <td>3.8787</td>\n",
       "      <td>0.0725</td>\n",
       "      <td>0.0487</td>\n",
       "      <td>1.9842</td>\n",
       "      <td>0.7671</td>\n",
       "      <td>2.0357</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>gender</td>\n",
       "      <td>1</td>\n",
       "      <td>58.0</td>\n",
       "      <td>5.0106</td>\n",
       "      <td>4.6799</td>\n",
       "      <td>1.3544</td>\n",
       "      <td>2.8372</td>\n",
       "      <td>-0.3307</td>\n",
       "      <td>-0.1012</td>\n",
       "      <td>1.6660</td>\n",
       "      <td>0.8420</td>\n",
       "      <td>1.6192</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   Feature Name Feature Value   Obs.  Mean Target  Mean Prediction     MAE  \\\n",
       "0  ALL FEATURES    ALL VALUES  120.0       4.2513           4.1290  1.5547   \n",
       "1        gender             0   62.0       3.5410           3.6136  1.7422   \n",
       "2        gender             1   58.0       5.0106           4.6799  1.3544   \n",
       "\n",
       "      MSE  Mean Error   Rsqrd  Std. Dev. Error  Std. Dev. Prediction  \\\n",
       "0  3.3753     -0.1224  0.1326           1.8408                0.9631   \n",
       "1  3.8787      0.0725  0.0487           1.9842                0.7671   \n",
       "2  2.8372     -0.3307 -0.1012           1.6660                0.8420   \n",
       "\n",
       "   Std. Dev. Target  \n",
       "0            1.9809  \n",
       "1            2.0357  \n",
       "2            1.6192  "
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Performance table example\n",
    "measure.performance(X_test[['gender']], y_test, model_1.predict(X_test), \n",
    "                    pred_type=\"regression\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Stratified Bias Tables\n",
    "\n",
    "The stratified bias analysis feature applies fairness-related metrics for each feature-value pair. It assumes a given feature-value as the \"privileged\" group relative to all other possible values for the feature. For example, in the table output shown in the cell below, row **2** in the table below displays measures for **\"col1\"** with a value of **\"2\"**. For this row, \"2\" is considered to be the privileged group, while all other non-null values (namely \"1\" and \"3\") are considered unprivileged.\n",
    "\n",
    "Note that the *flag* function is compatible with both **measure.bias()** and **measure.summary()** (which is demonstrated below). However, to enable colored cells the tool returns a pandas Styler rather than a DataTable. For this reason, *flag_oor* is False by default for these features. Flagging can be turned on by passing *flag_oor=True* to either function. As an added feature, optional custom ranges can be passed to either **measure.bias()** or **measure.summary()** to facilitate regression evaluation as shown below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style  type=\"text/css\" >\n",
       "#T_a740b90c_0c4c_11ec_9618_acde48001122row0_col3,#T_a740b90c_0c4c_11ec_9618_acde48001122row0_col5,#T_a740b90c_0c4c_11ec_9618_acde48001122row1_col3,#T_a740b90c_0c4c_11ec_9618_acde48001122row1_col5{\n",
       "            background-color: #c2bae3;\n",
       "        }</style><table id=\"T_a740b90c_0c4c_11ec_9618_acde48001122\" ><thead>    <tr>        <th class=\"blank level0\" ></th>        <th class=\"col_heading level0 col0\" >Feature Name</th>        <th class=\"col_heading level0 col1\" >Feature Value</th>        <th class=\"col_heading level0 col2\" >MAE Difference</th>        <th class=\"col_heading level0 col3\" >MAE Ratio</th>        <th class=\"col_heading level0 col4\" >Mean Prediction Difference</th>        <th class=\"col_heading level0 col5\" >Mean Prediction Ratio</th>    </tr></thead><tbody>\n",
       "                <tr>\n",
       "                        <th id=\"T_a740b90c_0c4c_11ec_9618_acde48001122level0_row0\" class=\"row_heading level0 row0\" >0</th>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row0_col0\" class=\"data row0 col0\" >gender</td>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row0_col1\" class=\"data row0 col1\" >0</td>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row0_col2\" class=\"data row0 col2\" >-0.3878</td>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row0_col3\" class=\"data row0 col3\" >0.7774</td>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row0_col4\" class=\"data row0 col4\" >1.0663</td>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row0_col5\" class=\"data row0 col5\" >1.2951</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                        <th id=\"T_a740b90c_0c4c_11ec_9618_acde48001122level0_row1\" class=\"row_heading level0 row1\" >1</th>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row1_col0\" class=\"data row1 col0\" >gender</td>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row1_col1\" class=\"data row1 col1\" >1</td>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row1_col2\" class=\"data row1 col2\" >0.3878</td>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row1_col3\" class=\"data row1 col3\" >1.2864</td>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row1_col4\" class=\"data row1 col4\" >-1.0663</td>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row1_col5\" class=\"data row1 col5\" >0.7721</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                        <th id=\"T_a740b90c_0c4c_11ec_9618_acde48001122level0_row2\" class=\"row_heading level0 row2\" >2</th>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row2_col0\" class=\"data row2 col0\" >col1</td>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row2_col1\" class=\"data row2 col1\" >1</td>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row2_col2\" class=\"data row2 col2\" >-0.2275</td>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row2_col3\" class=\"data row2 col3\" >0.8650</td>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row2_col4\" class=\"data row2 col4\" >0.1545</td>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row2_col5\" class=\"data row2 col5\" >1.0382</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                        <th id=\"T_a740b90c_0c4c_11ec_9618_acde48001122level0_row3\" class=\"row_heading level0 row3\" >3</th>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row3_col0\" class=\"data row3 col0\" >col1</td>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row3_col1\" class=\"data row3 col1\" >2</td>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row3_col2\" class=\"data row3 col2\" >0.2495</td>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row3_col3\" class=\"data row3 col3\" >1.1816</td>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row3_col4\" class=\"data row3 col4\" >0.1337</td>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row3_col5\" class=\"data row3 col5\" >1.0332</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                        <th id=\"T_a740b90c_0c4c_11ec_9618_acde48001122level0_row4\" class=\"row_heading level0 row4\" >4</th>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row4_col0\" class=\"data row4 col0\" >col1</td>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row4_col1\" class=\"data row4 col1\" >3</td>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row4_col2\" class=\"data row4 col2\" >0.0279</td>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row4_col3\" class=\"data row4 col3\" >1.0182</td>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row4_col4\" class=\"data row4 col4\" >-0.3067</td>\n",
       "                        <td id=\"T_a740b90c_0c4c_11ec_9618_acde48001122row4_col5\" class=\"data row4 col5\" >0.9294</td>\n",
       "            </tr>\n",
       "    </tbody></table>"
      ],
      "text/plain": [
       "<pandas.io.formats.style.Styler at 0x7faaa9770a90>"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Custom \"fair\" ranges may be passed as dictionaries of tuples whose keys \n",
    "# are case-insensitive measure names\n",
    "my_ranges = {'mean prediction difference':(-2, 2)}\n",
    "\n",
    "# Note that flag_oor is set to False by default for this feature\n",
    "measure.bias(X_test[['gender', 'col1']],\n",
    "             y_test,\n",
    "             model_1.predict(X_test),\n",
    "             pred_type=\"regression\",\n",
    "             flag_oor=True,\n",
    "             custom_ranges=my_ranges)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The **measure** module also contains a summary function that works similarly to report.compare(). While it can only be applied to one model at a time, it can accept custom \"fair\" ranges, and accept cohort groups as [shown in the next section](#cohort)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style  type=\"text/css\" >\n",
       "#T_a74d55cc_0c4c_11ec_aed5_acde48001122row1_col0,#T_a74d55cc_0c4c_11ec_aed5_acde48001122row2_col0,#T_a74d55cc_0c4c_11ec_aed5_acde48001122row3_col0{\n",
       "            background-color: #c2bae3;\n",
       "        }</style><table id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122\" ><thead>    <tr>        <th class=\"blank\" ></th>        <th class=\"blank level0\" ></th>        <th class=\"col_heading level0 col0\" >Value</th>    </tr>    <tr>        <th class=\"index_name level0\" >Metric</th>        <th class=\"index_name level1\" >Measure</th>        <th class=\"blank\" ></th>    </tr></thead><tbody>\n",
       "                <tr>\n",
       "                        <th id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122level0_row0\" class=\"row_heading level0 row0\" rowspan=4>Group Fairness</th>\n",
       "                        <th id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122level1_row0\" class=\"row_heading level1 row0\" >MAE Difference</th>\n",
       "                        <td id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122row0_col0\" class=\"data row0 col0\" >0.3878</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122level1_row1\" class=\"row_heading level1 row1\" >MAE Ratio</th>\n",
       "                        <td id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122row1_col0\" class=\"data row1 col0\" >1.2864</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122level1_row2\" class=\"row_heading level1 row2\" >Mean Prediction Difference</th>\n",
       "                        <td id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122row2_col0\" class=\"data row2 col0\" >-1.0663</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122level1_row3\" class=\"row_heading level1 row3\" >Mean Prediction Ratio</th>\n",
       "                        <td id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122row3_col0\" class=\"data row3 col0\" >0.7721</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                        <th id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122level0_row4\" class=\"row_heading level0 row4\" rowspan=2>Individual Fairness</th>\n",
       "                        <th id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122level1_row4\" class=\"row_heading level1 row4\" >Between-Group Gen. Entropy Error</th>\n",
       "                        <td id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122row4_col0\" class=\"data row4 col0\" >0.0000</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122level1_row5\" class=\"row_heading level1 row5\" >Consistency Score</th>\n",
       "                        <td id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122row5_col0\" class=\"data row5 col0\" >0.3141</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                        <th id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122level0_row6\" class=\"row_heading level0 row6\" rowspan=9>Model Performance</th>\n",
       "                        <th id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122level1_row6\" class=\"row_heading level1 row6\" >MAE</th>\n",
       "                        <td id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122row6_col0\" class=\"data row6 col0\" >1.5547</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122level1_row7\" class=\"row_heading level1 row7\" >MSE</th>\n",
       "                        <td id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122row7_col0\" class=\"data row7 col0\" >3.3753</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122level1_row8\" class=\"row_heading level1 row8\" >Mean Error</th>\n",
       "                        <td id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122row8_col0\" class=\"data row8 col0\" >-0.1224</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122level1_row9\" class=\"row_heading level1 row9\" >Mean Example_Target</th>\n",
       "                        <td id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122row9_col0\" class=\"data row9 col0\" >4.2513</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122level1_row10\" class=\"row_heading level1 row10\" >Mean Prediction</th>\n",
       "                        <td id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122row10_col0\" class=\"data row10 col0\" >4.1290</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122level1_row11\" class=\"row_heading level1 row11\" >Rsqrd</th>\n",
       "                        <td id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122row11_col0\" class=\"data row11 col0\" >0.1326</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122level1_row12\" class=\"row_heading level1 row12\" >Std. Dev. Error</th>\n",
       "                        <td id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122row12_col0\" class=\"data row12 col0\" >1.8408</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122level1_row13\" class=\"row_heading level1 row13\" >Std. Dev. Example_Target</th>\n",
       "                        <td id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122row13_col0\" class=\"data row13 col0\" >1.9809</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                                <th id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122level1_row14\" class=\"row_heading level1 row14\" >Std. Dev. Prediction</th>\n",
       "                        <td id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122row14_col0\" class=\"data row14 col0\" >0.9631</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                        <th id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122level0_row15\" class=\"row_heading level0 row15\" >Data Metrics</th>\n",
       "                        <th id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122level1_row15\" class=\"row_heading level1 row15\" >Prevalence of Privileged Class (%)</th>\n",
       "                        <td id=\"T_a74d55cc_0c4c_11ec_aed5_acde48001122row15_col0\" class=\"data row15 col0\" >48.0000</td>\n",
       "            </tr>\n",
       "    </tbody></table>"
      ],
      "text/plain": [
       "<pandas.io.formats.style.Styler at 0x7faaa97099b0>"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Example summary output for the regression model with custom ranges\n",
    "measure.summary(X_test[['gender', 'col1']],\n",
    "                y_test,\n",
    "                model_1.predict(X_test),\n",
    "                prtc_attr=X_test['gender'],\n",
    "                pred_type=\"regression\",\n",
    "                flag_oor=True,\n",
    "                custom_ranges={ 'mean prediction difference':(-0.5, 2)})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## <a name=\"cohort\"></a>Analysis by Cohort\n",
    "\n",
    "Table-generating functions in the **measure** module can be additionally grouped using the *cohort_labels* argument to specify additional labels for each observation. Cohorts may consist of either as a single label or a set of labels, and may be either separate from or attached to the existing data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style  type=\"text/css\" >\n",
       "#T_a7659440_0c4c_11ec_83f1_acde48001122row0_col4,#T_a7659440_0c4c_11ec_83f1_acde48001122row0_col6,#T_a7659440_0c4c_11ec_83f1_acde48001122row1_col4,#T_a7659440_0c4c_11ec_83f1_acde48001122row1_col6,#T_a7659440_0c4c_11ec_83f1_acde48001122row2_col4,#T_a7659440_0c4c_11ec_83f1_acde48001122row2_col6,#T_a7659440_0c4c_11ec_83f1_acde48001122row3_col4,#T_a7659440_0c4c_11ec_83f1_acde48001122row3_col6,#T_a7659440_0c4c_11ec_83f1_acde48001122row4_col4,#T_a7659440_0c4c_11ec_83f1_acde48001122row4_col6,#T_a7659440_0c4c_11ec_83f1_acde48001122row5_col4,#T_a7659440_0c4c_11ec_83f1_acde48001122row5_col6{\n",
       "            background-color: #c2bae3;\n",
       "        }</style><table id=\"T_a7659440_0c4c_11ec_83f1_acde48001122\" ><thead>    <tr>        <th class=\"blank level0\" ></th>        <th class=\"col_heading level0 col0\" >True Value Group</th>        <th class=\"col_heading level0 col1\" >Feature Name</th>        <th class=\"col_heading level0 col2\" >Feature Value</th>        <th class=\"col_heading level0 col3\" >MAE Difference</th>        <th class=\"col_heading level0 col4\" >MAE Ratio</th>        <th class=\"col_heading level0 col5\" >Mean Prediction Difference</th>        <th class=\"col_heading level0 col6\" >Mean Prediction Ratio</th>    </tr></thead><tbody>\n",
       "                <tr>\n",
       "                        <th id=\"T_a7659440_0c4c_11ec_83f1_acde48001122level0_row0\" class=\"row_heading level0 row0\" >0</th>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row0_col0\" class=\"data row0 col0\" >0</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row0_col1\" class=\"data row0 col1\" >col3</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row0_col2\" class=\"data row0 col2\" >0</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row0_col3\" class=\"data row0 col3\" >0.9421</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row0_col4\" class=\"data row0 col4\" >1.5954</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row0_col5\" class=\"data row0 col5\" >1.4668</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row0_col6\" class=\"data row0 col6\" >1.4613</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                        <th id=\"T_a7659440_0c4c_11ec_83f1_acde48001122level0_row1\" class=\"row_heading level0 row1\" >1</th>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row1_col0\" class=\"data row1 col0\" >0</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row1_col1\" class=\"data row1 col1\" >col3</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row1_col2\" class=\"data row1 col2\" >1</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row1_col3\" class=\"data row1 col3\" >-0.9421</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row1_col4\" class=\"data row1 col4\" >0.6268</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row1_col5\" class=\"data row1 col5\" >-1.4668</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row1_col6\" class=\"data row1 col6\" >0.6843</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                        <th id=\"T_a7659440_0c4c_11ec_83f1_acde48001122level0_row2\" class=\"row_heading level0 row2\" >2</th>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row2_col0\" class=\"data row2 col0\" >1</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row2_col1\" class=\"data row2 col1\" >col3</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row2_col2\" class=\"data row2 col2\" >0</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row2_col3\" class=\"data row2 col3\" >-0.4956</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row2_col4\" class=\"data row2 col4\" >0.5698</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row2_col5\" class=\"data row2 col5\" >1.4232</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row2_col6\" class=\"data row2 col6\" >1.4092</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                        <th id=\"T_a7659440_0c4c_11ec_83f1_acde48001122level0_row3\" class=\"row_heading level0 row3\" >3</th>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row3_col0\" class=\"data row3 col0\" >1</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row3_col1\" class=\"data row3 col1\" >col3</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row3_col2\" class=\"data row3 col2\" >1</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row3_col3\" class=\"data row3 col3\" >0.4956</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row3_col4\" class=\"data row3 col4\" >1.7549</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row3_col5\" class=\"data row3 col5\" >-1.4232</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row3_col6\" class=\"data row3 col6\" >0.7096</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                        <th id=\"T_a7659440_0c4c_11ec_83f1_acde48001122level0_row4\" class=\"row_heading level0 row4\" >4</th>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row4_col0\" class=\"data row4 col0\" >2</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row4_col1\" class=\"data row4 col1\" >col3</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row4_col2\" class=\"data row4 col2\" >0</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row4_col3\" class=\"data row4 col3\" >-1.1291</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row4_col4\" class=\"data row4 col4\" >0.5810</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row4_col5\" class=\"data row4 col5\" >1.2833</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row4_col6\" class=\"data row4 col6\" >1.3623</td>\n",
       "            </tr>\n",
       "            <tr>\n",
       "                        <th id=\"T_a7659440_0c4c_11ec_83f1_acde48001122level0_row5\" class=\"row_heading level0 row5\" >5</th>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row5_col0\" class=\"data row5 col0\" >2</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row5_col1\" class=\"data row5 col1\" >col3</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row5_col2\" class=\"data row5 col2\" >1</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row5_col3\" class=\"data row5 col3\" >1.1291</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row5_col4\" class=\"data row5 col4\" >1.7211</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row5_col5\" class=\"data row5 col5\" >-1.2833</td>\n",
       "                        <td id=\"T_a7659440_0c4c_11ec_83f1_acde48001122row5_col6\" class=\"data row5 col6\" >0.7340</td>\n",
       "            </tr>\n",
       "    </tbody></table>"
      ],
      "text/plain": [
       "<pandas.io.formats.style.Styler at 0x7faaa9701278>"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Define cohort labels relative to the true values of the target\n",
    "cohort_labels = pd.qcut(y_test, 3, labels=False).rename('True Value Group')\n",
    "\n",
    "# Separate, Single-Level Cohorts\n",
    "measure.bias(X_test['col3'], y_test, model_1.predict(X_test), \n",
    "             pred_type=\"regression\", flag_oor=True, \n",
    "             cohort_labels=cohort_labels)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>Feature Name</th>\n",
       "      <th>Feature Value</th>\n",
       "      <th>Obs.</th>\n",
       "      <th>Entropy</th>\n",
       "      <th>Mean Example_Target</th>\n",
       "      <th>Median Example_Target</th>\n",
       "      <th>Missing Values</th>\n",
       "      <th>Std. Dev. Example_Target</th>\n",
       "      <th>Value Prevalence</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>gender</th>\n",
       "      <th>ethnicity</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th rowspan=\"6\" valign=\"top\">0</th>\n",
       "      <th>0</th>\n",
       "      <td>ALL FEATURES</td>\n",
       "      <td>ALL VALUES</td>\n",
       "      <td>29</td>\n",
       "      <td>NaN</td>\n",
       "      <td>3.9273</td>\n",
       "      <td>3.9847</td>\n",
       "      <td>0</td>\n",
       "      <td>1.9164</td>\n",
       "      <td>1.0000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>col3</td>\n",
       "      <td>0</td>\n",
       "      <td>15</td>\n",
       "      <td>0.9991</td>\n",
       "      <td>3.3797</td>\n",
       "      <td>3.7754</td>\n",
       "      <td>0</td>\n",
       "      <td>1.9532</td>\n",
       "      <td>0.5172</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>col3</td>\n",
       "      <td>1</td>\n",
       "      <td>14</td>\n",
       "      <td>0.9991</td>\n",
       "      <td>4.5141</td>\n",
       "      <td>4.4485</td>\n",
       "      <td>0</td>\n",
       "      <td>1.7564</td>\n",
       "      <td>0.4828</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>ALL FEATURES</td>\n",
       "      <td>ALL VALUES</td>\n",
       "      <td>33</td>\n",
       "      <td>NaN</td>\n",
       "      <td>3.2016</td>\n",
       "      <td>2.6024</td>\n",
       "      <td>0</td>\n",
       "      <td>2.1053</td>\n",
       "      <td>1.0000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>col3</td>\n",
       "      <td>0</td>\n",
       "      <td>18</td>\n",
       "      <td>0.9940</td>\n",
       "      <td>2.4920</td>\n",
       "      <td>1.6709</td>\n",
       "      <td>0</td>\n",
       "      <td>1.8859</td>\n",
       "      <td>0.5455</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>col3</td>\n",
       "      <td>1</td>\n",
       "      <td>15</td>\n",
       "      <td>0.9940</td>\n",
       "      <td>4.0530</td>\n",
       "      <td>5.1426</td>\n",
       "      <td>0</td>\n",
       "      <td>2.0949</td>\n",
       "      <td>0.4545</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"6\" valign=\"top\">1</th>\n",
       "      <th>0</th>\n",
       "      <td>ALL FEATURES</td>\n",
       "      <td>ALL VALUES</td>\n",
       "      <td>26</td>\n",
       "      <td>NaN</td>\n",
       "      <td>4.9544</td>\n",
       "      <td>4.7895</td>\n",
       "      <td>0</td>\n",
       "      <td>1.4701</td>\n",
       "      <td>1.0000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>col3</td>\n",
       "      <td>0</td>\n",
       "      <td>11</td>\n",
       "      <td>0.9829</td>\n",
       "      <td>4.6557</td>\n",
       "      <td>4.5711</td>\n",
       "      <td>0</td>\n",
       "      <td>1.6014</td>\n",
       "      <td>0.4231</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>col3</td>\n",
       "      <td>1</td>\n",
       "      <td>15</td>\n",
       "      <td>0.9829</td>\n",
       "      <td>5.1735</td>\n",
       "      <td>5.0948</td>\n",
       "      <td>0</td>\n",
       "      <td>1.3805</td>\n",
       "      <td>0.5769</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>ALL FEATURES</td>\n",
       "      <td>ALL VALUES</td>\n",
       "      <td>32</td>\n",
       "      <td>NaN</td>\n",
       "      <td>5.0563</td>\n",
       "      <td>5.2957</td>\n",
       "      <td>0</td>\n",
       "      <td>1.7530</td>\n",
       "      <td>1.0000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>col3</td>\n",
       "      <td>0</td>\n",
       "      <td>12</td>\n",
       "      <td>0.9544</td>\n",
       "      <td>4.2436</td>\n",
       "      <td>4.2397</td>\n",
       "      <td>0</td>\n",
       "      <td>1.7731</td>\n",
       "      <td>0.3750</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>col3</td>\n",
       "      <td>1</td>\n",
       "      <td>20</td>\n",
       "      <td>0.9544</td>\n",
       "      <td>5.5440</td>\n",
       "      <td>5.6740</td>\n",
       "      <td>0</td>\n",
       "      <td>1.5894</td>\n",
       "      <td>0.6250</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                  Feature Name Feature Value  Obs.  Entropy  \\\n",
       "gender ethnicity                                              \n",
       "0      0          ALL FEATURES    ALL VALUES    29      NaN   \n",
       "       0                  col3             0    15   0.9991   \n",
       "       0                  col3             1    14   0.9991   \n",
       "       1          ALL FEATURES    ALL VALUES    33      NaN   \n",
       "       1                  col3             0    18   0.9940   \n",
       "       1                  col3             1    15   0.9940   \n",
       "1      0          ALL FEATURES    ALL VALUES    26      NaN   \n",
       "       0                  col3             0    11   0.9829   \n",
       "       0                  col3             1    15   0.9829   \n",
       "       1          ALL FEATURES    ALL VALUES    32      NaN   \n",
       "       1                  col3             0    12   0.9544   \n",
       "       1                  col3             1    20   0.9544   \n",
       "\n",
       "                  Mean Example_Target  Median Example_Target  Missing Values  \\\n",
       "gender ethnicity                                                               \n",
       "0      0                       3.9273                 3.9847               0   \n",
       "       0                       3.3797                 3.7754               0   \n",
       "       0                       4.5141                 4.4485               0   \n",
       "       1                       3.2016                 2.6024               0   \n",
       "       1                       2.4920                 1.6709               0   \n",
       "       1                       4.0530                 5.1426               0   \n",
       "1      0                       4.9544                 4.7895               0   \n",
       "       0                       4.6557                 4.5711               0   \n",
       "       0                       5.1735                 5.0948               0   \n",
       "       1                       5.0563                 5.2957               0   \n",
       "       1                       4.2436                 4.2397               0   \n",
       "       1                       5.5440                 5.6740               0   \n",
       "\n",
       "                  Std. Dev. Example_Target  Value Prevalence  \n",
       "gender ethnicity                                              \n",
       "0      0                            1.9164            1.0000  \n",
       "       0                            1.9532            0.5172  \n",
       "       0                            1.7564            0.4828  \n",
       "       1                            2.1053            1.0000  \n",
       "       1                            1.8859            0.5455  \n",
       "       1                            2.0949            0.4545  \n",
       "1      0                            1.4701            1.0000  \n",
       "       0                            1.6014            0.4231  \n",
       "       0                            1.3805            0.5769  \n",
       "       1                            1.7530            1.0000  \n",
       "       1                            1.7731            0.3750  \n",
       "       1                            1.5894            0.6250  "
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "## Multi-Level Cohorts for the Data table\n",
    "measure.data(X=X_test[['col3']], Y=y_test, cohort_labels=X_test[['gender', 'ethnicity']])"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}