{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Interpretable Classification" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this notebook we will fit classification explainable boosting machine (EBM), LogisticRegression, and ClassificationTree models. After fitting them, we will use their glassbox nature to understand their global and local explanations.\n", "\n", "This notebook can be found in our [**_examples folder_**](https://github.com/interpretml/interpret/tree/develop/docs/interpret/python/examples) on GitHub." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# install interpret if not already installed\n", "try:\n", " import interpret\n", "except ModuleNotFoundError:\n", " !pip install --quiet interpret pandas scikit-learn" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "from sklearn.model_selection import train_test_split\n", "from interpret import show\n", "from interpret.perf import ROC\n", "\n", "from interpret import set_visualize_provider\n", "from interpret.provider import InlineProvider\n", "set_visualize_provider(InlineProvider())\n", "\n", "df = pd.read_csv(\n", " \"https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data\",\n", " header=None)\n", "df.columns = [\n", " \"Age\", \"WorkClass\", \"fnlwgt\", \"Education\", \"EducationNum\",\n", " \"MaritalStatus\", \"Occupation\", \"Relationship\", \"Race\", \"Gender\",\n", " \"CapitalGain\", \"CapitalLoss\", \"HoursPerWeek\", \"NativeCountry\", \"Income\"\n", "]\n", "X = df.iloc[:, :-1]\n", "y = (df.iloc[:, -1] == \" >50K\").astype(int)\n", "\n", "seed = 42\n", "np.random.seed(seed)\n", "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=seed)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "