{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Recidivism\n", "\n", "This notebook contains my analysis of data presented in \n", "\"[Machine Bias](https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing)\", Julia Angwin, Jeff Larson, Surya Mattu and Lauren Kirchner, ProPublica, May 23, 2016.\n", "\n", "I would like to thank the authors of that article for making their data and analysis freely available. They are a model of open science.\n", "\n", "\n", "Copyright 2018 Allen Downey\n", "\n", "The code and text of this notebook are under this license: [Creative Commons Attribution 4.0 International](https://creativecommons.org/licenses/by/4.0)\n", "\n", "The data are from [this repository](https://github.com/propublica/compas-analysis), which contains the data and analysis pipeline described on [this web page](https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm).\n", "\n", "The terms of use for the data [are here](https://www.propublica.org/datastore/terms). In compliance with those terms, I am not distributing the data, but there is a link below that downloads it directly." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "%config InteractiveShell.ast_node_interactivity='last_expr_or_assign'\n", "\n", "import pandas as pd\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "\n", "from sympy import symbols, Eq, solve\n", "\n", "from overthink import decorate" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Metrics\n", "\n", "In this section I start with the [data reported here](https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm) and replicate the analysis there, computing various metrics based on the [confusion matrix](https://en.wikipedia.org/wiki/Confusion_matrix): prevalence, sensitivity, specificity, false positive rate, false negative rate, positive predictive value, and negative predictive value.\n", "\n", "The following function takes an array and returns a Pandas DataFrame:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "def make_matrix(a):\n", " \"\"\"Make a confusion matrix from an array.\n", " \n", " a: array or list of lists\n", " \n", " returns: DataFrame\n", " \"\"\"\n", " a = np.asarray(a).reshape((2, 2))\n", " index = ['Survived', 'Recidivated']\n", " columns = ['Low', 'High']\n", " return pd.DataFrame(a, index=index, columns=columns)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Make the matrix for all defendants." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
LowHigh
Survived26811282
Recidivated12162035
\n", "
" ], "text/plain": [ " Low High\n", "Survived 2681 1282\n", "Recidivated 1216 2035" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = [[2681, 1282], [1216, 2035]]\n", "matrix_all = make_matrix(a)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Confirm that the total matches what's reported in the article." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "7214" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.sum(a)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compute sensitivity and specificity." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "def percent(x, y):\n", " \"\"\"Compute the percentage `x/(x+y)`.\n", " \"\"\"\n", " return x / (x+y) * 100\n", "\n", "def sens_spec(m):\n", " \"\"\"Compute sensitivity and specificity.\n", " \n", " m: confusion matrix\n", " \"\"\"\n", " tn, fp, fn, tp = m.values.flatten()\n", " sens = percent(tp, fn)\n", " spec = percent(tn, fp)\n", " return sens, spec" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compute sensitivity and specificity for all defendants." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(62.59612426945556, 67.65076961897553)" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sens, spec = sens_spec(matrix_all)\n", "sens, spec" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compute error rates." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "def error_rates(m):\n", " \"\"\"Compute false positive and false negative rate.\n", " \n", " m: confusion matrix\n", " \"\"\"\n", " tn, fp, fn, tp = m.values.flatten()\n", " fpr = percent(fp, tn)\n", " fnr = percent(fn, tp)\n", " return fpr, fnr" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compute error rates for all defendants." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(32.349230381024476, 37.40387573054445)" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fpr, fnr = error_rates(matrix_all)\n", "fpr, fnr" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compute predictive value." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "def predictive_value(m):\n", " \"\"\"Compute positive and negatie predictive value.\n", " \n", " m: confusion matrix\n", " \"\"\"\n", " tn, fp, fn, tp = m.values.flatten()\n", " ppv = percent(tp, fp)\n", " npv = percent(tn, fn)\n", " return ppv, npv" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compute predictive value for all defendants." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(61.350618028338864, 68.79651013600206)" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ppv, npv = predictive_value(matrix_all)\n", "ppv, npv" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compute prevalence." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "def prevalence(df):\n", " \"\"\"Compute prevalence.\n", " \n", " m: confusion matrix\n", " \"\"\"\n", " tn, fp, fn, tp = df.values.flatten()\n", " prevalence = percent(tp+fn, tn+fp)\n", " return prevalence" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compute prevalences for all defendants." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "45.06515109509287" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "prev = prevalence(matrix_all)\n", "prev" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compute all metrics and put them in a DataFrame." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "def compute_metrics(m, name=''):\n", " \"\"\"Compute all metrics.\n", " \n", " m: confusion matrix\n", " \n", " returns: DataFrame\n", " \"\"\"\n", " fpr, fnr = error_rates(m)\n", " ppv, npv = predictive_value(m)\n", " sens, spec = sens_spec(m)\n", " prev = prevalence(m)\n", " \n", " index = ['FP rate', 'FN rate', 'PPV', 'NPV',\n", " 'Sensitivity', 'Specificity', 'Prevalence']\n", " df = pd.DataFrame(index=index, columns=['Percent'])\n", " df.Percent = fpr, fnr, ppv, npv, sens, spec, prev\n", " df.index.name = name\n", " return df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compute metrics for all defendants." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Percent
All defendants
FP rate32.349230
FN rate37.403876
PPV61.350618
NPV68.796510
Sensitivity62.596124
Specificity67.650770
Prevalence45.065151
\n", "
" ], "text/plain": [ " Percent\n", "All defendants \n", "FP rate 32.349230\n", "FN rate 37.403876\n", "PPV 61.350618\n", "NPV 68.796510\n", "Sensitivity 62.596124\n", "Specificity 67.650770\n", "Prevalence 45.065151" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "compute_metrics(matrix_all, 'All defendants')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Make the confusion matrix for black defendants." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
LowHigh
Survived990805
Recidivated5321369
\n", "
" ], "text/plain": [ " Low High\n", "Survived 990 805\n", "Recidivated 532 1369" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = [[990, 805], [532, 1369]]\n", "matrix_black = make_matrix(a)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compute metrics for black defendants." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Percent
Black defendants
FP rate44.846797
FN rate27.985271
PPV62.971481
NPV65.045992
Sensitivity72.014729
Specificity55.153203
Prevalence51.433983
\n", "
" ], "text/plain": [ " Percent\n", "Black defendants \n", "FP rate 44.846797\n", "FN rate 27.985271\n", "PPV 62.971481\n", "NPV 65.045992\n", "Sensitivity 72.014729\n", "Specificity 55.153203\n", "Prevalence 51.433983" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "black_metrics = compute_metrics(matrix_black, 'Black defendants')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Make the confusion matrix for white defendants." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
LowHigh
Survived1139349
Recidivated461505
\n", "
" ], "text/plain": [ " Low High\n", "Survived 1139 349\n", "Recidivated 461 505" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = [[1139, 349], [461, 505]]\n", "matrix_white = make_matrix(a)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compute metrics for white defendants." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Percent
White defendants
FP rate23.454301
FN rate47.722567
PPV59.133489
NPV71.187500
Sensitivity52.277433
Specificity76.545699
Prevalence39.364303
\n", "
" ], "text/plain": [ " Percent\n", "White defendants \n", "FP rate 23.454301\n", "FN rate 47.722567\n", "PPV 59.133489\n", "NPV 71.187500\n", "Sensitivity 52.277433\n", "Specificity 76.545699\n", "Prevalence 39.364303" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "white_metrics = compute_metrics(matrix_white, 'White defendants')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So far, all results are consistent with those reported in the article, including the headline results:\n", "\n", "1. The false positive rate for black defendants is substantially higher than for white defendants (45%, compared to 23%).\n", "\n", "2. The false negative rate for black defendants is substantially lower (28%, compared to 48%).\n", "\n", "The false positive rate is the fraction of all non-recidivists who not labeled low risk.\n", "\n", "The false negative rate is the fraction of all recidivists who were labeled low risk." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(44.84679665738162, 27.985270910047344)" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "error_rates(matrix_black)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(23.45430107526882, 47.72256728778468)" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "error_rates(matrix_white)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### The constant predictive value model\n", "\n", "An ideal test should have equal predicitive value in all groups; that is, two people in the same risk category should have the same probability of recidivism, regardless of what group they are in.\n", "\n", "An ideal test should also have the same error rates for all groups; that is, two non-recidivists should have the same probability of being classified as high risk. \n", "\n", "Unfortunately, these two goals are in conflict:\n", "\n", "* If you design a test to achieve equal predictive value across groups with different prevalence, you will find that error rates depend on prevalence. Speficially, false positive rates will be higher in groups with higher rates of recividism.\n", "\n", "* If you design a test to achieve equal error rates across groups, you will find that predictive value depends on prevalence. Specifically, positive predictive value will be lower in groups with lower rates of recidivism.\n", "\n", "The next two sections demonstrate these effects.\n", "\n", "A confusion matrix contains four values, but because they are contrained to add up to 100, it only takes 3 values to determine a confusion matrix.\n", "\n", "For example, if you specify prevalence, PPV, and NPV, that determines a confusion matrix, and then you can compute the error rates.\n", "\n", "Or, if you specify prevalence, FPR, and FNR, that determines a confusion matrix, and then you can compute predictive values.\n", "\n", "The following function takes prevalence, PPV, and NPV and returns a confusion matrix." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "def constant_predictive_value(prev, ppv, npv):\n", " \"\"\"Make a confusion matrix with given metrics.\n", " \n", " prev: prevalence\n", " ppv: positive predictive value\n", " npv: negative predictive value\n", " \n", " returns: confusion matrix\n", " \"\"\"\n", " tn, fp, fn, tp = symbols('tn fp fn tp')\n", " eq1 = Eq(percent(tp+fn, tn+fp), prev)\n", " eq2 = Eq(percent(tp, fp), ppv)\n", " eq3 = Eq(percent(tn,fn), npv)\n", " eq4 = Eq(tn+fp+fn+tp, 1)\n", " soln = solve([eq1, eq2, eq3, eq4], [tn, fp, fn, tp])\n", " a = list(soln.values())\n", " return make_matrix(a)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To test it, I'll construct a confusion matrix with the actual metrics from all defendents." ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
LowHigh
Survived0.3716384807319100.177710008317161
Recidivated0.1685611311339060.282090379817023
\n", "
" ], "text/plain": [ " Low High\n", "Survived 0.371638480731910 0.177710008317161\n", "Recidivated 0.168561131133906 0.282090379817023" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ppv, npv = predictive_value(matrix_all)\n", "prev = prevalence(matrix_all)\n", "\n", "m = constant_predictive_value(prev, ppv, npv)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we use it to compute the other metrics, they are consistent with the results we got with the original data." ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Percent
FP rate32.3492303810245
FN rate37.4038757305443
PPV61.3506180283389
NPV68.7965101360021
Sensitivity62.5961242694557
Specificity67.6507696189755
Prevalence45.0651510950929
\n", "
" ], "text/plain": [ " Percent\n", " \n", "FP rate 32.3492303810245\n", "FN rate 37.4038757305443\n", "PPV 61.3506180283389\n", "NPV 68.7965101360021\n", "Sensitivity 62.5961242694557\n", "Specificity 67.6507696189755\n", "Prevalence 45.0651510950929" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "compute_metrics(m)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can use this function to run the \"constant predictive value\" model, which asks what happens if we keep PPV and NPV constant, and vary prevalence. " ] }, { "cell_type": "code", "execution_count": 80, "metadata": {}, "outputs": [], "source": [ "ppv, npv = predictive_value(matrix_all)\n", "prevalences = np.linspace(32, 60, 31)\n", "\n", "fp_rates = pd.Series(index=prevalences)\n", "fn_rates = pd.Series(index=prevalences)\n", "\n", "for prev in prevalences:\n", " df = constant_predictive_value(prev, ppv, npv)\n", " fpr, fnr = error_rates(df)\n", " fp_rates[prev] = fpr\n", " fn_rates[prev] = fnr" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following figure shows the error rates we would expect from a test with equal predictive value for all groups, regardless of prevalence." ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "def plot_cpv_model():\n", " fp_rates.plot(label='false positive rate', color='C1')\n", " fn_rates.plot(label='false negative rate', color='C4')\n", " decorate(xlabel='Prevalence', ylabel='Percent',\n", " title='Expected error rates, constant predictive value',\n", " loc='upper center')\n", " \n", "plot_cpv_model()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As prevalence increases, false positive rates increase quickly. Note the vertical scale: the difference in error rates between a low-prevalence group and a high-prevalence group is dramatic!\n", "\n", "For the COMPAS test, the effect is not as extreme. The following figure shows the constant prediction model again, including data points for the white defendants (left), all defendants (middle), and black defendants (right)." ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "def plot_fpr_fnr(m):\n", " \"\"\"Plot error rates versus prevalence.\n", " \n", " m: confusion matrix\n", " \"\"\"\n", " prev = prevalence(m)\n", " fpr, fnr = error_rates(m)\n", " plt.plot(prev, fpr, 'o', color='C1')\n", " plt.plot(prev, fnr, 'o', color='C4')" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plot_fpr_fnr(matrix_all)\n", "plot_fpr_fnr(matrix_black)\n", "plot_fpr_fnr(matrix_white)\n", "\n", "plot_cpv_model()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For black defendants:\n", "\n", "* The actual false positive rate is lower that what we would expect if the test had the same predictive value for all groups.\n", "\n", "* The actual false negative rate is higher than expected.\n", "\n", "For white defendants:\n", "\n", "* The actual false positive rate is higher than what we would expect if the test had the same predictive value for all groups.\n", "\n", "* The actual false negative rate is lower than expected.\n", "\n", "Relative to the CPV model, the COMPAS test is what I will call \"tempered\", that is, less sensitive to variation in prevalence between groups.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Constant error rate model\n", "\n", "In the previous section we held predictive value constant and computed the effect on error rates. In this section we'll go the other way: if we hold error rates constant for all groups, what effect does that have on predictive value?\n", "\n", "The following function takes prevalence and error rates and returns a confusion matrix.\n", "\n" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "def constant_error_rates(prev, fpr, fnr):\n", " \"\"\"Make a confusion matrix with given metrics.\n", " \n", " prev: prevalence\n", " fpr: false positive rate\n", " fnr: false negative rate\n", " \n", " returns: confusion matrix\n", " \"\"\"\n", " tn, fp, fn, tp = symbols('tn fp fn tp')\n", " eq1 = Eq(percent(tp+fn, tn+fp), prev)\n", " eq2 = Eq(percent(fp, tn), fpr)\n", " eq3 = Eq(percent(fn, tp), fnr)\n", " eq4 = Eq(tn+fp+fn+tp, 1)\n", " soln = solve([eq1, eq2, eq3, eq4], [tn, fp, fn, tp])\n", " a = list(soln.values())\n", " return make_matrix(a)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Again, just to test it, we can replicate the observed confusion matrix." ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
LowHigh
Survived0.3716384807319100.177710008317161
Recidivated0.1685611311339070.282090379817022
\n", "
" ], "text/plain": [ " Low High\n", "Survived 0.371638480731910 0.177710008317161\n", "Recidivated 0.168561131133907 0.282090379817022" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fpr, fnr = error_rates(matrix_all)\n", "prev = prevalence(matrix_all)\n", "m = constant_error_rates(prev, fpr, fnr)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And it has the right metrics" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Percent
FP rate32.3492303810245
FN rate37.4038757305445
PPV61.3506180283388
NPV68.7965101360020
Sensitivity62.5961242694555
Specificity67.6507696189755
Prevalence45.0651510950929
\n", "
" ], "text/plain": [ " Percent\n", " \n", "FP rate 32.3492303810245\n", "FN rate 37.4038757305445\n", "PPV 61.3506180283388\n", "NPV 68.7965101360020\n", "Sensitivity 62.5961242694555\n", "Specificity 67.6507696189755\n", "Prevalence 45.0651510950929" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "compute_metrics(m)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can see how predictive value depends on prevalence (with error rates held constant)." ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "fpr, fnr = error_rates(matrix_all)\n", "prevalences = np.linspace(20, 70, 31)\n", "\n", "ppv_rates = pd.Series(index=prevalences)\n", "npv_rates = pd.Series(index=prevalences)\n", "\n", "for prev in prevalences:\n", " df = constant_error_rates(prev, fpr, fnr)\n", " ppv, npv = predictive_value(df)\n", " ppv_rates[prev] = ppv\n", " npv_rates[prev] = npv" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following function plots the results." ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "def plot_cer_model():\n", " ppv_rates.plot(label='Positive predictive value', color='C0')\n", " npv_rates.plot(label='Negative predictive value', color='C2')\n", " decorate(xlabel='Prevalence', ylabel='Percent',\n", " title='Expected predictive value, constant error rates',\n", " loc='upper center')\n", " \n", "plot_cer_model()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As prevalence increases, so does positive predictive value.\n", "\n", "For the COMPAS test, the effect is not as extreme. The following figure shows the constant error rate again, including data points for the white defendants (left), all defendants (middle), and black defendants (right)." ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [], "source": [ "def plot_ppv_npv(m):\n", " \"\"\"Plot predictive values versus prevalence.\n", " \n", " m: confusion matrix\n", " \"\"\"\n", " prev = prevalence(m)\n", " ppv, npv = predictive_value(m)\n", " plt.plot(prev, ppv, 'o', color='C0')\n", " plt.plot(prev, npv, 'o', color='C2')" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plot_ppv_npv(matrix_all)\n", "plot_ppv_npv(matrix_black)\n", "plot_ppv_npv(matrix_white)\n", "\n", "plot_cer_model()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Again, the test is less sensitive to differences in prevalence between groups than we would expect from the constant error rate model." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### More data, more details\n", "\n", "In this section I read the detailed dataset available from [this repository](https://github.com/propublica/compas-analysis) and run validation checks." ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [], "source": [ "# Uncomment and run this cell once to download the data.\n", "# Then comment it again so you don't download it every time you run the notebook.\n", "# !wget 'https://raw.githubusercontent.com/propublica/compas-analysis/master/compas-scores-two-years.csv'" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
idnamefirstlastcompas_screening_datesexdobageage_catrace...v_decile_scorev_score_textv_screening_datein_custodyout_custodypriors_count.1startendeventtwo_year_recid
01miguel hernandezmiguelhernandez2013-08-14Male1947-04-1869Greater than 45Other...1Low2013-08-142014-07-072014-07-140032700
13kevon dixonkevondixon2013-01-27Male1982-01-223425 - 45African-American...1Low2013-01-272013-01-262013-02-050915911
24ed philoedphilo2013-04-14Male1991-05-1424Less than 25African-American...3Low2013-04-142013-06-162013-06-16406301
35marcu brownmarcubrown2013-01-13Male1993-01-2123Less than 25African-American...6Medium2013-01-13NaNNaN10117400
46bouthy pierrelouisbouthypierrelouis2013-03-26Male1973-01-224325 - 45Other...1Low2013-03-26NaNNaN20110200
\n", "

5 rows × 53 columns

\n", "
" ], "text/plain": [ " id name first last compas_screening_date sex \\\n", "0 1 miguel hernandez miguel hernandez 2013-08-14 Male \n", "1 3 kevon dixon kevon dixon 2013-01-27 Male \n", "2 4 ed philo ed philo 2013-04-14 Male \n", "3 5 marcu brown marcu brown 2013-01-13 Male \n", "4 6 bouthy pierrelouis bouthy pierrelouis 2013-03-26 Male \n", "\n", " dob age age_cat race ... \\\n", "0 1947-04-18 69 Greater than 45 Other ... \n", "1 1982-01-22 34 25 - 45 African-American ... \n", "2 1991-05-14 24 Less than 25 African-American ... \n", "3 1993-01-21 23 Less than 25 African-American ... \n", "4 1973-01-22 43 25 - 45 Other ... \n", "\n", " v_decile_score v_score_text v_screening_date in_custody out_custody \\\n", "0 1 Low 2013-08-14 2014-07-07 2014-07-14 \n", "1 1 Low 2013-01-27 2013-01-26 2013-02-05 \n", "2 3 Low 2013-04-14 2013-06-16 2013-06-16 \n", "3 6 Medium 2013-01-13 NaN NaN \n", "4 1 Low 2013-03-26 NaN NaN \n", "\n", " priors_count.1 start end event two_year_recid \n", "0 0 0 327 0 0 \n", "1 0 9 159 1 1 \n", "2 4 0 63 0 1 \n", "3 1 0 1174 0 0 \n", "4 2 0 1102 0 0 \n", "\n", "[5 rows x 53 columns]" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cp = pd.read_csv(\"compas-scores-two-years.csv\")\n", "cp.head()" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(7214, 53)" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cp.shape" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "id\n", "name\n", "first\n", "last\n", "compas_screening_date\n", "sex\n", "dob\n", "age\n", "age_cat\n", "race\n", "juv_fel_count\n", "decile_score\n", "juv_misd_count\n", "juv_other_count\n", "priors_count\n", "days_b_screening_arrest\n", "c_jail_in\n", "c_jail_out\n", "c_case_number\n", "c_offense_date\n", "c_arrest_date\n", "c_days_from_compas\n", "c_charge_degree\n", "c_charge_desc\n", "is_recid\n", "r_case_number\n", "r_charge_degree\n", "r_days_from_arrest\n", "r_offense_date\n", "r_charge_desc\n", "r_jail_in\n", "r_jail_out\n", "violent_recid\n", "is_violent_recid\n", "vr_case_number\n", "vr_charge_degree\n", "vr_offense_date\n", "vr_charge_desc\n", "type_of_assessment\n", "decile_score.1\n", "score_text\n", "screening_date\n", "v_type_of_assessment\n", "v_decile_score\n", "v_score_text\n", "v_screening_date\n", "in_custody\n", "out_custody\n", "priors_count.1\n", "start\n", "end\n", "event\n", "two_year_recid\n" ] } ], "source": [ "for col in cp.columns:\n", " print(col)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following functions compute value counts and percentages for various variables." ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [], "source": [ "def make_dataframe(series, *columns):\n", " \"\"\"Make a Series into a DataFrame with one column.\n", " \n", " So it looks better in Jupyter.\n", " \n", " series: Series\n", " columns: column name(s)\n", " \n", " returns: DataFrame\n", " \"\"\"\n", " df = pd.DataFrame(series.values,\n", " index=series.index,\n", " columns=columns)\n", " df.index.name = series.name\n", " return df\n", "\n", "def counts(df, var):\n", " \"\"\"Compute counts for each unique value.\n", " \n", " df: DataFrame\n", " var: variable name\n", " \"\"\"\n", " series = df[var].value_counts()\n", " return make_dataframe(series, 'Count')\n", "\n", "def percentages(df, var):\n", " \"\"\"Compute percentages for each unique value.\n", " \n", " df: DataFrame\n", " var: variable name\n", " \"\"\"\n", " series = df[var].value_counts() / len(df) * 100\n", " return make_dataframe(series, 'Percentage')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Breakdown by age" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Count
age_cat
25 - 454109
Greater than 451576
Less than 251529
\n", "
" ], "text/plain": [ " Count\n", "age_cat \n", "25 - 45 4109\n", "Greater than 45 1576\n", "Less than 25 1529" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "counts(cp, 'age_cat')" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Percentage
age_cat
25 - 4556.958691
Greater than 4521.846410
Less than 2521.194899
\n", "
" ], "text/plain": [ " Percentage\n", "age_cat \n", "25 - 45 56.958691\n", "Greater than 45 21.846410\n", "Less than 25 21.194899" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "percentages(cp, 'age_cat')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Breakdown by race" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Count
race
African-American3696
Caucasian2454
Hispanic637
Other377
Asian32
Native American18
\n", "
" ], "text/plain": [ " Count\n", "race \n", "African-American 3696\n", "Caucasian 2454\n", "Hispanic 637\n", "Other 377\n", "Asian 32\n", "Native American 18" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "counts(cp, 'race')" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Percentage
race
African-American51.233712
Caucasian34.017189
Hispanic8.830053
Other5.225950
Asian0.443582
Native American0.249515
\n", "
" ], "text/plain": [ " Percentage\n", "race \n", "African-American 51.233712\n", "Caucasian 34.017189\n", "Hispanic 8.830053\n", "Other 5.225950\n", "Asian 0.443582\n", "Native American 0.249515" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "percentages(cp, 'race')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Breakdown by sex" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Count
sex
Male5819
Female1395
\n", "
" ], "text/plain": [ " Count\n", "sex \n", "Male 5819\n", "Female 1395" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "counts(cp, 'sex')" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Percentage
sex
Male80.6626
Female19.3374
\n", "
" ], "text/plain": [ " Percentage\n", "sex \n", "Male 80.6626\n", "Female 19.3374" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "percentages(cp, 'sex')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Breakdown by recidivism" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Count
two_year_recid
03963
13251
\n", "
" ], "text/plain": [ " Count\n", "two_year_recid \n", "0 3963\n", "1 3251" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "counts(cp, 'two_year_recid')" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Percentage
two_year_recid
054.934849
145.065151
\n", "
" ], "text/plain": [ " Percentage\n", "two_year_recid \n", "0 54.934849\n", "1 45.065151" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "percentages(cp, 'two_year_recid')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Breakdown by risk category" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Count
score_text
Low3897
Medium1914
High1403
\n", "
" ], "text/plain": [ " Count\n", "score_text \n", "Low 3897\n", "Medium 1914\n", "High 1403" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "counts(cp, 'score_text')" ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Percentage
score_text
Low54.019961
Medium26.531744
High19.448295
\n", "
" ], "text/plain": [ " Percentage\n", "score_text \n", "Low 54.019961\n", "Medium 26.531744\n", "High 19.448295" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "percentages(cp, 'score_text')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following function computes cross-tabulations." ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [], "source": [ "def crosstab(df, index, columns):\n", " \"\"\"Compute a cross-tabulation.\n", " \n", " df: DataFrame\n", " index: variable(s) that will label the rows\n", " columns: variable(s) that will label the columns\n", " \n", " returns: DataFrame\n", " \"\"\"\n", "\n", " xtab = df.pivot_table(index=index, \n", " columns=columns,\n", " values='id',\n", " aggfunc='count')\n", " \n", " return xtab" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Breakdown by sex and race" ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
raceAfrican-AmericanAsianCaucasianHispanicNative AmericanOther
sex
Female6522567103467
Male304430188753414310
\n", "
" ], "text/plain": [ "race African-American Asian Caucasian Hispanic Native American Other\n", "sex \n", "Female 652 2 567 103 4 67\n", "Male 3044 30 1887 534 14 310" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "xtab = crosstab(cp, 'sex', 'race')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Breakdown by age and race" ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
raceAfrican-AmericanAsianCaucasianHispanicNative AmericanOther
age_cat
25 - 45219414131236712210
Greater than 4558211752143385
Less than 259207390127382
\n", "
" ], "text/plain": [ "race African-American Asian Caucasian Hispanic \\\n", "age_cat \n", "25 - 45 2194 14 1312 367 \n", "Greater than 45 582 11 752 143 \n", "Less than 25 920 7 390 127 \n", "\n", "race Native American Other \n", "age_cat \n", "25 - 45 12 210 \n", "Greater than 45 3 85 \n", "Less than 25 3 82 " ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "xtab = crosstab(cp, 'age_cat', 'race')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Breakdown by age and sex" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sexFemaleMale
age_cat
25 - 458073302
Greater than 453001276
Less than 252881241
\n", "
" ], "text/plain": [ "sex Female Male\n", "age_cat \n", "25 - 45 807 3302\n", "Greater than 45 300 1276\n", "Less than 25 288 1241" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "xtab = crosstab(cp, 'age_cat', 'sex')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Distribution of decile scores for black defendants." ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "cp.loc[cp.race=='African-American', 'decile_score'].hist()\n", "decorate(xlabel='Decile Score',\n", " ylabel='Count',\n", " title='Black defendant decile scores',\n", " ylim=[0, 700])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Distribution of decile scores for white defendants." ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "cp.loc[cp.race=='Caucasian', 'decile_score'].hist()\n", "decorate(xlabel='Decile Score',\n", " ylabel='Count',\n", " title='White defendant decile scores',\n", " ylim=[0, 700])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Cross tabulation of decile score and race." ] }, { "cell_type": "code", "execution_count": 52, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
raceAfrican-AmericanAsianCaucasianHispanicNative AmericanOther
decile_score
1398.015.0681.0196.0NaN150.0
2393.04.0361.0113.04.066.0
3346.05.0273.086.01.036.0
4385.0NaN285.052.01.046.0
5365.01.0241.052.0NaN22.0
6384.03.0194.037.02.021.0
7400.01.0143.034.04.010.0
8359.02.0114.026.01.010.0
9380.0NaN98.020.02.08.0
10286.01.064.021.03.08.0
\n", "
" ], "text/plain": [ "race African-American Asian Caucasian Hispanic Native American \\\n", "decile_score \n", "1 398.0 15.0 681.0 196.0 NaN \n", "2 393.0 4.0 361.0 113.0 4.0 \n", "3 346.0 5.0 273.0 86.0 1.0 \n", "4 385.0 NaN 285.0 52.0 1.0 \n", "5 365.0 1.0 241.0 52.0 NaN \n", "6 384.0 3.0 194.0 37.0 2.0 \n", "7 400.0 1.0 143.0 34.0 4.0 \n", "8 359.0 2.0 114.0 26.0 1.0 \n", "9 380.0 NaN 98.0 20.0 2.0 \n", "10 286.0 1.0 64.0 21.0 3.0 \n", "\n", "race Other \n", "decile_score \n", "1 150.0 \n", "2 66.0 \n", "3 36.0 \n", "4 46.0 \n", "5 22.0 \n", "6 21.0 \n", "7 10.0 \n", "8 10.0 \n", "9 8.0 \n", "10 8.0 " ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "crosstab(cp, 'decile_score', 'race')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Cross tabulation of decile score and age group." ] }, { "cell_type": "code", "execution_count": 53, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
age_cat25 - 45Greater than 45Less than 25
decile_score
16547824
266018299
3437120190
443498237
540585191
6357104180
734077175
829857157
929254162
1023217134
\n", "
" ], "text/plain": [ "age_cat 25 - 45 Greater than 45 Less than 25\n", "decile_score \n", "1 654 782 4\n", "2 660 182 99\n", "3 437 120 190\n", "4 434 98 237\n", "5 405 85 191\n", "6 357 104 180\n", "7 340 77 175\n", "8 298 57 157\n", "9 292 54 162\n", "10 232 17 134" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "crosstab(cp, 'decile_score', 'age_cat')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here's the confusion matrix with all three score categories." ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
score_textHighLowMedium
two_year_recid
04022681880
1100112161034
\n", "
" ], "text/plain": [ "score_text High Low Medium\n", "two_year_recid \n", "0 402 2681 880\n", "1 1001 1216 1034" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "crosstab(cp, 'two_year_recid', 'score_text')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To make sure I've got the data right, I'll reproduce the confusion matrices from the article." ] }, { "cell_type": "code", "execution_count": 55, "metadata": {}, "outputs": [], "source": [ "def compute_matrix(df):\n", " \"\"\"Compute a confusion matrix from data.\n", " \n", " df: DataFrame\n", " \n", " returns: confusion matrix\n", " \"\"\"\n", " high = cp.score_text.isin(['Medium', 'High']).astype(int)\n", " return crosstab(df, 'two_year_recid', high)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All defendants." ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
score_text01
two_year_recid
026811282
112162035
\n", "
" ], "text/plain": [ "score_text 0 1\n", "two_year_recid \n", "0 2681 1282\n", "1 1216 2035" ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "compute_matrix(cp)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And the differences are 0." ] }, { "cell_type": "code", "execution_count": 57, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
LowHigh
Survived00
Recidivated00
\n", "
" ], "text/plain": [ " Low High\n", "Survived 0 0\n", "Recidivated 0 0" ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "matrix_all - compute_matrix(cp).values" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Black defendants." ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
score_text01
two_year_recid
0990805
15321369
\n", "
" ], "text/plain": [ "score_text 0 1\n", "two_year_recid \n", "0 990 805\n", "1 532 1369" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "black = cp[cp.race=='African-American']\n", "compute_matrix(black)" ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
LowHigh
Survived00
Recidivated00
\n", "
" ], "text/plain": [ " Low High\n", "Survived 0 0\n", "Recidivated 0 0" ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "matrix_black - compute_matrix(black).values" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "White defendants." ] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
score_text01
two_year_recid
01139349
1461505
\n", "
" ], "text/plain": [ "score_text 0 1\n", "two_year_recid \n", "0 1139 349\n", "1 461 505" ] }, "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ "white = cp[cp.race=='Caucasian']\n", "compute_matrix(white)" ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
LowHigh
Survived00
Recidivated00
\n", "
" ], "text/plain": [ " Low High\n", "Survived 0 0\n", "Recidivated 0 0" ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ "matrix_white - compute_matrix(white).values" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Calibration\n", "\n", "To check for calibration, I group defendents by decile score and compute prevalence (recidivism rate) in each group.\n", "\n", "This analysis does not take observation time into account, unlike the analysis in the original article.\n", "\n", "The following function groups defendants by decile score and computes prevalence in each group." ] }, { "cell_type": "code", "execution_count": 62, "metadata": {}, "outputs": [], "source": [ "def calibration_curve(df):\n", " \"\"\"Compute probability of recidivism by decile score.\n", " \n", " df: DataFrame\n", " \n", " returns: Series\n", " \"\"\"\n", " return df.groupby('decile_score').two_year_recid.mean()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following figure shows this calibration curve for all defendants and broken down by race." ] }, { "cell_type": "code", "execution_count": 63, "metadata": {}, "outputs": [], "source": [ "def plot_calibration(df, group_vars, title=''):\n", " calibration_curve(cp).plot(linestyle='dotted',\n", " label='All defendants', color='gray')\n", "\n", " for name, group in cp.groupby(group_vars):\n", " if len(group) > 1000:\n", " calibration_curve(group).plot(label=name)\n", "\n", " decorate(xlabel='Decile score',\n", " ylabel='Prob recidivism',\n", " title=title)" ] }, { "cell_type": "code", "execution_count": 64, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plot_calibration(cp, 'race')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The test is well calibrated. People with higher scores have higher probabilities of recidivism. In fact, we could use this curve to transform COMPAS scores into probabilities.\n", "\n", "The test is about equally calibrated for black and white defendants, although black defendants with scores 3 and 4 are more likely to recidivate than white defendants with the same scores (that apparent difference might not be statistically significant).\n", "\n", "Here's the breakdown by age group." ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plot_calibration(cp, 'age_cat')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The test is about equally calibrated for all age groups, which means that people with the same score have about the same probability of recidivism, regardless of what group they are in.\n", "\n", "There are only 4 people in the \"Less than 25\" group with decile score 1, which is why that data point is so out of line.\n", "\n", "Here's the breakdown by sex." ] }, { "cell_type": "code", "execution_count": 67, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plot_calibration(cp, 'sex')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here's the first case where the test does not seem well calibrated. At all decile scores, female defendants are substantially less likely to recidivate than male defendants.\n", "\n", "Or, reading this graph the other way, female defendants are given decile scores 1-2 points higher than male defendants with the same actual risk of recidivism. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Comparing reality to the CPV model\n", "\n", "In this section I'll compare actual PPV and FPR for a variety of subgroups to the values we would expect based on the CPV model; that is, a model where the predictive values are the same for all groups.\n", "\n", "The following function groups defendants by `group_vars` and returns a table with one row for each group." ] }, { "cell_type": "code", "execution_count": 68, "metadata": {}, "outputs": [], "source": [ "def make_table(df, group_vars, expected_ppv, expected_npv):\n", " \"\"\"Make a table with one line per group.\n", " \n", " df: DataFrame\n", " group_vars: string or list of string variable names\n", " expected_ppv: \n", " expected_npv:\n", " \n", " returns: table\n", " \"\"\"\n", " # make the DataFrame\n", " columns = ['count', 'prevalence',\n", " 'actual PPV', 'actual NPV', 'actual FNR',\n", " 'actual FPR', 'expected FPR', 'difference']\n", " columns = group_vars + columns \n", "\n", " table = pd.DataFrame(columns=columns)\n", "\n", " # loop through the groups\n", " grouped = df.groupby(group_vars)\n", " for i, (name, group) in enumerate(grouped):\n", " if not isinstance(name, tuple):\n", " name = name,\n", " \n", " # size of group\n", " count = len(group)\n", " \n", " # compute metrics\n", " matrix = compute_matrix(group)\n", " prev = prevalence(matrix)\n", " actual_ppv, actual_npv = predictive_value(matrix)\n", " actual_fpr, actual_fnr = error_rates(matrix)\n", "\n", " # generate the CPV matrix\n", " cpv = constant_predictive_value(prev, \n", " expected_ppv, expected_npv)\n", "\n", " # get the expected error rates\n", " expected_fpr, _ = error_rates(cpv * 100)\n", " \n", " # for very low and high prevalences, it might\n", " # not be possible to achieve given predictive values\n", " if expected_fpr < 0:\n", " expected_fpr = 0\n", "\n", " if expected_fpr > 100:\n", " expected_fpr = 100\n", "\n", " # difference between actual and expected\n", " diff = actual_fpr - expected_fpr\n", "\n", " # add a row to the table\n", " row = name + (count, prev,\n", " actual_ppv, actual_npv, actual_fnr,\n", " actual_fpr, expected_fpr, diff)\n", " \n", " table.loc[i] = row\n", " \n", " # sort the table by prevalence\n", " table.sort_values(by='prevalence', inplace=True)\n", " return table" ] }, { "cell_type": "code", "execution_count": 69, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(6150, 53)" ] }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "subset = cp[cp.race.isin(['African-American', 'Caucasian'])]\n", "subset.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here's the breakdown by age category." ] }, { "cell_type": "code", "execution_count": 70, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
age_catcountprevalenceactual PPVactual NPVactual FNRactual FPRexpected FPRdifference
1Greater than 45133432.90854654.94505575.36082554.44191318.3240223.2581270319166815.0658953144520
025 - 45350647.80376561.93149966.78260934.18854437.04918040.7730043881377-3.72382406026880
2Less than 25131057.40458064.66591257.61124124.06914955.91397878.8593270762673-22.9453485816437
\n", "
" ], "text/plain": [ " age_cat count prevalence actual PPV actual NPV actual FNR \\\n", "1 Greater than 45 1334 32.908546 54.945055 75.360825 54.441913 \n", "0 25 - 45 3506 47.803765 61.931499 66.782609 34.188544 \n", "2 Less than 25 1310 57.404580 64.665912 57.611241 24.069149 \n", "\n", " actual FPR expected FPR difference \n", "1 18.324022 3.25812703191668 15.0658953144520 \n", "0 37.049180 40.7730043881377 -3.72382406026880 \n", "2 55.913978 78.8593270762673 -22.9453485816437 " ] }, "execution_count": 70, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ppv, npv = predictive_value(matrix_all)\n", "group_vars = ['age_cat']\n", "table1 = make_table(subset, group_vars, ppv, npv)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Again, the actual behavior of the test is tempered, compare to the CPV model; that is, the results are less extreme than the model expects.\n", "\n", "Here's the breakdown by race." ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
racecountprevalenceactual PPVactual NPVactual FNRactual FPRexpected FPRdifference
1Caucasian245439.36430359.13348971.18750047.72256723.45430117.25447210364396.19982897162488
0African-American369651.43398362.97148165.04599227.98527144.84679753.4036054274053-8.55680877002365
\n", "
" ], "text/plain": [ " race count prevalence actual PPV actual NPV actual FNR \\\n", "1 Caucasian 2454 39.364303 59.133489 71.187500 47.722567 \n", "0 African-American 3696 51.433983 62.971481 65.045992 27.985271 \n", "\n", " actual FPR expected FPR difference \n", "1 23.454301 17.2544721036439 6.19982897162488 \n", "0 44.846797 53.4036054274053 -8.55680877002365 " ] }, "execution_count": 71, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ppv, npv = predictive_value(matrix_all)\n", "group_vars = ['race']\n", "table2 = make_table(subset, group_vars, ppv, npv)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The false positive rate for whites is higher than we would expect if predictive value were the same for all groups.\n", "\n", "The false positive rate for blacks is lower than we would expect.\n", "\n", "Here's the breakdown by sex." ] }, { "cell_type": "code", "execution_count": 72, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sexcountprevalenceactual PPVactual NPVactual FNRactual FPRexpected FPRdifference
0Female121936.58736750.98039275.68389135.87443935.57567910.884686720384524.6909924516724
1Male493149.09754664.36968066.19318234.40727035.01992045.0678340229313-10.0479137042062
\n", "
" ], "text/plain": [ " sex count prevalence actual PPV actual NPV actual FNR actual FPR \\\n", "0 Female 1219 36.587367 50.980392 75.683891 35.874439 35.575679 \n", "1 Male 4931 49.097546 64.369680 66.193182 34.407270 35.019920 \n", "\n", " expected FPR difference \n", "0 10.8846867203845 24.6909924516724 \n", "1 45.0678340229313 -10.0479137042062 " ] }, "execution_count": 72, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ppv, npv = predictive_value(matrix_all)\n", "group_vars = ['sex']\n", "table3 = make_table(subset, group_vars, ppv, npv)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The false positive rate for women is substantially higher than what we would expect in the CPV model, which is consistent with the calibration results in the previous section.\n", "\n", "Here's the breakdown by age and race." ] }, { "cell_type": "code", "execution_count": 73, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
age_catracecountprevalenceactual PPVactual NPVactual FNRactual FPRexpected FPRdifference
3Greater than 45Caucasian75227.79255355.55555677.32283568.8995229.57642709.57643
2Greater than 45African-American58239.51890054.65587071.64179141.30434831.81818217.626278185028614.1919036331532
125 - 45Caucasian131243.14024460.00000067.24137946.99646626.80965126.9139758955996-0.104324421068796
5Less than 25Caucasian39048.97435959.07173066.66666726.70157148.74371944.64952110386664.09419748909818
025 - 45African-American219450.59252562.68540266.37458927.65765844.09594150.3106752879841-6.21473432857449
4Less than 25African-American92060.97826166.71826652.55474523.17290659.88857997.8224228659821-37.9338434787955
\n", "
" ], "text/plain": [ " age_cat race count prevalence actual PPV \\\n", "3 Greater than 45 Caucasian 752 27.792553 55.555556 \n", "2 Greater than 45 African-American 582 39.518900 54.655870 \n", "1 25 - 45 Caucasian 1312 43.140244 60.000000 \n", "5 Less than 25 Caucasian 390 48.974359 59.071730 \n", "0 25 - 45 African-American 2194 50.592525 62.685402 \n", "4 Less than 25 African-American 920 60.978261 66.718266 \n", "\n", " actual NPV actual FNR actual FPR expected FPR difference \n", "3 77.322835 68.899522 9.576427 0 9.57643 \n", "2 71.641791 41.304348 31.818182 17.6262781850286 14.1919036331532 \n", "1 67.241379 46.996466 26.809651 26.9139758955996 -0.104324421068796 \n", "5 66.666667 26.701571 48.743719 44.6495211038666 4.09419748909818 \n", "0 66.374589 27.657658 44.095941 50.3106752879841 -6.21473432857449 \n", "4 52.554745 23.172906 59.888579 97.8224228659821 -37.9338434787955 " ] }, "execution_count": 73, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ppv, npv = predictive_value(matrix_all)\n", "group_vars = ['age_cat', 'race']\n", "table4 = make_table(subset, group_vars, ppv, npv)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Breakdown by age and sex." ] }, { "cell_type": "code", "execution_count": 74, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
age_catsexcountprevalenceactual PPVactual NPVactual FNRactual FPRexpected FPRdifference
2Greater than 45Female25923.55212441.07142981.28078862.29508216.666667016.6667
3Greater than 45Male107535.16279157.46753273.79400353.17460318.7948357.8287206027246410.9661144044490
025 - 45Female70440.05681855.52050572.60981937.58865233.41232218.934916074196714.4774062006848
4Less than 25Female25640.23437546.27659676.47058815.53398166.01307219.372044562613146.6410273328117
125 - 45Male280249.75017863.31967265.09716033.50071738.13920547.3182268799581-9.17902233450359
5Less than 25Male105461.57495369.64028854.03899725.42372952.098765100-47.9012
\n", "
" ], "text/plain": [ " age_cat sex count prevalence actual PPV actual NPV \\\n", "2 Greater than 45 Female 259 23.552124 41.071429 81.280788 \n", "3 Greater than 45 Male 1075 35.162791 57.467532 73.794003 \n", "0 25 - 45 Female 704 40.056818 55.520505 72.609819 \n", "4 Less than 25 Female 256 40.234375 46.276596 76.470588 \n", "1 25 - 45 Male 2802 49.750178 63.319672 65.097160 \n", "5 Less than 25 Male 1054 61.574953 69.640288 54.038997 \n", "\n", " actual FNR actual FPR expected FPR difference \n", "2 62.295082 16.666667 0 16.6667 \n", "3 53.174603 18.794835 7.82872060272464 10.9661144044490 \n", "0 37.588652 33.412322 18.9349160741967 14.4774062006848 \n", "4 15.533981 66.013072 19.3720445626131 46.6410273328117 \n", "1 33.500717 38.139205 47.3182268799581 -9.17902233450359 \n", "5 25.423729 52.098765 100 -47.9012 " ] }, "execution_count": 74, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ppv, npv = predictive_value(matrix_all)\n", "group_vars = ['age_cat', 'sex']\n", "table5 = make_table(subset, group_vars, ppv, npv)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Breakdown by race and sex." ] }, { "cell_type": "code", "execution_count": 75, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
racesexcountprevalenceactual PPVactual NPVactual FNRactual FPRexpected FPRdifference
2CaucasianFemale56735.09700250.44642974.92711443.21608030.1630437.6908325017500722.4722109765108
0African-AmericanFemale65237.88343651.33531276.50793729.95951440.49382713.786756726452226.7070704340416
3CaucasianMale188740.64652962.22222270.16706448.89178621.25000020.39681082998050.853189170019501
1African-AmericanMale304454.33639965.10615162.05468127.69044746.11510864.9466440776828-18.8315361640137
\n", "
" ], "text/plain": [ " race sex count prevalence actual PPV actual NPV \\\n", "2 Caucasian Female 567 35.097002 50.446429 74.927114 \n", "0 African-American Female 652 37.883436 51.335312 76.507937 \n", "3 Caucasian Male 1887 40.646529 62.222222 70.167064 \n", "1 African-American Male 3044 54.336399 65.106151 62.054681 \n", "\n", " actual FNR actual FPR expected FPR difference \n", "2 43.216080 30.163043 7.69083250175007 22.4722109765108 \n", "0 29.959514 40.493827 13.7867567264522 26.7070704340416 \n", "3 48.891786 21.250000 20.3968108299805 0.853189170019501 \n", "1 27.690447 46.115108 64.9466440776828 -18.8315361640137 " ] }, "execution_count": 75, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ppv, npv = predictive_value(matrix_all)\n", "group_vars = ['race', 'sex']\n", "table6 = make_table(subset, group_vars, ppv, npv)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Breakdown by age, race, and sex." ] }, { "cell_type": "code", "execution_count": 76, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
age_catracesexcountprevalenceactual PPVactual NPVactual FNRactual FPRexpected FPRdifference
4Greater than 45African-AmericanFemale8822.72727344.82758688.13559335.00000023.529412023.5294
6Greater than 45CaucasianFemale17123.97660837.03703778.47222275.60975613.076923013.0769
7Greater than 45CaucasianMale58128.91566361.11111176.98574367.2619058.47457608.47458
10Less than 25CaucasianFemale8731.03448338.23529494.7368423.70370470.000000070
025 - 45African-AmericanFemale39538.22784852.65957474.87922734.43708636.47541014.578422346108021.8969874899576
225 - 45CaucasianFemale30942.39482259.68992270.00000041.22137429.21348324.90673893741934.30674420864812
5Greater than 45African-AmericanMale49442.51012155.96330368.11594241.90476233.80281725.21380890281548.58900799859303
325 - 45CaucasianMale100343.36989060.10781766.45569648.73563226.05633827.5430043789732-1.48666635080423
8Less than 25African-AmericanFemale16944.97041450.83333369.38775519.73684263.44086032.072830419701431.3680297953524
125 - 45African-AmericanMale179953.30739364.40988163.88102026.59019846.30952460.6900439996716-14.3805201901478
11Less than 25CaucasianMale30354.12541367.45562162.68656730.48780539.56834564.0583101483587-24.4899648246177
9Less than 25African-AmericanMale75164.58055970.34220548.88888923.71134058.646617100-41.3534
\n", "
" ], "text/plain": [ " age_cat race sex count prevalence actual PPV \\\n", "4 Greater than 45 African-American Female 88 22.727273 44.827586 \n", "6 Greater than 45 Caucasian Female 171 23.976608 37.037037 \n", "7 Greater than 45 Caucasian Male 581 28.915663 61.111111 \n", "10 Less than 25 Caucasian Female 87 31.034483 38.235294 \n", "0 25 - 45 African-American Female 395 38.227848 52.659574 \n", "2 25 - 45 Caucasian Female 309 42.394822 59.689922 \n", "5 Greater than 45 African-American Male 494 42.510121 55.963303 \n", "3 25 - 45 Caucasian Male 1003 43.369890 60.107817 \n", "8 Less than 25 African-American Female 169 44.970414 50.833333 \n", "1 25 - 45 African-American Male 1799 53.307393 64.409881 \n", "11 Less than 25 Caucasian Male 303 54.125413 67.455621 \n", "9 Less than 25 African-American Male 751 64.580559 70.342205 \n", "\n", " actual NPV actual FNR actual FPR expected FPR difference \n", "4 88.135593 35.000000 23.529412 0 23.5294 \n", "6 78.472222 75.609756 13.076923 0 13.0769 \n", "7 76.985743 67.261905 8.474576 0 8.47458 \n", "10 94.736842 3.703704 70.000000 0 70 \n", "0 74.879227 34.437086 36.475410 14.5784223461080 21.8969874899576 \n", "2 70.000000 41.221374 29.213483 24.9067389374193 4.30674420864812 \n", "5 68.115942 41.904762 33.802817 25.2138089028154 8.58900799859303 \n", "3 66.455696 48.735632 26.056338 27.5430043789732 -1.48666635080423 \n", "8 69.387755 19.736842 63.440860 32.0728304197014 31.3680297953524 \n", "1 63.881020 26.590198 46.309524 60.6900439996716 -14.3805201901478 \n", "11 62.686567 30.487805 39.568345 64.0583101483587 -24.4899648246177 \n", "9 48.888889 23.711340 58.646617 100 -41.3534 " ] }, "execution_count": 76, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ppv, npv = predictive_value(matrix_all)\n", "group_vars = ['age_cat', 'race', 'sex']\n", "table7 = make_table(subset, group_vars, ppv, npv)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Those are all the possible subgroups for these three variables.\n", "\n", "Now we can see what the results look like." ] }, { "cell_type": "code", "execution_count": 77, "metadata": {}, "outputs": [], "source": [ "tables = [table1, table2, table3, table4, table5, table6, table7];" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following function plots one data point per subgroup showing the given metric versus prevalence.\n", "\n", "Groups with a small number of people are shown with lighter colors.\n" ] }, { "cell_type": "code", "execution_count": 78, "metadata": {}, "outputs": [], "source": [ "def plot_table_var(table, var, color):\n", " \"\"\"Plot one data point per row.\n", " \n", " table: DataFrame\n", " var: which metric to plot\n", " color: string\n", " \"\"\"\n", " for _, row in table.iterrows():\n", " alpha = 0.8 if row['count'] > 200 else 0.3\n", "\n", " plt.plot(row['prevalence'], row[var],\n", " 'o', color=color, alpha=alpha)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here's what the results look like for FPR." ] }, { "cell_type": "code", "execution_count": 81, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fp_rates.plot(label='Expected FPR, constant CPV',\n", " color='C1')\n", "plt.axhline(fpr, linestyle='dotted', \n", " label='Expected FPR, constant FPR', color='gray')\n", "\n", "for table in tables:\n", " plot_table_var(table, 'actual FPR', 'C1')\n", " \n", "decorate(xlabel='Prevalence',\n", " ylabel='Percent',\n", " title='False positive rates by subgroup')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In general, groups with higher prevalence have higher false positive rates, but the effect is less extreme than what we would expect from the CPV model.\n", "\n", "Here are the results for positive predictive value." ] }, { "cell_type": "code", "execution_count": 82, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "ppv_rates.plot(label='Expected PPV, constant FPR', color='C0')\n", "plt.axhline(ppv, linestyle='dotted', \n", " label='Expected PPV, constant PPV', color='gray')\n", "\n", "for table in tables:\n", " plot_table_var(table, 'actual PPV', 'C0')\n", " \n", "decorate(xlabel='Prevalence',\n", " ylabel='Rate',\n", " title='Positive predictive value by subgroup')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Groups with higher prevalence have higher PPV, but the effect is less extreme than we would expect from the CPV model.\n", "\n", "Here are the results for false negative rate." ] }, { "cell_type": "code", "execution_count": 83, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fn_rates.plot(label='Expected FNR, constant CPV',\n", " color='C4')\n", "plt.axhline(fnr, linestyle='dotted', \n", " label='Expected FNR, constant FNR', color='gray')\n", "\n", "for table in tables:\n", " plot_table_var(table, 'actual FNR', 'C4')\n", " \n", "decorate(xlabel='Prevalence',\n", " ylabel='Percent',\n", " title='False negative rates by subgroup')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Groups with higher prevalence have lower FNR, but the effect is less extreme than we would expect from the CPV model.\n", "\n", "Here are the results for negative predictive value." ] }, { "cell_type": "code", "execution_count": 84, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "npv_rates.plot(label='Expected NPV, constant FPR', color='C2')\n", "plt.axhline(npv, linestyle='dotted', \n", " label='Expected NPV, constant NPV', color='gray')\n", "\n", "for table in tables:\n", " plot_table_var(table, 'actual NPV', 'C2')\n", " \n", "decorate(xlabel='Prevalence',\n", " ylabel='Percent',\n", " title='Negative predictive value by subgroup')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Groups with higher prevalence have lower NPV. In this case, the effect is almost exactly what we would expect from the CPV model." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Individual FPR" ] }, { "cell_type": "code", "execution_count": 85, "metadata": {}, "outputs": [], "source": [ "from scipy.interpolate import interp1d\n", "\n", "def crossing(series, value, **options):\n", " \"\"\"Find where a function crosses a value.\n", " \n", " series: Series\n", " value: number\n", " options: passed to interp1d (default is linear interp)\n", " \n", " returns: number\n", " \"\"\"\n", " interp = interp1d(series.values, series.index, **options)\n", " return interp(value)\n", "\n", "def interpolate(series, value, **options):\n", " \"\"\"Evaluate a function at a value.\n", " \n", " series: Series\n", " value: number\n", " options: passed to interp1d (default is linear interp)\n", " \n", " returns: number\n", " \"\"\"\n", " interp = interp1d(series.index, series.values, **options)\n", " return interp(value)" ] }, { "cell_type": "code", "execution_count": 86, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "cal_all = calibration_curve(cp)\n", "cal_all.plot()\n", "decorate(ylabel='Probability of recidivism')" ] }, { "cell_type": "code", "execution_count": 87, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array(3.40971594)" ] }, "execution_count": 87, "metadata": {}, "output_type": "execute_result" } ], "source": [ "crossing(cal_all, 0.4)" ] }, { "cell_type": "code", "execution_count": 88, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array(9.01595501)" ] }, "execution_count": 88, "metadata": {}, "output_type": "execute_result" } ], "source": [ "crossing(cal_all, 0.7)" ] }, { "cell_type": "code", "execution_count": 89, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array(0.39943493)" ] }, "execution_count": 89, "metadata": {}, "output_type": "execute_result" } ], "source": [ "interpolate(cal_all, 3.4)" ] }, { "cell_type": "code", "execution_count": 90, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array(0.6988189)" ] }, "execution_count": 90, "metadata": {}, "output_type": "execute_result" } ], "source": [ "interpolate(cal_all, 9)" ] }, { "cell_type": "code", "execution_count": 91, "metadata": {}, "outputs": [], "source": [ "def make_error_dist(std_dev):\n", " \"\"\"Make a discrete Gaussian distribution.\n", " \n", " std_dev: standard deviation\n", " \n", " returns: Series that maps errors to probabilities\n", " \"\"\"\n", " errors = np.linspace(-3, 3, 21)\n", " prob_error = np.exp(-(errors/std_dev)**2)\n", " prob_error /= np.sum(prob_error)\n", " error_dist = pd.Series(prob_error, index=errors)\n", " return error_dist" ] }, { "cell_type": "code", "execution_count": 92, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "error_dist = make_error_dist(std_dev=2)\n", "error_dist.plot(label='')\n", "decorate(xlabel='Error (score)',\n", " ylabel='Probability')" ] }, { "cell_type": "code", "execution_count": 93, "metadata": {}, "outputs": [], "source": [ "def individual_fpr(actual_prob_recid, cal, thresh, std_dev):\n", " \"\"\"Compute an individual FPR.\n", " \n", " actual_prob_recid: actual probability of recidivism\n", " cal: calibration curve, map from score to prob_recid\n", " thresh: threshold between low and not low risk\n", " std_dev: standard deviation of the error function\n", " \n", " returns: individual FPR\n", " \"\"\"\n", " # look up actual_prob_recid to get correct score\n", " correct_score = crossing(cal, actual_prob_recid)\n", "\n", " # make the error distribution\n", " error_dist = make_error_dist(std_dev)\n", "\n", " # loop through possible errors\n", " total_prob = 0\n", " for error, prob_error in error_dist.iteritems():\n", " # hypothetical score\n", " score = correct_score+error\n", " score = max(score, 1)\n", " score = min(score, 10)\n", " \n", " # probability of being classified 'not low' | error\n", " prob_positive = 0 if score < thresh else 1\n", "\n", " # probability of being a false positive | error\n", " prob_fp = prob_positive * (1-actual_prob_recid)\n", " \n", " total_prob += prob_error * prob_fp\n", " return total_prob" ] }, { "cell_type": "code", "execution_count": 94, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.01623457260428886" ] }, "execution_count": 94, "metadata": {}, "output_type": "execute_result" } ], "source": [ "individual_fpr(0.3, cal_all, 4.5, 2)" ] }, { "cell_type": "code", "execution_count": 95, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.3538656315021642" ] }, "execution_count": 95, "metadata": {}, "output_type": "execute_result" } ], "source": [ "individual_fpr(0.5, cal_all, 4.5, 2)" ] }, { "cell_type": "code", "execution_count": 96, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.3" ] }, "execution_count": 96, "metadata": {}, "output_type": "execute_result" } ], "source": [ "individual_fpr(0.7, cal_all, 4.5, 2)" ] }, { "cell_type": "code", "execution_count": 97, "metadata": {}, "outputs": [], "source": [ "def compute_fpr_vs_prob_recid(cal, thresh, std_dev):\n", " \"\"\"Computes FPR as a function of probability of recidivism.\n", " \n", " cal: calibration curve, map from score to prob_recid\n", " thresh: threshold between low and not low risk\n", " std_dev: standard deviation of the error function\n", " \n", " returns: Series\n", " \"\"\"\n", " prob_recid_array = np.linspace(min(cal), max(cal), 21)\n", " prob_fpr_series = pd.Series(index=prob_recid_array)\n", " for prob_recid in prob_recid_array:\n", " fpr = individual_fpr(prob_recid, cal, thresh, std_dev)\n", " prob_fpr_series[prob_recid] = fpr\n", " return prob_fpr_series" ] }, { "cell_type": "code", "execution_count": 98, "metadata": { "scrolled": true }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "s = compute_fpr_vs_prob_recid(cal_all, thresh=4.5, std_dev=2)\n", "s.plot(label='FPR, std_dev=2')\n", "\n", "s = compute_fpr_vs_prob_recid(cal_all, thresh=4.5, std_dev=1)\n", "s.plot(label='FPR, std_dev=1')\n", "\n", "decorate(xlabel='Actual probability of recidivism',\n", " ylabel='Probability of false positive')" ] }, { "cell_type": "code", "execution_count": 99, "metadata": {}, "outputs": [], "source": [ "def individual_fpr_given_score(actual_score, cal, thresh, std_dev):\n", " \"\"\"Compute an individual FPR.\n", " \n", " actual_score: score assigned\n", " cal: calibration curve, map from score to prob_recid\n", " thresh: threshold between low and high risk\n", " std_dev: standard deviation of the error function\n", " \n", " returns: individual FPR\n", " \"\"\"\n", " # make the error distribution\n", " error_dist = make_error_dist(std_dev)\n", "\n", " # loop through possible errors\n", " total_prob = 0\n", " for error, prob_error in error_dist.iteritems():\n", " # correct score\n", " correct_score = actual_score-error\n", " correct_score = max(correct_score, 1)\n", " correct_score = min(correct_score, 10)\n", " \n", " # map from correct score to probability of recidivism.\n", " # if calibration curves are different for different\n", " # groups, this one should be group specific.\n", " correct_prob_recid = interpolate(cal, correct_score)\n", " cond_ifpr = individual_fpr(correct_prob_recid,\n", " cal, thresh, std_dev)\n", " \n", " total_prob += prob_error * cond_ifpr\n", " return total_prob" ] }, { "cell_type": "code", "execution_count": 100, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.3451680036503149" ] }, "execution_count": 100, "metadata": {}, "output_type": "execute_result" } ], "source": [ "individual_fpr_given_score(6, cal_all, thresh=4.5, std_dev=2)" ] }, { "cell_type": "code", "execution_count": 131, "metadata": {}, "outputs": [], "source": [ "def make_ifpr_series(cal, thresh, std_dev):\n", "\n", " scores = np.arange(1, 11)\n", " t = [individual_fpr_given_score(score, cal, thresh, std_dev)\n", " for score in scores]\n", " \n", " ifpr_series = pd.Series(t, scores)\n", " return ifpr_series" ] }, { "cell_type": "code", "execution_count": 132, "metadata": { "scrolled": true }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "thresh = 4.5\n", "\n", "s = make_ifpr_series(cal_all, thresh, std_dev=2)\n", "s.plot(label='FPR, std_dev=2')\n", "\n", "s = make_ifpr_series(cal_all, thresh, std_dev=1)\n", "s.plot(label='FPR, std_dev=1')\n", "\n", "decorate(xlabel='Score',\n", " ylabel='Probability of false positive')" ] }, { "cell_type": "code", "execution_count": 133, "metadata": {}, "outputs": [], "source": [ "def assign_individual_fpr(df, cal, thresh, std_dev):\n", " \"\"\"Assign individual FPRs to defendants.\n", " \n", " df: DataFrame\n", " cal: calibration curve, map from score to prob_recid\n", " thresh: threshold between low and high risk\n", " std_dev: standard deviation of the error function\n", " \"\"\"\n", " # compute the map from score to FPR\n", " ifpr_series = make_ifpr_series(cal, thresh, std_dev)\n", " \n", " # assign FPR to each defendant\n", " df['ifpr'] = [ifpr_series[score] for score in df.decile_score]" ] }, { "cell_type": "code", "execution_count": 134, "metadata": {}, "outputs": [], "source": [ "assign_individual_fpr(cp, cal_all, thresh=4.5, std_dev=2)" ] }, { "cell_type": "code", "execution_count": 135, "metadata": {}, "outputs": [], "source": [ "def make_cdf(series):\n", " \"\"\"Make a CDF.\"\"\"\n", " counts = series.value_counts().sort_index()\n", " counts /= counts.sum()\n", " return counts.cumsum()\n", "\n", "def plot_cdf(cdf, **options):\n", " \"\"\"Plot a CDF as a step function.\"\"\"\n", " plt.step(cdf.index, cdf.values, where='post', **options)" ] }, { "cell_type": "code", "execution_count": 136, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.018482 0.199612\n", "0.050389 0.330053\n", "0.128211 0.433601\n", "0.210571 0.540200\n", "0.258458 0.593291\n", "0.282626 0.687691\n", "0.292149 0.758109\n", "0.327170 0.829082\n", "0.345168 0.917937\n", "0.353374 1.000000\n", "Name: ifpr, dtype: float64" ] }, "execution_count": 136, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cdf_ifpr = make_cdf(cp.ifpr)" ] }, { "cell_type": "code", "execution_count": 137, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plot_cdf(cdf_ifpr, label='All')\n", "\n", "decorate(xlabel='Individual probability of false positive', \n", " ylabel='CDF',\n", " ylim=[0,1])" ] }, { "cell_type": "code", "execution_count": 138, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "black = cp[cp.race=='African-American']\n", "white = cp[cp.race=='Caucasian']\n", "\n", "thresh = 4.5\n", "std_dev = 2\n", "\n", "cal_black = calibration_curve(black)\n", "s = make_ifpr_series(cal_black, thresh, std_dev)\n", "s.plot(label='FPR, black')\n", "\n", "cal_white = calibration_curve(white)\n", "s = make_ifpr_series(cal_white, thresh, std_dev)\n", "s.plot(label='FPR, white')\n", "\n", "decorate(xlabel='Score',\n", " ylabel='Probability of false positive')" ] }, { "cell_type": "code", "execution_count": 139, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "male = cp[cp.sex=='Male']\n", "female = cp[cp.sex=='Female']\n", "\n", "thresh = 4.5\n", "std_dev = 2\n", "\n", "cal_male = calibration_curve(male)\n", "s = make_ifpr_series(cal_male, thresh, std_dev)\n", "s.plot(label='FPR, male')\n", "\n", "cal_female = calibration_curve(female)\n", "s = make_ifpr_series(cal_female, thresh, std_dev)\n", "s.plot(label='FPR, female')\n", "\n", "decorate(xlabel='Score',\n", " ylabel='Probability of false positive')" ] }, { "cell_type": "code", "execution_count": 176, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.18984802588839386" ] }, "execution_count": 176, "metadata": {}, "output_type": "execute_result" } ], "source": [ "assign_individual_fpr(cp, cal_all, thresh=4.5, std_dev=2)\n", "cp.ifpr.mean()" ] }, { "cell_type": "code", "execution_count": 177, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "race\n", "African-American 0.225117\n", "Asian 0.115755\n", "Caucasian 0.160501\n", "Hispanic 0.142154\n", "Native American 0.240612\n", "Other 0.119566\n", "Name: ifpr, dtype: float64" ] }, "execution_count": 177, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cp.groupby('race').ifpr.mean()" ] }, { "cell_type": "code", "execution_count": 178, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.182639042962266" ] }, "execution_count": 178, "metadata": {}, "output_type": "execute_result" } ], "source": [ "assign_individual_fpr(cp, cal_all, thresh=4.5, std_dev=1)\n", "cp.ifpr.mean()" ] }, { "cell_type": "code", "execution_count": 179, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "race\n", "African-American 0.223846\n", "Asian 0.098804\n", "Caucasian 0.148866\n", "Hispanic 0.125011\n", "Native American 0.239059\n", "Other 0.100288\n", "Name: ifpr, dtype: float64" ] }, "execution_count": 179, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cp.groupby('race').ifpr.mean()" ] }, { "cell_type": "code", "execution_count": 180, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.1777100083171615" ] }, "execution_count": 180, "metadata": {}, "output_type": "execute_result" } ], "source": [ "assign_individual_fpr(cp, cal_all, thresh=4.5, std_dev=0.01)\n", "cp.ifpr.mean()" ] }, { "cell_type": "code", "execution_count": 181, "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/plain": [ "race\n", "African-American 0.220867\n", "Asian 0.097329\n", "Caucasian 0.142568\n", "Hispanic 0.119877\n", "Native American 0.228798\n", "Other 0.085460\n", "Name: ifpr, dtype: float64" ] }, "execution_count": 181, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cp.groupby('race').ifpr.mean()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Individual FNR" ] }, { "cell_type": "code", "execution_count": 141, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "cal_all = calibration_curve(cp)\n", "cal_all.plot()\n", "decorate(ylabel='Probability of recidivism')" ] }, { "cell_type": "code", "execution_count": 142, "metadata": {}, "outputs": [], "source": [ "def individual_fnr(actual_prob_recid, cal, thresh, std_dev):\n", " \"\"\"Compute an individual FNR.\n", " \n", " actual_prob_recid: actual probability of recidivism\n", " cal: calibration curve, map from score to prob_recid\n", " thresh: threshold between low and not low risk\n", " std_dev: standard deviation of the error function\n", " \n", " returns: individual FNR\n", " \"\"\"\n", " # look up actual_prob_recid to get correct score\n", " correct_score = crossing(cal, actual_prob_recid)\n", "\n", " # make the error distribution\n", " error_dist = make_error_dist(std_dev)\n", "\n", " # loop through possible errors\n", " total_prob = 0\n", " for error, prob_error in error_dist.iteritems():\n", " # hypothetical score\n", " score = correct_score+error\n", " score = max(score, 1)\n", " score = min(score, 10)\n", " \n", " # probability of being classified 'low' | error\n", " prob_negative = 0 if score >= thresh else 1\n", "\n", " # probability of being a false negative | error\n", " prob_fp = prob_negative * actual_prob_recid\n", " \n", " total_prob += prob_error * prob_fp\n", " return total_prob" ] }, { "cell_type": "code", "execution_count": 143, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.29304232602673336" ] }, "execution_count": 143, "metadata": {}, "output_type": "execute_result" } ], "source": [ "individual_fnr(0.3, cal_all, 4.5, 2)" ] }, { "cell_type": "code", "execution_count": 144, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.14613436849783573" ] }, "execution_count": 144, "metadata": {}, "output_type": "execute_result" } ], "source": [ "individual_fnr(0.5, cal_all, 4.5, 2)" ] }, { "cell_type": "code", "execution_count": 145, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.0" ] }, "execution_count": 145, "metadata": {}, "output_type": "execute_result" } ], "source": [ "individual_fnr(0.7, cal_all, 4.5, 2)" ] }, { "cell_type": "code", "execution_count": 146, "metadata": {}, "outputs": [], "source": [ "def compute_fnr_vs_prob_recid(cal, thresh, std_dev):\n", " \"\"\"Computes FNR as a function of probability of recidivism.\n", " \n", " cal: calibration curve, map from score to prob_recid\n", " thresh: threshold between low and not low risk\n", " std_dev: standard deviation of the error function\n", " \n", " returns: Series\n", " \"\"\"\n", " prob_recid_array = np.linspace(min(cal), max(cal), 21)\n", " prob_fnr_series = pd.Series(index=prob_recid_array)\n", " for prob_recid in prob_recid_array:\n", " fnr = individual_fnr(prob_recid, cal, thresh, std_dev)\n", " prob_fnr_series[prob_recid] = fnr\n", " return prob_fnr_series" ] }, { "cell_type": "code", "execution_count": 147, "metadata": { "scrolled": true }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "s = compute_fnr_vs_prob_recid(cal_all, thresh=4.5, std_dev=2)\n", "s.plot(label='FNR, std_dev=2')\n", "\n", "s = compute_fnr_vs_prob_recid(cal_all, thresh=4.5, std_dev=1)\n", "s.plot(label='FNR, std_dev=1')\n", "\n", "decorate(xlabel='Actual probability of recidivism',\n", " ylabel='Probability of false negative')" ] }, { "cell_type": "code", "execution_count": 148, "metadata": {}, "outputs": [], "source": [ "def individual_fnr_given_score(actual_score, cal, thresh, std_dev):\n", " \"\"\"Compute an individual FNR.\n", " \n", " actual_score: score assigned\n", " cal: calibration curve, map from score to prob_recid\n", " thresh: threshold between low and high risk\n", " std_dev: standard deviation of the error function\n", " \n", " returns: individual FNR\n", " \"\"\"\n", " # make the error distribution\n", " error_dist = make_error_dist(std_dev)\n", "\n", " # loop through possible errors\n", " total_prob = 0\n", " for error, prob_error in error_dist.iteritems():\n", " # correct score\n", " correct_score = actual_score-error\n", " correct_score = max(correct_score, 1)\n", " correct_score = min(correct_score, 10)\n", " \n", " # map from correct score to probability of recidivism.\n", " # if calibration curves are different for different\n", " # groups, this one should be group specific.\n", " correct_prob_recid = interpolate(cal, correct_score)\n", " cond_ifnr = individual_fnr(correct_prob_recid,\n", " cal, thresh, std_dev)\n", " \n", " total_prob += prob_error * cond_ifnr\n", " return total_prob" ] }, { "cell_type": "code", "execution_count": 149, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.09711411113575788" ] }, "execution_count": 149, "metadata": {}, "output_type": "execute_result" } ], "source": [ "individual_fnr_given_score(6, cal_all, thresh=4.5, std_dev=2)" ] }, { "cell_type": "code", "execution_count": 150, "metadata": {}, "outputs": [], "source": [ "def make_ifnr_series(cal, thresh, std_dev):\n", "\n", " scores = np.arange(1, 11)\n", " t = [individual_fnr_given_score(score, cal, thresh, std_dev)\n", " for score in scores]\n", " \n", " ifnr_series = pd.Series(t, scores)\n", " return ifnr_series" ] }, { "cell_type": "code", "execution_count": 151, "metadata": { "scrolled": true }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "thresh = 4.5\n", "\n", "s = make_ifnr_series(cal_all, thresh, std_dev=2)\n", "s.plot(label='FNR, std_dev=2')\n", "\n", "s = make_ifnr_series(cal_all, thresh, std_dev=1)\n", "s.plot(label='FNR, std_dev=1')\n", "\n", "decorate(xlabel='Score',\n", " ylabel='Probability of false negative')" ] }, { "cell_type": "code", "execution_count": 152, "metadata": {}, "outputs": [], "source": [ "def assign_individual_fnr(df, cal, thresh, std_dev):\n", " \"\"\"Assign individual FNRs to defendants.\n", " \n", " df: DataFrame\n", " cal: calibration curve, map from score to prob_recid\n", " thresh: threshold between low and high risk\n", " std_dev: standard deviation of the error function\n", " \"\"\"\n", " # compute the map from score to FPR\n", " ifnr_series = make_ifnr_series(cal, thresh, std_dev)\n", " \n", " # assign FPR to each defendant\n", " df['ifnr'] = [ifnr_series[score] for score in df.decile_score]" ] }, { "cell_type": "code", "execution_count": 153, "metadata": {}, "outputs": [], "source": [ "assign_individual_fnr(cp, cal_all, thresh=4.5, std_dev=2)" ] }, { "cell_type": "code", "execution_count": 154, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.000205 0.053091\n", "0.002586 0.123510\n", "0.016131 0.194483\n", "0.043440 0.276546\n", "0.097114 0.365401\n", "0.177649 0.459800\n", "0.233751 0.566399\n", "0.248533 0.766011\n", "0.266719 0.869559\n", "0.273906 1.000000\n", "Name: ifnr, dtype: float64" ] }, "execution_count": 154, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cdf_ifnr = make_cdf(cp.ifnr)" ] }, { "cell_type": "code", "execution_count": 155, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "plot_cdf(cdf_ifnr, label='All')\n", "\n", "decorate(xlabel='Individual probability of false negative', \n", " ylabel='CDF',\n", " ylim=[0,1])" ] }, { "cell_type": "code", "execution_count": 159, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "thresh = 4.5\n", "std_dev = 2\n", "\n", "cal_black = calibration_curve(black)\n", "s = make_ifnr_series(cal_black, thresh, std_dev)\n", "s.plot(label='FNR, black')\n", "\n", "cal_white = calibration_curve(white)\n", "s = make_ifnr_series(cal_white, thresh, std_dev)\n", "s.plot(label='FNR, white')\n", "\n", "decorate(xlabel='Score',\n", " ylabel='Probability of false negative')" ] }, { "cell_type": "code", "execution_count": 160, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "thresh = 4.5\n", "std_dev = 2\n", "\n", "cal_male = calibration_curve(male)\n", "s = make_ifnr_series(cal_male, thresh, std_dev)\n", "s.plot(label='FNR, male')\n", "\n", "cal_female = calibration_curve(female)\n", "s = make_ifnr_series(cal_female, thresh, std_dev)\n", "s.plot(label='FNR, female')\n", "\n", "decorate(xlabel='Score',\n", " ylabel='Probability of false negative')" ] }, { "cell_type": "code", "execution_count": 182, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.1681762611669047" ] }, "execution_count": 182, "metadata": {}, "output_type": "execute_result" } ], "source": [ "assign_individual_fnr(cp, cal_all, thresh=4.5, std_dev=2)\n", "cp.ifnr.mean()" ] }, { "cell_type": "code", "execution_count": 183, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "race\n", "African-American 0.139389\n", "Asian 0.209441\n", "Caucasian 0.194595\n", "Hispanic 0.203360\n", "Native American 0.110333\n", "Other 0.218244\n", "Name: ifnr, dtype: float64" ] }, "execution_count": 183, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cp.groupby('race').ifnr.mean()" ] }, { "cell_type": "code", "execution_count": 184, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.1690990899101053" ] }, "execution_count": 184, "metadata": {}, "output_type": "execute_result" } ], "source": [ "assign_individual_fnr(cp, cal_all, thresh=4.5, std_dev=1)\n", "cp.ifnr.mean()" ] }, { "cell_type": "code", "execution_count": 185, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "race\n", "African-American 0.136836\n", "Asian 0.210584\n", "Caucasian 0.198345\n", "Hispanic 0.210041\n", "Native American 0.105383\n", "Other 0.225368\n", "Name: ifnr, dtype: float64" ] }, "execution_count": 185, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cp.groupby('race').ifnr.mean()" ] }, { "cell_type": "code", "execution_count": 186, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.1685611311339046" ] }, "execution_count": 186, "metadata": {}, "output_type": "execute_result" } ], "source": [ "assign_individual_fnr(cp, cal_all, thresh=4.5, std_dev=0.01)\n", "cp.ifnr.mean()" ] }, { "cell_type": "code", "execution_count": 187, "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/plain": [ "race\n", "African-American 0.136599\n", "Asian 0.197959\n", "Caucasian 0.197450\n", "Hispanic 0.207289\n", "Native American 0.114221\n", "Other 0.228528\n", "Name: ifnr, dtype: float64" ] }, "execution_count": 187, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cp.groupby('race').ifnr.mean()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### What would it take?\n", "\n", "In this section I explore what it would take to make a test with the same false positive rate for all groups." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def fpr_thresh(df, thresh):\n", " df = df.copy()\n", " df['high'] = df.decile_score >= thresh\n", " matrix_all = crosstab(df, 'two_year_recid', 'high')\n", " fpr, fnr = error_rates(matrix_all)\n", " return fpr" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fpr_thresh(cp, 5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fpr_thresh(black, 5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fpr_thresh(white, 5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def sweep_thresh(df):\n", " threshes = range(2,10)\n", " sweep = pd.Series(index=threshes)\n", " for thresh in threshes:\n", " sweep[thresh] = fpr_thresh(df, thresh)\n", " \n", " return sweep" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.axhline(32.25, color='gray')\n", "sweep_thresh(cp).plot(label='All')\n", "sweep_thresh(black).plot(label='Black')\n", "sweep_thresh(white).plot(label='White')\n", "decorate(xlabel='Threshold',\n", " ylabel='False positive rate')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def find_threshold(group, fpr):\n", " series = sweep_thresh(group)\n", " xs = crossing(series.dropna(), fpr)\n", " return xs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "all_thresh = find_threshold(cp, 32.35)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "black_thresh = find_threshold(black, 32.35)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "white_thresh = find_threshold(white, 32.35)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "interpolate(calibration_curve(cp), all_thresh)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "interpolate(calibration_curve(cp), black_thresh)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "interpolate(calibration_curve(black), black_thresh)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "interpolate(calibration_curve(cp), white_thresh)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "interpolate(calibration_curve(white), white_thresh)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "black_male = black[black.sex=='Male']\n", "black_male.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "black_female = black[black.sex=='Female']\n", "black_female.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "old_black_female = black_female[black_female.age_cat=='Greater than 45']\n", "old_black_female.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "old_white_female = cp[(cp.age_cat=='Greater than 45') &\n", " (cp.sex=='Female') &\n", " (cp.race=='Caucasian')]\n", "old_white_female.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "young_black_male = cp[(cp.age_cat=='Less than 25') &\n", " (cp.sex=='Male') &\n", " (cp.race=='African-American')]\n", "young_black_male.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fpr_thresh(cp, 5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fpr_thresh(black, 5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fpr_thresh(black_female, 5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fpr_thresh(old_black_female, 5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fpr_thresh(black_male, 5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fpr_thresh(young_black_male, 5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.axhline(32.25, color='gray')\n", "sweep_thresh(black).plot(label='Black', color='gray')\n", "sweep_thresh(black_male).plot(label='Black male')\n", "sweep_thresh(young_black_male).plot(label='Young black male')\n", "\n", "decorate(xlabel='Threshold',\n", " ylabel='False positive rate')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.axhline(32.25, color='gray')\n", "sweep_thresh(black).plot(label='Black', color='gray')\n", "sweep_thresh(black_female).plot(label='Black female')\n", "sweep_thresh(old_black_female).plot(label='Old black female')\n", "\n", "decorate(xlabel='Threshold',\n", " ylabel='False positive rate')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ybm_thresh = find_threshold(young_black_male, 32.35)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "obf_thresh = find_threshold(old_black_female, 32.35)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "interpolate(calibration_curve(cp), ybm_thresh)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "interpolate(calibration_curve(cp), obf_thresh)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.5" } }, "nbformat": 4, "nbformat_minor": 2 }