{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Imprisonment by Race\n", "\n", "This notebook explores the racial breakdown of the 3 largest racial groups in US prisons (from 2006 to 2016): non-Hispanic white people, non-Hispanic black people, and Hispanic people. \n", "\n", "In this notebook, I show that, over the entire span of data (2006-2016), black people are the largest racial prison population both in raw numbers and as a proportion of the total same-race US population. \n", "\n", "**Average Number of Prisoners (from 2006-2016)**\n", "\n", "| Race | Average Number in Prison | \n", "| ------------- |:-------------:| \n", "| Black | 551209 | \n", "| White | 476136 | \n", "| Hispanic | 336500 | \n", "\n", "**Average Percent US Racial Population in Prison**\n", "\n", "| Race | % of US Racial Pop. in Prison | \n", "| ------------- | :------------- | \n", "| Black | 1.440 % |\n", "| White | 0.241 % |\n", "| Hispanic | 0.656 % |\n", "\n", "**Change in Average Percent US Racial Population in Prison**\n", "\n", "| Race | Change in % of US Racial Pop. in Prison | \n", "| ------------- | :------------- | \n", "| Black | -0.406 % |\n", "| White | -0.035 % |\n", "| Hispanic | -0.113 % |\n", "\n", "That is a fairly astonishing. A randomly selected black person is (on average) nearly 6 times as likely to be in prison as a randomly selected white person. There is clearly a statistically significant difference between these rates, so there are clearly more questions to answer. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data Source\n", "\n", "This data is from file p16t03.csv of the Bureau of Justice Statistics [Prisoner Series](https://www.bjs.gov/index.cfm?ty=pbse&sid=40) data." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "import seaborn as sns\n", "import matplotlib.pyplot as plt\n", "import matplotlib.ticker as ticker\n", "import os\n", "from IPython.core.display import display, HTML\n", "%matplotlib inline" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [], "source": [ "# Importing modules from a visualization package.\n", "# from bokeh.sampledata.us_states import data as states|\n", "from bokeh.plotting import figure, show, output_notebook\n", "from bokeh.models import HoverTool, ColumnDataSource\n", "from bokeh.models import LinearColorMapper, ColorBar, BasicTicker" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# styling\n", "pd.options.display.max_columns = None\n", "display(HTML(\"\"))\n", "pd.set_option('display.float_format',lambda x: '%.3f' % x)\n", "plt.rcParams['figure.figsize'] = 10,10" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "CSV_PATH = os.path.join('data', 'prison', 'p16t03.csv')\n", "race_sex_raw = pd.read_csv(CSV_PATH, \n", " encoding='latin1',\n", " header=11, \n", " na_values=':',\n", " thousands=r',')\n", "race_sex_raw.dropna(axis=0, thresh=3, inplace=True)\n", "race_sex_raw.dropna(axis=1, thresh=3, inplace=True)\n", "race_sex_raw.dropna(axis=0, inplace=True)\n", "fix = lambda x: x.split('/')[0]\n", "race_sex_raw['Year'] = race_sex_raw['Year'].apply(fix)\n", "race_sex_raw.columns = [x.split('/')[0] for x in race_sex_raw.columns]\n", "race_sex_raw.set_index('Year', inplace=True)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TotalFederalStateMaleFemaleWhiteBlackHispanic
Year
20061504598.0173533.01331065.01401261.0103337.0507100.0590300.0313600.0
20071532851.0179204.01353647.01427088.0105763.0499800.0592900.0330400.0
20081547742.0182333.01365409.01441384.0106358.0499900.0592800.0329800.0
20091553574.0187886.01365688.01448239.0105335.0490000.0584800.0341200.0
20101552669.0190641.01362028.01447766.0104903.0484400.0572700.0345800.0
20111538847.0197050.01341797.01435141.0103706.0474300.0557100.0347800.0
20121512430.0196574.01315856.01411076.0101354.0466600.0537800.0340300.0
20131520403.0195098.01325305.01416102.0104301.0463900.0529900.0341200.0
20141507781.0191374.01316407.01401685.0106096.0461500.0518700.0338900.0
20151476847.0178688.01298159.01371879.0104968.0450200.0499400.0333200.0
20161458173.0171482.01286691.01352684.0105489.0439800.0486900.0339300.0
\n", "
" ], "text/plain": [ " Total Federal State Male Female White Black \\\n", "Year \n", "2006 1504598.0 173533.0 1331065.0 1401261.0 103337.0 507100.0 590300.0 \n", "2007 1532851.0 179204.0 1353647.0 1427088.0 105763.0 499800.0 592900.0 \n", "2008 1547742.0 182333.0 1365409.0 1441384.0 106358.0 499900.0 592800.0 \n", "2009 1553574.0 187886.0 1365688.0 1448239.0 105335.0 490000.0 584800.0 \n", "2010 1552669.0 190641.0 1362028.0 1447766.0 104903.0 484400.0 572700.0 \n", "2011 1538847.0 197050.0 1341797.0 1435141.0 103706.0 474300.0 557100.0 \n", "2012 1512430.0 196574.0 1315856.0 1411076.0 101354.0 466600.0 537800.0 \n", "2013 1520403.0 195098.0 1325305.0 1416102.0 104301.0 463900.0 529900.0 \n", "2014 1507781.0 191374.0 1316407.0 1401685.0 106096.0 461500.0 518700.0 \n", "2015 1476847.0 178688.0 1298159.0 1371879.0 104968.0 450200.0 499400.0 \n", "2016 1458173.0 171482.0 1286691.0 1352684.0 105489.0 439800.0 486900.0 \n", "\n", " Hispanic \n", "Year \n", "2006 313600.0 \n", "2007 330400.0 \n", "2008 329800.0 \n", "2009 341200.0 \n", "2010 345800.0 \n", "2011 347800.0 \n", "2012 340300.0 \n", "2013 341200.0 \n", "2014 338900.0 \n", "2015 333200.0 \n", "2016 339300.0 " ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "race_sex_raw" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Average number of black people in prison from 2006 to 2016: 551209\n", "Average number of White people in prison from 2006 to 2016: 476136\n", "Average number of Hispanic people in prison from 2006 to 2016: 336500\n" ] } ], "source": [ "print('Average number of {:>8s} people in prison from 2006 to 2016: {:6.0f}'\n", " .format('black', race_sex_raw['Black'].mean()))\n", "print('Average number of {:>8s} people in prison from 2006 to 2016: {:6.0f}'\n", " .format('White', race_sex_raw['White'].mean()))\n", "print('Average number of {:>8s} people in prison from 2006 to 2016: {:6.0f}'\n", " .format('Hispanic', race_sex_raw['Hispanic'].mean()))" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "with sns.axes_style(\"whitegrid\"):\n", " fig, ax = plt.subplots(figsize=(10,7))\n", " ax.plot(race_sex_raw['White'])\n", " ax.plot(race_sex_raw['Black'])\n", " ax.plot(race_sex_raw['Hispanic'])\n", " ax.set_title('Total Imprisonment Counts by Race (table: p16t03)',fontsize=14)\n", " ax.set_xlabel('Year', fontsize=14)\n", " ax.set_ylabel('Number of People imprisoned', fontsize=14)\n", " ax.legend(fontsize=14)\n", " ax.set_ylim([0, 1.1*max([race_sex_raw['White'].max(), \n", " race_sex_raw['Black'].max(),\n", " race_sex_raw['Hispanic'].max()])])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Looking at this plot of total imprisonment counts, we see that:\n", "* At any given time, there are more non-Hispanic black people in prison than any other race.\n", "* At any given time, there are more non-Hispanic white people than Hispanic people in prison.\n", "* The numbers of imprisoned black people and white people decreased steadily from 2008 to 2016, while the number of imprisoned Hispanic people was been fairly flat over that time.\n", "\n", "These numbers are raw counts, so they don't account for the fact that the fact that these races make up different proportions of the entire US population. This motivates questions like:\n", "\n", "* What percent of the population of each race is in prison?\n", "\n", "To answer that, I'll have to find population data broken down by race.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Census Data\n", "\n", "The Constitution requires that a full, national census is performed every 10 years that reaches every resident on American soil. This data is used to determine how much federal money is allocated to each district for things like schools, roadways, police, etc. and it determines how many congressional representatives will be allocated to each state. With the help of smaller surveys, the US Census bureau makes estimates of populations for the years between censuses.\n", "\n", "https://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?src=bkmk" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Initial Data Format:\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IdYearId.1SexId.2Hispanic OriginId.3Id2GeographyTotalRace Alone - WhiteRace Alone - Black or African AmericanRace Alone - American Indian and Alaska NativeRace Alone - AsianRace Alone - Native Hawaiian and Other Pacific IslanderTwo or More Races
0cen42010April 1, 2010 CensusfemaleFemalehispHispanic0100000USNaNUnited States2485879421936806119198470230924934685203693146
1cen42010April 1, 2010 CensusfemaleFemalenhispNot Hispanic0100000USNaNUnited States13210541810030133519853611114750276916932465182864759
2cen42010April 1, 2010 CensusfemaleFemaletothispTotal0100000USNaNUnited States15696421212223814121045595184981179410393317213557905
3cen42010April 1, 2010 CensusmaleMalehispHispanic0100000USNaNUnited States2561880022681299113612977393924865492206686573
4cen42010April 1, 2010 CensusmaleMalenhispNot Hispanic0100000USNaNUnited States1261625269701762118068911111575669698232506982739717
\n", "
" ], "text/plain": [ " Id Year Id.1 Sex Id.2 Hispanic Origin \\\n", "0 cen42010 April 1, 2010 Census female Female hisp Hispanic \n", "1 cen42010 April 1, 2010 Census female Female nhisp Not Hispanic \n", "2 cen42010 April 1, 2010 Census female Female tothisp Total \n", "3 cen42010 April 1, 2010 Census male Male hisp Hispanic \n", "4 cen42010 April 1, 2010 Census male Male nhisp Not Hispanic \n", "\n", " Id.3 Id2 Geography Total Race Alone - White \\\n", "0 0100000US NaN United States 24858794 21936806 \n", "1 0100000US NaN United States 132105418 100301335 \n", "2 0100000US NaN United States 156964212 122238141 \n", "3 0100000US NaN United States 25618800 22681299 \n", "4 0100000US NaN United States 126162526 97017621 \n", "\n", " Race Alone - Black or African American \\\n", "0 1191984 \n", "1 19853611 \n", "2 21045595 \n", "3 1136129 \n", "4 18068911 \n", "\n", " Race Alone - American Indian and Alaska Native Race Alone - Asian \\\n", "0 702309 249346 \n", "1 1147502 7691693 \n", "2 1849811 7941039 \n", "3 773939 248654 \n", "4 1115756 6969823 \n", "\n", " Race Alone - Native Hawaiian and Other Pacific Islander Two or More Races \n", "0 85203 693146 \n", "1 246518 2864759 \n", "2 331721 3557905 \n", "3 92206 686573 \n", "4 250698 2739717 " ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n", "Data Format after Processing:\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
YearHispanic OriginTotalwhite_only_popblack_only_popTwo or More Races
24July 1, 2010Hispanic507540694485552923430531392607
25July 1, 2010Not Hispanic258594124197394319380154615647760
33July 1, 2011Hispanic519063534584177124104901445820
34July 1, 2011Not Hispanic259757005197519026383937585826406
42July 1, 2012Hispanic529934964675916524801931498388
\n", "
" ], "text/plain": [ " Year Hispanic Origin Total white_only_pop black_only_pop \\\n", "24 July 1, 2010 Hispanic 50754069 44855529 2343053 \n", "25 July 1, 2010 Not Hispanic 258594124 197394319 38015461 \n", "33 July 1, 2011 Hispanic 51906353 45841771 2410490 \n", "34 July 1, 2011 Not Hispanic 259757005 197519026 38393758 \n", "42 July 1, 2012 Hispanic 52993496 46759165 2480193 \n", "\n", " Two or More Races \n", "24 1392607 \n", "25 5647760 \n", "33 1445820 \n", "34 5826406 \n", "42 1498388 " ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "CSV_PATH = os.path.join('data', 'pop', 'PEP_2016_PEPSR6H_with_ann.csv')\n", "race_pop = pd.read_csv(CSV_PATH, header=[1], encoding='latin1')\n", "print('Initial Data Format:')\n", "display(race_pop.head())\n", "# Only looking for national values\n", "race_pop = race_pop[race_pop['Geography'] == 'United States']\n", "# Eliminating the actual census values\n", "race_pop = race_pop[~race_pop['Year'].str.contains('April')]\n", "# Eliminating aggregated rows\n", "race_pop = race_pop[race_pop['Hispanic Origin'] != 'Total']\n", "# race_pop = race_pop[race_pop['Sex'] != 'Both Sexes'] # for later\n", "race_pop = race_pop[~race_pop['Sex'].isin(['Male','Female'])]\n", "\n", "drop_cols = ['Id', 'Id.1', 'Id.2', 'Id2', 'Id.3', 'Geography', 'Sex',\n", " 'Race Alone - American Indian and Alaska Native', 'Race Alone - Asian',\n", " 'Race Alone - Native Hawaiian and Other Pacific Islander']\n", "race_pop.drop(drop_cols, axis=1, inplace=True)\n", "\n", "# Reducing the size of long column names\n", "col_map = {'Race Alone - Black or African American':'black_only_pop',\n", " 'Race Alone - White': 'white_only_pop'}\n", "race_pop.rename(col_map, axis=1, inplace=True)\n", "\n", "print('\\nData Format after Processing:')\n", "race_pop.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Based on the aggregation of the data I pulled, there is a rather vague 'Two or More Races' column. The prisoner data I've been working with has not differentiated between multiracial people and single-race people, rather, the prison data only breaks races down to non-Hispanic white, non-Hispanic black, and Hispanic. It's entirely possible for someone to be non-Hispanic, black, and multiracial, which would make it much more difficult and less accurate to use this population data with the prison data. Fortunately, however, the [documentation for the prison data](https://www.bjs.gov/content/pub/pdf/p16.pd)] includes footnotes indicating the data for white and black prison populations excludes persons of two or more races, so, conveniently, I must also exclude it." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "race_pop.drop('Two or More Races', axis=1, inplace=True)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "hisp = race_pop[race_pop['Hispanic Origin'] == 'Hispanic'].copy()\n", "non_hisp = race_pop[race_pop['Hispanic Origin'] != 'Hispanic'].copy()\n", "non_hisp.drop(['Hispanic Origin', 'Total'], axis=1, inplace=True)\n", "hisp.drop(['white_only_pop','black_only_pop', 'Hispanic Origin'], axis=1, inplace=True)\n", "hisp.rename({'Total':'Hispanic_pop'}, axis=1, inplace=True)\n", "us_race_pop = hisp.merge(non_hisp, on=['Year'])\n", "fix_yr = lambda x: x.split(' ')[-1]\n", "us_race_pop['Year'] = us_race_pop['Year'].apply(fix_yr)\n", "us_race_pop.set_index('Year', inplace=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This provides population data from 2010 on, but not for the earlier years. To deal with earlier years, I need to handle another data set. \n", "\n", "https://www.census.gov/data/tables/time-series/demo/popest/intercensal-2000-2010-national.html" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "EXCEL_PATH = os.path.join('data', 'pop', 'us-est00int-02.xls')\n", "race_pop0010 = pd.read_excel(EXCEL_PATH, header=None)\n", "race_pop0010.dropna(axis=0, thresh=6, inplace=True)\n", "race_pop0010.drop([1],axis=1, inplace=True)\n", "race_pop0010 = race_pop0010.T\n", "race_pop0010.iloc[0,0] = 'Year'\n", "race_pop0010.columns = race_pop0010.loc[0]\n", "race_pop0010.drop(0, axis=0, inplace=True)\n", "race_pop0010.dropna(axis=0, inplace=True)\n", "race_pop0010['Year'] = race_pop0010['Year'].astype(int)\n", "race_pop0010['Year'] = race_pop0010['Year'].astype(str)\n", "race_pop0010.set_index('Year', inplace=True)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "# Code to help drop columns I'm not interested in\n", "drop_cols = []\n", "drop_stumps = ['AIAN','Asian','NHPI', 'One Race', 'Two or']\n", "for col in race_pop0010.columns:\n", " if any(x in col for x in drop_stumps):\n", " drop_cols.append(col)\n", "race_pop0010.drop(drop_cols, axis=1, inplace=True)\n", "\n", "# Code to facilitate merging this DataFrame with theother Population DataFrame\n", "both0010 = race_pop0010.iloc[:,4:7].copy()\n", "name_map = {'...White' : 'white_only_pop',\n", " '...Black' : 'black_only_pop',\n", " '.HISPANIC' : 'Hispanic_pop'}\n", "both0010.rename(name_map, axis=1, inplace=True)\n", "both0010 = both0010.astype(int)\n", "pop_span = pd.concat([both0010, us_race_pop], join='inner')\n", "race_sex_pop = race_sex_raw.join(pop_span)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that I've got a full population data set, I can normalize the prisoner data." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "race_sex_pop.loc[:,'White_pct'] = race_sex_pop.loc[:,'White']\\\n", " .divide(race_sex_pop.loc[:,'white_only_pop']) * 100\n", "race_sex_pop.loc[:,'Black_pct'] = race_sex_pop.loc[:,'Black']\\\n", " .divide(race_sex_pop.loc[:,'black_only_pop']) * 100\n", "race_sex_pop.loc[:,'Hispanic_pct'] = race_sex_pop.loc[:,'Hispanic']\\\n", " .divide(race_sex_pop.loc[:,'Hispanic_pop']) * 100" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TotalFederalStateMaleFemaleWhiteBlackHispanicwhite_only_popblack_only_popHispanic_popWhite_pctBlack_pctHispanic_pct
Year
20061504598.0173533.01331065.01401261.0103337.0507100.0590300.0313600.019683269736520961446063050.2576301.6163320.703040
20071532851.0179204.01353647.01427088.0105763.0499800.0592900.0330400.019701139436905758461968530.2536911.6065240.715200
20081547742.0182333.01365409.01441384.0106358.0499900.0592800.0329800.019718353537290709477937850.2535201.5896720.690048
20091553574.0187886.01365688.01448239.0105335.0490000.0584800.0341200.019727454937656592493274890.2483851.5529820.691704
20101552669.0190641.01362028.01447766.0104903.0484400.0572700.0345800.019739431938015461507540690.2453971.5064920.681325
20111538847.0197050.01341797.01435141.0103706.0474300.0557100.0347800.019751902638393758519063530.2401291.4510170.670053
20121512430.0196574.01315856.01411076.0101354.0466600.0537800.0340300.019770110938776276529934960.2360131.3869310.642154
20131520403.0195098.01325305.01416102.0104301.0463900.0529900.0341200.019777745439135988540641490.2345571.3539970.631102
20141507781.0191374.01316407.01401685.0106096.0461500.0518700.0338900.019790233639507913551899620.2331961.3129020.614061
20151476847.0178688.01298159.01371879.0104968.0450200.0499400.0333200.019796440239876758563385210.2274151.2523590.591425
20161458173.0171482.01286691.01352684.0105489.0439800.0486900.0339300.019796960840229236574702870.2221551.2103140.590392
\n", "
" ], "text/plain": [ " Total Federal State Male Female White Black \\\n", "Year \n", "2006 1504598.0 173533.0 1331065.0 1401261.0 103337.0 507100.0 590300.0 \n", "2007 1532851.0 179204.0 1353647.0 1427088.0 105763.0 499800.0 592900.0 \n", "2008 1547742.0 182333.0 1365409.0 1441384.0 106358.0 499900.0 592800.0 \n", "2009 1553574.0 187886.0 1365688.0 1448239.0 105335.0 490000.0 584800.0 \n", "2010 1552669.0 190641.0 1362028.0 1447766.0 104903.0 484400.0 572700.0 \n", "2011 1538847.0 197050.0 1341797.0 1435141.0 103706.0 474300.0 557100.0 \n", "2012 1512430.0 196574.0 1315856.0 1411076.0 101354.0 466600.0 537800.0 \n", "2013 1520403.0 195098.0 1325305.0 1416102.0 104301.0 463900.0 529900.0 \n", "2014 1507781.0 191374.0 1316407.0 1401685.0 106096.0 461500.0 518700.0 \n", "2015 1476847.0 178688.0 1298159.0 1371879.0 104968.0 450200.0 499400.0 \n", "2016 1458173.0 171482.0 1286691.0 1352684.0 105489.0 439800.0 486900.0 \n", "\n", " Hispanic white_only_pop black_only_pop Hispanic_pop White_pct \\\n", "Year \n", "2006 313600.0 196832697 36520961 44606305 0.257630 \n", "2007 330400.0 197011394 36905758 46196853 0.253691 \n", "2008 329800.0 197183535 37290709 47793785 0.253520 \n", "2009 341200.0 197274549 37656592 49327489 0.248385 \n", "2010 345800.0 197394319 38015461 50754069 0.245397 \n", "2011 347800.0 197519026 38393758 51906353 0.240129 \n", "2012 340300.0 197701109 38776276 52993496 0.236013 \n", "2013 341200.0 197777454 39135988 54064149 0.234557 \n", "2014 338900.0 197902336 39507913 55189962 0.233196 \n", "2015 333200.0 197964402 39876758 56338521 0.227415 \n", "2016 339300.0 197969608 40229236 57470287 0.222155 \n", "\n", " Black_pct Hispanic_pct \n", "Year \n", "2006 1.616332 0.703040 \n", "2007 1.606524 0.715200 \n", "2008 1.589672 0.690048 \n", "2009 1.552982 0.691704 \n", "2010 1.506492 0.681325 \n", "2011 1.451017 0.670053 \n", "2012 1.386931 0.642154 \n", "2013 1.353997 0.631102 \n", "2014 1.312902 0.614061 \n", "2015 1.252359 0.591425 \n", "2016 1.210314 0.590392 " ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "race_sex_pop" ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "with sns.axes_style(\"whitegrid\"):\n", " fig, ax = plt.subplots(figsize=(10,7))\n", " ax.plot(race_sex_pop['White_pct'])\n", " ax.plot(race_sex_pop['Black_pct'])\n", " ax.plot(race_sex_pop['Hispanic_pct'])\n", " ax.set_title('% of Total US Racial Population in Prison',fontsize=14)\n", " ax.set_xlabel('Year', fontsize=14)\n", " ax.set_ylabel('Percent of Racial Population In Prison [%]', fontsize=14)\n", " ax.legend(fontsize=14)\n", " ax.set_ylim([0, 1.1*max([race_sex_pop['White_pct'].max(), \n", " race_sex_pop['Black_pct'].max(),\n", " race_sex_pop['Hispanic_pct'].max()])])" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Average percentage of total black population in prison: 1.440 %\n", "Average percentage of total Hispanic population in prison: 0.656 %\n", "Average percentage of total white population in prison: 0.241 %\n" ] } ], "source": [ "print('Average percentage of total {:>8s} population in prison: {:0.3f} %'\n", " .format('black', race_sex_pop['Black_pct'].mean()))\n", "print('Average percentage of total {:8s} population in prison: {:0.3f} %'\n", " .format('Hispanic', race_sex_pop['Hispanic_pct'].mean()))\n", "print('Average percentage of total {:>8s} population in prison: {:0.3f} %'\n", " .format('white', race_sex_pop['White_pct'].mean()))" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Change in percentage of total black population in prison between 2006 to 2016: -0.406 %\n", "Change in percentage of total Hispanic population in prison between 2006 to 2016: -0.113 %\n", "Change in percentage of total white population in prison between 2006 to 2016: -0.035 %\n" ] } ], "source": [ "print('Change in percentage of total {:>8s} population in prison between 2006 to 2016: {:0.3f} %'\n", " .format('black', race_sex_pop['Black_pct']['2016'] - race_sex_pop['Black_pct']['2006']))\n", "print('Change in percentage of total {:>8s} population in prison between 2006 to 2016: {:0.3f} %'\n", " .format('Hispanic', race_sex_pop['Hispanic_pct']['2016'] - race_sex_pop['Hispanic_pct']['2006']))\n", "print('Change in percentage of total {:>8s} population in prison between 2006 to 2016: {:0.3f} %'\n", " .format('white', race_sex_pop['White_pct']['2016'] - race_sex_pop['White_pct']['2006']))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That's an extremely large difference. A randomly selected black person is nearly 6 times more likely to be in prison than a randomly selected white person and more than twice as likely as a randomly selected Hispanic person. \n", "\n", "That motivates the question: Why is there such a large difference between these populations? \n", "\n", "To investigate that question, it would be useful to investigate other questions, such as:\n", "* Does the criminal justice system treat different racial groups differently?\n", "* Is the average quality of education comparable for different racial populations?\n", "* Is the distribution of quality of education comparable for different racial populations?\n", "* Is the average quality of employment opportunity comparable for different racial populations?\n", "* Is the distribution of quality of employment opportunities comparable for different racial populations?\n", "\n", "To Be Continued." ] } ], "metadata": { "kernelspec": { "display_name": "Python [conda env:py36]", "language": "python", "name": "conda-env-py36-py" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 2 }