{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"from IPython.core.display import display, HTML"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We'll pull population density for counties from [this Census ESRI API](https://covid19.census.gov/datasets/21843f238cbb46b08615fc53e19e0daf_1)."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"county_pop_df = pd.read_csv('https://opendata.arcgis.com/datasets/21843f238cbb46b08615fc53e19e0daf_1.csv'\n",
" ,usecols=['GEOID','NAME','State','B01001_001E','B01001_calc_PopDensity']\n",
" ,dtype={'GEOID':str})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"1. Filter out Puerto Rico _(but support Puerto Rican self-determination!)_,\n",
"1. sort the records by their calculated population density,\n",
"1. and then add a cumulative sum column."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"county_pop_df = county_pop_df[county_pop_df['State']!='Puerto Rico']\n",
"county_pop_df.sort_values(by='B01001_calc_PopDensity', ascending=False, inplace=True)\n",
"county_pop_df['cumcount']=county_pop_df['B01001_001E'].cumsum()"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" GEOID | \n",
" NAME | \n",
" State | \n",
" B01001_001E | \n",
" B01001_calc_PopDensity | \n",
" cumcount | \n",
"
\n",
" \n",
" \n",
" \n",
" 1858 | \n",
" 36061 | \n",
" New York County | \n",
" New York | \n",
" 1632480 | \n",
" 27819.804801 | \n",
" 1632480 | \n",
"
\n",
" \n",
" 1851 | \n",
" 36047 | \n",
" Kings County | \n",
" New York | \n",
" 2600747 | \n",
" 14475.025439 | \n",
" 4233227 | \n",
"
\n",
" \n",
" 1830 | \n",
" 36005 | \n",
" Bronx County | \n",
" New York | \n",
" 1437872 | \n",
" 13202.515305 | \n",
" 5671099 | \n",
"
\n",
" \n",
" 1868 | \n",
" 36081 | \n",
" Queens County | \n",
" New York | \n",
" 2298513 | \n",
" 8158.437153 | \n",
" 7969612 | \n",
"
\n",
" \n",
" 223 | \n",
" 06075 | \n",
" San Francisco County | \n",
" California | \n",
" 870044 | \n",
" 7168.190471 | \n",
" 8839656 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" GEOID NAME State B01001_001E \\\n",
"1858 36061 New York County New York 1632480 \n",
"1851 36047 Kings County New York 2600747 \n",
"1830 36005 Bronx County New York 1437872 \n",
"1868 36081 Queens County New York 2298513 \n",
"223 06075 San Francisco County California 870044 \n",
"\n",
" B01001_calc_PopDensity cumcount \n",
"1858 27819.804801 1632480 \n",
"1851 14475.025439 4233227 \n",
"1830 13202.515305 5671099 \n",
"1868 8158.437153 7969612 \n",
"223 7168.190471 8839656 "
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"county_pop_df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It's surprisingly maddening to find a good source for county-level election data. We're going to use [this one](https://github.com/tonmcg/US_County_Level_Election_Results_08-20/), though we need to repair the FIPS codes so they're 5-character strings."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" votes_dem | \n",
" votes_gop | \n",
" total_votes | \n",
" state_abbr | \n",
" county_name | \n",
" combined_fips | \n",
" fips | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 93003 | \n",
" 130413 | \n",
" 246588 | \n",
" AK | \n",
" Alaska | \n",
" 2013 | \n",
" 02013 | \n",
"
\n",
" \n",
" 1 | \n",
" 93003 | \n",
" 130413 | \n",
" 246588 | \n",
" AK | \n",
" Alaska | \n",
" 2016 | \n",
" 02016 | \n",
"
\n",
" \n",
" 2 | \n",
" 93003 | \n",
" 130413 | \n",
" 246588 | \n",
" AK | \n",
" Alaska | \n",
" 2020 | \n",
" 02020 | \n",
"
\n",
" \n",
" 3 | \n",
" 93003 | \n",
" 130413 | \n",
" 246588 | \n",
" AK | \n",
" Alaska | \n",
" 2050 | \n",
" 02050 | \n",
"
\n",
" \n",
" 4 | \n",
" 93003 | \n",
" 130413 | \n",
" 246588 | \n",
" AK | \n",
" Alaska | \n",
" 2060 | \n",
" 02060 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" votes_dem votes_gop total_votes state_abbr county_name combined_fips \\\n",
"0 93003 130413 246588 AK Alaska 2013 \n",
"1 93003 130413 246588 AK Alaska 2016 \n",
"2 93003 130413 246588 AK Alaska 2020 \n",
"3 93003 130413 246588 AK Alaska 2050 \n",
"4 93003 130413 246588 AK Alaska 2060 \n",
"\n",
" fips \n",
"0 02013 \n",
"1 02016 \n",
"2 02020 \n",
"3 02050 \n",
"4 02060 "
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"election_results_df = pd.read_csv('https://raw.githubusercontent.com/tonmcg/US_County_Level_Election_Results_08-20/master/2016_US_County_Level_Presidential_Results.csv'\n",
" ,usecols=['votes_dem','votes_gop','total_votes','state_abbr','county_name','combined_fips']\n",
" ,dtype={'combined_fips':str,'votes_dem':'int64','votes_gop':'int64','total_votes':'int64'})\n",
"election_results_df['fips'] = election_results_df['combined_fips'].apply(lambda x: '0' + x if len(x)==4 else x)\n",
"election_results_df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There are two Census FIPS that mismatch between the tables; we'll coerce them to matchin values."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"election_results_df.loc[election_results_df['fips']=='46113','fips'] = '46102' #Oglala Lakota County\n",
"election_results_df.loc[election_results_df['fips']=='02270','fips'] = '02158' #Kusilvak Census Area"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Join them and see if anything went wrong. There's one county that's still mismatched, but, with only 75 people, it isn't going to affect any of our election results. Out of exhaustion, we'll leave it be."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" NAME | \n",
" State | \n",
" B01001_001E | \n",
" B01001_calc_PopDensity | \n",
" cumcount | \n",
" votes_dem | \n",
" votes_gop | \n",
" total_votes | \n",
" state_abbr | \n",
" county_name | \n",
" combined_fips | \n",
"
\n",
" \n",
" GEOID | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 15005 | \n",
" Kalawao County | \n",
" Hawaii | \n",
" 75 | \n",
" 2.414868 | \n",
" 320317763 | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" NAME State B01001_001E B01001_calc_PopDensity cumcount \\\n",
"GEOID \n",
"15005 Kalawao County Hawaii 75 2.414868 320317763 \n",
"\n",
" votes_dem votes_gop total_votes state_abbr county_name combined_fips \n",
"GEOID \n",
"15005 NaN NaN NaN NaN NaN NaN "
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"joined = county_pop_df.set_index('GEOID').join(election_results_df.set_index('fips'))\n",
"joined[pd.isna(joined['votes_dem'])]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Define a function that will run an election for an arbitrary number of district. To do so, we break the density-sorted dataframe by using `pd.cut` on the cumulative population field. Then we write a nice little HTML table to output the results."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"def run_election(number_districts):\n",
" joined['electorate'] = pd.cut(joined['cumcount'], bins=number_districts, labels=range(1,number_districts+1))\n",
" results_table = joined.groupby(by='electorate').sum()\n",
" results_table['pctdem'] = results_table['votes_dem']/results_table['total_votes']\n",
" results_table['pctgop'] = results_table['votes_gop']/results_table['total_votes']\n",
" display(HTML('Election with {} districts
'.format(number_districts)))\n",
" \n",
" html_table = ''\n",
" html_table += 'District | Total Population | Densest County | Least Dense County | Dem % | Rep % | Winner |
'\n",
" for i,r in results_table.iterrows():\n",
" most_dense = joined[joined['electorate']==i].iloc[0]\n",
" least_dense = joined[joined['electorate']==i].iloc[-1]\n",
" most_dense_printable = '{}, {}'.format(most_dense['NAME'], most_dense['State'])\n",
" least_dense_printable = '{}, {}'.format(least_dense['NAME'], least_dense['State'])\n",
" \n",
" winner = 'D' if r['pctdem'] > r['pctgop'] else 'R'\n",
"\n",
" html_table += '{} | {} | {} | {} | {} | {} | {} |
'.format(i, f\"{int(r['B01001_001E']):,}\", most_dense_printable, least_dense_printable, round(r['pctdem']*100,2),round(r['pctgop']*100,2), winner)\n",
" \n",
" html_table +='
'\n",
" display(HTML(html_table))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Run the election with 11 districts ..."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Election with 11 districts
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"District | Total Population | Densest County | Least Dense County | Dem % | Rep % | Winner |
---|
1 | 30,773,474 | New York County, New York | Winchester city, Virginia | 70.84 | 25.38 | D |
2 | 29,255,155 | Dallas County, Texas | Tarrant County, Texas | 62.85 | 32.64 | D |
3 | 29,020,695 | Orleans Parish, Louisiana | Contra Costa County, California | 62.24 | 33.22 | D |
4 | 29,291,928 | Sacramento County, California | Davis County, Utah | 56.04 | 38.66 | D |
5 | 28,872,991 | Monroe County, New York | Dakota County, Minnesota | 50.33 | 44.35 | D |
6 | 29,423,618 | Clark County, Washington | Orange County, New York | 46.32 | 48.81 | R |
7 | 29,293,887 | Kalamazoo County, Michigan | Spalding County, Georgia | 43.71 | 50.89 | R |
8 | 29,351,626 | Shawnee County, Kansas | Jefferson County, Tennessee | 39.26 | 55.36 | R |
9 | 29,140,053 | Jackson County, Georgia | Dickson County, Tennessee | 36.62 | 58.44 | R |
10 | 29,255,578 | Bowie County, Texas | Nottoway County, Virginia | 32.86 | 62.2 | R |
11 | 29,224,025 | Cherokee County, Texas | Yukon-Koyukuk Census Area, Alaska | 33.48 | 59.72 | R |
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"run_election(11)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"How about 9?"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Election with 9 districts
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"District | Total Population | Densest County | Least Dense County | Dem % | Rep % | Winner |
---|
1 | 37,299,101 | New York County, New York | DuPage County, Illinois | 69.36 | 26.7 | D |
2 | 34,210,307 | DeKalb County, Georgia | Clayton County, Georgia | 63.11 | 32.31 | D |
3 | 36,929,333 | Middlesex County, Massachusetts | Hillsborough County, Florida | 59.26 | 35.89 | D |
4 | 35,956,241 | Monmouth County, New Jersey | Vanderburgh County, Indiana | 51.63 | 43.23 | D |
5 | 32,504,452 | Hampden County, Massachusetts | Kalamazoo County, Michigan | 46.64 | 48.25 | R |
6 | 38,893,609 | Maricopa County, Arizona | Saratoga County, New York | 42.79 | 51.48 | R |
7 | 35,685,254 | Hinds County, Mississippi | Moore County, North Carolina | 37.47 | 57.77 | R |
8 | 35,677,509 | Lapeer County, Michigan | Allen County, Kentucky | 34.34 | 60.65 | R |
9 | 35,747,224 | Bonneville County, Idaho | Yukon-Koyukuk Census Area, Alaska | 33.46 | 59.97 | R |
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"run_election(9)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"With 17?"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Election with 17 districts
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"District | Total Population | Densest County | Least Dense County | Dem % | Rep % | Winner |
---|
1 | 20,084,946 | New York County, New York | Cook County, Illinois | 78.83 | 17.9 | D |
2 | 19,239,594 | Union County, New Jersey | Fredericksburg city, Virginia | 60.91 | 34.62 | D |
3 | 18,684,112 | Passaic County, New Jersey | Colonial Heights city, Virginia | 63.99 | 31.35 | D |
4 | 18,614,150 | Tarrant County, Texas | Radford city, Virginia | 63.41 | 31.88 | D |
5 | 18,562,096 | Macomb County, Michigan | Salt Lake County, Utah | 57.21 | 37.4 | D |
6 | 19,226,969 | Santa Clara County, California | Hartford County, Connecticut | 57.15 | 38.61 | D |
7 | 19,387,526 | Travis County, Texas | Erie County, New York | 50.33 | 44.5 | D |
8 | 18,888,412 | Galveston County, Texas | Chatham County, Georgia | 49.77 | 44.9 | D |
9 | 18,836,123 | Hamilton County, Tennessee | Manatee County, Florida | 46.97 | 47.75 | R |
10 | 18,990,427 | Marin County, California | Douglas County, Colorado | 44.67 | 49.78 | R |
11 | 18,936,579 | Winnebago County, Wisconsin | Hampshire County, Massachusetts | 42.36 | 52.46 | R |
12 | 18,869,503 | Benton County, Arkansas | Sangamon County, Illinois | 39.8 | 54.72 | R |
13 | 18,979,897 | Windham County, Connecticut | Albemarle County, Virginia | 35.99 | 59.3 | R |
14 | 18,883,361 | Cass County, Missouri | Stokes County, North Carolina | 36.74 | 58.04 | R |
15 | 18,908,564 | Pike County, Pennsylvania | Cherokee County, Oklahoma | 33.37 | 61.7 | R |
16 | 18,909,925 | Coshocton County, Ohio | Buena Vista County, Iowa | 32.27 | 62.89 | R |
17 | 18,900,846 | Breckinridge County, Kentucky | Yukon-Koyukuk Census Area, Alaska | 33.69 | 58.97 | R |
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"run_election(17)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"Election with 21 districts
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"District | Total Population | Densest County | Least Dense County | Dem % | Rep % | Winner |
---|
1 | 14,861,227 | New York County, New York | Essex County, New Jersey | 80.64 | 16.47 | D |
2 | 15,912,247 | Cook County, Illinois | Winchester city, Virginia | 63.17 | 32.35 | D |
3 | 14,800,056 | Dallas County, Texas | Newport News city, Virginia | 61.62 | 34.28 | D |
4 | 16,320,929 | Los Angeles County, California | Westchester County, New York | 64.97 | 30.27 | D |
5 | 15,597,047 | Hennepin County, Minnesota | Macomb County, Michigan | 63.5 | 31.64 | D |
6 | 15,628,239 | Norfolk County, Massachusetts | Providence County, Rhode Island | 58.99 | 36.55 | D |
7 | 15,318,996 | Fairfield County, Connecticut | Hillsborough County, Florida | 57.05 | 37.61 | D |
8 | 15,117,251 | Monmouth County, New Jersey | Clarke County, Georgia | 52.52 | 42.3 | D |
9 | 15,533,076 | Bucks County, Pennsylvania | Washington County, Oregon | 50.7 | 44.46 | D |
10 | 15,060,023 | Hamilton County, Indiana | Genesee County, Michigan | 48.68 | 45.9 | D |
11 | 15,417,696 | Greenville County, South Carolina | Cleveland County, Oklahoma | 48.15 | 46.57 | D |
12 | 15,585,913 | Sedgwick County, Kansas | Leon County, Florida | 46.06 | 49.04 | R |
13 | 15,316,510 | Putnam County, New York | Cumberland County, Maine | 41.33 | 52.86 | R |
14 | 15,323,833 | Erie County, Pennsylvania | Saratoga County, New York | 40.9 | 53.15 | R |
15 | 15,186,794 | Hinds County, Mississippi | Tompkins County, New York | 39.57 | 55.79 | R |
16 | 15,404,525 | Clackamas County, Oregon | Boone County, Indiana | 36.02 | 59.16 | R |
17 | 15,316,483 | Belknap County, New Hampshire | Barry County, Michigan | 35.38 | 59.55 | R |
18 | 15,234,397 | Fulton County, New York | Putnam County, Indiana | 35.73 | 59.36 | R |
19 | 15,336,889 | Santa Fe County, New Mexico | Preston County, West Virginia | 32.32 | 62.59 | R |
20 | 15,320,870 | Huntingdon County, Pennsylvania | Edwards County, Illinois | 32.91 | 62.26 | R |
21 | 15,310,029 | Franklin County, Arkansas | Yukon-Koyukuk Census Area, Alaska | 33.65 | 58.71 | R |
"
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"run_election(21)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.5"
}
},
"nbformat": 4,
"nbformat_minor": 4
}