{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Deriving \"Precinct\" level results for Wisconsin\n", "\n", "### tl;dr: Over a [few notebooks](https://github.com/epaulson/WI_Reporting_To_GEOID), we'll show how to join Wisconsin election results to ward(precinct) shapefiles\n", "\n", "> What we in Wisconsin call a ward is referred to as a precinct in some states or a voting district by the\n", "Census Bureau. Wards do not constitute election districts from which municipal officials are elected, and\n", "thus are not subject to the “one person, one vote” requirement which governs the formation of election\n", "districts. Instead, wards are intended to serve as administrative subunits that are aggregated into election\n", "districts of equal population. Cities, villages, and towns form municipal wards by combining whole\n", "census blocks...Once established, wards serve as the building blocks used by the legislature, counties, and cities in redistricting their respective election districts. -- [Wisconsin Elections Commission](https://docs.legis.wisconsin.gov/misc/lrb/redistricting_information/guidelines_2020.pdf)\n", "\n", "The [OpenElections Project](http://www.openelections.net) is compiling a set of standardized, precinct-level results for national and state level elections going back to 2000. It's a great project, and when finished the dataset will be very useful for journalists, academics, campaigns, and armchair political scientists. \n", "\n", "The quality of the data OpenElections obtains varies greatly by state, and even within a state it can be an adventure - sometimes there's a nice spreadsheet in one county, but in some extreme cases volunteers have to go to County Clerk's offices and take photographs of election results and convert it by hand. \n", "\n", "Wisconsin is on the easier end - The Wisconsin Elections Commission and its predecessor the Government Accountability Board have been good about [providing statewide election results](http://elections.wi.gov/elections-voting/results) within a few weeks of an election, and in machine-readable formats, usually Excel. They're not consistent about the formatting of those Excel files but it's at least not too bad to reason about on a year-by-year basis. \n", "\n", "Wards are the atomic base unit for most elections and all voters in a ward get the same ballot. (Well, mostly. See footnote at the bottom) Wards often have different polling places per ward, but in some cases multiple wards will vote at the same polling location. The districts built from the wards are not in a strict hierarchy: a State Assembly district made up of 50 wards might well be split by a Congressional district, with 25 wards in one district and 25 in another Congressional District. Wards are always contained in one County and in one municipality. \n", "\n", "Because wards are built from whole Census blocks, demographic information is available for each ward.\n", "\n", "Wards may get bigger or new wards may be created due to annexations and changes to municipal boundaries, but once created at the start of a new redistricting cycle, a ward is usually not deleted and they have stable ID numbers. The Legislative Technology Services Bureau publishes a map and shapefiles for the wards and gives each ward an equivient of a FIPS code that can be used for a database/GIS join. The maps are published twice a year to reflect municipal boundary changes. Election results need to be paired with the current shapefile for best accuracy - using a 2016 shapefile with the 2012 election results will be off on the boundaries of muncipalities, though most wards will still look the same. (One exception: when the entire municipality is deleted, the LTSB stops including the old wards for it)\n", "\n", "Unfortunately for Wisconsin election data users, while Wisconsin votes by ward, it does not necessarily report by ward. Only cities of greater than 35,000 people are required to report each ward individually. Smaller municipalites are permitted to combine results into [\"reporting units\"](http://elections.wi.gov/sites/default/files/publication/65/ea_wards_districts_reporting_units_annexations_f_18339.pdf) of multiple wards. All wards in a reporting unit must share the same districts, and because different elections cover districts, the reporting units may vary every election, though they're typically the same. There are usually around 3600 reporting units per election, and in the 2014 fall election there were 6,634 wards in Wisconsin. \n", "\n", "The \"reporting units\" do not get any sort of geographical identifier like a FIPS code. In the reported data collected by the Elections Commission, the reporting unit is an unstructured string. However, there is a prescribed format for [how clerks should name reporting units](http://elections.wi.gov/node/1298) that in theory contains all of the information necessary to decide what wards are included in a reporting unit. \n", "\n", "Sadly, the Reporting units are created by humans, and Wisconsin has 1,927 different clerks who are potentially creating reporting unit names, few of whom think about \"how can a computer parse this\", and not all of whom follow exactly the guidnance set out by the Elections Commission. \n", "\n", "This notebook will first cover how to normalize reporting units. Then, we'll move on to joining GeoData. To normalize, we will use a mix of code and hand-editing to create a file that users of OpenElection data can reference, and lookup which wards (and which GeoIDs) are associated with a given row of the election results data. All of the notebooks can be found in a [Github repo](https://github.com/epaulson/WI_Reporting_To_GEOID)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll start with the preliminaries - bring in Pandas and set some blogging-friendly defaults" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import pandas as pd\n", "import re\n", "import json\n", "pd.options.display.max_rows = 999\n", "pd.options.display.max_columns = 999" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "We'll start by bringing in the reporting units used for the Fall 2016 elections. Before we're done, we'll merge in reporting units from 2014 too - our end result will be a superset of multiple elections, but that's OK, if a reporting unit is named the same thing in different elections it will have the same wards." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "reportingunits2016 = pd.read_excel(\"http://elections.wi.gov/sites/default/files/page/2016_general_election_reporting_units_xlsx_79857.xlsx\")" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CountyMuniReportingUnitCongressionalDistrictAssemblyDistrictSenateDistrict
0Adams CountyCITY OF ADAMSWard 1-4Congressional - District 3Assembly - District 41State Senate - District 14
1Adams CountyCITY OF WISCONSIN DELLSWards 5,9Congressional - District 3Assembly - District 41State Senate - District 14
2Adams CountyTOWN OF ADAMSWards 1-3Congressional - District 3Assembly - District 41State Senate - District 14
3Adams CountyTOWN OF BIG FLATSWard 1-2Congressional - District 3Assembly - District 72State Senate - District 24
4Adams CountyTOWN OF COLBURNWard 1Congressional - District 3Assembly - District 72State Senate - District 24
\n", "
" ], "text/plain": [ " County Muni ReportingUnit \\\n", "0 Adams County CITY OF ADAMS Ward 1-4 \n", "1 Adams County CITY OF WISCONSIN DELLS Wards 5,9 \n", "2 Adams County TOWN OF ADAMS Wards 1-3 \n", "3 Adams County TOWN OF BIG FLATS Ward 1-2 \n", "4 Adams County TOWN OF COLBURN Ward 1 \n", "\n", " CongressionalDistrict AssemblyDistrict \\\n", "0 Congressional - District 3 Assembly - District 41 \n", "1 Congressional - District 3 Assembly - District 41 \n", "2 Congressional - District 3 Assembly - District 41 \n", "3 Congressional - District 3 Assembly - District 72 \n", "4 Congressional - District 3 Assembly - District 72 \n", "\n", " SenateDistrict \n", "0 State Senate - District 14 \n", "1 State Senate - District 14 \n", "2 State Senate - District 14 \n", "3 State Senate - District 24 \n", "4 State Senate - District 24 " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "reportingunits2016.head()" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "3638" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(reportingunits2016)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We don't actually care about the different Districts in use here (we'll rediscover those in election results anyway.) The good clerks of Adams County above have done a nice job of following the instructions, and it looks like this might be relatively straightforward to parse with a regular expression. (You know what quote this calls for, of course)\n", "\n", "> Some people, when confronted with a problem, think \"I know, I'll use regular expressions.\" Now they have two problems. --jwz\n", "\n", "It will turn out that there are too many exceptions to reasonably parse with a regex, so we'll take the coward's way out: we'll parse a bunch with a regex, and just fix the rest by hand. \n", "\n", "First, we'll try out our regex to try to cut the problem down to size. We'll blast over each row and then make a few new columns in our DataFrame " ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def process_reporting_unit(ward):\n", " x = re.search('((?:^\\D+)(?P(\\d+)(?:\\s*)-(?:\\s*)(\\d+$)))|((?:^\\D+)(?P(\\d+$)))|((?:^\\D+)(?P(\\d+)(?:\\s*)&(?:\\s*)(\\d+$)))|((?:^\\D+)(?P(\\d+)(?:\\s*),(?:\\s*)(\\d+$)))', ward)\n", " if x is None:\n", " return {\"type\": \"unmatched\", \"data\": ward}\n", " elif x.group('single'):\n", " return {\"type\": \"single\", \"data\": x.group('single')}\n", " elif x.group('hypen'):\n", " return {\"type\": \"hypen\", \"data\": x.group('hypen')}\n", " elif x.group('amp'):\n", " return {\"type\": \"amp\", \"data\": x.group('amp')}\n", " elif x.group('comma'):\n", " return {\"type\": \"comma\", \"data\": x.group('comma')}\n", " else:\n", " raise Exception()\n", " pass" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll use this regex to add a few columns to our data, using a [handy trick from StackOverflow for returning multiple columns from a single pandas row](http://stackoverflow.com/questions/16236684/apply-pandas-function-to-column-to-create-multiple-new-columns)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": true }, "outputs": [], "source": [ "processed2016 = pd.concat([reportingunits2016,reportingunits2016.ReportingUnit.apply(lambda s: pd.Series(process_reporting_unit(s)))], axis=1)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CountyMuniReportingUnitCongressionalDistrictAssemblyDistrictSenateDistrictdatatype
0Adams CountyCITY OF ADAMSWard 1-4Congressional - District 3Assembly - District 41State Senate - District 141-4hypen
1Adams CountyCITY OF WISCONSIN DELLSWards 5,9Congressional - District 3Assembly - District 41State Senate - District 145,9comma
2Adams CountyTOWN OF ADAMSWards 1-3Congressional - District 3Assembly - District 41State Senate - District 141-3hypen
3Adams CountyTOWN OF BIG FLATSWard 1-2Congressional - District 3Assembly - District 72State Senate - District 241-2hypen
4Adams CountyTOWN OF COLBURNWard 1Congressional - District 3Assembly - District 72State Senate - District 241single
\n", "
" ], "text/plain": [ " County Muni ReportingUnit \\\n", "0 Adams County CITY OF ADAMS Ward 1-4 \n", "1 Adams County CITY OF WISCONSIN DELLS Wards 5,9 \n", "2 Adams County TOWN OF ADAMS Wards 1-3 \n", "3 Adams County TOWN OF BIG FLATS Ward 1-2 \n", "4 Adams County TOWN OF COLBURN Ward 1 \n", "\n", " CongressionalDistrict AssemblyDistrict \\\n", "0 Congressional - District 3 Assembly - District 41 \n", "1 Congressional - District 3 Assembly - District 41 \n", "2 Congressional - District 3 Assembly - District 41 \n", "3 Congressional - District 3 Assembly - District 72 \n", "4 Congressional - District 3 Assembly - District 72 \n", "\n", " SenateDistrict data type \n", "0 State Senate - District 14 1-4 hypen \n", "1 State Senate - District 14 5,9 comma \n", "2 State Senate - District 14 1-3 hypen \n", "3 State Senate - District 24 1-2 hypen \n", "4 State Senate - District 24 1 single " ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "processed2016.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's look at a couple of examples" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CountyMuniReportingUnitCongressionalDistrictAssemblyDistrictSenateDistrictdatatype
0Adams CountyCITY OF ADAMSWard 1-4Congressional - District 3Assembly - District 41State Senate - District 141-4hypen
2Adams CountyTOWN OF ADAMSWards 1-3Congressional - District 3Assembly - District 41State Senate - District 141-3hypen
3Adams CountyTOWN OF BIG FLATSWard 1-2Congressional - District 3Assembly - District 72State Senate - District 241-2hypen
5Adams CountyTOWN OF DELL PRAIRIEWard 1-3Congressional - District 3Assembly - District 41State Senate - District 141-3hypen
6Adams CountyTOWN OF EASTONWards 1-2Congressional - District 3Assembly - District 41State Senate - District 141-2hypen
\n", "
" ], "text/plain": [ " County Muni ReportingUnit \\\n", "0 Adams County CITY OF ADAMS Ward 1-4 \n", "2 Adams County TOWN OF ADAMS Wards 1-3 \n", "3 Adams County TOWN OF BIG FLATS Ward 1-2 \n", "5 Adams County TOWN OF DELL PRAIRIE Ward 1-3 \n", "6 Adams County TOWN OF EASTON Wards 1-2 \n", "\n", " CongressionalDistrict AssemblyDistrict \\\n", "0 Congressional - District 3 Assembly - District 41 \n", "2 Congressional - District 3 Assembly - District 41 \n", "3 Congressional - District 3 Assembly - District 72 \n", "5 Congressional - District 3 Assembly - District 41 \n", "6 Congressional - District 3 Assembly - District 41 \n", "\n", " SenateDistrict data type \n", "0 State Senate - District 14 1-4 hypen \n", "2 State Senate - District 14 1-3 hypen \n", "3 State Senate - District 24 1-2 hypen \n", "5 State Senate - District 14 1-3 hypen \n", "6 State Senate - District 14 1-2 hypen " ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "processed2016[processed2016['type']=='hypen'].head(5)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CountyMuniReportingUnitCongressionalDistrictAssemblyDistrictSenateDistrictdatatype
1Adams CountyCITY OF WISCONSIN DELLSWards 5,9Congressional - District 3Assembly - District 41State Senate - District 145,9comma
115Brown CountyCITY OF DE PEREWards 9,18Congressional - District 8Assembly - District 2State Senate - District 19,18comma
197Brown CountyVILLAGE OF HOWARDWards 1,12Congressional - District 8Assembly - District 89State Senate - District 301,12comma
426Crawford CountyCITY OF PRAIRIE DU CHIENWards 2,7Congressional - District 3Assembly - District 96State Senate - District 322,7comma
616Dane CountyCITY OF VERONAWards 1,5Congressional - District 2Assembly - District 79State Senate - District 271,5comma
\n", "
" ], "text/plain": [ " County Muni ReportingUnit \\\n", "1 Adams County CITY OF WISCONSIN DELLS Wards 5,9 \n", "115 Brown County CITY OF DE PERE Wards 9,18 \n", "197 Brown County VILLAGE OF HOWARD Wards 1,12 \n", "426 Crawford County CITY OF PRAIRIE DU CHIEN Wards 2,7 \n", "616 Dane County CITY OF VERONA Wards 1,5 \n", "\n", " CongressionalDistrict AssemblyDistrict \\\n", "1 Congressional - District 3 Assembly - District 41 \n", "115 Congressional - District 8 Assembly - District 2 \n", "197 Congressional - District 8 Assembly - District 89 \n", "426 Congressional - District 3 Assembly - District 96 \n", "616 Congressional - District 2 Assembly - District 79 \n", "\n", " SenateDistrict data type \n", "1 State Senate - District 14 5,9 comma \n", "115 State Senate - District 1 9,18 comma \n", "197 State Senate - District 30 1,12 comma \n", "426 State Senate - District 32 2,7 comma \n", "616 State Senate - District 27 1,5 comma " ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "processed2016[processed2016['type']=='comma'].head(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And overall, 157 that are something funky. That's not too bad. Let's take a look at what a few of those look like:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "single 2198\n", "hypen 1236\n", "unmatched 157\n", "comma 47\n", "Name: type, dtype: int64" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "processed2016['type'].value_counts()" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CountyMuniReportingUnitCongressionalDistrictAssemblyDistrictSenateDistrictdatatype
173Brown CountyTOWN OF LEDGEVIEWWards 1-3,8-10Congressional - District 8Assembly - District 88State Senate - District 30Wards 1-3,8-10unmatched
198Brown CountyVILLAGE OF HOWARDWards 2,8,11Congressional - District 8Assembly - District 4State Senate - District 2Wards 2,8,11unmatched
199Brown CountyVILLAGE OF HOWARDWards 3-4,6Congressional - District 8Assembly - District 89State Senate - District 30Wards 3-4,6unmatched
202Brown CountyVILLAGE OF HOWARDWards 9-10,18Congressional - District 8Assembly - District 4State Senate - District 2Wards 9-10,18unmatched
205Brown CountyVILLAGE OF PULASKIWards 1-3,6Congressional - District 8Assembly - District 6State Senate - District 2Wards 1-3,6unmatched
\n", "
" ], "text/plain": [ " County Muni ReportingUnit \\\n", "173 Brown County TOWN OF LEDGEVIEW Wards 1-3,8-10 \n", "198 Brown County VILLAGE OF HOWARD Wards 2,8,11 \n", "199 Brown County VILLAGE OF HOWARD Wards 3-4,6 \n", "202 Brown County VILLAGE OF HOWARD Wards 9-10,18 \n", "205 Brown County VILLAGE OF PULASKI Wards 1-3,6 \n", "\n", " CongressionalDistrict AssemblyDistrict \\\n", "173 Congressional - District 8 Assembly - District 88 \n", "198 Congressional - District 8 Assembly - District 4 \n", "199 Congressional - District 8 Assembly - District 89 \n", "202 Congressional - District 8 Assembly - District 4 \n", "205 Congressional - District 8 Assembly - District 6 \n", "\n", " SenateDistrict data type \n", "173 State Senate - District 30 Wards 1-3,8-10 unmatched \n", "198 State Senate - District 2 Wards 2,8,11 unmatched \n", "199 State Senate - District 30 Wards 3-4,6 unmatched \n", "202 State Senate - District 2 Wards 9-10,18 unmatched \n", "205 State Senate - District 2 Wards 1-3,6 unmatched " ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "processed2016[processed2016['type']=='unmatched'].head(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Yeah, let's not try to figure all those variations out. Let's just make ourselves a nice dictionary to edit. " ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\"Wards 1-3,8-10\": \"Wards \",\n", "\"Wards 2,8,11\": \"Wards \",\n", "\"Wards 3-4,6\": \"Wards \",\n", "\"Wards 9-10,18\": \"Wards \",\n", "\"Wards 1-3,6\": \"Wards \",\n", "\"Wards 7-8,10-12\": \"Wards \",\n", "\"Wards 5-6,10\": \"Wards \",\n", "\"Wards 7-9,14\": \"Wards \",\n", "\"Ward 3A\": \"Ward \",\n", "\"Ward 7A\": \"Ward \",\n", "\"Wards 1-4,6-7\": \"Wards \",\n", "\"Wards 1,9-10\": \"Wards \",\n", "\"Wards 2-3,5\": \"Wards \",\n", "\"Wards 4,6-8\": \"Wards \",\n", "\"Wards 1-3,6\": \"Wards \",\n", "\"Wards 15,18-19\": \"Wards \",\n", "\"Wards 1-4,9\": \"Wards \",\n", "\"Wards 3-4,12\": \"Wards \",\n", "\"Wards 1-2,4-5,7\": \"Wards \",\n", "\"Ward 1,3,5\": \"Ward \",\n", "\"Ward 2,4,6\": \"Ward \",\n", "\"Ward 1,3-6,15\": \"Ward \",\n", "\"Ward 7-10,12\": \"Ward \",\n", "\"Wards 14,16-17\": \"Wards \",\n", "\"Wards 1,5-6,11\": \"Wards \",\n", "\"Wards 2-4,12\": \"Wards \",\n", "\"Wards 1,3,5\": \"Wards \",\n", "\"Wards 7,12-13\": \"Wards \",\n", "\"Wards 1-4,7-11\": \"Wards \",\n", "\"Wards 1,3-7\": \"Wards \",\n", "\"Wards 1-6,22-24,29\": \"Wards \",\n", "\"Wards 7-10,18-21,25-27,30\": \"Wards \",\n", "\"Wards 11-17,28\": \"Wards \",\n", "\"Wards 1-3,7-8\": \"Wards \",\n", "\"Wards 4-6,9-12\": \"Wards \",\n", "\"Wards 13-16,20-22\": \"Wards \",\n", "\"Wards 17-19,30-32\": \"Wards \",\n", "\"Ward 9B\": \"Ward \",\n", "\"Ward 1A\": \"Ward \",\n", "\"Ward 2-3,5\": \"Ward \",\n", "\"Ward 1-5,10\": \"Ward \",\n", "\"Wards 1-3,8\": \"Wards \",\n", "\"Wards 5-6,9-10\": \"Wards \",\n", "\"Wards 1-6,8\": \"Wards \",\n", "\"Wards 3-4,22\": \"Wards \",\n", "\"Wards 17-18,21,23-26\": \"Wards \",\n", "\"Wards 19-20,27\": \"Wards \",\n", "\"Wards 12,20-21,24\": \"Wards \",\n", "\"Wards 1-2,6-7\": \"Wards \",\n", "\"Wards 1-2,4-5\": \"Wards \",\n", "\"Wards 1,3,5\": \"Wards \",\n", "\"Wards 2,4,6\": \"Wards \",\n", "\"Wards 1,3-5\": \"Wards \",\n", "\"Ward 15A\": \"Ward \",\n", "\"Ward 15B\": \"Ward \",\n", "\"Ward 22A\": \"Ward \",\n", "\"Ward 22B\": \"Ward \",\n", "\"Wards 2,8S\": \"Wards \",\n", "\"Ward 11S\": \"Ward \",\n", "\"Wards 1S,3S\": \"Wards \",\n", "\"Ward 1-2,5\": \"Ward \",\n", "\"Ward 3-4,6\": \"Ward \",\n", "\"Wards 13-18,20\": \"Wards \",\n", "\"Ward 5B\": \"Ward \",\n", "\"Wards 12-16,18\": \"Wards \",\n", "\"Wards 17,19-20\": \"Wards \",\n", "\"Wards 1A-3A\": \"Wards \",\n", "\"Wards 1B-3B\": \"Wards \",\n", "\"Wards 4-6,9-10\": \"Wards \",\n", "\"Wards 1-3,5,7-8\": \"Wards \",\n", "\"Wards 1,4-5,14\": \"Wards \",\n", "\"Wards 2,6,8,12-13\": \"Wards \",\n", "\"Wards 3,9-11\": \"Wards \",\n", "\"Wards 5,7B\": \"Wards \",\n", "\"Wards 6-7A\": \"Wards \",\n", "\"Ward 5-6,10\": \"Ward \",\n", "\"Wards 1-2,5\": \"Wards \",\n", "\"Wards 1,6-7\": \"Wards \",\n", "\"Ward 2A\": \"Ward \",\n", "\"Ward 2B\": \"Ward \",\n", "\"Wards 35,40,43\": \"Wards \",\n", "\"Wards 36,38,41\": \"Wards \",\n", "\"Wards 1-2,4\": \"Wards \",\n", "\"Ward 3S\": \"Ward \",\n", "\"Wards 9-10,12-13\": \"Wards \",\n", "\"Wards 11,14-15,17\": \"Wards \",\n", "\"Wards 16,18-19\": \"Wards \",\n", "\"Wards 1,17,20\": \"Wards \",\n", "\"Wards 2-4,11\": \"Wards \",\n", "\"Wards 9,13-14\": \"Wards \",\n", "\"Wards 10,12,15-16\": \"Wards \",\n", "\"Wards 19,21-22\": \"Wards \",\n", "\"Wards 1-4,6\": \"Wards \",\n", "\"Wards 1-3,13\": \"Wards \",\n", "\"Wards 4,6,14\": \"Wards \",\n", "\"Wards 5,7-9\": \"Wards \",\n", "\"Wards 1-2,9\": \"Wards \",\n", "\"Wards 1-4,15\": \"Wards \",\n", "\"Wards 1-2,7-9,11-14\": \"Wards \",\n", "\"Wards 3-6,10\": \"Wards \",\n", "\"Wards 1-4,6\": \"Wards \",\n", "\"Wards 6,9-10,15-17,20,23-25,28\": \"Wards \",\n", "\"Wards 11-14,21-22,26-27\": \"Wards \",\n", "\"Wards 9-10,32\": \"Wards \",\n", "\"Wards 11-14,28\": \"Wards \",\n", "\"Wards 23-24,26\": \"Wards \",\n", "\"Wards 1-2,8\": \"Wards \",\n", "\"Wards 1,8,10-11\": \"Wards \",\n", "\"Wards 2,5-7\": \"Wards \",\n", "\"Wards 3-4,9,16-17\": \"Wards \",\n", "\"Wards 1-5,7\": \"Wards \",\n", "\"Wards 1-3,13\": \"Wards \",\n", "\"Wards 1,3-4,10\": \"Wards \",\n", "\"Wards 2,5-8\": \"Wards \",\n", "\"Wards 1-2,5-6\": \"Wards \",\n", "\"Wards 1,3,5,9\": \"Wards \",\n", "\"Wards 2,4,10\": \"Wards \",\n", "\"Ward 1-3,7-9\": \"Ward \",\n", "\"Ward 4-6,10-11\": \"Ward \",\n", "\"Wards 1,7-8\": \"Wards \",\n", "\"Wards 2-3,9-11\": \"Wards \",\n", "\"Wards 1-2,4\": \"Wards \",\n", "\"Wards 3,6-7\": \"Wards \",\n", "\"Wards 5,8-9\": \"Wards \",\n", "\"Ward 1,8-11\": \"Ward \",\n", "\"Wards 7,9-11\": \"Wards \",\n", "\"Wards 3-4,8\": \"Wards \",\n", "\"Wards 1A-2,4,7\": \"Wards \",\n", "\"Ward 1B\": \"Ward \",\n", "\"Wards 3,14-15,30\": \"Wards \",\n", "\"Ward 5B\": \"Ward \",\n", "\"Wards 5A-6,8-9,23-29,31-35,38\": \"Wards \",\n", "\"Wards 10-13,21-22,36-37\": \"Wards \",\n", "\"Ward 22B\": \"Ward \",\n", "\"Ward 22A\": \"Ward \",\n", "\"Ward 23A\": \"Ward \",\n", "\"Ward 23B\": \"Ward \",\n", "\"Ward 25A\": \"Ward \",\n", "\"Ward 25B\": \"Ward \",\n", "\"Ward 28B\": \"Ward \",\n", "\"Ward 28A\": \"Ward \",\n", "\"Ward 29A\": \"Ward \",\n", "\"Ward 29B\": \"Ward \",\n", "\"Wards 1-2,7-10\": \"Wards \",\n", "\"Wards 1A,2-5\": \"Wards \",\n", "\"Ward 1B\": \"Ward \",\n", "\"Wards 1A-2\": \"Wards \",\n", "\"Ward 1B\": \"Ward \",\n", "\"Wards 1-2A\": \"Wards \",\n", "\"Wards 2B,2C\": \"Wards \",\n", "\"Wards 1-2,4,7\": \"Wards \",\n", "\"Wards 3,5-6\": \"Wards \",\n", "\"Wards 6,17,25-26\": \"Wards \",\n", "\"Wards 7,16,27\": \"Wards \",\n", "\"Wards 8,19,22-23\": \"Wards \",\n", "\"Wards 6-15,24,26-29\": \"Wards \",\n", "\"Wards 16-23,25\": \"Wards \",\n" ] } ], "source": [ "def convert(row):\n", " x = re.search('(^\\D+)', row)\n", " print(\"\\\"%s\\\": \\\"%s \\\",\" % (row, x.group(1).strip()) )\n", " \n", "junk = processed2016[processed2016['type']=='unmatched'].ReportingUnit.map(lambda s: convert(s))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's cut and paste this into a new cell and just fix it by hand. You won't see me do this, but this is just the output from above manually edited." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": true }, "outputs": [], "source": [ "manual = {\"Wards 1-3,8-10\": \"Wards 1,2,3,8,9,10\",\n", "\"Wards 2,8,11\": \"Wards 2,8,11\",\n", "\"Wards 3-4,6\": \"Wards 3,4,6\",\n", "\"Wards 9-10,18\": \"Wards 9,10,18\",\n", "\"Wards 1-3,6\": \"Wards 1,2,3,6\",\n", "\"Wards 7-8,10-12\": \"Wards 7,8,10,11,12\",\n", "\"Wards 5-6,10\": \"Wards 5,6,10\",\n", "\"Wards 7-9,14\": \"Wards 7,8,9,14\",\n", "\"Ward 3A\": \"Ward 3A\",\n", "\"Ward 7A\": \"Ward 7A\",\n", "\"Wards 1-4,6-7\": \"Wards 1,2,3,4,6,7\",\n", "\"Wards 1,9-10\": \"Wards 1,9,10\",\n", "\"Wards 2-3,5\": \"Wards 2,3,5\",\n", "\"Wards 4,6-8\": \"Wards 4,6,7,8\",\n", "\"Wards 1-3,6\": \"Wards 1,2,3,6\",\n", "\"Wards 15,18-19\": \"Wards 15,18,19\",\n", "\"Wards 1-4,9\": \"Wards 1,2,3,4,9\",\n", "\"Wards 3-4,12\": \"Wards 3,4,12\",\n", "\"Wards 1-2,4-5,7\": \"Wards 1,2,4,5,7\",\n", "\"Ward 1,3,5\": \"Ward 1,3,5\",\n", "\"Ward 2,4,6\": \"Ward 2,4,6\",\n", "\"Ward 1,3-6,15\": \"Ward 1,3,4,5,6,15\",\n", "\"Ward 7-10,12\": \"Ward 7,8,9,10,12\",\n", "\"Wards 14,16-17\": \"Wards 14,16,17\",\n", "\"Wards 1,5-6,11\": \"Wards 1,5,6,11\",\n", "\"Wards 2-4,12\": \"Wards 2,3,4,12\",\n", "\"Wards 1,3,5\": \"Wards 1,3,5\",\n", "\"Wards 7,12-13\": \"Wards 7,12,13\",\n", "\"Wards 1-4,7-11\": \"Wards 1,2,3,4,7,8,9,10,11\",\n", "\"Wards 1,3-7\": \"Wards 1,3,4,5,6,7\",\n", "\"Wards 1-6,22-24,29\": \"Wards 1,2,3,4,5,6,22,23,24,29\",\n", "\"Wards 7-10,18-21,25-27,30\": \"Wards 7,8,9,10,18,19,20,21,25,26,27,30\",\n", "\"Wards 11-17,28\": \"Wards 11,12,13,14,15,16,17,28\",\n", "\"Wards 1-3,7-8\": \"Wards 1,2,3,7,8\",\n", "\"Wards 4-6,9-12\": \"Wards 4,5,6,9,10,11,12\",\n", "\"Wards 13-16,20-22\": \"Wards 13,14,15,16,20,21,22\",\n", "\"Wards 17-19,30-32\": \"Wards 17,18,19,30,31,32\",\n", "\"Ward 9B\": \"Ward 9B\",\n", "\"Ward 1A\": \"Ward 1A\",\n", "\"Ward 2-3,5\": \"Ward 2,3,5\",\n", "\"Ward 1-5,10\": \"Ward 1,2,3,4,5,10\",\n", "\"Wards 1-3,8\": \"Wards 1,2,3,8\",\n", "\"Wards 5-6,9-10\": \"Wards 5,6,9,10\",\n", "\"Wards 1-6,8\": \"Wards 1,2,3,4,5,6,8\",\n", "\"Wards 3-4,22\": \"Wards 3,4,22\",\n", "\"Wards 17-18,21,23-26\": \"Wards 17,18,21,23,24,25,26\",\n", "\"Wards 19-20,27\": \"Wards 19,20,27\",\n", "\"Wards 12,20-21,24\": \"Wards 12,20,21,24\",\n", "\"Wards 1-2,6-7\": \"Wards 1,2,6,7\",\n", "\"Wards 1-2,4-5\": \"Wards 1,2,4,5\",\n", "\"Wards 1,3,5\": \"Wards 1,3,5\",\n", "\"Wards 2,4,6\": \"Wards 2,4,6\",\n", "\"Wards 1,3-5\": \"Wards 1,3,4,5\",\n", "\"Ward 15A\": \"Ward 15A\",\n", "\"Ward 15B\": \"Ward 15B\",\n", "\"Ward 22A\": \"Ward 22A\",\n", "\"Ward 22B\": \"Ward 22B\",\n", "\"Wards 2,8S\": \"Wards 2,8S\",\n", "\"Ward 11S\": \"Ward 11S\",\n", "\"Wards 1S,3S\": \"Wards 1S,3S\",\n", "\"Ward 1-2,5\": \"Ward 1,2,5\",\n", "\"Ward 3-4,6\": \"Ward 2,4,6\",\n", "\"Wards 13-18,20\": \"Wards 13,14,15,16,17,18,20\",\n", "\"Ward 5B\": \"Ward 5B\",\n", "\"Wards 12-16,18\": \"Wards 12,13,14,15,16,18\",\n", "\"Wards 17,19-20\": \"Wards 17,19,20\",\n", "\"Wards 1A-3A\": \"Wards 1A,2A,3A\",\n", "\"Wards 1B-3B\": \"Wards 1B,2B,3B\",\n", "\"Wards 4-6,9-10\": \"Wards 4,5,6,9,10\",\n", "\"Wards 1-3,5,7-8\": \"Wards 1,2,3,5,7,8\",\n", "\"Wards 1,4-5,14\": \"Wards 1,4,5,14\",\n", "\"Wards 2,6,8,12-13\": \"Wards 2,6,8,12,13\",\n", "\"Wards 3,9-11\": \"Wards 3,9,10,11\",\n", "\"Wards 5,7B\": \"Wards 5,7B\",\n", "\"Wards 6-7A\": \"Wards 6,7A\",\n", "\"Ward 5-6,10\": \"Ward 5,6,10\",\n", "\"Wards 1-2,5\": \"Wards 1,2,5\",\n", "\"Wards 1,6-7\": \"Wards 1,6,7\",\n", "\"Ward 2A\": \"Ward 2A\",\n", "\"Ward 2B\": \"Ward 2B\",\n", "\"Wards 35,40,43\": \"Wards 35,40,43\",\n", "\"Wards 36,38,41\": \"Wards 36,38,41\",\n", "\"Wards 1-2,4\": \"Wards 1,2,4\",\n", "\"Ward 3S\": \"Ward 3S\",\n", "\"Wards 9-10,12-13\": \"Wards 9,10,12,13\",\n", "\"Wards 11,14-15,17\": \"Wards 11,14,15,17\",\n", "\"Wards 16,18-19\": \"Wards 16,18,19\",\n", "\"Wards 1,17,20\": \"Wards 1,17,20\",\n", "\"Wards 2-4,11\": \"Wards 2,3,4,11\",\n", "\"Wards 9,13-14\": \"Wards 9,13,14\",\n", "\"Wards 10,12,15-16\": \"Wards 10,12,15,16\",\n", "\"Wards 19,21-22\": \"Wards 19,21,22\",\n", "\"Wards 1-4,6\": \"Wards 1,2,3,4,6\",\n", "\"Wards 1-3,13\": \"Wards 1,2,3,13\",\n", "\"Wards 4,6,14\": \"Wards 4,6,14\",\n", "\"Wards 5,7-9\": \"Wards 5,7,8,9\",\n", "\"Wards 1-2,9\": \"Wards 1,2,9\",\n", "\"Wards 1-4,15\": \"Wards 1,2,3,4,15\",\n", "\"Wards 1-2,7-9,11-14\": \"Wards 1,2,7,8,9,11,12,13,14\",\n", "\"Wards 3-6,10\": \"Wards 3,4,5,6,10\",\n", "\"Wards 1-4,6\": \"Wards 1,2,3,4,6\",\n", "\"Wards 6,9-10,15-17,20,23-25,28\": \"Wards 6,9,10,15,16,17,20,23,24,25,28\",\n", "\"Wards 11-14,21-22,26-27\": \"Wards 11,12,13,14,21,22,26,27\",\n", "\"Wards 9-10,32\": \"Wards 9,10,32\",\n", "\"Wards 11-14,28\": \"Wards 11,12,13,14,28\",\n", "\"Wards 23-24,26\": \"Wards 23,24,26\",\n", "\"Wards 1-2,8\": \"Wards 1,2,8\",\n", "\"Wards 1,8,10-11\": \"Wards 1,8,10,11\",\n", "\"Wards 2,5-7\": \"Wards 2,5,6,7\",\n", "\"Wards 3-4,9,16-17\": \"Wards 3,4,9,16,17\",\n", "\"Wards 1-5,7\": \"Wards 1,2,3,4,5,7\",\n", "\"Wards 1-3,13\": \"Wards 1,2,3,13\",\n", "\"Wards 1,3-4,10\": \"Wards 1,3,4,10\",\n", "\"Wards 2,5-8\": \"Wards 2,5,6,7,8\",\n", "\"Wards 1-2,5-6\": \"Wards 1,2,5,6\",\n", "\"Wards 1,3,5,9\": \"Wards 1,3,5,9\",\n", "\"Wards 2,4,10\": \"Wards 2,4,10\",\n", "\"Ward 1-3,7-9\": \"Ward 1,2,3,7,8,9\",\n", "\"Ward 4-6,10-11\": \"Ward 4,5,6,10,11\",\n", "\"Wards 1,7-8\": \"Wards 1,7,8\",\n", "\"Wards 2-3,9-11\": \"Wards 2,3,9,10,11\",\n", "\"Wards 1-2,4\": \"Wards 1,2,4\",\n", "\"Wards 3,6-7\": \"Wards 3,6,7\",\n", "\"Wards 5,8-9\": \"Wards 5,8,9\",\n", "\"Ward 1,8-11\": \"Ward 1,8,9,10,11\",\n", "\"Wards 7,9-11\": \"Wards 7,9,10,11\",\n", "\"Wards 3-4,8\": \"Wards 3,4,8\",\n", "\"Wards 1A-2,4,7\": \"Wards 1A,2,4,7\",\n", "\"Ward 1B\": \"Ward 1B\",\n", "\"Wards 3,14-15,30\": \"Wards 3,14,15,30\",\n", "\"Ward 5B\": \"Ward 5B\",\n", "\"Wards 5A-6,8-9,23-29,31-35,38\": \"Wards 5A,6,8,9,22,23,24,25,26,27,28,29,31,32,33,34,35,38\",\n", "\"Wards 10-13,21-22,36-37\": \"Wards 10,11,12,13,21,22,36,37\",\n", "\"Ward 22B\": \"Ward 22B\",\n", "\"Ward 22A\": \"Ward 22A\",\n", "\"Ward 23A\": \"Ward 23A\",\n", "\"Ward 23B\": \"Ward 23B\",\n", "\"Ward 25A\": \"Ward 25A\",\n", "\"Ward 25B\": \"Ward 25B\",\n", "\"Ward 28B\": \"Ward 28B\",\n", "\"Ward 28A\": \"Ward 28A\",\n", "\"Ward 29A\": \"Ward 29A\",\n", "\"Ward 29B\": \"Ward 29B\",\n", "\"Wards 1-2,7-10\": \"Wards 1,2,7,8,9,10\",\n", "\"Wards 1A,2-5\": \"Wards 1A,2,3,4,5\",\n", "\"Ward 1B\": \"Ward 1B\",\n", "\"Wards 1A-2\": \"Wards 1A,2\",\n", "\"Ward 1B\": \"Ward 1B\",\n", "\"Wards 1-2A\": \"Wards 1,2A\",\n", "\"Wards 2B,2C\": \"Wards 2B,2C\",\n", "\"Wards 1-2,4,7\": \"Wards 1,2,4,7\",\n", "\"Wards 3,5-6\": \"Wards 3,5,6\",\n", "\"Wards 6,17,25-26\": \"Wards 6,17,25,26\",\n", "\"Wards 7,16,27\": \"Wards 7,16,27\",\n", "\"Wards 8,19,22-23\": \"Wards 8,19,22,23\",\n", "\"Wards 6-15,24,26-29\": \"Wards 6,7,8,9,10,11,12,13,14,15,24,26,27,28,29\",\n", "\"Wards 16-23,25\": \"Wards 16,17,18,19,20,21,22,23,25\"}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, we can just blast over the Data Frame and create a new column, merging or expanding where necessary. If we were unmatched, look in the 'wardmappings' dictionary and return what we found there. The end results will be a column named 'mapped', which has every reporting unit converted to a comma seperated list." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def lookup(row, ward_mappings):\n", " if row['type'] == 'hypen':\n", " search = re.search('(\\d+)(?:\\s*)-(?:\\s*)(\\d+$)', row['data'])\n", " if(search):\n", " return \"Wards %s\" % (\",\".join([str(x) for x in range(int(search.group(1)), int(search.group(2))+1)]))\n", " elif row['type'] == 'comma':\n", " search = re.search('(\\d+)(?:\\s*),(?:\\s*)(\\d+$)', row['data'])\n", " if(search):\n", " return \"Wards %d,%d\" % (int(search.group(1)), int(search.group(2)))\n", " return row['ReportingUnit']\n", " elif row['type'] == 'single':\n", " search = re.search('(\\d+$)', row['data'])\n", " return \"Ward %d\" % (int(search.group(1), 10))\n", " return row['ReportingUnit']\n", " elif row['type'] == 'amp':\n", " search = re.search('(\\d+)(?:\\s*)&(?:\\s*)(\\d+$)', row['data'])\n", " if(search):\n", " return \"Wards %d,%d\" % (int(search.group(1)), int(search.group(2)))\n", " else:\n", " raise Exception()\n", " elif row['type'] == 'unmatched':\n", " return ward_mappings[row['ReportingUnit']]\n", " " ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CountyMuniReportingUnitCongressionalDistrictAssemblyDistrictSenateDistrictdatatypemapped
0Adams CountyCITY OF ADAMSWard 1-4Congressional - District 3Assembly - District 41State Senate - District 141-4hypenWards 1,2,3,4
1Adams CountyCITY OF WISCONSIN DELLSWards 5,9Congressional - District 3Assembly - District 41State Senate - District 145,9commaWards 5,9
2Adams CountyTOWN OF ADAMSWards 1-3Congressional - District 3Assembly - District 41State Senate - District 141-3hypenWards 1,2,3
3Adams CountyTOWN OF BIG FLATSWard 1-2Congressional - District 3Assembly - District 72State Senate - District 241-2hypenWards 1,2
4Adams CountyTOWN OF COLBURNWard 1Congressional - District 3Assembly - District 72State Senate - District 241singleWard 1
\n", "
" ], "text/plain": [ " County Muni ReportingUnit \\\n", "0 Adams County CITY OF ADAMS Ward 1-4 \n", "1 Adams County CITY OF WISCONSIN DELLS Wards 5,9 \n", "2 Adams County TOWN OF ADAMS Wards 1-3 \n", "3 Adams County TOWN OF BIG FLATS Ward 1-2 \n", "4 Adams County TOWN OF COLBURN Ward 1 \n", "\n", " CongressionalDistrict AssemblyDistrict \\\n", "0 Congressional - District 3 Assembly - District 41 \n", "1 Congressional - District 3 Assembly - District 41 \n", "2 Congressional - District 3 Assembly - District 41 \n", "3 Congressional - District 3 Assembly - District 72 \n", "4 Congressional - District 3 Assembly - District 72 \n", "\n", " SenateDistrict data type mapped \n", "0 State Senate - District 14 1-4 hypen Wards 1,2,3,4 \n", "1 State Senate - District 14 5,9 comma Wards 5,9 \n", "2 State Senate - District 14 1-3 hypen Wards 1,2,3 \n", "3 State Senate - District 24 1-2 hypen Wards 1,2 \n", "4 State Senate - District 24 1 single Ward 1 " ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "processed2016['mapped'] = processed2016.apply(lookup, args=(manual,), axis=1)\n", "processed2016.head()" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CountyMuniReportingUnitCongressionalDistrictAssemblyDistrictSenateDistrictdatatypemapped
2208Monroe CountyTOWN OF LA GRANGEWards 1A-3ACongressional - District 7Assembly - District 70State Senate - District 24Wards 1A-3AunmatchedWards 1A,2A,3A
\n", "
" ], "text/plain": [ " County Muni ReportingUnit \\\n", "2208 Monroe County TOWN OF LA GRANGE Wards 1A-3A \n", "\n", " CongressionalDistrict AssemblyDistrict \\\n", "2208 Congressional - District 7 Assembly - District 70 \n", "\n", " SenateDistrict data type mapped \n", "2208 State Senate - District 24 Wards 1A-3A unmatched Wards 1A,2A,3A " ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "processed2016[processed2016['ReportingUnit'] == 'Wards 1A-3A']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, let's add in reporting units from the OpenElection Data. We'll start with 2014 data." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": true }, "outputs": [], "source": [ "openelex2014 = pd.read_csv(\"https://raw.githubusercontent.com/openelections/openelections-data-wi/master/2014/20141104__wi__general_ward.csv\")\n" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countywardofficedistricttotal votespartycandidatevotes
0AdamsTown Of Adams Wards 1-3Governor/Lieutenant GovernorNaN500DEMMary Burke John Lehman233
1AdamsTown Of Adams Wards 1-3Governor/Lieutenant GovernorNaN500REPScott Walker Rebecca Kleefisch258
2AdamsTown Of Adams Wards 1-3Governor/Lieutenant GovernorNaN500INDDennis Fehr No Candidate1
3AdamsTown Of Adams Wards 1-3Governor/Lieutenant GovernorNaN500INDRobert Burke Joseph M. Brost8
4AdamsTown Of Adams Wards 1-3Governor/Lieutenant GovernorNaN500INDMary Jo Walters (Write-In)0
\n", "
" ], "text/plain": [ " county ward office district \\\n", "0 Adams Town Of Adams Wards 1-3 Governor/Lieutenant Governor NaN \n", "1 Adams Town Of Adams Wards 1-3 Governor/Lieutenant Governor NaN \n", "2 Adams Town Of Adams Wards 1-3 Governor/Lieutenant Governor NaN \n", "3 Adams Town Of Adams Wards 1-3 Governor/Lieutenant Governor NaN \n", "4 Adams Town Of Adams Wards 1-3 Governor/Lieutenant Governor NaN \n", "\n", " total votes party candidate votes \n", "0 500 DEM Mary Burke John Lehman 233 \n", "1 500 REP Scott Walker Rebecca Kleefisch 258 \n", "2 500 IND Dennis Fehr No Candidate 1 \n", "3 500 IND Robert Burke Joseph M. Brost 8 \n", "4 500 IND Mary Jo Walters (Write-In) 0 " ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "openelex2014.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The OpenElection data is similar to Election Commission reporting units data, except they've combined the Ward and City into one column. Our end goal is a generic dictionary that can work on multiple years, so to keep the data consistent we're going to split that back into two columns. \n", "\n", "Unfortunately, there are some odditities in the 2014 data, so we'll special-case those when we split data into municipalities and reporting units. " ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false }, "outputs": [], "source": [ "openelex2014specialcases= {\"Village Of Gilman Ward\": {\"Muni\": \"Village of Gilman\", \"ReportingUnit\": \"Ward 1\"}, \"Village Of Lublin Ward\": {\"Muni\": \"Village of Lublin\", \"ReportingUnit\": \"Ward 1\"}}\n", "\n", "def split_ward_col(ward, special_cases):\n", " \n", " if ward in special_cases:\n", " mcd = special_cases[ward]['Muni']\n", " repounit = special_cases[ward]['ReportingUnit']\n", " else:\n", " x = re.match('(.+?)(?P(?:Ward).*$)', ward)\n", " if x is None:\n", " #print(ward)\n", " #some counties use Wd instead of Ward\n", " x = re.match('(.+?)(?P(?:Wd).*$)', ward)\n", " mcd = x.group(1).strip()\n", " repounit = x.group('ward')\n", " handled = process_reporting_unit(repounit)\n", " handled['Muni'] = mcd\n", " handled['ReportingUnit'] = repounit\n", " return handled\n" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": true }, "outputs": [], "source": [ "openelexprocessed = pd.concat([openelex2014,openelex2014.ward.apply(lambda s: pd.Series(split_ward_col(s, openelex2014specialcases)))], axis=1)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": false }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The OpenElections data has a row for each candidate in each reporting unit, and for now we're just looking to compute ward infomration, so we'll drop some columns and make ourselves a copy that we can de-depulicate. " ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": false }, "outputs": [], "source": [ "hack = openelexprocessed[['county', 'ward', 'Muni', 'ReportingUnit', 'data', 'type']].copy()" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": false }, "outputs": [], "source": [ "dedupedOpenElex2014 = hack.drop_duplicates().copy()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As mentioned above, each election the reporting units might be different: depending on the ballots used and what's on the election, in one year a city might have a single reporting unit of wards 1,2,3,and 4, and the next year it might have two reporting units, one with wards 1 and 2, and the other with wards 3 and 4. So, we'll see if we've already got a reporting unit from 2016, and for everything we don't already have we'll build another manual dictionary. Hopefully, between the 2016 and the 2014 data, we'll have covered most possible combinations. " ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "200" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(dedupedOpenElex2014[dedupedOpenElex2014['type']=='unmatched'])" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\"Wards 1-3, 8-10\": \"Wards \",\n", "\"Wards 2, 8 & 11\": \"Wards \",\n", "\"Ward 3, 4 & 6\": \"Ward \",\n", "\"Ward 9, 10 & 18\": \"Ward \",\n", "\"Wards 1-3 & 6\": \"Wards \",\n", "\"Wards 1, 2, 3\": \"Wards \",\n", "\"Wards 7-8, 10-12\": \"Wards \",\n", "\"Ward 1, 2, 3\": \"Ward \",\n", "\"Wards 1 - 5 & 5S\": \"Wards \",\n", "\"Wards 1 - 2 & 2 S\": \"Wards \",\n", "\"Wards 1 - 4, 6 - 7\": \"Wards \",\n", "\"Wards 1, 2 & 3\": \"Wards \",\n", "\"Ward 1, 2 & 3\": \"Ward \",\n", "\"Ward 1, 2, 3,& 4\": \"Ward \",\n", "\"Wards 1,9,10\": \"Wards \",\n", "\"Wards 2,3,5\": \"Wards \",\n", "\"Wards 4,6,7,8\": \"Wards \",\n", "\"Wards 1 - 3 & 6\": \"Wards \",\n", "\"Wards 1, 2, 4, 5, 7\": \"Wards \",\n", "\"Wards 1 And 2\": \"Wards \",\n", "\"Ward 1, 3-6, 15\": \"Ward \",\n", "\"Ward 7-10, 12\": \"Ward \",\n", "\"Ward 1, 5, 6 & 11\": \"Ward \",\n", "\"Ward 2, 3, 4 & 12\": \"Ward \",\n", "\"Wards 15, 18, 19\": \"Wards \",\n", "\"Wards 1-5, 8-9\": \"Wards \",\n", "\"Wards 6-7, 14-18\": \"Wards \",\n", "\"Wards 3 - 4, 12\": \"Wards \",\n", "\"Wards 2,3,4\": \"Wards \",\n", "\"Wards 1-4, 7-11\": \"Wards \",\n", "\"Wards 1, 3-7\": \"Wards \",\n", "\"Ward 1, 2, 3\": \"Ward \",\n", "\"Wards 1, 3 & 5\": \"Wards \",\n", "\"Wards 7, 12 & 13\": \"Wards \",\n", "\"Wards 1, 2 & 3\": \"Wards \",\n", "\"Wards 1, 2 & 3\": \"Wards \",\n", "\"Wards 1, 2 & 3\": \"Wards \",\n", "\"Wards 1, 2, 3, 4 & 5\": \"Wards \",\n", "\"Wards 1-6, 22-24, 29\": \"Wards \",\n", "\"Wards 7-10, 18-21, 25-27, 30\": \"Wards \",\n", "\"Wards 11-17, 28\": \"Wards \",\n", "\"Wards 1, 2 & 3\": \"Wards \",\n", "\"Wards 1 - 3 And 7 - 8\": \"Wards \",\n", "\"Wards 4 - 6 And 9 - 12\": \"Wards \",\n", "\"Wards 13 - 16 And 20 - 22\": \"Wards \",\n", "\"Wards 17 - 19 And 30 - 32\": \"Wards \",\n", "\"Wards 23 - 26 And 27 - 29\": \"Wards \",\n", "\"Wards 1 + 2\": \"Wards \",\n", "\"Wards 3 + 4\": \"Wards \",\n", "\"Wards 5 + 7\": \"Wards \",\n", "\"Wards 8 + 9\": \"Wards \",\n", "\"Wards 9B\": \"Wards \",\n", "\"Wards 9, 9A, 10, 11 ,12\": \"Wards \",\n", "\"Wards 1, 2, 3\": \"Wards \",\n", "\"Wards 1, 2, 3\": \"Wards \",\n", "\"Wards 1, 2, 3\": \"Wards \",\n", "\"Wards 1, 2, 3, 4, 5\": \"Wards \",\n", "\"Wards 1, 2 & 3\": \"Wards \",\n", "\"Wards 1, 2 & 3\": \"Wards \",\n", "\"Wards 1, 2 & 3\": \"Wards \",\n", "\"Wards 1-2-3\": \"Wards \",\n", "\"Wards 1-2-3\": \"Wards \",\n", "\"Wards 1-2-3-4\": \"Wards \",\n", "\"Wards 1-2-3-4\": \"Wards \",\n", "\"Wards 2,3 & 5\": \"Wards \",\n", "\"Wards 1 - 5 & 10\": \"Wards \",\n", "\"Wards 5-6, 9, 12\": \"Wards \",\n", "\"Wards 1,2,3 & 8\": \"Wards \",\n", "\"Wards 1 And 2\": \"Wards \",\n", "\"Wards 3 And 4\": \"Wards \",\n", "\"Wards 5 And 6\": \"Wards \",\n", "\"Ward 1 - 6 & 8\": \"Ward \",\n", "\"Wards 3, 4, 22\": \"Wards \",\n", "\"Wards 17 - 18, 21, 23 - 25\": \"Wards \",\n", "\"Ward 1, 2 & 3\": \"Ward \",\n", "\"Wards 1, 2, 4, 5\": \"Wards \",\n", "\"Wards 12, 20, 21, 24\": \"Wards \",\n", "\"Wards 1,2,6 & 7\": \"Wards \",\n", "\"Wards 1, 2, & 3\": \"Wards \",\n", "\"Wards 1, 2, & 3\": \"Wards \",\n", "\"Wards 1, 2, & 3\": \"Wards \",\n", "\"Wards 1, 2, & 3\": \"Wards \",\n", "\"Wards 1, 3, 5\": \"Wards \",\n", "\"Wards 2, 4, 6\": \"Wards \",\n", "\"Wards 1, 2, & 3\": \"Wards \",\n", "\"Ward 1, 3 - 5\": \"Ward \",\n", "\"Wards 1S & 3S\": \"Wards \",\n", "\"Wards - 1, 2 & 5\": \"Wards - \",\n", "\"Wards - 3, 4 & 6\": \"Wards - \",\n", "\"Wards 2, 8S\": \"Wards \",\n", "\"Ward 1 B, 2 B, 3 B\": \"Ward \",\n", "\"Ward 1 A, 2 A, 3 A\": \"Ward \",\n", "\"Wards 1, 2 & 3\": \"Wards \",\n", "\"Wards 1-16, 5A, 18\": \"Wards \",\n", "\"Wards 1-3 & 5-8\": \"Wards \",\n", "\"Wards 1, 4, 5, 14\": \"Wards \",\n", "\"Wards 2, 6, 8, 12, 13\": \"Wards \",\n", "\"Wards 3, 9, 10, 11\": \"Wards \",\n", "\"Wd 1,2,5\": \"Wd \",\n", "\"Ward 5, 7B\": \"Ward \",\n", "\"Ward 6, 7A\": \"Ward \",\n", "\"Ward 8, 9, 10\": \"Ward \",\n", "\"Ward 19, 20, 21\": \"Ward \",\n", "\"Wards 1, 2 & 4\": \"Wards \",\n", "\"Ward 1, 2, 3\": \"Ward \",\n", "\"Wards 3, 4, 5\": \"Wards \",\n", "\"Wards 6, 7, 8\": \"Wards \",\n", "\"Wards 9, 10, 12, 13\": \"Wards \",\n", "\"Wards 11, 14, 15, 17\": \"Wards \",\n", "\"Wards 16, 18, 19\": \"Wards \",\n", "\"Ward 1, 2, 4, 16\": \"Ward \",\n", "\"Ward 3, 10, 11, 12, 15\": \"Ward \",\n", "\"Ward 6, 7, 8, 9\": \"Ward \",\n", "\"Ward 19, 21, 22, 23\": \"Ward \",\n", "\"Wards 1-4, 6\": \"Wards \",\n", "\"Wards 1 - 4 & 15\": \"Wards \",\n", "\"Ward 1-3, 13\": \"Ward \",\n", "\"Ward 4, 6, 14\": \"Ward \",\n", "\"Ward 5, 7-9\": \"Ward \",\n", "\"Wards 1-2 & 9\": \"Wards \",\n", "\"Wards 1,2,3,\": \"Wards \",\n", "\"Wards 1- 4, 6\": \"Wards \",\n", "\"Wards 5, 6, 10\": \"Wards \",\n", "\"Ward 7 - 9, 11 - 14\": \"Ward \",\n", "\"Wds 1,2,8\": \"Wds \",\n", "\"Wds 3,4,5,6,7\": \"Wds \",\n", "\"Wards 1,8,10 & 11\": \"Wards \",\n", "\"Wards 2,5,6 & 7\": \"Wards \",\n", "\"Wards 3,16,4,17 & 9\": \"Wards \",\n", "\"Wards 1 2\": \"Wards \",\n", "\"Wards 5 6\": \"Wards \",\n", "\"Wards 7 8 9\": \"Wards \",\n", "\"Wards 6, 9, 10, 15 - 17, 20, 23\": \"Wards \",\n", "\"Wards 11 - 14, 21, 22\": \"Wards \",\n", "\"Wards 23-24, 26\": \"Wards \",\n", "\"Wards 1 - 3 - 4 - 10\": \"Wards \",\n", "\"Wards 2 - 5 - 6 - 7 - 8\": \"Wards \",\n", "\"Wards 1, 2, 5, 6\": \"Wards \",\n", "\"Wards 9, 10, 11\": \"Wards \",\n", "\"Ward 1,3,5 & 9\": \"Ward \",\n", "\"Ward 2, 4 & 10\": \"Ward \",\n", "\"Ward 6,7 & 8\": \"Ward \",\n", "\"Wards 1, 2, 3, 7, 8, 9\": \"Wards \",\n", "\"Wards 4, 5, 6, 10, 11\": \"Wards \",\n", "\"Wards 1 - 3, 7 - 11\": \"Wards \",\n", "\"Wards 1,2,& 4\": \"Wards \",\n", "\"Wards 3,6,& 7\": \"Wards \",\n", "\"Wards 5,8,& 9\": \"Wards \",\n", "\"Ward 1, 8-11\": \"Ward \",\n", "\"Wards 7, 9 - 11\": \"Wards \",\n", "\"Wards 1, 2, 3, 4\": \"Wards \",\n", "\"Wards 2,3,4,5\": \"Wards \",\n", "\"Wards 4 And 5\": \"Wards \",\n", "\"Wards 9 And 10\": \"Wards \",\n", "\"Wards 11 And 12\": \"Wards \",\n", "\"Wards 13 And 14\": \"Wards \",\n", "\"Wards 15 And 16\": \"Wards \",\n", "\"Wards 3, 4 & 8\": \"Wards \",\n", "\"Wards 1 - 2 & 7 - 10\": \"Wards \",\n", "\"Wards 1, 2, 4, 7\": \"Wards \",\n", "\"Wards 3, 5, 6\": \"Wards \",\n", "\"Wards 1A, 2-5\": \"Wards \",\n", "\"Ward 1A And 2\": \"Ward \",\n", "\"Wards 1 And 2A\": \"Wards \",\n", "\"Wards 2B And 2C\": \"Wards \",\n", "\"Wards 1A, 2, 4, 7\": \"Wards \",\n", "\"Wards 3, 14, 15, 30\": \"Wards \",\n", "\"Wards 5A, 6, 8, 9, 23, 24, 25, 26, 27, 28, 29, 31\": \"Wards \",\n", "\"Wards 10, 11, 12, 13, 21, 22\": \"Wards \",\n", "\"Wards 8, 19, 22, 23\": \"Wards \",\n", "\"Wards 6 - 15 & 24 & 26\": \"Wards \",\n", "\"Wards 16 - 23 & 25 & 27\": \"Wards \",\n" ] } ], "source": [ "def checkforexists(row):\n", " if not row in manual:\n", " convert(row)\n", " \n", "junk = dedupedOpenElex2014[dedupedOpenElex2014['type']=='unmatched']['ReportingUnit'].apply(checkforexists)" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "collapsed": true }, "outputs": [], "source": [ "manual2014 = {\n", "\"Wards 1-3, 8-10\": \"Wards 1,2,3,8,9,10\",\n", "\"Wards 2, 8 & 11\": \"Wards 2,8,11\",\n", "\"Ward 3, 4 & 6\": \"Ward 3,4,6\",\n", "\"Ward 9, 10 & 18\": \"Ward 9,10,18\",\n", "\"Wards 1-3 & 6\": \"Wards 1,2,3,6\",\n", "\"Wards 1, 2, 3\": \"Wards 1,2,3\",\n", "\"Wards 7-8, 10-12\": \"Wards 7,8,10,11,12\",\n", "\"Ward 1, 2, 3\": \"Ward 1,2,3\",\n", "\"Wards 1 - 5 & 5S\": \"Wards 1,2,3,4,5,5S\",\n", "\"Wards 1 - 2 & 2 S\": \"Wards 1,2,2S\",\n", "\"Wards 1 - 4, 6 - 7\": \"Wards 1,2,3,4,6,7\",\n", "\"Wards 1, 2 & 3\": \"Wards 1,2,3\",\n", "\"Ward 1, 2 & 3\": \"Ward 1,2,3\",\n", "\"Ward 1, 2, 3,& 4\": \"Ward 1,2,3,4\",\n", "\"Wards 1,9,10\": \"Wards 1,9,10\",\n", "\"Wards 2,3,5\": \"Wards 2,3,5\",\n", "\"Wards 4,6,7,8\": \"Wards 4,6,7,8\",\n", "\"Wards 1 - 3 & 6\": \"Wards 1,2,3,6\",\n", "\"Wards 1, 2, 4, 5, 7\": \"Wards 1,2,4,5,7\",\n", "\"Wards 1 And 2\": \"Wards 1,2\",\n", "\"Ward 1, 3-6, 15\": \"Ward 1,3,4,5,6,15\",\n", "\"Ward 7-10, 12\": \"Ward 7,8,9,10,12\",\n", "\"Ward 1, 5, 6 & 11\": \"Ward 1,5,6,11\",\n", "\"Ward 2, 3, 4 & 12\": \"Ward 2,3,4,12\",\n", "\"Wards 15, 18, 19\": \"Wards 15,18,19\",\n", "\"Wards 1-5, 8-9\": \"Wards 1,2,3,4,5,8,9\",\n", "\"Wards 6-7, 14-18\": \"Wards 6,7,14,15,16,17,18\",\n", "\"Wards 3 - 4, 12\": \"Wards 3,4,12\",\n", "\"Wards 2,3,4\": \"Wards 2,3,4\",\n", "\"Wards 1-4, 7-11\": \"Wards 1,2,3,4,7,8,9,10,11\",\n", "\"Wards 1, 3-7\": \"Wards 1,3,4,5,6,7\",\n", "\"Ward 1, 2, 3\": \"Ward 1,2,3\",\n", "\"Wards 1, 3 & 5\": \"Wards 1,3,5\",\n", "\"Wards 7, 12 & 13\": \"Wards 7,12,13\",\n", "\"Wards 1, 2 & 3\": \"Wards 1,2,3\",\n", "\"Wards 1, 2 & 3\": \"Wards 1,2,3\",\n", "\"Wards 1, 2 & 3\": \"Wards 1,2,3\",\n", "\"Wards 1, 2, 3, 4 & 5\": \"Wards 1,2,3,4,5\",\n", "\"Wards 1-6, 22-24, 29\": \"Wards 1,2,3,4,5,6,22,23,24,29\",\n", "\"Wards 7-10, 18-21, 25-27, 30\": \"Wards 7,8,9,10,18,19,20,21,25,26,27,30\",\n", "\"Wards 11-17, 28\": \"Wards 11,12,13,14,15,16,17,28\",\n", "\"Wards 1, 2 & 3\": \"Wards 1,2,3\",\n", "\"Wards 1 - 3 And 7 - 8\": \"Wards 1,2,3,7,8\",\n", "\"Wards 4 - 6 And 9 - 12\": \"Wards 4,5,6,9,10,11,12\",\n", "\"Wards 13 - 16 And 20 - 22\": \"Wards 13,14,15,16,20,21,22\",\n", "\"Wards 17 - 19 And 30 - 32\": \"Wards 17,18,19,30,31,32\",\n", "\"Wards 23 - 26 And 27 - 29\": \"Wards 23,24,25,26,27,28,29\",\n", "\"Wards 1 + 2\": \"Wards 1,2\",\n", "\"Wards 3 + 4\": \"Wards 3,4\",\n", "\"Wards 5 + 7\": \"Wards 5,7\",\n", "\"Wards 8 + 9\": \"Wards 8,9\",\n", "\"Wards 9B\": \"Wards 9B\",\n", "\"Wards 9, 9A, 10, 11 ,12\": \"Wards 9,9A,10,11,12\",\n", "\"Wards 1, 2, 3\": \"Wards 1,2,3\",\n", "\"Wards 1, 2, 3\": \"Wards 1,2,3\",\n", "\"Wards 1, 2, 3\": \"Wards 1,2,3\",\n", "\"Wards 1, 2, 3, 4, 5\": \"Wards 1,2,3,4,5\",\n", "\"Wards 1, 2 & 3\": \"Wards 1,2,3\",\n", "\"Wards 1, 2 & 3\": \"Wards 1,2,3\",\n", "\"Wards 1, 2 & 3\": \"Wards 1,2,3\",\n", "\"Wards 1-2-3\": \"Wards 1,2,3\",\n", "\"Wards 1-2-3\": \"Wards 1,2,3\",\n", "\"Wards 1-2-3-4\": \"Wards 1,2,3,4\",\n", "\"Wards 1-2-3-4\": \"Wards 1,2,3,4\",\n", "\"Wards 2,3 & 5\": \"Wards 2,3,5\",\n", "\"Wards 1 - 5 & 10\": \"Wards 1,2,3,4,5,10\",\n", "\"Wards 5-6, 9, 12\": \"Wards 5,6,9,12\",\n", "\"Wards 1,2,3 & 8\": \"Wards 1,2,3,8\",\n", "\"Wards 1 And 2\": \"Wards 1,2\",\n", "\"Wards 3 And 4\": \"Wards 3,4\",\n", "\"Wards 5 And 6\": \"Wards 5,6\",\n", "\"Ward 1 - 6 & 8\": \"Ward 1,2,3,4,5,6,8\",\n", "\"Wards 3, 4, 22\": \"Wards 3,4,22\",\n", "\"Wards 17 - 18, 21, 23 - 25\": \"Wards 17,18,21,23,24,25\",\n", "\"Ward 1, 2 & 3\": \"Ward 1,2,3\",\n", "\"Wards 1, 2, 4, 5\": \"Wards 1,2,4,5\",\n", "\"Wards 12, 20, 21, 24\": \"Wards 12,20,21,24\",\n", "\"Wards 1,2,6 & 7\": \"Wards 1,2,6,7\",\n", "\"Wards 1, 2, & 3\": \"Wards 1,2,3\",\n", "\"Wards 1, 2, & 3\": \"Wards 1,2,3\",\n", "\"Wards 1, 2, & 3\": \"Wards 1,2,3\",\n", "\"Wards 1, 2, & 3\": \"Wards 1,2,3\",\n", "\"Wards 1, 3, 5\": \"Wards 1,3,5\",\n", "\"Wards 2, 4, 6\": \"Wards 2,4,6\",\n", "\"Wards 1, 2, & 3\": \"Wards 1,2,3\",\n", "\"Ward 1, 3 - 5\": \"Ward 1,3,4,5\",\n", "\"Wards 1S & 3S\": \"Wards 1S,3S\",\n", "\"Wards - 1, 2 & 5\": \"Wards 1,2,5\",\n", "\"Wards - 3, 4 & 6\": \"Wards 3,4,6\",\n", "\"Wards 2, 8S\": \"Wards 2,8S\",\n", "\"Ward 1 B, 2 B, 3 B\": \"Ward 1B,2B,3B\",\n", "\"Ward 1 A, 2 A, 3 A\": \"Ward 1A,2A,3A\",\n", "\"Wards 1, 2 & 3\": \"Wards 1,2,3\",\n", "\"Wards 1-16, 5A, 18\": \"Wards 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,5A,18\",\n", "\"Wards 1-3 & 5-8\": \"Wards 1,2,3,5,6,7,8\",\n", "\"Wards 1, 4, 5, 14\": \"Wards 1,3,4,14\",\n", "\"Wards 2, 6, 8, 12, 13\": \"Wards 2,6,8,12,13\",\n", "\"Wards 3, 9, 10, 11\": \"Wards 3,9,10,11\",\n", "\"Wd 1,2,5\": \"Ward 1,2,5\",\n", "\"Ward 5, 7B\": \"Ward 5,7B\",\n", "\"Ward 6, 7A\": \"Ward 6,7A\",\n", "\"Ward 8, 9, 10\": \"Ward 8,9,10\",\n", "\"Ward 19, 20, 21\": \"Ward 19,20,21\",\n", "\"Wards 1, 2 & 4\": \"Wards 1,2,4\",\n", "\"Ward 1, 2, 3\": \"Ward 1,2,3\",\n", "\"Wards 3, 4, 5\": \"Wards 3,4,5\",\n", "\"Wards 6, 7, 8\": \"Wards 6,7,8\",\n", "\"Wards 9, 10, 12, 13\": \"Wards 9,10,12,13\",\n", "\"Wards 11, 14, 15, 17\": \"Wards 11,14,15,17\",\n", "\"Wards 16, 18, 19\": \"Wards 16,18,19\",\n", "\"Ward 1, 2, 4, 16\": \"Ward 1,2,3,16\",\n", "\"Ward 3, 10, 11, 12, 15\": \"Ward 3,10,11,12,15\",\n", "\"Ward 6, 7, 8, 9\": \"Ward 6,7,8,9\",\n", "\"Ward 19, 21, 22, 23\": \"Ward 19,21,22,23\",\n", "\"Wards 1-4, 6\": \"Wards 1,2,3,4,6\",\n", "\"Wards 1 - 4 & 15\": \"Wards 1,2,3,4,15\",\n", "\"Ward 1-3, 13\": \"Ward 1,2,3,13\",\n", "\"Ward 4, 6, 14\": \"Ward 4,6,14\",\n", "\"Ward 5, 7-9\": \"Ward 5,7,8,9\",\n", "\"Wards 1-2 & 9\": \"Wards 1,2,9\",\n", "\"Wards 1,2,3,\": \"Wards 1,2,3\",\n", "\"Wards 1- 4, 6\": \"Wards 1,2,3,4,6\",\n", "\"Wards 5, 6, 10\": \"Wards 5,6,10\",\n", "\"Ward 7 - 9, 11 - 14\": \"Ward 7,8,9,11,12,13,14\",\n", "\"Wds 1,2,8\": \"Wards 1,2,8\",\n", "\"Wds 3,4,5,6,7\": \"Wards 3,4,5,6,7\",\n", "\"Wards 1,8,10 & 11\": \"Wards 1,8,10,11\",\n", "\"Wards 2,5,6 & 7\": \"Wards 2,5,6,7\",\n", "\"Wards 3,16,4,17 & 9\": \"Wards 3,16,4,17,9\",\n", "\"Wards 1 2\": \"Wards 1,2\",\n", "\"Wards 5 6\": \"Wards 5,6\",\n", "\"Wards 7 8 9\": \"Wards 7,8,9\",\n", "\"Wards 6, 9, 10, 15 - 17, 20, 23\": \"Wards 6,9,10,15,16,17,20,23\",\n", "\"Wards 11 - 14, 21, 22\": \"Wards 11,12,13,14,21,22\",\n", "\"Wards 23-24, 26\": \"Wards 23,24,26\",\n", "\"Wards 1 - 3 - 4 - 10\": \"Wards 1,3,4,10\",\n", "\"Wards 2 - 5 - 6 - 7 - 8\": \"Wards 2,5,6,7,8\",\n", "\"Wards 1, 2, 5, 6\": \"Wards 1,2,5,6\",\n", "\"Wards 9, 10, 11\": \"Wards 9,10,11\",\n", "\"Ward 1,3,5 & 9\": \"Ward 1,3,5,9\",\n", "\"Ward 2, 4 & 10\": \"Ward 2,4,10\",\n", "\"Ward 6,7 & 8\": \"Ward 6,7,8\",\n", "\"Wards 1, 2, 3, 7, 8, 9\": \"Wards 1,2,3,7,8,9\",\n", "\"Wards 4, 5, 6, 10, 11\": \"Wards 4,5,6,10,11\",\n", "\"Wards 1 - 3, 7 - 11\": \"Wards 1,2,3,7,8,9,10,11\",\n", "\"Wards 1,2,& 4\": \"Wards 1,2,4\",\n", "\"Wards 3,6,& 7\": \"Wards 3,6,7\",\n", "\"Wards 5,8,& 9\": \"Wards 5,8,9\",\n", "\"Ward 1, 8-11\": \"Ward 1,8,9,10,11\",\n", "\"Wards 7, 9 - 11\": \"Wards 7,9,10,11\",\n", "\"Wards 1, 2, 3, 4\": \"Wards 1,2,3,4\",\n", "\"Wards 2,3,4,5\": \"Wards 2,3,4,5\",\n", "\"Wards 4 And 5\": \"Wards 4,5\",\n", "\"Wards 9 And 10\": \"Wards 9,10\",\n", "\"Wards 11 And 12\": \"Wards 11,12\",\n", "\"Wards 13 And 14\": \"Wards 13,14\",\n", "\"Wards 15 And 16\": \"Wards 15,16\",\n", "\"Wards 3, 4 & 8\": \"Wards 3,4,8\",\n", "\"Wards 1 - 2 & 7 - 10\": \"Wards 1,2,7,8,9,10\",\n", "\"Wards 1, 2, 4, 7\": \"Wards 1,2,4,7\",\n", "\"Wards 3, 5, 6\": \"Wards 3,5,6\",\n", "\"Wards 1A, 2-5\": \"Wards 1A,2,3,4,5\",\n", "\"Ward 1A And 2\": \"Ward 1A,2\",\n", "\"Wards 1 And 2A\": \"Wards 1,2A\",\n", "\"Wards 2B And 2C\": \"Wards 2B,2C\",\n", "\"Wards 1A, 2, 4, 7\": \"Wards 1A,2,4,7\",\n", "\"Wards 3, 14, 15, 30\": \"Wards 3,14,15,30\",\n", "\"Wards 5A, 6, 8, 9, 23, 24, 25, 26, 27, 28, 29, 31\": \"Wards 5A,6,8,9,23,24,25,26,27,28,29,31\",\n", "\"Wards 10, 11, 12, 13, 21, 22\": \"Wards 10,11,12,13,21,22\",\n", "\"Wards 8, 19, 22, 23\": \"Wards 8,19,22,23\",\n", "\"Wards 6 - 15 & 24 & 26\": \"Wards 6,7,8,9,10,11,12,13,14,15,24,26\",\n", "\"Wards 16 - 23 & 25 & 27\": \"Wards 16,17,18,19,20,21,22,23,25,27\",\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Merge the 2016 and 2014 data into one dictionary, and then let's normalize all of the 2014 reporting units. " ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "collapsed": true }, "outputs": [], "source": [ "combinedmappings = manual.copy()\n", "combinedmappings.update(manual2014)" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "collapsed": false }, "outputs": [], "source": [ "dedupedOpenElex2014['mapped'] = dedupedOpenElex2014.apply(lookup, args=(combinedmappings,), axis=1)" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "collapsed": false, "scrolled": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countywardMuniReportingUnitdatatypemapped
0AdamsTown Of Adams Wards 1-3Town Of AdamsWards 1-31-3hypenWards 1,2,3
11AdamsTown Of Big Flats Wards 1 & 2Town Of Big FlatsWards 1 & 21 & 2ampWards 1,2
22AdamsTown Of Colburn Ward 1Town Of ColburnWard 11singleWard 1
33AdamsTown Of Dell Prairie Wards 1-3Town Of Dell PrairieWards 1-31-3hypenWards 1,2,3
44AdamsTown Of Easton Wards 1 & 2Town Of EastonWards 1 & 21 & 2ampWards 1,2
55AdamsTown Of Jackson Wards 1 & 2Town Of JacksonWards 1 & 21 & 2ampWards 1,2
66AdamsTown Of Leola Ward 1Town Of LeolaWard 11singleWard 1
77AdamsTown Of Lincoln Ward 1Town Of LincolnWard 11singleWard 1
88AdamsTown Of Monroe Ward 1Town Of MonroeWard 11singleWard 1
99AdamsTown Of New Chester Ward 1-3Town Of New ChesterWard 1-31-3hypenWards 1,2,3
\n", "
" ], "text/plain": [ " county ward Muni ReportingUnit \\\n", "0 Adams Town Of Adams Wards 1-3 Town Of Adams Wards 1-3 \n", "11 Adams Town Of Big Flats Wards 1 & 2 Town Of Big Flats Wards 1 & 2 \n", "22 Adams Town Of Colburn Ward 1 Town Of Colburn Ward 1 \n", "33 Adams Town Of Dell Prairie Wards 1-3 Town Of Dell Prairie Wards 1-3 \n", "44 Adams Town Of Easton Wards 1 & 2 Town Of Easton Wards 1 & 2 \n", "55 Adams Town Of Jackson Wards 1 & 2 Town Of Jackson Wards 1 & 2 \n", "66 Adams Town Of Leola Ward 1 Town Of Leola Ward 1 \n", "77 Adams Town Of Lincoln Ward 1 Town Of Lincoln Ward 1 \n", "88 Adams Town Of Monroe Ward 1 Town Of Monroe Ward 1 \n", "99 Adams Town Of New Chester Ward 1-3 Town Of New Chester Ward 1-3 \n", "\n", " data type mapped \n", "0 1-3 hypen Wards 1,2,3 \n", "11 1 & 2 amp Wards 1,2 \n", "22 1 single Ward 1 \n", "33 1-3 hypen Wards 1,2,3 \n", "44 1 & 2 amp Wards 1,2 \n", "55 1 & 2 amp Wards 1,2 \n", "66 1 single Ward 1 \n", "77 1 single Ward 1 \n", "88 1 single Ward 1 \n", "99 1-3 hypen Wards 1,2,3 " ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dedupedOpenElex2014.head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's write out our combined special-case mappings for later. This dictionary will have both 2016 and 2014 data, but when we want to add 2012 data hopefully we'll be able to use it as a starting point. " ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "collapsed": true }, "outputs": [], "source": [ "with open(\"ward_mappings.json\", \"w\") as special_case_output:\n", " json.dump(combinedmappings, special_case_output)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's also write out a copy of a CSV with normalized wards. This is only the 2014 data, and can be used as a lookup table for the real Open Elections data, which we'll do in our next notebook." ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "collapsed": true }, "outputs": [], "source": [ "dedupedOpenElex2014.to_csv(\"2014_wards_normalized.csv\", index=False, columns=('county', 'ward', 'Muni', 'ReportingUnit', 'mapped'))\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*** One note about wards. Earlier, we said that all voters in a ward get the same ballot. This is not true in the case of school board elections. School district boundaries do not follow municipal boundaries, so some wards are split between different districts. The Elections Commission calls these 'District Combos', and the Voter Registration system records which District Combo a voter lives in and which ballot they should get. (In fact, not only do school district boundaries not follow municipal boundaries, they don't even follow parcel boundaries. There are a few condos in Madison where the School District boundary runs through the condo. In that case, for purposes of \"residency\", the district that the bedroom falls into decides which district the voter lives in.) \n", "\n", "\n", "
\n", "
\n", "
" ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python [conda env:gab]", "language": "python", "name": "conda-env-gab-py" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.2" } }, "nbformat": 4, "nbformat_minor": 1 }