{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"__________\n",
"## Advisory!\n",
"** This is the stripped down `dw-nominate` notebook.
\n",
"If you're new to Python or want thorough documentation, please view the `dw-nominate-detail` notebook.**\n",
"____________"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# DW-Nominate Exploration\n",
"\n",
"The Nominate scoring scale was first developed by Keith T. Poole and Howard Rosenthal in the late 1980's.
\n",
"Since then, it has undergone several iterations, with the DW-series being the latest.
\n",
"Scores are derived from roll call votes, and contain 2 dimensions:
\n",
"1. Allowing us to place Senators, House members, and their political orgs on the liberal-convervative [-1, 1] spectrum (1st dimension).
\n",
"2. Nominate also quantifies the opposition/support of civil rights for underrepresented minorities (2nd dimension).\n",
"\n",
"Read more about these metrics on the Voteview website.\n",
"\n",
"The following notebook is going to build off a visualization made by the Pew Research Institute, by making a gif of changes of house ideology accross congresses."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here's the finished product:\n",
"\n",
"GIF hosted on Github."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We're going to\n",
"1. Read a fixed-width text file from an anonymous ftp hosted on the web.\n",
"2. Use Pandas dataframes to filter, replace, and aggregate data.\n",
"3. Plot data using Panda's Matplotlib extension.\n",
"4. Generate a GIF out of static png files."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%matplotlib inline\n",
"import os\n",
"import glob\n",
"\n",
"import us\n",
"import numpy as np\n",
"import pandas as pd\n",
"import matplotlib.pyplot as plt\n",
"import matplotlib.ticker as ticker\n",
"import imageio"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"where the raw files are hosted:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"senate_dl = 'ftp://k7moa.com/junkord/SL01113D21_BSSE.dat'\n",
"house_dl = 'ftp://k7moa.com/junkord/HL01113D21_BSSE.DAT'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can store these links in a list of tuples, later we'll iterate through `args` to download the correct files."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"args = [('senate', senate_dl),\n",
" ('house', house_dl)]"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"ftp://k7moa.com/junkord/SL01113D21_BSSE.dat\n",
"ftp://k7moa.com/junkord/HL01113D21_BSSE.DAT\n"
]
}
],
"source": [
"for arg in args:\n",
" print(arg[1])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"columns copied and pasted from voteview docs."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"cols = '''Congress Number\n",
"ICPSR ID Number\n",
"State Code\n",
"Congressional District Number\n",
"State Name\n",
"Party Code\n",
"Name\n",
"1st Dimension Coordinate\n",
"2nd Dimension Coordinate\n",
"1st Dimension Bootstrapped Standard Error \n",
"2nd Dimension Bootstrapped Standard Error\n",
"Correlation Between 1st and 2nd Dimension\n",
"Log-Likelihood\n",
"Number of Votes\n",
"Number of Classification Errors\n",
"Geometric Mean Probability'''.split('\\n')"
]
},
{
"cell_type": "code",
"execution_count": 91,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"'We split the columns by a newline character (\\n), to get 16 columns.'"
]
},
"execution_count": 91,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"\"We split the columns by a newline character (\\n), to get {} columns.\".format(len(cols))"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"col_widths = [(0,4), (4,10), (10,13), (13,15), \n",
" (15,23), (23,28), (28,40), (40,50), \n",
" (50,60), (60,70), (70,80), (80,90),\n",
" (90,102), (102,107), (107, 112)]"
]
},
{
"cell_type": "code",
"execution_count": 77,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# This dict is for compat. with Propublica Congress API.\n",
"col_mapping = {'Name' : 'last_name', \n",
" 'Congress Number': 'Congress'}\n",
"\n",
"# dict comprehension to convert state names to abbreviations.\n",
"state_dict = {state.name.upper()[:7] : state.abbr for state in us.states.STATES}"
]
},
{
"cell_type": "code",
"execution_count": 70,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"df = pd.read_fwf(house_dl, names=cols, colspecs=col_widths)"
]
},
{
"cell_type": "code",
"execution_count": 87,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"
\n", " | Congress Number | \n", "ICPSR ID Number | \n", "State Code | \n", "Congressional District Number | \n", "State Name | \n", "Party Code | \n", "Name | \n", "1st Dimension Coordinate | \n", "2nd Dimension Coordinate | \n", "1st Dimension Bootstrapped Standard Error | \n", "2nd Dimension Bootstrapped Standard Error | \n", "Correlation Between 1st and 2nd Dimension | \n", "Log-Likelihood | \n", "Number of Votes | \n", "Number of Classification Errors | \n", "Geometric Mean Probability | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "1 | \n", "9062 | \n", "1 | \n", "98 | \n", "CONNECT | \n", "5000 | \n", "STURGES | \n", "0.648 | \n", "0.330 | \n", "0.1129 | \n", "0.1204 | \n", "-0.1329 | \n", "-26.87624 | \n", "80 | \n", "12 | \n", "NaN | \n", "
1 | \n", "1 | \n", "9706 | \n", "1 | \n", "98 | \n", "CONNECT | \n", "5000 | \n", "WADSWORTH | \n", "0.759 | \n", "0.136 | \n", "0.0751 | \n", "0.1067 | \n", "-0.2138 | \n", "-18.20466 | \n", "86 | \n", "4 | \n", "NaN | \n", "
2 | \n", "1 | \n", "8457 | \n", "1 | \n", "98 | \n", "CONNECT | \n", "5000 | \n", "SHERMAN | \n", "0.715 | \n", "0.203 | \n", "0.0759 | \n", "0.2234 | \n", "-0.2127 | \n", "-39.45672 | \n", "107 | \n", "18 | \n", "NaN | \n", "