{ "metadata": { "name": "", "signature": "sha256:fa6290b5b2a517d6190ddb141f05b090600a18bd83402d5c31f7f0df1993000c" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Homework 2: More Exploratory Data Analysis\n", "## Gene Expression Data and Election Polls \n", "\n", "Due: Thursday, October 2, 2014 11:59 PM\n", "\n", " Download this assignment\n", "\n", "#### Submission Instructions\n", "To submit your homework, create a folder named lastname_firstinitial_hw# and place your IPython notebooks, data files, and any other files in this folder. Your IPython Notebooks should be completely executed with the results visible in the notebook. We should not have to run any code. Compress the folder (please use .zip compression) and submit to the CS109 dropbox in the appropriate folder. If we cannot access your work because these directions are not followed correctly, we will not grade your work.\n", "\n", "\n", "---\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction\n", "\n", "John Tukey wrote in [Exploratory Data Analysis, 1977](http://www.amazon.com/Exploratory-Data-Analysis-Wilder-Tukey/dp/0201076160/ref=pd_bbs_sr_2/103-4466654-5303007?ie=UTF8&s=books&qid=1189739816&sr=8-2): \"The greatest value of a picture is when it forces us to notice what we never expected to see.\" In this assignment we will continue using our exploratory data analysis tools, but apply it to new sets of data: [gene expression](http://en.wikipedia.org/wiki/Gene_expression) and polls from the [2012 Presidental Election](http://en.wikipedia.org/wiki/United_States_presidential_election,_2012) and from the [2014 Senate Midterm Elections](http://en.wikipedia.org/wiki/United_States_Senate_elections,_2014). \n", "\n", "**First**: You will use exploratory data analysis and apply the [singular value decomposition](http://en.wikipedia.org/wiki/Singular_value_decomposition) (SVD) to a gene expression data matrix to determine if the the date that the gene expression samples are processed has large effect on the variability seen in the data. \n", "\n", "**Second**: You will use the polls from the 2012 Presidential Elections to determine (1) Is there a pollster bias in presidential election polls? and (2) Is the average of polls better than just one poll?\n", "\n", "**Finally**: You will use the [HuffPost Pollster API](http://elections.huffingtonpost.com/pollster/api) to extract the polls for the current 2014 Senate Midterm Elections and provide a preliminary prediction of the result of each state.\n", "\n", "#### Data\n", "\n", "We will use the following data sets: \n", "\n", "1. A gene expression data set called `exprs_GSE5859.csv` and sample annotation table called `sampleinfo_GSE5859.csv` which are both available on Github in the 2014_data repository: [expression data set](https://github.com/cs109/2014_data/blob/master/exprs_GSE5859.csv) and [sample annotation table](https://github.com/cs109/2014_data/blob/master/sampleinfo_GSE5859.csv). \n", "\n", "2. Polls from the [2012 Presidential Election: Barack Obama vs Mitt Romney](http://elections.huffingtonpost.com/pollster/2012-general-election-romney-vs-obama). The polls we will use are from the [Huffington Post Pollster](http://elections.huffingtonpost.com/pollster). \n", "\n", "3. Polls from the [2014 Senate Midterm Elections](http://elections.huffingtonpost.com/pollster) from the [HuffPost Pollster API](http://elections.huffingtonpost.com/pollster/api). \n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load Python modules" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# special IPython command to prepare the notebook for matplotlib\n", "%matplotlib inline \n", "\n", "import requests \n", "from StringIO import StringIO\n", "import numpy as np\n", "import pandas as pd # pandas\n", "import matplotlib.pyplot as plt # module for plotting \n", "import datetime as dt # module for manipulating dates and times\n", "import numpy.linalg as lin # module for performing linear algebra operations" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 61 }, { "cell_type": "code", "collapsed": false, "input": [ "# special matplotlib argument for improved plots\n", "from matplotlib import rcParams\n", "\n", "#colorbrewer2 Dark2 qualitative color table\n", "dark2_colors = [(0.10588235294117647, 0.6196078431372549, 0.4666666666666667),\n", " (0.8509803921568627, 0.37254901960784315, 0.00784313725490196),\n", " (0.4588235294117647, 0.4392156862745098, 0.7019607843137254),\n", " (0.9058823529411765, 0.1607843137254902, 0.5411764705882353),\n", " (0.4, 0.6509803921568628, 0.11764705882352941),\n", " (0.9019607843137255, 0.6705882352941176, 0.00784313725490196),\n", " (0.6509803921568628, 0.4627450980392157, 0.11372549019607843)]\n", "\n", "rcParams['figure.figsize'] = (10, 6)\n", "rcParams['figure.dpi'] = 150\n", "rcParams['axes.color_cycle'] = dark2_colors\n", "rcParams['lines.linewidth'] = 2\n", "rcParams['axes.facecolor'] = 'white'\n", "rcParams['font.size'] = 14\n", "rcParams['patch.edgecolor'] = 'white'\n", "rcParams['patch.facecolor'] = dark2_colors[0]\n", "rcParams['font.family'] = 'StixGeneral'" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 62 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Problem 1\n", "\n", "In this problem we will be using a [gene expression](http://en.wikipedia.org/wiki/Gene_expression) data set obtained from a [microarray](http://en.wikipedia.org/wiki/DNA_microarray) experiement [Read more about the specific experiment here](http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE5859). There are two data sets we will use: \n", "\n", "1. The gene expression intensities where the rows represent the features on the microarray (e.g. genes) and the columsns represent the different microarray samples. \n", "\n", "2. A table that contains the information about each of the samples (columns in the gene expression data set) such as the sex, the age, the treatment status, the date the samples were processed. Each row represents one sample. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Problem 1(a) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Read in the two files from Github: [exprs_GSE5859.csv](https://github.com/cs109/2014_data/blob/master/exprs_GSE5859.csv) and [sampleinfo_GSE5859.csv](https://github.com/cs109/2014_data/blob/master/sampleinfo_GSE5859.csv) as pandas DataFrames called `exprs` and `sampleinfo`. Use the gene names as the index of the `exprs` DataFrame." ] }, { "cell_type": "code", "collapsed": false, "input": [ "#your code here\n", "url_exprs = \"https://raw.githubusercontent.com/cs109/2014_data/master/exprs_GSE5859.csv\"\n", "exprs = pd.read_csv(url_exprs, index_col=0)\n", "\n", "url_sampleinfo = \"https://raw.githubusercontent.com/cs109/2014_data/master/sampleinfo_GSE5859.csv\"\n", "sampleinfo = pd.read_csv(url_sampleinfo)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 63 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Make sure the order of the columns in the gene expression DataFrame match the order of file names in the sample annotation DataFrame. If the order of the columns the `exprs` DataFrame do not match the order of the file names in the `sampleinfo` DataFrame, reorder the columns in the `exprs` DataFrame. \n", "\n", "**Note**: The column names of the gene expression DataFrame are the filenames of the orignal files from which these data were obtained. \n", "\n", "**Hint**: The method `list.index(x)` [[read here](https://docs.python.org/2/tutorial/datastructures.html)] can be used to return the index in the list of the first item whose value is x. It is an error if there is no such item. To check if the order of the columns in `exprs` matches the order of the rows in `sampleinfo`, you can check using the method `.all()` on a Boolean or list of Booleans: \n", "\n", "Example code: `(exprs.columns == sampleinfo.filename).all()`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, let's check if order of the columns in the `exprs` DataFrame match the order of the `filename` in the `sampleinfo` DataFrame using the `==` Boolean operator. " ] }, { "cell_type": "code", "collapsed": false, "input": [ "(exprs.columns == sampleinfo.filename).all()" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 64, "text": [ "False" ] } ], "prompt_number": 64 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we know the columns in the `exprs` DataFrame are out of order compared to the order of the rows in the `sampleinfo` DataFrame. To check if there are any columns in the correct order, we can test for which columns are equal to the `filename` in the `sampleinfo` DataFrame. " ] }, { "cell_type": "code", "collapsed": false, "input": [ "sampleinfo[exprs.columns == sampleinfo.filename]" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", " | ethnicity | \n", "date | \n", "filename | \n", "sex | \n", "
---|---|---|---|---|
23 | \n", "CEU | \n", "2002-11-21 | \n", "GSM25482.CEL.gz | \n", "F | \n", "
\n", " | GSM25349.CEL.gz | \n", "GSM25350.CEL.gz | \n", "GSM25356.CEL.gz | \n", "GSM25357.CEL.gz | \n", "GSM25358.CEL.gz | \n", "GSM25359.CEL.gz | \n", "GSM25360.CEL.gz | \n", "GSM25361.CEL.gz | \n", "GSM25377.CEL.gz | \n", "GSM25378.CEL.gz | \n", "... | \n", "GSM136719.CEL.gz | \n", "GSM136720.CEL.gz | \n", "GSM136721.CEL.gz | \n", "GSM136722.CEL.gz | \n", "GSM136723.CEL.gz | \n", "GSM136724.CEL.gz | \n", "GSM136725.CEL.gz | \n", "GSM136726.CEL.gz | \n", "GSM136727.CEL.gz | \n", "GSM136729.CEL.gz | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1007_s_at | \n", "6.627014 | \n", "6.249807 | \n", "5.934128 | \n", "6.145268 | \n", "6.091270 | \n", "6.041186 | \n", "6.050375 | \n", "6.063847 | \n", "6.226106 | \n", "6.358282 | \n", "... | \n", "6.466445 | \n", "6.533592 | \n", "6.631492 | \n", "6.513362 | \n", "6.440706 | \n", "6.704324 | \n", "6.488579 | \n", "6.809481 | \n", "6.133068 | \n", "6.155473 | \n", "
1053_at | \n", "6.939184 | \n", "6.818038 | \n", "7.047962 | \n", "7.422477 | \n", "7.272361 | \n", "7.128216 | \n", "6.750719 | \n", "6.836287 | \n", "6.738022 | \n", "7.367895 | \n", "... | \n", "7.032885 | \n", "7.156344 | \n", "7.018025 | \n", "6.973322 | \n", "6.884738 | \n", "7.120898 | \n", "7.517410 | \n", "7.200596 | \n", "7.280781 | \n", "7.111583 | \n", "
117_at | \n", "5.113570 | \n", "5.074096 | \n", "5.371201 | \n", "5.266550 | \n", "5.342047 | \n", "5.063876 | \n", "5.315898 | \n", "5.483652 | \n", "6.689444 | \n", "6.482782 | \n", "... | \n", "5.661112 | \n", "5.127260 | \n", "5.151840 | \n", "5.505602 | \n", "5.687689 | \n", "4.942651 | \n", "5.247190 | \n", "5.237239 | \n", "5.401876 | \n", "5.302628 | \n", "
121_at | \n", "7.833862 | \n", "7.780682 | \n", "7.458197 | \n", "7.655948 | \n", "7.546555 | \n", "7.072670 | \n", "7.092984 | \n", "6.954225 | \n", "7.489785 | \n", "7.388539 | \n", "... | \n", "7.769734 | \n", "7.815864 | \n", "7.683279 | \n", "7.883231 | \n", "7.913621 | \n", "7.834196 | \n", "7.331864 | \n", "7.357102 | \n", "7.607461 | \n", "7.456453 | \n", "
1255_g_at | \n", "3.152269 | \n", "3.111747 | \n", "3.018932 | \n", "3.154545 | \n", "3.107954 | \n", "3.224284 | \n", "3.114241 | \n", "3.044975 | \n", "3.304038 | \n", "2.887919 | \n", "... | \n", "3.257484 | \n", "3.339234 | \n", "3.298384 | \n", "3.150654 | \n", "3.344501 | \n", "3.230285 | \n", "3.175846 | \n", "3.105092 | \n", "3.225123 | \n", "3.090149 | \n", "
5 rows \u00d7 208 columns
\n", "\n", " | ethnicity | \n", "date | \n", "filename | \n", "sex | \n", "
---|---|---|---|---|
0 | \n", "CEU | \n", "2003-02-04 | \n", "GSM25349.CEL.gz | \n", "M | \n", "
1 | \n", "CEU | \n", "2003-02-04 | \n", "GSM25350.CEL.gz | \n", "M | \n", "
2 | \n", "CEU | \n", "2002-12-17 | \n", "GSM25356.CEL.gz | \n", "M | \n", "
3 | \n", "CEU | \n", "2003-01-30 | \n", "GSM25357.CEL.gz | \n", "M | \n", "
4 | \n", "CEU | \n", "2003-01-03 | \n", "GSM25358.CEL.gz | \n", "M | \n", "
\n", " | ethnicity | \n", "date | \n", "filename | \n", "sex | \n", "month | \n", "year | \n", "elapsedInDays | \n", "
---|---|---|---|---|---|---|---|
0 | \n", "CEU | \n", "2003-02-04 | \n", "GSM25349.CEL.gz | \n", "M | \n", "2 | \n", "2003 | \n", "96 | \n", "
1 | \n", "CEU | \n", "2003-02-04 | \n", "GSM25350.CEL.gz | \n", "M | \n", "2 | \n", "2003 | \n", "96 | \n", "
2 | \n", "CEU | \n", "2002-12-17 | \n", "GSM25356.CEL.gz | \n", "M | \n", "12 | \n", "2002 | \n", "47 | \n", "
3 | \n", "CEU | \n", "2003-01-30 | \n", "GSM25357.CEL.gz | \n", "M | \n", "1 | \n", "2003 | \n", "91 | \n", "
4 | \n", "CEU | \n", "2003-01-03 | \n", "GSM25358.CEL.gz | \n", "M | \n", "1 | \n", "2003 | \n", "64 | \n", "
\n", " | ethnicity | \n", "date | \n", "filename | \n", "sex | \n", "month | \n", "year | \n", "elapsedInDays | \n", "
---|---|---|---|---|---|---|---|
0 | \n", "CEU | \n", "2003-02-04 | \n", "GSM25349.CEL.gz | \n", "M | \n", "2 | \n", "2003 | \n", "96 | \n", "
1 | \n", "CEU | \n", "2003-02-04 | \n", "GSM25350.CEL.gz | \n", "M | \n", "2 | \n", "2003 | \n", "96 | \n", "
2 | \n", "CEU | \n", "2002-12-17 | \n", "GSM25356.CEL.gz | \n", "M | \n", "12 | \n", "2002 | \n", "47 | \n", "
3 | \n", "CEU | \n", "2003-01-30 | \n", "GSM25357.CEL.gz | \n", "M | \n", "1 | \n", "2003 | \n", "91 | \n", "
4 | \n", "CEU | \n", "2003-01-03 | \n", "GSM25358.CEL.gz | \n", "M | \n", "1 | \n", "2003 | \n", "64 | \n", "
\n", " | GSM25349.CEL.gz | \n", "GSM25350.CEL.gz | \n", "GSM25356.CEL.gz | \n", "GSM25357.CEL.gz | \n", "GSM25358.CEL.gz | \n", "GSM25359.CEL.gz | \n", "GSM25360.CEL.gz | \n", "GSM25361.CEL.gz | \n", "GSM25377.CEL.gz | \n", "GSM25378.CEL.gz | \n", "... | \n", "GSM48658.CEL.gz | \n", "GSM48660.CEL.gz | \n", "GSM48661.CEL.gz | \n", "GSM48662.CEL.gz | \n", "GSM48663.CEL.gz | \n", "GSM48664.CEL.gz | \n", "GSM48665.CEL.gz | \n", "GSM136725.CEL.gz | \n", "GSM136726.CEL.gz | \n", "GSM136727.CEL.gz | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1007_s_at | \n", "6.627014 | \n", "6.249807 | \n", "5.934128 | \n", "6.145268 | \n", "6.091270 | \n", "6.041186 | \n", "6.050375 | \n", "6.063847 | \n", "6.226106 | \n", "6.358282 | \n", "... | \n", "6.501510 | \n", "6.558100 | \n", "6.618286 | \n", "6.869995 | \n", "5.981000 | \n", "6.403285 | \n", "6.248702 | \n", "6.488579 | \n", "6.809481 | \n", "6.133068 | \n", "
1053_at | \n", "6.939184 | \n", "6.818038 | \n", "7.047962 | \n", "7.422477 | \n", "7.272361 | \n", "7.128216 | \n", "6.750719 | \n", "6.836287 | \n", "6.738022 | \n", "7.367895 | \n", "... | \n", "7.066686 | \n", "7.386702 | \n", "6.407958 | \n", "7.089180 | \n", "7.120923 | \n", "7.018998 | \n", "7.155419 | \n", "7.517410 | \n", "7.200596 | \n", "7.280781 | \n", "
117_at | \n", "5.113570 | \n", "5.074096 | \n", "5.371201 | \n", "5.266550 | \n", "5.342047 | \n", "5.063876 | \n", "5.315898 | \n", "5.483652 | \n", "6.689444 | \n", "6.482782 | \n", "... | \n", "5.600514 | \n", "5.232676 | \n", "5.630720 | \n", "4.944748 | \n", "5.275577 | \n", "5.770358 | \n", "5.616842 | \n", "5.247190 | \n", "5.237239 | \n", "5.401876 | \n", "
121_at | \n", "7.833862 | \n", "7.780682 | \n", "7.458197 | \n", "7.655948 | \n", "7.546555 | \n", "7.072670 | \n", "7.092984 | \n", "6.954225 | \n", "7.489785 | \n", "7.388539 | \n", "... | \n", "7.437535 | \n", "7.714650 | \n", "7.416252 | \n", "7.746448 | \n", "8.001434 | \n", "7.626723 | \n", "7.452299 | \n", "7.331864 | \n", "7.357102 | \n", "7.607461 | \n", "
1255_g_at | \n", "3.152269 | \n", "3.111747 | \n", "3.018932 | \n", "3.154545 | \n", "3.107954 | \n", "3.224284 | \n", "3.114241 | \n", "3.044975 | \n", "3.304038 | \n", "2.887919 | \n", "... | \n", "3.009983 | \n", "3.151203 | \n", "3.199709 | \n", "3.159496 | \n", "3.149710 | \n", "3.242780 | \n", "3.433125 | \n", "3.175846 | \n", "3.105092 | \n", "3.225123 | \n", "
5 rows \u00d7 102 columns
\n", "\n", " | GSM25349.CEL.gz | \n", "GSM25350.CEL.gz | \n", "GSM25356.CEL.gz | \n", "GSM25357.CEL.gz | \n", "GSM25358.CEL.gz | \n", "GSM25359.CEL.gz | \n", "GSM25360.CEL.gz | \n", "GSM25361.CEL.gz | \n", "GSM25377.CEL.gz | \n", "GSM25378.CEL.gz | \n", "... | \n", "GSM48658.CEL.gz | \n", "GSM48660.CEL.gz | \n", "GSM48661.CEL.gz | \n", "GSM48662.CEL.gz | \n", "GSM48663.CEL.gz | \n", "GSM48664.CEL.gz | \n", "GSM48665.CEL.gz | \n", "GSM136725.CEL.gz | \n", "GSM136726.CEL.gz | \n", "GSM136727.CEL.gz | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1007_s_at | \n", "0.365059 | \n", "-0.012149 | \n", "-0.327827 | \n", "-0.116687 | \n", "-0.170685 | \n", "-0.220769 | \n", "-0.211580 | \n", "-0.198109 | \n", "-0.035849 | \n", "0.096327 | \n", "... | \n", "0.239554 | \n", "0.296144 | \n", "0.356331 | \n", "0.608040 | \n", "-0.280956 | \n", "0.141330 | \n", "-0.013254 | \n", "0.226624 | \n", "0.547526 | \n", "-0.128887 | \n", "
1053_at | \n", "-0.136032 | \n", "-0.257178 | \n", "-0.027254 | \n", "0.347260 | \n", "0.197144 | \n", "0.053000 | \n", "-0.324497 | \n", "-0.238930 | \n", "-0.337195 | \n", "0.292678 | \n", "... | \n", "-0.008531 | \n", "0.311485 | \n", "-0.667259 | \n", "0.013964 | \n", "0.045706 | \n", "-0.056219 | \n", "0.080203 | \n", "0.442193 | \n", "0.125379 | \n", "0.205564 | \n", "
117_at | \n", "-0.490556 | \n", "-0.530031 | \n", "-0.232926 | \n", "-0.337577 | \n", "-0.262080 | \n", "-0.540250 | \n", "-0.288228 | \n", "-0.120475 | \n", "1.085317 | \n", "0.878655 | \n", "... | \n", "-0.003613 | \n", "-0.371451 | \n", "0.026594 | \n", "-0.659379 | \n", "-0.328549 | \n", "0.166231 | \n", "0.012716 | \n", "-0.356936 | \n", "-0.366887 | \n", "-0.202251 | \n", "
121_at | \n", "0.418026 | \n", "0.364847 | \n", "0.042362 | \n", "0.240113 | \n", "0.130720 | \n", "-0.343165 | \n", "-0.322852 | \n", "-0.461611 | \n", "0.073949 | \n", "-0.027296 | \n", "... | \n", "0.021700 | \n", "0.298814 | \n", "0.000417 | \n", "0.330612 | \n", "0.585598 | \n", "0.210887 | \n", "0.036463 | \n", "-0.083972 | \n", "-0.058733 | \n", "0.191626 | \n", "
1255_g_at | \n", "0.018335 | \n", "-0.022187 | \n", "-0.115002 | \n", "0.020611 | \n", "-0.025980 | \n", "0.090351 | \n", "-0.019693 | \n", "-0.088959 | \n", "0.170104 | \n", "-0.246015 | \n", "... | \n", "-0.123951 | \n", "0.017269 | \n", "0.065775 | \n", "0.025562 | \n", "0.015776 | \n", "0.108846 | \n", "0.299192 | \n", "0.041912 | \n", "-0.028842 | \n", "0.091189 | \n", "
5 rows \u00d7 102 columns
\n", "\n", " | Pollster | \n", "Start Date | \n", "End Date | \n", "Entry Date/Time (ET) | \n", "Number of Observations | \n", "Population | \n", "Mode | \n", "Obama | \n", "Romney | \n", "Undecided | \n", "Pollster URL | \n", "Source URL | \n", "Partisan | \n", "Affiliation | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "Politico/GWU/Battleground | \n", "2012-11-04 | \n", "2012-11-05 | \n", "2012-11-06 08:40:26 | \n", "1000 | \n", "Likely Voters | \n", "Live Phone | \n", "47 | \n", "47 | \n", "6 | \n", "http://elections.huffingtonpost.com/pollster/p... | \n", "NaN | \n", "Nonpartisan | \n", "None | \n", "
1 | \n", "UPI/CVOTER | \n", "2012-11-03 | \n", "2012-11-05 | \n", "2012-11-05 18:30:15 | \n", "3000 | \n", "Likely Voters | \n", "Live Phone | \n", "49 | \n", "48 | \n", "NaN | \n", "http://elections.huffingtonpost.com/pollster/p... | \n", "NaN | \n", "Nonpartisan | \n", "None | \n", "
2 | \n", "Gravis Marketing | \n", "2012-11-03 | \n", "2012-11-05 | \n", "2012-11-06 09:22:02 | \n", "872 | \n", "Likely Voters | \n", "Automated Phone | \n", "48 | \n", "48 | \n", "4 | \n", "http://elections.huffingtonpost.com/pollster/p... | \n", "NaN | \n", "Nonpartisan | \n", "None | \n", "
3 | \n", "JZ Analytics/Newsmax | \n", "2012-11-03 | \n", "2012-11-05 | \n", "2012-11-06 07:38:41 | \n", "1041 | \n", "Likely Voters | \n", "Internet | \n", "47 | \n", "47 | \n", "6 | \n", "http://elections.huffingtonpost.com/pollster/p... | \n", "NaN | \n", "Sponsor | \n", "Rep | \n", "
4 | \n", "Rasmussen | \n", "2012-11-03 | \n", "2012-11-05 | \n", "2012-11-06 08:47:50 | \n", "1500 | \n", "Likely Voters | \n", "Automated Phone | \n", "48 | \n", "49 | \n", "NaN | \n", "http://elections.huffingtonpost.com/pollster/p... | \n", "NaN | \n", "Nonpartisan | \n", "None | \n", "
\n", " | Pollster | \n", "Start Date | \n", "End Date | \n", "Entry Date/Time (ET) | \n", "Number of Observations | \n", "Population | \n", "Mode | \n", "Obama | \n", "Romney | \n", "Undecided | \n", "Pollster URL | \n", "Source URL | \n", "Partisan | \n", "Affiliation | \n", "Diff | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "Politico/GWU/Battleground | \n", "2012-11-04 | \n", "2012-11-05 | \n", "2012-11-06 08:40:26 | \n", "1000 | \n", "Likely Voters | \n", "Live Phone | \n", "47 | \n", "47 | \n", "6 | \n", "http://elections.huffingtonpost.com/pollster/p... | \n", "NaN | \n", "Nonpartisan | \n", "None | \n", "0.00 | \n", "
1 | \n", "UPI/CVOTER | \n", "2012-11-03 | \n", "2012-11-05 | \n", "2012-11-05 18:30:15 | \n", "3000 | \n", "Likely Voters | \n", "Live Phone | \n", "49 | \n", "48 | \n", "NaN | \n", "http://elections.huffingtonpost.com/pollster/p... | \n", "NaN | \n", "Nonpartisan | \n", "None | \n", "0.01 | \n", "
2 | \n", "Gravis Marketing | \n", "2012-11-03 | \n", "2012-11-05 | \n", "2012-11-06 09:22:02 | \n", "872 | \n", "Likely Voters | \n", "Automated Phone | \n", "48 | \n", "48 | \n", "4 | \n", "http://elections.huffingtonpost.com/pollster/p... | \n", "NaN | \n", "Nonpartisan | \n", "None | \n", "0.00 | \n", "
3 | \n", "JZ Analytics/Newsmax | \n", "2012-11-03 | \n", "2012-11-05 | \n", "2012-11-06 07:38:41 | \n", "1041 | \n", "Likely Voters | \n", "Internet | \n", "47 | \n", "47 | \n", "6 | \n", "http://elections.huffingtonpost.com/pollster/p... | \n", "NaN | \n", "Sponsor | \n", "Rep | \n", "0.00 | \n", "
4 | \n", "Rasmussen | \n", "2012-11-03 | \n", "2012-11-05 | \n", "2012-11-06 08:47:50 | \n", "1500 | \n", "Likely Voters | \n", "Automated Phone | \n", "48 | \n", "49 | \n", "NaN | \n", "http://elections.huffingtonpost.com/pollster/p... | \n", "NaN | \n", "Nonpartisan | \n", "None | \n", "-0.01 | \n", "
\n", " | Pollster | \n", "Start Date | \n", "End Date | \n", "Entry Date/Time (ET) | \n", "Number of Observations | \n", "Population | \n", "Mode | \n", "McConnell | \n", "Grimes | \n", "Undecided | \n", "Pollster URL | \n", "Source URL | \n", "Partisan | \n", "Affiliation | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "SurveyUSA/Courier-Journal/Herald-Leader/WHAS/WKYT | \n", "2014-09-29 | \n", "2014-10-02 | \n", "2014-10-06 15:54:36 | \n", "632 | \n", "Likely Voters | \n", "IVR/Online | \n", "44 | \n", "46 | \n", "7 | \n", "http://elections.huffingtonpost.com/pollster/p... | \n", "http://www.courier-journal.com/story/news/poli... | \n", "Nonpartisan | \n", "None | \n", "
1 | \n", "CBS/NYT/YouGov | \n", "2014-09-20 | \n", "2014-10-01 | \n", "2014-10-05 15:42:23 | \n", "1689 | \n", "Likely Voters | \n", "Internet | \n", "47 | \n", "41 | \n", "9 | \n", "http://elections.huffingtonpost.com/pollster/p... | \n", "https://today.yougov.com/news/2014/09/07/battl... | \n", "Nonpartisan | \n", "None | \n", "
2 | \n", "Mellman (D-Grimes) | \n", "2014-09-19 | \n", "2014-09-27 | \n", "2014-10-01 15:13:47 | \n", "1800 | \n", "Likely Voters | \n", "Live Phone | \n", "40 | \n", "42 | \n", "16 | \n", "http://elections.huffingtonpost.com/pollster/p... | \n", "http://images.politico.com/global/2014/09/30/m... | \n", "Sponsor | \n", "Dem | \n", "
3 | \n", "Gravis/Human Events (R) | \n", "2014-09-13 | \n", "2014-09-16 | \n", "2014-09-19 14:57:03 | \n", "839 | \n", "Likely Voters | \n", "Automated Phone | \n", "51 | \n", "41 | \n", "8 | \n", "http://elections.huffingtonpost.com/pollster/p... | \n", "http://gravismarketing.com/polling-and-market-... | \n", "Sponsor | \n", "Rep | \n", "
4 | \n", "Ipsos/Reuters | \n", "2014-09-08 | \n", "2014-09-12 | \n", "2014-09-22 15:23:23 | \n", "944 | \n", "Likely Voters | \n", "Internet | \n", "46 | \n", "42 | \n", "6 | \n", "http://elections.huffingtonpost.com/pollster/p... | \n", "http://ipsos-na.com/download/pr.aspx?id=13904 | \n", "Nonpartisan | \n", "None | \n", "
\n", " | Candidate1 | \n", "Candidate2 | \n", "Difference | \n", "Winner | \n", "
---|---|---|---|---|
2014-alaska-senate-sullivan-vs-begich | \n", "Sullivan | \n", "Begich | \n", "-0.013 | \n", "Begich | \n", "
2014-arkansas-senate-cotton-vs-pryor | \n", "Cotton | \n", "Pryor | \n", "0.008 | \n", "Cotton | \n", "
2014-colorado-senate-gardner-vs-udall | \n", "Gardner | \n", "Udall | \n", "-0.011 | \n", "Udall | \n", "
2014-delaware-senate-wade-vs-coons | \n", "Coons | \n", "Wade | \n", "0.173 | \n", "Coons | \n", "
2014-georgia-senate-perdue-vs-nunn | \n", "Perdue | \n", "Nunn | \n", "0.022 | \n", "Perdue | \n", "
2014-hawaii-senate-cavasso-vs-schatz | \n", "Schatz | \n", "Cavasso | \n", "0.388 | \n", "Schatz | \n", "
2014-idaho-senate-risch-vs-mitchell | \n", "Risch | \n", "Mitchell | \n", "0.325 | \n", "Risch | \n", "
2014-illinois-senate-oberweis-vs-durbin | \n", "Durbin | \n", "Oberweis | \n", "0.121 | \n", "Durbin | \n", "
2014-iowa-senate-ernst-vs-braley | \n", "Ernst | \n", "Braley | \n", "-0.011 | \n", "Braley | \n", "
2014-kansas-senate-roberts-vs-orman-vs-taylor | \n", "Orman | \n", "Roberts | \n", "-0.044 | \n", "Roberts | \n", "
2014-kentucky-senate-mcconnell-vs-grimes | \n", "McConnell | \n", "Grimes | \n", "0.024 | \n", "McConnell | \n", "
2014-louisiana-senate-cassidy-vs-landrieu | \n", "Cassidy | \n", "Landrieu | \n", "0.001 | \n", "Cassidy | \n", "
2014-maine-senate-collins-vs-bellows | \n", "Collins | \n", "Bellows | \n", "0.342 | \n", "Collins | \n", "
2014-massachusetts-senate-herr-vs-markey | \n", "Markey | \n", "Herr | \n", "0.24 | \n", "Markey | \n", "
2014-michigan-senate-land-vs-peters | \n", "Peters | \n", "Land | \n", "0.036 | \n", "Peters | \n", "
2014-minnesota-senate-mcfadden-vs-franken | \n", "Franken | \n", "McFadden | \n", "0.105 | \n", "Franken | \n", "
2014-mississippi-senate-cochran-vs-childers | \n", "Cochran | \n", "Childers | \n", "0.146 | \n", "Cochran | \n", "
2014-montana-senate-daines-vs-curtis | \n", "Daines | \n", "Curtis | \n", "0.197 | \n", "Daines | \n", "
2014-nebraska-senate-sasse-vs-domina | \n", "Sasse | \n", "Domina | \n", "0.236 | \n", "Sasse | \n", "
2014-new-hampshire-senate-bass-vs-shaheen | \n", "Shaheen | \n", "Bass | \n", "0.153 | \n", "Shaheen | \n", "
2014-new-hampshire-senate-brown-vs-shaheen | \n", "Shaheen | \n", "Brown | \n", "0.07 | \n", "Shaheen | \n", "
2014-new-jersey-senate-bell-vs-booker | \n", "Booker | \n", "Bell | \n", "0.131 | \n", "Booker | \n", "
2014-new-mexico-senate-weh-vs-udall | \n", "Udall | \n", "Weh | \n", "0.161 | \n", "Udall | \n", "
2014-north-carolina-senate-tillis-vs-hagan | \n", "Hagan | \n", "Tillis | \n", "0.034 | \n", "Hagan | \n", "
2014-oklahoma-senate-inhofe-vs-silverstein | \n", "Inhofe | \n", "Silverstein | \n", "0.322 | \n", "Inhofe | \n", "
2014-oklahoma-senate-lankford-vs-johnson | \n", "Lankford | \n", "Johnson | \n", "0.315 | \n", "Lankford | \n", "
2014-oregon-senate-wehby-vs-merkley | \n", "Merkley | \n", "Wehby | \n", "0.12 | \n", "Merkley | \n", "
2014-rhode-island-senate-zaccaria-vs-reed | \n", "Reed | \n", "Zaccaria | \n", "0.323 | \n", "Reed | \n", "
2014-south-carolina-senate-graham-vs-hutto | \n", "Graham | \n", "Hutto | \n", "0.15 | \n", "Graham | \n", "
2014-south-carolina-senate-scott-vs-dickerson | \n", "Scott | \n", "Dickerson | \n", "0.2 | \n", "Scott | \n", "
2014-south-dakota-senate-rounds-vs-weiland | \n", "Rounds | \n", "Weiland | \n", "0.134 | \n", "Rounds | \n", "
2014-tennessee-senate-alexander-vs-ball | \n", "Alexander | \n", "Ball | \n", "0.184 | \n", "Alexander | \n", "
2014-texas-senate-cornyn-vs-alameel | \n", "Cornyn | \n", "Alameel | \n", "0.173 | \n", "Cornyn | \n", "
2014-virginia-senate-gillespie-vs-warner | \n", "Warner | \n", "Gillespie | \n", "0.17 | \n", "Warner | \n", "
2014-west-virginia-senate-capito-vs-tennant | \n", "Capito | \n", "Tennant | \n", "0.135 | \n", "Capito | \n", "
2014-wyoming-senate | \n", "Enzi | \n", "Hardy | \n", "0.46 | \n", "Enzi | \n", "