{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Auditing Federal Contractors Part II\n", "_By [Leon Yin](leonyin.org) Last Updated 2017-06-11_\n", "\n", "View this notebook in [NBViewer](http://nbviewer.jupyter.org/github/yinleon/us-spending/blob/master/1_analysis_methods.ipynb) or [Github](https://github.com/yinleon/us-spending/blob/master/1_analysis_methods.ipynb) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Analysis Methods\n", "After downloading data from CCA from the [part 1](http://nbviewer.jupyter.org/github/yinleon/us-spending/blob/master/0_get_data.ipynb) of this module, we can analyze it using Python Pandas, and Matplotlib." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "%matplotlib inline\n", "import glob\n", "\n", "import pandas as pd\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "data_in = 'data_in/spending_corrections_corporation_of_america.tsv.gz'\n", "df = pd.read_csv(data_in, sep='\\t', compression='gzip')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Recall, this is what the data looks like:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | unique_transaction_id | \n", "transaction_status | \n", "dollarsobligated | \n", "baseandexercisedoptionsvalue | \n", "baseandalloptionsvalue | \n", "maj_agency_cat | \n", "mod_agency | \n", "maj_fund_agency_cat | \n", "contractingofficeagencyid | \n", "contractingofficeid | \n", "... | \n", "prime_awardee_executive4 | \n", "prime_awardee_executive4_compensation | \n", "prime_awardee_executive5 | \n", "prime_awardee_executive5_compensation | \n", "interagencycontractingauthority | \n", "last_modified_date | \n", "lastupdate | \n", "contract_year | \n", "filename | \n", "search_terms | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "7418eaac867ef6ac12aeafa60195ccd8 | \n", "active | \n", "45000.0 | \n", "0.0 | \n", "0.0 | \n", "1500: JUSTICE, DEPARTMENT OF | \n", "1501: OFFICES, BOARDS AND DIVISIONS | \n", "1500: JUSTICE, DEPARTMENT OF | \n", "1501: OFFICES, BOARDS AND DIVISIONS | \n", "POS01: DEPT OF JUSTICE | \n", "... | \n", "NaN | \n", "0.0 | \n", "NaN | \n", "0.0 | \n", "X: Not Applicable | \n", "09/21/2000 | \n", "2017-03-15 | \n", "2000 | \n", "datafeeds\\2000_All_Contracts_Full_20170315.tsv | \n", "Corrections Corporation of America|CCA|CoreCiv... | \n", "
1 | \n", "01df83f4365464a1a069282bbb1fc832 | \n", "active | \n", "31000.0 | \n", "0.0 | \n", "0.0 | \n", "6400: TENNESSEE VALLEY AUTHORITY | \n", "6400: TENNESSEE VALLEY AUTHORITY | \n", "6400: TENNESSEE VALLEY AUTHORITY | \n", "6400: TENNESSEE VALLEY AUTHORITY | \n", "PURCH: TENNESSEE VALLEY AUTHORITY | \n", "... | \n", "NaN | \n", "0.0 | \n", "NaN | \n", "0.0 | \n", "X: Not Applicable | \n", "05/22/2000 | \n", "2017-03-15 | \n", "2000 | \n", "datafeeds\\2000_All_Contracts_Full_20170315.tsv | \n", "Corrections Corporation of America|CCA|CoreCiv... | \n", "
2 | \n", "96cbbd9ad0ed6d3e028fae2be8453f24 | \n", "active | \n", "25000.0 | \n", "25000.0 | \n", "25000.0 | \n", "1500: JUSTICE, DEPARTMENT OF | \n", "1540: FEDERAL PRISON SYSTEM | \n", ": | \n", "1540: FEDERAL PRISON SYSTEM | \n", "70000: CENTRAL OFFICE | \n", "... | \n", "NaN | \n", "0.0 | \n", "NaN | \n", "0.0 | \n", "X: Not Applicable | \n", "06/07/2009 | \n", "2017-03-15 | \n", "2000 | \n", "datafeeds\\2000_All_Contracts_Full_20170315.tsv | \n", "Corrections Corporation of America|CCA|CoreCiv... | \n", "
3 rows × 229 columns
\n", "