{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#Lede Program\n",
"## Data and databases\n",
"## Dealing with craptastical data sources, text and otherwise\n",
"\n",
"Biggest bugbear: pdf\n",
"\n",
" but in general, any sort of image file, with something you want to extract from it\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Sad part of story: \n",
"\n",
"best current commercial tools better than open source tools for many cases\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import pandas as pd"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##the lesser evil (but still very evil)\n",
"###pdfs with text information included as instructions for drawing text\n",
"####AKA\n",
"###you can try to copy and text gets highlighted and you can copy\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##helpful resources\n",
"\n",
"`https://www.propublica.org/nerds/item/turning-pdfs-to-text-doc-dollars-guide`\n",
"\n",
"`https://thomaslevine.com/!/parsing-pdfs/`\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## FIRST LINE OF DEFENSE: `TABULA`\n",
"\n",
"### free software by journalists for journalists and friends\n",
"## *La Naciòn*, Knight foundation funding, Propublica\n",
"\n",
"download and run locally in browser\n",
"\n",
"http://tabula.technology/ \n",
"\n",
"requires JAVA runtime (boo).\n",
"\n",
"OR\n",
"\n",
"access someone else's open server:\n",
"\n",
"http://tabula.dataninja.it/\n",
"\n",
"(this is the older less good version).\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"list=pd.read_csv(\"pdf_examples/tabula-AFD-130118-015.csv\")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"
\n",
" \n",
" \n",
" | \n",
" MAJCOM, FOA, Etc | \n",
" Organizational Level | \n",
" Finding Type | \n",
" Quantity | \n",
" Item(s) discovered | \n",
" Location | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" ACC | \n",
" Staff | \n",
" Unprofessional | \n",
" 1 | \n",
" photo | \n",
" Workplace Common Area | \n",
"
\n",
" \n",
" 1 | \n",
" ACC | \n",
" Staff | \n",
" Unprofessional | \n",
" 1 | \n",
" newspaper with unprofessional cover | \n",
" Workplace Common Area | \n",
"
\n",
" \n",
" 2 | \n",
" ACC | \n",
" Squadron | \n",
" Unprofessional | \n",
" 1 | \n",
" magazine | \n",
" Workplace Common Area | \n",
"
\n",
" \n",
" 3 | \n",
" ACC | \n",
" Squadron | \n",
" Inappropriate/Offensive | \n",
" 1 | \n",
" Bumper sticker | \n",
" Car | \n",
"
\n",
" \n",
" 4 | \n",
" ACC | \n",
" Squadron | \n",
" Unprofessional | \n",
" 6 | \n",
" signs with unproffesional language | \n",
" Workplace Common Area | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" MAJCOM, FOA, Etc Organizational Level Finding Type Quantity \\\n",
"0 ACC Staff Unprofessional 1 \n",
"1 ACC Staff Unprofessional 1 \n",
"2 ACC Squadron Unprofessional 1 \n",
"3 ACC Squadron Inappropriate/Offensive 1 \n",
"4 ACC Squadron Unprofessional 6 \n",
"\n",
" Item(s) discovered Location \n",
"0 photo Workplace Common Area \n",
"1 newspaper with unprofessional cover Workplace Common Area \n",
"2 magazine Workplace Common Area \n",
"3 Bumper sticker Car \n",
"4 signs with unproffesional language Workplace Common Area "
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"list.head()"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" MAJCOM, FOA, Etc | \n",
" Organizational Level | \n",
" Finding Type | \n",
" Quantity | \n",
" Item(s) discovered | \n",
"
\n",
" \n",
" Location | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 1inX1in deck of cards with \\rnude drawing; 2inX2.5in post \\rcard drawing depicting front of \\rairplane with female drawing | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" A-10 Ladder Doors | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" Acft Dock desk | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" Air Terminal Operations Bldg | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" Aircraft | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
"
\n",
" \n",
" Aircraft Parts Store | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" Airfield Server/network drive | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" Airman & Family Readiness \\rCenter | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" Airmen's common work area | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" Although \\rhumorous/functional…could be \\rperceived as offensive | \n",
" 3 | \n",
" 3 | \n",
" 3 | \n",
" 3 | \n",
" 3 | \n",
"
\n",
" \n",
" Ammo Facility | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" Anti-religious sentiment does \\rnot promote a proper work \\renvironment | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" Auditorium | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
"
\n",
" \n",
" Auto Hobby Shop | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
"
\n",
" \n",
" Avionics Programs Office | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
"
\n",
" \n",
" Avionics section | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
"
\n",
" \n",
" Back of office door | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" Bar | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" Bar (class gift) | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" Bar (visiting unit gift) | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" Base Common Area | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" Base Library | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" Base operations men’s room | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" Bathroom | \n",
" 8 | \n",
" 8 | \n",
" 8 | \n",
" 8 | \n",
" 8 | \n",
"
\n",
" \n",
" Bathroom (M) | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" Bathroom (W) | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" Bathroom Stall | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" Bathroom stalls | \n",
" 6 | \n",
" 6 | \n",
" 6 | \n",
" 6 | \n",
" 6 | \n",
"
\n",
" \n",
" Bathroom wall | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" Bias against mentally \\rhandicapped people is not \\rappropriate in the work place | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" computer | \n",
" 4 | \n",
" 4 | \n",
" 4 | \n",
" 4 | \n",
" 4 | \n",
"
\n",
" \n",
" computer files | \n",
" 32 | \n",
" 32 | \n",
" 32 | \n",
" 32 | \n",
" 32 | \n",
"
\n",
" \n",
" computer room | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" detrimental to good order and \\rdiscipline | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
"
\n",
" \n",
" explicit material | \n",
" 6 | \n",
" 6 | \n",
" 6 | \n",
" 6 | \n",
" 6 | \n",
"
\n",
" \n",
" explicit material/mild nudity | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" explicit material/sexuality mild \\rnudity | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" explicit/mild nudity | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" explicit/violent/vulgar material | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" foyer | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" inside the drawer of a common \\ruse desk | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" latrine | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
"
\n",
" \n",
" member’s office | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
"
\n",
" \n",
" nudity/inappropriate subject \\rmatter | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" office cubicles | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" on latrine board | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" potential for inappropriate \\rcontent | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" server | \n",
" 5 | \n",
" 5 | \n",
" 5 | \n",
" 5 | \n",
" 5 | \n",
"
\n",
" \n",
" sexually explicit item in \\rcommon area | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" sexually explicit/offensive | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" sexually explicit/profane item in \\rcommon area | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
"
\n",
" \n",
" share drive | \n",
" 8 | \n",
" 8 | \n",
" 8 | \n",
" 8 | \n",
" 8 | \n",
"
\n",
" \n",
" shared drive | \n",
" 77 | \n",
" 77 | \n",
" 77 | \n",
" 77 | \n",
" 77 | \n",
"
\n",
" \n",
" shared drive/history micro film r | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
"
\n",
" \n",
" shelf | \n",
" 3 | \n",
" 3 | \n",
" 3 | \n",
" 3 | \n",
" 3 | \n",
"
\n",
" \n",
" storage closet | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" unprofessional comments | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
" 2 | \n",
"
\n",
" \n",
" vulgar | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
" vulgar or offensive language | \n",
" 4 | \n",
" 4 | \n",
" 4 | \n",
" 4 | \n",
" 4 | \n",
"
\n",
" \n",
" workspace | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
" 1 | \n",
"
\n",
" \n",
"
\n",
"
577 rows × 5 columns
\n",
"
"
],
"text/plain": [
" MAJCOM, FOA, Etc \\\n",
"Location \n",
"1inX1in deck of cards with \\rnude drawing; 2inX... 1 \n",
"A-10 Ladder Doors 1 \n",
"Acft Dock desk 1 \n",
"Air Terminal Operations Bldg 1 \n",
"Aircraft 2 \n",
"Aircraft Parts Store 1 \n",
"Airfield Server/network drive 1 \n",
"Airman & Family Readiness \\rCenter 1 \n",
"Airmen's common work area 1 \n",
"Although \\rhumorous/functional…could be \\rperce... 3 \n",
"Ammo Facility 1 \n",
"Anti-religious sentiment does \\rnot promote a p... 1 \n",
"Auditorium 2 \n",
"Auto Hobby Shop 2 \n",
"Avionics Programs Office 2 \n",
"Avionics section 2 \n",
"Back of office door 1 \n",
"Bar 1 \n",
"Bar (class gift) 1 \n",
"Bar (visiting unit gift) 1 \n",
"Base Common Area 1 \n",
"Base Library 1 \n",
"Base operations men’s room 1 \n",
"Bathroom 8 \n",
"Bathroom (M) 1 \n",
"Bathroom (W) 1 \n",
"Bathroom Stall 1 \n",
"Bathroom stalls 6 \n",
"Bathroom wall 1 \n",
"Bias against mentally \\rhandicapped people is n... 1 \n",
"... ... \n",
"computer 4 \n",
"computer files 32 \n",
"computer room 1 \n",
"detrimental to good order and \\rdiscipline 2 \n",
"explicit material 6 \n",
"explicit material/mild nudity 1 \n",
"explicit material/sexuality mild \\rnudity 1 \n",
"explicit/mild nudity 1 \n",
"explicit/violent/vulgar material 1 \n",
"foyer 1 \n",
"inside the drawer of a common \\ruse desk 1 \n",
"latrine 2 \n",
"member’s office 2 \n",
"nudity/inappropriate subject \\rmatter 1 \n",
"office cubicles 1 \n",
"on latrine board 1 \n",
"potential for inappropriate \\rcontent 1 \n",
"server 5 \n",
"sexually explicit item in \\rcommon area 1 \n",
"sexually explicit/offensive 1 \n",
"sexually explicit/profane item in \\rcommon area 2 \n",
"share drive 8 \n",
"shared drive 77 \n",
"shared drive/history micro film r 2 \n",
"shelf 3 \n",
"storage closet 1 \n",
"unprofessional comments 2 \n",
"vulgar 1 \n",
"vulgar or offensive language 4 \n",
"workspace 1 \n",
"\n",
" Organizational Level \\\n",
"Location \n",
"1inX1in deck of cards with \\rnude drawing; 2inX... 1 \n",
"A-10 Ladder Doors 1 \n",
"Acft Dock desk 1 \n",
"Air Terminal Operations Bldg 1 \n",
"Aircraft 2 \n",
"Aircraft Parts Store 1 \n",
"Airfield Server/network drive 1 \n",
"Airman & Family Readiness \\rCenter 1 \n",
"Airmen's common work area 1 \n",
"Although \\rhumorous/functional…could be \\rperce... 3 \n",
"Ammo Facility 1 \n",
"Anti-religious sentiment does \\rnot promote a p... 1 \n",
"Auditorium 2 \n",
"Auto Hobby Shop 2 \n",
"Avionics Programs Office 2 \n",
"Avionics section 2 \n",
"Back of office door 1 \n",
"Bar 1 \n",
"Bar (class gift) 1 \n",
"Bar (visiting unit gift) 1 \n",
"Base Common Area 1 \n",
"Base Library 1 \n",
"Base operations men’s room 1 \n",
"Bathroom 8 \n",
"Bathroom (M) 1 \n",
"Bathroom (W) 1 \n",
"Bathroom Stall 1 \n",
"Bathroom stalls 6 \n",
"Bathroom wall 1 \n",
"Bias against mentally \\rhandicapped people is n... 1 \n",
"... ... \n",
"computer 4 \n",
"computer files 32 \n",
"computer room 1 \n",
"detrimental to good order and \\rdiscipline 2 \n",
"explicit material 6 \n",
"explicit material/mild nudity 1 \n",
"explicit material/sexuality mild \\rnudity 1 \n",
"explicit/mild nudity 1 \n",
"explicit/violent/vulgar material 1 \n",
"foyer 1 \n",
"inside the drawer of a common \\ruse desk 1 \n",
"latrine 2 \n",
"member’s office 2 \n",
"nudity/inappropriate subject \\rmatter 1 \n",
"office cubicles 1 \n",
"on latrine board 1 \n",
"potential for inappropriate \\rcontent 1 \n",
"server 5 \n",
"sexually explicit item in \\rcommon area 1 \n",
"sexually explicit/offensive 1 \n",
"sexually explicit/profane item in \\rcommon area 2 \n",
"share drive 8 \n",
"shared drive 77 \n",
"shared drive/history micro film r 2 \n",
"shelf 3 \n",
"storage closet 1 \n",
"unprofessional comments 2 \n",
"vulgar 1 \n",
"vulgar or offensive language 4 \n",
"workspace 1 \n",
"\n",
" Finding Type Quantity \\\n",
"Location \n",
"1inX1in deck of cards with \\rnude drawing; 2inX... 1 1 \n",
"A-10 Ladder Doors 1 1 \n",
"Acft Dock desk 1 1 \n",
"Air Terminal Operations Bldg 1 1 \n",
"Aircraft 2 2 \n",
"Aircraft Parts Store 1 1 \n",
"Airfield Server/network drive 1 1 \n",
"Airman & Family Readiness \\rCenter 1 1 \n",
"Airmen's common work area 1 1 \n",
"Although \\rhumorous/functional…could be \\rperce... 3 3 \n",
"Ammo Facility 1 1 \n",
"Anti-religious sentiment does \\rnot promote a p... 1 1 \n",
"Auditorium 2 2 \n",
"Auto Hobby Shop 2 2 \n",
"Avionics Programs Office 2 2 \n",
"Avionics section 2 2 \n",
"Back of office door 1 1 \n",
"Bar 1 1 \n",
"Bar (class gift) 1 1 \n",
"Bar (visiting unit gift) 1 1 \n",
"Base Common Area 1 1 \n",
"Base Library 1 1 \n",
"Base operations men’s room 1 1 \n",
"Bathroom 8 8 \n",
"Bathroom (M) 1 1 \n",
"Bathroom (W) 1 1 \n",
"Bathroom Stall 1 1 \n",
"Bathroom stalls 6 6 \n",
"Bathroom wall 1 1 \n",
"Bias against mentally \\rhandicapped people is n... 1 1 \n",
"... ... ... \n",
"computer 4 4 \n",
"computer files 32 32 \n",
"computer room 1 1 \n",
"detrimental to good order and \\rdiscipline 2 2 \n",
"explicit material 6 6 \n",
"explicit material/mild nudity 1 1 \n",
"explicit material/sexuality mild \\rnudity 1 1 \n",
"explicit/mild nudity 1 1 \n",
"explicit/violent/vulgar material 1 1 \n",
"foyer 1 1 \n",
"inside the drawer of a common \\ruse desk 1 1 \n",
"latrine 2 2 \n",
"member’s office 2 2 \n",
"nudity/inappropriate subject \\rmatter 1 1 \n",
"office cubicles 1 1 \n",
"on latrine board 1 1 \n",
"potential for inappropriate \\rcontent 1 1 \n",
"server 5 5 \n",
"sexually explicit item in \\rcommon area 1 1 \n",
"sexually explicit/offensive 1 1 \n",
"sexually explicit/profane item in \\rcommon area 2 2 \n",
"share drive 8 8 \n",
"shared drive 77 77 \n",
"shared drive/history micro film r 2 2 \n",
"shelf 3 3 \n",
"storage closet 1 1 \n",
"unprofessional comments 2 2 \n",
"vulgar 1 1 \n",
"vulgar or offensive language 4 4 \n",
"workspace 1 1 \n",
"\n",
" Item(s) discovered \n",
"Location \n",
"1inX1in deck of cards with \\rnude drawing; 2inX... 1 \n",
"A-10 Ladder Doors 1 \n",
"Acft Dock desk 1 \n",
"Air Terminal Operations Bldg 1 \n",
"Aircraft 2 \n",
"Aircraft Parts Store 1 \n",
"Airfield Server/network drive 1 \n",
"Airman & Family Readiness \\rCenter 1 \n",
"Airmen's common work area 1 \n",
"Although \\rhumorous/functional…could be \\rperce... 3 \n",
"Ammo Facility 1 \n",
"Anti-religious sentiment does \\rnot promote a p... 1 \n",
"Auditorium 2 \n",
"Auto Hobby Shop 2 \n",
"Avionics Programs Office 2 \n",
"Avionics section 2 \n",
"Back of office door 1 \n",
"Bar 1 \n",
"Bar (class gift) 1 \n",
"Bar (visiting unit gift) 1 \n",
"Base Common Area 1 \n",
"Base Library 1 \n",
"Base operations men’s room 1 \n",
"Bathroom 8 \n",
"Bathroom (M) 1 \n",
"Bathroom (W) 1 \n",
"Bathroom Stall 1 \n",
"Bathroom stalls 6 \n",
"Bathroom wall 1 \n",
"Bias against mentally \\rhandicapped people is n... 1 \n",
"... ... \n",
"computer 4 \n",
"computer files 32 \n",
"computer room 1 \n",
"detrimental to good order and \\rdiscipline 2 \n",
"explicit material 6 \n",
"explicit material/mild nudity 1 \n",
"explicit material/sexuality mild \\rnudity 1 \n",
"explicit/mild nudity 1 \n",
"explicit/violent/vulgar material 1 \n",
"foyer 1 \n",
"inside the drawer of a common \\ruse desk 1 \n",
"latrine 2 \n",
"member’s office 2 \n",
"nudity/inappropriate subject \\rmatter 1 \n",
"office cubicles 1 \n",
"on latrine board 1 \n",
"potential for inappropriate \\rcontent 1 \n",
"server 5 \n",
"sexually explicit item in \\rcommon area 1 \n",
"sexually explicit/offensive 1 \n",
"sexually explicit/profane item in \\rcommon area 2 \n",
"share drive 8 \n",
"shared drive 77 \n",
"shared drive/history micro film r 2 \n",
"shelf 3 \n",
"storage closet 1 \n",
"unprofessional comments 2 \n",
"vulgar 1 \n",
"vulgar or offensive language 4 \n",
"workspace 1 \n",
"\n",
"[577 rows x 5 columns]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"list.groupby(by=\"Location\").count()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### still a lot of data munging to get into working form\n",
"\n",
" Hello *REGEX* my old friend,\n",
" I've come to talk with you once again"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# second line of defense: `pdftotext`\n",
"\n",
"- part of the poppler-utils in most linux flavors\n",
"\n",
"`apt-get install poppler-utils`\n",
"\n",
"\n",
"- Mac or Windows download from:\n",
"\n",
"`http://www.foolabs.com/xpdf/home.html`\n",
"\n",
"\n",
"implementations *vary* a lot. Better on Linux than on Mac. \n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"/home/mljones/repositories/courses/databases-2015/pdf_examples\n"
]
}
],
"source": [
"cd pdf_examples/"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"#does basic conversio\n",
"!pdftotext p5.pdf\n"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Keynote Talk\r\n",
"\r\n",
"The Mathematics of Causal Inference\r\n",
"Judea Pearl\r\n",
"Computer Science Department\r\n",
"University of California Los Angeles\r\n",
"Los Angeles, CA 90024, USA\r\n",
"\r\n",
"judea@cs.ucla.edu\r\n",
"\r\n",
"Abstract\r\n",
"I will review concepts, principles, and mathematical tools that were found useful in applications involving\r\n",
"causal and counterfactual relationships. This semantical framework, enriched with a few ideas from logic\r\n",
"and graph theory, gives rise to a complete, coherent, and friendly calculus of causation that unifies the\r\n",
"graphical and counterfactual approaches to causation and resolves many long-standing problems in several\r\n",
"of the sciences. These include questions of causal effect estimation, policy analysis, and the integration of\r\n",
"data from diverse studies. Of special interest to KDD researchers would be the following topics:\r\n",
"1. The Mediation Formula, and what it tells us about direct and indirect effects.\r\n",
"2. What mathematics can tell us about “external validity” or “generalizing from experiments”\r\n",
"3. What can graph theory tell us about recovering from sample-selection bias.\r\n",
"Categories and Subject Descriptors: G.m [Mathematics of Computing]: Miscellaneous\r\n",
"General Terms: Theory\r\n",
"\r\n",
"Bio\r\n",
"Judea Pearl is a professor of computer science and statistics at the University of California, Los Angeles. He is\r\n",
"a graduate of the Technion, Israel, and has joined the faculty of UCLA in 1970, where he currently directs the\r\n",
"Cognitive Systems Laboratory and conducts research in artificial intelligence, causal inference and philosophy\r\n",
"of science. He has authored three books: Heuristics (1984), Probabilistic Reasoning (1988), and Causality\r\n",
"(2000;2009). A member of the National Academy of Engineering, and a Founding Fellow the American\r\n",
"Association for Artificial Intelligence (AAAI), Judea Pearl is the recipient of the 2008 Benjamin Franklin\r\n",
"Medal for Computer and Cognitive Science and this year’s David Rumelhart Prize from the Cognitive Science\r\n",
"Society.\r\n",
"\r\n",
"Copyright is held by the author/owner(s).\r\n",
"KDD’11, August 21–24, 2011, San Diego, California, USA.\r\n",
"ACM 978-1-4503-0813-7/11/08.\r\n",
"\r\n",
"5\r\n",
"\r\n",
"\f"
]
}
],
"source": [
"!cat p5.txt\n"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"!pdftotext -layout p5.pdf\n"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" Keynote Talk\r\n",
" The Mathematics of Causal Inference\r\n",
"\r\n",
" Judea Pearl\r\n",
" Computer Science Department\r\n",
" University of California Los Angeles\r\n",
" Los Angeles, CA 90024, USA\r\n",
" judea@cs.ucla.edu\r\n",
"\r\n",
"\r\n",
"Abstract\r\n",
"I will review concepts, principles, and mathematical tools that were found useful in applications involving\r\n",
"causal and counterfactual relationships. This semantical framework, enriched with a few ideas from logic\r\n",
"and graph theory, gives rise to a complete, coherent, and friendly calculus of causation that unifies the\r\n",
"graphical and counterfactual approaches to causation and resolves many long-standing problems in several\r\n",
"of the sciences. These include questions of causal effect estimation, policy analysis, and the integration of\r\n",
"data from diverse studies. Of special interest to KDD researchers would be the following topics:\r\n",
"\r\n",
" 1. The Mediation Formula, and what it tells us about direct and indirect effects.\r\n",
" 2. What mathematics can tell us about “external validity” or “generalizing from experiments”\r\n",
" 3. What can graph theory tell us about recovering from sample-selection bias.\r\n",
"\r\n",
"\r\n",
"Categories and Subject Descriptors: G.m [Mathematics of Computing]: Miscellaneous\r\n",
"General Terms: Theory\r\n",
"\r\n",
"Bio\r\n",
"Judea Pearl is a professor of computer science and statistics at the University of California, Los Angeles. He is\r\n",
"a graduate of the Technion, Israel, and has joined the faculty of UCLA in 1970, where he currently directs the\r\n",
"Cognitive Systems Laboratory and conducts research in artificial intelligence, causal inference and philosophy\r\n",
"of science. He has authored three books: Heuristics (1984), Probabilistic Reasoning (1988), and Causality\r\n",
"(2000;2009). A member of the National Academy of Engineering, and a Founding Fellow the American\r\n",
"Association for Artificial Intelligence (AAAI), Judea Pearl is the recipient of the 2008 Benjamin Franklin\r\n",
"Medal for Computer and Cognitive Science and this year’s David Rumelhart Prize from the Cognitive Science\r\n",
"Society.\r\n",
"\r\n",
"\r\n",
"\r\n",
"\r\n",
"Copyright is held by the author/owner(s).\r\n",
"KDD’11, August 21–24, 2011, San Diego, California, USA.\r\n",
"ACM 978-1-4503-0813-7/11/08.\r\n",
"\r\n",
"\r\n",
"\r\n",
"\r\n",
" 5\r\n",
"\f"
]
}
],
"source": [
"!cat p5.txt"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"Let's check out a yucky scanned then OCR'd table from our good friends at DARPA. (It doesn't work on Tabula, alas!)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"!pdftotext 12-F-1039_1999-DARPA-Funding-List.pdf"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"A\r\n",
"1 FY\r\n",
"2\r\n",
"1420 1999\r\n",
"1421\r\n",
"1422\r\n",
"1423\r\n",
"1424\r\n",
"1425\r\n",
"1426\r\n"
]
}
],
"source": [
"!head 12-F-1039_1999-DARPA-Funding-List.txt"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# key parameter: `-layout` OR `-fixed` (and a number say 2 or 10)\n"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"!pdftotext -layout 12-F-1039_1999-DARPA-Funding-List.pdf"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" A B c D E F G\r\n",
" 1 FY CONTRACT NUMBER CONTRACT MOD PERFORMER PROGRAM TITLE AWARD DATE AMOUNT\r\n",
" 2\r\n",
"1420 1999 MDA97292J 1029 GR20 CNRI INFORMATION MANAGEMENT 12/10/1998 $687,000.00\r\n",
"1421 MDA97292J1 029 GR22 CNRI COMMUNICATOR 4/22/1999 $400,000.00\r\n",
"1422 MDA97292J1 029 GR22 CNRI WEBINABOX 4122/1999 $360,000.00\r\n",
"1423 MDA97292J1 029 P00025 CNRI WEBINABOX 8/24/1999 $0.00\r\n",
"1424 MDA972931 0030 P00009 GEORGIATEC HIGH DEFINITION SYSTEMS (HDS) 1/29/1999 $1 ,210,694.00\r\n",
"1425 MDA9729320014 P00017 USDISPLAYC FLAT PANEL DISPLAYS 8116/1999 $5,794,000.00\r\n",
"1426 MDA97293C0016 P00043 SYSPLANCOR CHPS: Combat Hybrid Power Systems 1nt1999 $79,441.00\r\n"
]
}
],
"source": [
"!head 12-F-1039_1999-DARPA-Funding-List.txt"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"That looks like something we might be able to struggle with!\n",
"\n",
"Let's try it!\n",
"\n",
"Lots of ways of tackling it but the easiest is probably `pandas`' `read_table` function."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"#first just make sure in control of encoding\n",
"!pdftotext -layout -enc \"UTF-8\" 12-F-1039_1999-DARPA-Funding-List.pdf"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"darpa1999=pd.read_table(\"12-F-1039_1999-DARPA-Funding-List.txt\", sep=\"\\t\", encoding=\"UTF-8\", header=1)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" 1 FY CONTRACT NUMBER CONTRACT MOD PERFORMER PROGRAM TITLE AWARD DATE AMOUNT | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 2 | \n",
"
\n",
" \n",
" 1 | \n",
" 1420 1999 MDA97292J 1029 GR20 CNRI ... | \n",
"
\n",
" \n",
" 2 | \n",
" 1421 MDA97292J1 029 GR22 CNRI ... | \n",
"
\n",
" \n",
" 3 | \n",
" 1422 MDA97292J1 029 GR22 CNRI ... | \n",
"
\n",
" \n",
" 4 | \n",
" 1423 MDA97292J1 029 P00025 CNRI ... | \n",
"
\n",
" \n",
" 5 | \n",
" 1424 MDA972931 0030 P00009 GEORGI... | \n",
"
\n",
" \n",
" 6 | \n",
" 1425 MDA9729320014 P00017 USDISP... | \n",
"
\n",
" \n",
" 7 | \n",
" 1426 MDA97293C0016 P00043 SYSPLA... | \n",
"
\n",
" \n",
" 8 | \n",
" 1427 MDA97294C0003 A00003 BELLAT... | \n",
"
\n",
" \n",
" 9 | \n",
" 1428 MDA97294C0003 P00026 BELLAT... | \n",
"
\n",
" \n",
" 10 | \n",
" 1429 MDA97294C0003 P00027 BELLAT... | \n",
"
\n",
" \n",
" 11 | \n",
" 1430 MDA97294C0003 P00028 BELLAT... | \n",
"
\n",
" \n",
" 12 | \n",
" 1431 MDA97294C0003 P00029 BELLAT... | \n",
"
\n",
" \n",
" 13 | \n",
" 1432 MDA97294C0003 P00030 BELLAT... | \n",
"
\n",
" \n",
" 14 | \n",
" 1433 MDA97294C0003 P00031 BELLAT... | \n",
"
\n",
" \n",
" 15 | \n",
" 1434 MDA97294C0003 P00032 BELLAT... | \n",
"
\n",
" \n",
" 16 | \n",
" 1435 MDA97294C0016 P00026 BDMFED... | \n",
"
\n",
" \n",
" 17 | \n",
" 1436 MDA97294C0016 P00027 BDMFED... | \n",
"
\n",
" \n",
" 18 | \n",
" 1437 MDA97294C0016 P00028 BDMFED... | \n",
"
\n",
" \n",
" 19 | \n",
" 1438 MDA97294C0016 P00029 BDMFED... | \n",
"
\n",
" \n",
" 20 | \n",
" 1439 MDA97294C0016 P00030 BDMFED... | \n",
"
\n",
" \n",
" 21 | \n",
" 1440 MDA97294D0001 D003/P16 VRT ... | \n",
"
\n",
" \n",
" 22 | \n",
" 1441 MDA97294D0001 0032/3 VRT ... | \n",
"
\n",
" \n",
" 23 | \n",
" 1442 MDA97294D0001 003202 VALLEY... | \n",
"
\n",
" \n",
" 24 | \n",
" 1443 MDA972951 0016 GR03 ARIZON... | \n",
"
\n",
" \n",
" 25 | \n",
" 1444 MDA9729530027 P00014 BELLCO... | \n",
"
\n",
" \n",
" 26 | \n",
" 1445 MDA9729530029 A00009 PLANAR... | \n",
"
\n",
" \n",
" 27 | \n",
" 1446 MDA9729530029 GR0008 PLANAR... | \n",
"
\n",
" \n",
" 28 | \n",
" 1447 MDA9729530036 GR06 ITNENE... | \n",
"
\n",
" \n",
" 29 | \n",
" 1448 MDA9729530042 GR011 CRAYRE... | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 488 | \n",
" 1880 MDA97299F0028 D001 DIGITSY... | \n",
"
\n",
" \n",
" 489 | \n",
" 1881 MDA97299F0029 DO DTAI ... | \n",
"
\n",
" \n",
" 490 | \n",
" 1882 MDA97299F0030 BASIC BOOZALL... | \n",
"
\n",
" \n",
" 491 | \n",
" 1883 MDA97299F0031 BASIC SCHAFER... | \n",
"
\n",
" \n",
" 492 | \n",
" 1884 MDA97299F0032 DO BRADSON... | \n",
"
\n",
" \n",
" 493 | \n",
" 1885 MDA97299F0033 DO SYSPLAN... | \n",
"
\n",
" \n",
" 494 | \n",
" 1886 MDA97299F0033 P00001 SYSPLAN... | \n",
"
\n",
" \n",
" 495 | \n",
" 1887 MDA97299F0034 BASIC DIGITSY... | \n",
"
\n",
" \n",
" 496 | \n",
" 1888 MDA97299M0002 DO INFOSYS... | \n",
"
\n",
" \n",
" 497 | \n",
" 1889 MDA97299M0003 DO SRC ... | \n",
"
\n",
" \n",
" 498 | \n",
" A B c D ... | \n",
"
\n",
" \n",
" 499 | \n",
" 1 FY CONTRACT NUMBER CONTRACT MOD PERFORME... | \n",
"
\n",
" \n",
" 500 | \n",
" 2 | \n",
"
\n",
" \n",
" 501 | \n",
" 1890 MDA97299M0004 DO ARDAK ... | \n",
"
\n",
" \n",
" 502 | \n",
" 1891 MDA97299M0004 P00001 ARDAK ... | \n",
"
\n",
" \n",
" 503 | \n",
" 1892 MDA97299M0004 P00002 ARDAK ... | \n",
"
\n",
" \n",
" 504 | \n",
" 1893 MDA97299M0005 DO SHA ... | \n",
"
\n",
" \n",
" 505 | \n",
" 1894 MDA97299M0005 P00001 SHA ... | \n",
"
\n",
" \n",
" 506 | \n",
" 1895 MDA97299M0006 DO VISTARE... | \n",
"
\n",
" \n",
" 507 | \n",
" 1896 MDA97299M0007 DO VISUALE... | \n",
"
\n",
" \n",
" 508 | \n",
" 1897 MDA97299M0008 BASIC BLUE RI... | \n",
"
\n",
" \n",
" 509 | \n",
" 1898 MDA97299M0009 DO QRI ... | \n",
"
\n",
" \n",
" 510 | \n",
" 1899 MDA97299M001 0 DO PRAJAIN... | \n",
"
\n",
" \n",
" 511 | \n",
" 1900 MDA97299M0011 BASIC lVI ... | \n",
"
\n",
" \n",
" 512 | \n",
" 1901 MDA97299M0012 BASIC JERRYCO... | \n",
"
\n",
" \n",
" 513 | \n",
" 1902 MDA97299M0013 DO DIAMOND... | \n",
"
\n",
" \n",
" 514 | \n",
" 1903 MDA9769630014 P00007 SDLINC ... | \n",
"
\n",
" \n",
" 515 | \n",
" 1904 ... | \n",
"
\n",
" \n",
" 516 | \n",
" 1905 | \n",
"
\n",
" \n",
" 517 | \n",
" | \n",
"
\n",
" \n",
"
\n",
"
518 rows × 1 columns
\n",
"
"
],
"text/plain": [
" 1 FY CONTRACT NUMBER CONTRACT MOD PERFORMER PROGRAM TITLE AWARD DATE AMOUNT\n",
"0 2 \n",
"1 1420 1999 MDA97292J 1029 GR20 CNRI ... \n",
"2 1421 MDA97292J1 029 GR22 CNRI ... \n",
"3 1422 MDA97292J1 029 GR22 CNRI ... \n",
"4 1423 MDA97292J1 029 P00025 CNRI ... \n",
"5 1424 MDA972931 0030 P00009 GEORGI... \n",
"6 1425 MDA9729320014 P00017 USDISP... \n",
"7 1426 MDA97293C0016 P00043 SYSPLA... \n",
"8 1427 MDA97294C0003 A00003 BELLAT... \n",
"9 1428 MDA97294C0003 P00026 BELLAT... \n",
"10 1429 MDA97294C0003 P00027 BELLAT... \n",
"11 1430 MDA97294C0003 P00028 BELLAT... \n",
"12 1431 MDA97294C0003 P00029 BELLAT... \n",
"13 1432 MDA97294C0003 P00030 BELLAT... \n",
"14 1433 MDA97294C0003 P00031 BELLAT... \n",
"15 1434 MDA97294C0003 P00032 BELLAT... \n",
"16 1435 MDA97294C0016 P00026 BDMFED... \n",
"17 1436 MDA97294C0016 P00027 BDMFED... \n",
"18 1437 MDA97294C0016 P00028 BDMFED... \n",
"19 1438 MDA97294C0016 P00029 BDMFED... \n",
"20 1439 MDA97294C0016 P00030 BDMFED... \n",
"21 1440 MDA97294D0001 D003/P16 VRT ... \n",
"22 1441 MDA97294D0001 0032/3 VRT ... \n",
"23 1442 MDA97294D0001 003202 VALLEY... \n",
"24 1443 MDA972951 0016 GR03 ARIZON... \n",
"25 1444 MDA9729530027 P00014 BELLCO... \n",
"26 1445 MDA9729530029 A00009 PLANAR... \n",
"27 1446 MDA9729530029 GR0008 PLANAR... \n",
"28 1447 MDA9729530036 GR06 ITNENE... \n",
"29 1448 MDA9729530042 GR011 CRAYRE... \n",
".. ... \n",
"488 1880 MDA97299F0028 D001 DIGITSY... \n",
"489 1881 MDA97299F0029 DO DTAI ... \n",
"490 1882 MDA97299F0030 BASIC BOOZALL... \n",
"491 1883 MDA97299F0031 BASIC SCHAFER... \n",
"492 1884 MDA97299F0032 DO BRADSON... \n",
"493 1885 MDA97299F0033 DO SYSPLAN... \n",
"494 1886 MDA97299F0033 P00001 SYSPLAN... \n",
"495 1887 MDA97299F0034 BASIC DIGITSY... \n",
"496 1888 MDA97299M0002 DO INFOSYS... \n",
"497 1889 MDA97299M0003 DO SRC ... \n",
"498 \f",
" A B c D ... \n",
"499 1 FY CONTRACT NUMBER CONTRACT MOD PERFORME... \n",
"500 2 \n",
"501 1890 MDA97299M0004 DO ARDAK ... \n",
"502 1891 MDA97299M0004 P00001 ARDAK ... \n",
"503 1892 MDA97299M0004 P00002 ARDAK ... \n",
"504 1893 MDA97299M0005 DO SHA ... \n",
"505 1894 MDA97299M0005 P00001 SHA ... \n",
"506 1895 MDA97299M0006 DO VISTARE... \n",
"507 1896 MDA97299M0007 DO VISUALE... \n",
"508 1897 MDA97299M0008 BASIC BLUE RI... \n",
"509 1898 MDA97299M0009 DO QRI ... \n",
"510 1899 MDA97299M001 0 DO PRAJAIN... \n",
"511 1900 MDA97299M0011 BASIC lVI ... \n",
"512 1901 MDA97299M0012 BASIC JERRYCO... \n",
"513 1902 MDA97299M0013 DO DIAMOND... \n",
"514 1903 MDA9769630014 P00007 SDLINC ... \n",
"515 1904 ... \n",
"516 1905 \n",
"517 \f",
" \n",
"\n",
"[518 rows x 1 columns]"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"darpa1999"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"You'll recall the `sep=\"\\t\"` or `sep=\"|\"` to tell `pd.read_csv` to look for tabs.\n",
"\n",
"The trick here is to look for `spaces`. Fortunately, we don't have to convert spaces to tabs. We just tell it that a number of spaces are the delimited using standard regex: `\\s+`!\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/mljones/anaconda/lib/python2.7/site-packages/pandas/io/parsers.py:648: ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support regex separators; you can avoid this warning by specifying engine='python'.\n",
" ParserWarning)\n"
]
}
],
"source": [
"darpa1999=pd.read_table(\"12-F-1039_1999-DARPA-Funding-List.txt\", sep=\"\\s\\s+\", encoding=\"UTF-8\", header=0)"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" A | \n",
" B | \n",
" c | \n",
" D | \n",
" E | \n",
" F | \n",
" G | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 1 FY | \n",
" CONTRACT NUMBER CONTRACT MOD PERFORMER | \n",
" PROGRAM TITLE | \n",
" AWARD DATE | \n",
" AMOUNT | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
" 1 | \n",
" 2 | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
" 2 | \n",
" 1420 1999 | \n",
" MDA97292J 1029 | \n",
" GR20 | \n",
" CNRI | \n",
" INFORMATION MANAGEMENT | \n",
" 12/10/1998 | \n",
" $687,000.00 | \n",
"
\n",
" \n",
" 3 | \n",
" 1421 | \n",
" MDA97292J1 029 | \n",
" GR22 | \n",
" CNRI | \n",
" COMMUNICATOR | \n",
" 4/22/1999 | \n",
" $400,000.00 | \n",
"
\n",
" \n",
" 4 | \n",
" 1422 | \n",
" MDA97292J1 029 | \n",
" GR22 | \n",
" CNRI | \n",
" WEBINABOX | \n",
" 4122/1999 | \n",
" $360,000.00 | \n",
"
\n",
" \n",
" 5 | \n",
" 1423 | \n",
" MDA97292J1 029 | \n",
" P00025 | \n",
" CNRI | \n",
" WEBINABOX | \n",
" 8/24/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 6 | \n",
" 1424 | \n",
" MDA972931 0030 | \n",
" P00009 | \n",
" GEORGIATEC | \n",
" HIGH DEFINITION SYSTEMS (HDS) | \n",
" 1/29/1999 | \n",
" $1 ,210,694.00 | \n",
"
\n",
" \n",
" 7 | \n",
" 1425 | \n",
" MDA9729320014 | \n",
" P00017 | \n",
" USDISPLAYC | \n",
" FLAT PANEL DISPLAYS | \n",
" 8116/1999 | \n",
" $5,794,000.00 | \n",
"
\n",
" \n",
" 8 | \n",
" 1426 | \n",
" MDA97293C0016 | \n",
" P00043 | \n",
" SYSPLANCOR | \n",
" CHPS: Combat Hybrid Power Systems | \n",
" 1nt1999 | \n",
" $79,441.00 | \n",
"
\n",
" \n",
" 9 | \n",
" 1427 | \n",
" MDA97294C0003 | \n",
" A00003 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 8/28/1998 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 10 | \n",
" 1428 | \n",
" MDA97294C0003 | \n",
" P00026 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 1/2011999 | \n",
" $332,197.00 | \n",
"
\n",
" \n",
" 11 | \n",
" 1429 | \n",
" MDA97294C0003 | \n",
" P00027 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 2/4/1999 | \n",
" $94,750.00 | \n",
"
\n",
" \n",
" 12 | \n",
" 1430 | \n",
" MDA97294C0003 | \n",
" P00028 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 2/22/1999 | \n",
" $450,000.00 | \n",
"
\n",
" \n",
" 13 | \n",
" 1431 | \n",
" MDA97294C0003 | \n",
" P00029 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 3/1/1999 | \n",
" $254,750.00 | \n",
"
\n",
" \n",
" 14 | \n",
" 1432 | \n",
" MDA97294C0003 | \n",
" P00030 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 4/1 2/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 15 | \n",
" 1433 | \n",
" MDA97294C0003 | \n",
" P00031 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 4/1 3/1999 | \n",
" $254,750.00 | \n",
"
\n",
" \n",
" 16 | \n",
" 1434 | \n",
" MDA97294C0003 | \n",
" P00032 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 9/8/1999 | \n",
" $254,750.00 | \n",
"
\n",
" \n",
" 17 | \n",
" 1435 | \n",
" MDA97294C0016 | \n",
" P00026 | \n",
" BDMFEDERAL | \n",
" STOWACTD | \n",
" 2/1 2/1999 | \n",
" $117,000.00 | \n",
"
\n",
" \n",
" 18 | \n",
" 1436 | \n",
" MDA97294C0016 | \n",
" P00027 | \n",
" BDMFEDERAL | \n",
" STOWACTD | \n",
" 3/1/1999 | \n",
" $273,000.00 | \n",
"
\n",
" \n",
" 19 | \n",
" 1437 | \n",
" MDA97294C0016 | \n",
" P00028 | \n",
" BDMFEDERAL | \n",
" IMAGE UNDERSTANDING | \n",
" 3/2911999 | \n",
" $150,166.00 | \n",
"
\n",
" \n",
" 20 | \n",
" 1438 | \n",
" MDA97294C0016 | \n",
" P00029 | \n",
" BDMFEDERAL | \n",
" STOWACTD | \n",
" 5/27/1999 | \n",
" $40,000.00 | \n",
"
\n",
" \n",
" 21 | \n",
" 1439 | \n",
" MDA97294C0016 | \n",
" P00030 | \n",
" BDMFEDERAL | \n",
" STOWACTD | \n",
" 911 /1999 | \n",
" $55,930.00 | \n",
"
\n",
" \n",
" 22 | \n",
" 1440 | \n",
" MDA97294D0001 | \n",
" D003/P16 | \n",
" VRT | \n",
" BADD | \n",
" 12/9/1998 | \n",
" $73,374.00 | \n",
"
\n",
" \n",
" 23 | \n",
" 1441 | \n",
" MDA97294D0001 | \n",
" 0032/3 | \n",
" VRT | \n",
" AGILE INFO CONTROL ENVIRONMENT | \n",
" 2/12/1999 | \n",
" $100,095.00 | \n",
"
\n",
" \n",
" 24 | \n",
" 1442 | \n",
" MDA97294D0001 | \n",
" 003202 | \n",
" VALLEYELEC | \n",
" AGILE INFO CONTROL ENVIRONMENT | \n",
" 12/22/1998 | \n",
" $100,095.00 | \n",
"
\n",
" \n",
" 25 | \n",
" 1443 | \n",
" MDA972951 0016 | \n",
" GR03 | \n",
" ARIZONASTA | \n",
" VLSI PHOTONICS | \n",
" 3/1 5/1999 | \n",
" $149,984.00 | \n",
"
\n",
" \n",
" 26 | \n",
" 1444 | \n",
" MDA9729530027 | \n",
" P00014 | \n",
" BELLCORE | \n",
" BROADBAND INFORMATION TECHNOLOGY | \n",
" 1/4/1999 | \n",
" $4,547,200.00 | \n",
"
\n",
" \n",
" 27 | \n",
" 1445 | \n",
" MDA9729530029 | \n",
" A00009 | \n",
" PLANARAMER | \n",
" HIGH DEFINITION SYSTEMS (HDS) | \n",
" 5/4/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 28 | \n",
" 1446 | \n",
" MDA9729530029 | \n",
" GR0008 | \n",
" PLANARAMER | \n",
" HIGH DEFINITION SYSTEMS (HDS) | \n",
" 11/10/1998 | \n",
" $7,570,137.00 | \n",
"
\n",
" \n",
" 29 | \n",
" 1447 | \n",
" MDA9729530036 | \n",
" GR06 | \n",
" ITNENERGYS | \n",
" PHOTOVOLTAICS (VP) | \n",
" 11/1 8/1998 | \n",
" $558,900.00 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 488 | \n",
" 1879 | \n",
" M DA97299F0028 | \n",
" DO | \n",
" DIGITSYSIN | \n",
" CONTRACT ADMINISTRATION | \n",
" 7/14/1999 | \n",
" $90,000.00 | \n",
"
\n",
" \n",
" 489 | \n",
" 1880 | \n",
" MDA97299F0028 | \n",
" D001 | \n",
" DIGITSYSIN | \n",
" CONTRACTS MANAGEMENT | \n",
" 6/30/1999 | \n",
" $4,422.00 | \n",
"
\n",
" \n",
" 490 | \n",
" 1881 | \n",
" MDA97299F0029 | \n",
" DO | \n",
" DTAI | \n",
" TECH INTEGRATION CENTER/TECH DEV CENTER | \n",
" 8/4/1999 | \n",
" $100,000.00 | \n",
"
\n",
" \n",
" 491 | \n",
" 1882 | \n",
" MDA97299F0030 | \n",
" BASIC | \n",
" BOOZALLEN | \n",
" POLYMER MATERIALS (CONG ADD) | \n",
" 5/15/1999 | \n",
" $423,916.45 | \n",
"
\n",
" \n",
" 492 | \n",
" 1883 | \n",
" MDA97299F0031 | \n",
" BASIC | \n",
" SCHAFER | \n",
" CEROS (FENCED) | \n",
" 8/2/1999 | \n",
" $59,972.00 | \n",
"
\n",
" \n",
" 493 | \n",
" 1884 | \n",
" MDA97299F0032 | \n",
" DO | \n",
" BRADSONCOR | \n",
" ADVANCED SHIP/SENSOR SYSTEMS MRN-02 | \n",
" 8/9/1999 | \n",
" $43,425.18 | \n",
"
\n",
" \n",
" 494 | \n",
" 1885 | \n",
" MDA97299F0033 | \n",
" DO | \n",
" SYSPLANCOR | \n",
" CONTRACTS MANAGEMENT | \n",
" 8/30/1999 | \n",
" $37,075.00 | \n",
"
\n",
" \n",
" 495 | \n",
" 1886 | \n",
" MDA97299F0033 | \n",
" P00001 | \n",
" SYSPLANCOR | \n",
" CONTRACTS MANAGEMENT | \n",
" 9/13/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 496 | \n",
" 1887 | \n",
" MDA97299F0034 | \n",
" BASIC | \n",
" DIGITSYSIN | \n",
" CONTRACTS MANAGEMENT | \n",
" 8/31/1999 | \n",
" $64,755.00 | \n",
"
\n",
" \n",
" 497 | \n",
" 1888 | \n",
" MDA97299M0002 | \n",
" DO | \n",
" INFOSYSLAB | \n",
" ADVANCED GROUND SURVELLIANCE | \n",
" 3/1211999 | \n",
" $99,729.00 | \n",
"
\n",
" \n",
" 498 | \n",
" 1889 | \n",
" MDA97299M0003 | \n",
" DO | \n",
" SRC | \n",
" ADVANCED MICROELECTRONICS | \n",
" 4/14/1999 | \n",
" $10,000.00 | \n",
"
\n",
" \n",
" 499 | \n",
" A | \n",
" B | \n",
" c | \n",
" D | \n",
" E | \n",
" F | \n",
" G | \n",
"
\n",
" \n",
" 500 | \n",
" 1 FY | \n",
" CONTRACT NUMBER CONTRACT MOD PERFORMER | \n",
" PROGRAM TITLE | \n",
" AWARD DATE | \n",
" AMOUNT | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
" 501 | \n",
" 2 | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
" 502 | \n",
" 1890 | \n",
" MDA97299M0004 | \n",
" DO | \n",
" ARDAK | \n",
" BW MEDICAL DIAGNOSTICS | \n",
" 3/30/1999 | \n",
" $99,970.00 | \n",
"
\n",
" \n",
" 503 | \n",
" 1891 | \n",
" MDA97299M0004 | \n",
" P00001 | \n",
" ARDAK | \n",
" BW MEDICAL DIAGNOSTICS | \n",
" 5/26/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 504 | \n",
" 1892 | \n",
" MDA97299M0004 | \n",
" P00002 | \n",
" ARDAK | \n",
" BW MEDICAL DIAGNOSTICS | \n",
" 8/4/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 505 | \n",
" 1893 | \n",
" MDA97299M0005 | \n",
" DO | \n",
" SHA | \n",
" SENSOR EMULATION | \n",
" 5/4/1999 | \n",
" $100,000.00 | \n",
"
\n",
" \n",
" 506 | \n",
" 1894 | \n",
" MDA97299M0005 | \n",
" P00001 | \n",
" SHA | \n",
" SENSOR EMULATION | \n",
" 5/12/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 507 | \n",
" 1895 | \n",
" MDA97299M0006 | \n",
" DO | \n",
" VISTARESEA | \n",
" UNDERSEA LITTORAL WARFARE | \n",
" 4/12/1999 | \n",
" $74,827.00 | \n",
"
\n",
" \n",
" 508 | \n",
" 1896 | \n",
" MDA97299M0007 | \n",
" DO | \n",
" VISUALEYES | \n",
" COMBAT CASUALTY DIAGNOSTICS:ULTRASOUND | \n",
" 5/3/1999 | \n",
" $59,500.00 | \n",
"
\n",
" \n",
" 509 | \n",
" 1897 | \n",
" MDA97299M0008 | \n",
" BASIC | \n",
" BLUE RIDGE | \n",
" OFFICE/PROGRAM SUPPORT (related to VTAX4) | \n",
" 5/11/1999 | \n",
" $48,566.00 | \n",
"
\n",
" \n",
" 510 | \n",
" 1898 | \n",
" MDA97299M0009 | \n",
" DO | \n",
" QRI | \n",
" ADVANCED SIMULATION TECH | \n",
" 6/29/1999 | \n",
" $99,494.00 | \n",
"
\n",
" \n",
" 511 | \n",
" 1899 | \n",
" MDA97299M001 0 | \n",
" DO | \n",
" PRAJAINC | \n",
" COUNTER MEASURES | \n",
" 6/14/1999 | \n",
" $80,460.00 | \n",
"
\n",
" \n",
" 512 | \n",
" 1900 | \n",
" MDA97299M0011 | \n",
" BASIC | \n",
" lVI | \n",
" COUNTER MEASURES | \n",
" 7/16/1999 | \n",
" $90,000.00 | \n",
"
\n",
" \n",
" 513 | \n",
" 1901 | \n",
" MDA97299M0012 | \n",
" BASIC | \n",
" JERRYCOOKE | \n",
" CONTRACT ADMINISTRATION | \n",
" 5/3/1999 | \n",
" $100,000.00 | \n",
"
\n",
" \n",
" 514 | \n",
" 1902 | \n",
" MDA97299M0013 | \n",
" DO | \n",
" DIAMONDBAC | \n",
" TECH INTEGRATION CENTER/TECH DEV CENTER | \n",
" 9/8/1999 | \n",
" $50,000.00 | \n",
"
\n",
" \n",
" 515 | \n",
" 1903 | \n",
" MDA9769630014 | \n",
" P00007 | \n",
" SDLINC | \n",
" SOLAR BLIND DETECTORS | \n",
" 7/9/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 516 | \n",
" 1904 | \n",
" FY SUBTOTAL: $340,495,021.94 | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
" 517 | \n",
" 1905 | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
"
\n",
"
518 rows × 7 columns
\n",
"
"
],
"text/plain": [
" A B c \\\n",
"0 1 FY CONTRACT NUMBER CONTRACT MOD PERFORMER PROGRAM TITLE \n",
"1 2 None None \n",
"2 1420 1999 MDA97292J 1029 GR20 \n",
"3 1421 MDA97292J1 029 GR22 \n",
"4 1422 MDA97292J1 029 GR22 \n",
"5 1423 MDA97292J1 029 P00025 \n",
"6 1424 MDA972931 0030 P00009 \n",
"7 1425 MDA9729320014 P00017 \n",
"8 1426 MDA97293C0016 P00043 \n",
"9 1427 MDA97294C0003 A00003 \n",
"10 1428 MDA97294C0003 P00026 \n",
"11 1429 MDA97294C0003 P00027 \n",
"12 1430 MDA97294C0003 P00028 \n",
"13 1431 MDA97294C0003 P00029 \n",
"14 1432 MDA97294C0003 P00030 \n",
"15 1433 MDA97294C0003 P00031 \n",
"16 1434 MDA97294C0003 P00032 \n",
"17 1435 MDA97294C0016 P00026 \n",
"18 1436 MDA97294C0016 P00027 \n",
"19 1437 MDA97294C0016 P00028 \n",
"20 1438 MDA97294C0016 P00029 \n",
"21 1439 MDA97294C0016 P00030 \n",
"22 1440 MDA97294D0001 D003/P16 \n",
"23 1441 MDA97294D0001 0032/3 \n",
"24 1442 MDA97294D0001 003202 \n",
"25 1443 MDA972951 0016 GR03 \n",
"26 1444 MDA9729530027 P00014 \n",
"27 1445 MDA9729530029 A00009 \n",
"28 1446 MDA9729530029 GR0008 \n",
"29 1447 MDA9729530036 GR06 \n",
".. ... ... ... \n",
"488 1879 M DA97299F0028 DO \n",
"489 1880 MDA97299F0028 D001 \n",
"490 1881 MDA97299F0029 DO \n",
"491 1882 MDA97299F0030 BASIC \n",
"492 1883 MDA97299F0031 BASIC \n",
"493 1884 MDA97299F0032 DO \n",
"494 1885 MDA97299F0033 DO \n",
"495 1886 MDA97299F0033 P00001 \n",
"496 1887 MDA97299F0034 BASIC \n",
"497 1888 MDA97299M0002 DO \n",
"498 1889 MDA97299M0003 DO \n",
"499 A B c \n",
"500 1 FY CONTRACT NUMBER CONTRACT MOD PERFORMER PROGRAM TITLE \n",
"501 2 None None \n",
"502 1890 MDA97299M0004 DO \n",
"503 1891 MDA97299M0004 P00001 \n",
"504 1892 MDA97299M0004 P00002 \n",
"505 1893 MDA97299M0005 DO \n",
"506 1894 MDA97299M0005 P00001 \n",
"507 1895 MDA97299M0006 DO \n",
"508 1896 MDA97299M0007 DO \n",
"509 1897 MDA97299M0008 BASIC \n",
"510 1898 MDA97299M0009 DO \n",
"511 1899 MDA97299M001 0 DO \n",
"512 1900 MDA97299M0011 BASIC \n",
"513 1901 MDA97299M0012 BASIC \n",
"514 1902 MDA97299M0013 DO \n",
"515 1903 MDA9769630014 P00007 \n",
"516 1904 FY SUBTOTAL: $340,495,021.94 None \n",
"517 1905 None None \n",
"\n",
" D E F \\\n",
"0 AWARD DATE AMOUNT None \n",
"1 None None None \n",
"2 CNRI INFORMATION MANAGEMENT 12/10/1998 \n",
"3 CNRI COMMUNICATOR 4/22/1999 \n",
"4 CNRI WEBINABOX 4122/1999 \n",
"5 CNRI WEBINABOX 8/24/1999 \n",
"6 GEORGIATEC HIGH DEFINITION SYSTEMS (HDS) 1/29/1999 \n",
"7 USDISPLAYC FLAT PANEL DISPLAYS 8116/1999 \n",
"8 SYSPLANCOR CHPS: Combat Hybrid Power Systems 1nt1999 \n",
"9 BELLATLANT NEXT GENERATION INTERNET 8/28/1998 \n",
"10 BELLATLANT NEXT GENERATION INTERNET 1/2011999 \n",
"11 BELLATLANT NEXT GENERATION INTERNET 2/4/1999 \n",
"12 BELLATLANT NEXT GENERATION INTERNET 2/22/1999 \n",
"13 BELLATLANT NEXT GENERATION INTERNET 3/1/1999 \n",
"14 BELLATLANT NEXT GENERATION INTERNET 4/1 2/1999 \n",
"15 BELLATLANT NEXT GENERATION INTERNET 4/1 3/1999 \n",
"16 BELLATLANT NEXT GENERATION INTERNET 9/8/1999 \n",
"17 BDMFEDERAL STOWACTD 2/1 2/1999 \n",
"18 BDMFEDERAL STOWACTD 3/1/1999 \n",
"19 BDMFEDERAL IMAGE UNDERSTANDING 3/2911999 \n",
"20 BDMFEDERAL STOWACTD 5/27/1999 \n",
"21 BDMFEDERAL STOWACTD 911 /1999 \n",
"22 VRT BADD 12/9/1998 \n",
"23 VRT AGILE INFO CONTROL ENVIRONMENT 2/12/1999 \n",
"24 VALLEYELEC AGILE INFO CONTROL ENVIRONMENT 12/22/1998 \n",
"25 ARIZONASTA VLSI PHOTONICS 3/1 5/1999 \n",
"26 BELLCORE BROADBAND INFORMATION TECHNOLOGY 1/4/1999 \n",
"27 PLANARAMER HIGH DEFINITION SYSTEMS (HDS) 5/4/1999 \n",
"28 PLANARAMER HIGH DEFINITION SYSTEMS (HDS) 11/10/1998 \n",
"29 ITNENERGYS PHOTOVOLTAICS (VP) 11/1 8/1998 \n",
".. ... ... ... \n",
"488 DIGITSYSIN CONTRACT ADMINISTRATION 7/14/1999 \n",
"489 DIGITSYSIN CONTRACTS MANAGEMENT 6/30/1999 \n",
"490 DTAI TECH INTEGRATION CENTER/TECH DEV CENTER 8/4/1999 \n",
"491 BOOZALLEN POLYMER MATERIALS (CONG ADD) 5/15/1999 \n",
"492 SCHAFER CEROS (FENCED) 8/2/1999 \n",
"493 BRADSONCOR ADVANCED SHIP/SENSOR SYSTEMS MRN-02 8/9/1999 \n",
"494 SYSPLANCOR CONTRACTS MANAGEMENT 8/30/1999 \n",
"495 SYSPLANCOR CONTRACTS MANAGEMENT 9/13/1999 \n",
"496 DIGITSYSIN CONTRACTS MANAGEMENT 8/31/1999 \n",
"497 INFOSYSLAB ADVANCED GROUND SURVELLIANCE 3/1211999 \n",
"498 SRC ADVANCED MICROELECTRONICS 4/14/1999 \n",
"499 D E F \n",
"500 AWARD DATE AMOUNT None \n",
"501 None None None \n",
"502 ARDAK BW MEDICAL DIAGNOSTICS 3/30/1999 \n",
"503 ARDAK BW MEDICAL DIAGNOSTICS 5/26/1999 \n",
"504 ARDAK BW MEDICAL DIAGNOSTICS 8/4/1999 \n",
"505 SHA SENSOR EMULATION 5/4/1999 \n",
"506 SHA SENSOR EMULATION 5/12/1999 \n",
"507 VISTARESEA UNDERSEA LITTORAL WARFARE 4/12/1999 \n",
"508 VISUALEYES COMBAT CASUALTY DIAGNOSTICS:ULTRASOUND 5/3/1999 \n",
"509 BLUE RIDGE OFFICE/PROGRAM SUPPORT (related to VTAX4) 5/11/1999 \n",
"510 QRI ADVANCED SIMULATION TECH 6/29/1999 \n",
"511 PRAJAINC COUNTER MEASURES 6/14/1999 \n",
"512 lVI COUNTER MEASURES 7/16/1999 \n",
"513 JERRYCOOKE CONTRACT ADMINISTRATION 5/3/1999 \n",
"514 DIAMONDBAC TECH INTEGRATION CENTER/TECH DEV CENTER 9/8/1999 \n",
"515 SDLINC SOLAR BLIND DETECTORS 7/9/1999 \n",
"516 None None None \n",
"517 None None None \n",
"\n",
" G \n",
"0 None \n",
"1 None \n",
"2 $687,000.00 \n",
"3 $400,000.00 \n",
"4 $360,000.00 \n",
"5 $0.00 \n",
"6 $1 ,210,694.00 \n",
"7 $5,794,000.00 \n",
"8 $79,441.00 \n",
"9 $0.00 \n",
"10 $332,197.00 \n",
"11 $94,750.00 \n",
"12 $450,000.00 \n",
"13 $254,750.00 \n",
"14 $0.00 \n",
"15 $254,750.00 \n",
"16 $254,750.00 \n",
"17 $117,000.00 \n",
"18 $273,000.00 \n",
"19 $150,166.00 \n",
"20 $40,000.00 \n",
"21 $55,930.00 \n",
"22 $73,374.00 \n",
"23 $100,095.00 \n",
"24 $100,095.00 \n",
"25 $149,984.00 \n",
"26 $4,547,200.00 \n",
"27 $0.00 \n",
"28 $7,570,137.00 \n",
"29 $558,900.00 \n",
".. ... \n",
"488 $90,000.00 \n",
"489 $4,422.00 \n",
"490 $100,000.00 \n",
"491 $423,916.45 \n",
"492 $59,972.00 \n",
"493 $43,425.18 \n",
"494 $37,075.00 \n",
"495 $0.00 \n",
"496 $64,755.00 \n",
"497 $99,729.00 \n",
"498 $10,000.00 \n",
"499 G \n",
"500 None \n",
"501 None \n",
"502 $99,970.00 \n",
"503 $0.00 \n",
"504 $0.00 \n",
"505 $100,000.00 \n",
"506 $0.00 \n",
"507 $74,827.00 \n",
"508 $59,500.00 \n",
"509 $48,566.00 \n",
"510 $99,494.00 \n",
"511 $80,460.00 \n",
"512 $90,000.00 \n",
"513 $100,000.00 \n",
"514 $50,000.00 \n",
"515 $0.00 \n",
"516 None \n",
"517 None \n",
"\n",
"[518 rows x 7 columns]"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"darpa1999"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"darpa1999.columns=[\"Number\", \"CONTRACT_NUMBER\", \"CONTRACT_MOD\", \"PERFORMER\",\"PROGRAM_TITLE\",\"AWARD_DATE\",\"AMOUNT\"]"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Number | \n",
" CONTRACT_NUMBER | \n",
" CONTRACT_MOD | \n",
" PERFORMER | \n",
" PROGRAM_TITLE | \n",
" AWARD_DATE | \n",
" AMOUNT | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 1 FY | \n",
" CONTRACT NUMBER CONTRACT MOD PERFORMER | \n",
" PROGRAM TITLE | \n",
" AWARD DATE | \n",
" AMOUNT | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
" 1 | \n",
" 2 | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
" 2 | \n",
" 1420 1999 | \n",
" MDA97292J 1029 | \n",
" GR20 | \n",
" CNRI | \n",
" INFORMATION MANAGEMENT | \n",
" 12/10/1998 | \n",
" $687,000.00 | \n",
"
\n",
" \n",
" 3 | \n",
" 1421 | \n",
" MDA97292J1 029 | \n",
" GR22 | \n",
" CNRI | \n",
" COMMUNICATOR | \n",
" 4/22/1999 | \n",
" $400,000.00 | \n",
"
\n",
" \n",
" 4 | \n",
" 1422 | \n",
" MDA97292J1 029 | \n",
" GR22 | \n",
" CNRI | \n",
" WEBINABOX | \n",
" 4122/1999 | \n",
" $360,000.00 | \n",
"
\n",
" \n",
" 5 | \n",
" 1423 | \n",
" MDA97292J1 029 | \n",
" P00025 | \n",
" CNRI | \n",
" WEBINABOX | \n",
" 8/24/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 6 | \n",
" 1424 | \n",
" MDA972931 0030 | \n",
" P00009 | \n",
" GEORGIATEC | \n",
" HIGH DEFINITION SYSTEMS (HDS) | \n",
" 1/29/1999 | \n",
" $1 ,210,694.00 | \n",
"
\n",
" \n",
" 7 | \n",
" 1425 | \n",
" MDA9729320014 | \n",
" P00017 | \n",
" USDISPLAYC | \n",
" FLAT PANEL DISPLAYS | \n",
" 8116/1999 | \n",
" $5,794,000.00 | \n",
"
\n",
" \n",
" 8 | \n",
" 1426 | \n",
" MDA97293C0016 | \n",
" P00043 | \n",
" SYSPLANCOR | \n",
" CHPS: Combat Hybrid Power Systems | \n",
" 1nt1999 | \n",
" $79,441.00 | \n",
"
\n",
" \n",
" 9 | \n",
" 1427 | \n",
" MDA97294C0003 | \n",
" A00003 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 8/28/1998 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 10 | \n",
" 1428 | \n",
" MDA97294C0003 | \n",
" P00026 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 1/2011999 | \n",
" $332,197.00 | \n",
"
\n",
" \n",
" 11 | \n",
" 1429 | \n",
" MDA97294C0003 | \n",
" P00027 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 2/4/1999 | \n",
" $94,750.00 | \n",
"
\n",
" \n",
" 12 | \n",
" 1430 | \n",
" MDA97294C0003 | \n",
" P00028 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 2/22/1999 | \n",
" $450,000.00 | \n",
"
\n",
" \n",
" 13 | \n",
" 1431 | \n",
" MDA97294C0003 | \n",
" P00029 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 3/1/1999 | \n",
" $254,750.00 | \n",
"
\n",
" \n",
" 14 | \n",
" 1432 | \n",
" MDA97294C0003 | \n",
" P00030 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 4/1 2/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 15 | \n",
" 1433 | \n",
" MDA97294C0003 | \n",
" P00031 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 4/1 3/1999 | \n",
" $254,750.00 | \n",
"
\n",
" \n",
" 16 | \n",
" 1434 | \n",
" MDA97294C0003 | \n",
" P00032 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 9/8/1999 | \n",
" $254,750.00 | \n",
"
\n",
" \n",
" 17 | \n",
" 1435 | \n",
" MDA97294C0016 | \n",
" P00026 | \n",
" BDMFEDERAL | \n",
" STOWACTD | \n",
" 2/1 2/1999 | \n",
" $117,000.00 | \n",
"
\n",
" \n",
" 18 | \n",
" 1436 | \n",
" MDA97294C0016 | \n",
" P00027 | \n",
" BDMFEDERAL | \n",
" STOWACTD | \n",
" 3/1/1999 | \n",
" $273,000.00 | \n",
"
\n",
" \n",
" 19 | \n",
" 1437 | \n",
" MDA97294C0016 | \n",
" P00028 | \n",
" BDMFEDERAL | \n",
" IMAGE UNDERSTANDING | \n",
" 3/2911999 | \n",
" $150,166.00 | \n",
"
\n",
" \n",
" 20 | \n",
" 1438 | \n",
" MDA97294C0016 | \n",
" P00029 | \n",
" BDMFEDERAL | \n",
" STOWACTD | \n",
" 5/27/1999 | \n",
" $40,000.00 | \n",
"
\n",
" \n",
" 21 | \n",
" 1439 | \n",
" MDA97294C0016 | \n",
" P00030 | \n",
" BDMFEDERAL | \n",
" STOWACTD | \n",
" 911 /1999 | \n",
" $55,930.00 | \n",
"
\n",
" \n",
" 22 | \n",
" 1440 | \n",
" MDA97294D0001 | \n",
" D003/P16 | \n",
" VRT | \n",
" BADD | \n",
" 12/9/1998 | \n",
" $73,374.00 | \n",
"
\n",
" \n",
" 23 | \n",
" 1441 | \n",
" MDA97294D0001 | \n",
" 0032/3 | \n",
" VRT | \n",
" AGILE INFO CONTROL ENVIRONMENT | \n",
" 2/12/1999 | \n",
" $100,095.00 | \n",
"
\n",
" \n",
" 24 | \n",
" 1442 | \n",
" MDA97294D0001 | \n",
" 003202 | \n",
" VALLEYELEC | \n",
" AGILE INFO CONTROL ENVIRONMENT | \n",
" 12/22/1998 | \n",
" $100,095.00 | \n",
"
\n",
" \n",
" 25 | \n",
" 1443 | \n",
" MDA972951 0016 | \n",
" GR03 | \n",
" ARIZONASTA | \n",
" VLSI PHOTONICS | \n",
" 3/1 5/1999 | \n",
" $149,984.00 | \n",
"
\n",
" \n",
" 26 | \n",
" 1444 | \n",
" MDA9729530027 | \n",
" P00014 | \n",
" BELLCORE | \n",
" BROADBAND INFORMATION TECHNOLOGY | \n",
" 1/4/1999 | \n",
" $4,547,200.00 | \n",
"
\n",
" \n",
" 27 | \n",
" 1445 | \n",
" MDA9729530029 | \n",
" A00009 | \n",
" PLANARAMER | \n",
" HIGH DEFINITION SYSTEMS (HDS) | \n",
" 5/4/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 28 | \n",
" 1446 | \n",
" MDA9729530029 | \n",
" GR0008 | \n",
" PLANARAMER | \n",
" HIGH DEFINITION SYSTEMS (HDS) | \n",
" 11/10/1998 | \n",
" $7,570,137.00 | \n",
"
\n",
" \n",
" 29 | \n",
" 1447 | \n",
" MDA9729530036 | \n",
" GR06 | \n",
" ITNENERGYS | \n",
" PHOTOVOLTAICS (VP) | \n",
" 11/1 8/1998 | \n",
" $558,900.00 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 488 | \n",
" 1879 | \n",
" M DA97299F0028 | \n",
" DO | \n",
" DIGITSYSIN | \n",
" CONTRACT ADMINISTRATION | \n",
" 7/14/1999 | \n",
" $90,000.00 | \n",
"
\n",
" \n",
" 489 | \n",
" 1880 | \n",
" MDA97299F0028 | \n",
" D001 | \n",
" DIGITSYSIN | \n",
" CONTRACTS MANAGEMENT | \n",
" 6/30/1999 | \n",
" $4,422.00 | \n",
"
\n",
" \n",
" 490 | \n",
" 1881 | \n",
" MDA97299F0029 | \n",
" DO | \n",
" DTAI | \n",
" TECH INTEGRATION CENTER/TECH DEV CENTER | \n",
" 8/4/1999 | \n",
" $100,000.00 | \n",
"
\n",
" \n",
" 491 | \n",
" 1882 | \n",
" MDA97299F0030 | \n",
" BASIC | \n",
" BOOZALLEN | \n",
" POLYMER MATERIALS (CONG ADD) | \n",
" 5/15/1999 | \n",
" $423,916.45 | \n",
"
\n",
" \n",
" 492 | \n",
" 1883 | \n",
" MDA97299F0031 | \n",
" BASIC | \n",
" SCHAFER | \n",
" CEROS (FENCED) | \n",
" 8/2/1999 | \n",
" $59,972.00 | \n",
"
\n",
" \n",
" 493 | \n",
" 1884 | \n",
" MDA97299F0032 | \n",
" DO | \n",
" BRADSONCOR | \n",
" ADVANCED SHIP/SENSOR SYSTEMS MRN-02 | \n",
" 8/9/1999 | \n",
" $43,425.18 | \n",
"
\n",
" \n",
" 494 | \n",
" 1885 | \n",
" MDA97299F0033 | \n",
" DO | \n",
" SYSPLANCOR | \n",
" CONTRACTS MANAGEMENT | \n",
" 8/30/1999 | \n",
" $37,075.00 | \n",
"
\n",
" \n",
" 495 | \n",
" 1886 | \n",
" MDA97299F0033 | \n",
" P00001 | \n",
" SYSPLANCOR | \n",
" CONTRACTS MANAGEMENT | \n",
" 9/13/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 496 | \n",
" 1887 | \n",
" MDA97299F0034 | \n",
" BASIC | \n",
" DIGITSYSIN | \n",
" CONTRACTS MANAGEMENT | \n",
" 8/31/1999 | \n",
" $64,755.00 | \n",
"
\n",
" \n",
" 497 | \n",
" 1888 | \n",
" MDA97299M0002 | \n",
" DO | \n",
" INFOSYSLAB | \n",
" ADVANCED GROUND SURVELLIANCE | \n",
" 3/1211999 | \n",
" $99,729.00 | \n",
"
\n",
" \n",
" 498 | \n",
" 1889 | \n",
" MDA97299M0003 | \n",
" DO | \n",
" SRC | \n",
" ADVANCED MICROELECTRONICS | \n",
" 4/14/1999 | \n",
" $10,000.00 | \n",
"
\n",
" \n",
" 499 | \n",
" A | \n",
" B | \n",
" c | \n",
" D | \n",
" E | \n",
" F | \n",
" G | \n",
"
\n",
" \n",
" 500 | \n",
" 1 FY | \n",
" CONTRACT NUMBER CONTRACT MOD PERFORMER | \n",
" PROGRAM TITLE | \n",
" AWARD DATE | \n",
" AMOUNT | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
" 501 | \n",
" 2 | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
" 502 | \n",
" 1890 | \n",
" MDA97299M0004 | \n",
" DO | \n",
" ARDAK | \n",
" BW MEDICAL DIAGNOSTICS | \n",
" 3/30/1999 | \n",
" $99,970.00 | \n",
"
\n",
" \n",
" 503 | \n",
" 1891 | \n",
" MDA97299M0004 | \n",
" P00001 | \n",
" ARDAK | \n",
" BW MEDICAL DIAGNOSTICS | \n",
" 5/26/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 504 | \n",
" 1892 | \n",
" MDA97299M0004 | \n",
" P00002 | \n",
" ARDAK | \n",
" BW MEDICAL DIAGNOSTICS | \n",
" 8/4/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 505 | \n",
" 1893 | \n",
" MDA97299M0005 | \n",
" DO | \n",
" SHA | \n",
" SENSOR EMULATION | \n",
" 5/4/1999 | \n",
" $100,000.00 | \n",
"
\n",
" \n",
" 506 | \n",
" 1894 | \n",
" MDA97299M0005 | \n",
" P00001 | \n",
" SHA | \n",
" SENSOR EMULATION | \n",
" 5/12/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 507 | \n",
" 1895 | \n",
" MDA97299M0006 | \n",
" DO | \n",
" VISTARESEA | \n",
" UNDERSEA LITTORAL WARFARE | \n",
" 4/12/1999 | \n",
" $74,827.00 | \n",
"
\n",
" \n",
" 508 | \n",
" 1896 | \n",
" MDA97299M0007 | \n",
" DO | \n",
" VISUALEYES | \n",
" COMBAT CASUALTY DIAGNOSTICS:ULTRASOUND | \n",
" 5/3/1999 | \n",
" $59,500.00 | \n",
"
\n",
" \n",
" 509 | \n",
" 1897 | \n",
" MDA97299M0008 | \n",
" BASIC | \n",
" BLUE RIDGE | \n",
" OFFICE/PROGRAM SUPPORT (related to VTAX4) | \n",
" 5/11/1999 | \n",
" $48,566.00 | \n",
"
\n",
" \n",
" 510 | \n",
" 1898 | \n",
" MDA97299M0009 | \n",
" DO | \n",
" QRI | \n",
" ADVANCED SIMULATION TECH | \n",
" 6/29/1999 | \n",
" $99,494.00 | \n",
"
\n",
" \n",
" 511 | \n",
" 1899 | \n",
" MDA97299M001 0 | \n",
" DO | \n",
" PRAJAINC | \n",
" COUNTER MEASURES | \n",
" 6/14/1999 | \n",
" $80,460.00 | \n",
"
\n",
" \n",
" 512 | \n",
" 1900 | \n",
" MDA97299M0011 | \n",
" BASIC | \n",
" lVI | \n",
" COUNTER MEASURES | \n",
" 7/16/1999 | \n",
" $90,000.00 | \n",
"
\n",
" \n",
" 513 | \n",
" 1901 | \n",
" MDA97299M0012 | \n",
" BASIC | \n",
" JERRYCOOKE | \n",
" CONTRACT ADMINISTRATION | \n",
" 5/3/1999 | \n",
" $100,000.00 | \n",
"
\n",
" \n",
" 514 | \n",
" 1902 | \n",
" MDA97299M0013 | \n",
" DO | \n",
" DIAMONDBAC | \n",
" TECH INTEGRATION CENTER/TECH DEV CENTER | \n",
" 9/8/1999 | \n",
" $50,000.00 | \n",
"
\n",
" \n",
" 515 | \n",
" 1903 | \n",
" MDA9769630014 | \n",
" P00007 | \n",
" SDLINC | \n",
" SOLAR BLIND DETECTORS | \n",
" 7/9/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 516 | \n",
" 1904 | \n",
" FY SUBTOTAL: $340,495,021.94 | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
" 517 | \n",
" 1905 | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
"
\n",
"
518 rows × 7 columns
\n",
"
"
],
"text/plain": [
" Number CONTRACT_NUMBER CONTRACT_MOD \\\n",
"0 1 FY CONTRACT NUMBER CONTRACT MOD PERFORMER PROGRAM TITLE \n",
"1 2 None None \n",
"2 1420 1999 MDA97292J 1029 GR20 \n",
"3 1421 MDA97292J1 029 GR22 \n",
"4 1422 MDA97292J1 029 GR22 \n",
"5 1423 MDA97292J1 029 P00025 \n",
"6 1424 MDA972931 0030 P00009 \n",
"7 1425 MDA9729320014 P00017 \n",
"8 1426 MDA97293C0016 P00043 \n",
"9 1427 MDA97294C0003 A00003 \n",
"10 1428 MDA97294C0003 P00026 \n",
"11 1429 MDA97294C0003 P00027 \n",
"12 1430 MDA97294C0003 P00028 \n",
"13 1431 MDA97294C0003 P00029 \n",
"14 1432 MDA97294C0003 P00030 \n",
"15 1433 MDA97294C0003 P00031 \n",
"16 1434 MDA97294C0003 P00032 \n",
"17 1435 MDA97294C0016 P00026 \n",
"18 1436 MDA97294C0016 P00027 \n",
"19 1437 MDA97294C0016 P00028 \n",
"20 1438 MDA97294C0016 P00029 \n",
"21 1439 MDA97294C0016 P00030 \n",
"22 1440 MDA97294D0001 D003/P16 \n",
"23 1441 MDA97294D0001 0032/3 \n",
"24 1442 MDA97294D0001 003202 \n",
"25 1443 MDA972951 0016 GR03 \n",
"26 1444 MDA9729530027 P00014 \n",
"27 1445 MDA9729530029 A00009 \n",
"28 1446 MDA9729530029 GR0008 \n",
"29 1447 MDA9729530036 GR06 \n",
".. ... ... ... \n",
"488 1879 M DA97299F0028 DO \n",
"489 1880 MDA97299F0028 D001 \n",
"490 1881 MDA97299F0029 DO \n",
"491 1882 MDA97299F0030 BASIC \n",
"492 1883 MDA97299F0031 BASIC \n",
"493 1884 MDA97299F0032 DO \n",
"494 1885 MDA97299F0033 DO \n",
"495 1886 MDA97299F0033 P00001 \n",
"496 1887 MDA97299F0034 BASIC \n",
"497 1888 MDA97299M0002 DO \n",
"498 1889 MDA97299M0003 DO \n",
"499 A B c \n",
"500 1 FY CONTRACT NUMBER CONTRACT MOD PERFORMER PROGRAM TITLE \n",
"501 2 None None \n",
"502 1890 MDA97299M0004 DO \n",
"503 1891 MDA97299M0004 P00001 \n",
"504 1892 MDA97299M0004 P00002 \n",
"505 1893 MDA97299M0005 DO \n",
"506 1894 MDA97299M0005 P00001 \n",
"507 1895 MDA97299M0006 DO \n",
"508 1896 MDA97299M0007 DO \n",
"509 1897 MDA97299M0008 BASIC \n",
"510 1898 MDA97299M0009 DO \n",
"511 1899 MDA97299M001 0 DO \n",
"512 1900 MDA97299M0011 BASIC \n",
"513 1901 MDA97299M0012 BASIC \n",
"514 1902 MDA97299M0013 DO \n",
"515 1903 MDA9769630014 P00007 \n",
"516 1904 FY SUBTOTAL: $340,495,021.94 None \n",
"517 1905 None None \n",
"\n",
" PERFORMER PROGRAM_TITLE AWARD_DATE \\\n",
"0 AWARD DATE AMOUNT None \n",
"1 None None None \n",
"2 CNRI INFORMATION MANAGEMENT 12/10/1998 \n",
"3 CNRI COMMUNICATOR 4/22/1999 \n",
"4 CNRI WEBINABOX 4122/1999 \n",
"5 CNRI WEBINABOX 8/24/1999 \n",
"6 GEORGIATEC HIGH DEFINITION SYSTEMS (HDS) 1/29/1999 \n",
"7 USDISPLAYC FLAT PANEL DISPLAYS 8116/1999 \n",
"8 SYSPLANCOR CHPS: Combat Hybrid Power Systems 1nt1999 \n",
"9 BELLATLANT NEXT GENERATION INTERNET 8/28/1998 \n",
"10 BELLATLANT NEXT GENERATION INTERNET 1/2011999 \n",
"11 BELLATLANT NEXT GENERATION INTERNET 2/4/1999 \n",
"12 BELLATLANT NEXT GENERATION INTERNET 2/22/1999 \n",
"13 BELLATLANT NEXT GENERATION INTERNET 3/1/1999 \n",
"14 BELLATLANT NEXT GENERATION INTERNET 4/1 2/1999 \n",
"15 BELLATLANT NEXT GENERATION INTERNET 4/1 3/1999 \n",
"16 BELLATLANT NEXT GENERATION INTERNET 9/8/1999 \n",
"17 BDMFEDERAL STOWACTD 2/1 2/1999 \n",
"18 BDMFEDERAL STOWACTD 3/1/1999 \n",
"19 BDMFEDERAL IMAGE UNDERSTANDING 3/2911999 \n",
"20 BDMFEDERAL STOWACTD 5/27/1999 \n",
"21 BDMFEDERAL STOWACTD 911 /1999 \n",
"22 VRT BADD 12/9/1998 \n",
"23 VRT AGILE INFO CONTROL ENVIRONMENT 2/12/1999 \n",
"24 VALLEYELEC AGILE INFO CONTROL ENVIRONMENT 12/22/1998 \n",
"25 ARIZONASTA VLSI PHOTONICS 3/1 5/1999 \n",
"26 BELLCORE BROADBAND INFORMATION TECHNOLOGY 1/4/1999 \n",
"27 PLANARAMER HIGH DEFINITION SYSTEMS (HDS) 5/4/1999 \n",
"28 PLANARAMER HIGH DEFINITION SYSTEMS (HDS) 11/10/1998 \n",
"29 ITNENERGYS PHOTOVOLTAICS (VP) 11/1 8/1998 \n",
".. ... ... ... \n",
"488 DIGITSYSIN CONTRACT ADMINISTRATION 7/14/1999 \n",
"489 DIGITSYSIN CONTRACTS MANAGEMENT 6/30/1999 \n",
"490 DTAI TECH INTEGRATION CENTER/TECH DEV CENTER 8/4/1999 \n",
"491 BOOZALLEN POLYMER MATERIALS (CONG ADD) 5/15/1999 \n",
"492 SCHAFER CEROS (FENCED) 8/2/1999 \n",
"493 BRADSONCOR ADVANCED SHIP/SENSOR SYSTEMS MRN-02 8/9/1999 \n",
"494 SYSPLANCOR CONTRACTS MANAGEMENT 8/30/1999 \n",
"495 SYSPLANCOR CONTRACTS MANAGEMENT 9/13/1999 \n",
"496 DIGITSYSIN CONTRACTS MANAGEMENT 8/31/1999 \n",
"497 INFOSYSLAB ADVANCED GROUND SURVELLIANCE 3/1211999 \n",
"498 SRC ADVANCED MICROELECTRONICS 4/14/1999 \n",
"499 D E F \n",
"500 AWARD DATE AMOUNT None \n",
"501 None None None \n",
"502 ARDAK BW MEDICAL DIAGNOSTICS 3/30/1999 \n",
"503 ARDAK BW MEDICAL DIAGNOSTICS 5/26/1999 \n",
"504 ARDAK BW MEDICAL DIAGNOSTICS 8/4/1999 \n",
"505 SHA SENSOR EMULATION 5/4/1999 \n",
"506 SHA SENSOR EMULATION 5/12/1999 \n",
"507 VISTARESEA UNDERSEA LITTORAL WARFARE 4/12/1999 \n",
"508 VISUALEYES COMBAT CASUALTY DIAGNOSTICS:ULTRASOUND 5/3/1999 \n",
"509 BLUE RIDGE OFFICE/PROGRAM SUPPORT (related to VTAX4) 5/11/1999 \n",
"510 QRI ADVANCED SIMULATION TECH 6/29/1999 \n",
"511 PRAJAINC COUNTER MEASURES 6/14/1999 \n",
"512 lVI COUNTER MEASURES 7/16/1999 \n",
"513 JERRYCOOKE CONTRACT ADMINISTRATION 5/3/1999 \n",
"514 DIAMONDBAC TECH INTEGRATION CENTER/TECH DEV CENTER 9/8/1999 \n",
"515 SDLINC SOLAR BLIND DETECTORS 7/9/1999 \n",
"516 None None None \n",
"517 None None None \n",
"\n",
" AMOUNT \n",
"0 None \n",
"1 None \n",
"2 $687,000.00 \n",
"3 $400,000.00 \n",
"4 $360,000.00 \n",
"5 $0.00 \n",
"6 $1 ,210,694.00 \n",
"7 $5,794,000.00 \n",
"8 $79,441.00 \n",
"9 $0.00 \n",
"10 $332,197.00 \n",
"11 $94,750.00 \n",
"12 $450,000.00 \n",
"13 $254,750.00 \n",
"14 $0.00 \n",
"15 $254,750.00 \n",
"16 $254,750.00 \n",
"17 $117,000.00 \n",
"18 $273,000.00 \n",
"19 $150,166.00 \n",
"20 $40,000.00 \n",
"21 $55,930.00 \n",
"22 $73,374.00 \n",
"23 $100,095.00 \n",
"24 $100,095.00 \n",
"25 $149,984.00 \n",
"26 $4,547,200.00 \n",
"27 $0.00 \n",
"28 $7,570,137.00 \n",
"29 $558,900.00 \n",
".. ... \n",
"488 $90,000.00 \n",
"489 $4,422.00 \n",
"490 $100,000.00 \n",
"491 $423,916.45 \n",
"492 $59,972.00 \n",
"493 $43,425.18 \n",
"494 $37,075.00 \n",
"495 $0.00 \n",
"496 $64,755.00 \n",
"497 $99,729.00 \n",
"498 $10,000.00 \n",
"499 G \n",
"500 None \n",
"501 None \n",
"502 $99,970.00 \n",
"503 $0.00 \n",
"504 $0.00 \n",
"505 $100,000.00 \n",
"506 $0.00 \n",
"507 $74,827.00 \n",
"508 $59,500.00 \n",
"509 $48,566.00 \n",
"510 $99,494.00 \n",
"511 $80,460.00 \n",
"512 $90,000.00 \n",
"513 $100,000.00 \n",
"514 $50,000.00 \n",
"515 $0.00 \n",
"516 None \n",
"517 None \n",
"\n",
"[518 rows x 7 columns]"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"darpa1999"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"darpa1999=darpa1999[2:]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A different problem! The columns titles are repeated at top of each sheet!\n",
"\n",
"Lots of ways to resolve and eliminate the unnecessary rows.\n",
"\n",
"In many cases, means that you'll have column names as values. Pick out just those ones and clean your data.\n"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Number | \n",
" CONTRACT_NUMBER | \n",
" CONTRACT_MOD | \n",
" PERFORMER | \n",
" PROGRAM_TITLE | \n",
" AWARD_DATE | \n",
" AMOUNT | \n",
"
\n",
" \n",
" \n",
" \n",
" 49 | \n",
" A | \n",
" B | \n",
" c | \n",
" D | \n",
" E | \n",
" F | \n",
" G | \n",
"
\n",
" \n",
" 99 | \n",
" A | \n",
" B | \n",
" c | \n",
" D | \n",
" E | \n",
" F | \n",
" G | \n",
"
\n",
" \n",
" 149 | \n",
" A | \n",
" B | \n",
" c | \n",
" D | \n",
" E | \n",
" F | \n",
" G | \n",
"
\n",
" \n",
" 199 | \n",
" A | \n",
" B | \n",
" c | \n",
" D | \n",
" E | \n",
" F | \n",
" G | \n",
"
\n",
" \n",
" 249 | \n",
" A | \n",
" B | \n",
" c | \n",
" D | \n",
" E | \n",
" F | \n",
" G | \n",
"
\n",
" \n",
" 299 | \n",
" A | \n",
" B | \n",
" c | \n",
" D | \n",
" E | \n",
" F | \n",
" G | \n",
"
\n",
" \n",
" 349 | \n",
" A | \n",
" B | \n",
" c | \n",
" D | \n",
" E | \n",
" F | \n",
" G | \n",
"
\n",
" \n",
" 399 | \n",
" A | \n",
" B | \n",
" c | \n",
" D | \n",
" E | \n",
" F | \n",
" G | \n",
"
\n",
" \n",
" 449 | \n",
" A | \n",
" B | \n",
" c | \n",
" D | \n",
" E | \n",
" F | \n",
" G | \n",
"
\n",
" \n",
" 499 | \n",
" A | \n",
" B | \n",
" c | \n",
" D | \n",
" E | \n",
" F | \n",
" G | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Number CONTRACT_NUMBER CONTRACT_MOD PERFORMER PROGRAM_TITLE AWARD_DATE \\\n",
"49 A B c D E F \n",
"99 A B c D E F \n",
"149 A B c D E F \n",
"199 A B c D E F \n",
"249 A B c D E F \n",
"299 A B c D E F \n",
"349 A B c D E F \n",
"399 A B c D E F \n",
"449 A B c D E F \n",
"499 A B c D E F \n",
"\n",
" AMOUNT \n",
"49 G \n",
"99 G \n",
"149 G \n",
"199 G \n",
"249 G \n",
"299 G \n",
"349 G \n",
"399 G \n",
"449 G \n",
"499 G "
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"darpa1999[darpa1999[\"Number\"]==(\"A\")]"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Number | \n",
" CONTRACT_NUMBER | \n",
" CONTRACT_MOD | \n",
" PERFORMER | \n",
" PROGRAM_TITLE | \n",
" AWARD_DATE | \n",
" AMOUNT | \n",
"
\n",
" \n",
" \n",
" \n",
" 50 | \n",
" 1 FY | \n",
" CONTRACT NUMBER CONTRACT MOD PERFORMER | \n",
" PROGRAM TITLE | \n",
" AWARD DATE | \n",
" AMOUNT | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
" 100 | \n",
" 1 FY | \n",
" CONTRACT NUMBER CONTRACT MOD PERFORMER | \n",
" PROGRAM TITLE | \n",
" AWARD DATE | \n",
" AMOUNT | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
" 150 | \n",
" 1 FY | \n",
" CONTRACT NUMBER CONTRACT MOD PERFORMER | \n",
" PROGRAM TITLE | \n",
" AWARD DATE | \n",
" AMOUNT | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
" 200 | \n",
" 1 FY | \n",
" CONTRACT NUMBER CONTRACT MOD PERFORMER | \n",
" PROGRAM TITLE | \n",
" AWARD DATE | \n",
" AMOUNT | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
" 250 | \n",
" 1 FY | \n",
" CONTRACT NUMBER CONTRACT MOD PERFORMER | \n",
" PROGRAM TITLE | \n",
" AWARD DATE | \n",
" AMOUNT | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
" 300 | \n",
" 1 FY | \n",
" CONTRACT NUMBER CONTRACT MOD PERFORMER | \n",
" PROGRAM TITLE | \n",
" AWARD DATE | \n",
" AMOUNT | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
" 350 | \n",
" 1 FY | \n",
" CONTRACT NUMBER CONTRACT MOD PERFORMER | \n",
" PROGRAM TITLE | \n",
" AWARD DATE | \n",
" AMOUNT | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
" 400 | \n",
" 1 FY | \n",
" CONTRACT NUMBER CONTRACT MOD PERFORMER | \n",
" PROGRAM TITLE | \n",
" AWARD DATE | \n",
" AMOUNT | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
" 500 | \n",
" 1 FY | \n",
" CONTRACT NUMBER CONTRACT MOD PERFORMER | \n",
" PROGRAM TITLE | \n",
" AWARD DATE | \n",
" AMOUNT | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Number CONTRACT_NUMBER CONTRACT_MOD PERFORMER \\\n",
"50 1 FY CONTRACT NUMBER CONTRACT MOD PERFORMER PROGRAM TITLE AWARD DATE \n",
"100 1 FY CONTRACT NUMBER CONTRACT MOD PERFORMER PROGRAM TITLE AWARD DATE \n",
"150 1 FY CONTRACT NUMBER CONTRACT MOD PERFORMER PROGRAM TITLE AWARD DATE \n",
"200 1 FY CONTRACT NUMBER CONTRACT MOD PERFORMER PROGRAM TITLE AWARD DATE \n",
"250 1 FY CONTRACT NUMBER CONTRACT MOD PERFORMER PROGRAM TITLE AWARD DATE \n",
"300 1 FY CONTRACT NUMBER CONTRACT MOD PERFORMER PROGRAM TITLE AWARD DATE \n",
"350 1 FY CONTRACT NUMBER CONTRACT MOD PERFORMER PROGRAM TITLE AWARD DATE \n",
"400 1 FY CONTRACT NUMBER CONTRACT MOD PERFORMER PROGRAM TITLE AWARD DATE \n",
"500 1 FY CONTRACT NUMBER CONTRACT MOD PERFORMER PROGRAM TITLE AWARD DATE \n",
"\n",
" PROGRAM_TITLE AWARD_DATE AMOUNT \n",
"50 AMOUNT None None \n",
"100 AMOUNT None None \n",
"150 AMOUNT None None \n",
"200 AMOUNT None None \n",
"250 AMOUNT None None \n",
"300 AMOUNT None None \n",
"350 AMOUNT None None \n",
"400 AMOUNT None None \n",
"500 AMOUNT None None "
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"darpa1999[darpa1999[\"Number\"]==(\"1 FY\")]"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"rows_to_include=(darpa1999[\"Number\"]!=\"A\") & (darpa1999[\"Number\"]!=\"1 FY\")"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"darpa1999=darpa1999[rows_to_include]"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Number | \n",
" CONTRACT_NUMBER | \n",
" CONTRACT_MOD | \n",
" PERFORMER | \n",
" PROGRAM_TITLE | \n",
" AWARD_DATE | \n",
" AMOUNT | \n",
"
\n",
" \n",
" \n",
" \n",
" 2 | \n",
" 1420 1999 | \n",
" MDA97292J 1029 | \n",
" GR20 | \n",
" CNRI | \n",
" INFORMATION MANAGEMENT | \n",
" 12/10/1998 | \n",
" $687,000.00 | \n",
"
\n",
" \n",
" 3 | \n",
" 1421 | \n",
" MDA97292J1 029 | \n",
" GR22 | \n",
" CNRI | \n",
" COMMUNICATOR | \n",
" 4/22/1999 | \n",
" $400,000.00 | \n",
"
\n",
" \n",
" 4 | \n",
" 1422 | \n",
" MDA97292J1 029 | \n",
" GR22 | \n",
" CNRI | \n",
" WEBINABOX | \n",
" 4122/1999 | \n",
" $360,000.00 | \n",
"
\n",
" \n",
" 5 | \n",
" 1423 | \n",
" MDA97292J1 029 | \n",
" P00025 | \n",
" CNRI | \n",
" WEBINABOX | \n",
" 8/24/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 6 | \n",
" 1424 | \n",
" MDA972931 0030 | \n",
" P00009 | \n",
" GEORGIATEC | \n",
" HIGH DEFINITION SYSTEMS (HDS) | \n",
" 1/29/1999 | \n",
" $1 ,210,694.00 | \n",
"
\n",
" \n",
" 7 | \n",
" 1425 | \n",
" MDA9729320014 | \n",
" P00017 | \n",
" USDISPLAYC | \n",
" FLAT PANEL DISPLAYS | \n",
" 8116/1999 | \n",
" $5,794,000.00 | \n",
"
\n",
" \n",
" 8 | \n",
" 1426 | \n",
" MDA97293C0016 | \n",
" P00043 | \n",
" SYSPLANCOR | \n",
" CHPS: Combat Hybrid Power Systems | \n",
" 1nt1999 | \n",
" $79,441.00 | \n",
"
\n",
" \n",
" 9 | \n",
" 1427 | \n",
" MDA97294C0003 | \n",
" A00003 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 8/28/1998 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 10 | \n",
" 1428 | \n",
" MDA97294C0003 | \n",
" P00026 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 1/2011999 | \n",
" $332,197.00 | \n",
"
\n",
" \n",
" 11 | \n",
" 1429 | \n",
" MDA97294C0003 | \n",
" P00027 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 2/4/1999 | \n",
" $94,750.00 | \n",
"
\n",
" \n",
" 12 | \n",
" 1430 | \n",
" MDA97294C0003 | \n",
" P00028 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 2/22/1999 | \n",
" $450,000.00 | \n",
"
\n",
" \n",
" 13 | \n",
" 1431 | \n",
" MDA97294C0003 | \n",
" P00029 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 3/1/1999 | \n",
" $254,750.00 | \n",
"
\n",
" \n",
" 14 | \n",
" 1432 | \n",
" MDA97294C0003 | \n",
" P00030 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 4/1 2/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 15 | \n",
" 1433 | \n",
" MDA97294C0003 | \n",
" P00031 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 4/1 3/1999 | \n",
" $254,750.00 | \n",
"
\n",
" \n",
" 16 | \n",
" 1434 | \n",
" MDA97294C0003 | \n",
" P00032 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 9/8/1999 | \n",
" $254,750.00 | \n",
"
\n",
" \n",
" 17 | \n",
" 1435 | \n",
" MDA97294C0016 | \n",
" P00026 | \n",
" BDMFEDERAL | \n",
" STOWACTD | \n",
" 2/1 2/1999 | \n",
" $117,000.00 | \n",
"
\n",
" \n",
" 18 | \n",
" 1436 | \n",
" MDA97294C0016 | \n",
" P00027 | \n",
" BDMFEDERAL | \n",
" STOWACTD | \n",
" 3/1/1999 | \n",
" $273,000.00 | \n",
"
\n",
" \n",
" 19 | \n",
" 1437 | \n",
" MDA97294C0016 | \n",
" P00028 | \n",
" BDMFEDERAL | \n",
" IMAGE UNDERSTANDING | \n",
" 3/2911999 | \n",
" $150,166.00 | \n",
"
\n",
" \n",
" 20 | \n",
" 1438 | \n",
" MDA97294C0016 | \n",
" P00029 | \n",
" BDMFEDERAL | \n",
" STOWACTD | \n",
" 5/27/1999 | \n",
" $40,000.00 | \n",
"
\n",
" \n",
" 21 | \n",
" 1439 | \n",
" MDA97294C0016 | \n",
" P00030 | \n",
" BDMFEDERAL | \n",
" STOWACTD | \n",
" 911 /1999 | \n",
" $55,930.00 | \n",
"
\n",
" \n",
" 22 | \n",
" 1440 | \n",
" MDA97294D0001 | \n",
" D003/P16 | \n",
" VRT | \n",
" BADD | \n",
" 12/9/1998 | \n",
" $73,374.00 | \n",
"
\n",
" \n",
" 23 | \n",
" 1441 | \n",
" MDA97294D0001 | \n",
" 0032/3 | \n",
" VRT | \n",
" AGILE INFO CONTROL ENVIRONMENT | \n",
" 2/12/1999 | \n",
" $100,095.00 | \n",
"
\n",
" \n",
" 24 | \n",
" 1442 | \n",
" MDA97294D0001 | \n",
" 003202 | \n",
" VALLEYELEC | \n",
" AGILE INFO CONTROL ENVIRONMENT | \n",
" 12/22/1998 | \n",
" $100,095.00 | \n",
"
\n",
" \n",
" 25 | \n",
" 1443 | \n",
" MDA972951 0016 | \n",
" GR03 | \n",
" ARIZONASTA | \n",
" VLSI PHOTONICS | \n",
" 3/1 5/1999 | \n",
" $149,984.00 | \n",
"
\n",
" \n",
" 26 | \n",
" 1444 | \n",
" MDA9729530027 | \n",
" P00014 | \n",
" BELLCORE | \n",
" BROADBAND INFORMATION TECHNOLOGY | \n",
" 1/4/1999 | \n",
" $4,547,200.00 | \n",
"
\n",
" \n",
" 27 | \n",
" 1445 | \n",
" MDA9729530029 | \n",
" A00009 | \n",
" PLANARAMER | \n",
" HIGH DEFINITION SYSTEMS (HDS) | \n",
" 5/4/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 28 | \n",
" 1446 | \n",
" MDA9729530029 | \n",
" GR0008 | \n",
" PLANARAMER | \n",
" HIGH DEFINITION SYSTEMS (HDS) | \n",
" 11/10/1998 | \n",
" $7,570,137.00 | \n",
"
\n",
" \n",
" 29 | \n",
" 1447 | \n",
" MDA9729530036 | \n",
" GR06 | \n",
" ITNENERGYS | \n",
" PHOTOVOLTAICS (VP) | \n",
" 11/1 8/1998 | \n",
" $558,900.00 | \n",
"
\n",
" \n",
" 30 | \n",
" 1448 | \n",
" MDA9729530042 | \n",
" GR011 | \n",
" CRAYRESEAR | \n",
" SHOCC | \n",
" 6nt1999 | \n",
" $1 ,289,562.00 | \n",
"
\n",
" \n",
" 31 | \n",
" 1449 | \n",
" MDA97295C0004 | \n",
" P00008 | \n",
" UMASS | \n",
" LARGE MILLIMETER TELESCOPE | \n",
" 8/30/1999 | \n",
" $1 ,151 ,500.00 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 486 | \n",
" 1877 | \n",
" MDA97299F0025 | \n",
" BASIC | \n",
" SYSPLANCOR | \n",
" COUNTER UNDERGROUND FACILITIES | \n",
" 6/25/1999 | \n",
" $251 ,924.00 | \n",
"
\n",
" \n",
" 487 | \n",
" 1878 | \n",
" MDA97299F0027 | \n",
" DO | \n",
" ORIONSCSYS | \n",
" COUNTER MEASURES | \n",
" 6/11/1999 | \n",
" $199,991 .00 | \n",
"
\n",
" \n",
" 488 | \n",
" 1879 | \n",
" M DA97299F0028 | \n",
" DO | \n",
" DIGITSYSIN | \n",
" CONTRACT ADMINISTRATION | \n",
" 7/14/1999 | \n",
" $90,000.00 | \n",
"
\n",
" \n",
" 489 | \n",
" 1880 | \n",
" MDA97299F0028 | \n",
" D001 | \n",
" DIGITSYSIN | \n",
" CONTRACTS MANAGEMENT | \n",
" 6/30/1999 | \n",
" $4,422.00 | \n",
"
\n",
" \n",
" 490 | \n",
" 1881 | \n",
" MDA97299F0029 | \n",
" DO | \n",
" DTAI | \n",
" TECH INTEGRATION CENTER/TECH DEV CENTER | \n",
" 8/4/1999 | \n",
" $100,000.00 | \n",
"
\n",
" \n",
" 491 | \n",
" 1882 | \n",
" MDA97299F0030 | \n",
" BASIC | \n",
" BOOZALLEN | \n",
" POLYMER MATERIALS (CONG ADD) | \n",
" 5/15/1999 | \n",
" $423,916.45 | \n",
"
\n",
" \n",
" 492 | \n",
" 1883 | \n",
" MDA97299F0031 | \n",
" BASIC | \n",
" SCHAFER | \n",
" CEROS (FENCED) | \n",
" 8/2/1999 | \n",
" $59,972.00 | \n",
"
\n",
" \n",
" 493 | \n",
" 1884 | \n",
" MDA97299F0032 | \n",
" DO | \n",
" BRADSONCOR | \n",
" ADVANCED SHIP/SENSOR SYSTEMS MRN-02 | \n",
" 8/9/1999 | \n",
" $43,425.18 | \n",
"
\n",
" \n",
" 494 | \n",
" 1885 | \n",
" MDA97299F0033 | \n",
" DO | \n",
" SYSPLANCOR | \n",
" CONTRACTS MANAGEMENT | \n",
" 8/30/1999 | \n",
" $37,075.00 | \n",
"
\n",
" \n",
" 495 | \n",
" 1886 | \n",
" MDA97299F0033 | \n",
" P00001 | \n",
" SYSPLANCOR | \n",
" CONTRACTS MANAGEMENT | \n",
" 9/13/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 496 | \n",
" 1887 | \n",
" MDA97299F0034 | \n",
" BASIC | \n",
" DIGITSYSIN | \n",
" CONTRACTS MANAGEMENT | \n",
" 8/31/1999 | \n",
" $64,755.00 | \n",
"
\n",
" \n",
" 497 | \n",
" 1888 | \n",
" MDA97299M0002 | \n",
" DO | \n",
" INFOSYSLAB | \n",
" ADVANCED GROUND SURVELLIANCE | \n",
" 3/1211999 | \n",
" $99,729.00 | \n",
"
\n",
" \n",
" 498 | \n",
" 1889 | \n",
" MDA97299M0003 | \n",
" DO | \n",
" SRC | \n",
" ADVANCED MICROELECTRONICS | \n",
" 4/14/1999 | \n",
" $10,000.00 | \n",
"
\n",
" \n",
" 501 | \n",
" 2 | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
" 502 | \n",
" 1890 | \n",
" MDA97299M0004 | \n",
" DO | \n",
" ARDAK | \n",
" BW MEDICAL DIAGNOSTICS | \n",
" 3/30/1999 | \n",
" $99,970.00 | \n",
"
\n",
" \n",
" 503 | \n",
" 1891 | \n",
" MDA97299M0004 | \n",
" P00001 | \n",
" ARDAK | \n",
" BW MEDICAL DIAGNOSTICS | \n",
" 5/26/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 504 | \n",
" 1892 | \n",
" MDA97299M0004 | \n",
" P00002 | \n",
" ARDAK | \n",
" BW MEDICAL DIAGNOSTICS | \n",
" 8/4/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 505 | \n",
" 1893 | \n",
" MDA97299M0005 | \n",
" DO | \n",
" SHA | \n",
" SENSOR EMULATION | \n",
" 5/4/1999 | \n",
" $100,000.00 | \n",
"
\n",
" \n",
" 506 | \n",
" 1894 | \n",
" MDA97299M0005 | \n",
" P00001 | \n",
" SHA | \n",
" SENSOR EMULATION | \n",
" 5/12/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 507 | \n",
" 1895 | \n",
" MDA97299M0006 | \n",
" DO | \n",
" VISTARESEA | \n",
" UNDERSEA LITTORAL WARFARE | \n",
" 4/12/1999 | \n",
" $74,827.00 | \n",
"
\n",
" \n",
" 508 | \n",
" 1896 | \n",
" MDA97299M0007 | \n",
" DO | \n",
" VISUALEYES | \n",
" COMBAT CASUALTY DIAGNOSTICS:ULTRASOUND | \n",
" 5/3/1999 | \n",
" $59,500.00 | \n",
"
\n",
" \n",
" 509 | \n",
" 1897 | \n",
" MDA97299M0008 | \n",
" BASIC | \n",
" BLUE RIDGE | \n",
" OFFICE/PROGRAM SUPPORT (related to VTAX4) | \n",
" 5/11/1999 | \n",
" $48,566.00 | \n",
"
\n",
" \n",
" 510 | \n",
" 1898 | \n",
" MDA97299M0009 | \n",
" DO | \n",
" QRI | \n",
" ADVANCED SIMULATION TECH | \n",
" 6/29/1999 | \n",
" $99,494.00 | \n",
"
\n",
" \n",
" 511 | \n",
" 1899 | \n",
" MDA97299M001 0 | \n",
" DO | \n",
" PRAJAINC | \n",
" COUNTER MEASURES | \n",
" 6/14/1999 | \n",
" $80,460.00 | \n",
"
\n",
" \n",
" 512 | \n",
" 1900 | \n",
" MDA97299M0011 | \n",
" BASIC | \n",
" lVI | \n",
" COUNTER MEASURES | \n",
" 7/16/1999 | \n",
" $90,000.00 | \n",
"
\n",
" \n",
" 513 | \n",
" 1901 | \n",
" MDA97299M0012 | \n",
" BASIC | \n",
" JERRYCOOKE | \n",
" CONTRACT ADMINISTRATION | \n",
" 5/3/1999 | \n",
" $100,000.00 | \n",
"
\n",
" \n",
" 514 | \n",
" 1902 | \n",
" MDA97299M0013 | \n",
" DO | \n",
" DIAMONDBAC | \n",
" TECH INTEGRATION CENTER/TECH DEV CENTER | \n",
" 9/8/1999 | \n",
" $50,000.00 | \n",
"
\n",
" \n",
" 515 | \n",
" 1903 | \n",
" MDA9769630014 | \n",
" P00007 | \n",
" SDLINC | \n",
" SOLAR BLIND DETECTORS | \n",
" 7/9/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 516 | \n",
" 1904 | \n",
" FY SUBTOTAL: $340,495,021.94 | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
" 517 | \n",
" 1905 | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
"
\n",
"
497 rows × 7 columns
\n",
"
"
],
"text/plain": [
" Number CONTRACT_NUMBER CONTRACT_MOD PERFORMER \\\n",
"2 1420 1999 MDA97292J 1029 GR20 CNRI \n",
"3 1421 MDA97292J1 029 GR22 CNRI \n",
"4 1422 MDA97292J1 029 GR22 CNRI \n",
"5 1423 MDA97292J1 029 P00025 CNRI \n",
"6 1424 MDA972931 0030 P00009 GEORGIATEC \n",
"7 1425 MDA9729320014 P00017 USDISPLAYC \n",
"8 1426 MDA97293C0016 P00043 SYSPLANCOR \n",
"9 1427 MDA97294C0003 A00003 BELLATLANT \n",
"10 1428 MDA97294C0003 P00026 BELLATLANT \n",
"11 1429 MDA97294C0003 P00027 BELLATLANT \n",
"12 1430 MDA97294C0003 P00028 BELLATLANT \n",
"13 1431 MDA97294C0003 P00029 BELLATLANT \n",
"14 1432 MDA97294C0003 P00030 BELLATLANT \n",
"15 1433 MDA97294C0003 P00031 BELLATLANT \n",
"16 1434 MDA97294C0003 P00032 BELLATLANT \n",
"17 1435 MDA97294C0016 P00026 BDMFEDERAL \n",
"18 1436 MDA97294C0016 P00027 BDMFEDERAL \n",
"19 1437 MDA97294C0016 P00028 BDMFEDERAL \n",
"20 1438 MDA97294C0016 P00029 BDMFEDERAL \n",
"21 1439 MDA97294C0016 P00030 BDMFEDERAL \n",
"22 1440 MDA97294D0001 D003/P16 VRT \n",
"23 1441 MDA97294D0001 0032/3 VRT \n",
"24 1442 MDA97294D0001 003202 VALLEYELEC \n",
"25 1443 MDA972951 0016 GR03 ARIZONASTA \n",
"26 1444 MDA9729530027 P00014 BELLCORE \n",
"27 1445 MDA9729530029 A00009 PLANARAMER \n",
"28 1446 MDA9729530029 GR0008 PLANARAMER \n",
"29 1447 MDA9729530036 GR06 ITNENERGYS \n",
"30 1448 MDA9729530042 GR011 CRAYRESEAR \n",
"31 1449 MDA97295C0004 P00008 UMASS \n",
".. ... ... ... ... \n",
"486 1877 MDA97299F0025 BASIC SYSPLANCOR \n",
"487 1878 MDA97299F0027 DO ORIONSCSYS \n",
"488 1879 M DA97299F0028 DO DIGITSYSIN \n",
"489 1880 MDA97299F0028 D001 DIGITSYSIN \n",
"490 1881 MDA97299F0029 DO DTAI \n",
"491 1882 MDA97299F0030 BASIC BOOZALLEN \n",
"492 1883 MDA97299F0031 BASIC SCHAFER \n",
"493 1884 MDA97299F0032 DO BRADSONCOR \n",
"494 1885 MDA97299F0033 DO SYSPLANCOR \n",
"495 1886 MDA97299F0033 P00001 SYSPLANCOR \n",
"496 1887 MDA97299F0034 BASIC DIGITSYSIN \n",
"497 1888 MDA97299M0002 DO INFOSYSLAB \n",
"498 1889 MDA97299M0003 DO SRC \n",
"501 2 None None None \n",
"502 1890 MDA97299M0004 DO ARDAK \n",
"503 1891 MDA97299M0004 P00001 ARDAK \n",
"504 1892 MDA97299M0004 P00002 ARDAK \n",
"505 1893 MDA97299M0005 DO SHA \n",
"506 1894 MDA97299M0005 P00001 SHA \n",
"507 1895 MDA97299M0006 DO VISTARESEA \n",
"508 1896 MDA97299M0007 DO VISUALEYES \n",
"509 1897 MDA97299M0008 BASIC BLUE RIDGE \n",
"510 1898 MDA97299M0009 DO QRI \n",
"511 1899 MDA97299M001 0 DO PRAJAINC \n",
"512 1900 MDA97299M0011 BASIC lVI \n",
"513 1901 MDA97299M0012 BASIC JERRYCOOKE \n",
"514 1902 MDA97299M0013 DO DIAMONDBAC \n",
"515 1903 MDA9769630014 P00007 SDLINC \n",
"516 1904 FY SUBTOTAL: $340,495,021.94 None None \n",
"517 1905 None None None \n",
"\n",
" PROGRAM_TITLE AWARD_DATE AMOUNT \n",
"2 INFORMATION MANAGEMENT 12/10/1998 $687,000.00 \n",
"3 COMMUNICATOR 4/22/1999 $400,000.00 \n",
"4 WEBINABOX 4122/1999 $360,000.00 \n",
"5 WEBINABOX 8/24/1999 $0.00 \n",
"6 HIGH DEFINITION SYSTEMS (HDS) 1/29/1999 $1 ,210,694.00 \n",
"7 FLAT PANEL DISPLAYS 8116/1999 $5,794,000.00 \n",
"8 CHPS: Combat Hybrid Power Systems 1nt1999 $79,441.00 \n",
"9 NEXT GENERATION INTERNET 8/28/1998 $0.00 \n",
"10 NEXT GENERATION INTERNET 1/2011999 $332,197.00 \n",
"11 NEXT GENERATION INTERNET 2/4/1999 $94,750.00 \n",
"12 NEXT GENERATION INTERNET 2/22/1999 $450,000.00 \n",
"13 NEXT GENERATION INTERNET 3/1/1999 $254,750.00 \n",
"14 NEXT GENERATION INTERNET 4/1 2/1999 $0.00 \n",
"15 NEXT GENERATION INTERNET 4/1 3/1999 $254,750.00 \n",
"16 NEXT GENERATION INTERNET 9/8/1999 $254,750.00 \n",
"17 STOWACTD 2/1 2/1999 $117,000.00 \n",
"18 STOWACTD 3/1/1999 $273,000.00 \n",
"19 IMAGE UNDERSTANDING 3/2911999 $150,166.00 \n",
"20 STOWACTD 5/27/1999 $40,000.00 \n",
"21 STOWACTD 911 /1999 $55,930.00 \n",
"22 BADD 12/9/1998 $73,374.00 \n",
"23 AGILE INFO CONTROL ENVIRONMENT 2/12/1999 $100,095.00 \n",
"24 AGILE INFO CONTROL ENVIRONMENT 12/22/1998 $100,095.00 \n",
"25 VLSI PHOTONICS 3/1 5/1999 $149,984.00 \n",
"26 BROADBAND INFORMATION TECHNOLOGY 1/4/1999 $4,547,200.00 \n",
"27 HIGH DEFINITION SYSTEMS (HDS) 5/4/1999 $0.00 \n",
"28 HIGH DEFINITION SYSTEMS (HDS) 11/10/1998 $7,570,137.00 \n",
"29 PHOTOVOLTAICS (VP) 11/1 8/1998 $558,900.00 \n",
"30 SHOCC 6nt1999 $1 ,289,562.00 \n",
"31 LARGE MILLIMETER TELESCOPE 8/30/1999 $1 ,151 ,500.00 \n",
".. ... ... ... \n",
"486 COUNTER UNDERGROUND FACILITIES 6/25/1999 $251 ,924.00 \n",
"487 COUNTER MEASURES 6/11/1999 $199,991 .00 \n",
"488 CONTRACT ADMINISTRATION 7/14/1999 $90,000.00 \n",
"489 CONTRACTS MANAGEMENT 6/30/1999 $4,422.00 \n",
"490 TECH INTEGRATION CENTER/TECH DEV CENTER 8/4/1999 $100,000.00 \n",
"491 POLYMER MATERIALS (CONG ADD) 5/15/1999 $423,916.45 \n",
"492 CEROS (FENCED) 8/2/1999 $59,972.00 \n",
"493 ADVANCED SHIP/SENSOR SYSTEMS MRN-02 8/9/1999 $43,425.18 \n",
"494 CONTRACTS MANAGEMENT 8/30/1999 $37,075.00 \n",
"495 CONTRACTS MANAGEMENT 9/13/1999 $0.00 \n",
"496 CONTRACTS MANAGEMENT 8/31/1999 $64,755.00 \n",
"497 ADVANCED GROUND SURVELLIANCE 3/1211999 $99,729.00 \n",
"498 ADVANCED MICROELECTRONICS 4/14/1999 $10,000.00 \n",
"501 None None None \n",
"502 BW MEDICAL DIAGNOSTICS 3/30/1999 $99,970.00 \n",
"503 BW MEDICAL DIAGNOSTICS 5/26/1999 $0.00 \n",
"504 BW MEDICAL DIAGNOSTICS 8/4/1999 $0.00 \n",
"505 SENSOR EMULATION 5/4/1999 $100,000.00 \n",
"506 SENSOR EMULATION 5/12/1999 $0.00 \n",
"507 UNDERSEA LITTORAL WARFARE 4/12/1999 $74,827.00 \n",
"508 COMBAT CASUALTY DIAGNOSTICS:ULTRASOUND 5/3/1999 $59,500.00 \n",
"509 OFFICE/PROGRAM SUPPORT (related to VTAX4) 5/11/1999 $48,566.00 \n",
"510 ADVANCED SIMULATION TECH 6/29/1999 $99,494.00 \n",
"511 COUNTER MEASURES 6/14/1999 $80,460.00 \n",
"512 COUNTER MEASURES 7/16/1999 $90,000.00 \n",
"513 CONTRACT ADMINISTRATION 5/3/1999 $100,000.00 \n",
"514 TECH INTEGRATION CENTER/TECH DEV CENTER 9/8/1999 $50,000.00 \n",
"515 SOLAR BLIND DETECTORS 7/9/1999 $0.00 \n",
"516 None None None \n",
"517 None None None \n",
"\n",
"[497 rows x 7 columns]"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"darpa1999"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"darpa1999.loc[2][\"Number\"]=\"1420\""
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Number | \n",
" CONTRACT_NUMBER | \n",
" CONTRACT_MOD | \n",
" PERFORMER | \n",
" PROGRAM_TITLE | \n",
" AWARD_DATE | \n",
" AMOUNT | \n",
"
\n",
" \n",
" \n",
" \n",
" 2 | \n",
" 1420 | \n",
" MDA97292J 1029 | \n",
" GR20 | \n",
" CNRI | \n",
" INFORMATION MANAGEMENT | \n",
" 12/10/1998 | \n",
" $687,000.00 | \n",
"
\n",
" \n",
" 3 | \n",
" 1421 | \n",
" MDA97292J1 029 | \n",
" GR22 | \n",
" CNRI | \n",
" COMMUNICATOR | \n",
" 4/22/1999 | \n",
" $400,000.00 | \n",
"
\n",
" \n",
" 4 | \n",
" 1422 | \n",
" MDA97292J1 029 | \n",
" GR22 | \n",
" CNRI | \n",
" WEBINABOX | \n",
" 4122/1999 | \n",
" $360,000.00 | \n",
"
\n",
" \n",
" 5 | \n",
" 1423 | \n",
" MDA97292J1 029 | \n",
" P00025 | \n",
" CNRI | \n",
" WEBINABOX | \n",
" 8/24/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 6 | \n",
" 1424 | \n",
" MDA972931 0030 | \n",
" P00009 | \n",
" GEORGIATEC | \n",
" HIGH DEFINITION SYSTEMS (HDS) | \n",
" 1/29/1999 | \n",
" $1 ,210,694.00 | \n",
"
\n",
" \n",
" 7 | \n",
" 1425 | \n",
" MDA9729320014 | \n",
" P00017 | \n",
" USDISPLAYC | \n",
" FLAT PANEL DISPLAYS | \n",
" 8116/1999 | \n",
" $5,794,000.00 | \n",
"
\n",
" \n",
" 8 | \n",
" 1426 | \n",
" MDA97293C0016 | \n",
" P00043 | \n",
" SYSPLANCOR | \n",
" CHPS: Combat Hybrid Power Systems | \n",
" 1nt1999 | \n",
" $79,441.00 | \n",
"
\n",
" \n",
" 9 | \n",
" 1427 | \n",
" MDA97294C0003 | \n",
" A00003 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 8/28/1998 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 10 | \n",
" 1428 | \n",
" MDA97294C0003 | \n",
" P00026 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 1/2011999 | \n",
" $332,197.00 | \n",
"
\n",
" \n",
" 11 | \n",
" 1429 | \n",
" MDA97294C0003 | \n",
" P00027 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 2/4/1999 | \n",
" $94,750.00 | \n",
"
\n",
" \n",
" 12 | \n",
" 1430 | \n",
" MDA97294C0003 | \n",
" P00028 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 2/22/1999 | \n",
" $450,000.00 | \n",
"
\n",
" \n",
" 13 | \n",
" 1431 | \n",
" MDA97294C0003 | \n",
" P00029 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 3/1/1999 | \n",
" $254,750.00 | \n",
"
\n",
" \n",
" 14 | \n",
" 1432 | \n",
" MDA97294C0003 | \n",
" P00030 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 4/1 2/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 15 | \n",
" 1433 | \n",
" MDA97294C0003 | \n",
" P00031 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 4/1 3/1999 | \n",
" $254,750.00 | \n",
"
\n",
" \n",
" 16 | \n",
" 1434 | \n",
" MDA97294C0003 | \n",
" P00032 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 9/8/1999 | \n",
" $254,750.00 | \n",
"
\n",
" \n",
" 17 | \n",
" 1435 | \n",
" MDA97294C0016 | \n",
" P00026 | \n",
" BDMFEDERAL | \n",
" STOWACTD | \n",
" 2/1 2/1999 | \n",
" $117,000.00 | \n",
"
\n",
" \n",
" 18 | \n",
" 1436 | \n",
" MDA97294C0016 | \n",
" P00027 | \n",
" BDMFEDERAL | \n",
" STOWACTD | \n",
" 3/1/1999 | \n",
" $273,000.00 | \n",
"
\n",
" \n",
" 19 | \n",
" 1437 | \n",
" MDA97294C0016 | \n",
" P00028 | \n",
" BDMFEDERAL | \n",
" IMAGE UNDERSTANDING | \n",
" 3/2911999 | \n",
" $150,166.00 | \n",
"
\n",
" \n",
" 20 | \n",
" 1438 | \n",
" MDA97294C0016 | \n",
" P00029 | \n",
" BDMFEDERAL | \n",
" STOWACTD | \n",
" 5/27/1999 | \n",
" $40,000.00 | \n",
"
\n",
" \n",
" 21 | \n",
" 1439 | \n",
" MDA97294C0016 | \n",
" P00030 | \n",
" BDMFEDERAL | \n",
" STOWACTD | \n",
" 911 /1999 | \n",
" $55,930.00 | \n",
"
\n",
" \n",
" 22 | \n",
" 1440 | \n",
" MDA97294D0001 | \n",
" D003/P16 | \n",
" VRT | \n",
" BADD | \n",
" 12/9/1998 | \n",
" $73,374.00 | \n",
"
\n",
" \n",
" 23 | \n",
" 1441 | \n",
" MDA97294D0001 | \n",
" 0032/3 | \n",
" VRT | \n",
" AGILE INFO CONTROL ENVIRONMENT | \n",
" 2/12/1999 | \n",
" $100,095.00 | \n",
"
\n",
" \n",
" 24 | \n",
" 1442 | \n",
" MDA97294D0001 | \n",
" 003202 | \n",
" VALLEYELEC | \n",
" AGILE INFO CONTROL ENVIRONMENT | \n",
" 12/22/1998 | \n",
" $100,095.00 | \n",
"
\n",
" \n",
" 25 | \n",
" 1443 | \n",
" MDA972951 0016 | \n",
" GR03 | \n",
" ARIZONASTA | \n",
" VLSI PHOTONICS | \n",
" 3/1 5/1999 | \n",
" $149,984.00 | \n",
"
\n",
" \n",
" 26 | \n",
" 1444 | \n",
" MDA9729530027 | \n",
" P00014 | \n",
" BELLCORE | \n",
" BROADBAND INFORMATION TECHNOLOGY | \n",
" 1/4/1999 | \n",
" $4,547,200.00 | \n",
"
\n",
" \n",
" 27 | \n",
" 1445 | \n",
" MDA9729530029 | \n",
" A00009 | \n",
" PLANARAMER | \n",
" HIGH DEFINITION SYSTEMS (HDS) | \n",
" 5/4/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 28 | \n",
" 1446 | \n",
" MDA9729530029 | \n",
" GR0008 | \n",
" PLANARAMER | \n",
" HIGH DEFINITION SYSTEMS (HDS) | \n",
" 11/10/1998 | \n",
" $7,570,137.00 | \n",
"
\n",
" \n",
" 29 | \n",
" 1447 | \n",
" MDA9729530036 | \n",
" GR06 | \n",
" ITNENERGYS | \n",
" PHOTOVOLTAICS (VP) | \n",
" 11/1 8/1998 | \n",
" $558,900.00 | \n",
"
\n",
" \n",
" 30 | \n",
" 1448 | \n",
" MDA9729530042 | \n",
" GR011 | \n",
" CRAYRESEAR | \n",
" SHOCC | \n",
" 6nt1999 | \n",
" $1 ,289,562.00 | \n",
"
\n",
" \n",
" 31 | \n",
" 1449 | \n",
" MDA97295C0004 | \n",
" P00008 | \n",
" UMASS | \n",
" LARGE MILLIMETER TELESCOPE | \n",
" 8/30/1999 | \n",
" $1 ,151 ,500.00 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 486 | \n",
" 1877 | \n",
" MDA97299F0025 | \n",
" BASIC | \n",
" SYSPLANCOR | \n",
" COUNTER UNDERGROUND FACILITIES | \n",
" 6/25/1999 | \n",
" $251 ,924.00 | \n",
"
\n",
" \n",
" 487 | \n",
" 1878 | \n",
" MDA97299F0027 | \n",
" DO | \n",
" ORIONSCSYS | \n",
" COUNTER MEASURES | \n",
" 6/11/1999 | \n",
" $199,991 .00 | \n",
"
\n",
" \n",
" 488 | \n",
" 1879 | \n",
" M DA97299F0028 | \n",
" DO | \n",
" DIGITSYSIN | \n",
" CONTRACT ADMINISTRATION | \n",
" 7/14/1999 | \n",
" $90,000.00 | \n",
"
\n",
" \n",
" 489 | \n",
" 1880 | \n",
" MDA97299F0028 | \n",
" D001 | \n",
" DIGITSYSIN | \n",
" CONTRACTS MANAGEMENT | \n",
" 6/30/1999 | \n",
" $4,422.00 | \n",
"
\n",
" \n",
" 490 | \n",
" 1881 | \n",
" MDA97299F0029 | \n",
" DO | \n",
" DTAI | \n",
" TECH INTEGRATION CENTER/TECH DEV CENTER | \n",
" 8/4/1999 | \n",
" $100,000.00 | \n",
"
\n",
" \n",
" 491 | \n",
" 1882 | \n",
" MDA97299F0030 | \n",
" BASIC | \n",
" BOOZALLEN | \n",
" POLYMER MATERIALS (CONG ADD) | \n",
" 5/15/1999 | \n",
" $423,916.45 | \n",
"
\n",
" \n",
" 492 | \n",
" 1883 | \n",
" MDA97299F0031 | \n",
" BASIC | \n",
" SCHAFER | \n",
" CEROS (FENCED) | \n",
" 8/2/1999 | \n",
" $59,972.00 | \n",
"
\n",
" \n",
" 493 | \n",
" 1884 | \n",
" MDA97299F0032 | \n",
" DO | \n",
" BRADSONCOR | \n",
" ADVANCED SHIP/SENSOR SYSTEMS MRN-02 | \n",
" 8/9/1999 | \n",
" $43,425.18 | \n",
"
\n",
" \n",
" 494 | \n",
" 1885 | \n",
" MDA97299F0033 | \n",
" DO | \n",
" SYSPLANCOR | \n",
" CONTRACTS MANAGEMENT | \n",
" 8/30/1999 | \n",
" $37,075.00 | \n",
"
\n",
" \n",
" 495 | \n",
" 1886 | \n",
" MDA97299F0033 | \n",
" P00001 | \n",
" SYSPLANCOR | \n",
" CONTRACTS MANAGEMENT | \n",
" 9/13/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 496 | \n",
" 1887 | \n",
" MDA97299F0034 | \n",
" BASIC | \n",
" DIGITSYSIN | \n",
" CONTRACTS MANAGEMENT | \n",
" 8/31/1999 | \n",
" $64,755.00 | \n",
"
\n",
" \n",
" 497 | \n",
" 1888 | \n",
" MDA97299M0002 | \n",
" DO | \n",
" INFOSYSLAB | \n",
" ADVANCED GROUND SURVELLIANCE | \n",
" 3/1211999 | \n",
" $99,729.00 | \n",
"
\n",
" \n",
" 498 | \n",
" 1889 | \n",
" MDA97299M0003 | \n",
" DO | \n",
" SRC | \n",
" ADVANCED MICROELECTRONICS | \n",
" 4/14/1999 | \n",
" $10,000.00 | \n",
"
\n",
" \n",
" 501 | \n",
" 2 | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
" 502 | \n",
" 1890 | \n",
" MDA97299M0004 | \n",
" DO | \n",
" ARDAK | \n",
" BW MEDICAL DIAGNOSTICS | \n",
" 3/30/1999 | \n",
" $99,970.00 | \n",
"
\n",
" \n",
" 503 | \n",
" 1891 | \n",
" MDA97299M0004 | \n",
" P00001 | \n",
" ARDAK | \n",
" BW MEDICAL DIAGNOSTICS | \n",
" 5/26/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 504 | \n",
" 1892 | \n",
" MDA97299M0004 | \n",
" P00002 | \n",
" ARDAK | \n",
" BW MEDICAL DIAGNOSTICS | \n",
" 8/4/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 505 | \n",
" 1893 | \n",
" MDA97299M0005 | \n",
" DO | \n",
" SHA | \n",
" SENSOR EMULATION | \n",
" 5/4/1999 | \n",
" $100,000.00 | \n",
"
\n",
" \n",
" 506 | \n",
" 1894 | \n",
" MDA97299M0005 | \n",
" P00001 | \n",
" SHA | \n",
" SENSOR EMULATION | \n",
" 5/12/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 507 | \n",
" 1895 | \n",
" MDA97299M0006 | \n",
" DO | \n",
" VISTARESEA | \n",
" UNDERSEA LITTORAL WARFARE | \n",
" 4/12/1999 | \n",
" $74,827.00 | \n",
"
\n",
" \n",
" 508 | \n",
" 1896 | \n",
" MDA97299M0007 | \n",
" DO | \n",
" VISUALEYES | \n",
" COMBAT CASUALTY DIAGNOSTICS:ULTRASOUND | \n",
" 5/3/1999 | \n",
" $59,500.00 | \n",
"
\n",
" \n",
" 509 | \n",
" 1897 | \n",
" MDA97299M0008 | \n",
" BASIC | \n",
" BLUE RIDGE | \n",
" OFFICE/PROGRAM SUPPORT (related to VTAX4) | \n",
" 5/11/1999 | \n",
" $48,566.00 | \n",
"
\n",
" \n",
" 510 | \n",
" 1898 | \n",
" MDA97299M0009 | \n",
" DO | \n",
" QRI | \n",
" ADVANCED SIMULATION TECH | \n",
" 6/29/1999 | \n",
" $99,494.00 | \n",
"
\n",
" \n",
" 511 | \n",
" 1899 | \n",
" MDA97299M001 0 | \n",
" DO | \n",
" PRAJAINC | \n",
" COUNTER MEASURES | \n",
" 6/14/1999 | \n",
" $80,460.00 | \n",
"
\n",
" \n",
" 512 | \n",
" 1900 | \n",
" MDA97299M0011 | \n",
" BASIC | \n",
" lVI | \n",
" COUNTER MEASURES | \n",
" 7/16/1999 | \n",
" $90,000.00 | \n",
"
\n",
" \n",
" 513 | \n",
" 1901 | \n",
" MDA97299M0012 | \n",
" BASIC | \n",
" JERRYCOOKE | \n",
" CONTRACT ADMINISTRATION | \n",
" 5/3/1999 | \n",
" $100,000.00 | \n",
"
\n",
" \n",
" 514 | \n",
" 1902 | \n",
" MDA97299M0013 | \n",
" DO | \n",
" DIAMONDBAC | \n",
" TECH INTEGRATION CENTER/TECH DEV CENTER | \n",
" 9/8/1999 | \n",
" $50,000.00 | \n",
"
\n",
" \n",
" 515 | \n",
" 1903 | \n",
" MDA9769630014 | \n",
" P00007 | \n",
" SDLINC | \n",
" SOLAR BLIND DETECTORS | \n",
" 7/9/1999 | \n",
" $0.00 | \n",
"
\n",
" \n",
" 516 | \n",
" 1904 | \n",
" FY SUBTOTAL: $340,495,021.94 | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
" 517 | \n",
" 1905 | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
"
\n",
" \n",
"
\n",
"
497 rows × 7 columns
\n",
"
"
],
"text/plain": [
" Number CONTRACT_NUMBER CONTRACT_MOD PERFORMER \\\n",
"2 1420 MDA97292J 1029 GR20 CNRI \n",
"3 1421 MDA97292J1 029 GR22 CNRI \n",
"4 1422 MDA97292J1 029 GR22 CNRI \n",
"5 1423 MDA97292J1 029 P00025 CNRI \n",
"6 1424 MDA972931 0030 P00009 GEORGIATEC \n",
"7 1425 MDA9729320014 P00017 USDISPLAYC \n",
"8 1426 MDA97293C0016 P00043 SYSPLANCOR \n",
"9 1427 MDA97294C0003 A00003 BELLATLANT \n",
"10 1428 MDA97294C0003 P00026 BELLATLANT \n",
"11 1429 MDA97294C0003 P00027 BELLATLANT \n",
"12 1430 MDA97294C0003 P00028 BELLATLANT \n",
"13 1431 MDA97294C0003 P00029 BELLATLANT \n",
"14 1432 MDA97294C0003 P00030 BELLATLANT \n",
"15 1433 MDA97294C0003 P00031 BELLATLANT \n",
"16 1434 MDA97294C0003 P00032 BELLATLANT \n",
"17 1435 MDA97294C0016 P00026 BDMFEDERAL \n",
"18 1436 MDA97294C0016 P00027 BDMFEDERAL \n",
"19 1437 MDA97294C0016 P00028 BDMFEDERAL \n",
"20 1438 MDA97294C0016 P00029 BDMFEDERAL \n",
"21 1439 MDA97294C0016 P00030 BDMFEDERAL \n",
"22 1440 MDA97294D0001 D003/P16 VRT \n",
"23 1441 MDA97294D0001 0032/3 VRT \n",
"24 1442 MDA97294D0001 003202 VALLEYELEC \n",
"25 1443 MDA972951 0016 GR03 ARIZONASTA \n",
"26 1444 MDA9729530027 P00014 BELLCORE \n",
"27 1445 MDA9729530029 A00009 PLANARAMER \n",
"28 1446 MDA9729530029 GR0008 PLANARAMER \n",
"29 1447 MDA9729530036 GR06 ITNENERGYS \n",
"30 1448 MDA9729530042 GR011 CRAYRESEAR \n",
"31 1449 MDA97295C0004 P00008 UMASS \n",
".. ... ... ... ... \n",
"486 1877 MDA97299F0025 BASIC SYSPLANCOR \n",
"487 1878 MDA97299F0027 DO ORIONSCSYS \n",
"488 1879 M DA97299F0028 DO DIGITSYSIN \n",
"489 1880 MDA97299F0028 D001 DIGITSYSIN \n",
"490 1881 MDA97299F0029 DO DTAI \n",
"491 1882 MDA97299F0030 BASIC BOOZALLEN \n",
"492 1883 MDA97299F0031 BASIC SCHAFER \n",
"493 1884 MDA97299F0032 DO BRADSONCOR \n",
"494 1885 MDA97299F0033 DO SYSPLANCOR \n",
"495 1886 MDA97299F0033 P00001 SYSPLANCOR \n",
"496 1887 MDA97299F0034 BASIC DIGITSYSIN \n",
"497 1888 MDA97299M0002 DO INFOSYSLAB \n",
"498 1889 MDA97299M0003 DO SRC \n",
"501 2 None None None \n",
"502 1890 MDA97299M0004 DO ARDAK \n",
"503 1891 MDA97299M0004 P00001 ARDAK \n",
"504 1892 MDA97299M0004 P00002 ARDAK \n",
"505 1893 MDA97299M0005 DO SHA \n",
"506 1894 MDA97299M0005 P00001 SHA \n",
"507 1895 MDA97299M0006 DO VISTARESEA \n",
"508 1896 MDA97299M0007 DO VISUALEYES \n",
"509 1897 MDA97299M0008 BASIC BLUE RIDGE \n",
"510 1898 MDA97299M0009 DO QRI \n",
"511 1899 MDA97299M001 0 DO PRAJAINC \n",
"512 1900 MDA97299M0011 BASIC lVI \n",
"513 1901 MDA97299M0012 BASIC JERRYCOOKE \n",
"514 1902 MDA97299M0013 DO DIAMONDBAC \n",
"515 1903 MDA9769630014 P00007 SDLINC \n",
"516 1904 FY SUBTOTAL: $340,495,021.94 None None \n",
"517 1905 None None None \n",
"\n",
" PROGRAM_TITLE AWARD_DATE AMOUNT \n",
"2 INFORMATION MANAGEMENT 12/10/1998 $687,000.00 \n",
"3 COMMUNICATOR 4/22/1999 $400,000.00 \n",
"4 WEBINABOX 4122/1999 $360,000.00 \n",
"5 WEBINABOX 8/24/1999 $0.00 \n",
"6 HIGH DEFINITION SYSTEMS (HDS) 1/29/1999 $1 ,210,694.00 \n",
"7 FLAT PANEL DISPLAYS 8116/1999 $5,794,000.00 \n",
"8 CHPS: Combat Hybrid Power Systems 1nt1999 $79,441.00 \n",
"9 NEXT GENERATION INTERNET 8/28/1998 $0.00 \n",
"10 NEXT GENERATION INTERNET 1/2011999 $332,197.00 \n",
"11 NEXT GENERATION INTERNET 2/4/1999 $94,750.00 \n",
"12 NEXT GENERATION INTERNET 2/22/1999 $450,000.00 \n",
"13 NEXT GENERATION INTERNET 3/1/1999 $254,750.00 \n",
"14 NEXT GENERATION INTERNET 4/1 2/1999 $0.00 \n",
"15 NEXT GENERATION INTERNET 4/1 3/1999 $254,750.00 \n",
"16 NEXT GENERATION INTERNET 9/8/1999 $254,750.00 \n",
"17 STOWACTD 2/1 2/1999 $117,000.00 \n",
"18 STOWACTD 3/1/1999 $273,000.00 \n",
"19 IMAGE UNDERSTANDING 3/2911999 $150,166.00 \n",
"20 STOWACTD 5/27/1999 $40,000.00 \n",
"21 STOWACTD 911 /1999 $55,930.00 \n",
"22 BADD 12/9/1998 $73,374.00 \n",
"23 AGILE INFO CONTROL ENVIRONMENT 2/12/1999 $100,095.00 \n",
"24 AGILE INFO CONTROL ENVIRONMENT 12/22/1998 $100,095.00 \n",
"25 VLSI PHOTONICS 3/1 5/1999 $149,984.00 \n",
"26 BROADBAND INFORMATION TECHNOLOGY 1/4/1999 $4,547,200.00 \n",
"27 HIGH DEFINITION SYSTEMS (HDS) 5/4/1999 $0.00 \n",
"28 HIGH DEFINITION SYSTEMS (HDS) 11/10/1998 $7,570,137.00 \n",
"29 PHOTOVOLTAICS (VP) 11/1 8/1998 $558,900.00 \n",
"30 SHOCC 6nt1999 $1 ,289,562.00 \n",
"31 LARGE MILLIMETER TELESCOPE 8/30/1999 $1 ,151 ,500.00 \n",
".. ... ... ... \n",
"486 COUNTER UNDERGROUND FACILITIES 6/25/1999 $251 ,924.00 \n",
"487 COUNTER MEASURES 6/11/1999 $199,991 .00 \n",
"488 CONTRACT ADMINISTRATION 7/14/1999 $90,000.00 \n",
"489 CONTRACTS MANAGEMENT 6/30/1999 $4,422.00 \n",
"490 TECH INTEGRATION CENTER/TECH DEV CENTER 8/4/1999 $100,000.00 \n",
"491 POLYMER MATERIALS (CONG ADD) 5/15/1999 $423,916.45 \n",
"492 CEROS (FENCED) 8/2/1999 $59,972.00 \n",
"493 ADVANCED SHIP/SENSOR SYSTEMS MRN-02 8/9/1999 $43,425.18 \n",
"494 CONTRACTS MANAGEMENT 8/30/1999 $37,075.00 \n",
"495 CONTRACTS MANAGEMENT 9/13/1999 $0.00 \n",
"496 CONTRACTS MANAGEMENT 8/31/1999 $64,755.00 \n",
"497 ADVANCED GROUND SURVELLIANCE 3/1211999 $99,729.00 \n",
"498 ADVANCED MICROELECTRONICS 4/14/1999 $10,000.00 \n",
"501 None None None \n",
"502 BW MEDICAL DIAGNOSTICS 3/30/1999 $99,970.00 \n",
"503 BW MEDICAL DIAGNOSTICS 5/26/1999 $0.00 \n",
"504 BW MEDICAL DIAGNOSTICS 8/4/1999 $0.00 \n",
"505 SENSOR EMULATION 5/4/1999 $100,000.00 \n",
"506 SENSOR EMULATION 5/12/1999 $0.00 \n",
"507 UNDERSEA LITTORAL WARFARE 4/12/1999 $74,827.00 \n",
"508 COMBAT CASUALTY DIAGNOSTICS:ULTRASOUND 5/3/1999 $59,500.00 \n",
"509 OFFICE/PROGRAM SUPPORT (related to VTAX4) 5/11/1999 $48,566.00 \n",
"510 ADVANCED SIMULATION TECH 6/29/1999 $99,494.00 \n",
"511 COUNTER MEASURES 6/14/1999 $80,460.00 \n",
"512 COUNTER MEASURES 7/16/1999 $90,000.00 \n",
"513 CONTRACT ADMINISTRATION 5/3/1999 $100,000.00 \n",
"514 TECH INTEGRATION CENTER/TECH DEV CENTER 9/8/1999 $50,000.00 \n",
"515 SOLAR BLIND DETECTORS 7/9/1999 $0.00 \n",
"516 None None None \n",
"517 None None None \n",
"\n",
"[497 rows x 7 columns]"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"darpa1999"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's get rid of those yucky last two rows'"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"darpa1999=darpa1999[:-2]"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import re"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### re.sub like find and replace all\n",
"subtitute\n",
"\n",
"`re.sub(\"REGULAR EXPRESSION\", , )`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"## WRITE UP MAP EXPLANATION"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"collapsed": false,
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"2 687000.00\n",
"3 400000.00\n",
"4 360000.00\n",
"5 0.00\n",
"6 1210694.00\n",
"7 5794000.00\n",
"8 79441.00\n",
"9 0.00\n",
"10 332197.00\n",
"11 94750.00\n",
"12 450000.00\n",
"13 254750.00\n",
"14 0.00\n",
"15 254750.00\n",
"16 254750.00\n",
"17 117000.00\n",
"18 273000.00\n",
"19 150166.00\n",
"20 40000.00\n",
"21 55930.00\n",
"22 73374.00\n",
"23 100095.00\n",
"24 100095.00\n",
"25 149984.00\n",
"26 4547200.00\n",
"27 0.00\n",
"28 7570137.00\n",
"29 558900.00\n",
"30 1289562.00\n",
"31 1151500.00\n",
" ... \n",
"484 40000.00\n",
"485 117000.00\n",
"486 251924.00\n",
"487 199991.00\n",
"488 90000.00\n",
"489 4422.00\n",
"490 100000.00\n",
"491 423916.45\n",
"492 59972.00\n",
"493 43425.18\n",
"494 37075.00\n",
"495 0.00\n",
"496 64755.00\n",
"497 99729.00\n",
"498 10000.00\n",
"501 \n",
"502 99970.00\n",
"503 0.00\n",
"504 0.00\n",
"505 100000.00\n",
"506 0.00\n",
"507 74827.00\n",
"508 59500.00\n",
"509 48566.00\n",
"510 99494.00\n",
"511 80460.00\n",
"512 90000.00\n",
"513 100000.00\n",
"514 50000.00\n",
"515 0.00\n",
"Name: AMOUNT, dtype: object"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"darpa1999[\"AMOUNT\"].astype(\"str\").map(lambda x: re.sub(\"[^\\d\\.\\(\\)]\", \"\", x))"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/mljones/anaconda/lib/python2.7/site-packages/IPython/kernel/__main__.py:4: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n"
]
}
],
"source": [
"## First get rid of everything not a numeral, a period, or a `(`. \n",
"## Note for Non Anglo-American sources, use you'll need to get rid of periods not commas.\n",
"\n",
"darpa1999[\"AMOUNT\"]=darpa1999[\"AMOUNT\"].astype(\"str\").map(lambda x: re.sub(\"[^\\d\\.\\(]\", \"\", x))\n",
"\n",
"#[^\\d\\.\\(] means everything but single digits or \".\" or \"(\"\n",
"\n",
"## for European style, you'd use `re.sub(\"[^\\d\\,\\(]\", \"\", x)`\n"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array(['687000.00', '400000.00', '360000.00', '0.00', '1210694.00',\n",
" '5794000.00', '79441.00', '0.00', '332197.00', '94750.00',\n",
" '450000.00', '254750.00', '0.00', '254750.00', '254750.00',\n",
" '117000.00', '273000.00', '150166.00', '40000.00', '55930.00',\n",
" '73374.00', '100095.00', '100095.00', '149984.00', '4547200.00',\n",
" '0.00', '7570137.00', '558900.00', '1289562.00', '1151500.00',\n",
" '498383.00', '2000000.00', '290000.00', '1516319.00', '210500.00',\n",
" '210500.00', '0.00', '490071.00', '48000.00', '1321000.00',\n",
" '423212.00', '3417.00', '5286.00', '0.00', '(109494.00', '65572.00',\n",
" '(10000.00', '', '60000.00', '566045.00', '0.00', '1926000.00',\n",
" '808440.00', '244629.00', '0.00', '250000.00', '0.00', '810734.00',\n",
" '1000000.00', '2790100.00', '3497712.00', '300000.00', '0.00',\n",
" '1606470.00', '874.00', '0.00', '0.00', '450000.00', '97000.00',\n",
" '0.00', '50000.00', '333929.00', '97000.00', '0.00', '100000.00',\n",
" '0.00', '500000.00', '0.00', '599878.00', '980000.00', '70000.00',\n",
" '30000.00', '0.00', '2100000.00', '0.00', '0.00', '0.00',\n",
" '921400.00', '100000.00', '1790000.00', '267500.00', '0.00',\n",
" '91071.00', '40056.00', '0.00', '', '741000.00', '300000.00',\n",
" '30000.00', '900000.00', '1950000.00', '499052.00', '10523.00',\n",
" '35528.00', '49905.00', '52158.00', '51861.00', '800000.00', '0.00',\n",
" '0.00', '365396.00', '500000.00', '400000.00', '300000.00', '0.00',\n",
" '3474136.00', '3400000.00', '1380000.00', '162618.00', '0.00',\n",
" '0.00', '150000.00', '15000.00', '45000.00', '0.00', '205000.00',\n",
" '2897000.00', '503630.00', '0.00', '395320.00', '0.00',\n",
" '6510000.00', '4600545.00', '(402700.00', '2700000.00', '898000.00',\n",
" '16000.00', '650000.00', '0.00', '0.00', '9901798.00', '0.00',\n",
" '20000.00', '', '55000.00', '750000.00', '0.00', '345372.00',\n",
" '0.00', '209431.00', '500000.00', '290569.00', '1188000.00',\n",
" '200000.00', '400795.00', '176482.00', '350000.00', '117420.00',\n",
" '430533.00', '912000.00', '500000.00', '480975.00', '0.00',\n",
" '105408.00', '386493.00', '107142.00', '392858.00', '199309.00',\n",
" '2610132.00', '150000.00', '103950.00', '119950.00', '200000.00',\n",
" '0.00', '2130000.00', '1000000.00', '9250000.00', '30950.00',\n",
" '174798.00', '0.00', '0.00', '199899.00', '550000.00', '350000.00',\n",
" '0.00', '200000.00', '149999.00', '100000.00', '258264.00',\n",
" '327206.00', '93780.00', '', '102706.00', '0.00', '0.00',\n",
" '2135000.00', '3750000.00', '100000.00', '0.00', '2200000.00',\n",
" '144000.00', '4384000.00', '0.00', '1785000.00', '1700000.00',\n",
" '0.00', '0.00', '56000.00', '750000.00', '1853441.00', '2036559.00',\n",
" '750000.00', '20526.00', '184428.00', '256638.00', '321768.00',\n",
" '204435.00', '43650.00', '350000.00', '735760.00', '1819008.00',\n",
" '1500000.00', '0.00', '0.00', '7925688.00', '0.00', '800000.00',\n",
" '650000.00', '186083.00', '190842.00', '(500000.00', '6565000.00',\n",
" '0.00', '874010.00', '0.00', '518453.00', '0.00', '847010.00',\n",
" '27000.00', '', '0.00', '3660000.00', '5600000.00', '5600000.00',\n",
" '5600000.00', '5000000.00', '0.00', '5000000.00', '1427526.00',\n",
" '243774.00', '0.00', '0.00', '1064928.00', '556226.00', '867000.00',\n",
" '(800000.00', '200000.00', '769226.00', '1650000.00', '1915045.00',\n",
" '1642515.00', '800000.00', '370416.00', '900000.00', '100000.00',\n",
" '', '', '', '3333000.00', '', '4082000.00', '1515000.00',\n",
" '687927.00', '116667.00', '583333.00', '143433.00', '349909.00',\n",
" '2053443.00', '108700.00', '300000.00', '391300.00', '0.00',\n",
" '6119332.00', '0.00', '1490998.00', '26658.00', '114281.00', '',\n",
" '39415.00', '700000.00', '1293841.00', '0.00', '302233.00', '0.00',\n",
" '199789.00', '247900.00', '0.00', '1200000.00', '124983.00', '0.00',\n",
" '0.00', '294996.00', '244295.00', '0.00', '76655.00', '0.00',\n",
" '0.00', '413864.00', '324929.00', '79167.00', '194148.00',\n",
" '375000.00', '2000000.00', '(120000.00', '', '2000000.00', '0.00',\n",
" '2586066.00', '100000.00', '100000.00', '50000.00', '100000.00',\n",
" '156000.00', '0.00', '50000.00', '0.00', '100000.00', '450000.00',\n",
" '50000.00', '250000.00', '850000.00', '200000.00', '380492.00',\n",
" '155992.00', '130000.00', '', '210000.00', '355947.00', '80000.00',\n",
" '32839.00', '79846.00', '97966.00', '392856.00', '70778.00',\n",
" '24864.00', '395859.00', '124999.00', '1800000.00', '',\n",
" '5883520.00', '1169500.00', '', '698000.00', '3075000.00',\n",
" '3075000.00', '2311497.00', '500000.00', '200000.00', '8826140.00',\n",
" '0.00', '0.00', '95524.00', '25000.00', '400000.00', '150000.00',\n",
" '100000.00', '65000.00', '68000.00', '771805.00', '', '222649.00',\n",
" '0.00', '242880.00', '50000.00', '3282970.00', '0.00', '365731.00',\n",
" '(365731.00', '200000.00', '195867.00', '417065.00', '666262.00',\n",
" '493980.00', '', '374990.00', '833281.00', '357869.00', '95943.00',\n",
" '0.00', '100000.00', '299997.00', '500000.00', '0.00', '400000.00',\n",
" '114000.00', '0.00', '1000000.00', '499927.00', '497.00',\n",
" '218000.00', '176491.00', '0.00', '0.00', '273501.00', '0.00',\n",
" '100000.00', '0.00', '594401.00', '0.00', '0.00', '0.00', '0.00',\n",
" '0.00', '200000.00', '200000.00', '250000.00', '534702.00', '',\n",
" '31569.00', '124928.00', '0.00', '168421.00', '80000.00',\n",
" '60669.00', '140000.00', '40000.00', '20000.00', '82109.00',\n",
" '60000.00', '104539.00', '70000.00', '', '', '59000.00',\n",
" '312360.00', '290580.00', '332000.00', '320000.00', '78000.00',\n",
" '285000.00', '0.00', '31697.37', '114950.00', '0.00', '393486.00',\n",
" '210262.00', '300000.00', '329987.00', '500386.00', '210262.00',\n",
" '129850.00', '119732.00', '119732.00', '0.00', '81772.38',\n",
" '73096.56', '789358.00', '49959.00', '100000.00', '306000.00',\n",
" '306000.00', '600000.00', '75000.00', '183000.00', '179993.00',\n",
" '40000.00', '117000.00', '251924.00', '199991.00', '90000.00',\n",
" '4422.00', '100000.00', '423916.45', '59972.00', '43425.18',\n",
" '37075.00', '0.00', '64755.00', '99729.00', '10000.00', '',\n",
" '99970.00', '0.00', '0.00', '100000.00', '0.00', '74827.00',\n",
" '59500.00', '48566.00', '99494.00', '80460.00', '90000.00',\n",
" '100000.00', '50000.00', '0.00'], dtype=object)"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"darpa1999[\"AMOUNT\"].values"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/mljones/anaconda/lib/python2.7/site-packages/IPython/kernel/__main__.py:3: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n",
" app.launch_new_instance()\n"
]
}
],
"source": [
"#make all the ( into negatives\n",
"\n",
"darpa1999[\"AMOUNT\"]=darpa1999[\"AMOUNT\"].astype(\"str\").map(lambda x: re.sub(\"[\\(]\", \"-\", x))"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"2 687000.00\n",
"3 400000.00\n",
"4 360000.00\n",
"5 0.00\n",
"6 1210694.00\n",
"7 5794000.00\n",
"8 79441.00\n",
"9 0.00\n",
"10 332197.00\n",
"11 94750.00\n",
"12 450000.00\n",
"13 254750.00\n",
"14 0.00\n",
"15 254750.00\n",
"16 254750.00\n",
"17 117000.00\n",
"18 273000.00\n",
"19 150166.00\n",
"20 40000.00\n",
"21 55930.00\n",
"22 73374.00\n",
"23 100095.00\n",
"24 100095.00\n",
"25 149984.00\n",
"26 4547200.00\n",
"27 0.00\n",
"28 7570137.00\n",
"29 558900.00\n",
"30 1289562.00\n",
"31 1151500.00\n",
" ... \n",
"484 40000.00\n",
"485 117000.00\n",
"486 251924.00\n",
"487 199991.00\n",
"488 90000.00\n",
"489 4422.00\n",
"490 100000.00\n",
"491 423916.45\n",
"492 59972.00\n",
"493 43425.18\n",
"494 37075.00\n",
"495 0.00\n",
"496 64755.00\n",
"497 99729.00\n",
"498 10000.00\n",
"501 NaN\n",
"502 99970.00\n",
"503 0.00\n",
"504 0.00\n",
"505 100000.00\n",
"506 0.00\n",
"507 74827.00\n",
"508 59500.00\n",
"509 48566.00\n",
"510 99494.00\n",
"511 80460.00\n",
"512 90000.00\n",
"513 100000.00\n",
"514 50000.00\n",
"515 0.00\n",
"Name: AMOUNT, dtype: float64"
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#finally convert into a numerical object. \n",
"# pandas convert_objects will do the trick!\n",
"darpa1999[\"AMOUNT\"].convert_objects(convert_numeric=True)"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/mljones/anaconda/lib/python2.7/site-packages/IPython/kernel/__main__.py:1: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy\n",
" if __name__ == '__main__':\n"
]
}
],
"source": [
"darpa1999[\"AMOUNT\"]=darpa1999[\"AMOUNT\"].convert_objects(convert_numeric=True)"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Number | \n",
" CONTRACT_NUMBER | \n",
" CONTRACT_MOD | \n",
" PERFORMER | \n",
" PROGRAM_TITLE | \n",
" AWARD_DATE | \n",
" AMOUNT | \n",
"
\n",
" \n",
" \n",
" \n",
" 2 | \n",
" 1420 | \n",
" MDA97292J 1029 | \n",
" GR20 | \n",
" CNRI | \n",
" INFORMATION MANAGEMENT | \n",
" 12/10/1998 | \n",
" 687000.00 | \n",
"
\n",
" \n",
" 3 | \n",
" 1421 | \n",
" MDA97292J1 029 | \n",
" GR22 | \n",
" CNRI | \n",
" COMMUNICATOR | \n",
" 4/22/1999 | \n",
" 400000.00 | \n",
"
\n",
" \n",
" 4 | \n",
" 1422 | \n",
" MDA97292J1 029 | \n",
" GR22 | \n",
" CNRI | \n",
" WEBINABOX | \n",
" 4122/1999 | \n",
" 360000.00 | \n",
"
\n",
" \n",
" 5 | \n",
" 1423 | \n",
" MDA97292J1 029 | \n",
" P00025 | \n",
" CNRI | \n",
" WEBINABOX | \n",
" 8/24/1999 | \n",
" 0.00 | \n",
"
\n",
" \n",
" 6 | \n",
" 1424 | \n",
" MDA972931 0030 | \n",
" P00009 | \n",
" GEORGIATEC | \n",
" HIGH DEFINITION SYSTEMS (HDS) | \n",
" 1/29/1999 | \n",
" 1210694.00 | \n",
"
\n",
" \n",
" 7 | \n",
" 1425 | \n",
" MDA9729320014 | \n",
" P00017 | \n",
" USDISPLAYC | \n",
" FLAT PANEL DISPLAYS | \n",
" 8116/1999 | \n",
" 5794000.00 | \n",
"
\n",
" \n",
" 8 | \n",
" 1426 | \n",
" MDA97293C0016 | \n",
" P00043 | \n",
" SYSPLANCOR | \n",
" CHPS: Combat Hybrid Power Systems | \n",
" 1nt1999 | \n",
" 79441.00 | \n",
"
\n",
" \n",
" 9 | \n",
" 1427 | \n",
" MDA97294C0003 | \n",
" A00003 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 8/28/1998 | \n",
" 0.00 | \n",
"
\n",
" \n",
" 10 | \n",
" 1428 | \n",
" MDA97294C0003 | \n",
" P00026 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 1/2011999 | \n",
" 332197.00 | \n",
"
\n",
" \n",
" 11 | \n",
" 1429 | \n",
" MDA97294C0003 | \n",
" P00027 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 2/4/1999 | \n",
" 94750.00 | \n",
"
\n",
" \n",
" 12 | \n",
" 1430 | \n",
" MDA97294C0003 | \n",
" P00028 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 2/22/1999 | \n",
" 450000.00 | \n",
"
\n",
" \n",
" 13 | \n",
" 1431 | \n",
" MDA97294C0003 | \n",
" P00029 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 3/1/1999 | \n",
" 254750.00 | \n",
"
\n",
" \n",
" 14 | \n",
" 1432 | \n",
" MDA97294C0003 | \n",
" P00030 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 4/1 2/1999 | \n",
" 0.00 | \n",
"
\n",
" \n",
" 15 | \n",
" 1433 | \n",
" MDA97294C0003 | \n",
" P00031 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 4/1 3/1999 | \n",
" 254750.00 | \n",
"
\n",
" \n",
" 16 | \n",
" 1434 | \n",
" MDA97294C0003 | \n",
" P00032 | \n",
" BELLATLANT | \n",
" NEXT GENERATION INTERNET | \n",
" 9/8/1999 | \n",
" 254750.00 | \n",
"
\n",
" \n",
" 17 | \n",
" 1435 | \n",
" MDA97294C0016 | \n",
" P00026 | \n",
" BDMFEDERAL | \n",
" STOWACTD | \n",
" 2/1 2/1999 | \n",
" 117000.00 | \n",
"
\n",
" \n",
" 18 | \n",
" 1436 | \n",
" MDA97294C0016 | \n",
" P00027 | \n",
" BDMFEDERAL | \n",
" STOWACTD | \n",
" 3/1/1999 | \n",
" 273000.00 | \n",
"
\n",
" \n",
" 19 | \n",
" 1437 | \n",
" MDA97294C0016 | \n",
" P00028 | \n",
" BDMFEDERAL | \n",
" IMAGE UNDERSTANDING | \n",
" 3/2911999 | \n",
" 150166.00 | \n",
"
\n",
" \n",
" 20 | \n",
" 1438 | \n",
" MDA97294C0016 | \n",
" P00029 | \n",
" BDMFEDERAL | \n",
" STOWACTD | \n",
" 5/27/1999 | \n",
" 40000.00 | \n",
"
\n",
" \n",
" 21 | \n",
" 1439 | \n",
" MDA97294C0016 | \n",
" P00030 | \n",
" BDMFEDERAL | \n",
" STOWACTD | \n",
" 911 /1999 | \n",
" 55930.00 | \n",
"
\n",
" \n",
" 22 | \n",
" 1440 | \n",
" MDA97294D0001 | \n",
" D003/P16 | \n",
" VRT | \n",
" BADD | \n",
" 12/9/1998 | \n",
" 73374.00 | \n",
"
\n",
" \n",
" 23 | \n",
" 1441 | \n",
" MDA97294D0001 | \n",
" 0032/3 | \n",
" VRT | \n",
" AGILE INFO CONTROL ENVIRONMENT | \n",
" 2/12/1999 | \n",
" 100095.00 | \n",
"
\n",
" \n",
" 24 | \n",
" 1442 | \n",
" MDA97294D0001 | \n",
" 003202 | \n",
" VALLEYELEC | \n",
" AGILE INFO CONTROL ENVIRONMENT | \n",
" 12/22/1998 | \n",
" 100095.00 | \n",
"
\n",
" \n",
" 25 | \n",
" 1443 | \n",
" MDA972951 0016 | \n",
" GR03 | \n",
" ARIZONASTA | \n",
" VLSI PHOTONICS | \n",
" 3/1 5/1999 | \n",
" 149984.00 | \n",
"
\n",
" \n",
" 26 | \n",
" 1444 | \n",
" MDA9729530027 | \n",
" P00014 | \n",
" BELLCORE | \n",
" BROADBAND INFORMATION TECHNOLOGY | \n",
" 1/4/1999 | \n",
" 4547200.00 | \n",
"
\n",
" \n",
" 27 | \n",
" 1445 | \n",
" MDA9729530029 | \n",
" A00009 | \n",
" PLANARAMER | \n",
" HIGH DEFINITION SYSTEMS (HDS) | \n",
" 5/4/1999 | \n",
" 0.00 | \n",
"
\n",
" \n",
" 28 | \n",
" 1446 | \n",
" MDA9729530029 | \n",
" GR0008 | \n",
" PLANARAMER | \n",
" HIGH DEFINITION SYSTEMS (HDS) | \n",
" 11/10/1998 | \n",
" 7570137.00 | \n",
"
\n",
" \n",
" 29 | \n",
" 1447 | \n",
" MDA9729530036 | \n",
" GR06 | \n",
" ITNENERGYS | \n",
" PHOTOVOLTAICS (VP) | \n",
" 11/1 8/1998 | \n",
" 558900.00 | \n",
"
\n",
" \n",
" 30 | \n",
" 1448 | \n",
" MDA9729530042 | \n",
" GR011 | \n",
" CRAYRESEAR | \n",
" SHOCC | \n",
" 6nt1999 | \n",
" 1289562.00 | \n",
"
\n",
" \n",
" 31 | \n",
" 1449 | \n",
" MDA97295C0004 | \n",
" P00008 | \n",
" UMASS | \n",
" LARGE MILLIMETER TELESCOPE | \n",
" 8/30/1999 | \n",
" 1151500.00 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 484 | \n",
" 1875 | \n",
" MDA97299F0024 | \n",
" BASIC | \n",
" GRCI | \n",
" OFFICE/PROGRAM SUPPORT (RELATED TO VSEE8) | \n",
" 3/1711999 | \n",
" 40000.00 | \n",
"
\n",
" \n",
" 485 | \n",
" 1876 | \n",
" MDA97299F0024 | \n",
" BASIC | \n",
" GRCI | \n",
" WARFIGHTERS INTERNET | \n",
" 3/17/1999 | \n",
" 117000.00 | \n",
"
\n",
" \n",
" 486 | \n",
" 1877 | \n",
" MDA97299F0025 | \n",
" BASIC | \n",
" SYSPLANCOR | \n",
" COUNTER UNDERGROUND FACILITIES | \n",
" 6/25/1999 | \n",
" 251924.00 | \n",
"
\n",
" \n",
" 487 | \n",
" 1878 | \n",
" MDA97299F0027 | \n",
" DO | \n",
" ORIONSCSYS | \n",
" COUNTER MEASURES | \n",
" 6/11/1999 | \n",
" 199991.00 | \n",
"
\n",
" \n",
" 488 | \n",
" 1879 | \n",
" M DA97299F0028 | \n",
" DO | \n",
" DIGITSYSIN | \n",
" CONTRACT ADMINISTRATION | \n",
" 7/14/1999 | \n",
" 90000.00 | \n",
"
\n",
" \n",
" 489 | \n",
" 1880 | \n",
" MDA97299F0028 | \n",
" D001 | \n",
" DIGITSYSIN | \n",
" CONTRACTS MANAGEMENT | \n",
" 6/30/1999 | \n",
" 4422.00 | \n",
"
\n",
" \n",
" 490 | \n",
" 1881 | \n",
" MDA97299F0029 | \n",
" DO | \n",
" DTAI | \n",
" TECH INTEGRATION CENTER/TECH DEV CENTER | \n",
" 8/4/1999 | \n",
" 100000.00 | \n",
"
\n",
" \n",
" 491 | \n",
" 1882 | \n",
" MDA97299F0030 | \n",
" BASIC | \n",
" BOOZALLEN | \n",
" POLYMER MATERIALS (CONG ADD) | \n",
" 5/15/1999 | \n",
" 423916.45 | \n",
"
\n",
" \n",
" 492 | \n",
" 1883 | \n",
" MDA97299F0031 | \n",
" BASIC | \n",
" SCHAFER | \n",
" CEROS (FENCED) | \n",
" 8/2/1999 | \n",
" 59972.00 | \n",
"
\n",
" \n",
" 493 | \n",
" 1884 | \n",
" MDA97299F0032 | \n",
" DO | \n",
" BRADSONCOR | \n",
" ADVANCED SHIP/SENSOR SYSTEMS MRN-02 | \n",
" 8/9/1999 | \n",
" 43425.18 | \n",
"
\n",
" \n",
" 494 | \n",
" 1885 | \n",
" MDA97299F0033 | \n",
" DO | \n",
" SYSPLANCOR | \n",
" CONTRACTS MANAGEMENT | \n",
" 8/30/1999 | \n",
" 37075.00 | \n",
"
\n",
" \n",
" 495 | \n",
" 1886 | \n",
" MDA97299F0033 | \n",
" P00001 | \n",
" SYSPLANCOR | \n",
" CONTRACTS MANAGEMENT | \n",
" 9/13/1999 | \n",
" 0.00 | \n",
"
\n",
" \n",
" 496 | \n",
" 1887 | \n",
" MDA97299F0034 | \n",
" BASIC | \n",
" DIGITSYSIN | \n",
" CONTRACTS MANAGEMENT | \n",
" 8/31/1999 | \n",
" 64755.00 | \n",
"
\n",
" \n",
" 497 | \n",
" 1888 | \n",
" MDA97299M0002 | \n",
" DO | \n",
" INFOSYSLAB | \n",
" ADVANCED GROUND SURVELLIANCE | \n",
" 3/1211999 | \n",
" 99729.00 | \n",
"
\n",
" \n",
" 498 | \n",
" 1889 | \n",
" MDA97299M0003 | \n",
" DO | \n",
" SRC | \n",
" ADVANCED MICROELECTRONICS | \n",
" 4/14/1999 | \n",
" 10000.00 | \n",
"
\n",
" \n",
" 501 | \n",
" 2 | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" NaN | \n",
"
\n",
" \n",
" 502 | \n",
" 1890 | \n",
" MDA97299M0004 | \n",
" DO | \n",
" ARDAK | \n",
" BW MEDICAL DIAGNOSTICS | \n",
" 3/30/1999 | \n",
" 99970.00 | \n",
"
\n",
" \n",
" 503 | \n",
" 1891 | \n",
" MDA97299M0004 | \n",
" P00001 | \n",
" ARDAK | \n",
" BW MEDICAL DIAGNOSTICS | \n",
" 5/26/1999 | \n",
" 0.00 | \n",
"
\n",
" \n",
" 504 | \n",
" 1892 | \n",
" MDA97299M0004 | \n",
" P00002 | \n",
" ARDAK | \n",
" BW MEDICAL DIAGNOSTICS | \n",
" 8/4/1999 | \n",
" 0.00 | \n",
"
\n",
" \n",
" 505 | \n",
" 1893 | \n",
" MDA97299M0005 | \n",
" DO | \n",
" SHA | \n",
" SENSOR EMULATION | \n",
" 5/4/1999 | \n",
" 100000.00 | \n",
"
\n",
" \n",
" 506 | \n",
" 1894 | \n",
" MDA97299M0005 | \n",
" P00001 | \n",
" SHA | \n",
" SENSOR EMULATION | \n",
" 5/12/1999 | \n",
" 0.00 | \n",
"
\n",
" \n",
" 507 | \n",
" 1895 | \n",
" MDA97299M0006 | \n",
" DO | \n",
" VISTARESEA | \n",
" UNDERSEA LITTORAL WARFARE | \n",
" 4/12/1999 | \n",
" 74827.00 | \n",
"
\n",
" \n",
" 508 | \n",
" 1896 | \n",
" MDA97299M0007 | \n",
" DO | \n",
" VISUALEYES | \n",
" COMBAT CASUALTY DIAGNOSTICS:ULTRASOUND | \n",
" 5/3/1999 | \n",
" 59500.00 | \n",
"
\n",
" \n",
" 509 | \n",
" 1897 | \n",
" MDA97299M0008 | \n",
" BASIC | \n",
" BLUE RIDGE | \n",
" OFFICE/PROGRAM SUPPORT (related to VTAX4) | \n",
" 5/11/1999 | \n",
" 48566.00 | \n",
"
\n",
" \n",
" 510 | \n",
" 1898 | \n",
" MDA97299M0009 | \n",
" DO | \n",
" QRI | \n",
" ADVANCED SIMULATION TECH | \n",
" 6/29/1999 | \n",
" 99494.00 | \n",
"
\n",
" \n",
" 511 | \n",
" 1899 | \n",
" MDA97299M001 0 | \n",
" DO | \n",
" PRAJAINC | \n",
" COUNTER MEASURES | \n",
" 6/14/1999 | \n",
" 80460.00 | \n",
"
\n",
" \n",
" 512 | \n",
" 1900 | \n",
" MDA97299M0011 | \n",
" BASIC | \n",
" lVI | \n",
" COUNTER MEASURES | \n",
" 7/16/1999 | \n",
" 90000.00 | \n",
"
\n",
" \n",
" 513 | \n",
" 1901 | \n",
" MDA97299M0012 | \n",
" BASIC | \n",
" JERRYCOOKE | \n",
" CONTRACT ADMINISTRATION | \n",
" 5/3/1999 | \n",
" 100000.00 | \n",
"
\n",
" \n",
" 514 | \n",
" 1902 | \n",
" MDA97299M0013 | \n",
" DO | \n",
" DIAMONDBAC | \n",
" TECH INTEGRATION CENTER/TECH DEV CENTER | \n",
" 9/8/1999 | \n",
" 50000.00 | \n",
"
\n",
" \n",
" 515 | \n",
" 1903 | \n",
" MDA9769630014 | \n",
" P00007 | \n",
" SDLINC | \n",
" SOLAR BLIND DETECTORS | \n",
" 7/9/1999 | \n",
" 0.00 | \n",
"
\n",
" \n",
"
\n",
"
495 rows × 7 columns
\n",
"
"
],
"text/plain": [
" Number CONTRACT_NUMBER CONTRACT_MOD PERFORMER \\\n",
"2 1420 MDA97292J 1029 GR20 CNRI \n",
"3 1421 MDA97292J1 029 GR22 CNRI \n",
"4 1422 MDA97292J1 029 GR22 CNRI \n",
"5 1423 MDA97292J1 029 P00025 CNRI \n",
"6 1424 MDA972931 0030 P00009 GEORGIATEC \n",
"7 1425 MDA9729320014 P00017 USDISPLAYC \n",
"8 1426 MDA97293C0016 P00043 SYSPLANCOR \n",
"9 1427 MDA97294C0003 A00003 BELLATLANT \n",
"10 1428 MDA97294C0003 P00026 BELLATLANT \n",
"11 1429 MDA97294C0003 P00027 BELLATLANT \n",
"12 1430 MDA97294C0003 P00028 BELLATLANT \n",
"13 1431 MDA97294C0003 P00029 BELLATLANT \n",
"14 1432 MDA97294C0003 P00030 BELLATLANT \n",
"15 1433 MDA97294C0003 P00031 BELLATLANT \n",
"16 1434 MDA97294C0003 P00032 BELLATLANT \n",
"17 1435 MDA97294C0016 P00026 BDMFEDERAL \n",
"18 1436 MDA97294C0016 P00027 BDMFEDERAL \n",
"19 1437 MDA97294C0016 P00028 BDMFEDERAL \n",
"20 1438 MDA97294C0016 P00029 BDMFEDERAL \n",
"21 1439 MDA97294C0016 P00030 BDMFEDERAL \n",
"22 1440 MDA97294D0001 D003/P16 VRT \n",
"23 1441 MDA97294D0001 0032/3 VRT \n",
"24 1442 MDA97294D0001 003202 VALLEYELEC \n",
"25 1443 MDA972951 0016 GR03 ARIZONASTA \n",
"26 1444 MDA9729530027 P00014 BELLCORE \n",
"27 1445 MDA9729530029 A00009 PLANARAMER \n",
"28 1446 MDA9729530029 GR0008 PLANARAMER \n",
"29 1447 MDA9729530036 GR06 ITNENERGYS \n",
"30 1448 MDA9729530042 GR011 CRAYRESEAR \n",
"31 1449 MDA97295C0004 P00008 UMASS \n",
".. ... ... ... ... \n",
"484 1875 MDA97299F0024 BASIC GRCI \n",
"485 1876 MDA97299F0024 BASIC GRCI \n",
"486 1877 MDA97299F0025 BASIC SYSPLANCOR \n",
"487 1878 MDA97299F0027 DO ORIONSCSYS \n",
"488 1879 M DA97299F0028 DO DIGITSYSIN \n",
"489 1880 MDA97299F0028 D001 DIGITSYSIN \n",
"490 1881 MDA97299F0029 DO DTAI \n",
"491 1882 MDA97299F0030 BASIC BOOZALLEN \n",
"492 1883 MDA97299F0031 BASIC SCHAFER \n",
"493 1884 MDA97299F0032 DO BRADSONCOR \n",
"494 1885 MDA97299F0033 DO SYSPLANCOR \n",
"495 1886 MDA97299F0033 P00001 SYSPLANCOR \n",
"496 1887 MDA97299F0034 BASIC DIGITSYSIN \n",
"497 1888 MDA97299M0002 DO INFOSYSLAB \n",
"498 1889 MDA97299M0003 DO SRC \n",
"501 2 None None None \n",
"502 1890 MDA97299M0004 DO ARDAK \n",
"503 1891 MDA97299M0004 P00001 ARDAK \n",
"504 1892 MDA97299M0004 P00002 ARDAK \n",
"505 1893 MDA97299M0005 DO SHA \n",
"506 1894 MDA97299M0005 P00001 SHA \n",
"507 1895 MDA97299M0006 DO VISTARESEA \n",
"508 1896 MDA97299M0007 DO VISUALEYES \n",
"509 1897 MDA97299M0008 BASIC BLUE RIDGE \n",
"510 1898 MDA97299M0009 DO QRI \n",
"511 1899 MDA97299M001 0 DO PRAJAINC \n",
"512 1900 MDA97299M0011 BASIC lVI \n",
"513 1901 MDA97299M0012 BASIC JERRYCOOKE \n",
"514 1902 MDA97299M0013 DO DIAMONDBAC \n",
"515 1903 MDA9769630014 P00007 SDLINC \n",
"\n",
" PROGRAM_TITLE AWARD_DATE AMOUNT \n",
"2 INFORMATION MANAGEMENT 12/10/1998 687000.00 \n",
"3 COMMUNICATOR 4/22/1999 400000.00 \n",
"4 WEBINABOX 4122/1999 360000.00 \n",
"5 WEBINABOX 8/24/1999 0.00 \n",
"6 HIGH DEFINITION SYSTEMS (HDS) 1/29/1999 1210694.00 \n",
"7 FLAT PANEL DISPLAYS 8116/1999 5794000.00 \n",
"8 CHPS: Combat Hybrid Power Systems 1nt1999 79441.00 \n",
"9 NEXT GENERATION INTERNET 8/28/1998 0.00 \n",
"10 NEXT GENERATION INTERNET 1/2011999 332197.00 \n",
"11 NEXT GENERATION INTERNET 2/4/1999 94750.00 \n",
"12 NEXT GENERATION INTERNET 2/22/1999 450000.00 \n",
"13 NEXT GENERATION INTERNET 3/1/1999 254750.00 \n",
"14 NEXT GENERATION INTERNET 4/1 2/1999 0.00 \n",
"15 NEXT GENERATION INTERNET 4/1 3/1999 254750.00 \n",
"16 NEXT GENERATION INTERNET 9/8/1999 254750.00 \n",
"17 STOWACTD 2/1 2/1999 117000.00 \n",
"18 STOWACTD 3/1/1999 273000.00 \n",
"19 IMAGE UNDERSTANDING 3/2911999 150166.00 \n",
"20 STOWACTD 5/27/1999 40000.00 \n",
"21 STOWACTD 911 /1999 55930.00 \n",
"22 BADD 12/9/1998 73374.00 \n",
"23 AGILE INFO CONTROL ENVIRONMENT 2/12/1999 100095.00 \n",
"24 AGILE INFO CONTROL ENVIRONMENT 12/22/1998 100095.00 \n",
"25 VLSI PHOTONICS 3/1 5/1999 149984.00 \n",
"26 BROADBAND INFORMATION TECHNOLOGY 1/4/1999 4547200.00 \n",
"27 HIGH DEFINITION SYSTEMS (HDS) 5/4/1999 0.00 \n",
"28 HIGH DEFINITION SYSTEMS (HDS) 11/10/1998 7570137.00 \n",
"29 PHOTOVOLTAICS (VP) 11/1 8/1998 558900.00 \n",
"30 SHOCC 6nt1999 1289562.00 \n",
"31 LARGE MILLIMETER TELESCOPE 8/30/1999 1151500.00 \n",
".. ... ... ... \n",
"484 OFFICE/PROGRAM SUPPORT (RELATED TO VSEE8) 3/1711999 40000.00 \n",
"485 WARFIGHTERS INTERNET 3/17/1999 117000.00 \n",
"486 COUNTER UNDERGROUND FACILITIES 6/25/1999 251924.00 \n",
"487 COUNTER MEASURES 6/11/1999 199991.00 \n",
"488 CONTRACT ADMINISTRATION 7/14/1999 90000.00 \n",
"489 CONTRACTS MANAGEMENT 6/30/1999 4422.00 \n",
"490 TECH INTEGRATION CENTER/TECH DEV CENTER 8/4/1999 100000.00 \n",
"491 POLYMER MATERIALS (CONG ADD) 5/15/1999 423916.45 \n",
"492 CEROS (FENCED) 8/2/1999 59972.00 \n",
"493 ADVANCED SHIP/SENSOR SYSTEMS MRN-02 8/9/1999 43425.18 \n",
"494 CONTRACTS MANAGEMENT 8/30/1999 37075.00 \n",
"495 CONTRACTS MANAGEMENT 9/13/1999 0.00 \n",
"496 CONTRACTS MANAGEMENT 8/31/1999 64755.00 \n",
"497 ADVANCED GROUND SURVELLIANCE 3/1211999 99729.00 \n",
"498 ADVANCED MICROELECTRONICS 4/14/1999 10000.00 \n",
"501 None None NaN \n",
"502 BW MEDICAL DIAGNOSTICS 3/30/1999 99970.00 \n",
"503 BW MEDICAL DIAGNOSTICS 5/26/1999 0.00 \n",
"504 BW MEDICAL DIAGNOSTICS 8/4/1999 0.00 \n",
"505 SENSOR EMULATION 5/4/1999 100000.00 \n",
"506 SENSOR EMULATION 5/12/1999 0.00 \n",
"507 UNDERSEA LITTORAL WARFARE 4/12/1999 74827.00 \n",
"508 COMBAT CASUALTY DIAGNOSTICS:ULTRASOUND 5/3/1999 59500.00 \n",
"509 OFFICE/PROGRAM SUPPORT (related to VTAX4) 5/11/1999 48566.00 \n",
"510 ADVANCED SIMULATION TECH 6/29/1999 99494.00 \n",
"511 COUNTER MEASURES 6/14/1999 80460.00 \n",
"512 COUNTER MEASURES 7/16/1999 90000.00 \n",
"513 CONTRACT ADMINISTRATION 5/3/1999 100000.00 \n",
"514 TECH INTEGRATION CENTER/TECH DEV CENTER 9/8/1999 50000.00 \n",
"515 SOLAR BLIND DETECTORS 7/9/1999 0.00 \n",
"\n",
"[495 rows x 7 columns]"
]
},
"execution_count": 39,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"darpa1999"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" AMOUNT | \n",
"
\n",
" \n",
" PROGRAM_TITLE | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 3-D MICRO ELECTRONICS | \n",
" NaN | \n",
"
\n",
" \n",
" 6/1711999 | \n",
" NaN | \n",
"
\n",
" \n",
" AA V: Advanced Air Vehicle | \n",
" -500000.00 | \n",
"
\n",
" \n",
" AAV: Advanced Air Vehicle | \n",
" 10225000.00 | \n",
"
\n",
" \n",
" ACMPNIP | \n",
" 302233.00 | \n",
"
\n",
" \n",
" ACTIVE NETWORKS | \n",
" 50000.00 | \n",
"
\n",
" \n",
" ACTIVE TEMPLATES | \n",
" 329987.00 | \n",
"
\n",
" \n",
" ADAPTIVE COMPUTING SYSTEMS | \n",
" 850000.00 | \n",
"
\n",
" \n",
" ADMINISTRATIVE SUPPORT | \n",
" 290000.00 | \n",
"
\n",
" \n",
" ADVANCED FLEXIBLE MANUFACTURING | \n",
" 3400000.00 | \n",
"
\n",
" \n",
" ADVANCED GROUND SURVELLIANCE | \n",
" 124593.00 | \n",
"
\n",
" \n",
" ADVANCED LITHOGRAPHY | \n",
" 3030000.00 | \n",
"
\n",
" \n",
" ADVANCED LOGISTICS TECHNOLOGY | \n",
" 16518949.00 | \n",
"
\n",
" \n",
" ADVANCED MICROELECTRONICS | \n",
" 10000.00 | \n",
"
\n",
" \n",
" ADVANCED NETWORKING TECHNOLOGY | \n",
" 49959.00 | \n",
"
\n",
" \n",
" ADVANCED SHIP/SENSOR SYSTEMS MRN-02 | \n",
" 328425.18 | \n",
"
\n",
" \n",
" ADVANCED SIMULATION TECH | \n",
" 2947837.00 | \n",
"
\n",
" \n",
" AG ILE INFO CONTROL ENVIRONMENT | \n",
" 199899.00 | \n",
"
\n",
" \n",
" AGENT ADMIN SUPPORT | \n",
" 769226.00 | \n",
"
\n",
" \n",
" AGILE INFO CONTROL ENVIRONMENT | \n",
" 3872700.00 | \n",
"
\n",
" \n",
" AIM : Advanced ISR Man'!9ement | \n",
" 600000.00 | \n",
"
\n",
" \n",
" AIRBORNE COMMS NODE | \n",
" 16800000.00 | \n",
"
\n",
" \n",
" AIRBORNE VIDEO SURVEILLANCE | \n",
" 468486.00 | \n",
"
\n",
" \n",
" AM3: Affordable Multi-Missile Manufacturing | \n",
" 18795907.00 | \n",
"
\n",
" \n",
" AMOUNT | \n",
" NaN | \n",
"
\n",
" \n",
" ANTS SEEDLINGS | \n",
" 50000.00 | \n",
"
\n",
" \n",
" APLA: SELF HEALING/TAGS/MGM | \n",
" 20526.00 | \n",
"
\n",
" \n",
" ARRMD: Affordable Rapid Response Missile Demonstrator | \n",
" 500000.00 | \n",
"
\n",
" \n",
" ART: Advanced Rotorcraft Technology | \n",
" 1500000.00 | \n",
"
\n",
" \n",
" BADD | \n",
" 642161.00 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" SECURITY SUPPORT | \n",
" 1321000.00 | \n",
"
\n",
" \n",
" SECURITY SUPPORT . | \n",
" 423212.00 | \n",
"
\n",
" \n",
" SENSOR EMULATION | \n",
" 332215.00 | \n",
"
\n",
" \n",
" SHOCC | \n",
" 1289562.00 | \n",
"
\n",
" \n",
" SKYLINK | \n",
" 100000.00 | \n",
"
\n",
" \n",
" SLID: Small Low-Cost Interceptor Device | \n",
" 1859000.00 | \n",
"
\n",
" \n",
" SMART MATERIALS/ACTUATORS | \n",
" 1633760.00 | \n",
"
\n",
" \n",
" SMART MATERIALS/DEMOS | \n",
" 5280951.00 | \n",
"
\n",
" \n",
" SOLAR BLIND DETECTORS | \n",
" 250000.00 | \n",
"
\n",
" \n",
" STARLIGHT SUPPORT COSTS-ITO | \n",
" 210500.00 | \n",
"
\n",
" \n",
" STOWACTD | \n",
" 831302.00 | \n",
"
\n",
" \n",
" SUB STUDY (SUBMARINE PAYLOADS AND SENSORS | \n",
" 6180000.00 | \n",
"
\n",
" \n",
" SUO: SITUATION AWARENESS SYS (SAS) | \n",
" 19138500.00 | \n",
"
\n",
" \n",
" SURVIVABLTY LARGE SCALE INFO SYS | \n",
" 119732.00 | \n",
"
\n",
" \n",
" Seedlings SGT-02 | \n",
" 242880.00 | \n",
"
\n",
" \n",
" Seedlings TT-06 | \n",
" 900000.00 | \n",
"
\n",
" \n",
" Seedlings TT-07 | \n",
" 200000.00 | \n",
"
\n",
" \n",
" TACTICAL SENSORS | \n",
" 290580.00 | \n",
"
\n",
" \n",
" TECH INTEGRATION CENTER/TECH DEV CENTER | \n",
" 150000.00 | \n",
"
\n",
" \n",
" TMR: URBAN ROBOTICS | \n",
" 332000.00 | \n",
"
\n",
" \n",
" TRVS | \n",
" 500000.00 | \n",
"
\n",
" \n",
" UCAV: Unmanned Combat Air Vehicle | \n",
" 10017493.00 | \n",
"
\n",
" \n",
" UNDERSEA LITTORAL WARFARE | \n",
" 1324827.00 | \n",
"
\n",
" \n",
" VIRTUAL ELECTROMAGNETIC TEST RANGE | \n",
" 498383.00 | \n",
"
\n",
" \n",
" VLSI PHOTONICS | \n",
" 149984.00 | \n",
"
\n",
" \n",
" WARFIGHTERS INTERNET | \n",
" 327000.00 | \n",
"
\n",
" \n",
" WATER HAMMER | \n",
" 2887441.00 | \n",
"
\n",
" \n",
" WEBINABOX | \n",
" 360000.00 | \n",
"
\n",
" \n",
" lA INTEGRATED TESTBED (INFORMATION ASSURANCE) | \n",
" 909090.00 | \n",
"
\n",
" \n",
" lA INTEGRATED TESTBEDJINFORMATION ASSURANCE) | \n",
" 79167.00 | \n",
"
\n",
" \n",
"
\n",
"
164 rows × 1 columns
\n",
"
"
],
"text/plain": [
" AMOUNT\n",
"PROGRAM_TITLE \n",
"3-D MICRO ELECTRONICS NaN\n",
"6/1711999 NaN\n",
"AA V: Advanced Air Vehicle -500000.00\n",
"AAV: Advanced Air Vehicle 10225000.00\n",
"ACMPNIP 302233.00\n",
"ACTIVE NETWORKS 50000.00\n",
"ACTIVE TEMPLATES 329987.00\n",
"ADAPTIVE COMPUTING SYSTEMS 850000.00\n",
"ADMINISTRATIVE SUPPORT 290000.00\n",
"ADVANCED FLEXIBLE MANUFACTURING 3400000.00\n",
"ADVANCED GROUND SURVELLIANCE 124593.00\n",
"ADVANCED LITHOGRAPHY 3030000.00\n",
"ADVANCED LOGISTICS TECHNOLOGY 16518949.00\n",
"ADVANCED MICROELECTRONICS 10000.00\n",
"ADVANCED NETWORKING TECHNOLOGY 49959.00\n",
"ADVANCED SHIP/SENSOR SYSTEMS MRN-02 328425.18\n",
"ADVANCED SIMULATION TECH 2947837.00\n",
"AG ILE INFO CONTROL ENVIRONMENT 199899.00\n",
"AGENT ADMIN SUPPORT 769226.00\n",
"AGILE INFO CONTROL ENVIRONMENT 3872700.00\n",
"AIM : Advanced ISR Man'!9ement 600000.00\n",
"AIRBORNE COMMS NODE 16800000.00\n",
"AIRBORNE VIDEO SURVEILLANCE 468486.00\n",
"AM3: Affordable Multi-Missile Manufacturing 18795907.00\n",
"AMOUNT NaN\n",
"ANTS SEEDLINGS 50000.00\n",
"APLA: SELF HEALING/TAGS/MGM 20526.00\n",
"ARRMD: Affordable Rapid Response Missile Demons... 500000.00\n",
"ART: Advanced Rotorcraft Technology 1500000.00\n",
"BADD 642161.00\n",
"... ...\n",
"SECURITY SUPPORT 1321000.00\n",
"SECURITY SUPPORT . 423212.00\n",
"SENSOR EMULATION 332215.00\n",
"SHOCC 1289562.00\n",
"SKYLINK 100000.00\n",
"SLID: Small Low-Cost Interceptor Device 1859000.00\n",
"SMART MATERIALS/ACTUATORS 1633760.00\n",
"SMART MATERIALS/DEMOS 5280951.00\n",
"SOLAR BLIND DETECTORS 250000.00\n",
"STARLIGHT SUPPORT COSTS-ITO 210500.00\n",
"STOWACTD 831302.00\n",
"SUB STUDY (SUBMARINE PAYLOADS AND SENSORS 6180000.00\n",
"SUO: SITUATION AWARENESS SYS (SAS) 19138500.00\n",
"SURVIVABLTY LARGE SCALE INFO SYS 119732.00\n",
"Seedlings SGT-02 242880.00\n",
"Seedlings TT-06 900000.00\n",
"Seedlings TT-07 200000.00\n",
"TACTICAL SENSORS 290580.00\n",
"TECH INTEGRATION CENTER/TECH DEV CENTER 150000.00\n",
"TMR: URBAN ROBOTICS 332000.00\n",
"TRVS 500000.00\n",
"UCAV: Unmanned Combat Air Vehicle 10017493.00\n",
"UNDERSEA LITTORAL WARFARE 1324827.00\n",
"VIRTUAL ELECTROMAGNETIC TEST RANGE 498383.00\n",
"VLSI PHOTONICS 149984.00\n",
"WARFIGHTERS INTERNET 327000.00\n",
"WATER HAMMER 2887441.00\n",
"WEBINABOX 360000.00\n",
"lA INTEGRATED TESTBED (INFORMATION ASSURANCE) 909090.00\n",
"lA INTEGRATED TESTBEDJINFORMATION ASSURANCE) 79167.00\n",
"\n",
"[164 rows x 1 columns]"
]
},
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#finally can do some operations\n",
"darpa1999.groupby(by=\"PROGRAM_TITLE\").sum()"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" AMOUNT | \n",
"
\n",
" \n",
" PERFORMER | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" ALPHATECH | \n",
" 885851.00 | \n",
"
\n",
" \n",
" ALPINECONS | \n",
" 16110000.00 | \n",
"
\n",
" \n",
" APT I | \n",
" 2809441.00 | \n",
"
\n",
" \n",
" APTI | \n",
" 124999.00 | \n",
"
\n",
" \n",
" ARDAK | \n",
" 99970.00 | \n",
"
\n",
" \n",
" ARIZONASTA | \n",
" 345851.00 | \n",
"
\n",
" \n",
" ART I | \n",
" 2290000.00 | \n",
"
\n",
" \n",
" ART! | \n",
" 210500.00 | \n",
"
\n",
" \n",
" ARTI | \n",
" 4009102.00 | \n",
"
\n",
" \n",
" ATTTECH | \n",
" 200000.00 | \n",
"
\n",
" \n",
" AUBURNU | \n",
" 395320.00 | \n",
"
\n",
" \n",
" AWARD DATE | \n",
" NaN | \n",
"
\n",
" \n",
" BBN | \n",
" 144000.00 | \n",
"
\n",
" \n",
" BDMFEDERAL | \n",
" 751046.00 | \n",
"
\n",
" \n",
" BELLATLANT | \n",
" 1641197.00 | \n",
"
\n",
" \n",
" BELLCORE | \n",
" 4547200.00 | \n",
"
\n",
" \n",
" BLUE RIDGE | \n",
" 48566.00 | \n",
"
\n",
" \n",
" BOEING | \n",
" 8082268.00 | \n",
"
\n",
" \n",
" BOEINGDEFS | \n",
" 4238010.00 | \n",
"
\n",
" \n",
" BOEINGDESP | \n",
" 5883520.00 | \n",
"
\n",
" \n",
" BOEINGNAIN | \n",
" 76655.00 | \n",
"
\n",
" \n",
" BOOZALLEN | \n",
" 3301594.45 | \n",
"
\n",
" \n",
" BRADSONCOR | \n",
" 885747.74 | \n",
"
\n",
" \n",
" CALTECH | \n",
" 256638.00 | \n",
"
\n",
" \n",
" CENTRA | \n",
" 1810672.00 | \n",
"
\n",
" \n",
" CERIDIAN | \n",
" 3797712.00 | \n",
"
\n",
" \n",
" CFDRESCORP | \n",
" 699919.00 | \n",
"
\n",
" \n",
" CNRI | \n",
" 5936135.00 | \n",
"
\n",
" \n",
" COLOSTU | \n",
" 100000.00 | \n",
"
\n",
" \n",
" CRAYRESEAR | \n",
" 1289562.00 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" TRACORAERO | \n",
" 1447900.00 | \n",
"
\n",
" \n",
" TRITECHINC | \n",
" 222649.00 | \n",
"
\n",
" \n",
" TRW | \n",
" 5600000.00 | \n",
"
\n",
" \n",
" UALABAMA | \n",
" 900000.00 | \n",
"
\n",
" \n",
" UARIZONA | \n",
" 2790100.00 | \n",
"
\n",
" \n",
" UCBERKELEY | \n",
" 3260000.00 | \n",
"
\n",
" \n",
" UCIRVINE | \n",
" 50000.00 | \n",
"
\n",
" \n",
" UCLA | \n",
" 350000.00 | \n",
"
\n",
" \n",
" UCLON | \n",
" 156000.00 | \n",
"
\n",
" \n",
" UCSANTABAR | \n",
" 100000.00 | \n",
"
\n",
" \n",
" UFLA | \n",
" 321768.00 | \n",
"
\n",
" \n",
" UILLURBCHA | \n",
" 100000.00 | \n",
"
\n",
" \n",
" UMASS | \n",
" 1151500.00 | \n",
"
\n",
" \n",
" UMINN | \n",
" 1915045.00 | \n",
"
\n",
" \n",
" UNIVNEWORL | \n",
" 3474136.00 | \n",
"
\n",
" \n",
" USCISI | \n",
" 299997.00 | \n",
"
\n",
" \n",
" USDISPLAYC | \n",
" 5794000.00 | \n",
"
\n",
" \n",
" UTAHSTU | \n",
" 327618.00 | \n",
"
\n",
" \n",
" UTEXAS | \n",
" 350000.00 | \n",
"
\n",
" \n",
" UVA | \n",
" 365396.00 | \n",
"
\n",
" \n",
" UWISCONSIN | \n",
" 43650.00 | \n",
"
\n",
" \n",
" VALLEYELEC | \n",
" 100095.00 | \n",
"
\n",
" \n",
" VANDERBILT | \n",
" 204435.00 | \n",
"
\n",
" \n",
" VEDAINC | \n",
" 666262.00 | \n",
"
\n",
" \n",
" VISTARESEA | \n",
" 74827.00 | \n",
"
\n",
" \n",
" VISUALEYES | \n",
" 59500.00 | \n",
"
\n",
" \n",
" VRT | \n",
" 173469.00 | \n",
"
\n",
" \n",
" WALCOFF | \n",
" 20526.00 | \n",
"
\n",
" \n",
" XEROXPARC | \n",
" 1642515.00 | \n",
"
\n",
" \n",
" lVI | \n",
" 90000.00 | \n",
"
\n",
" \n",
"
\n",
"
152 rows × 1 columns
\n",
"
"
],
"text/plain": [
" AMOUNT\n",
"PERFORMER \n",
"ALPHATECH 885851.00\n",
"ALPINECONS 16110000.00\n",
"APT I 2809441.00\n",
"APTI 124999.00\n",
"ARDAK 99970.00\n",
"ARIZONASTA 345851.00\n",
"ART I 2290000.00\n",
"ART! 210500.00\n",
"ARTI 4009102.00\n",
"ATTTECH 200000.00\n",
"AUBURNU 395320.00\n",
"AWARD DATE NaN\n",
"BBN 144000.00\n",
"BDMFEDERAL 751046.00\n",
"BELLATLANT 1641197.00\n",
"BELLCORE 4547200.00\n",
"BLUE RIDGE 48566.00\n",
"BOEING 8082268.00\n",
"BOEINGDEFS 4238010.00\n",
"BOEINGDESP 5883520.00\n",
"BOEINGNAIN 76655.00\n",
"BOOZALLEN 3301594.45\n",
"BRADSONCOR 885747.74\n",
"CALTECH 256638.00\n",
"CENTRA 1810672.00\n",
"CERIDIAN 3797712.00\n",
"CFDRESCORP 699919.00\n",
"CNRI 5936135.00\n",
"COLOSTU 100000.00\n",
"CRAYRESEAR 1289562.00\n",
"... ...\n",
"TRACORAERO 1447900.00\n",
"TRITECHINC 222649.00\n",
"TRW 5600000.00\n",
"UALABAMA 900000.00\n",
"UARIZONA 2790100.00\n",
"UCBERKELEY 3260000.00\n",
"UCIRVINE 50000.00\n",
"UCLA 350000.00\n",
"UCLON 156000.00\n",
"UCSANTABAR 100000.00\n",
"UFLA 321768.00\n",
"UILLURBCHA 100000.00\n",
"UMASS 1151500.00\n",
"UMINN 1915045.00\n",
"UNIVNEWORL 3474136.00\n",
"USCISI 299997.00\n",
"USDISPLAYC 5794000.00\n",
"UTAHSTU 327618.00\n",
"UTEXAS 350000.00\n",
"UVA 365396.00\n",
"UWISCONSIN 43650.00\n",
"VALLEYELEC 100095.00\n",
"VANDERBILT 204435.00\n",
"VEDAINC 666262.00\n",
"VISTARESEA 74827.00\n",
"VISUALEYES 59500.00\n",
"VRT 173469.00\n",
"WALCOFF 20526.00\n",
"XEROXPARC 1642515.00\n",
"lVI 90000.00\n",
"\n",
"[152 rows x 1 columns]"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"darpa1999.groupby(by=\"PERFORMER\").sum()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##note that we've not done any work on the dates, which are filled with badly OCR'd data\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## back to pdftotext\n",
"\n",
"`pdftotext` not use wildcards.\n",
"\n",
"To run on all files in a directory within the unix bash shell (Mac OS X, most linux):\n",
"\n",
"`for file in *.pdf; do pdftotext \"$file\" \"$file.txt\"; done`\n",
"\n",
"RUN in shell not in python\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#the greater evil\n",
"## image text needing to be OCR'd--optical character recognition\n",
"\n",
"Here proprietary solutions rule the day. :("
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Easiest, if you trust people not to be evil, and severly limited\n",
"\n",
"Google drive for file < 2m or 10 pages.\n",
"\n",
"Google probably has the best ocr out there but no way to access at scale.\n",
"\n",
"if you've found a pdf online, can always consult Google's OCR of it via Google cache:\n",
"\n",
"take yer url and prefix it with:\n",
"\n",
"`https://webcache.googleusercontent.com/search?q=cache:{{your URL}}`\n",
"\n",
"\n",
"Doesn't always work and result is challenging html that reproduces the *position* of text\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Key commerical products\n",
"- Adobe Acrobat Pro\n",
" - slow\n",
" - not great on bulk operations, but does the job ok. \n",
" - embeds the ocr'd text within a new pdf\n",
" - extract using pdftotext or from a menu item. pdftotext better bet\n",
"\n",
"- Abbyy FineReader\n",
" - can do multiple languages, tables\n",
" - enterprise grade stuff\n",
" - not horrendously expensive\n",
" - fewered featured version *not* available in US. Not sure why.\n",
" \n",
"Locally, can be used on library machines."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Open source alternatives\n",
"\n",
"Old form of Google tech: `tesseract`\n",
"\n",
"Futzy: requires pdfs to be divided into individual pages, then rendered as tiff.\n",
"\n",
"Very linux-y world of multiple dependencies, weird incompatibilites\n",
"\n",
"See https://apple.stackexchange.com/questions/128384/ocr-on-pdfs-in-os-x-with-free-open-source-tools\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"###Another potential evil: encryption\n",
"\n",
"*All* major utilities honor the pdf encryption schemes."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For ebooks you \"own\" (i.e. have a license), such as Kindle books, use the Calibre application and the de-DRM add ons to extract your licensed text as a more open format."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.10"
}
},
"nbformat": 4,
"nbformat_minor": 0
}