{
"metadata": {
"name": "",
"signature": "sha256:4160fb4cc19cc97187ca3a5f9acbad4dd7ccc7c28bee0a5faa9b043be2fc54ba"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Analytics Beyond the Basics with pandas and SQL\n",
"===\n",
"\n",
"Wes McKinney\n",
"---\n",
"@wesmckinn, wes@cloudera.com\n",
"---\n",
"Strata-Hadoop World SJ 2015\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Tutorial motivation\n",
"===\n",
"- Illustrate useful analytics techniques in both pandas and SQL\n",
"- Tools\n",
" - pandas 0.15.1 or higher (install via Anaconda)\n",
" - PostgreSQL \n",
"- Some SQL differences between database flavors\n",
"- Materials: https://github.com/wesm/strata-sj-2015\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Topics\n",
"===\n",
"- More complex data grouping and augmentation\n",
"- Top-N filtering\n",
"- Subqueries\n",
"- Semi- and anti-joins\n",
"- Using window functions and equivalents\n",
"- Histograms and quantile analysis\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Datasets\n",
"===\n",
"- SF Crime Reports since 2014\n",
"- Crunchbase Data Export (CC-BY-NC license)\n",
"- Companies\n",
"- Funding rounds and investors\n",
"- Acquisitions\n",
"- Find these in CSV form in the GitHub repo, along with a Python data loading script"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Ceremonial preparation of the datas\n",
"===\n",
"\n",
"See gnarly data loading script in GitHub repo to put data into Postgres \n",
"---\n",
"(p.s. please let me know if there's a less painful way that also addresses the bad data issues)\n",
"---"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%matplotlib inline"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 1
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from datetime import datetime, time\n",
"import pandas as pd\n",
"import numpy as np\n",
"import psycopg2 as pg\n",
"import pandas.io.sql as sql\n",
"\n",
"CRIME_DATE_FORMAT = '%m/%d/%Y %H:%M:%S %p'"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 8
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"from sqlalchemy import create_engine\n",
"con = create_engine('postgresql://wesm:foo@localhost:5432/strata_2015')\n",
"con"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 3,
"text": [
"Engine(postgresql://wesm:***@localhost:5432/strata_2015)"
]
}
],
"prompt_number": 3
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"table = pd.read_sql(\"select * from crimes limit 10;\", con)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 4
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"crimes = pd.read_csv('sf_crimes.csv')\n",
"pd.to_datetime(crimes.Date[:1000])"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 5,
"text": [
"0 2014-01-30\n",
"1 2014-07-14\n",
"2 2015-01-18\n",
"3 2014-09-21\n",
"4 2014-03-05\n",
"5 2014-03-08\n",
"6 2014-04-06\n",
"7 2014-09-02\n",
"8 2014-07-19\n",
"9 2014-09-29\n",
"10 2014-04-14\n",
"11 2014-07-16\n",
"12 2014-01-27\n",
"13 2014-05-04\n",
"14 2014-08-16\n",
"...\n",
"985 2014-08-29\n",
"986 2015-01-24\n",
"987 2014-07-09\n",
"988 2014-01-21\n",
"989 2014-05-17\n",
"990 2014-09-30\n",
"991 2014-02-11\n",
"992 2014-05-03\n",
"993 2014-06-30\n",
"994 2014-12-10\n",
"995 2014-07-03\n",
"996 2014-09-20\n",
"997 2014-01-19\n",
"998 2014-12-16\n",
"999 2014-07-24\n",
"Name: Date, Length: 1000, dtype: datetime64[ns]"
]
}
],
"prompt_number": 5
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%timeit pd.to_datetime(crimes.Date[:1000])"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"10 loops, best of 3: 99.5 ms per loop\n"
]
}
],
"prompt_number": 6
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"%timeit pd.to_datetime(crimes.Date[:1000], format=CRIME_DATE_FORMAT)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"100 loops, best of 3: 5.63 ms per loop\n"
]
}
],
"prompt_number": 9
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def null_on_error(series, converter):\n",
" def f(x):\n",
" try:\n",
" return converter(x)\n",
" except (TypeError, ValueError):\n",
" return np.nan\n",
" return series.map(f)\n",
"\n",
"def number_with_comma(x):\n",
" try:\n",
" x = x.replace(',', '')\n",
" return float(x)\n",
" except (ValueError, AttributeError):\n",
" return np.nan\n",
"\n",
"def to_time(x):\n",
" h, m = x.split(':')\n",
" return time(int(h), int(m))\n",
"\n",
"crimes = pd.read_csv('sf_crimes.csv')\n",
"crimes.columns = crimes.columns.map(str.lower)\n",
"crimes.date = pd.to_datetime(crimes.date, format=CRIME_DATE_FORMAT)\n",
"crimes.time = crimes.time.map(to_time)\n",
"\n",
"companies = pd.read_csv('companies.csv',\n",
" parse_dates=['founded_at', 'first_funding_at'])\n",
"companies['funding_total_usd'] = null_on_error(\n",
" companies.pop(' funding_total_usd '), number_with_comma)\n",
"\n",
"investments = pd.read_csv('investments.csv')\n",
"acquisitions = pd.read_csv('acquisitions.csv')\n",
"rounds = pd.read_csv('rounds.csv')\n"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stderr",
"text": [
"/Users/wesm/anaconda/lib/python2.7/site-packages/pandas/io/parsers.py:1154: DtypeWarning: Columns (0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,20,21) have mixed types. Specify dtype option on import or set low_memory=False.\n",
" data = self._reader.read(nrows)\n"
]
}
],
"prompt_number": 10
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"companies.columns = companies.columns.map(str.strip)\n",
"\n",
"rounds.columns = rounds.columns.map(str.strip)\n",
"investments.columns = investments.columns.map(str.strip)\n",
"acquisitions.columns = acquisitions.columns.map(str.strip)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 11
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"acquisitions['price_amount'] = null_on_error(acquisitions.price_amount, number_with_comma)"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 12
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Less trivial data grouping\n",
"===\n",
"- Computed fields\n",
"- pandas conveniences: arrays, functions\n",
"- Coarser grouping with value mappings"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"crimes.loc[0]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 13,
"text": [
"incidntnum 140090226\n",
"category WARRANTS\n",
"descript ENROUTE TO PAROLE OFFICER\n",
"dayofweek Thursday\n",
"date 2014-01-30 12:00:00\n",
"time 21:56:00\n",
"pddistrict MISSION\n",
"resolution ARREST, BOOKED\n",
"address 400 Block of SOUTH VAN NESS AV\n",
"x -122.4177\n",
"y 37.76627\n",
"location (37.7662744315825, -122.417662868855)\n",
"Name: 0, dtype: object"
]
}
],
"prompt_number": 13
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"crimes.groupby('category').size().order(ascending=False)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 14,
"text": [
"category\n",
"LARCENY/THEFT 41300\n",
"OTHER OFFENSES 22324\n",
"NON-CRIMINAL 20838\n",
"ASSAULT 13422\n",
"VANDALISM 7712\n",
"VEHICLE THEFT 7671\n",
"WARRANTS 7267\n",
"BURGLARY 6525\n",
"DRUG/NARCOTIC 5801\n",
"SUSPICIOUS OCC 5633\n",
"MISSING PERSON 5052\n",
"ROBBERY 3758\n",
"FRAUD 3176\n",
"SECONDARY CODES 2042\n",
"WEAPON LAWS 1726\n",
"TRESPASS 1247\n",
"STOLEN PROPERTY 1165\n",
"SEX OFFENSES, FORCIBLE 912\n",
"FORGERY/COUNTERFEITING 807\n",
"DRUNKENNESS 666\n",
"KIDNAPPING 559\n",
"PROSTITUTION 477\n",
"DRIVING UNDER THE INFLUENCE 408\n",
"DISORDERLY CONDUCT 389\n",
"ARSON 292\n",
"RUNAWAY 233\n",
"LIQUOR LAWS 192\n",
"EMBEZZLEMENT 153\n",
"FAMILY OFFENSES 94\n",
"SUICIDE 75\n",
"BRIBERY 58\n",
"LOITERING 38\n",
"BAD CHECKS 35\n",
"EXTORTION 33\n",
"SEX OFFENSES, NON FORCIBLE 20\n",
"GAMBLING 9\n",
"PORNOGRAPHY/OBSCENE MAT 2\n",
"TREA 1\n",
"dtype: int64"
]
}
],
"prompt_number": 14
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"query = \"\"\"\n",
"select category, count(*) as occs\n",
"from crimes\n",
"group by 1\n",
"order by occs desc;\n",
"\"\"\"\n",
"\n",
"pd.read_sql(query, con)[:10]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"
\n",
"
\n",
" \n",
" \n",
" | \n",
" category | \n",
" occs | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" LARCENY/THEFT | \n",
" 41300 | \n",
"
\n",
" \n",
" 1 | \n",
" OTHER OFFENSES | \n",
" 22324 | \n",
"
\n",
" \n",
" 2 | \n",
" NON-CRIMINAL | \n",
" 20838 | \n",
"
\n",
" \n",
" 3 | \n",
" ASSAULT | \n",
" 13422 | \n",
"
\n",
" \n",
" 4 | \n",
" VANDALISM | \n",
" 7712 | \n",
"
\n",
" \n",
" 5 | \n",
" VEHICLE THEFT | \n",
" 7671 | \n",
"
\n",
" \n",
" 6 | \n",
" WARRANTS | \n",
" 7267 | \n",
"
\n",
" \n",
" 7 | \n",
" BURGLARY | \n",
" 6525 | \n",
"
\n",
" \n",
" 8 | \n",
" DRUG/NARCOTIC | \n",
" 5801 | \n",
"
\n",
" \n",
" 9 | \n",
" SUSPICIOUS OCC | \n",
" 5633 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 16,
"text": [
" category occs\n",
"0 LARCENY/THEFT 41300\n",
"1 OTHER OFFENSES 22324\n",
"2 NON-CRIMINAL 20838\n",
"3 ASSAULT 13422\n",
"4 VANDALISM 7712\n",
"5 VEHICLE THEFT 7671\n",
"6 WARRANTS 7267\n",
"7 BURGLARY 6525\n",
"8 DRUG/NARCOTIC 5801\n",
"9 SUSPICIOUS OCC 5633"
]
}
],
"prompt_number": 16
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def top(x, k=10):\n",
" return x.order(ascending=False)[:k]\n",
"\n",
"crimes.groupby([crimes.date.dt.year,\n",
" crimes.time.map(lambda x: x.hour)]).size()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 17,
"text": [
" time\n",
"2014 0 7217\n",
" 1 4353\n",
" 2 3642\n",
" 3 2537\n",
" 4 1709\n",
" 5 1636\n",
" 6 2195\n",
" 7 3553\n",
" 8 5124\n",
" 9 5844\n",
" 10 6210\n",
" 11 6464\n",
" 12 8803\n",
" 13 7122\n",
" 14 7373\n",
" 15 8111\n",
" 16 8671\n",
" 17 9389\n",
" 18 9865\n",
" 19 9087\n",
" 20 8452\n",
" 21 7743\n",
" 22 7924\n",
" 23 7092\n",
"2015 0 504\n",
" 1 420\n",
" 2 332\n",
" 3 224\n",
" 4 156\n",
" 5 143\n",
" 6 179\n",
" 7 320\n",
" 8 403\n",
" 9 530\n",
" 10 497\n",
" 11 533\n",
" 12 596\n",
" 13 539\n",
" 14 576\n",
" 15 627\n",
" 16 681\n",
" 17 691\n",
" 18 819\n",
" 19 766\n",
" 20 694\n",
" 21 634\n",
" 22 616\n",
" 23 516\n",
"dtype: int64"
]
}
],
"prompt_number": 17
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"query = \"\"\"\n",
"select extract(year from date) as year,\n",
" extract(hour from time) as hourofday,\n",
" count(*) as occs \n",
"from crimes\n",
"group by 1, 2;\n",
"\"\"\"\n",
"sql.read_sql(query, con)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" year | \n",
" hourofday | \n",
" occs | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 2014 | \n",
" 0 | \n",
" 7217 | \n",
"
\n",
" \n",
" 1 | \n",
" 2014 | \n",
" 1 | \n",
" 4353 | \n",
"
\n",
" \n",
" 2 | \n",
" 2014 | \n",
" 2 | \n",
" 3642 | \n",
"
\n",
" \n",
" 3 | \n",
" 2014 | \n",
" 3 | \n",
" 2537 | \n",
"
\n",
" \n",
" 4 | \n",
" 2014 | \n",
" 4 | \n",
" 1709 | \n",
"
\n",
" \n",
" 5 | \n",
" 2014 | \n",
" 5 | \n",
" 1636 | \n",
"
\n",
" \n",
" 6 | \n",
" 2014 | \n",
" 6 | \n",
" 2195 | \n",
"
\n",
" \n",
" 7 | \n",
" 2014 | \n",
" 7 | \n",
" 3553 | \n",
"
\n",
" \n",
" 8 | \n",
" 2014 | \n",
" 8 | \n",
" 5124 | \n",
"
\n",
" \n",
" 9 | \n",
" 2014 | \n",
" 9 | \n",
" 5844 | \n",
"
\n",
" \n",
" 10 | \n",
" 2014 | \n",
" 10 | \n",
" 6210 | \n",
"
\n",
" \n",
" 11 | \n",
" 2014 | \n",
" 11 | \n",
" 6464 | \n",
"
\n",
" \n",
" 12 | \n",
" 2014 | \n",
" 12 | \n",
" 8803 | \n",
"
\n",
" \n",
" 13 | \n",
" 2014 | \n",
" 13 | \n",
" 7122 | \n",
"
\n",
" \n",
" 14 | \n",
" 2014 | \n",
" 14 | \n",
" 7373 | \n",
"
\n",
" \n",
" 15 | \n",
" 2014 | \n",
" 15 | \n",
" 8111 | \n",
"
\n",
" \n",
" 16 | \n",
" 2014 | \n",
" 16 | \n",
" 8671 | \n",
"
\n",
" \n",
" 17 | \n",
" 2014 | \n",
" 17 | \n",
" 9389 | \n",
"
\n",
" \n",
" 18 | \n",
" 2014 | \n",
" 18 | \n",
" 9865 | \n",
"
\n",
" \n",
" 19 | \n",
" 2014 | \n",
" 19 | \n",
" 9087 | \n",
"
\n",
" \n",
" 20 | \n",
" 2014 | \n",
" 20 | \n",
" 8452 | \n",
"
\n",
" \n",
" 21 | \n",
" 2014 | \n",
" 21 | \n",
" 7743 | \n",
"
\n",
" \n",
" 22 | \n",
" 2014 | \n",
" 22 | \n",
" 7924 | \n",
"
\n",
" \n",
" 23 | \n",
" 2014 | \n",
" 23 | \n",
" 7092 | \n",
"
\n",
" \n",
" 24 | \n",
" 2015 | \n",
" 0 | \n",
" 504 | \n",
"
\n",
" \n",
" 25 | \n",
" 2015 | \n",
" 1 | \n",
" 420 | \n",
"
\n",
" \n",
" 26 | \n",
" 2015 | \n",
" 2 | \n",
" 332 | \n",
"
\n",
" \n",
" 27 | \n",
" 2015 | \n",
" 3 | \n",
" 224 | \n",
"
\n",
" \n",
" 28 | \n",
" 2015 | \n",
" 4 | \n",
" 156 | \n",
"
\n",
" \n",
" 29 | \n",
" 2015 | \n",
" 5 | \n",
" 143 | \n",
"
\n",
" \n",
" 30 | \n",
" 2015 | \n",
" 6 | \n",
" 179 | \n",
"
\n",
" \n",
" 31 | \n",
" 2015 | \n",
" 7 | \n",
" 320 | \n",
"
\n",
" \n",
" 32 | \n",
" 2015 | \n",
" 8 | \n",
" 403 | \n",
"
\n",
" \n",
" 33 | \n",
" 2015 | \n",
" 9 | \n",
" 530 | \n",
"
\n",
" \n",
" 34 | \n",
" 2015 | \n",
" 10 | \n",
" 497 | \n",
"
\n",
" \n",
" 35 | \n",
" 2015 | \n",
" 11 | \n",
" 533 | \n",
"
\n",
" \n",
" 36 | \n",
" 2015 | \n",
" 12 | \n",
" 596 | \n",
"
\n",
" \n",
" 37 | \n",
" 2015 | \n",
" 13 | \n",
" 539 | \n",
"
\n",
" \n",
" 38 | \n",
" 2015 | \n",
" 14 | \n",
" 576 | \n",
"
\n",
" \n",
" 39 | \n",
" 2015 | \n",
" 15 | \n",
" 627 | \n",
"
\n",
" \n",
" 40 | \n",
" 2015 | \n",
" 16 | \n",
" 681 | \n",
"
\n",
" \n",
" 41 | \n",
" 2015 | \n",
" 17 | \n",
" 691 | \n",
"
\n",
" \n",
" 42 | \n",
" 2015 | \n",
" 18 | \n",
" 819 | \n",
"
\n",
" \n",
" 43 | \n",
" 2015 | \n",
" 19 | \n",
" 766 | \n",
"
\n",
" \n",
" 44 | \n",
" 2015 | \n",
" 20 | \n",
" 694 | \n",
"
\n",
" \n",
" 45 | \n",
" 2015 | \n",
" 21 | \n",
" 634 | \n",
"
\n",
" \n",
" 46 | \n",
" 2015 | \n",
" 22 | \n",
" 616 | \n",
"
\n",
" \n",
" 47 | \n",
" 2015 | \n",
" 23 | \n",
" 516 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 18,
"text": [
" year hourofday occs\n",
"0 2014 0 7217\n",
"1 2014 1 4353\n",
"2 2014 2 3642\n",
"3 2014 3 2537\n",
"4 2014 4 1709\n",
"5 2014 5 1636\n",
"6 2014 6 2195\n",
"7 2014 7 3553\n",
"8 2014 8 5124\n",
"9 2014 9 5844\n",
"10 2014 10 6210\n",
"11 2014 11 6464\n",
"12 2014 12 8803\n",
"13 2014 13 7122\n",
"14 2014 14 7373\n",
"15 2014 15 8111\n",
"16 2014 16 8671\n",
"17 2014 17 9389\n",
"18 2014 18 9865\n",
"19 2014 19 9087\n",
"20 2014 20 8452\n",
"21 2014 21 7743\n",
"22 2014 22 7924\n",
"23 2014 23 7092\n",
"24 2015 0 504\n",
"25 2015 1 420\n",
"26 2015 2 332\n",
"27 2015 3 224\n",
"28 2015 4 156\n",
"29 2015 5 143\n",
"30 2015 6 179\n",
"31 2015 7 320\n",
"32 2015 8 403\n",
"33 2015 9 530\n",
"34 2015 10 497\n",
"35 2015 11 533\n",
"36 2015 12 596\n",
"37 2015 13 539\n",
"38 2015 14 576\n",
"39 2015 15 627\n",
"40 2015 16 681\n",
"41 2015 17 691\n",
"42 2015 18 819\n",
"43 2015 19 766\n",
"44 2015 20 694\n",
"45 2015 21 634\n",
"46 2015 22 616\n",
"47 2015 23 516"
]
}
],
"prompt_number": 18
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"crimes[:1].T"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" 0 | \n",
"
\n",
" \n",
" \n",
" \n",
" incidntnum | \n",
" 140090226 | \n",
"
\n",
" \n",
" category | \n",
" WARRANTS | \n",
"
\n",
" \n",
" descript | \n",
" ENROUTE TO PAROLE OFFICER | \n",
"
\n",
" \n",
" dayofweek | \n",
" Thursday | \n",
"
\n",
" \n",
" date | \n",
" 2014-01-30 12:00:00 | \n",
"
\n",
" \n",
" time | \n",
" 21:56:00 | \n",
"
\n",
" \n",
" pddistrict | \n",
" MISSION | \n",
"
\n",
" \n",
" resolution | \n",
" ARREST, BOOKED | \n",
"
\n",
" \n",
" address | \n",
" 400 Block of SOUTH VAN NESS AV | \n",
"
\n",
" \n",
" x | \n",
" -122.4177 | \n",
"
\n",
" \n",
" y | \n",
" 37.76627 | \n",
"
\n",
" \n",
" location | \n",
" (37.7662744315825, -122.417662868855) | \n",
"
\n",
" \n",
"
\n",
"
"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 19,
"text": [
" 0\n",
"incidntnum 140090226\n",
"category WARRANTS\n",
"descript ENROUTE TO PAROLE OFFICER\n",
"dayofweek Thursday\n",
"date 2014-01-30 12:00:00\n",
"time 21:56:00\n",
"pddistrict MISSION\n",
"resolution ARREST, BOOKED\n",
"address 400 Block of SOUTH VAN NESS AV\n",
"x -122.4177\n",
"y 37.76627\n",
"location (37.7662744315825, -122.417662868855)"
]
}
],
"prompt_number": 19
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"crimes.groupby('category').size().order(ascending=False)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 20,
"text": [
"category\n",
"LARCENY/THEFT 41300\n",
"OTHER OFFENSES 22324\n",
"NON-CRIMINAL 20838\n",
"ASSAULT 13422\n",
"VANDALISM 7712\n",
"VEHICLE THEFT 7671\n",
"WARRANTS 7267\n",
"BURGLARY 6525\n",
"DRUG/NARCOTIC 5801\n",
"SUSPICIOUS OCC 5633\n",
"MISSING PERSON 5052\n",
"ROBBERY 3758\n",
"FRAUD 3176\n",
"SECONDARY CODES 2042\n",
"WEAPON LAWS 1726\n",
"TRESPASS 1247\n",
"STOLEN PROPERTY 1165\n",
"SEX OFFENSES, FORCIBLE 912\n",
"FORGERY/COUNTERFEITING 807\n",
"DRUNKENNESS 666\n",
"KIDNAPPING 559\n",
"PROSTITUTION 477\n",
"DRIVING UNDER THE INFLUENCE 408\n",
"DISORDERLY CONDUCT 389\n",
"ARSON 292\n",
"RUNAWAY 233\n",
"LIQUOR LAWS 192\n",
"EMBEZZLEMENT 153\n",
"FAMILY OFFENSES 94\n",
"SUICIDE 75\n",
"BRIBERY 58\n",
"LOITERING 38\n",
"BAD CHECKS 35\n",
"EXTORTION 33\n",
"SEX OFFENSES, NON FORCIBLE 20\n",
"GAMBLING 9\n",
"PORNOGRAPHY/OBSCENE MAT 2\n",
"TREA 1\n",
"dtype: int64"
]
}
],
"prompt_number": 20
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"category_mapping = {\n",
" 'LARCENY/THEFT': 'THEFT',\n",
" 'ROBBERY': 'THEFT',\n",
" 'VEHICLE THEFT': 'THEFT'\n",
"}\n",
"coarse_cat = crimes.category.map(lambda x: category_mapping.get(x, x))\n",
"top(crimes.groupby(coarse_cat).size(), 20)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 21,
"text": [
"category\n",
"THEFT 52729\n",
"OTHER OFFENSES 22324\n",
"NON-CRIMINAL 20838\n",
"ASSAULT 13422\n",
"VANDALISM 7712\n",
"WARRANTS 7267\n",
"BURGLARY 6525\n",
"DRUG/NARCOTIC 5801\n",
"SUSPICIOUS OCC 5633\n",
"MISSING PERSON 5052\n",
"FRAUD 3176\n",
"SECONDARY CODES 2042\n",
"WEAPON LAWS 1726\n",
"TRESPASS 1247\n",
"STOLEN PROPERTY 1165\n",
"SEX OFFENSES, FORCIBLE 912\n",
"FORGERY/COUNTERFEITING 807\n",
"DRUNKENNESS 666\n",
"KIDNAPPING 559\n",
"PROSTITUTION 477\n",
"dtype: int64"
]
}
],
"prompt_number": 21
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"#crimes.category.map(category_mapping.__getitem__)\n",
"crimes.category.map(lambda x: category_mapping[x])"
],
"language": "python",
"metadata": {},
"outputs": [
{
"ename": "KeyError",
"evalue": "'WARRANTS'",
"output_type": "pyerr",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mKeyError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;31m#crimes.category.map(category_mapping.__getitem__)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mcrimes\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcategory\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmap\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;32mlambda\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mcategory_mapping\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;32m/Users/wesm/anaconda/lib/python2.7/site-packages/pandas/core/series.pyc\u001b[0m in \u001b[0;36mmap\u001b[0;34m(self, arg, na_action)\u001b[0m\n\u001b[1;32m 2012\u001b[0m index=self.index).__finalize__(self)\n\u001b[1;32m 2013\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 2014\u001b[0;31m \u001b[0mmapped\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mmap_f\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mvalues\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0marg\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2015\u001b[0m return self._constructor(mapped,\n\u001b[1;32m 2016\u001b[0m index=self.index).__finalize__(self)\n",
"\u001b[0;32m/Users/wesm/anaconda/lib/python2.7/site-packages/pandas/lib.so\u001b[0m in \u001b[0;36mpandas.lib.map_infer (pandas/lib.c:56502)\u001b[0;34m()\u001b[0m\n",
"\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m(x)\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;31m#crimes.category.map(category_mapping.__getitem__)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mcrimes\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcategory\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmap\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;32mlambda\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mcategory_mapping\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mKeyError\u001b[0m: 'WARRANTS'"
]
}
],
"prompt_number": 22
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"query = \"\"\"\n",
"select\n",
" case \n",
" when category in ('LARCENY/THEFT', 'ROBBERY', 'VEHICLE THEFT')\n",
" then 'THEFT'\n",
" else category\n",
" end as coarse_cat,\n",
" count(*) as occs\n",
"from crimes\n",
"group by 1\n",
"order by occs desc;\n",
"\"\"\"\n",
"sql.read_sql(query, con)[:10]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" coarse_cat | \n",
" occs | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" THEFT | \n",
" 52729 | \n",
"
\n",
" \n",
" 1 | \n",
" OTHER OFFENSES | \n",
" 22324 | \n",
"
\n",
" \n",
" 2 | \n",
" NON-CRIMINAL | \n",
" 20838 | \n",
"
\n",
" \n",
" 3 | \n",
" ASSAULT | \n",
" 13422 | \n",
"
\n",
" \n",
" 4 | \n",
" VANDALISM | \n",
" 7712 | \n",
"
\n",
" \n",
" 5 | \n",
" WARRANTS | \n",
" 7267 | \n",
"
\n",
" \n",
" 6 | \n",
" BURGLARY | \n",
" 6525 | \n",
"
\n",
" \n",
" 7 | \n",
" DRUG/NARCOTIC | \n",
" 5801 | \n",
"
\n",
" \n",
" 8 | \n",
" SUSPICIOUS OCC | \n",
" 5633 | \n",
"
\n",
" \n",
" 9 | \n",
" MISSING PERSON | \n",
" 5052 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 24,
"text": [
" coarse_cat occs\n",
"0 THEFT 52729\n",
"1 OTHER OFFENSES 22324\n",
"2 NON-CRIMINAL 20838\n",
"3 ASSAULT 13422\n",
"4 VANDALISM 7712\n",
"5 WARRANTS 7267\n",
"6 BURGLARY 6525\n",
"7 DRUG/NARCOTIC 5801\n",
"8 SUSPICIOUS OCC 5633\n",
"9 MISSING PERSON 5052"
]
}
],
"prompt_number": 24
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"keys = [crimes.date.dt.year,\n",
" crimes.time.map(lambda x: x.hour)]\n",
"counts = crimes.groupby(keys).size()\n",
"counts = counts.unstack(0)\n",
"share = counts / counts.sum()\n",
"share.plot(kind='bar')"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 26,
"text": [
""
]
},
{
"metadata": {},
"output_type": "display_data",
"png": "iVBORw0KGgoAAAANSUhEUgAAAXYAAAEQCAYAAACk818iAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3X+cVPV97/HXsAZhYZeF+osquDwUFI0GrJjtjSlg07oS\nAzWpIQRjF3/gDSWa3LTX0Op1oonW0tyi11SxUR72h642+GiwEU0RVm1qiLTuCsgG5JeYNEQjP5Zf\nsnTn/vH9zs6ZszNnzpyZ2TnnzPv5eJzHzvnxmfOd2ZnPnPM53/kOiIiIiIiIiIiIiIiIiIiIiIiI\niIhEXivQDWwDbs+zzYN2fRcw1S47D3jDMR0Abq1oS0VEpKA64G2gGfgI0AlMdm0zC3je3v448JMc\n9zME+C9gXEVaKSIi/YYUWH8ZJrHvAnqBdmCOa5vZwBP29nqgCTjdtc2ngO3AnhLaKiIiPhRK7GeS\nnYzftcsKbXOWa5svAE8GaaCIiBTnpALrUz7vJ+ERNxT4DHnq8+ecc05q+/btPncjIiJWFzAl14pC\nR+w/J7suPg5zRO61zVl2WdpVwH8A7+Xawfbt20mlUjmnu+66K+86r0lxilNcdeKi0Ma4xAEfy5e4\nCyX2DcBEzMXTocBcYJVrm1XA9fZ2C7Af2OtYPw94qsB+RESkTAqVYk4Ai4EXMT1kHgO2ALfY9csx\nPWJmYS6yHgYWOOJHYC6c3hykcbt27QoSpjjFKa5KcVFoYy3EFUrsAKvt5LTcNb84T+xh4JRiG5U2\nZUrO8pHiFKe4kMZFoY21EOe+6FkNKVsvEhERnxKJBOTJ4X6O2KtizJgx7Nu3r9rNCK3Ro0fzwQcf\nVLsZImXR2NRIz4Ge/vmGUQ0c3H+wii2KtkIXT6tm3759ga4i18oU5EOvo6Mj0P9CcYqr9L56DvRA\nkv7JmeQrsb+4x4U2sYuISDChrbEnEglyLRdDz4/ESSKRMEfraUn0+i7Aq8auI3YRkZhRYq8hUakP\nKi66cUH3FVQUnpNqxCmxi4hI2aVyybW8oWF0CjPAWEWmhobROdvi9uGHH6ZuuOGG1Nlnn51qaGhI\nTZkyJbV69er+9WvWrEmdd955qfr6+tTMmTNTu3fv7l+3du3a1IwZM1KjRo1KNTc3591HR0dHKpFI\npO644w7fz49IVAEpko5Jr++C8BikMVJH7D09+6hgXrf3X9iJEycYP348r7zyCgcPHuRb3/oWn//8\n53nnnXd4//33+exnP8u3v/1t9u3bx6WXXsrcuXP7Y0eOHMlNN93E0qVL895/b28vt912Gy0tLekL\nJCKx09g4hkQiodd4BUQqsYdFfX09d911F+PHjwfg05/+NBMmTGDDhg08++yzXHTRRXzuc59j6NCh\nJJNJurq62Lp1KwDTpk1j/vz5TJgwIe/9f+c736G1tZXzzjuvrD0DolIfVFx044qJyT5QCyYKz0k1\n4pTYy2Dv3r1s3bqVj370o2zevJmPfSwzmmZ9fT3nnnsumzZt8nVfu3fvZsWKFdx5553q7iUigYTh\nHCiVK4Hl6qdtTtkqmeyK7xve29vLVVddxcSJE3n44Ye56aabOPXUU7nvvvv6t7n88stZuHAh119/\nff+yNWvWcPPNN7Nz586s+5szZw7XXXcd1157LQsWLGDcuHHcfffdA1uqfuwScdnvZ/VjL5b6sVdI\nX18fX/rSlxg2bBgPPfQQYGroBw9mj3Fx4MABGhoaCt7fc889x6FDh7j22msBnAPqi4j4psQeUCqV\n4sYbb+S9995j5cqV1NXVAXDhhRfS1dXVv93hw4fZvn07F154YcH7XLt2LRs2bGDs2LGMHTuWZ555\nhmXLlnHNNdeUpc1RqQ8qLrpx6scejjgl9oC+/OUv093dzapVqzj55JP7l19zzTVs2rSJZ599lmPH\njvHNb36TKVOmMGnSJMB8IBw7doze3l5SqRQffvghx48fB+Cee+5h27ZtdHV10dnZyezZs1m4cCEr\nVqyoymMUCbtZV8/q71mTSCRobGqsdpPEyttH0y0s/dh37dqVSiQSqeHDh6dGjhzZPz355JOpVMr0\nYz///PNTw4cPH9CPfd26dalEIpFKJBKpIUOGpBKJRGrmzJk599PW1pa68847fT8/IlFi3ncpOwXr\nxx40Lg7wuOAYqYunkqHnR6KuHBdPa3nwMF08FSA69UHFRTdusGvsQUXhuSwlLrS/oCQi0dDYOKb/\nW9vDh4/kyBF/P5IhlaNSTETp+ZGwcJdU/L4uVYopjUoxIiI1RIm9hkSlPqi4aMdFQVSey0r2Y28F\nuoFtwO15tnnQru8CpjqWNwHfB7YAbwEtgVopIiK+Faqx1wE/Az4F/Bx4HZiHSdRps4DF9u/HgQfI\nJPAngJeBxzEXakcAB1z7UI09AD0/EhaqsVdHKTX2y4C3gV1AL9AOzHFtMxuTwAHWY47STwdGAZ/E\nJHWAEwxM6iIiUmaFEvuZwB7H/Lt2WaFtzgImAO8BK4D/BP4WqC+lsVKaqNQHFRftuEorxw90ROW5\nrFSN3e85jfsZTmFKL5cAf2P/Hga+kSu4ra2NZDJJMplk2bJleR9MY1Nj1rgQ5Z78jjNx/Phxbrzx\nRpqbm2lsbGTq1Km88MIL/etfeuklzj//fEaMGMEVV1zBO++8079u3bp1zJw5k6amppw/ttHc3Ex9\nfT0NDQ00NDTQ2trq2ZaOjo6s56sS852dndqf9pd33q2YeOiwk7XTTh7xA36gI3vk64o+n52dnRX/\nf+XbX0dHB21tbf350kuhj7wWTAUrnV2WAH3A/Y5tHsH8Z9rtfDcw3d73a5gjd4DLMYn9atc+fNfY\nB9TTyi3prz535MgRli5dyoIFCxg/fjw//OEPmTdvHps2baK+vp5zzjmHxx9/nM985jPccccdvPrq\nq7z22msAvP7662zdupUjR45w7733DhiPfcKECTz22GNcccUVnm1QjV3CYrBr7BrH3fCqsRf65ukG\nYCLQDPwCmIu5eOq0CnPxtB3zQbAf2GvX7QEmAVsxF2A3F9v4MEr/NF6a86fx3n///f6fxgNIJpOc\ncsopbN26lUmTJjFt2jSmTZvGmjVr8t5/rbwwRaQyCpViTmCS9ouY7opPY3rE3GIngOeBHZiLrMuB\nRY74rwD/iOkGeTFwb7kaHibl/Gk8gPnz53Paaadx5ZVX8uabb5atnblOmxWnuHLHRUFUnsugcX7G\nilltJ6flrvnFeWK7gGnFNipKent7mT9/Pm1tbUyaNInDhw9z6qmnZm3T2NjIoUOHfN3fk08+ySWX\nXEJfXx8PPPAAV155Jd3d3YwaNaoSzReRGIrUWDFhqbGn9fX18cUvfpFDhw7xgx/8gLq6Or761a/S\n29vLd7/73f7tLrroIu6+++6sX0LK95unbpMnT2bp0qVcfXX2pQnV2Aefc7ArgIaG0Rw8+EEVWxQO\nqrFXh8aKqYBUBX4aL5dSunRJeWX3xkhlJXmRMFFiD6gSP423Z88efvzjH3P8+HGOHTvG0qVL+fWv\nf80nPvGJsrQ5KvXBqMQFFdbH5+wfnkgkqK8v/APsUVU/sj5QV+ew/u/cIjUee8OoBnqSlRvruWGU\nvxfy7t27efTRRxk2bBhnnHFG//JHH32UefPmsXLlShYvXsx1111HS0sL7e3t/du8/PLL/V0ZE4kE\nw4cPZ8aMGaxdu5aenh4WLVrE9u3bGTZsGFOnTmX16tWMHj26vA9UJIfMGYlx9GiAs8VE9llmw6gG\nDu4/WIbWldfRw0ezSjiVzCvVEIbzfI0VE4Cen8GXXduFYurJURD08Q12rVxjzBiqsYuI1BAl9hoS\nlfpgVOKCiszjs2WVYmrQcReV/50Su4jklsKUK5LQcyD6NehyDB4WFWF4hKqxB6DnZ/DVYo29vw6d\nDE+tXP3fDdXYRURqiBJ7DYlKfTAqcUHF/fHFWVT+d6Htxz569OiaqIUFpb7tIpJPGDJnzhq7SNio\nxh6Omrdq7IZq7CIiNSTUiT0q9SzF1WZcUBrzJbqi8toMbY1dRIIpy5gvEmlh+I+rxi6REJUae3nG\nfDFxqrGHl2rsIiI1JNSJPSr1LMXVZlxQUWmnDFTMOO7Oax1Br3Ooxi4iUmHFjOPuvNYx2Nc5VGMX\n8SmyNfYhCegzN71++EI19vKO4x70t2D98qqx64hdJO766E9GcfulIMlNNXbFKU41dvFQzeF+Kzke\neyvQDWwDbs+zzYN2fRcw1bF8F/Am8Abw00AtFBGpokytvIRSiuNHSwbjh0sKfQTVAT8DPgX8HHgd\nmAdscWwzC1hs/34ceABoset2Ar8FfOCxD9XYJRIiW2MPVCsPGhfeWnlU4vwqpR/7ZcDbmCPvXqAd\nmOPaZjbwhL29HmgCTnfuv6jWiohISQol9jOBPY75d+0yv9ukgDXABuDmYhsXlVqr4mozLqiotFOq\nr1I1dr/nCvmOyi/H1NyvAv4Y+GSujdra2kgmkySTSZYtW5b1YDo6Ooqe7+zsLCle+9P+8s1Dh53C\n+/iyuefxjHc/Pnbayff+srfX/orfn3O+s7Ozf76jo4O2trb+fOmlUJmkBVMZarXzSzCdp+53bPOI\nbXm7ne8GpgN7Xfd1F3AI+I5ruWrsEglB+4cPNtXYox3nVyk19g3ARKAZGArMBVa5tlkFXG9vtwD7\nMUm9Hkh/j3YE8PvAxqJaLhJm6f7hSeg5oP7hEh6FEvsJTI+XF4G3gKcxPWJusRPA88AOzEXW5cAi\nu/wM4FWgE3NR9V+AHxXTuJynQYpTXEjigvK7P42rLkFfm36+ebraTk7LXfOLc8TtAKYEaZSIaFx1\nCS4MrxTV2CUSgtagy7m/wRtXPWhceGvXUYnzK1LjsTtPPxsbx1S7OSLhMSTz7cVKf3NRoi10id35\n9V1zu3hRqdEqLtpxQQXeny7W1pxK1thFpASNjWOyDlKGDx/JkSNKzFI5sUzsM2bMUJziKh7nly6C\nSjGcBwINDaM5eNBrqK3cYpnYRUSiynkg0NMT7CAgdDX2cohKjVZx0Y4TCatYJnYRkVoWy8QelRqt\n4qIdJxJWsUzsIiK1LJaJPSo1WsVFO04krGKZ2EVCLaFvkEplxbK7Y1RqtIqLdlxgKfrHDelJ6otK\nUn46Ypea19jUqCNoiZVYJvao1GgVl1v9yPpAiTbo/noO9GgMFomVWJZiJNqOHj6qUoVICWJ5xB6V\nGq3iykv90UWMWCZ2ES/un5wTCS3HGPzFlCZjmdijUktWXHn53Z9zzP/sXwwSCRnHGPzFXAOKZWIX\nEallsUzsUaklK668VGMXMcKd2APWl0REalm4E3vA+lJUasmKK6+g/d9F4sZPYm8FuoFtwO15tnnQ\nru8CprrW1QFvAM8FbKNITu7eLf3935P6opHUtkKJvQ54CJPcLwDmAZNd28wCzgUmAguBh13rbwPe\nYhC7H0Sllqy40qh3i0huhRL7ZcDbwC6gF2gH5ri2mQ08YW+vB5qA0+38WZjE/z1AHYZFRAZBocR+\nJrDHMf+uXeZ3m78G/hRTLR80UaklK05EKqFQYvd7fus+Gk8AVwO/wtTXPY/W29raSCaTJJNJu6Qj\ns3KnndJrOjqyEkau+c7OzqK2L3Ve+yvvvJvX9ua14pj38Xpx3bv2V9b9ZW+v/ZVxf/ZvOl96KVQe\nacFcjmq180swR9/3O7Z5BNPydjvfDcwAbgW+BJwAhgGNwErgetc+UqlU5vPDfMU7PZ/oHwwKTEuc\n20o8JRKO/3sy//88+7UCWa8XxVUpzv97VnGlxdnhMHLm8EJH7BswF0WbgaHAXGCVa5tVZJJ1C7Af\n+CXwZ8A4YALwBWAtA5O6iIiUWaHEfgJYDLyI6dnyNLAFuMVOAM8DOzAXWZcDi/Lc16Adauc8nVFc\naOM0KJdIefkZj321nZyWu+YXF7iPl+0kMkCm22KakrtIKcL9zdOAotJfW3EiUgmxTOwiIrUslok9\nrLVkxYnIYIhlYhcRqWWxTOxRqSUrTkQqIZaJXUSklsUysUellqw4EamEWCZ2EZFaFsvEHpVasuJE\npBJimdhFRGpZLBN7VGrJihORSohlYp919aysQaX0w8YiUkv8DAIWOf0/amz1JP39sHFUatBxjxOR\n0sTyiF1EpJYpsTtEpQYd1jj3uOr19Q2B9icipVFil7LJjKtupqPHDuk6h0gVxLLGHlRUatBRiSNF\n/7UOv9c5RKR0OmIXEYkZJXaHsNauoxonItWhxC4iEjNK7A5RqV1HJU5EqkOJXUQkZpTYHaJSu45K\nnIhUh5/E3gp0A9uA2/Ns86Bd3wVMtcuGAeuBTuAt4L6SWioiIr4USux1wEOY5H4BMA+Y7NpmFnAu\nMBFYCDxslx8DZgJTgIvt7cvL0uoKiUrtOipxIlIdhRL7ZcDbwC6gF2gH5ri2mQ08YW+vB5qA0+38\nEft3KOZD4oPSmisiIoUUSuxnAnsc8+/aZYW2OcversOUYvYC6zAlmdCKSu06KnEiUh2FEnvK5/0k\n8sT9N6YUcxbwO8CMXMFtbW0kk0mSyaRd0pFZudNO6TUdHVmJJtd8lp3Zs37ii53v7Oys6P1HaX/m\nf+eY9/H/y+aeR/uL1P7I+Z7T/sqwP/s3nS+9uBOyWwtmtI9WO78E6APud2zzCKbl7Xa+G5iOOUp3\nuhM4CvyVa3kqlcp8fiQSCTKfC4mscdVJgnPbfBKJYHFSmuz/HWT9/5L5/weKi1Oc//ee4kqLM9vl\nzuGFjtg3YC6KNmPq5HOBVa5tVgHX29stwH5MUj8FU28HGA78HvBGgf2JiEiJCiX2E8Bi4EVMffxp\nYAtwi50Angd2YC6yLgcW2eVjgbWYGvt64DngpTK2PYtzLPCgcp4GKS5wnIhUh59he1fbyWm5a35x\njriNwCVBGhVEZixwKFxhEhGJL33z1CEq/cOjEici1aHELiISM0rsDvUj67N+s9Pvz7lFpeatGrtI\nbdBP4zkcPXw0q2uRfs5NRKJIR+xlEJWat2rsIrVBiV0KamxqLLo8JSLVo8ReBlGpeQeN6znQY0pU\nSXtbREJNiV1EJGaU2MsgKjVv1cpFaoMSu4hIzCixl0FUauXqjy5SG5TYRURiRom9DKJSK1eNXaQ2\n1HxiL8dwvyIiYVLziT0z3G/wX1iKSq1cNXaR2lDziV0Gcp7F6ExGJHo0CFgZRKVW7jcu+0dLQD9c\nIhItOmIXEYkZJfYyiEqtXDV2kdqgxC4iEjNK7GUQ1lp5ueJEJFqU2EVEYkaJvQyiUitXjV2kNvhN\n7K1AN7ANuD3PNg/a9V3AVLtsHLAO2AxsAm4N3FIREfHFT2KvAx7CJPcLgHnAZNc2s4BzgYnAQuBh\nu7wX+BpwIdAC/HGO2MiLSq1cNXaR2uAnsV8GvA3swiTqdmCOa5vZwBP29nqgCTgd+CXQaZcfArYA\nv1lSi0VExJOfxH4msMcx/65dVmibs1zbNGNKNOuLa2L4RaVWrhq7SG3wk9j9jo7l/t65M24k8H3g\nNsyRe5a2tjaSySTJZNIu6cis3Gmn9JqOjqwE5Z7Pik3HO9cO2L60/XV0dNDZ2VnU9qXOV3p/2dzz\neMab7R3zPp5P7S9O+yPne077K8P+7N90vvTiZxCQFsxv1Lfa+SVAH3C/Y5tHMK1vt/PdwHRgL/AR\n4F+A1cCyHPefSqUynwFm0Kn0fMLsOS0Jzm2zHsggx8VZ9nMCWc9L0u9zqbjajAvvezZucXaAvpw5\n3M8R+wbMRdFmYCgwF1jl2mYVcL293QLsxyT1BPAY8Ba5k7qIiJSZn8R+AlgMvIhJ0E9jLoLeYieA\n54EdmIusy4FFdvkngOuAmcAbdkof+cdGztOnGMWJSLT4HbZ3tZ2clrvmF+eI+zdq4EtQs66exdHD\nR/vnG0Y1cHD/wSq2SERqmcZjL4Ojh49m1cF6kj2+4tSPXUQqIfZH0yIitUaJvYpUYxeRSlBiFxGJ\nGSX2KlKNXUQqQYk9oMbGMSQSifSXBEREQkOJPaCenn2Yb4cF/2aqauwiUglK7CIiMaPEXkWqsYtI\nJSixi4jEjBJ7FanGLiKVoMReQ2ZdPau/J09jU2O1myMiFaKxYqpoMGrljY1jbA8eK2n++B3PRkSi\nR0fsMVeObpkiEi1K7FWkWrmIVIISu4hIzCixV5H6o4tIJSixV1FjU2N/LxX1VBGRclFir6KeAz2m\nl4qdeg7466miGruIeFFiFxGJGSX2CFKNXUS8KLFHkGrzIuJF3zwdZAO+CRpAf20+Pa9vkYqIg98j\n9lagG9gG3J5nmwft+i5gqmP548BeYGPANsaKvgkqIpXmJ7HXAQ9hkvsFwDxgsmubWcC5wERgIfCw\nY90KGysl0E/xiYhffhL7ZcDbwC6gF2gH5ri2mQ08YW+vB5qAM+z8q0BptQfRkb6I+OYnsZ8J7HHM\nv2uXFbuNiIgMAj8XT/0eIrprBL4PLdva2mhubnYs6QBmmJs77aIJdo39ck66y9/AL+u45ne61rri\na3Z/ebYvuD+7jXv7zP7S22t/tbk/GzOh2P05YrW/3PuzsQPz5UB+CrYtmD4Y6Tr5EqAPuN+xzSOY\nR9Bu57uB6ZiLpgDNwHPARTnuP5VKZT4DTA05PZ/I6v1BEpzbOimuyDjfMYpTXDFxEXwvRDTOXm/L\nmcP9lGI2YC6KNgNDgbnAKtc2q4Dr7e0WYD+ZpC4iIoPIT2I/ASwGXgTeAp4GtgC32AngeWAH5iLr\ncmCRI/4p4N+BSZg6/IJyNFxERHLz+wWl1XZyWu6aX5wndl5RLRIRkZJoSAERkZhRYhcRiRkldhGR\nmFFiFxGJGSV2EZGYUWIXEYkZJXYRkZhRYhcRiRkldhGRmFFiFxGJGSV2EZGYUWIXEYkZJXYRkZhR\nYhcRiRkldhGRmFFiFxGJGSV2EZGYUWIXEYkZJXYRkZhRYhcRiRkldhGRmFFiFxGJGT+JvRXoBrYB\nt+fZ5kG7vguYWmSsiIiUUaHEXgc8hEnQFwDzgMmubWYB5wITgYXAw0XEiohImRVK7JcBbwO7gF6g\nHZjj2mY28IS9vR5oAs7wGSsiImVWKLGfCexxzL9rl/nZ5jd9xIqISJkVSuwpn/eTKLUhIiIyOFqA\nFxzzSxh4EfQR4AuO+W7gdJ+xAJ2YDxBNmjRp0uR/6iSgk4DtQDMw1N5Rrounz9vbLcBPiogVEZEq\nuAr4GeZC6BK77BY7pT1k13cBlxSIFRERERERv+qq3QCHycBNwFxMeedi4APg/Qru72PAe8Bxx/JW\nzBlGPpcDI23cDOAPgeHAzoq0MtsngWuBBkyZK58WoAf4EKgH7gS+jnm8r9vludwK/AI4WGS7Tgau\nA04FdgDzgf8JTMCU4Po8Ys8h83+/0sZs9WijiBQQliEFbgeesrfX22mIXRa0hLPAY92twD8DXwE2\nA3/gWHefR9x9wF9h+u3/JfAXmKR+F/CnRbbv73xs81PH7ZuB/4f5ULkL7+flceCwvf0A0GjbehRY\n4RF3j93nvwGLMInajxWYD+PbgL/HfNj9BPNdhu95xN2Gufh+st32ZGA85v8/0+e+a8Vpg7y/3xjk\n/UXJao91ozDvtX8Avuha9zcVa1FIbQM+kmP5ULyPnr3s8Vi3CZMgwVzc3QB81c6/4RH3FuaicD3m\niHiUXT4ceNMj7jlglf2bng47lufjbMsGMol2hH0M+Wxx3P5P17quAvsbAvw+5sPhPUzPpj/CnCXk\ns9H+PQn4lf0LphvsxpwRxiYyZ431wMv29ni8r/g3Yd483cA+zJldt13W5BHnpRJv1nGYD7Z0u1Zg\nHvPf452ox7im38B80S89n0+r43YT8Bjm+X8S01Mtn/vJvLYuxZx1vQ28gzkrzeUN4A7MGVcxpgHr\nMM/lOOBfgQOYM8mpHnENwN2YA7GDmDP59UBbgf0Ffa1ckmf6LeCXHnHP2vu+BvP+XgkMs+u8ckvQ\n10qodWMSrFsz5uJrPhs9Jq9T+c2u+ZHAi8Bf451QOvPczjXv9Abwj5ij0OmYN8t/2dvTPeLeJPPG\ndr8ovPb3feAGe3sF5s0EMAnzBvJqp9NQzLeF2/EuiW3GHG2PxnzgpY/2hjPwuXbaSOZFPwbz4eW8\nz3x+hDnLO4PMdyjGAt+w6/IZ7DfrS5izwiWY1/g3MB9aX7H3kU8fprTnnHrt3x0ecc62PAZ8C/Me\n+hrmDDUf50FCB9mvl//IE7MTc/b6DuY19TXMlxILeR3TqWIe5kuL12L+h78LvOYRtwpzFj4O+F/A\n/7Ht+zvgXo+4oK+V/8Z8AOWajnrEuQ+c/hz4MXAKlXmthFq6rv0C8Ld2egFTR77KI24v5lO+Ocf0\nC4+4dcAU17KPYF4kXvXg9ZgjS8guYzUx8MjYqQ7zYlxD5qjET01+F5k39g7MCxLM0UuhI9onbMx6\nMknhFUydPR+vF94Ij3VL7L5+hhkv6C3M0ccm4H97xN2GSe7fs7HpD6PTbFvz2Rpw3WC/WZ3/o3c8\n1rl9HfP6v9ixzM/rxdmWLrK/OOh1praFzBnzT1zr8p1xpfeVAH4HM0bULzHP5UKfbSzmOXGfEacP\nAobgffAX9LWyGfPBkYtXNWALA0vcbfb+dnvEBX2thF4d8NuY+uznMBcAT/KMMOWCT+ZZ91Se5WA+\n9c/IsTyBuTiaz7A8y08BLvKISzsL+Cfgu3i/OAqpx1xkLGQU5gPsUnI/XrfzSmhTM5kywTmYi6Fe\nHyJpH8X8z88vYl//ivnAcJYXzsAcma3xiBvsN6szmX7btc6rRAXmNfpPmLPIRvwl9ncxBxBfxxwU\nOBO7V6nwK5jn9AogibkuMx34JqYUkEuuD7STMAdpXtdxfoq5SP55zHN+jV0+HXMQks9rZN7rczBn\n2GleiT3oa+Va8r8m/yDPcoClwO/lWN6KKTnnU8prRULiarxPH8XbGMzF63TddJ+9/Zd416AH+816\nD7mvTUzElMv8mINJeHt9bJvEXFhPT+na7FgKX6yfCTyDSdgbMdccbiH3tS8w5bkgLsOUe54CzsYk\n14OYM95LPeLSPbr2Y86W0gchp2I6Q+QT9LUCpufc75K5HpfmVUUIGleO14pIbHn1hvJyQ+FNqhZX\nT+aMcLBfyhcjAAAB50lEQVQfX5D9BW1jpR7bZOBTDEycrTm2TbsVcybwz5izMucHv1f5LWhcup1B\nPkhEYi9oiUtx5YsLUxuDJtqgPeeCxpXygTBAoRq2SBh51Ry9uvXFIc6r69tg7i9MbfSKW4jp9XQI\nk2hX2r/LPGLAXJ84ZG/vwvRkW4kpH3mNZhs0zt3O7/tsZ05K7BJFp2FOo/flWPfvihuUuCi0EQYm\n2un4S7S/wnQ8SPdIOYS5NvYY2b2VyhUX9ANBJDaC9oZSXPniotBGCN61OWjPuaBxQdspIlJzgiba\nwRaVdoqIiIiIiIiIiIiIiEjMjAK+bG+PxYzJIiIiEdaMBlYSEYmVduAI5qvaz5BJ8m2Yr3T/CDOa\n4mLgTzADVL2GGW8ezOiVqzFfF3+F0kbFFBGRMjibTDJ33m7DjNY4AjMU8wEyY4v/X8z48WB+FOFc\ne/vjdl4kNDSkgNSiRJ7bYL4BeNhO+zG/mAQm+V+MSfr/g+y6/NDKNFMkGCV2kWzOn1Tsc8z3Yd4v\nQzDjlXj9PqdIVbl/FUakFvTg/ePcuaSP7Hsw9fc/dCz3GtxJZNApsUst+jXmV3g2Yn5JJ2WXpxy3\nyXE7PT8fuBEzgt8mYHYlGysiIiIiIiIiIiIiIiIiIiIiIiIiIiIiIhJm/x8evA6RfDpmGwAAAABJ\nRU5ErkJggg==\n",
"text": [
""
]
}
],
"prompt_number": 26
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Use Series indexing to perform vectorized mappings\n",
"---"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"crimes['coarse_cat'] = coarse_cat\n",
"thefts = crimes[crimes.coarse_cat == 'THEFT']\n",
"\n",
"thefts.groupby(thefts.date.dt.weekday).size()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 27,
"text": [
"0 7201\n",
"1 7082\n",
"2 7353\n",
"3 7243\n",
"4 8259\n",
"5 8368\n",
"6 7223\n",
"dtype: int64"
]
}
],
"prompt_number": 27
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"day_names = pd.Series(['Weekday'] * 5 + ['Weekend'] * 2)\n",
"day_names"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 28,
"text": [
"0 Weekday\n",
"1 Weekday\n",
"2 Weekday\n",
"3 Weekday\n",
"4 Weekday\n",
"5 Weekend\n",
"6 Weekend\n",
"dtype: object"
]
}
],
"prompt_number": 28
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"day_type = day_names.take(thefts.date.dt.weekday)\n",
"thefts.groupby(day_type.values).size()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 29,
"text": [
"Weekday 37138\n",
"Weekend 15591\n",
"dtype: int64"
]
}
],
"prompt_number": 29
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Top-N selection / filtering\n",
"===\n",
"- SQL: use `ORDER BY` + `LIMIT` to find\n",
" - Many databases feature optimized Top-N algorithms\n",
"- pandas: more of a manual affair"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"query = \"\"\"\n",
"with t0 as (\n",
" select descript, count(*) as occs\n",
" from crimes\n",
" group by 1\n",
" order by occs desc\n",
" limit 5\n",
")\n",
"select descript, dayofweek, count(*) as occs\n",
"from crimes\n",
"where descript in (select descript from t0)\n",
"group by 1, 2\n",
"order by descript, dayofweek;\n",
"\"\"\"\n",
"sql.read_sql(query, con)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" descript | \n",
" dayofweek | \n",
" occs | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" AIDED CASE, MENTAL DISTURBED | \n",
" Friday | \n",
" 704 | \n",
"
\n",
" \n",
" 1 | \n",
" AIDED CASE, MENTAL DISTURBED | \n",
" Monday | \n",
" 718 | \n",
"
\n",
" \n",
" 2 | \n",
" AIDED CASE, MENTAL DISTURBED | \n",
" Saturday | \n",
" 747 | \n",
"
\n",
" \n",
" 3 | \n",
" AIDED CASE, MENTAL DISTURBED | \n",
" Sunday | \n",
" 710 | \n",
"
\n",
" \n",
" 4 | \n",
" AIDED CASE, MENTAL DISTURBED | \n",
" Thursday | \n",
" 745 | \n",
"
\n",
" \n",
" 5 | \n",
" AIDED CASE, MENTAL DISTURBED | \n",
" Tuesday | \n",
" 735 | \n",
"
\n",
" \n",
" 6 | \n",
" AIDED CASE, MENTAL DISTURBED | \n",
" Wednesday | \n",
" 781 | \n",
"
\n",
" \n",
" 7 | \n",
" DRIVERS LICENSE, SUSPENDED OR REVOKED | \n",
" Friday | \n",
" 899 | \n",
"
\n",
" \n",
" 8 | \n",
" DRIVERS LICENSE, SUSPENDED OR REVOKED | \n",
" Monday | \n",
" 876 | \n",
"
\n",
" \n",
" 9 | \n",
" DRIVERS LICENSE, SUSPENDED OR REVOKED | \n",
" Saturday | \n",
" 785 | \n",
"
\n",
" \n",
" 10 | \n",
" DRIVERS LICENSE, SUSPENDED OR REVOKED | \n",
" Sunday | \n",
" 778 | \n",
"
\n",
" \n",
" 11 | \n",
" DRIVERS LICENSE, SUSPENDED OR REVOKED | \n",
" Thursday | \n",
" 829 | \n",
"
\n",
" \n",
" 12 | \n",
" DRIVERS LICENSE, SUSPENDED OR REVOKED | \n",
" Tuesday | \n",
" 904 | \n",
"
\n",
" \n",
" 13 | \n",
" DRIVERS LICENSE, SUSPENDED OR REVOKED | \n",
" Wednesday | \n",
" 878 | \n",
"
\n",
" \n",
" 14 | \n",
" GRAND THEFT FROM LOCKED AUTO | \n",
" Friday | \n",
" 2492 | \n",
"
\n",
" \n",
" 15 | \n",
" GRAND THEFT FROM LOCKED AUTO | \n",
" Monday | \n",
" 2267 | \n",
"
\n",
" \n",
" 16 | \n",
" GRAND THEFT FROM LOCKED AUTO | \n",
" Saturday | \n",
" 2605 | \n",
"
\n",
" \n",
" 17 | \n",
" GRAND THEFT FROM LOCKED AUTO | \n",
" Sunday | \n",
" 2162 | \n",
"
\n",
" \n",
" 18 | \n",
" GRAND THEFT FROM LOCKED AUTO | \n",
" Thursday | \n",
" 2281 | \n",
"
\n",
" \n",
" 19 | \n",
" GRAND THEFT FROM LOCKED AUTO | \n",
" Tuesday | \n",
" 2234 | \n",
"
\n",
" \n",
" 20 | \n",
" GRAND THEFT FROM LOCKED AUTO | \n",
" Wednesday | \n",
" 2241 | \n",
"
\n",
" \n",
" 21 | \n",
" LOST PROPERTY | \n",
" Friday | \n",
" 860 | \n",
"
\n",
" \n",
" 22 | \n",
" LOST PROPERTY | \n",
" Monday | \n",
" 606 | \n",
"
\n",
" \n",
" 23 | \n",
" LOST PROPERTY | \n",
" Saturday | \n",
" 907 | \n",
"
\n",
" \n",
" 24 | \n",
" LOST PROPERTY | \n",
" Sunday | \n",
" 728 | \n",
"
\n",
" \n",
" 25 | \n",
" LOST PROPERTY | \n",
" Thursday | \n",
" 637 | \n",
"
\n",
" \n",
" 26 | \n",
" LOST PROPERTY | \n",
" Tuesday | \n",
" 593 | \n",
"
\n",
" \n",
" 27 | \n",
" LOST PROPERTY | \n",
" Wednesday | \n",
" 691 | \n",
"
\n",
" \n",
" 28 | \n",
" PETTY THEFT OF PROPERTY | \n",
" Friday | \n",
" 825 | \n",
"
\n",
" \n",
" 29 | \n",
" PETTY THEFT OF PROPERTY | \n",
" Monday | \n",
" 558 | \n",
"
\n",
" \n",
" 30 | \n",
" PETTY THEFT OF PROPERTY | \n",
" Saturday | \n",
" 985 | \n",
"
\n",
" \n",
" 31 | \n",
" PETTY THEFT OF PROPERTY | \n",
" Sunday | \n",
" 749 | \n",
"
\n",
" \n",
" 32 | \n",
" PETTY THEFT OF PROPERTY | \n",
" Thursday | \n",
" 673 | \n",
"
\n",
" \n",
" 33 | \n",
" PETTY THEFT OF PROPERTY | \n",
" Tuesday | \n",
" 607 | \n",
"
\n",
" \n",
" 34 | \n",
" PETTY THEFT OF PROPERTY | \n",
" Wednesday | \n",
" 644 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 34,
"text": [
" descript dayofweek occs\n",
"0 AIDED CASE, MENTAL DISTURBED Friday 704\n",
"1 AIDED CASE, MENTAL DISTURBED Monday 718\n",
"2 AIDED CASE, MENTAL DISTURBED Saturday 747\n",
"3 AIDED CASE, MENTAL DISTURBED Sunday 710\n",
"4 AIDED CASE, MENTAL DISTURBED Thursday 745\n",
"5 AIDED CASE, MENTAL DISTURBED Tuesday 735\n",
"6 AIDED CASE, MENTAL DISTURBED Wednesday 781\n",
"7 DRIVERS LICENSE, SUSPENDED OR REVOKED Friday 899\n",
"8 DRIVERS LICENSE, SUSPENDED OR REVOKED Monday 876\n",
"9 DRIVERS LICENSE, SUSPENDED OR REVOKED Saturday 785\n",
"10 DRIVERS LICENSE, SUSPENDED OR REVOKED Sunday 778\n",
"11 DRIVERS LICENSE, SUSPENDED OR REVOKED Thursday 829\n",
"12 DRIVERS LICENSE, SUSPENDED OR REVOKED Tuesday 904\n",
"13 DRIVERS LICENSE, SUSPENDED OR REVOKED Wednesday 878\n",
"14 GRAND THEFT FROM LOCKED AUTO Friday 2492\n",
"15 GRAND THEFT FROM LOCKED AUTO Monday 2267\n",
"16 GRAND THEFT FROM LOCKED AUTO Saturday 2605\n",
"17 GRAND THEFT FROM LOCKED AUTO Sunday 2162\n",
"18 GRAND THEFT FROM LOCKED AUTO Thursday 2281\n",
"19 GRAND THEFT FROM LOCKED AUTO Tuesday 2234\n",
"20 GRAND THEFT FROM LOCKED AUTO Wednesday 2241\n",
"21 LOST PROPERTY Friday 860\n",
"22 LOST PROPERTY Monday 606\n",
"23 LOST PROPERTY Saturday 907\n",
"24 LOST PROPERTY Sunday 728\n",
"25 LOST PROPERTY Thursday 637\n",
"26 LOST PROPERTY Tuesday 593\n",
"27 LOST PROPERTY Wednesday 691\n",
"28 PETTY THEFT OF PROPERTY Friday 825\n",
"29 PETTY THEFT OF PROPERTY Monday 558\n",
"30 PETTY THEFT OF PROPERTY Saturday 985\n",
"31 PETTY THEFT OF PROPERTY Sunday 749\n",
"32 PETTY THEFT OF PROPERTY Thursday 673\n",
"33 PETTY THEFT OF PROPERTY Tuesday 607\n",
"34 PETTY THEFT OF PROPERTY Wednesday 644"
]
}
],
"prompt_number": 34
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"query = \"\"\"\n",
"select c.descript, c.dayofweek, count(*) as occs\n",
"from crimes c\n",
" inner join (\n",
" select descript, count(*) as occs\n",
" from crimes\n",
" group by 1\n",
" order by occs desc\n",
" limit 5\n",
" ) c2 on c.descript = c2.descript\n",
"group by 1, 2;\n",
"\"\"\"\n",
"sql.read_sql(query, con)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" descript | \n",
" dayofweek | \n",
" occs | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" PETTY THEFT OF PROPERTY | \n",
" Thursday | \n",
" 673 | \n",
"
\n",
" \n",
" 1 | \n",
" LOST PROPERTY | \n",
" Wednesday | \n",
" 691 | \n",
"
\n",
" \n",
" 2 | \n",
" PETTY THEFT OF PROPERTY | \n",
" Wednesday | \n",
" 644 | \n",
"
\n",
" \n",
" 3 | \n",
" LOST PROPERTY | \n",
" Thursday | \n",
" 637 | \n",
"
\n",
" \n",
" 4 | \n",
" LOST PROPERTY | \n",
" Friday | \n",
" 860 | \n",
"
\n",
" \n",
" 5 | \n",
" PETTY THEFT OF PROPERTY | \n",
" Friday | \n",
" 825 | \n",
"
\n",
" \n",
" 6 | \n",
" GRAND THEFT FROM LOCKED AUTO | \n",
" Friday | \n",
" 2492 | \n",
"
\n",
" \n",
" 7 | \n",
" DRIVERS LICENSE, SUSPENDED OR REVOKED | \n",
" Sunday | \n",
" 778 | \n",
"
\n",
" \n",
" 8 | \n",
" DRIVERS LICENSE, SUSPENDED OR REVOKED | \n",
" Monday | \n",
" 876 | \n",
"
\n",
" \n",
" 9 | \n",
" GRAND THEFT FROM LOCKED AUTO | \n",
" Thursday | \n",
" 2281 | \n",
"
\n",
" \n",
" 10 | \n",
" AIDED CASE, MENTAL DISTURBED | \n",
" Saturday | \n",
" 747 | \n",
"
\n",
" \n",
" 11 | \n",
" GRAND THEFT FROM LOCKED AUTO | \n",
" Wednesday | \n",
" 2241 | \n",
"
\n",
" \n",
" 12 | \n",
" AIDED CASE, MENTAL DISTURBED | \n",
" Wednesday | \n",
" 781 | \n",
"
\n",
" \n",
" 13 | \n",
" AIDED CASE, MENTAL DISTURBED | \n",
" Thursday | \n",
" 745 | \n",
"
\n",
" \n",
" 14 | \n",
" GRAND THEFT FROM LOCKED AUTO | \n",
" Saturday | \n",
" 2605 | \n",
"
\n",
" \n",
" 15 | \n",
" AIDED CASE, MENTAL DISTURBED | \n",
" Friday | \n",
" 704 | \n",
"
\n",
" \n",
" 16 | \n",
" LOST PROPERTY | \n",
" Saturday | \n",
" 907 | \n",
"
\n",
" \n",
" 17 | \n",
" PETTY THEFT OF PROPERTY | \n",
" Saturday | \n",
" 985 | \n",
"
\n",
" \n",
" 18 | \n",
" DRIVERS LICENSE, SUSPENDED OR REVOKED | \n",
" Tuesday | \n",
" 904 | \n",
"
\n",
" \n",
" 19 | \n",
" PETTY THEFT OF PROPERTY | \n",
" Tuesday | \n",
" 607 | \n",
"
\n",
" \n",
" 20 | \n",
" AIDED CASE, MENTAL DISTURBED | \n",
" Monday | \n",
" 718 | \n",
"
\n",
" \n",
" 21 | \n",
" LOST PROPERTY | \n",
" Tuesday | \n",
" 593 | \n",
"
\n",
" \n",
" 22 | \n",
" DRIVERS LICENSE, SUSPENDED OR REVOKED | \n",
" Saturday | \n",
" 785 | \n",
"
\n",
" \n",
" 23 | \n",
" AIDED CASE, MENTAL DISTURBED | \n",
" Sunday | \n",
" 710 | \n",
"
\n",
" \n",
" 24 | \n",
" GRAND THEFT FROM LOCKED AUTO | \n",
" Tuesday | \n",
" 2234 | \n",
"
\n",
" \n",
" 25 | \n",
" LOST PROPERTY | \n",
" Monday | \n",
" 606 | \n",
"
\n",
" \n",
" 26 | \n",
" PETTY THEFT OF PROPERTY | \n",
" Monday | \n",
" 558 | \n",
"
\n",
" \n",
" 27 | \n",
" AIDED CASE, MENTAL DISTURBED | \n",
" Tuesday | \n",
" 735 | \n",
"
\n",
" \n",
" 28 | \n",
" PETTY THEFT OF PROPERTY | \n",
" Sunday | \n",
" 749 | \n",
"
\n",
" \n",
" 29 | \n",
" LOST PROPERTY | \n",
" Sunday | \n",
" 728 | \n",
"
\n",
" \n",
" 30 | \n",
" DRIVERS LICENSE, SUSPENDED OR REVOKED | \n",
" Friday | \n",
" 899 | \n",
"
\n",
" \n",
" 31 | \n",
" GRAND THEFT FROM LOCKED AUTO | \n",
" Sunday | \n",
" 2162 | \n",
"
\n",
" \n",
" 32 | \n",
" DRIVERS LICENSE, SUSPENDED OR REVOKED | \n",
" Wednesday | \n",
" 878 | \n",
"
\n",
" \n",
" 33 | \n",
" DRIVERS LICENSE, SUSPENDED OR REVOKED | \n",
" Thursday | \n",
" 829 | \n",
"
\n",
" \n",
" 34 | \n",
" GRAND THEFT FROM LOCKED AUTO | \n",
" Monday | \n",
" 2267 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 33,
"text": [
" descript dayofweek occs\n",
"0 PETTY THEFT OF PROPERTY Thursday 673\n",
"1 LOST PROPERTY Wednesday 691\n",
"2 PETTY THEFT OF PROPERTY Wednesday 644\n",
"3 LOST PROPERTY Thursday 637\n",
"4 LOST PROPERTY Friday 860\n",
"5 PETTY THEFT OF PROPERTY Friday 825\n",
"6 GRAND THEFT FROM LOCKED AUTO Friday 2492\n",
"7 DRIVERS LICENSE, SUSPENDED OR REVOKED Sunday 778\n",
"8 DRIVERS LICENSE, SUSPENDED OR REVOKED Monday 876\n",
"9 GRAND THEFT FROM LOCKED AUTO Thursday 2281\n",
"10 AIDED CASE, MENTAL DISTURBED Saturday 747\n",
"11 GRAND THEFT FROM LOCKED AUTO Wednesday 2241\n",
"12 AIDED CASE, MENTAL DISTURBED Wednesday 781\n",
"13 AIDED CASE, MENTAL DISTURBED Thursday 745\n",
"14 GRAND THEFT FROM LOCKED AUTO Saturday 2605\n",
"15 AIDED CASE, MENTAL DISTURBED Friday 704\n",
"16 LOST PROPERTY Saturday 907\n",
"17 PETTY THEFT OF PROPERTY Saturday 985\n",
"18 DRIVERS LICENSE, SUSPENDED OR REVOKED Tuesday 904\n",
"19 PETTY THEFT OF PROPERTY Tuesday 607\n",
"20 AIDED CASE, MENTAL DISTURBED Monday 718\n",
"21 LOST PROPERTY Tuesday 593\n",
"22 DRIVERS LICENSE, SUSPENDED OR REVOKED Saturday 785\n",
"23 AIDED CASE, MENTAL DISTURBED Sunday 710\n",
"24 GRAND THEFT FROM LOCKED AUTO Tuesday 2234\n",
"25 LOST PROPERTY Monday 606\n",
"26 PETTY THEFT OF PROPERTY Monday 558\n",
"27 AIDED CASE, MENTAL DISTURBED Tuesday 735\n",
"28 PETTY THEFT OF PROPERTY Sunday 749\n",
"29 LOST PROPERTY Sunday 728\n",
"30 DRIVERS LICENSE, SUSPENDED OR REVOKED Friday 899\n",
"31 GRAND THEFT FROM LOCKED AUTO Sunday 2162\n",
"32 DRIVERS LICENSE, SUSPENDED OR REVOKED Wednesday 878\n",
"33 DRIVERS LICENSE, SUSPENDED OR REVOKED Thursday 829\n",
"34 GRAND THEFT FROM LOCKED AUTO Monday 2267"
]
}
],
"prompt_number": 33
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"counts = crimes.groupby('descript').size()\n",
"counts"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 36,
"text": [
"descript\n",
"ABANDONMENT OF CHILD 4\n",
"ABORTION 1\n",
"ACCESS CARD INFORMATION, PUBLICATION OF 1\n",
"ACCESS CARD INFORMATION, THEFT OF 135\n",
"ACCIDENTAL BURNS 1\n",
"ACCIDENTAL SHOOTING 1\n",
"ACTS AGAINST PUBLIC TRANSIT 34\n",
"ADVERTISING DISTRIBUTORS PERMIT VIOLATION 2\n",
"AEROSOL CONTAINER; SALE, PURCHASE OR POSSESSION OF 3\n",
"AFFIXING ADVERTISMENTS TO POLES 1\n",
"AGGRAVATED ASSAULT OF POLICE OFFICER,BODILY FORCE 39\n",
"AGGRAVATED ASSAULT ON POLICE OFFICER WITH A KNIFE 2\n",
"AGGRAVATED ASSAULT WITH A DEADLY WEAPON 1102\n",
"AGGRAVATED ASSAULT WITH A GUN 158\n",
"AGGRAVATED ASSAULT WITH A KNIFE 447\n",
"...\n",
"VIOLATION OF STAY AWAY ORDER 131\n",
"WARRANT ARREST 4042\n",
"WEAPON, ASSAULT, POSSESSION, MANUFACTURE, OR SALE 29\n",
"WEAPON, ASSAULT, REGISTRATION OR TRANSFER VIOLATION 1\n",
"WEAPON, DEADLY, CARRYING WITH INTENT TO COMMIT ASSAULT 37\n",
"WEAPON, DEADLY, EXHIBITING TO RESIST ARREST 6\n",
"WEAPON, DEADLY, POSSESSION OF TO VIOLATE 136.1 PC 1\n",
"WEAPON, POSSESS OR BRING OTHER ON SCHOOL GROUNDS 33\n",
"WEAPON, POSSESSING IN PUBLIC BUILDING OR OPEN MEETING 7\n",
"WEAPON, TAKING OR ATTEMPTING TO TAKE FROM PEACE OFFICER 3\n",
"WEAPONS POSSESSION BY JUVENILE SUSPECT 2\n",
"WEARING MASK OR DISGUISE FOR UNLAWFUL PURPOSE 2\n",
"WILLFUL CRUELTY TO CHILD 25\n",
"WIRETAPS, UNAUTHORIZED 1\n",
"YOUTH COURT 1\n",
"Length: 744, dtype: int64"
]
}
],
"prompt_number": 36
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"counts.order(ascending=False)[:5].index"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 37,
"text": [
"Index([u'GRAND THEFT FROM LOCKED AUTO', u'DRIVERS LICENSE, SUSPENDED OR REVOKED', u'AIDED CASE, MENTAL DISTURBED', u'PETTY THEFT OF PROPERTY', u'LOST PROPERTY'], dtype='object')"
]
}
],
"prompt_number": 37
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# this could be very large\n",
"K = 5\n",
"counts = crimes.groupby('descript').size()\n",
"top_descripts = counts.order(ascending=False)[:K].index\n",
"top_descripts\n",
"\n",
"filtered = crimes[crimes.descript.isin(top_descripts)]\n",
"result = (filtered\n",
" .groupby(['descript', filtered.time.map(lambda x: x.hour)])\n",
" .size())\n",
"#result\n",
"result.unstack('descript')"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"\n",
"
\n",
" \n",
" \n",
" descript | \n",
" AIDED CASE, MENTAL DISTURBED | \n",
" DRIVERS LICENSE, SUSPENDED OR REVOKED | \n",
" GRAND THEFT FROM LOCKED AUTO | \n",
" LOST PROPERTY | \n",
" PETTY THEFT OF PROPERTY | \n",
"
\n",
" \n",
" time | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 170 | \n",
" 384 | \n",
" 526 | \n",
" 250 | \n",
" 284 | \n",
"
\n",
" \n",
" 1 | \n",
" 120 | \n",
" 250 | \n",
" 327 | \n",
" 177 | \n",
" 234 | \n",
"
\n",
" \n",
" 2 | \n",
" 114 | \n",
" 193 | \n",
" 237 | \n",
" 106 | \n",
" 83 | \n",
"
\n",
" \n",
" 3 | \n",
" 103 | \n",
" 121 | \n",
" 133 | \n",
" 42 | \n",
" 43 | \n",
"
\n",
" \n",
" 4 | \n",
" 74 | \n",
" 131 | \n",
" 78 | \n",
" 11 | \n",
" 27 | \n",
"
\n",
" \n",
" 5 | \n",
" 77 | \n",
" 97 | \n",
" 96 | \n",
" 32 | \n",
" 22 | \n",
"
\n",
" \n",
" 6 | \n",
" 123 | \n",
" 45 | \n",
" 144 | \n",
" 44 | \n",
" 35 | \n",
"
\n",
" \n",
" 7 | \n",
" 165 | \n",
" 126 | \n",
" 194 | \n",
" 97 | \n",
" 83 | \n",
"
\n",
" \n",
" 8 | \n",
" 217 | \n",
" 201 | \n",
" 276 | \n",
" 167 | \n",
" 142 | \n",
"
\n",
" \n",
" 9 | \n",
" 234 | \n",
" 208 | \n",
" 405 | \n",
" 206 | \n",
" 207 | \n",
"
\n",
" \n",
" 10 | \n",
" 245 | \n",
" 204 | \n",
" 588 | \n",
" 265 | \n",
" 227 | \n",
"
\n",
" \n",
" 11 | \n",
" 284 | \n",
" 207 | \n",
" 665 | \n",
" 246 | \n",
" 229 | \n",
"
\n",
" \n",
" 12 | \n",
" 288 | \n",
" 286 | \n",
" 655 | \n",
" 395 | \n",
" 283 | \n",
"
\n",
" \n",
" 13 | \n",
" 282 | \n",
" 262 | \n",
" 633 | \n",
" 289 | \n",
" 228 | \n",
"
\n",
" \n",
" 14 | \n",
" 254 | \n",
" 243 | \n",
" 652 | \n",
" 264 | \n",
" 270 | \n",
"
\n",
" \n",
" 15 | \n",
" 265 | \n",
" 236 | \n",
" 666 | \n",
" 321 | \n",
" 294 | \n",
"
\n",
" \n",
" 16 | \n",
" 331 | \n",
" 329 | \n",
" 779 | \n",
" 319 | \n",
" 317 | \n",
"
\n",
" \n",
" 17 | \n",
" 275 | \n",
" 386 | \n",
" 1015 | \n",
" 289 | \n",
" 335 | \n",
"
\n",
" \n",
" 18 | \n",
" 263 | \n",
" 358 | \n",
" 1731 | \n",
" 277 | \n",
" 345 | \n",
"
\n",
" \n",
" 19 | \n",
" 300 | \n",
" 291 | \n",
" 1746 | \n",
" 242 | \n",
" 308 | \n",
"
\n",
" \n",
" 20 | \n",
" 286 | \n",
" 227 | \n",
" 1595 | \n",
" 243 | \n",
" 274 | \n",
"
\n",
" \n",
" 21 | \n",
" 252 | \n",
" 279 | \n",
" 1208 | \n",
" 241 | \n",
" 223 | \n",
"
\n",
" \n",
" 22 | \n",
" 220 | \n",
" 449 | \n",
" 1121 | \n",
" 238 | \n",
" 239 | \n",
"
\n",
" \n",
" 23 | \n",
" 198 | \n",
" 436 | \n",
" 812 | \n",
" 261 | \n",
" 309 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 39,
"text": [
"descript AIDED CASE, MENTAL DISTURBED DRIVERS LICENSE, SUSPENDED OR REVOKED \\\n",
"time \n",
"0 170 384 \n",
"1 120 250 \n",
"2 114 193 \n",
"3 103 121 \n",
"4 74 131 \n",
"5 77 97 \n",
"6 123 45 \n",
"7 165 126 \n",
"8 217 201 \n",
"9 234 208 \n",
"10 245 204 \n",
"11 284 207 \n",
"12 288 286 \n",
"13 282 262 \n",
"14 254 243 \n",
"15 265 236 \n",
"16 331 329 \n",
"17 275 386 \n",
"18 263 358 \n",
"19 300 291 \n",
"20 286 227 \n",
"21 252 279 \n",
"22 220 449 \n",
"23 198 436 \n",
"\n",
"descript GRAND THEFT FROM LOCKED AUTO LOST PROPERTY PETTY THEFT OF PROPERTY \n",
"time \n",
"0 526 250 284 \n",
"1 327 177 234 \n",
"2 237 106 83 \n",
"3 133 42 43 \n",
"4 78 11 27 \n",
"5 96 32 22 \n",
"6 144 44 35 \n",
"7 194 97 83 \n",
"8 276 167 142 \n",
"9 405 206 207 \n",
"10 588 265 227 \n",
"11 665 246 229 \n",
"12 655 395 283 \n",
"13 633 289 228 \n",
"14 652 264 270 \n",
"15 666 321 294 \n",
"16 779 319 317 \n",
"17 1015 289 335 \n",
"18 1731 277 345 \n",
"19 1746 242 308 \n",
"20 1595 243 274 \n",
"21 1208 241 223 \n",
"22 1121 238 239 \n",
"23 812 261 309 "
]
}
],
"prompt_number": 39
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Subqueries\n",
"===\n",
"- Mainly used in WHERE clauses in SQL\n",
"- Scalar and uncorrelated / correlated cases."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"acquisitions.price_amount.isnull().value_counts()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 40,
"text": [
"True 51424\n",
"False 3816\n",
"dtype: int64"
]
}
],
"prompt_number": 40
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"acqs_known = acquisitions[acquisitions.price_amount.notnull()]\n",
"acqs_known[acqs_known.price_amount > acqs_known.price_amount.mean()]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" company_permalink | \n",
" company_name | \n",
" company_category_list | \n",
" company_market | \n",
" company_country_code | \n",
" company_state_code | \n",
" company_region | \n",
" company_city | \n",
" acquirer_permalink | \n",
" acquirer_name | \n",
" ... | \n",
" acquirer_country_code | \n",
" acquirer_state_code | \n",
" acquirer_region | \n",
" acquirer_city | \n",
" acquired_at | \n",
" acquired_month | \n",
" acquired_quarter | \n",
" acquired_year | \n",
" price_amount | \n",
" price_currency_code | \n",
"
\n",
" \n",
" \n",
" \n",
" 31 | \n",
" /organization/21st-century-insurance | \n",
" 21st Century Insurance | \n",
" |Finance|Business Services|Insurance| | \n",
" Business Services | \n",
" USA | \n",
" AL | \n",
" AL - Other | \n",
" De Armanville | \n",
" /organization/american-international-group | \n",
" American International Group | \n",
" ... | \n",
" USA | \n",
" NY | \n",
" New York City | \n",
" New York | \n",
" 2007-09-01 | \n",
" 2007-09 | \n",
" 2007-Q3 | \n",
" 2007 | \n",
" 749000000 | \n",
" USD | \n",
"
\n",
" \n",
" 56 | \n",
" /organization/3com | \n",
" 3Com | \n",
" |Curated Web| | \n",
" Curated Web | \n",
" USA | \n",
" MA | \n",
" Boston | \n",
" Marlborough | \n",
" /organization/hewlett-packard | \n",
" Hewlett-Packard | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Palo Alto | \n",
" 2009-11-11 | \n",
" 2009-11 | \n",
" 2009-Q4 | \n",
" 2009 | \n",
" 2700000000 | \n",
" USD | \n",
"
\n",
" \n",
" 69 | \n",
" /organization/3par | \n",
" 3PAR | \n",
" |Services|Cloud Data Services|Software| | \n",
" Cloud Data Services | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Fremont | \n",
" /organization/hewlett-packard | \n",
" Hewlett-Packard | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Palo Alto | \n",
" 2010-09-02 | \n",
" 2010-09 | \n",
" 2010-Q3 | \n",
" 2010 | \n",
" 2350000000 | \n",
" USD | \n",
"
\n",
" \n",
" 108 | \n",
" /organization/91-wireless | \n",
" 91 Wireless | \n",
" |Mobile| | \n",
" Mobile | \n",
" CHN | \n",
" NaN | \n",
" Fuzhou Shi | \n",
" Fuzhou Shi | \n",
" /organization/baidu | \n",
" Baidu | \n",
" ... | \n",
" CHN | \n",
" NaN | \n",
" Beijing | \n",
" Beijing | \n",
" 2013-07-15 | \n",
" 2013-07 | \n",
" 2013-Q3 | \n",
" 2013 | \n",
" 1900000000 | \n",
" USD | \n",
"
\n",
" \n",
" 136 | \n",
" /organization/abc | \n",
" ABC | \n",
" |News| | \n",
" News | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" /organization/the-walt-disney-company | \n",
" The Walt Disney Company | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" Los Angeles | \n",
" Burbank | \n",
" 1995-08-05 | \n",
" 1995-08 | \n",
" 1995-Q3 | \n",
" 1995 | \n",
" 19000000000 | \n",
" USD | \n",
"
\n",
" \n",
" 152 | \n",
" /organization/abovenet | \n",
" AboveNet | \n",
" |Information Technology|Internet|Service Provi... | \n",
" Internet | \n",
" USA | \n",
" NY | \n",
" New York City | \n",
" White Plains | \n",
" /organization/zayo-group | \n",
" Zayo | \n",
" ... | \n",
" USA | \n",
" CO | \n",
" Denver | \n",
" Boulder | \n",
" 2012-03-19 | \n",
" 2012-03 | \n",
" 2012-Q1 | \n",
" 2012 | \n",
" 2300000000 | \n",
" USD | \n",
"
\n",
" \n",
" 155 | \n",
" /organization/abraxis-bioscience | \n",
" Abraxis BioScience | \n",
" |Pharmaceuticals|Biotechnology| | \n",
" Biotechnology | \n",
" USA | \n",
" CA | \n",
" Los Angeles | \n",
" Los Angeles | \n",
" /organization/celgene | \n",
" Celgene | \n",
" ... | \n",
" USA | \n",
" NJ | \n",
" Newark | \n",
" Summit | \n",
" 2010-06-30 | \n",
" 2010-06 | \n",
" 2010-Q2 | \n",
" 2010 | \n",
" 2900000000 | \n",
" USD | \n",
"
\n",
" \n",
" 173 | \n",
" /organization/accelrys | \n",
" Accelrys | \n",
" |Software| | \n",
" Software | \n",
" USA | \n",
" CA | \n",
" San Diego | \n",
" San Diego | \n",
" /organization/dassault | \n",
" Dassault Systemes | \n",
" ... | \n",
" FRA | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" 2014-01-30 | \n",
" 2014-01 | \n",
" 2014-Q1 | \n",
" 2014 | \n",
" 750000000 | \n",
" USD | \n",
"
\n",
" \n",
" 187 | \n",
" /organization/access-midstream-partners-lp-inv... | \n",
" Access Midstream Partners LP Investments | \n",
" |Natural Gas Uses| | \n",
" Natural Gas Uses | \n",
" USA | \n",
" OK | \n",
" Oklahoma City | \n",
" Oklahoma City | \n",
" /organization/the-williams-companies | \n",
" The Williams Companies | \n",
" ... | \n",
" USA | \n",
" OK | \n",
" OK - Other | \n",
" Snyder | \n",
" 2012-12-21 | \n",
" 2012-12 | \n",
" 2012-Q4 | \n",
" 2012 | \n",
" 2250000000 | \n",
" USD | \n",
"
\n",
" \n",
" 195 | \n",
" /organization/acclarent | \n",
" Acclarent | \n",
" |Medical|Biotechnology| | \n",
" Biotechnology | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Menlo Park | \n",
" /organization/johnson-johnson | \n",
" Johnson & Johnson | \n",
" ... | \n",
" USA | \n",
" NJ | \n",
" Newark | \n",
" New Brunswick | \n",
" 2009-12-17 | \n",
" 2009-12 | \n",
" 2009-Q4 | \n",
" 2009 | \n",
" 785000000 | \n",
" USD | \n",
"
\n",
" \n",
" 343 | \n",
" /organization/admob | \n",
" AdMob | \n",
" |Mobile|Advertising| | \n",
" Advertising | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Mountain View | \n",
" /organization/google | \n",
" Google | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Mountain View | \n",
" 2009-11-09 | \n",
" 2009-11 | \n",
" 2009-Q4 | \n",
" 2009 | \n",
" 750000000 | \n",
" USD | \n",
"
\n",
" \n",
" 382 | \n",
" /organization/advanced-computer-software | \n",
" Advanced Computer Software | \n",
" |Health Care|Information Technology|Software| | \n",
" Software | \n",
" GBR | \n",
" NaN | \n",
" Cobham | \n",
" Cobham | \n",
" /organization/vista-equity-partners | \n",
" Vista Equity Partners | \n",
" ... | \n",
" USA | \n",
" TX | \n",
" Austin | \n",
" Austin | \n",
" 2014-11-25 | \n",
" 2014-11 | \n",
" 2014-Q4 | \n",
" 2014 | \n",
" 725000000 | \n",
" GBP | \n",
"
\n",
" \n",
" 406 | \n",
" /organization/advanstar-communications | \n",
" Advanstar Communications | \n",
" |Market Research|Events|Business Services| | \n",
" Events | \n",
" USA | \n",
" CA | \n",
" Los Angeles | \n",
" Santa Monica | \n",
" /organization/ubm-electronics | \n",
" UBM Electronics | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" San Francisco | \n",
" 2014-10-01 | \n",
" 2014-10 | \n",
" 2014-Q4 | \n",
" 2014 | \n",
" 972000000 | \n",
" USD | \n",
"
\n",
" \n",
" 420 | \n",
" /organization/advo | \n",
" Advo | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" /organization/valassis | \n",
" Valassis | \n",
" ... | \n",
" USA | \n",
" MI | \n",
" Detroit | \n",
" Livonia | \n",
" 2006-07-06 | \n",
" 2006-07 | \n",
" 2006-Q3 | \n",
" 2006 | \n",
" 1300000000 | \n",
" USD | \n",
"
\n",
" \n",
" 430 | \n",
" /organization/aeluros | \n",
" Aeluros | \n",
" |Semiconductors| | \n",
" Semiconductors | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Mountain View | \n",
" /organization/broadcom | \n",
" Broadcom | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" Anaheim | \n",
" Irvine | \n",
" 2012-02-17 | \n",
" 2012-02 | \n",
" 2012-Q1 | \n",
" 2012 | \n",
" 3700000000 | \n",
" USD | \n",
"
\n",
" \n",
" 443 | \n",
" /organization/affiliated-computer-services | \n",
" Affiliated Computer Services | \n",
" |Consulting| | \n",
" Consulting | \n",
" USA | \n",
" TX | \n",
" Dallas | \n",
" Dallas | \n",
" /organization/xerox | \n",
" Xerox | \n",
" ... | \n",
" USA | \n",
" AL | \n",
" AL - Other | \n",
" Normal | \n",
" 2009-09-28 | \n",
" 2009-09 | \n",
" 2009-Q3 | \n",
" 2009 | \n",
" 5750000000 | \n",
" USD | \n",
"
\n",
" \n",
" 516 | \n",
" /organization/airwatch | \n",
" AirWatch | \n",
" |Mobile Devices|Mobile| | \n",
" Mobile | \n",
" USA | \n",
" GA | \n",
" Atlanta | \n",
" Atlanta | \n",
" /organization/vmware | \n",
" VMware | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Palo Alto | \n",
" 2014-01-22 | \n",
" 2014-01 | \n",
" 2014-Q1 | \n",
" 2014 | \n",
" 1540000000 | \n",
" USD | \n",
"
\n",
" \n",
" 536 | \n",
" /organization/aktiv-kapital | \n",
" Aktiv Kapital | \n",
" |Finance| | \n",
" Finance | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" /organization/portfolio-recovery-associates | \n",
" Portfolio Recovery Associates | \n",
" ... | \n",
" USA | \n",
" VA | \n",
" Norfolk - Virginia Beach | \n",
" Norfolk | \n",
" 2014-02-20 | \n",
" 2014-02 | \n",
" 2014-Q1 | \n",
" 2014 | \n",
" 1300000000 | \n",
" USD | \n",
"
\n",
" \n",
" 540 | \n",
" /organization/alabama-gas-corp | \n",
" Alabama Gas Corp | \n",
" NaN | \n",
" NaN | \n",
" USA | \n",
" AL | \n",
" Birmingham | \n",
" Birmingham | \n",
" /organization/laclede-group | \n",
" Laclede Group | \n",
" ... | \n",
" USA | \n",
" MO | \n",
" St. Louis | \n",
" St Louis | \n",
" 2014-04-07 | \n",
" 2014-04 | \n",
" 2014-Q2 | \n",
" 2014 | \n",
" 1340000000 | \n",
" USD | \n",
"
\n",
" \n",
" 576 | \n",
" /organization/algeta | \n",
" Algeta | \n",
" |Biotechnology| | \n",
" Biotechnology | \n",
" NOR | \n",
" NaN | \n",
" Oslo | \n",
" Oslo | \n",
" /organization/bayer-ag-germany | \n",
" Bayer AG | \n",
" ... | \n",
" DEU | \n",
" NaN | \n",
" Leverkusen | \n",
" Leverkusen | \n",
" 2013-12-19 | \n",
" 2013-12 | \n",
" 2013-Q4 | \n",
" 2013 | \n",
" 2900000000 | \n",
" USD | \n",
"
\n",
" \n",
" 588 | \n",
" /organization/alios-biopharma | \n",
" Alios BioPharma | \n",
" |Biotechnology| | \n",
" Biotechnology | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" South San Francisco | \n",
" /organization/johnson-johnson | \n",
" Johnson & Johnson | \n",
" ... | \n",
" USA | \n",
" NJ | \n",
" Newark | \n",
" New Brunswick | \n",
" 2014-09-30 | \n",
" 2014-09 | \n",
" 2014-Q3 | \n",
" 2014 | \n",
" 1750000000 | \n",
" USD | \n",
"
\n",
" \n",
" 606 | \n",
" /organization/allergan | \n",
" Allergan | \n",
" |Medical|Pharmaceuticals|Biotechnology| | \n",
" Medical | \n",
" USA | \n",
" CA | \n",
" Anaheim | \n",
" Irvine | \n",
" /organization/actavis | \n",
" Actavis | \n",
" ... | \n",
" USA | \n",
" NJ | \n",
" Newark | \n",
" Parsippany | \n",
" 2014-11-16 | \n",
" 2014-11 | \n",
" 2014-Q4 | \n",
" 2014 | \n",
" 65500000000 | \n",
" USD | \n",
"
\n",
" \n",
" 618 | \n",
" /organization/adl-media-inc | \n",
" AllMedia Inc | \n",
" |Advertising| | \n",
" Advertising | \n",
" USA | \n",
" IA | \n",
" IA - Other | \n",
" Plano | \n",
" /organization/liberty-global | \n",
" Liberty Global | \n",
" ... | \n",
" USA | \n",
" CO | \n",
" Denver | \n",
" Englewood | \n",
" 2014-05-08 | \n",
" 2014-05 | \n",
" 2014-Q2 | \n",
" 2014 | \n",
" 930000000 | \n",
" USD | \n",
"
\n",
" \n",
" 629 | \n",
" /organization/alltel | \n",
" Alltel | \n",
" |Public Relations| | \n",
" Public Relations | \n",
" USA | \n",
" AR | \n",
" Little Rock | \n",
" Little Rock | \n",
" /organization/verizon | \n",
" Verizon Communications | \n",
" ... | \n",
" USA | \n",
" NY | \n",
" New York City | \n",
" New York | \n",
" 2008-06-05 | \n",
" 2008-06 | \n",
" 2008-Q2 | \n",
" 2008 | \n",
" 28100000000 | \n",
" USD | \n",
"
\n",
" \n",
" 654 | \n",
" /organization/altalink | \n",
" AltaLink | \n",
" |Local Businesses|Energy|Utilities| | \n",
" Utilities | \n",
" CAN | \n",
" AB | \n",
" Calgary | \n",
" Calgary | \n",
" /organization/midamerican-energy | \n",
" Berkshire Hathaway Energy | \n",
" ... | \n",
" USA | \n",
" IA | \n",
" Des Moines | \n",
" Des Moines | \n",
" 2014-05-02 | \n",
" 2014-05 | \n",
" 2014-Q2 | \n",
" 2014 | \n",
" 3520000000 | \n",
" USD | \n",
"
\n",
" \n",
" 691 | \n",
" /organization/amcol-international | \n",
" AMCOL International | \n",
" |Manufacturing| | \n",
" Manufacturing | \n",
" USA | \n",
" IL | \n",
" Chicago | \n",
" Hoffman Estates | \n",
" /organization/minerals-technologies | \n",
" Minerals Technologies | \n",
" ... | \n",
" USA | \n",
" NY | \n",
" New York City | \n",
" New York | \n",
" 2014-03-10 | \n",
" 2014-03 | \n",
" 2014-Q1 | \n",
" 2014 | \n",
" 1700000000 | \n",
" USD | \n",
"
\n",
" \n",
" 703 | \n",
" /organization/americanexpress | \n",
" American Express | \n",
" |Finance| | \n",
" Finance | \n",
" USA | \n",
" NY | \n",
" New York City | \n",
" New York | \n",
" /organization/standard-chartered-bank | \n",
" Standard Chartered Bank | \n",
" ... | \n",
" GBR | \n",
" NaN | \n",
" London | \n",
" London | \n",
" 2008-02-28 | \n",
" 2008-02 | \n",
" 2008-Q1 | \n",
" 2008 | \n",
" 823000000 | \n",
" USD | \n",
"
\n",
" \n",
" 716 | \n",
" /organization/american-petroleum-tankers | \n",
" American Petroleum Tankers | \n",
" NaN | \n",
" NaN | \n",
" USA | \n",
" PA | \n",
" Philadelphia | \n",
" Plymouth Meeting | \n",
" /organization/kinder-morgan | \n",
" Kinder Morgan Energy | \n",
" ... | \n",
" USA | \n",
" TX | \n",
" Houston | \n",
" Houston | \n",
" 2013-12-23 | \n",
" 2013-12 | \n",
" 2013-Q4 | \n",
" 2013 | \n",
" 962000000 | \n",
" USD | \n",
"
\n",
" \n",
" 719 | \n",
" /organization/american-realty-capital-healthca... | \n",
" American Realty Capital Healthcare Trust | \n",
" NaN | \n",
" NaN | \n",
" USA | \n",
" MA | \n",
" Boston | \n",
" Boston | \n",
" /organization/ventas | \n",
" Ventas | \n",
" ... | \n",
" USA | \n",
" IL | \n",
" Chicago | \n",
" Chicago | \n",
" 2014-06-02 | \n",
" 2014-06 | \n",
" 2014-Q2 | \n",
" 2014 | \n",
" 2600000000 | \n",
" USD | \n",
"
\n",
" \n",
" 723 | \n",
" /organization/ami-semiconductor | \n",
" AMI Semiconductor | \n",
" |Electronics|Manufacturing|Design| | \n",
" Design | \n",
" USA | \n",
" ID | \n",
" Idaho Falls | \n",
" Pocatello | \n",
" /organization/on-semiconductor | \n",
" ON Semiconductor | \n",
" ... | \n",
" USA | \n",
" AZ | \n",
" Phoenix | \n",
" Phoenix | \n",
" 2007-12-14 | \n",
" 2007-12 | \n",
" 2007-Q4 | \n",
" 2007 | \n",
" 915000000 | \n",
" USD | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 12702 | \n",
" /organization/viawest | \n",
" ViaWest | \n",
" |Data Centers|Services|Web Hosting| | \n",
" Web Hosting | \n",
" USA | \n",
" CO | \n",
" Denver | \n",
" Greenwood Village | \n",
" /organization/shaw-communications | \n",
" Shaw Communications | \n",
" ... | \n",
" CAN | \n",
" AB | \n",
" Calgary | \n",
" Calgary | \n",
" 2014-08-04 | \n",
" 2014-08 | \n",
" 2014-Q3 | \n",
" 2014 | \n",
" 1200000000 | \n",
" USD | \n",
"
\n",
" \n",
" 12704 | \n",
" /organization/viber-media | \n",
" Viber Media | \n",
" |Android|iPhone|VoIP|Messaging| | \n",
" Android | \n",
" CYP | \n",
" NaN | \n",
" Cyprus | \n",
" Limassol | \n",
" /organization/rakuten | \n",
" Rakuten | \n",
" ... | \n",
" JPN | \n",
" NaN | \n",
" Tokyo | \n",
" Tokyo | \n",
" 2014-02-14 | \n",
" 2014-02 | \n",
" 2014-Q1 | \n",
" 2014 | \n",
" 900000000 | \n",
" USD | \n",
"
\n",
" \n",
" 12717 | \n",
" /organization/video-gaming-technologies | \n",
" Video Gaming Technologies | \n",
" |Video Games| | \n",
" Video Games | \n",
" USA | \n",
" TN | \n",
" Nashville | \n",
" Franklin | \n",
" /organization/aristocrat-technologies-inc | \n",
" Aristocrat Technologies, Inc | \n",
" ... | \n",
" AUS | \n",
" NaN | \n",
" AUS - Other | \n",
" NaN | \n",
" 2014-07-07 | \n",
" 2014-07 | \n",
" 2014-Q3 | \n",
" 2014 | \n",
" 1300000000 | \n",
" USD | \n",
"
\n",
" \n",
" 12749 | \n",
" /organization/vion-food-group | \n",
" Vion Food Group | \n",
" |Hospitality| | \n",
" Hospitality | \n",
" NLD | \n",
" NaN | \n",
" Eindhoven | \n",
" Eindhoven | \n",
" /organization/darling-international | \n",
" Darling Ingredients | \n",
" ... | \n",
" USA | \n",
" TX | \n",
" Dallas | \n",
" Irving | \n",
" 2013-10-07 | \n",
" 2013-10 | \n",
" 2013-Q4 | \n",
" 2013 | \n",
" 1600000000 | \n",
" EUR | \n",
"
\n",
" \n",
" 12764 | \n",
" /organization/viropharma | \n",
" ViroPharma | \n",
" |Pharmaceuticals|Biotechnology| | \n",
" Biotechnology | \n",
" USA | \n",
" PA | \n",
" Philadelphia | \n",
" Exton | \n",
" /organization/shire | \n",
" Shire | \n",
" ... | \n",
" IRL | \n",
" NaN | \n",
" Dublin | \n",
" Dublin | \n",
" 2013-11-10 | \n",
" 2013-11 | \n",
" 2013-Q4 | \n",
" 2013 | \n",
" 4200000000 | \n",
" USD | \n",
"
\n",
" \n",
" 12843 | \n",
" /organization/vizada | \n",
" Vizada | \n",
" |DOD/Military|Mobile| | \n",
" Mobile | \n",
" USA | \n",
" MD | \n",
" Washington, D.C. | \n",
" Rockville | \n",
" /organization/eads-astrium | \n",
" EADS Astrium | \n",
" ... | \n",
" USA | \n",
" TX | \n",
" Houston | \n",
" Houston | \n",
" 2011-08-03 | \n",
" 2011-08 | \n",
" 2011-Q3 | \n",
" 2011 | \n",
" 960000000 | \n",
" USD | \n",
"
\n",
" \n",
" 12854 | \n",
" /organization/vk | \n",
" Vkontake | \n",
" |Communities|Social Network Media|Social Media| | \n",
" Communities | \n",
" RUS | \n",
" NaN | \n",
" St. Petersburg | \n",
" Saint Petersburg | \n",
" /organization/mail-ru | \n",
" Mail.Ru Group | \n",
" ... | \n",
" RUS | \n",
" NaN | \n",
" Moscow | \n",
" Moscow | \n",
" 2014-09-16 | \n",
" 2014-09 | \n",
" 2014-Q3 | \n",
" 2014 | \n",
" 1470000000 | \n",
" USD | \n",
"
\n",
" \n",
" 12887 | \n",
" /organization/volvo-rents-construction | \n",
" Volvo Rents Construction | \n",
" NaN | \n",
" NaN | \n",
" CHE | \n",
" NaN | \n",
" CHE - Other | \n",
" Eison | \n",
" /organization/platinum-equity-llc | \n",
" Platinum Equity | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" Los Angeles | \n",
" Beverly Hills | \n",
" 2013-12-10 | \n",
" 2013-12 | \n",
" 2013-Q4 | \n",
" 2013 | \n",
" 1100000000 | \n",
" USD | \n",
"
\n",
" \n",
" 12957 | \n",
" /organization/warner-chilcott | \n",
" Warner Chilcott | \n",
" |Pharmaceuticals|Biotechnology| | \n",
" Biotechnology | \n",
" USA | \n",
" NJ | \n",
" Newark | \n",
" Rockaway | \n",
" /organization/actavis | \n",
" Actavis | \n",
" ... | \n",
" USA | \n",
" NJ | \n",
" Newark | \n",
" Parsippany | \n",
" 2013-05-20 | \n",
" 2013-05 | \n",
" 2013-Q2 | \n",
" 2013 | \n",
" 8500000000 | \n",
" USD | \n",
"
\n",
" \n",
" 12981 | \n",
" /organization/waze | \n",
" Waze | \n",
" |Navigation|Transportation| | \n",
" Transportation | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Palo Alto | \n",
" /organization/google | \n",
" Google | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Mountain View | \n",
" 2013-06-11 | \n",
" 2013-06 | \n",
" 2013-Q2 | \n",
" 2013 | \n",
" 996000000 | \n",
" USD | \n",
"
\n",
" \n",
" 13011 | \n",
" /organization/webex | \n",
" WebEx | \n",
" |Curated Web| | \n",
" Curated Web | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Santa Clara | \n",
" /organization/cisco | \n",
" Cisco | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" San Jose | \n",
" 2007-03-15 | \n",
" 2007-03 | \n",
" 2007-Q1 | \n",
" 2007 | \n",
" 3200000000 | \n",
" USD | \n",
"
\n",
" \n",
" 13030 | \n",
" /organization/webtrends | \n",
" Webtrends | \n",
" |SEO|Curated Web| | \n",
" Curated Web | \n",
" USA | \n",
" OR | \n",
" Portland, Oregon | \n",
" Portland | \n",
" /organization/netiq | \n",
" NetIQ | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" San Jose | \n",
" 2001-01-17 | \n",
" 2001-01 | \n",
" 2001-Q1 | \n",
" 2001 | \n",
" 1000000000 | \n",
" USD | \n",
"
\n",
" \n",
" 13061 | \n",
" /organization/western-gas-resources | \n",
" Western Gas Resources | \n",
" |Clean Energy|Oil| | \n",
" Clean Energy | \n",
" USA | \n",
" TX | \n",
" Seminole | \n",
" Seminole | \n",
" /organization/anadarko-petroleum-corporation | \n",
" Anadarko Petroleum Corporation | \n",
" ... | \n",
" USA | \n",
" TX | \n",
" Seminole | \n",
" Seminole | \n",
" 2006-06-01 | \n",
" 2006-06 | \n",
" 2006-Q2 | \n",
" 2006 | \n",
" 4800000000 | \n",
" USD | \n",
"
\n",
" \n",
" 13069 | \n",
" /organization/weyerhaeuser | \n",
" Weyerhaeuser | \n",
" |Manufacturing| | \n",
" Manufacturing | \n",
" USA | \n",
" WA | \n",
" Seattle | \n",
" Federal Way | \n",
" /organization/tri-pointe-homes | \n",
" Tri Pointe Homes | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" San Ramon | \n",
" 2013-11-04 | \n",
" 2013-11 | \n",
" 2013-Q4 | \n",
" 2013 | \n",
" 2700000000 | \n",
" USD | \n",
"
\n",
" \n",
" 13080 | \n",
" /organization/whatsapp | \n",
" WhatsApp | \n",
" |Messaging| | \n",
" Messaging | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Santa Clara | \n",
" /organization/facebook | \n",
" Facebook | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Menlo Park | \n",
" 2014-02-19 | \n",
" 2014-02 | \n",
" 2014-Q1 | \n",
" 2014 | \n",
" 19000000000 | \n",
" USD | \n",
"
\n",
" \n",
" 13081 | \n",
" /organization/wheelabrator-technologies | \n",
" Wheelabrator Technologies | \n",
" |Waste Management| | \n",
" Waste Management | \n",
" USA | \n",
" NH | \n",
" Manchester, New Hampshire | \n",
" Hampton | \n",
" /organization/energy-capital-partners | \n",
" Energy Capital Partners | \n",
" ... | \n",
" USA | \n",
" NJ | \n",
" Newark | \n",
" Short Hills | \n",
" 2014-07-29 | \n",
" 2014-07 | \n",
" 2014-Q3 | \n",
" 2014 | \n",
" 1940000000 | \n",
" USD | \n",
"
\n",
" \n",
" 13121 | \n",
" /organization/wild-flavors | \n",
" WILD Flavors | \n",
" |Food Processing| | \n",
" Food Processing | \n",
" DEU | \n",
" NaN | \n",
" Frankfurt | \n",
" Heidelberg | \n",
" /organization/archer-daniels-midland-company | \n",
" Archer Daniels Midland Company | \n",
" ... | \n",
" USA | \n",
" IL | \n",
" Chicago | \n",
" Chicago | \n",
" 2014-07-07 | \n",
" 2014-07 | \n",
" 2014-Q3 | \n",
" 2014 | \n",
" 3000000000 | \n",
" USD | \n",
"
\n",
" \n",
" 13143 | \n",
" /organization/wind-river | \n",
" Wind River | \n",
" |Software| | \n",
" Software | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Alameda | \n",
" /organization/intel | \n",
" Intel | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Santa Clara | \n",
" 2009-07-17 | \n",
" 2009-07 | \n",
" 2009-Q3 | \n",
" 2009 | \n",
" 884000000 | \n",
" USD | \n",
"
\n",
" \n",
" 13146 | \n",
" /organization/windsor-foods | \n",
" Windsor Foods | \n",
" |Consumer Goods|Food Processing| | \n",
" Food Processing | \n",
" USA | \n",
" TX | \n",
" Houston | \n",
" Houston | \n",
" /organization/ajinomoto | \n",
" Ajinomoto | \n",
" ... | \n",
" JPN | \n",
" NaN | \n",
" Tokyo | \n",
" Tokyo | \n",
" 2014-09-10 | \n",
" 2014-09 | \n",
" 2014-Q3 | \n",
" 2014 | \n",
" 800000000 | \n",
" USD | \n",
"
\n",
" \n",
" 13174 | \n",
" /organization/wittur | \n",
" Wittur | \n",
" |Manufacturing|Public Safety|Heavy Industry| | \n",
" Heavy Industry | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" /organization/bain-capital-2 | \n",
" Bain Capital | \n",
" ... | \n",
" USA | \n",
" MA | \n",
" Boston | \n",
" Boston | \n",
" 2014-12-23 | \n",
" 2014-12 | \n",
" 2014-Q4 | \n",
" 2014 | \n",
" 750000000 | \n",
" USD | \n",
"
\n",
" \n",
" 13238 | \n",
" /organization/wyeth-pharmaceuticals | \n",
" Wyeth Pharmaceuticals | \n",
" |Biotechnology|Pharmaceuticals|Health and Well... | \n",
" Biotechnology | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" /organization/nestl | \n",
" Nestl\u0082 | \n",
" ... | \n",
" CHE | \n",
" NaN | \n",
" Vevey | \n",
" Vevey | \n",
" 2012-04-01 | \n",
" 2012-04 | \n",
" 2012-Q2 | \n",
" 2012 | \n",
" 11850000000 | \n",
" USD | \n",
"
\n",
" \n",
" 13239 | \n",
" /organization/wyeth-pharmaceuticals | \n",
" Wyeth Pharmaceuticals | \n",
" |Biotechnology|Pharmaceuticals|Health and Well... | \n",
" Biotechnology | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" /organization/pfizer | \n",
" Pfizer | \n",
" ... | \n",
" USA | \n",
" NY | \n",
" New York City | \n",
" New York | \n",
" 2009-01-23 | \n",
" 2009-01 | \n",
" 2009-Q1 | \n",
" 2009 | \n",
" 68000000000 | \n",
" USD | \n",
"
\n",
" \n",
" 13277 | \n",
" /organization/xircom | \n",
" Xircom | \n",
" |Web Hosting| | \n",
" Web Hosting | \n",
" USA | \n",
" CA | \n",
" Los Angeles | \n",
" Thousand Oaks | \n",
" /organization/intel | \n",
" Intel | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Santa Clara | \n",
" 2001-01-15 | \n",
" 2001-01 | \n",
" 2001-Q1 | \n",
" 2001 | \n",
" 748000000 | \n",
" USD | \n",
"
\n",
" \n",
" 13323 | \n",
" /organization/yammer | \n",
" Yammer | \n",
" |Twitter Applications|Networking|Social Media|... | \n",
" Enterprise Software | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" San Francisco | \n",
" /organization/microsoft | \n",
" Microsoft | \n",
" ... | \n",
" USA | \n",
" WA | \n",
" Seattle | \n",
" Redmond | \n",
" 2012-06-25 | \n",
" 2012-06 | \n",
" 2012-Q2 | \n",
" 2012 | \n",
" 1200000000 | \n",
" USD | \n",
"
\n",
" \n",
" 13324 | \n",
" /organization/yankee-candle-company | \n",
" Yankee Candle Company | \n",
" |Manufacturing| | \n",
" Manufacturing | \n",
" USA | \n",
" AR | \n",
" Fayetteville | \n",
" Rogers | \n",
" /organization/jarden | \n",
" Jarden | \n",
" ... | \n",
" USA | \n",
" NY | \n",
" New York City | \n",
" Rye | \n",
" 2013-09-03 | \n",
" 2013-09 | \n",
" 2013-Q3 | \n",
" 2013 | \n",
" 1750000000 | \n",
" USD | \n",
"
\n",
" \n",
" 13356 | \n",
" /organization/yoplait | \n",
" Yoplait | \n",
" |Food Processing| | \n",
" Food Processing | \n",
" USA | \n",
" MN | \n",
" Minneapolis | \n",
" Minneapolis | \n",
" /organization/generalmills | \n",
" General Mills | \n",
" ... | \n",
" USA | \n",
" MN | \n",
" Minneapolis | \n",
" Minneapolis | \n",
" 2011-07-01 | \n",
" 2011-07 | \n",
" 2011-Q3 | \n",
" 2011 | \n",
" 1200000000 | \n",
" USD | \n",
"
\n",
" \n",
" 13357 | \n",
" /organization/york-risk-services-group | \n",
" York Risk Services Group | \n",
" |Insurance| | \n",
" Insurance | \n",
" USA | \n",
" NJ | \n",
" Newark | \n",
" Parsippany | \n",
" /organization/onex | \n",
" Onex | \n",
" ... | \n",
" CAN | \n",
" ON | \n",
" Toronto | \n",
" Toronto | \n",
" 2014-07-16 | \n",
" 2014-07 | \n",
" 2014-Q3 | \n",
" 2014 | \n",
" 1330000000 | \n",
" USD | \n",
"
\n",
" \n",
" 13371 | \n",
" /organization/youtube | \n",
" YouTube | \n",
" |Video|Online Rental|Entertainment|Games| | \n",
" Games | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" San Bruno | \n",
" /organization/google | \n",
" Google | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Mountain View | \n",
" 2006-10-01 | \n",
" 2006-10 | \n",
" 2006-Q4 | \n",
" 2006 | \n",
" 1650000000 | \n",
" USD | \n",
"
\n",
" \n",
" 13396 | \n",
" /organization/zappos | \n",
" Zappos | \n",
" |Curated Web| | \n",
" Curated Web | \n",
" USA | \n",
" NV | \n",
" Las Vegas | \n",
" Las Vegas | \n",
" /organization/amazon | \n",
" Amazon | \n",
" ... | \n",
" USA | \n",
" WA | \n",
" Seattle | \n",
" Seattle | \n",
" 2009-07-22 | \n",
" 2009-07 | \n",
" 2009-Q3 | \n",
" 2009 | \n",
" 1200000000 | \n",
" USD | \n",
"
\n",
" \n",
" 13437 | \n",
" /organization/ziggo | \n",
" Ziggo | \n",
" |Information Services|Entertainment|Investment... | \n",
" Investment Management | \n",
" NLD | \n",
" NaN | \n",
" Utrecht | \n",
" Utrecht | \n",
" /organization/liberty-global | \n",
" Liberty Global | \n",
" ... | \n",
" USA | \n",
" CO | \n",
" Denver | \n",
" Englewood | \n",
" 2014-01-27 | \n",
" 2014-01 | \n",
" 2014-Q1 | \n",
" 2014 | \n",
" 13700000000 | \n",
" USD | \n",
"
\n",
" \n",
"
\n",
"
537 rows \u00d7 22 columns
\n",
"
"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 41,
"text": [
" company_permalink \\\n",
"31 /organization/21st-century-insurance \n",
"56 /organization/3com \n",
"69 /organization/3par \n",
"108 /organization/91-wireless \n",
"136 /organization/abc \n",
"152 /organization/abovenet \n",
"155 /organization/abraxis-bioscience \n",
"173 /organization/accelrys \n",
"187 /organization/access-midstream-partners-lp-inv... \n",
"195 /organization/acclarent \n",
"343 /organization/admob \n",
"382 /organization/advanced-computer-software \n",
"406 /organization/advanstar-communications \n",
"420 /organization/advo \n",
"430 /organization/aeluros \n",
"443 /organization/affiliated-computer-services \n",
"516 /organization/airwatch \n",
"536 /organization/aktiv-kapital \n",
"540 /organization/alabama-gas-corp \n",
"576 /organization/algeta \n",
"588 /organization/alios-biopharma \n",
"606 /organization/allergan \n",
"618 /organization/adl-media-inc \n",
"629 /organization/alltel \n",
"654 /organization/altalink \n",
"691 /organization/amcol-international \n",
"703 /organization/americanexpress \n",
"716 /organization/american-petroleum-tankers \n",
"719 /organization/american-realty-capital-healthca... \n",
"723 /organization/ami-semiconductor \n",
"... ... \n",
"12702 /organization/viawest \n",
"12704 /organization/viber-media \n",
"12717 /organization/video-gaming-technologies \n",
"12749 /organization/vion-food-group \n",
"12764 /organization/viropharma \n",
"12843 /organization/vizada \n",
"12854 /organization/vk \n",
"12887 /organization/volvo-rents-construction \n",
"12957 /organization/warner-chilcott \n",
"12981 /organization/waze \n",
"13011 /organization/webex \n",
"13030 /organization/webtrends \n",
"13061 /organization/western-gas-resources \n",
"13069 /organization/weyerhaeuser \n",
"13080 /organization/whatsapp \n",
"13081 /organization/wheelabrator-technologies \n",
"13121 /organization/wild-flavors \n",
"13143 /organization/wind-river \n",
"13146 /organization/windsor-foods \n",
"13174 /organization/wittur \n",
"13238 /organization/wyeth-pharmaceuticals \n",
"13239 /organization/wyeth-pharmaceuticals \n",
"13277 /organization/xircom \n",
"13323 /organization/yammer \n",
"13324 /organization/yankee-candle-company \n",
"13356 /organization/yoplait \n",
"13357 /organization/york-risk-services-group \n",
"13371 /organization/youtube \n",
"13396 /organization/zappos \n",
"13437 /organization/ziggo \n",
"\n",
" company_name \\\n",
"31 21st Century Insurance \n",
"56 3Com \n",
"69 3PAR \n",
"108 91 Wireless \n",
"136 ABC \n",
"152 AboveNet \n",
"155 Abraxis BioScience \n",
"173 Accelrys \n",
"187 Access Midstream Partners LP Investments \n",
"195 Acclarent \n",
"343 AdMob \n",
"382 Advanced Computer Software \n",
"406 Advanstar Communications \n",
"420 Advo \n",
"430 Aeluros \n",
"443 Affiliated Computer Services \n",
"516 AirWatch \n",
"536 Aktiv Kapital \n",
"540 Alabama Gas Corp \n",
"576 Algeta \n",
"588 Alios BioPharma \n",
"606 Allergan \n",
"618 AllMedia Inc \n",
"629 Alltel \n",
"654 AltaLink \n",
"691 AMCOL International \n",
"703 American Express \n",
"716 American Petroleum Tankers \n",
"719 American Realty Capital Healthcare Trust \n",
"723 AMI Semiconductor \n",
"... ... \n",
"12702 ViaWest \n",
"12704 Viber Media \n",
"12717 Video Gaming Technologies \n",
"12749 Vion Food Group \n",
"12764 ViroPharma \n",
"12843 Vizada \n",
"12854 Vkontake \n",
"12887 Volvo Rents Construction \n",
"12957 Warner Chilcott \n",
"12981 Waze \n",
"13011 WebEx \n",
"13030 Webtrends \n",
"13061 Western Gas Resources \n",
"13069 Weyerhaeuser \n",
"13080 WhatsApp \n",
"13081 Wheelabrator Technologies \n",
"13121 WILD Flavors \n",
"13143 Wind River \n",
"13146 Windsor Foods \n",
"13174 Wittur \n",
"13238 Wyeth Pharmaceuticals \n",
"13239 Wyeth Pharmaceuticals \n",
"13277 Xircom \n",
"13323 Yammer \n",
"13324 Yankee Candle Company \n",
"13356 Yoplait \n",
"13357 York Risk Services Group \n",
"13371 YouTube \n",
"13396 Zappos \n",
"13437 Ziggo \n",
"\n",
" company_category_list \\\n",
"31 |Finance|Business Services|Insurance| \n",
"56 |Curated Web| \n",
"69 |Services|Cloud Data Services|Software| \n",
"108 |Mobile| \n",
"136 |News| \n",
"152 |Information Technology|Internet|Service Provi... \n",
"155 |Pharmaceuticals|Biotechnology| \n",
"173 |Software| \n",
"187 |Natural Gas Uses| \n",
"195 |Medical|Biotechnology| \n",
"343 |Mobile|Advertising| \n",
"382 |Health Care|Information Technology|Software| \n",
"406 |Market Research|Events|Business Services| \n",
"420 NaN \n",
"430 |Semiconductors| \n",
"443 |Consulting| \n",
"516 |Mobile Devices|Mobile| \n",
"536 |Finance| \n",
"540 NaN \n",
"576 |Biotechnology| \n",
"588 |Biotechnology| \n",
"606 |Medical|Pharmaceuticals|Biotechnology| \n",
"618 |Advertising| \n",
"629 |Public Relations| \n",
"654 |Local Businesses|Energy|Utilities| \n",
"691 |Manufacturing| \n",
"703 |Finance| \n",
"716 NaN \n",
"719 NaN \n",
"723 |Electronics|Manufacturing|Design| \n",
"... ... \n",
"12702 |Data Centers|Services|Web Hosting| \n",
"12704 |Android|iPhone|VoIP|Messaging| \n",
"12717 |Video Games| \n",
"12749 |Hospitality| \n",
"12764 |Pharmaceuticals|Biotechnology| \n",
"12843 |DOD/Military|Mobile| \n",
"12854 |Communities|Social Network Media|Social Media| \n",
"12887 NaN \n",
"12957 |Pharmaceuticals|Biotechnology| \n",
"12981 |Navigation|Transportation| \n",
"13011 |Curated Web| \n",
"13030 |SEO|Curated Web| \n",
"13061 |Clean Energy|Oil| \n",
"13069 |Manufacturing| \n",
"13080 |Messaging| \n",
"13081 |Waste Management| \n",
"13121 |Food Processing| \n",
"13143 |Software| \n",
"13146 |Consumer Goods|Food Processing| \n",
"13174 |Manufacturing|Public Safety|Heavy Industry| \n",
"13238 |Biotechnology|Pharmaceuticals|Health and Well... \n",
"13239 |Biotechnology|Pharmaceuticals|Health and Well... \n",
"13277 |Web Hosting| \n",
"13323 |Twitter Applications|Networking|Social Media|... \n",
"13324 |Manufacturing| \n",
"13356 |Food Processing| \n",
"13357 |Insurance| \n",
"13371 |Video|Online Rental|Entertainment|Games| \n",
"13396 |Curated Web| \n",
"13437 |Information Services|Entertainment|Investment... \n",
"\n",
" company_market company_country_code company_state_code \\\n",
"31 Business Services USA AL \n",
"56 Curated Web USA MA \n",
"69 Cloud Data Services USA CA \n",
"108 Mobile CHN NaN \n",
"136 News NaN NaN \n",
"152 Internet USA NY \n",
"155 Biotechnology USA CA \n",
"173 Software USA CA \n",
"187 Natural Gas Uses USA OK \n",
"195 Biotechnology USA CA \n",
"343 Advertising USA CA \n",
"382 Software GBR NaN \n",
"406 Events USA CA \n",
"420 NaN NaN NaN \n",
"430 Semiconductors USA CA \n",
"443 Consulting USA TX \n",
"516 Mobile USA GA \n",
"536 Finance NaN NaN \n",
"540 NaN USA AL \n",
"576 Biotechnology NOR NaN \n",
"588 Biotechnology USA CA \n",
"606 Medical USA CA \n",
"618 Advertising USA IA \n",
"629 Public Relations USA AR \n",
"654 Utilities CAN AB \n",
"691 Manufacturing USA IL \n",
"703 Finance USA NY \n",
"716 NaN USA PA \n",
"719 NaN USA MA \n",
"723 Design USA ID \n",
"... ... ... ... \n",
"12702 Web Hosting USA CO \n",
"12704 Android CYP NaN \n",
"12717 Video Games USA TN \n",
"12749 Hospitality NLD NaN \n",
"12764 Biotechnology USA PA \n",
"12843 Mobile USA MD \n",
"12854 Communities RUS NaN \n",
"12887 NaN CHE NaN \n",
"12957 Biotechnology USA NJ \n",
"12981 Transportation USA CA \n",
"13011 Curated Web USA CA \n",
"13030 Curated Web USA OR \n",
"13061 Clean Energy USA TX \n",
"13069 Manufacturing USA WA \n",
"13080 Messaging USA CA \n",
"13081 Waste Management USA NH \n",
"13121 Food Processing DEU NaN \n",
"13143 Software USA CA \n",
"13146 Food Processing USA TX \n",
"13174 Heavy Industry NaN NaN \n",
"13238 Biotechnology NaN NaN \n",
"13239 Biotechnology NaN NaN \n",
"13277 Web Hosting USA CA \n",
"13323 Enterprise Software USA CA \n",
"13324 Manufacturing USA AR \n",
"13356 Food Processing USA MN \n",
"13357 Insurance USA NJ \n",
"13371 Games USA CA \n",
"13396 Curated Web USA NV \n",
"13437 Investment Management NLD NaN \n",
"\n",
" company_region company_city \\\n",
"31 AL - Other De Armanville \n",
"56 Boston Marlborough \n",
"69 SF Bay Area Fremont \n",
"108 Fuzhou Shi Fuzhou Shi \n",
"136 NaN NaN \n",
"152 New York City White Plains \n",
"155 Los Angeles Los Angeles \n",
"173 San Diego San Diego \n",
"187 Oklahoma City Oklahoma City \n",
"195 SF Bay Area Menlo Park \n",
"343 SF Bay Area Mountain View \n",
"382 Cobham Cobham \n",
"406 Los Angeles Santa Monica \n",
"420 NaN NaN \n",
"430 SF Bay Area Mountain View \n",
"443 Dallas Dallas \n",
"516 Atlanta Atlanta \n",
"536 NaN NaN \n",
"540 Birmingham Birmingham \n",
"576 Oslo Oslo \n",
"588 SF Bay Area South San Francisco \n",
"606 Anaheim Irvine \n",
"618 IA - Other Plano \n",
"629 Little Rock Little Rock \n",
"654 Calgary Calgary \n",
"691 Chicago Hoffman Estates \n",
"703 New York City New York \n",
"716 Philadelphia Plymouth Meeting \n",
"719 Boston Boston \n",
"723 Idaho Falls Pocatello \n",
"... ... ... \n",
"12702 Denver Greenwood Village \n",
"12704 Cyprus Limassol \n",
"12717 Nashville Franklin \n",
"12749 Eindhoven Eindhoven \n",
"12764 Philadelphia Exton \n",
"12843 Washington, D.C. Rockville \n",
"12854 St. Petersburg Saint Petersburg \n",
"12887 CHE - Other Eison \n",
"12957 Newark Rockaway \n",
"12981 SF Bay Area Palo Alto \n",
"13011 SF Bay Area Santa Clara \n",
"13030 Portland, Oregon Portland \n",
"13061 Seminole Seminole \n",
"13069 Seattle Federal Way \n",
"13080 SF Bay Area Santa Clara \n",
"13081 Manchester, New Hampshire Hampton \n",
"13121 Frankfurt Heidelberg \n",
"13143 SF Bay Area Alameda \n",
"13146 Houston Houston \n",
"13174 NaN NaN \n",
"13238 NaN NaN \n",
"13239 NaN NaN \n",
"13277 Los Angeles Thousand Oaks \n",
"13323 SF Bay Area San Francisco \n",
"13324 Fayetteville Rogers \n",
"13356 Minneapolis Minneapolis \n",
"13357 Newark Parsippany \n",
"13371 SF Bay Area San Bruno \n",
"13396 Las Vegas Las Vegas \n",
"13437 Utrecht Utrecht \n",
"\n",
" acquirer_permalink \\\n",
"31 /organization/american-international-group \n",
"56 /organization/hewlett-packard \n",
"69 /organization/hewlett-packard \n",
"108 /organization/baidu \n",
"136 /organization/the-walt-disney-company \n",
"152 /organization/zayo-group \n",
"155 /organization/celgene \n",
"173 /organization/dassault \n",
"187 /organization/the-williams-companies \n",
"195 /organization/johnson-johnson \n",
"343 /organization/google \n",
"382 /organization/vista-equity-partners \n",
"406 /organization/ubm-electronics \n",
"420 /organization/valassis \n",
"430 /organization/broadcom \n",
"443 /organization/xerox \n",
"516 /organization/vmware \n",
"536 /organization/portfolio-recovery-associates \n",
"540 /organization/laclede-group \n",
"576 /organization/bayer-ag-germany \n",
"588 /organization/johnson-johnson \n",
"606 /organization/actavis \n",
"618 /organization/liberty-global \n",
"629 /organization/verizon \n",
"654 /organization/midamerican-energy \n",
"691 /organization/minerals-technologies \n",
"703 /organization/standard-chartered-bank \n",
"716 /organization/kinder-morgan \n",
"719 /organization/ventas \n",
"723 /organization/on-semiconductor \n",
"... ... \n",
"12702 /organization/shaw-communications \n",
"12704 /organization/rakuten \n",
"12717 /organization/aristocrat-technologies-inc \n",
"12749 /organization/darling-international \n",
"12764 /organization/shire \n",
"12843 /organization/eads-astrium \n",
"12854 /organization/mail-ru \n",
"12887 /organization/platinum-equity-llc \n",
"12957 /organization/actavis \n",
"12981 /organization/google \n",
"13011 /organization/cisco \n",
"13030 /organization/netiq \n",
"13061 /organization/anadarko-petroleum-corporation \n",
"13069 /organization/tri-pointe-homes \n",
"13080 /organization/facebook \n",
"13081 /organization/energy-capital-partners \n",
"13121 /organization/archer-daniels-midland-company \n",
"13143 /organization/intel \n",
"13146 /organization/ajinomoto \n",
"13174 /organization/bain-capital-2 \n",
"13238 /organization/nestl \n",
"13239 /organization/pfizer \n",
"13277 /organization/intel \n",
"13323 /organization/microsoft \n",
"13324 /organization/jarden \n",
"13356 /organization/generalmills \n",
"13357 /organization/onex \n",
"13371 /organization/google \n",
"13396 /organization/amazon \n",
"13437 /organization/liberty-global \n",
"\n",
" acquirer_name ... \\\n",
"31 American International Group ... \n",
"56 Hewlett-Packard ... \n",
"69 Hewlett-Packard ... \n",
"108 Baidu ... \n",
"136 The Walt Disney Company ... \n",
"152 Zayo ... \n",
"155 Celgene ... \n",
"173 Dassault Systemes ... \n",
"187 The Williams Companies ... \n",
"195 Johnson & Johnson ... \n",
"343 Google ... \n",
"382 Vista Equity Partners ... \n",
"406 UBM Electronics ... \n",
"420 Valassis ... \n",
"430 Broadcom ... \n",
"443 Xerox ... \n",
"516 VMware ... \n",
"536 Portfolio Recovery Associates ... \n",
"540 Laclede Group ... \n",
"576 Bayer AG ... \n",
"588 Johnson & Johnson ... \n",
"606 Actavis ... \n",
"618 Liberty Global ... \n",
"629 Verizon Communications ... \n",
"654 Berkshire Hathaway Energy ... \n",
"691 Minerals Technologies ... \n",
"703 Standard Chartered Bank ... \n",
"716 Kinder Morgan Energy ... \n",
"719 Ventas ... \n",
"723 ON Semiconductor ... \n",
"... ... ... \n",
"12702 Shaw Communications ... \n",
"12704 Rakuten ... \n",
"12717 Aristocrat Technologies, Inc ... \n",
"12749 Darling Ingredients ... \n",
"12764 Shire ... \n",
"12843 EADS Astrium ... \n",
"12854 Mail.Ru Group ... \n",
"12887 Platinum Equity ... \n",
"12957 Actavis ... \n",
"12981 Google ... \n",
"13011 Cisco ... \n",
"13030 NetIQ ... \n",
"13061 Anadarko Petroleum Corporation ... \n",
"13069 Tri Pointe Homes ... \n",
"13080 Facebook ... \n",
"13081 Energy Capital Partners ... \n",
"13121 Archer Daniels Midland Company ... \n",
"13143 Intel ... \n",
"13146 Ajinomoto ... \n",
"13174 Bain Capital ... \n",
"13238 Nestl\u0082 ... \n",
"13239 Pfizer ... \n",
"13277 Intel ... \n",
"13323 Microsoft ... \n",
"13324 Jarden ... \n",
"13356 General Mills ... \n",
"13357 Onex ... \n",
"13371 Google ... \n",
"13396 Amazon ... \n",
"13437 Liberty Global ... \n",
"\n",
" acquirer_country_code acquirer_state_code acquirer_region \\\n",
"31 USA NY New York City \n",
"56 USA CA SF Bay Area \n",
"69 USA CA SF Bay Area \n",
"108 CHN NaN Beijing \n",
"136 USA CA Los Angeles \n",
"152 USA CO Denver \n",
"155 USA NJ Newark \n",
"173 FRA NaN NaN \n",
"187 USA OK OK - Other \n",
"195 USA NJ Newark \n",
"343 USA CA SF Bay Area \n",
"382 USA TX Austin \n",
"406 USA CA SF Bay Area \n",
"420 USA MI Detroit \n",
"430 USA CA Anaheim \n",
"443 USA AL AL - Other \n",
"516 USA CA SF Bay Area \n",
"536 USA VA Norfolk - Virginia Beach \n",
"540 USA MO St. Louis \n",
"576 DEU NaN Leverkusen \n",
"588 USA NJ Newark \n",
"606 USA NJ Newark \n",
"618 USA CO Denver \n",
"629 USA NY New York City \n",
"654 USA IA Des Moines \n",
"691 USA NY New York City \n",
"703 GBR NaN London \n",
"716 USA TX Houston \n",
"719 USA IL Chicago \n",
"723 USA AZ Phoenix \n",
"... ... ... ... \n",
"12702 CAN AB Calgary \n",
"12704 JPN NaN Tokyo \n",
"12717 AUS NaN AUS - Other \n",
"12749 USA TX Dallas \n",
"12764 IRL NaN Dublin \n",
"12843 USA TX Houston \n",
"12854 RUS NaN Moscow \n",
"12887 USA CA Los Angeles \n",
"12957 USA NJ Newark \n",
"12981 USA CA SF Bay Area \n",
"13011 USA CA SF Bay Area \n",
"13030 USA CA SF Bay Area \n",
"13061 USA TX Seminole \n",
"13069 USA CA SF Bay Area \n",
"13080 USA CA SF Bay Area \n",
"13081 USA NJ Newark \n",
"13121 USA IL Chicago \n",
"13143 USA CA SF Bay Area \n",
"13146 JPN NaN Tokyo \n",
"13174 USA MA Boston \n",
"13238 CHE NaN Vevey \n",
"13239 USA NY New York City \n",
"13277 USA CA SF Bay Area \n",
"13323 USA WA Seattle \n",
"13324 USA NY New York City \n",
"13356 USA MN Minneapolis \n",
"13357 CAN ON Toronto \n",
"13371 USA CA SF Bay Area \n",
"13396 USA WA Seattle \n",
"13437 USA CO Denver \n",
"\n",
" acquirer_city acquired_at acquired_month acquired_quarter \\\n",
"31 New York 2007-09-01 2007-09 2007-Q3 \n",
"56 Palo Alto 2009-11-11 2009-11 2009-Q4 \n",
"69 Palo Alto 2010-09-02 2010-09 2010-Q3 \n",
"108 Beijing 2013-07-15 2013-07 2013-Q3 \n",
"136 Burbank 1995-08-05 1995-08 1995-Q3 \n",
"152 Boulder 2012-03-19 2012-03 2012-Q1 \n",
"155 Summit 2010-06-30 2010-06 2010-Q2 \n",
"173 NaN 2014-01-30 2014-01 2014-Q1 \n",
"187 Snyder 2012-12-21 2012-12 2012-Q4 \n",
"195 New Brunswick 2009-12-17 2009-12 2009-Q4 \n",
"343 Mountain View 2009-11-09 2009-11 2009-Q4 \n",
"382 Austin 2014-11-25 2014-11 2014-Q4 \n",
"406 San Francisco 2014-10-01 2014-10 2014-Q4 \n",
"420 Livonia 2006-07-06 2006-07 2006-Q3 \n",
"430 Irvine 2012-02-17 2012-02 2012-Q1 \n",
"443 Normal 2009-09-28 2009-09 2009-Q3 \n",
"516 Palo Alto 2014-01-22 2014-01 2014-Q1 \n",
"536 Norfolk 2014-02-20 2014-02 2014-Q1 \n",
"540 St Louis 2014-04-07 2014-04 2014-Q2 \n",
"576 Leverkusen 2013-12-19 2013-12 2013-Q4 \n",
"588 New Brunswick 2014-09-30 2014-09 2014-Q3 \n",
"606 Parsippany 2014-11-16 2014-11 2014-Q4 \n",
"618 Englewood 2014-05-08 2014-05 2014-Q2 \n",
"629 New York 2008-06-05 2008-06 2008-Q2 \n",
"654 Des Moines 2014-05-02 2014-05 2014-Q2 \n",
"691 New York 2014-03-10 2014-03 2014-Q1 \n",
"703 London 2008-02-28 2008-02 2008-Q1 \n",
"716 Houston 2013-12-23 2013-12 2013-Q4 \n",
"719 Chicago 2014-06-02 2014-06 2014-Q2 \n",
"723 Phoenix 2007-12-14 2007-12 2007-Q4 \n",
"... ... ... ... ... \n",
"12702 Calgary 2014-08-04 2014-08 2014-Q3 \n",
"12704 Tokyo 2014-02-14 2014-02 2014-Q1 \n",
"12717 NaN 2014-07-07 2014-07 2014-Q3 \n",
"12749 Irving 2013-10-07 2013-10 2013-Q4 \n",
"12764 Dublin 2013-11-10 2013-11 2013-Q4 \n",
"12843 Houston 2011-08-03 2011-08 2011-Q3 \n",
"12854 Moscow 2014-09-16 2014-09 2014-Q3 \n",
"12887 Beverly Hills 2013-12-10 2013-12 2013-Q4 \n",
"12957 Parsippany 2013-05-20 2013-05 2013-Q2 \n",
"12981 Mountain View 2013-06-11 2013-06 2013-Q2 \n",
"13011 San Jose 2007-03-15 2007-03 2007-Q1 \n",
"13030 San Jose 2001-01-17 2001-01 2001-Q1 \n",
"13061 Seminole 2006-06-01 2006-06 2006-Q2 \n",
"13069 San Ramon 2013-11-04 2013-11 2013-Q4 \n",
"13080 Menlo Park 2014-02-19 2014-02 2014-Q1 \n",
"13081 Short Hills 2014-07-29 2014-07 2014-Q3 \n",
"13121 Chicago 2014-07-07 2014-07 2014-Q3 \n",
"13143 Santa Clara 2009-07-17 2009-07 2009-Q3 \n",
"13146 Tokyo 2014-09-10 2014-09 2014-Q3 \n",
"13174 Boston 2014-12-23 2014-12 2014-Q4 \n",
"13238 Vevey 2012-04-01 2012-04 2012-Q2 \n",
"13239 New York 2009-01-23 2009-01 2009-Q1 \n",
"13277 Santa Clara 2001-01-15 2001-01 2001-Q1 \n",
"13323 Redmond 2012-06-25 2012-06 2012-Q2 \n",
"13324 Rye 2013-09-03 2013-09 2013-Q3 \n",
"13356 Minneapolis 2011-07-01 2011-07 2011-Q3 \n",
"13357 Toronto 2014-07-16 2014-07 2014-Q3 \n",
"13371 Mountain View 2006-10-01 2006-10 2006-Q4 \n",
"13396 Seattle 2009-07-22 2009-07 2009-Q3 \n",
"13437 Englewood 2014-01-27 2014-01 2014-Q1 \n",
"\n",
" acquired_year price_amount price_currency_code \n",
"31 2007 749000000 USD \n",
"56 2009 2700000000 USD \n",
"69 2010 2350000000 USD \n",
"108 2013 1900000000 USD \n",
"136 1995 19000000000 USD \n",
"152 2012 2300000000 USD \n",
"155 2010 2900000000 USD \n",
"173 2014 750000000 USD \n",
"187 2012 2250000000 USD \n",
"195 2009 785000000 USD \n",
"343 2009 750000000 USD \n",
"382 2014 725000000 GBP \n",
"406 2014 972000000 USD \n",
"420 2006 1300000000 USD \n",
"430 2012 3700000000 USD \n",
"443 2009 5750000000 USD \n",
"516 2014 1540000000 USD \n",
"536 2014 1300000000 USD \n",
"540 2014 1340000000 USD \n",
"576 2013 2900000000 USD \n",
"588 2014 1750000000 USD \n",
"606 2014 65500000000 USD \n",
"618 2014 930000000 USD \n",
"629 2008 28100000000 USD \n",
"654 2014 3520000000 USD \n",
"691 2014 1700000000 USD \n",
"703 2008 823000000 USD \n",
"716 2013 962000000 USD \n",
"719 2014 2600000000 USD \n",
"723 2007 915000000 USD \n",
"... ... ... ... \n",
"12702 2014 1200000000 USD \n",
"12704 2014 900000000 USD \n",
"12717 2014 1300000000 USD \n",
"12749 2013 1600000000 EUR \n",
"12764 2013 4200000000 USD \n",
"12843 2011 960000000 USD \n",
"12854 2014 1470000000 USD \n",
"12887 2013 1100000000 USD \n",
"12957 2013 8500000000 USD \n",
"12981 2013 996000000 USD \n",
"13011 2007 3200000000 USD \n",
"13030 2001 1000000000 USD \n",
"13061 2006 4800000000 USD \n",
"13069 2013 2700000000 USD \n",
"13080 2014 19000000000 USD \n",
"13081 2014 1940000000 USD \n",
"13121 2014 3000000000 USD \n",
"13143 2009 884000000 USD \n",
"13146 2014 800000000 USD \n",
"13174 2014 750000000 USD \n",
"13238 2012 11850000000 USD \n",
"13239 2009 68000000000 USD \n",
"13277 2001 748000000 USD \n",
"13323 2012 1200000000 USD \n",
"13324 2013 1750000000 USD \n",
"13356 2011 1200000000 USD \n",
"13357 2014 1330000000 USD \n",
"13371 2006 1650000000 USD \n",
"13396 2009 1200000000 USD \n",
"13437 2014 13700000000 USD \n",
"\n",
"[537 rows x 22 columns]"
]
}
],
"prompt_number": 41
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"query = \"\"\"\n",
"select *\n",
"from acquisitions a\n",
"where price_amount > (\n",
" select avg(price_amount)\n",
" from acquisitions\n",
");\n",
"\"\"\"\n",
"sql.read_sql(query, con)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" company_permalink | \n",
" company_name | \n",
" company_category_list | \n",
" company_market | \n",
" company_country_code | \n",
" company_state_code | \n",
" company_region | \n",
" company_city | \n",
" acquirer_permalink | \n",
" acquirer_name | \n",
" ... | \n",
" acquirer_country_code | \n",
" acquirer_state_code | \n",
" acquirer_region | \n",
" acquirer_city | \n",
" acquired_at | \n",
" acquired_month | \n",
" acquired_quarter | \n",
" acquired_year | \n",
" price_amount | \n",
" price_currency_code | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" /organization/21st-century-insurance | \n",
" 21st Century Insurance | \n",
" |Finance|Business Services|Insurance| | \n",
" Business Services | \n",
" USA | \n",
" AL | \n",
" AL - Other | \n",
" De Armanville | \n",
" /organization/american-international-group | \n",
" American International Group | \n",
" ... | \n",
" USA | \n",
" NY | \n",
" New York City | \n",
" New York | \n",
" 2007-09-01 | \n",
" 2007-09 | \n",
" 2007-Q3 | \n",
" 2007 | \n",
" 749000000 | \n",
" USD | \n",
"
\n",
" \n",
" 1 | \n",
" /organization/3com | \n",
" 3Com | \n",
" |Curated Web| | \n",
" Curated Web | \n",
" USA | \n",
" MA | \n",
" Boston | \n",
" Marlborough | \n",
" /organization/hewlett-packard | \n",
" Hewlett-Packard | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Palo Alto | \n",
" 2009-11-11 | \n",
" 2009-11 | \n",
" 2009-Q4 | \n",
" 2009 | \n",
" 2700000000 | \n",
" USD | \n",
"
\n",
" \n",
" 2 | \n",
" /organization/3par | \n",
" 3PAR | \n",
" |Services|Cloud Data Services|Software| | \n",
" Cloud Data Services | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Fremont | \n",
" /organization/hewlett-packard | \n",
" Hewlett-Packard | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Palo Alto | \n",
" 2010-09-02 | \n",
" 2010-09 | \n",
" 2010-Q3 | \n",
" 2010 | \n",
" 2350000000 | \n",
" USD | \n",
"
\n",
" \n",
" 3 | \n",
" /organization/91-wireless | \n",
" 91 Wireless | \n",
" |Mobile| | \n",
" Mobile | \n",
" CHN | \n",
" None | \n",
" Fuzhou Shi | \n",
" Fuzhou Shi | \n",
" /organization/baidu | \n",
" Baidu | \n",
" ... | \n",
" CHN | \n",
" None | \n",
" Beijing | \n",
" Beijing | \n",
" 2013-07-15 | \n",
" 2013-07 | \n",
" 2013-Q3 | \n",
" 2013 | \n",
" 1900000000 | \n",
" USD | \n",
"
\n",
" \n",
" 4 | \n",
" /organization/abc | \n",
" ABC | \n",
" |News| | \n",
" News | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" /organization/the-walt-disney-company | \n",
" The Walt Disney Company | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" Los Angeles | \n",
" Burbank | \n",
" 1995-08-05 | \n",
" 1995-08 | \n",
" 1995-Q3 | \n",
" 1995 | \n",
" 19000000000 | \n",
" USD | \n",
"
\n",
" \n",
" 5 | \n",
" /organization/abovenet | \n",
" AboveNet | \n",
" |Information Technology|Internet|Service Provi... | \n",
" Internet | \n",
" USA | \n",
" NY | \n",
" New York City | \n",
" White Plains | \n",
" /organization/zayo-group | \n",
" Zayo | \n",
" ... | \n",
" USA | \n",
" CO | \n",
" Denver | \n",
" Boulder | \n",
" 2012-03-19 | \n",
" 2012-03 | \n",
" 2012-Q1 | \n",
" 2012 | \n",
" 2300000000 | \n",
" USD | \n",
"
\n",
" \n",
" 6 | \n",
" /organization/abraxis-bioscience | \n",
" Abraxis BioScience | \n",
" |Pharmaceuticals|Biotechnology| | \n",
" Biotechnology | \n",
" USA | \n",
" CA | \n",
" Los Angeles | \n",
" Los Angeles | \n",
" /organization/celgene | \n",
" Celgene | \n",
" ... | \n",
" USA | \n",
" NJ | \n",
" Newark | \n",
" Summit | \n",
" 2010-06-30 | \n",
" 2010-06 | \n",
" 2010-Q2 | \n",
" 2010 | \n",
" 2900000000 | \n",
" USD | \n",
"
\n",
" \n",
" 7 | \n",
" /organization/accelrys | \n",
" Accelrys | \n",
" |Software| | \n",
" Software | \n",
" USA | \n",
" CA | \n",
" San Diego | \n",
" San Diego | \n",
" /organization/dassault | \n",
" Dassault Systemes | \n",
" ... | \n",
" FRA | \n",
" None | \n",
" None | \n",
" None | \n",
" 2014-01-30 | \n",
" 2014-01 | \n",
" 2014-Q1 | \n",
" 2014 | \n",
" 750000000 | \n",
" USD | \n",
"
\n",
" \n",
" 8 | \n",
" /organization/access-midstream-partners-lp-inv... | \n",
" Access Midstream Partners LP Investments | \n",
" |Natural Gas Uses| | \n",
" Natural Gas Uses | \n",
" USA | \n",
" OK | \n",
" Oklahoma City | \n",
" Oklahoma City | \n",
" /organization/the-williams-companies | \n",
" The Williams Companies | \n",
" ... | \n",
" USA | \n",
" OK | \n",
" OK - Other | \n",
" Snyder | \n",
" 2012-12-21 | \n",
" 2012-12 | \n",
" 2012-Q4 | \n",
" 2012 | \n",
" 2250000000 | \n",
" USD | \n",
"
\n",
" \n",
" 9 | \n",
" /organization/acclarent | \n",
" Acclarent | \n",
" |Medical|Biotechnology| | \n",
" Biotechnology | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Menlo Park | \n",
" /organization/johnson-johnson | \n",
" Johnson & Johnson | \n",
" ... | \n",
" USA | \n",
" NJ | \n",
" Newark | \n",
" New Brunswick | \n",
" 2009-12-17 | \n",
" 2009-12 | \n",
" 2009-Q4 | \n",
" 2009 | \n",
" 785000000 | \n",
" USD | \n",
"
\n",
" \n",
" 10 | \n",
" /organization/admob | \n",
" AdMob | \n",
" |Mobile|Advertising| | \n",
" Advertising | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Mountain View | \n",
" /organization/google | \n",
" Google | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Mountain View | \n",
" 2009-11-09 | \n",
" 2009-11 | \n",
" 2009-Q4 | \n",
" 2009 | \n",
" 750000000 | \n",
" USD | \n",
"
\n",
" \n",
" 11 | \n",
" /organization/advanced-computer-software | \n",
" Advanced Computer Software | \n",
" |Health Care|Information Technology|Software| | \n",
" Software | \n",
" GBR | \n",
" None | \n",
" Cobham | \n",
" Cobham | \n",
" /organization/vista-equity-partners | \n",
" Vista Equity Partners | \n",
" ... | \n",
" USA | \n",
" TX | \n",
" Austin | \n",
" Austin | \n",
" 2014-11-25 | \n",
" 2014-11 | \n",
" 2014-Q4 | \n",
" 2014 | \n",
" 725000000 | \n",
" GBP | \n",
"
\n",
" \n",
" 12 | \n",
" /organization/advanstar-communications | \n",
" Advanstar Communications | \n",
" |Market Research|Events|Business Services| | \n",
" Events | \n",
" USA | \n",
" CA | \n",
" Los Angeles | \n",
" Santa Monica | \n",
" /organization/ubm-electronics | \n",
" UBM Electronics | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" San Francisco | \n",
" 2014-10-01 | \n",
" 2014-10 | \n",
" 2014-Q4 | \n",
" 2014 | \n",
" 972000000 | \n",
" USD | \n",
"
\n",
" \n",
" 13 | \n",
" /organization/advo | \n",
" Advo | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" /organization/valassis | \n",
" Valassis | \n",
" ... | \n",
" USA | \n",
" MI | \n",
" Detroit | \n",
" Livonia | \n",
" 2006-07-06 | \n",
" 2006-07 | \n",
" 2006-Q3 | \n",
" 2006 | \n",
" 1300000000 | \n",
" USD | \n",
"
\n",
" \n",
" 14 | \n",
" /organization/aeluros | \n",
" Aeluros | \n",
" |Semiconductors| | \n",
" Semiconductors | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Mountain View | \n",
" /organization/broadcom | \n",
" Broadcom | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" Anaheim | \n",
" Irvine | \n",
" 2012-02-17 | \n",
" 2012-02 | \n",
" 2012-Q1 | \n",
" 2012 | \n",
" 3700000000 | \n",
" USD | \n",
"
\n",
" \n",
" 15 | \n",
" /organization/affiliated-computer-services | \n",
" Affiliated Computer Services | \n",
" |Consulting| | \n",
" Consulting | \n",
" USA | \n",
" TX | \n",
" Dallas | \n",
" Dallas | \n",
" /organization/xerox | \n",
" Xerox | \n",
" ... | \n",
" USA | \n",
" AL | \n",
" AL - Other | \n",
" Normal | \n",
" 2009-09-28 | \n",
" 2009-09 | \n",
" 2009-Q3 | \n",
" 2009 | \n",
" 5750000000 | \n",
" USD | \n",
"
\n",
" \n",
" 16 | \n",
" /organization/airwatch | \n",
" AirWatch | \n",
" |Mobile Devices|Mobile| | \n",
" Mobile | \n",
" USA | \n",
" GA | \n",
" Atlanta | \n",
" Atlanta | \n",
" /organization/vmware | \n",
" VMware | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Palo Alto | \n",
" 2014-01-22 | \n",
" 2014-01 | \n",
" 2014-Q1 | \n",
" 2014 | \n",
" 1540000000 | \n",
" USD | \n",
"
\n",
" \n",
" 17 | \n",
" /organization/aktiv-kapital | \n",
" Aktiv Kapital | \n",
" |Finance| | \n",
" Finance | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" /organization/portfolio-recovery-associates | \n",
" Portfolio Recovery Associates | \n",
" ... | \n",
" USA | \n",
" VA | \n",
" Norfolk - Virginia Beach | \n",
" Norfolk | \n",
" 2014-02-20 | \n",
" 2014-02 | \n",
" 2014-Q1 | \n",
" 2014 | \n",
" 1300000000 | \n",
" USD | \n",
"
\n",
" \n",
" 18 | \n",
" /organization/alabama-gas-corp | \n",
" Alabama Gas Corp | \n",
" None | \n",
" None | \n",
" USA | \n",
" AL | \n",
" Birmingham | \n",
" Birmingham | \n",
" /organization/laclede-group | \n",
" Laclede Group | \n",
" ... | \n",
" USA | \n",
" MO | \n",
" St. Louis | \n",
" St Louis | \n",
" 2014-04-07 | \n",
" 2014-04 | \n",
" 2014-Q2 | \n",
" 2014 | \n",
" 1340000000 | \n",
" USD | \n",
"
\n",
" \n",
" 19 | \n",
" /organization/algeta | \n",
" Algeta | \n",
" |Biotechnology| | \n",
" Biotechnology | \n",
" NOR | \n",
" None | \n",
" Oslo | \n",
" Oslo | \n",
" /organization/bayer-ag-germany | \n",
" Bayer AG | \n",
" ... | \n",
" DEU | \n",
" None | \n",
" Leverkusen | \n",
" Leverkusen | \n",
" 2013-12-19 | \n",
" 2013-12 | \n",
" 2013-Q4 | \n",
" 2013 | \n",
" 2900000000 | \n",
" USD | \n",
"
\n",
" \n",
" 20 | \n",
" /organization/alios-biopharma | \n",
" Alios BioPharma | \n",
" |Biotechnology| | \n",
" Biotechnology | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" South San Francisco | \n",
" /organization/johnson-johnson | \n",
" Johnson & Johnson | \n",
" ... | \n",
" USA | \n",
" NJ | \n",
" Newark | \n",
" New Brunswick | \n",
" 2014-09-30 | \n",
" 2014-09 | \n",
" 2014-Q3 | \n",
" 2014 | \n",
" 1750000000 | \n",
" USD | \n",
"
\n",
" \n",
" 21 | \n",
" /organization/allergan | \n",
" Allergan | \n",
" |Medical|Pharmaceuticals|Biotechnology| | \n",
" Medical | \n",
" USA | \n",
" CA | \n",
" Anaheim | \n",
" Irvine | \n",
" /organization/actavis | \n",
" Actavis | \n",
" ... | \n",
" USA | \n",
" NJ | \n",
" Newark | \n",
" Parsippany | \n",
" 2014-11-16 | \n",
" 2014-11 | \n",
" 2014-Q4 | \n",
" 2014 | \n",
" 65500000000 | \n",
" USD | \n",
"
\n",
" \n",
" 22 | \n",
" /organization/adl-media-inc | \n",
" AllMedia Inc | \n",
" |Advertising| | \n",
" Advertising | \n",
" USA | \n",
" IA | \n",
" IA - Other | \n",
" Plano | \n",
" /organization/liberty-global | \n",
" Liberty Global | \n",
" ... | \n",
" USA | \n",
" CO | \n",
" Denver | \n",
" Englewood | \n",
" 2014-05-08 | \n",
" 2014-05 | \n",
" 2014-Q2 | \n",
" 2014 | \n",
" 930000000 | \n",
" USD | \n",
"
\n",
" \n",
" 23 | \n",
" /organization/alltel | \n",
" Alltel | \n",
" |Public Relations| | \n",
" Public Relations | \n",
" USA | \n",
" AR | \n",
" Little Rock | \n",
" Little Rock | \n",
" /organization/verizon | \n",
" Verizon Communications | \n",
" ... | \n",
" USA | \n",
" NY | \n",
" New York City | \n",
" New York | \n",
" 2008-06-05 | \n",
" 2008-06 | \n",
" 2008-Q2 | \n",
" 2008 | \n",
" 28100000000 | \n",
" USD | \n",
"
\n",
" \n",
" 24 | \n",
" /organization/altalink | \n",
" AltaLink | \n",
" |Local Businesses|Energy|Utilities| | \n",
" Utilities | \n",
" CAN | \n",
" AB | \n",
" Calgary | \n",
" Calgary | \n",
" /organization/midamerican-energy | \n",
" Berkshire Hathaway Energy | \n",
" ... | \n",
" USA | \n",
" IA | \n",
" Des Moines | \n",
" Des Moines | \n",
" 2014-05-02 | \n",
" 2014-05 | \n",
" 2014-Q2 | \n",
" 2014 | \n",
" 3520000000 | \n",
" USD | \n",
"
\n",
" \n",
" 25 | \n",
" /organization/amcol-international | \n",
" AMCOL International | \n",
" |Manufacturing| | \n",
" Manufacturing | \n",
" USA | \n",
" IL | \n",
" Chicago | \n",
" Hoffman Estates | \n",
" /organization/minerals-technologies | \n",
" Minerals Technologies | \n",
" ... | \n",
" USA | \n",
" NY | \n",
" New York City | \n",
" New York | \n",
" 2014-03-10 | \n",
" 2014-03 | \n",
" 2014-Q1 | \n",
" 2014 | \n",
" 1700000000 | \n",
" USD | \n",
"
\n",
" \n",
" 26 | \n",
" /organization/americanexpress | \n",
" American Express | \n",
" |Finance| | \n",
" Finance | \n",
" USA | \n",
" NY | \n",
" New York City | \n",
" New York | \n",
" /organization/standard-chartered-bank | \n",
" Standard Chartered Bank | \n",
" ... | \n",
" GBR | \n",
" None | \n",
" London | \n",
" London | \n",
" 2008-02-28 | \n",
" 2008-02 | \n",
" 2008-Q1 | \n",
" 2008 | \n",
" 823000000 | \n",
" USD | \n",
"
\n",
" \n",
" 27 | \n",
" /organization/american-petroleum-tankers | \n",
" American Petroleum Tankers | \n",
" None | \n",
" None | \n",
" USA | \n",
" PA | \n",
" Philadelphia | \n",
" Plymouth Meeting | \n",
" /organization/kinder-morgan | \n",
" Kinder Morgan Energy | \n",
" ... | \n",
" USA | \n",
" TX | \n",
" Houston | \n",
" Houston | \n",
" 2013-12-23 | \n",
" 2013-12 | \n",
" 2013-Q4 | \n",
" 2013 | \n",
" 962000000 | \n",
" USD | \n",
"
\n",
" \n",
" 28 | \n",
" /organization/american-realty-capital-healthca... | \n",
" American Realty Capital Healthcare Trust | \n",
" None | \n",
" None | \n",
" USA | \n",
" MA | \n",
" Boston | \n",
" Boston | \n",
" /organization/ventas | \n",
" Ventas | \n",
" ... | \n",
" USA | \n",
" IL | \n",
" Chicago | \n",
" Chicago | \n",
" 2014-06-02 | \n",
" 2014-06 | \n",
" 2014-Q2 | \n",
" 2014 | \n",
" 2600000000 | \n",
" USD | \n",
"
\n",
" \n",
" 29 | \n",
" /organization/ami-semiconductor | \n",
" AMI Semiconductor | \n",
" |Electronics|Manufacturing|Design| | \n",
" Design | \n",
" USA | \n",
" ID | \n",
" Idaho Falls | \n",
" Pocatello | \n",
" /organization/on-semiconductor | \n",
" ON Semiconductor | \n",
" ... | \n",
" USA | \n",
" AZ | \n",
" Phoenix | \n",
" Phoenix | \n",
" 2007-12-14 | \n",
" 2007-12 | \n",
" 2007-Q4 | \n",
" 2007 | \n",
" 915000000 | \n",
" USD | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 507 | \n",
" /organization/viawest | \n",
" ViaWest | \n",
" |Data Centers|Services|Web Hosting| | \n",
" Web Hosting | \n",
" USA | \n",
" CO | \n",
" Denver | \n",
" Greenwood Village | \n",
" /organization/shaw-communications | \n",
" Shaw Communications | \n",
" ... | \n",
" CAN | \n",
" AB | \n",
" Calgary | \n",
" Calgary | \n",
" 2014-08-04 | \n",
" 2014-08 | \n",
" 2014-Q3 | \n",
" 2014 | \n",
" 1200000000 | \n",
" USD | \n",
"
\n",
" \n",
" 508 | \n",
" /organization/viber-media | \n",
" Viber Media | \n",
" |Android|iPhone|VoIP|Messaging| | \n",
" Android | \n",
" CYP | \n",
" None | \n",
" Cyprus | \n",
" Limassol | \n",
" /organization/rakuten | \n",
" Rakuten | \n",
" ... | \n",
" JPN | \n",
" None | \n",
" Tokyo | \n",
" Tokyo | \n",
" 2014-02-14 | \n",
" 2014-02 | \n",
" 2014-Q1 | \n",
" 2014 | \n",
" 900000000 | \n",
" USD | \n",
"
\n",
" \n",
" 509 | \n",
" /organization/video-gaming-technologies | \n",
" Video Gaming Technologies | \n",
" |Video Games| | \n",
" Video Games | \n",
" USA | \n",
" TN | \n",
" Nashville | \n",
" Franklin | \n",
" /organization/aristocrat-technologies-inc | \n",
" Aristocrat Technologies, Inc | \n",
" ... | \n",
" AUS | \n",
" None | \n",
" AUS - Other | \n",
" None | \n",
" 2014-07-07 | \n",
" 2014-07 | \n",
" 2014-Q3 | \n",
" 2014 | \n",
" 1300000000 | \n",
" USD | \n",
"
\n",
" \n",
" 510 | \n",
" /organization/vion-food-group | \n",
" Vion Food Group | \n",
" |Hospitality| | \n",
" Hospitality | \n",
" NLD | \n",
" None | \n",
" Eindhoven | \n",
" Eindhoven | \n",
" /organization/darling-international | \n",
" Darling Ingredients | \n",
" ... | \n",
" USA | \n",
" TX | \n",
" Dallas | \n",
" Irving | \n",
" 2013-10-07 | \n",
" 2013-10 | \n",
" 2013-Q4 | \n",
" 2013 | \n",
" 1600000000 | \n",
" EUR | \n",
"
\n",
" \n",
" 511 | \n",
" /organization/viropharma | \n",
" ViroPharma | \n",
" |Pharmaceuticals|Biotechnology| | \n",
" Biotechnology | \n",
" USA | \n",
" PA | \n",
" Philadelphia | \n",
" Exton | \n",
" /organization/shire | \n",
" Shire | \n",
" ... | \n",
" IRL | \n",
" None | \n",
" Dublin | \n",
" Dublin | \n",
" 2013-11-10 | \n",
" 2013-11 | \n",
" 2013-Q4 | \n",
" 2013 | \n",
" 4200000000 | \n",
" USD | \n",
"
\n",
" \n",
" 512 | \n",
" /organization/vizada | \n",
" Vizada | \n",
" |DOD/Military|Mobile| | \n",
" Mobile | \n",
" USA | \n",
" MD | \n",
" Washington, D.C. | \n",
" Rockville | \n",
" /organization/eads-astrium | \n",
" EADS Astrium | \n",
" ... | \n",
" USA | \n",
" TX | \n",
" Houston | \n",
" Houston | \n",
" 2011-08-03 | \n",
" 2011-08 | \n",
" 2011-Q3 | \n",
" 2011 | \n",
" 960000000 | \n",
" USD | \n",
"
\n",
" \n",
" 513 | \n",
" /organization/vk | \n",
" Vkontake | \n",
" |Communities|Social Network Media|Social Media| | \n",
" Communities | \n",
" RUS | \n",
" None | \n",
" St. Petersburg | \n",
" Saint Petersburg | \n",
" /organization/mail-ru | \n",
" Mail.Ru Group | \n",
" ... | \n",
" RUS | \n",
" None | \n",
" Moscow | \n",
" Moscow | \n",
" 2014-09-16 | \n",
" 2014-09 | \n",
" 2014-Q3 | \n",
" 2014 | \n",
" 1470000000 | \n",
" USD | \n",
"
\n",
" \n",
" 514 | \n",
" /organization/volvo-rents-construction | \n",
" Volvo Rents Construction | \n",
" None | \n",
" None | \n",
" CHE | \n",
" None | \n",
" CHE - Other | \n",
" Eison | \n",
" /organization/platinum-equity-llc | \n",
" Platinum Equity | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" Los Angeles | \n",
" Beverly Hills | \n",
" 2013-12-10 | \n",
" 2013-12 | \n",
" 2013-Q4 | \n",
" 2013 | \n",
" 1100000000 | \n",
" USD | \n",
"
\n",
" \n",
" 515 | \n",
" /organization/warner-chilcott | \n",
" Warner Chilcott | \n",
" |Pharmaceuticals|Biotechnology| | \n",
" Biotechnology | \n",
" USA | \n",
" NJ | \n",
" Newark | \n",
" Rockaway | \n",
" /organization/actavis | \n",
" Actavis | \n",
" ... | \n",
" USA | \n",
" NJ | \n",
" Newark | \n",
" Parsippany | \n",
" 2013-05-20 | \n",
" 2013-05 | \n",
" 2013-Q2 | \n",
" 2013 | \n",
" 8500000000 | \n",
" USD | \n",
"
\n",
" \n",
" 516 | \n",
" /organization/waze | \n",
" Waze | \n",
" |Navigation|Transportation| | \n",
" Transportation | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Palo Alto | \n",
" /organization/google | \n",
" Google | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Mountain View | \n",
" 2013-06-11 | \n",
" 2013-06 | \n",
" 2013-Q2 | \n",
" 2013 | \n",
" 996000000 | \n",
" USD | \n",
"
\n",
" \n",
" 517 | \n",
" /organization/webex | \n",
" WebEx | \n",
" |Curated Web| | \n",
" Curated Web | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Santa Clara | \n",
" /organization/cisco | \n",
" Cisco | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" San Jose | \n",
" 2007-03-15 | \n",
" 2007-03 | \n",
" 2007-Q1 | \n",
" 2007 | \n",
" 3200000000 | \n",
" USD | \n",
"
\n",
" \n",
" 518 | \n",
" /organization/webtrends | \n",
" Webtrends | \n",
" |SEO|Curated Web| | \n",
" Curated Web | \n",
" USA | \n",
" OR | \n",
" Portland, Oregon | \n",
" Portland | \n",
" /organization/netiq | \n",
" NetIQ | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" San Jose | \n",
" 2001-01-17 | \n",
" 2001-01 | \n",
" 2001-Q1 | \n",
" 2001 | \n",
" 1000000000 | \n",
" USD | \n",
"
\n",
" \n",
" 519 | \n",
" /organization/western-gas-resources | \n",
" Western Gas Resources | \n",
" |Clean Energy|Oil| | \n",
" Clean Energy | \n",
" USA | \n",
" TX | \n",
" Seminole | \n",
" Seminole | \n",
" /organization/anadarko-petroleum-corporation | \n",
" Anadarko Petroleum Corporation | \n",
" ... | \n",
" USA | \n",
" TX | \n",
" Seminole | \n",
" Seminole | \n",
" 2006-06-01 | \n",
" 2006-06 | \n",
" 2006-Q2 | \n",
" 2006 | \n",
" 4800000000 | \n",
" USD | \n",
"
\n",
" \n",
" 520 | \n",
" /organization/weyerhaeuser | \n",
" Weyerhaeuser | \n",
" |Manufacturing| | \n",
" Manufacturing | \n",
" USA | \n",
" WA | \n",
" Seattle | \n",
" Federal Way | \n",
" /organization/tri-pointe-homes | \n",
" Tri Pointe Homes | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" San Ramon | \n",
" 2013-11-04 | \n",
" 2013-11 | \n",
" 2013-Q4 | \n",
" 2013 | \n",
" 2700000000 | \n",
" USD | \n",
"
\n",
" \n",
" 521 | \n",
" /organization/whatsapp | \n",
" WhatsApp | \n",
" |Messaging| | \n",
" Messaging | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Santa Clara | \n",
" /organization/facebook | \n",
" Facebook | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Menlo Park | \n",
" 2014-02-19 | \n",
" 2014-02 | \n",
" 2014-Q1 | \n",
" 2014 | \n",
" 19000000000 | \n",
" USD | \n",
"
\n",
" \n",
" 522 | \n",
" /organization/wheelabrator-technologies | \n",
" Wheelabrator Technologies | \n",
" |Waste Management| | \n",
" Waste Management | \n",
" USA | \n",
" NH | \n",
" Manchester, New Hampshire | \n",
" Hampton | \n",
" /organization/energy-capital-partners | \n",
" Energy Capital Partners | \n",
" ... | \n",
" USA | \n",
" NJ | \n",
" Newark | \n",
" Short Hills | \n",
" 2014-07-29 | \n",
" 2014-07 | \n",
" 2014-Q3 | \n",
" 2014 | \n",
" 1940000000 | \n",
" USD | \n",
"
\n",
" \n",
" 523 | \n",
" /organization/wild-flavors | \n",
" WILD Flavors | \n",
" |Food Processing| | \n",
" Food Processing | \n",
" DEU | \n",
" None | \n",
" Frankfurt | \n",
" Heidelberg | \n",
" /organization/archer-daniels-midland-company | \n",
" Archer Daniels Midland Company | \n",
" ... | \n",
" USA | \n",
" IL | \n",
" Chicago | \n",
" Chicago | \n",
" 2014-07-07 | \n",
" 2014-07 | \n",
" 2014-Q3 | \n",
" 2014 | \n",
" 3000000000 | \n",
" USD | \n",
"
\n",
" \n",
" 524 | \n",
" /organization/wind-river | \n",
" Wind River | \n",
" |Software| | \n",
" Software | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Alameda | \n",
" /organization/intel | \n",
" Intel | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Santa Clara | \n",
" 2009-07-17 | \n",
" 2009-07 | \n",
" 2009-Q3 | \n",
" 2009 | \n",
" 884000000 | \n",
" USD | \n",
"
\n",
" \n",
" 525 | \n",
" /organization/windsor-foods | \n",
" Windsor Foods | \n",
" |Consumer Goods|Food Processing| | \n",
" Food Processing | \n",
" USA | \n",
" TX | \n",
" Houston | \n",
" Houston | \n",
" /organization/ajinomoto | \n",
" Ajinomoto | \n",
" ... | \n",
" JPN | \n",
" None | \n",
" Tokyo | \n",
" Tokyo | \n",
" 2014-09-10 | \n",
" 2014-09 | \n",
" 2014-Q3 | \n",
" 2014 | \n",
" 800000000 | \n",
" USD | \n",
"
\n",
" \n",
" 526 | \n",
" /organization/wittur | \n",
" Wittur | \n",
" |Manufacturing|Public Safety|Heavy Industry| | \n",
" Heavy Industry | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" /organization/bain-capital-2 | \n",
" Bain Capital | \n",
" ... | \n",
" USA | \n",
" MA | \n",
" Boston | \n",
" Boston | \n",
" 2014-12-23 | \n",
" 2014-12 | \n",
" 2014-Q4 | \n",
" 2014 | \n",
" 750000000 | \n",
" USD | \n",
"
\n",
" \n",
" 527 | \n",
" /organization/wyeth-pharmaceuticals | \n",
" Wyeth Pharmaceuticals | \n",
" |Biotechnology|Pharmaceuticals|Health and Well... | \n",
" Biotechnology | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" /organization/nestl | \n",
" Nestl\u0082 | \n",
" ... | \n",
" CHE | \n",
" None | \n",
" Vevey | \n",
" Vevey | \n",
" 2012-04-01 | \n",
" 2012-04 | \n",
" 2012-Q2 | \n",
" 2012 | \n",
" 11850000000 | \n",
" USD | \n",
"
\n",
" \n",
" 528 | \n",
" /organization/wyeth-pharmaceuticals | \n",
" Wyeth Pharmaceuticals | \n",
" |Biotechnology|Pharmaceuticals|Health and Well... | \n",
" Biotechnology | \n",
" None | \n",
" None | \n",
" None | \n",
" None | \n",
" /organization/pfizer | \n",
" Pfizer | \n",
" ... | \n",
" USA | \n",
" NY | \n",
" New York City | \n",
" New York | \n",
" 2009-01-23 | \n",
" 2009-01 | \n",
" 2009-Q1 | \n",
" 2009 | \n",
" 68000000000 | \n",
" USD | \n",
"
\n",
" \n",
" 529 | \n",
" /organization/xircom | \n",
" Xircom | \n",
" |Web Hosting| | \n",
" Web Hosting | \n",
" USA | \n",
" CA | \n",
" Los Angeles | \n",
" Thousand Oaks | \n",
" /organization/intel | \n",
" Intel | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Santa Clara | \n",
" 2001-01-15 | \n",
" 2001-01 | \n",
" 2001-Q1 | \n",
" 2001 | \n",
" 748000000 | \n",
" USD | \n",
"
\n",
" \n",
" 530 | \n",
" /organization/yammer | \n",
" Yammer | \n",
" |Twitter Applications|Networking|Social Media|... | \n",
" Enterprise Software | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" San Francisco | \n",
" /organization/microsoft | \n",
" Microsoft | \n",
" ... | \n",
" USA | \n",
" WA | \n",
" Seattle | \n",
" Redmond | \n",
" 2012-06-25 | \n",
" 2012-06 | \n",
" 2012-Q2 | \n",
" 2012 | \n",
" 1200000000 | \n",
" USD | \n",
"
\n",
" \n",
" 531 | \n",
" /organization/yankee-candle-company | \n",
" Yankee Candle Company | \n",
" |Manufacturing| | \n",
" Manufacturing | \n",
" USA | \n",
" AR | \n",
" Fayetteville | \n",
" Rogers | \n",
" /organization/jarden | \n",
" Jarden | \n",
" ... | \n",
" USA | \n",
" NY | \n",
" New York City | \n",
" Rye | \n",
" 2013-09-03 | \n",
" 2013-09 | \n",
" 2013-Q3 | \n",
" 2013 | \n",
" 1750000000 | \n",
" USD | \n",
"
\n",
" \n",
" 532 | \n",
" /organization/yoplait | \n",
" Yoplait | \n",
" |Food Processing| | \n",
" Food Processing | \n",
" USA | \n",
" MN | \n",
" Minneapolis | \n",
" Minneapolis | \n",
" /organization/generalmills | \n",
" General Mills | \n",
" ... | \n",
" USA | \n",
" MN | \n",
" Minneapolis | \n",
" Minneapolis | \n",
" 2011-07-01 | \n",
" 2011-07 | \n",
" 2011-Q3 | \n",
" 2011 | \n",
" 1200000000 | \n",
" USD | \n",
"
\n",
" \n",
" 533 | \n",
" /organization/york-risk-services-group | \n",
" York Risk Services Group | \n",
" |Insurance| | \n",
" Insurance | \n",
" USA | \n",
" NJ | \n",
" Newark | \n",
" Parsippany | \n",
" /organization/onex | \n",
" Onex | \n",
" ... | \n",
" CAN | \n",
" ON | \n",
" Toronto | \n",
" Toronto | \n",
" 2014-07-16 | \n",
" 2014-07 | \n",
" 2014-Q3 | \n",
" 2014 | \n",
" 1330000000 | \n",
" USD | \n",
"
\n",
" \n",
" 534 | \n",
" /organization/youtube | \n",
" YouTube | \n",
" |Video|Online Rental|Entertainment|Games| | \n",
" Games | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" San Bruno | \n",
" /organization/google | \n",
" Google | \n",
" ... | \n",
" USA | \n",
" CA | \n",
" SF Bay Area | \n",
" Mountain View | \n",
" 2006-10-01 | \n",
" 2006-10 | \n",
" 2006-Q4 | \n",
" 2006 | \n",
" 1650000000 | \n",
" USD | \n",
"
\n",
" \n",
" 535 | \n",
" /organization/zappos | \n",
" Zappos | \n",
" |Curated Web| | \n",
" Curated Web | \n",
" USA | \n",
" NV | \n",
" Las Vegas | \n",
" Las Vegas | \n",
" /organization/amazon | \n",
" Amazon | \n",
" ... | \n",
" USA | \n",
" WA | \n",
" Seattle | \n",
" Seattle | \n",
" 2009-07-22 | \n",
" 2009-07 | \n",
" 2009-Q3 | \n",
" 2009 | \n",
" 1200000000 | \n",
" USD | \n",
"
\n",
" \n",
" 536 | \n",
" /organization/ziggo | \n",
" Ziggo | \n",
" |Information Services|Entertainment|Investment... | \n",
" Investment Management | \n",
" NLD | \n",
" None | \n",
" Utrecht | \n",
" Utrecht | \n",
" /organization/liberty-global | \n",
" Liberty Global | \n",
" ... | \n",
" USA | \n",
" CO | \n",
" Denver | \n",
" Englewood | \n",
" 2014-01-27 | \n",
" 2014-01 | \n",
" 2014-Q1 | \n",
" 2014 | \n",
" 13700000000 | \n",
" USD | \n",
"
\n",
" \n",
"
\n",
"
537 rows \u00d7 22 columns
\n",
"
"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 42,
"text": [
" company_permalink \\\n",
"0 /organization/21st-century-insurance \n",
"1 /organization/3com \n",
"2 /organization/3par \n",
"3 /organization/91-wireless \n",
"4 /organization/abc \n",
"5 /organization/abovenet \n",
"6 /organization/abraxis-bioscience \n",
"7 /organization/accelrys \n",
"8 /organization/access-midstream-partners-lp-inv... \n",
"9 /organization/acclarent \n",
"10 /organization/admob \n",
"11 /organization/advanced-computer-software \n",
"12 /organization/advanstar-communications \n",
"13 /organization/advo \n",
"14 /organization/aeluros \n",
"15 /organization/affiliated-computer-services \n",
"16 /organization/airwatch \n",
"17 /organization/aktiv-kapital \n",
"18 /organization/alabama-gas-corp \n",
"19 /organization/algeta \n",
"20 /organization/alios-biopharma \n",
"21 /organization/allergan \n",
"22 /organization/adl-media-inc \n",
"23 /organization/alltel \n",
"24 /organization/altalink \n",
"25 /organization/amcol-international \n",
"26 /organization/americanexpress \n",
"27 /organization/american-petroleum-tankers \n",
"28 /organization/american-realty-capital-healthca... \n",
"29 /organization/ami-semiconductor \n",
".. ... \n",
"507 /organization/viawest \n",
"508 /organization/viber-media \n",
"509 /organization/video-gaming-technologies \n",
"510 /organization/vion-food-group \n",
"511 /organization/viropharma \n",
"512 /organization/vizada \n",
"513 /organization/vk \n",
"514 /organization/volvo-rents-construction \n",
"515 /organization/warner-chilcott \n",
"516 /organization/waze \n",
"517 /organization/webex \n",
"518 /organization/webtrends \n",
"519 /organization/western-gas-resources \n",
"520 /organization/weyerhaeuser \n",
"521 /organization/whatsapp \n",
"522 /organization/wheelabrator-technologies \n",
"523 /organization/wild-flavors \n",
"524 /organization/wind-river \n",
"525 /organization/windsor-foods \n",
"526 /organization/wittur \n",
"527 /organization/wyeth-pharmaceuticals \n",
"528 /organization/wyeth-pharmaceuticals \n",
"529 /organization/xircom \n",
"530 /organization/yammer \n",
"531 /organization/yankee-candle-company \n",
"532 /organization/yoplait \n",
"533 /organization/york-risk-services-group \n",
"534 /organization/youtube \n",
"535 /organization/zappos \n",
"536 /organization/ziggo \n",
"\n",
" company_name \\\n",
"0 21st Century Insurance \n",
"1 3Com \n",
"2 3PAR \n",
"3 91 Wireless \n",
"4 ABC \n",
"5 AboveNet \n",
"6 Abraxis BioScience \n",
"7 Accelrys \n",
"8 Access Midstream Partners LP Investments \n",
"9 Acclarent \n",
"10 AdMob \n",
"11 Advanced Computer Software \n",
"12 Advanstar Communications \n",
"13 Advo \n",
"14 Aeluros \n",
"15 Affiliated Computer Services \n",
"16 AirWatch \n",
"17 Aktiv Kapital \n",
"18 Alabama Gas Corp \n",
"19 Algeta \n",
"20 Alios BioPharma \n",
"21 Allergan \n",
"22 AllMedia Inc \n",
"23 Alltel \n",
"24 AltaLink \n",
"25 AMCOL International \n",
"26 American Express \n",
"27 American Petroleum Tankers \n",
"28 American Realty Capital Healthcare Trust \n",
"29 AMI Semiconductor \n",
".. ... \n",
"507 ViaWest \n",
"508 Viber Media \n",
"509 Video Gaming Technologies \n",
"510 Vion Food Group \n",
"511 ViroPharma \n",
"512 Vizada \n",
"513 Vkontake \n",
"514 Volvo Rents Construction \n",
"515 Warner Chilcott \n",
"516 Waze \n",
"517 WebEx \n",
"518 Webtrends \n",
"519 Western Gas Resources \n",
"520 Weyerhaeuser \n",
"521 WhatsApp \n",
"522 Wheelabrator Technologies \n",
"523 WILD Flavors \n",
"524 Wind River \n",
"525 Windsor Foods \n",
"526 Wittur \n",
"527 Wyeth Pharmaceuticals \n",
"528 Wyeth Pharmaceuticals \n",
"529 Xircom \n",
"530 Yammer \n",
"531 Yankee Candle Company \n",
"532 Yoplait \n",
"533 York Risk Services Group \n",
"534 YouTube \n",
"535 Zappos \n",
"536 Ziggo \n",
"\n",
" company_category_list company_market \\\n",
"0 |Finance|Business Services|Insurance| Business Services \n",
"1 |Curated Web| Curated Web \n",
"2 |Services|Cloud Data Services|Software| Cloud Data Services \n",
"3 |Mobile| Mobile \n",
"4 |News| News \n",
"5 |Information Technology|Internet|Service Provi... Internet \n",
"6 |Pharmaceuticals|Biotechnology| Biotechnology \n",
"7 |Software| Software \n",
"8 |Natural Gas Uses| Natural Gas Uses \n",
"9 |Medical|Biotechnology| Biotechnology \n",
"10 |Mobile|Advertising| Advertising \n",
"11 |Health Care|Information Technology|Software| Software \n",
"12 |Market Research|Events|Business Services| Events \n",
"13 None None \n",
"14 |Semiconductors| Semiconductors \n",
"15 |Consulting| Consulting \n",
"16 |Mobile Devices|Mobile| Mobile \n",
"17 |Finance| Finance \n",
"18 None None \n",
"19 |Biotechnology| Biotechnology \n",
"20 |Biotechnology| Biotechnology \n",
"21 |Medical|Pharmaceuticals|Biotechnology| Medical \n",
"22 |Advertising| Advertising \n",
"23 |Public Relations| Public Relations \n",
"24 |Local Businesses|Energy|Utilities| Utilities \n",
"25 |Manufacturing| Manufacturing \n",
"26 |Finance| Finance \n",
"27 None None \n",
"28 None None \n",
"29 |Electronics|Manufacturing|Design| Design \n",
".. ... ... \n",
"507 |Data Centers|Services|Web Hosting| Web Hosting \n",
"508 |Android|iPhone|VoIP|Messaging| Android \n",
"509 |Video Games| Video Games \n",
"510 |Hospitality| Hospitality \n",
"511 |Pharmaceuticals|Biotechnology| Biotechnology \n",
"512 |DOD/Military|Mobile| Mobile \n",
"513 |Communities|Social Network Media|Social Media| Communities \n",
"514 None None \n",
"515 |Pharmaceuticals|Biotechnology| Biotechnology \n",
"516 |Navigation|Transportation| Transportation \n",
"517 |Curated Web| Curated Web \n",
"518 |SEO|Curated Web| Curated Web \n",
"519 |Clean Energy|Oil| Clean Energy \n",
"520 |Manufacturing| Manufacturing \n",
"521 |Messaging| Messaging \n",
"522 |Waste Management| Waste Management \n",
"523 |Food Processing| Food Processing \n",
"524 |Software| Software \n",
"525 |Consumer Goods|Food Processing| Food Processing \n",
"526 |Manufacturing|Public Safety|Heavy Industry| Heavy Industry \n",
"527 |Biotechnology|Pharmaceuticals|Health and Well... Biotechnology \n",
"528 |Biotechnology|Pharmaceuticals|Health and Well... Biotechnology \n",
"529 |Web Hosting| Web Hosting \n",
"530 |Twitter Applications|Networking|Social Media|... Enterprise Software \n",
"531 |Manufacturing| Manufacturing \n",
"532 |Food Processing| Food Processing \n",
"533 |Insurance| Insurance \n",
"534 |Video|Online Rental|Entertainment|Games| Games \n",
"535 |Curated Web| Curated Web \n",
"536 |Information Services|Entertainment|Investment... Investment Management \n",
"\n",
" company_country_code company_state_code company_region \\\n",
"0 USA AL AL - Other \n",
"1 USA MA Boston \n",
"2 USA CA SF Bay Area \n",
"3 CHN None Fuzhou Shi \n",
"4 None None None \n",
"5 USA NY New York City \n",
"6 USA CA Los Angeles \n",
"7 USA CA San Diego \n",
"8 USA OK Oklahoma City \n",
"9 USA CA SF Bay Area \n",
"10 USA CA SF Bay Area \n",
"11 GBR None Cobham \n",
"12 USA CA Los Angeles \n",
"13 None None None \n",
"14 USA CA SF Bay Area \n",
"15 USA TX Dallas \n",
"16 USA GA Atlanta \n",
"17 None None None \n",
"18 USA AL Birmingham \n",
"19 NOR None Oslo \n",
"20 USA CA SF Bay Area \n",
"21 USA CA Anaheim \n",
"22 USA IA IA - Other \n",
"23 USA AR Little Rock \n",
"24 CAN AB Calgary \n",
"25 USA IL Chicago \n",
"26 USA NY New York City \n",
"27 USA PA Philadelphia \n",
"28 USA MA Boston \n",
"29 USA ID Idaho Falls \n",
".. ... ... ... \n",
"507 USA CO Denver \n",
"508 CYP None Cyprus \n",
"509 USA TN Nashville \n",
"510 NLD None Eindhoven \n",
"511 USA PA Philadelphia \n",
"512 USA MD Washington, D.C. \n",
"513 RUS None St. Petersburg \n",
"514 CHE None CHE - Other \n",
"515 USA NJ Newark \n",
"516 USA CA SF Bay Area \n",
"517 USA CA SF Bay Area \n",
"518 USA OR Portland, Oregon \n",
"519 USA TX Seminole \n",
"520 USA WA Seattle \n",
"521 USA CA SF Bay Area \n",
"522 USA NH Manchester, New Hampshire \n",
"523 DEU None Frankfurt \n",
"524 USA CA SF Bay Area \n",
"525 USA TX Houston \n",
"526 None None None \n",
"527 None None None \n",
"528 None None None \n",
"529 USA CA Los Angeles \n",
"530 USA CA SF Bay Area \n",
"531 USA AR Fayetteville \n",
"532 USA MN Minneapolis \n",
"533 USA NJ Newark \n",
"534 USA CA SF Bay Area \n",
"535 USA NV Las Vegas \n",
"536 NLD None Utrecht \n",
"\n",
" company_city acquirer_permalink \\\n",
"0 De Armanville /organization/american-international-group \n",
"1 Marlborough /organization/hewlett-packard \n",
"2 Fremont /organization/hewlett-packard \n",
"3 Fuzhou Shi /organization/baidu \n",
"4 None /organization/the-walt-disney-company \n",
"5 White Plains /organization/zayo-group \n",
"6 Los Angeles /organization/celgene \n",
"7 San Diego /organization/dassault \n",
"8 Oklahoma City /organization/the-williams-companies \n",
"9 Menlo Park /organization/johnson-johnson \n",
"10 Mountain View /organization/google \n",
"11 Cobham /organization/vista-equity-partners \n",
"12 Santa Monica /organization/ubm-electronics \n",
"13 None /organization/valassis \n",
"14 Mountain View /organization/broadcom \n",
"15 Dallas /organization/xerox \n",
"16 Atlanta /organization/vmware \n",
"17 None /organization/portfolio-recovery-associates \n",
"18 Birmingham /organization/laclede-group \n",
"19 Oslo /organization/bayer-ag-germany \n",
"20 South San Francisco /organization/johnson-johnson \n",
"21 Irvine /organization/actavis \n",
"22 Plano /organization/liberty-global \n",
"23 Little Rock /organization/verizon \n",
"24 Calgary /organization/midamerican-energy \n",
"25 Hoffman Estates /organization/minerals-technologies \n",
"26 New York /organization/standard-chartered-bank \n",
"27 Plymouth Meeting /organization/kinder-morgan \n",
"28 Boston /organization/ventas \n",
"29 Pocatello /organization/on-semiconductor \n",
".. ... ... \n",
"507 Greenwood Village /organization/shaw-communications \n",
"508 Limassol /organization/rakuten \n",
"509 Franklin /organization/aristocrat-technologies-inc \n",
"510 Eindhoven /organization/darling-international \n",
"511 Exton /organization/shire \n",
"512 Rockville /organization/eads-astrium \n",
"513 Saint Petersburg /organization/mail-ru \n",
"514 Eison /organization/platinum-equity-llc \n",
"515 Rockaway /organization/actavis \n",
"516 Palo Alto /organization/google \n",
"517 Santa Clara /organization/cisco \n",
"518 Portland /organization/netiq \n",
"519 Seminole /organization/anadarko-petroleum-corporation \n",
"520 Federal Way /organization/tri-pointe-homes \n",
"521 Santa Clara /organization/facebook \n",
"522 Hampton /organization/energy-capital-partners \n",
"523 Heidelberg /organization/archer-daniels-midland-company \n",
"524 Alameda /organization/intel \n",
"525 Houston /organization/ajinomoto \n",
"526 None /organization/bain-capital-2 \n",
"527 None /organization/nestl \n",
"528 None /organization/pfizer \n",
"529 Thousand Oaks /organization/intel \n",
"530 San Francisco /organization/microsoft \n",
"531 Rogers /organization/jarden \n",
"532 Minneapolis /organization/generalmills \n",
"533 Parsippany /organization/onex \n",
"534 San Bruno /organization/google \n",
"535 Las Vegas /organization/amazon \n",
"536 Utrecht /organization/liberty-global \n",
"\n",
" acquirer_name ... \\\n",
"0 American International Group ... \n",
"1 Hewlett-Packard ... \n",
"2 Hewlett-Packard ... \n",
"3 Baidu ... \n",
"4 The Walt Disney Company ... \n",
"5 Zayo ... \n",
"6 Celgene ... \n",
"7 Dassault Systemes ... \n",
"8 The Williams Companies ... \n",
"9 Johnson & Johnson ... \n",
"10 Google ... \n",
"11 Vista Equity Partners ... \n",
"12 UBM Electronics ... \n",
"13 Valassis ... \n",
"14 Broadcom ... \n",
"15 Xerox ... \n",
"16 VMware ... \n",
"17 Portfolio Recovery Associates ... \n",
"18 Laclede Group ... \n",
"19 Bayer AG ... \n",
"20 Johnson & Johnson ... \n",
"21 Actavis ... \n",
"22 Liberty Global ... \n",
"23 Verizon Communications ... \n",
"24 Berkshire Hathaway Energy ... \n",
"25 Minerals Technologies ... \n",
"26 Standard Chartered Bank ... \n",
"27 Kinder Morgan Energy ... \n",
"28 Ventas ... \n",
"29 ON Semiconductor ... \n",
".. ... ... \n",
"507 Shaw Communications ... \n",
"508 Rakuten ... \n",
"509 Aristocrat Technologies, Inc ... \n",
"510 Darling Ingredients ... \n",
"511 Shire ... \n",
"512 EADS Astrium ... \n",
"513 Mail.Ru Group ... \n",
"514 Platinum Equity ... \n",
"515 Actavis ... \n",
"516 Google ... \n",
"517 Cisco ... \n",
"518 NetIQ ... \n",
"519 Anadarko Petroleum Corporation ... \n",
"520 Tri Pointe Homes ... \n",
"521 Facebook ... \n",
"522 Energy Capital Partners ... \n",
"523 Archer Daniels Midland Company ... \n",
"524 Intel ... \n",
"525 Ajinomoto ... \n",
"526 Bain Capital ... \n",
"527 Nestl\u0082 ... \n",
"528 Pfizer ... \n",
"529 Intel ... \n",
"530 Microsoft ... \n",
"531 Jarden ... \n",
"532 General Mills ... \n",
"533 Onex ... \n",
"534 Google ... \n",
"535 Amazon ... \n",
"536 Liberty Global ... \n",
"\n",
" acquirer_country_code acquirer_state_code acquirer_region \\\n",
"0 USA NY New York City \n",
"1 USA CA SF Bay Area \n",
"2 USA CA SF Bay Area \n",
"3 CHN None Beijing \n",
"4 USA CA Los Angeles \n",
"5 USA CO Denver \n",
"6 USA NJ Newark \n",
"7 FRA None None \n",
"8 USA OK OK - Other \n",
"9 USA NJ Newark \n",
"10 USA CA SF Bay Area \n",
"11 USA TX Austin \n",
"12 USA CA SF Bay Area \n",
"13 USA MI Detroit \n",
"14 USA CA Anaheim \n",
"15 USA AL AL - Other \n",
"16 USA CA SF Bay Area \n",
"17 USA VA Norfolk - Virginia Beach \n",
"18 USA MO St. Louis \n",
"19 DEU None Leverkusen \n",
"20 USA NJ Newark \n",
"21 USA NJ Newark \n",
"22 USA CO Denver \n",
"23 USA NY New York City \n",
"24 USA IA Des Moines \n",
"25 USA NY New York City \n",
"26 GBR None London \n",
"27 USA TX Houston \n",
"28 USA IL Chicago \n",
"29 USA AZ Phoenix \n",
".. ... ... ... \n",
"507 CAN AB Calgary \n",
"508 JPN None Tokyo \n",
"509 AUS None AUS - Other \n",
"510 USA TX Dallas \n",
"511 IRL None Dublin \n",
"512 USA TX Houston \n",
"513 RUS None Moscow \n",
"514 USA CA Los Angeles \n",
"515 USA NJ Newark \n",
"516 USA CA SF Bay Area \n",
"517 USA CA SF Bay Area \n",
"518 USA CA SF Bay Area \n",
"519 USA TX Seminole \n",
"520 USA CA SF Bay Area \n",
"521 USA CA SF Bay Area \n",
"522 USA NJ Newark \n",
"523 USA IL Chicago \n",
"524 USA CA SF Bay Area \n",
"525 JPN None Tokyo \n",
"526 USA MA Boston \n",
"527 CHE None Vevey \n",
"528 USA NY New York City \n",
"529 USA CA SF Bay Area \n",
"530 USA WA Seattle \n",
"531 USA NY New York City \n",
"532 USA MN Minneapolis \n",
"533 CAN ON Toronto \n",
"534 USA CA SF Bay Area \n",
"535 USA WA Seattle \n",
"536 USA CO Denver \n",
"\n",
" acquirer_city acquired_at acquired_month acquired_quarter acquired_year \\\n",
"0 New York 2007-09-01 2007-09 2007-Q3 2007 \n",
"1 Palo Alto 2009-11-11 2009-11 2009-Q4 2009 \n",
"2 Palo Alto 2010-09-02 2010-09 2010-Q3 2010 \n",
"3 Beijing 2013-07-15 2013-07 2013-Q3 2013 \n",
"4 Burbank 1995-08-05 1995-08 1995-Q3 1995 \n",
"5 Boulder 2012-03-19 2012-03 2012-Q1 2012 \n",
"6 Summit 2010-06-30 2010-06 2010-Q2 2010 \n",
"7 None 2014-01-30 2014-01 2014-Q1 2014 \n",
"8 Snyder 2012-12-21 2012-12 2012-Q4 2012 \n",
"9 New Brunswick 2009-12-17 2009-12 2009-Q4 2009 \n",
"10 Mountain View 2009-11-09 2009-11 2009-Q4 2009 \n",
"11 Austin 2014-11-25 2014-11 2014-Q4 2014 \n",
"12 San Francisco 2014-10-01 2014-10 2014-Q4 2014 \n",
"13 Livonia 2006-07-06 2006-07 2006-Q3 2006 \n",
"14 Irvine 2012-02-17 2012-02 2012-Q1 2012 \n",
"15 Normal 2009-09-28 2009-09 2009-Q3 2009 \n",
"16 Palo Alto 2014-01-22 2014-01 2014-Q1 2014 \n",
"17 Norfolk 2014-02-20 2014-02 2014-Q1 2014 \n",
"18 St Louis 2014-04-07 2014-04 2014-Q2 2014 \n",
"19 Leverkusen 2013-12-19 2013-12 2013-Q4 2013 \n",
"20 New Brunswick 2014-09-30 2014-09 2014-Q3 2014 \n",
"21 Parsippany 2014-11-16 2014-11 2014-Q4 2014 \n",
"22 Englewood 2014-05-08 2014-05 2014-Q2 2014 \n",
"23 New York 2008-06-05 2008-06 2008-Q2 2008 \n",
"24 Des Moines 2014-05-02 2014-05 2014-Q2 2014 \n",
"25 New York 2014-03-10 2014-03 2014-Q1 2014 \n",
"26 London 2008-02-28 2008-02 2008-Q1 2008 \n",
"27 Houston 2013-12-23 2013-12 2013-Q4 2013 \n",
"28 Chicago 2014-06-02 2014-06 2014-Q2 2014 \n",
"29 Phoenix 2007-12-14 2007-12 2007-Q4 2007 \n",
".. ... ... ... ... ... \n",
"507 Calgary 2014-08-04 2014-08 2014-Q3 2014 \n",
"508 Tokyo 2014-02-14 2014-02 2014-Q1 2014 \n",
"509 None 2014-07-07 2014-07 2014-Q3 2014 \n",
"510 Irving 2013-10-07 2013-10 2013-Q4 2013 \n",
"511 Dublin 2013-11-10 2013-11 2013-Q4 2013 \n",
"512 Houston 2011-08-03 2011-08 2011-Q3 2011 \n",
"513 Moscow 2014-09-16 2014-09 2014-Q3 2014 \n",
"514 Beverly Hills 2013-12-10 2013-12 2013-Q4 2013 \n",
"515 Parsippany 2013-05-20 2013-05 2013-Q2 2013 \n",
"516 Mountain View 2013-06-11 2013-06 2013-Q2 2013 \n",
"517 San Jose 2007-03-15 2007-03 2007-Q1 2007 \n",
"518 San Jose 2001-01-17 2001-01 2001-Q1 2001 \n",
"519 Seminole 2006-06-01 2006-06 2006-Q2 2006 \n",
"520 San Ramon 2013-11-04 2013-11 2013-Q4 2013 \n",
"521 Menlo Park 2014-02-19 2014-02 2014-Q1 2014 \n",
"522 Short Hills 2014-07-29 2014-07 2014-Q3 2014 \n",
"523 Chicago 2014-07-07 2014-07 2014-Q3 2014 \n",
"524 Santa Clara 2009-07-17 2009-07 2009-Q3 2009 \n",
"525 Tokyo 2014-09-10 2014-09 2014-Q3 2014 \n",
"526 Boston 2014-12-23 2014-12 2014-Q4 2014 \n",
"527 Vevey 2012-04-01 2012-04 2012-Q2 2012 \n",
"528 New York 2009-01-23 2009-01 2009-Q1 2009 \n",
"529 Santa Clara 2001-01-15 2001-01 2001-Q1 2001 \n",
"530 Redmond 2012-06-25 2012-06 2012-Q2 2012 \n",
"531 Rye 2013-09-03 2013-09 2013-Q3 2013 \n",
"532 Minneapolis 2011-07-01 2011-07 2011-Q3 2011 \n",
"533 Toronto 2014-07-16 2014-07 2014-Q3 2014 \n",
"534 Mountain View 2006-10-01 2006-10 2006-Q4 2006 \n",
"535 Seattle 2009-07-22 2009-07 2009-Q3 2009 \n",
"536 Englewood 2014-01-27 2014-01 2014-Q1 2014 \n",
"\n",
" price_amount price_currency_code \n",
"0 749000000 USD \n",
"1 2700000000 USD \n",
"2 2350000000 USD \n",
"3 1900000000 USD \n",
"4 19000000000 USD \n",
"5 2300000000 USD \n",
"6 2900000000 USD \n",
"7 750000000 USD \n",
"8 2250000000 USD \n",
"9 785000000 USD \n",
"10 750000000 USD \n",
"11 725000000 GBP \n",
"12 972000000 USD \n",
"13 1300000000 USD \n",
"14 3700000000 USD \n",
"15 5750000000 USD \n",
"16 1540000000 USD \n",
"17 1300000000 USD \n",
"18 1340000000 USD \n",
"19 2900000000 USD \n",
"20 1750000000 USD \n",
"21 65500000000 USD \n",
"22 930000000 USD \n",
"23 28100000000 USD \n",
"24 3520000000 USD \n",
"25 1700000000 USD \n",
"26 823000000 USD \n",
"27 962000000 USD \n",
"28 2600000000 USD \n",
"29 915000000 USD \n",
".. ... ... \n",
"507 1200000000 USD \n",
"508 900000000 USD \n",
"509 1300000000 USD \n",
"510 1600000000 EUR \n",
"511 4200000000 USD \n",
"512 960000000 USD \n",
"513 1470000000 USD \n",
"514 1100000000 USD \n",
"515 8500000000 USD \n",
"516 996000000 USD \n",
"517 3200000000 USD \n",
"518 1000000000 USD \n",
"519 4800000000 USD \n",
"520 2700000000 USD \n",
"521 19000000000 USD \n",
"522 1940000000 USD \n",
"523 3000000000 USD \n",
"524 884000000 USD \n",
"525 800000000 USD \n",
"526 750000000 USD \n",
"527 11850000000 USD \n",
"528 68000000000 USD \n",
"529 748000000 USD \n",
"530 1200000000 USD \n",
"531 1750000000 USD \n",
"532 1200000000 USD \n",
"533 1330000000 USD \n",
"534 1650000000 USD \n",
"535 1200000000 USD \n",
"536 13700000000 USD \n",
"\n",
"[537 rows x 22 columns]"
]
}
],
"prompt_number": 42
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"companies.loc[0]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 43,
"text": [
"permalink /organization/-fame\n",
"name #fame\n",
"homepage_url http://livfame.com\n",
"category_list |Media|\n",
"market Media \n",
"status operating\n",
"country_code IND\n",
"state_code NaN\n",
"region Mumbai\n",
"city Mumbai\n",
"funding_rounds 1\n",
"founded_at nan\n",
"founded_month NaN\n",
"founded_quarter NaN\n",
"founded_year NaN\n",
"first_funding_at 2015-01-05\n",
"last_funding_at 2015-01-05\n",
"funding_total_usd 1e+07\n",
"Name: 0, dtype: object"
]
}
],
"prompt_number": 43
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Semi and Anti-Joins\n",
"===\n",
"- Implement efficient \"set filter\"-type relational algebra\n",
"- Can be implemented as an explicit join, but only when no row duplication possible\n",
"- Some databases support explicit `SEMI JOIN` or `ANTI JOIN`\n",
"- No direct analogue in pandas; must implement using explicit set filtering"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"rounds.loc[0]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 44,
"text": [
"company_permalink /organization/-fame\n",
"company_name #fame\n",
"company_category_list |Media|\n",
"company_market Media\n",
"company_country_code IND\n",
"company_state_code NaN\n",
"company_region Mumbai\n",
"company_city Mumbai\n",
"funding_round_permalink /funding-round/9a01d05418af9f794eebff7ace91f638\n",
"funding_round_type venture\n",
"funding_round_code B\n",
"funded_at 2015-01-05\n",
"funded_month 2015-01\n",
"funded_quarter 2015-Q1 \n",
"funded_year 2015\n",
"raised_amount_usd 10,000,000 \n",
"Unnamed: 16 NaN\n",
"Name: 0, dtype: object"
]
}
],
"prompt_number": 44
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"companies.loc[0]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 45,
"text": [
"permalink /organization/-fame\n",
"name #fame\n",
"homepage_url http://livfame.com\n",
"category_list |Media|\n",
"market Media \n",
"status operating\n",
"country_code IND\n",
"state_code NaN\n",
"region Mumbai\n",
"city Mumbai\n",
"funding_rounds 1\n",
"founded_at nan\n",
"founded_month NaN\n",
"founded_quarter NaN\n",
"founded_year NaN\n",
"first_funding_at 2015-01-05\n",
"last_funding_at 2015-01-05\n",
"funding_total_usd 1e+07\n",
"Name: 0, dtype: object"
]
}
],
"prompt_number": 45
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Use this One Weird Trick\n",
"==="
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"len(companies)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 46,
"text": [
"54292"
]
}
],
"prompt_number": 46
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"query = \"\"\"\n",
"select status, count(*) as count\n",
"from companies c\n",
"where not exists (\n",
" select 1\n",
" from rounds r\n",
" where c.permalink = r.company_permalink\n",
" and r.funding_round_type = 'venture' \n",
") \n",
" and c.founded_at > '2010-01-01'\n",
"group by 1;\n",
"\"\"\"\n",
"sql.read_sql(query, con)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" status | \n",
" count | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" None | \n",
" 250 | \n",
"
\n",
" \n",
" 1 | \n",
" operating | \n",
" 12309 | \n",
"
\n",
" \n",
" 2 | \n",
" closed | \n",
" 542 | \n",
"
\n",
" \n",
" 3 | \n",
" acquired | \n",
" 288 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 47,
"text": [
" status count\n",
"0 None 250\n",
"1 operating 12309\n",
"2 closed 542\n",
"3 acquired 288"
]
}
],
"prompt_number": 47
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" QUERY PLAN \n",
" ----------------------------------------------------------------------------------\n",
" HashAggregate (cost=8454.36..8454.38 rows=2 width=9)\n",
" -> Hash Semi Join (cost=4466.18..8388.91 rows=13090 width=9)\n",
" Hash Cond: ((c.permalink)::text = (r.company_permalink)::text)\n",
" -> Seq Scan on companies c (cost=0.00..2143.65 rows=18526 width=36)\n",
" Filter: (founded_at > '2010-01-01'::date)\n",
" -> Hash (cost=3635.51..3635.51 rows=42933 width=26)\n",
" -> Seq Scan on rounds r (cost=0.00..3635.51 rows=42933 width=26)\n",
" Filter: ((funding_round_type)::text = 'venture'::text)\n",
" (8 rows)\n"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"rounds[:20]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" company_permalink | \n",
" company_name | \n",
" company_category_list | \n",
" company_market | \n",
" company_country_code | \n",
" company_state_code | \n",
" company_region | \n",
" company_city | \n",
" funding_round_permalink | \n",
" funding_round_type | \n",
" funding_round_code | \n",
" funded_at | \n",
" funded_month | \n",
" funded_quarter | \n",
" funded_year | \n",
" raised_amount_usd | \n",
" Unnamed: 16 | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" /organization/-fame | \n",
" #fame | \n",
" |Media| | \n",
" Media | \n",
" IND | \n",
" NaN | \n",
" Mumbai | \n",
" Mumbai | \n",
" /funding-round/9a01d05418af9f794eebff7ace91f638 | \n",
" venture | \n",
" B | \n",
" 2015-01-05 | \n",
" 2015-01 | \n",
" 2015-Q1 | \n",
" 2015 | \n",
" 10,000,000 | \n",
" NaN | \n",
"
\n",
" \n",
" 1 | \n",
" /organization/hashoff | \n",
" #HASHOFF | \n",
" |Digital Media|Internet|Social Media| | \n",
" Digital Media | \n",
" USA | \n",
" CO | \n",
" Denver | \n",
" Denver | \n",
" /funding-round/669d6203c0374e6cf0e8d10f75ba0b8a | \n",
" debt_financing | \n",
" NaN | \n",
" 2014-12-08 | \n",
" 2014-12 | \n",
" 2014-Q4 | \n",
" 2014 | \n",
" 455,000 | \n",
" NaN | \n",
"
\n",
" \n",
" 2 | \n",
" /organization/waywire | \n",
" #waywire | \n",
" |Entertainment|Politics|Social Media|News| | \n",
" News | \n",
" USA | \n",
" NY | \n",
" New York City | \n",
" New York | \n",
" /funding-round/cc409188fa2b63482bd9008f682c2efa | \n",
" seed | \n",
" NaN | \n",
" 2012-06-30 | \n",
" 2012-06 | \n",
" 2012-Q2 | \n",
" 2012 | \n",
" 1,750,000 | \n",
" NaN | \n",
"
\n",
" \n",
" 3 | \n",
" /organization/tv-communications | \n",
" &TV Communications | \n",
" |Games| | \n",
" Games | \n",
" USA | \n",
" CA | \n",
" Los Angeles | \n",
" Los Angeles | \n",
" /funding-round/86d22afc65107b6941e6c43c671ecbb8 | \n",
" venture | \n",
" NaN | \n",
" 2010-06-04 | \n",
" 2010-06 | \n",
" 2010-Q2 | \n",
" 2010 | \n",
" 1,000,000 | \n",
" NaN | \n",
"
\n",
" \n",
" 4 | \n",
" /organization/tv-communications | \n",
" &TV Communications | \n",
" |Games| | \n",
" Games | \n",
" USA | \n",
" CA | \n",
" Los Angeles | \n",
" Los Angeles | \n",
" /funding-round/59a3669a64e39360c2b939300bcda162 | \n",
" venture | \n",
" NaN | \n",
" 2010-09-23 | \n",
" 2010-09 | \n",
" 2010-Q3 | \n",
" 2010 | \n",
" 3,000,000 | \n",
" NaN | \n",
"
\n",
" \n",
" 5 | \n",
" /organization/rock-your-paper | \n",
" 'Rock' Your Paper | \n",
" |Publishing|Education| | \n",
" Publishing | \n",
" EST | \n",
" NaN | \n",
" Tallinn | \n",
" Tallinn | \n",
" /funding-round/f06b420775f7cb6c1541a9db526534bb | \n",
" seed | \n",
" NaN | \n",
" 2012-08-09 | \n",
" 2012-08 | \n",
" 2012-Q3 | \n",
" 2012 | \n",
" 40,000 | \n",
" NaN | \n",
"
\n",
" \n",
" 6 | \n",
" /organization/in-touch-network | \n",
" (In)Touch Network | \n",
" |Electronics|Guides|Coffee|Restaurants|Music|i... | \n",
" Electronics | \n",
" GBR | \n",
" NaN | \n",
" London | \n",
" London | \n",
" /funding-round/33c3f135f05d7b734b8d7b7c8ae82647 | \n",
" seed | \n",
" NaN | \n",
" 2011-04-01 | \n",
" 2011-04 | \n",
" 2011-Q2 | \n",
" 2011 | \n",
" 1,500,000 | \n",
" NaN | \n",
"
\n",
" \n",
" 7 | \n",
" /organization/r-ranch-and-mine | \n",
" -R- Ranch and Mine | \n",
" |Tourism|Entertainment|Games| | \n",
" Entertainment | \n",
" USA | \n",
" TX | \n",
" Dallas | \n",
" Fort Worth | \n",
" /funding-round/029720f7eeb218f51c43df5155671472 | \n",
" equity_crowdfunding | \n",
" NaN | \n",
" 2014-08-17 | \n",
" 2014-08 | \n",
" 2014-Q3 | \n",
" 2014 | \n",
" 10,000 | \n",
" NaN | \n",
"
\n",
" \n",
" 8 | \n",
" /organization/r-ranch-and-mine | \n",
" -R- Ranch and Mine | \n",
" |Tourism|Entertainment|Games| | \n",
" Entertainment | \n",
" USA | \n",
" TX | \n",
" Dallas | \n",
" Fort Worth | \n",
" /funding-round/766b0bcead9ca8560af5e1ade579fb7f | \n",
" equity_crowdfunding | \n",
" NaN | \n",
" 2014-09-26 | \n",
" 2014-09 | \n",
" 2014-Q3 | \n",
" 2014 | \n",
" 50,000 | \n",
" NaN | \n",
"
\n",
" \n",
" 9 | \n",
" /organization/club-domains | \n",
" .Club Domains | \n",
" |Software| | \n",
" Software | \n",
" USA | \n",
" FL | \n",
" Ft. Lauderdale | \n",
" Oakland Park | \n",
" /funding-round/aee671c9707f8278a9544c8ae37650e0 | \n",
" venture | \n",
" B | \n",
" 2013-05-31 | \n",
" 2013-05 | \n",
" 2013-Q2 | \n",
" 2013 | \n",
" 7,000,000 | \n",
" NaN | \n",
"
\n",
" \n",
" 10 | \n",
" /organization/fox-networks | \n",
" .Fox Networks | \n",
" |Advertising| | \n",
" Advertising | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" /funding-round/69a1536a6f4506538afd7aa7241ddbe1 | \n",
" undisclosed | \n",
" NaN | \n",
" 2007-01-16 | \n",
" 2007-01 | \n",
" 2007-Q1 | \n",
" 2007 | \n",
" 4,912,393 | \n",
" NaN | \n",
"
\n",
" \n",
" 11 | \n",
" /organization/0-6-com | \n",
" 0-6.com | \n",
" |Curated Web| | \n",
" Curated Web | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" /funding-round/5727accaeaa57461bd22a9bdd945382d | \n",
" venture | \n",
" A | \n",
" 2008-03-19 | \n",
" 2008-03 | \n",
" 2008-Q1 | \n",
" 2008 | \n",
" 2,000,000 | \n",
" NaN | \n",
"
\n",
" \n",
" 12 | \n",
" /organization/004-technologies | \n",
" 004 Technologies | \n",
" |Software| | \n",
" Software | \n",
" USA | \n",
" IL | \n",
" Springfield, Illinois | \n",
" Champaign | \n",
" /funding-round/1278dd4e6a37fa4b7d7e06c21b3c1830 | \n",
" venture | \n",
" NaN | \n",
" 2014-07-24 | \n",
" 2014-07 | \n",
" 2014-Q3 | \n",
" 2014 | \n",
" NaN | \n",
" NaN | \n",
"
\n",
" \n",
" 13 | \n",
" /organization/01games-technology | \n",
" 01Games Technology | \n",
" |Games| | \n",
" Games | \n",
" HKG | \n",
" NaN | \n",
" Hong Kong | \n",
" Hong Kong | \n",
" /funding-round/7d53696f2b4f607a2f2a8cbb83d01839 | \n",
" seed | \n",
" NaN | \n",
" 2014-07-01 | \n",
" 2014-07 | \n",
" 2014-Q3 | \n",
" 2014 | \n",
" 41,250 | \n",
" NaN | \n",
"
\n",
" \n",
" 14 | \n",
" /organization/1-2-3-listo | \n",
" 1,2,3 Listo | \n",
" |E-Commerce| | \n",
" E-Commerce | \n",
" CHL | \n",
" NaN | \n",
" Santiago | \n",
" Las Condes | \n",
" /funding-round/6de4609e894495105bc791ed64361288 | \n",
" seed | \n",
" NaN | \n",
" 2013-02-18 | \n",
" 2013-02 | \n",
" 2013-Q1 | \n",
" 2013 | \n",
" 40,000 | \n",
" NaN | \n",
"
\n",
" \n",
" 15 | \n",
" /organization/1-4-all | \n",
" 1-4 All | \n",
" |Entertainment|Games|Software| | \n",
" Software | \n",
" USA | \n",
" NC | \n",
" NC - Other | \n",
" Connellys Springs | \n",
" /funding-round/e97a192e13ea0ee3c4f71136b4f3ec16 | \n",
" equity_crowdfunding | \n",
" NaN | \n",
" 2013-04-21 | \n",
" 2013-04 | \n",
" 2013-Q2 | \n",
" 2013 | \n",
" NaN | \n",
" NaN | \n",
"
\n",
" \n",
" 16 | \n",
" /organization/1-800-dentist | \n",
" 1-800-DENTIST | \n",
" |Health and Wellness| | \n",
" Health and Wellness | \n",
" USA | \n",
" CA | \n",
" Los Angeles | \n",
" Los Angeles | \n",
" /funding-round/5274aacc211163fc7c86539ce94bbacc | \n",
" undisclosed | \n",
" NaN | \n",
" 2010-08-19 | \n",
" 2010-08 | \n",
" 2010-Q3 | \n",
" 2010 | \n",
" NaN | \n",
" NaN | \n",
"
\n",
" \n",
" 17 | \n",
" /organization/1-800-doctors | \n",
" 1-800-DOCTORS | \n",
" |Health and Wellness| | \n",
" Health and Wellness | \n",
" USA | \n",
" NJ | \n",
" Newark | \n",
" Iselin | \n",
" /funding-round/9eb8c7790a0c200d79e75785d1c4aa12 | \n",
" convertible_note | \n",
" NaN | \n",
" 2011-03-02 | \n",
" 2011-03 | \n",
" 2011-Q1 | \n",
" 2011 | \n",
" 1,750,000 | \n",
" NaN | \n",
"
\n",
" \n",
" 18 | \n",
" /organization/1-618-technology | \n",
" 1.618 Technology | \n",
" |Real Estate| | \n",
" Real Estate | \n",
" USA | \n",
" FL | \n",
" Orlando | \n",
" Orlando | \n",
" /funding-round/83b8f4c7d37ecef5e001a5e953bf461a | \n",
" equity_crowdfunding | \n",
" NaN | \n",
" 2014-01-22 | \n",
" 2014-01 | \n",
" 2014-Q1 | \n",
" 2014 | \n",
" NaN | \n",
" NaN | \n",
"
\n",
" \n",
" 19 | \n",
" /organization/10-minutes-with | \n",
" 10 Minutes With | \n",
" |Education| | \n",
" Education | \n",
" GBR | \n",
" NaN | \n",
" London | \n",
" London | \n",
" /funding-round/f245a74b4c54610ae843e17bdf4d1113 | \n",
" seed | \n",
" NaN | \n",
" 2013-01-01 | \n",
" 2013-01 | \n",
" 2013-Q1 | \n",
" 2013 | \n",
" 400,000 | \n",
" NaN | \n",
"
\n",
" \n",
"
\n",
"
"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 48,
"text": [
" company_permalink company_name \\\n",
"0 /organization/-fame #fame \n",
"1 /organization/hashoff #HASHOFF \n",
"2 /organization/waywire #waywire \n",
"3 /organization/tv-communications &TV Communications \n",
"4 /organization/tv-communications &TV Communications \n",
"5 /organization/rock-your-paper 'Rock' Your Paper \n",
"6 /organization/in-touch-network (In)Touch Network \n",
"7 /organization/r-ranch-and-mine -R- Ranch and Mine \n",
"8 /organization/r-ranch-and-mine -R- Ranch and Mine \n",
"9 /organization/club-domains .Club Domains \n",
"10 /organization/fox-networks .Fox Networks \n",
"11 /organization/0-6-com 0-6.com \n",
"12 /organization/004-technologies 004 Technologies \n",
"13 /organization/01games-technology 01Games Technology \n",
"14 /organization/1-2-3-listo 1,2,3 Listo \n",
"15 /organization/1-4-all 1-4 All \n",
"16 /organization/1-800-dentist 1-800-DENTIST \n",
"17 /organization/1-800-doctors 1-800-DOCTORS \n",
"18 /organization/1-618-technology 1.618 Technology \n",
"19 /organization/10-minutes-with 10 Minutes With \n",
"\n",
" company_category_list company_market \\\n",
"0 |Media| Media \n",
"1 |Digital Media|Internet|Social Media| Digital Media \n",
"2 |Entertainment|Politics|Social Media|News| News \n",
"3 |Games| Games \n",
"4 |Games| Games \n",
"5 |Publishing|Education| Publishing \n",
"6 |Electronics|Guides|Coffee|Restaurants|Music|i... Electronics \n",
"7 |Tourism|Entertainment|Games| Entertainment \n",
"8 |Tourism|Entertainment|Games| Entertainment \n",
"9 |Software| Software \n",
"10 |Advertising| Advertising \n",
"11 |Curated Web| Curated Web \n",
"12 |Software| Software \n",
"13 |Games| Games \n",
"14 |E-Commerce| E-Commerce \n",
"15 |Entertainment|Games|Software| Software \n",
"16 |Health and Wellness| Health and Wellness \n",
"17 |Health and Wellness| Health and Wellness \n",
"18 |Real Estate| Real Estate \n",
"19 |Education| Education \n",
"\n",
" company_country_code company_state_code company_region \\\n",
"0 IND NaN Mumbai \n",
"1 USA CO Denver \n",
"2 USA NY New York City \n",
"3 USA CA Los Angeles \n",
"4 USA CA Los Angeles \n",
"5 EST NaN Tallinn \n",
"6 GBR NaN London \n",
"7 USA TX Dallas \n",
"8 USA TX Dallas \n",
"9 USA FL Ft. Lauderdale \n",
"10 NaN NaN NaN \n",
"11 NaN NaN NaN \n",
"12 USA IL Springfield, Illinois \n",
"13 HKG NaN Hong Kong \n",
"14 CHL NaN Santiago \n",
"15 USA NC NC - Other \n",
"16 USA CA Los Angeles \n",
"17 USA NJ Newark \n",
"18 USA FL Orlando \n",
"19 GBR NaN London \n",
"\n",
" company_city funding_round_permalink \\\n",
"0 Mumbai /funding-round/9a01d05418af9f794eebff7ace91f638 \n",
"1 Denver /funding-round/669d6203c0374e6cf0e8d10f75ba0b8a \n",
"2 New York /funding-round/cc409188fa2b63482bd9008f682c2efa \n",
"3 Los Angeles /funding-round/86d22afc65107b6941e6c43c671ecbb8 \n",
"4 Los Angeles /funding-round/59a3669a64e39360c2b939300bcda162 \n",
"5 Tallinn /funding-round/f06b420775f7cb6c1541a9db526534bb \n",
"6 London /funding-round/33c3f135f05d7b734b8d7b7c8ae82647 \n",
"7 Fort Worth /funding-round/029720f7eeb218f51c43df5155671472 \n",
"8 Fort Worth /funding-round/766b0bcead9ca8560af5e1ade579fb7f \n",
"9 Oakland Park /funding-round/aee671c9707f8278a9544c8ae37650e0 \n",
"10 NaN /funding-round/69a1536a6f4506538afd7aa7241ddbe1 \n",
"11 NaN /funding-round/5727accaeaa57461bd22a9bdd945382d \n",
"12 Champaign /funding-round/1278dd4e6a37fa4b7d7e06c21b3c1830 \n",
"13 Hong Kong /funding-round/7d53696f2b4f607a2f2a8cbb83d01839 \n",
"14 Las Condes /funding-round/6de4609e894495105bc791ed64361288 \n",
"15 Connellys Springs /funding-round/e97a192e13ea0ee3c4f71136b4f3ec16 \n",
"16 Los Angeles /funding-round/5274aacc211163fc7c86539ce94bbacc \n",
"17 Iselin /funding-round/9eb8c7790a0c200d79e75785d1c4aa12 \n",
"18 Orlando /funding-round/83b8f4c7d37ecef5e001a5e953bf461a \n",
"19 London /funding-round/f245a74b4c54610ae843e17bdf4d1113 \n",
"\n",
" funding_round_type funding_round_code funded_at funded_month \\\n",
"0 venture B 2015-01-05 2015-01 \n",
"1 debt_financing NaN 2014-12-08 2014-12 \n",
"2 seed NaN 2012-06-30 2012-06 \n",
"3 venture NaN 2010-06-04 2010-06 \n",
"4 venture NaN 2010-09-23 2010-09 \n",
"5 seed NaN 2012-08-09 2012-08 \n",
"6 seed NaN 2011-04-01 2011-04 \n",
"7 equity_crowdfunding NaN 2014-08-17 2014-08 \n",
"8 equity_crowdfunding NaN 2014-09-26 2014-09 \n",
"9 venture B 2013-05-31 2013-05 \n",
"10 undisclosed NaN 2007-01-16 2007-01 \n",
"11 venture A 2008-03-19 2008-03 \n",
"12 venture NaN 2014-07-24 2014-07 \n",
"13 seed NaN 2014-07-01 2014-07 \n",
"14 seed NaN 2013-02-18 2013-02 \n",
"15 equity_crowdfunding NaN 2013-04-21 2013-04 \n",
"16 undisclosed NaN 2010-08-19 2010-08 \n",
"17 convertible_note NaN 2011-03-02 2011-03 \n",
"18 equity_crowdfunding NaN 2014-01-22 2014-01 \n",
"19 seed NaN 2013-01-01 2013-01 \n",
"\n",
" funded_quarter funded_year raised_amount_usd Unnamed: 16 \n",
"0 2015-Q1 2015 10,000,000 NaN \n",
"1 2014-Q4 2014 455,000 NaN \n",
"2 2012-Q2 2012 1,750,000 NaN \n",
"3 2010-Q2 2010 1,000,000 NaN \n",
"4 2010-Q3 2010 3,000,000 NaN \n",
"5 2012-Q3 2012 40,000 NaN \n",
"6 2011-Q2 2011 1,500,000 NaN \n",
"7 2014-Q3 2014 10,000 NaN \n",
"8 2014-Q3 2014 50,000 NaN \n",
"9 2013-Q2 2013 7,000,000 NaN \n",
"10 2007-Q1 2007 4,912,393 NaN \n",
"11 2008-Q1 2008 2,000,000 NaN \n",
"12 2014-Q3 2014 NaN NaN \n",
"13 2014-Q3 2014 41,250 NaN \n",
"14 2013-Q1 2013 40,000 NaN \n",
"15 2013-Q2 2013 NaN NaN \n",
"16 2010-Q3 2010 NaN NaN \n",
"17 2011-Q1 2011 1,750,000 NaN \n",
"18 2014-Q1 2014 NaN NaN \n",
"19 2013-Q1 2013 400,000 NaN "
]
}
],
"prompt_number": 48
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"companies_with_round = rounds.company_permalink.unique()\n",
"companies.permalink.isin(companies_with_round)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 49,
"text": [
"0 True\n",
"1 True\n",
"2 True\n",
"3 True\n",
"4 True\n",
"5 True\n",
"6 True\n",
"7 True\n",
"8 True\n",
"9 True\n",
"10 True\n",
"11 True\n",
"12 True\n",
"13 True\n",
"14 True\n",
"...\n",
"54277 False\n",
"54278 False\n",
"54279 False\n",
"54280 False\n",
"54281 False\n",
"54282 False\n",
"54283 False\n",
"54284 False\n",
"54285 False\n",
"54286 False\n",
"54287 False\n",
"54288 False\n",
"54289 False\n",
"54290 False\n",
"54291 False\n",
"Name: permalink, Length: 54292, dtype: bool"
]
}
],
"prompt_number": 49
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Window functions\n",
"===\n",
"- A SQL concept, primarily\n",
"- Functions applying to whole sections / partitions of a table\n",
"- Emulate in pandas using groupby-apply paradigm"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"query = \"\"\"\n",
"select year, hour, occs,\n",
" occs / sum(occs) OVER (PARTITION BY YEAR) as share\n",
"from (\n",
" select extract(year from date) as year,\n",
" extract(hour from time) as hour,\n",
" count(*) as occs\n",
" from crimes\n",
" group by 1, 2\n",
") t0\n",
"\"\"\"\n",
"results = sql.read_sql(query, con)\n",
"results"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" year | \n",
" hour | \n",
" occs | \n",
" share | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 2014 | \n",
" 0 | \n",
" 7217 | \n",
" 0.048076 | \n",
"
\n",
" \n",
" 1 | \n",
" 2014 | \n",
" 1 | \n",
" 4353 | \n",
" 0.028998 | \n",
"
\n",
" \n",
" 2 | \n",
" 2014 | \n",
" 2 | \n",
" 3642 | \n",
" 0.024261 | \n",
"
\n",
" \n",
" 3 | \n",
" 2014 | \n",
" 3 | \n",
" 2537 | \n",
" 0.016900 | \n",
"
\n",
" \n",
" 4 | \n",
" 2014 | \n",
" 4 | \n",
" 1709 | \n",
" 0.011385 | \n",
"
\n",
" \n",
" 5 | \n",
" 2014 | \n",
" 5 | \n",
" 1636 | \n",
" 0.010898 | \n",
"
\n",
" \n",
" 6 | \n",
" 2014 | \n",
" 6 | \n",
" 2195 | \n",
" 0.014622 | \n",
"
\n",
" \n",
" 7 | \n",
" 2014 | \n",
" 7 | \n",
" 3553 | \n",
" 0.023668 | \n",
"
\n",
" \n",
" 8 | \n",
" 2014 | \n",
" 8 | \n",
" 5124 | \n",
" 0.034134 | \n",
"
\n",
" \n",
" 9 | \n",
" 2014 | \n",
" 9 | \n",
" 5844 | \n",
" 0.038930 | \n",
"
\n",
" \n",
" 10 | \n",
" 2014 | \n",
" 10 | \n",
" 6210 | \n",
" 0.041368 | \n",
"
\n",
" \n",
" 11 | \n",
" 2014 | \n",
" 11 | \n",
" 6464 | \n",
" 0.043060 | \n",
"
\n",
" \n",
" 12 | \n",
" 2014 | \n",
" 12 | \n",
" 8803 | \n",
" 0.058641 | \n",
"
\n",
" \n",
" 13 | \n",
" 2014 | \n",
" 13 | \n",
" 7122 | \n",
" 0.047443 | \n",
"
\n",
" \n",
" 14 | \n",
" 2014 | \n",
" 14 | \n",
" 7373 | \n",
" 0.049115 | \n",
"
\n",
" \n",
" 15 | \n",
" 2014 | \n",
" 15 | \n",
" 8111 | \n",
" 0.054032 | \n",
"
\n",
" \n",
" 16 | \n",
" 2014 | \n",
" 16 | \n",
" 8671 | \n",
" 0.057762 | \n",
"
\n",
" \n",
" 17 | \n",
" 2014 | \n",
" 17 | \n",
" 9389 | \n",
" 0.062545 | \n",
"
\n",
" \n",
" 18 | \n",
" 2014 | \n",
" 18 | \n",
" 9865 | \n",
" 0.065716 | \n",
"
\n",
" \n",
" 19 | \n",
" 2014 | \n",
" 19 | \n",
" 9087 | \n",
" 0.060533 | \n",
"
\n",
" \n",
" 20 | \n",
" 2014 | \n",
" 20 | \n",
" 8452 | \n",
" 0.056303 | \n",
"
\n",
" \n",
" 21 | \n",
" 2014 | \n",
" 21 | \n",
" 7743 | \n",
" 0.051580 | \n",
"
\n",
" \n",
" 22 | \n",
" 2014 | \n",
" 22 | \n",
" 7924 | \n",
" 0.052786 | \n",
"
\n",
" \n",
" 23 | \n",
" 2014 | \n",
" 23 | \n",
" 7092 | \n",
" 0.047243 | \n",
"
\n",
" \n",
" 24 | \n",
" 2015 | \n",
" 0 | \n",
" 504 | \n",
" 0.042014 | \n",
"
\n",
" \n",
" 25 | \n",
" 2015 | \n",
" 1 | \n",
" 420 | \n",
" 0.035012 | \n",
"
\n",
" \n",
" 26 | \n",
" 2015 | \n",
" 2 | \n",
" 332 | \n",
" 0.027676 | \n",
"
\n",
" \n",
" 27 | \n",
" 2015 | \n",
" 3 | \n",
" 224 | \n",
" 0.018673 | \n",
"
\n",
" \n",
" 28 | \n",
" 2015 | \n",
" 4 | \n",
" 156 | \n",
" 0.013004 | \n",
"
\n",
" \n",
" 29 | \n",
" 2015 | \n",
" 5 | \n",
" 143 | \n",
" 0.011921 | \n",
"
\n",
" \n",
" 30 | \n",
" 2015 | \n",
" 6 | \n",
" 179 | \n",
" 0.014922 | \n",
"
\n",
" \n",
" 31 | \n",
" 2015 | \n",
" 7 | \n",
" 320 | \n",
" 0.026676 | \n",
"
\n",
" \n",
" 32 | \n",
" 2015 | \n",
" 8 | \n",
" 403 | \n",
" 0.033595 | \n",
"
\n",
" \n",
" 33 | \n",
" 2015 | \n",
" 9 | \n",
" 530 | \n",
" 0.044181 | \n",
"
\n",
" \n",
" 34 | \n",
" 2015 | \n",
" 10 | \n",
" 497 | \n",
" 0.041430 | \n",
"
\n",
" \n",
" 35 | \n",
" 2015 | \n",
" 11 | \n",
" 533 | \n",
" 0.044431 | \n",
"
\n",
" \n",
" 36 | \n",
" 2015 | \n",
" 12 | \n",
" 596 | \n",
" 0.049683 | \n",
"
\n",
" \n",
" 37 | \n",
" 2015 | \n",
" 13 | \n",
" 539 | \n",
" 0.044932 | \n",
"
\n",
" \n",
" 38 | \n",
" 2015 | \n",
" 14 | \n",
" 576 | \n",
" 0.048016 | \n",
"
\n",
" \n",
" 39 | \n",
" 2015 | \n",
" 15 | \n",
" 627 | \n",
" 0.052267 | \n",
"
\n",
" \n",
" 40 | \n",
" 2015 | \n",
" 16 | \n",
" 681 | \n",
" 0.056769 | \n",
"
\n",
" \n",
" 41 | \n",
" 2015 | \n",
" 17 | \n",
" 691 | \n",
" 0.057603 | \n",
"
\n",
" \n",
" 42 | \n",
" 2015 | \n",
" 18 | \n",
" 819 | \n",
" 0.068273 | \n",
"
\n",
" \n",
" 43 | \n",
" 2015 | \n",
" 19 | \n",
" 766 | \n",
" 0.063855 | \n",
"
\n",
" \n",
" 44 | \n",
" 2015 | \n",
" 20 | \n",
" 694 | \n",
" 0.057853 | \n",
"
\n",
" \n",
" 45 | \n",
" 2015 | \n",
" 21 | \n",
" 634 | \n",
" 0.052851 | \n",
"
\n",
" \n",
" 46 | \n",
" 2015 | \n",
" 22 | \n",
" 616 | \n",
" 0.051350 | \n",
"
\n",
" \n",
" 47 | \n",
" 2015 | \n",
" 23 | \n",
" 516 | \n",
" 0.043014 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 50,
"text": [
" year hour occs share\n",
"0 2014 0 7217 0.048076\n",
"1 2014 1 4353 0.028998\n",
"2 2014 2 3642 0.024261\n",
"3 2014 3 2537 0.016900\n",
"4 2014 4 1709 0.011385\n",
"5 2014 5 1636 0.010898\n",
"6 2014 6 2195 0.014622\n",
"7 2014 7 3553 0.023668\n",
"8 2014 8 5124 0.034134\n",
"9 2014 9 5844 0.038930\n",
"10 2014 10 6210 0.041368\n",
"11 2014 11 6464 0.043060\n",
"12 2014 12 8803 0.058641\n",
"13 2014 13 7122 0.047443\n",
"14 2014 14 7373 0.049115\n",
"15 2014 15 8111 0.054032\n",
"16 2014 16 8671 0.057762\n",
"17 2014 17 9389 0.062545\n",
"18 2014 18 9865 0.065716\n",
"19 2014 19 9087 0.060533\n",
"20 2014 20 8452 0.056303\n",
"21 2014 21 7743 0.051580\n",
"22 2014 22 7924 0.052786\n",
"23 2014 23 7092 0.047243\n",
"24 2015 0 504 0.042014\n",
"25 2015 1 420 0.035012\n",
"26 2015 2 332 0.027676\n",
"27 2015 3 224 0.018673\n",
"28 2015 4 156 0.013004\n",
"29 2015 5 143 0.011921\n",
"30 2015 6 179 0.014922\n",
"31 2015 7 320 0.026676\n",
"32 2015 8 403 0.033595\n",
"33 2015 9 530 0.044181\n",
"34 2015 10 497 0.041430\n",
"35 2015 11 533 0.044431\n",
"36 2015 12 596 0.049683\n",
"37 2015 13 539 0.044932\n",
"38 2015 14 576 0.048016\n",
"39 2015 15 627 0.052267\n",
"40 2015 16 681 0.056769\n",
"41 2015 17 691 0.057603\n",
"42 2015 18 819 0.068273\n",
"43 2015 19 766 0.063855\n",
"44 2015 20 694 0.057853\n",
"45 2015 21 634 0.052851\n",
"46 2015 22 616 0.051350\n",
"47 2015 23 516 0.043014"
]
}
],
"prompt_number": 50
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"keys = [crimes.date.dt.year,\n",
" crimes.time.map(lambda x: x.hour)]\n",
"counts = crimes.groupby(keys).size()\n",
"counts"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 52,
"text": [
" time\n",
"2014 0 7217\n",
" 1 4353\n",
" 2 3642\n",
" 3 2537\n",
" 4 1709\n",
" 5 1636\n",
" 6 2195\n",
" 7 3553\n",
" 8 5124\n",
" 9 5844\n",
" 10 6210\n",
" 11 6464\n",
" 12 8803\n",
" 13 7122\n",
" 14 7373\n",
" 15 8111\n",
" 16 8671\n",
" 17 9389\n",
" 18 9865\n",
" 19 9087\n",
" 20 8452\n",
" 21 7743\n",
" 22 7924\n",
" 23 7092\n",
"2015 0 504\n",
" 1 420\n",
" 2 332\n",
" 3 224\n",
" 4 156\n",
" 5 143\n",
" 6 179\n",
" 7 320\n",
" 8 403\n",
" 9 530\n",
" 10 497\n",
" 11 533\n",
" 12 596\n",
" 13 539\n",
" 14 576\n",
" 15 627\n",
" 16 681\n",
" 17 691\n",
" 18 819\n",
" 19 766\n",
" 20 694\n",
" 21 634\n",
" 22 616\n",
" 23 516\n",
"dtype: int64"
]
}
],
"prompt_number": 52
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"keys = [crimes.date.dt.year,\n",
" crimes.time.map(lambda x: x.hour)]\n",
"counts = crimes.groupby(keys).size()\n",
"counts.name = 'count'\n",
"counts.index.names = ['year', 'hour']\n",
"counts2 = counts.reset_index()\n",
"counts2"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" year | \n",
" hour | \n",
" count | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 2014 | \n",
" 0 | \n",
" 7217 | \n",
"
\n",
" \n",
" 1 | \n",
" 2014 | \n",
" 1 | \n",
" 4353 | \n",
"
\n",
" \n",
" 2 | \n",
" 2014 | \n",
" 2 | \n",
" 3642 | \n",
"
\n",
" \n",
" 3 | \n",
" 2014 | \n",
" 3 | \n",
" 2537 | \n",
"
\n",
" \n",
" 4 | \n",
" 2014 | \n",
" 4 | \n",
" 1709 | \n",
"
\n",
" \n",
" 5 | \n",
" 2014 | \n",
" 5 | \n",
" 1636 | \n",
"
\n",
" \n",
" 6 | \n",
" 2014 | \n",
" 6 | \n",
" 2195 | \n",
"
\n",
" \n",
" 7 | \n",
" 2014 | \n",
" 7 | \n",
" 3553 | \n",
"
\n",
" \n",
" 8 | \n",
" 2014 | \n",
" 8 | \n",
" 5124 | \n",
"
\n",
" \n",
" 9 | \n",
" 2014 | \n",
" 9 | \n",
" 5844 | \n",
"
\n",
" \n",
" 10 | \n",
" 2014 | \n",
" 10 | \n",
" 6210 | \n",
"
\n",
" \n",
" 11 | \n",
" 2014 | \n",
" 11 | \n",
" 6464 | \n",
"
\n",
" \n",
" 12 | \n",
" 2014 | \n",
" 12 | \n",
" 8803 | \n",
"
\n",
" \n",
" 13 | \n",
" 2014 | \n",
" 13 | \n",
" 7122 | \n",
"
\n",
" \n",
" 14 | \n",
" 2014 | \n",
" 14 | \n",
" 7373 | \n",
"
\n",
" \n",
" 15 | \n",
" 2014 | \n",
" 15 | \n",
" 8111 | \n",
"
\n",
" \n",
" 16 | \n",
" 2014 | \n",
" 16 | \n",
" 8671 | \n",
"
\n",
" \n",
" 17 | \n",
" 2014 | \n",
" 17 | \n",
" 9389 | \n",
"
\n",
" \n",
" 18 | \n",
" 2014 | \n",
" 18 | \n",
" 9865 | \n",
"
\n",
" \n",
" 19 | \n",
" 2014 | \n",
" 19 | \n",
" 9087 | \n",
"
\n",
" \n",
" 20 | \n",
" 2014 | \n",
" 20 | \n",
" 8452 | \n",
"
\n",
" \n",
" 21 | \n",
" 2014 | \n",
" 21 | \n",
" 7743 | \n",
"
\n",
" \n",
" 22 | \n",
" 2014 | \n",
" 22 | \n",
" 7924 | \n",
"
\n",
" \n",
" 23 | \n",
" 2014 | \n",
" 23 | \n",
" 7092 | \n",
"
\n",
" \n",
" 24 | \n",
" 2015 | \n",
" 0 | \n",
" 504 | \n",
"
\n",
" \n",
" 25 | \n",
" 2015 | \n",
" 1 | \n",
" 420 | \n",
"
\n",
" \n",
" 26 | \n",
" 2015 | \n",
" 2 | \n",
" 332 | \n",
"
\n",
" \n",
" 27 | \n",
" 2015 | \n",
" 3 | \n",
" 224 | \n",
"
\n",
" \n",
" 28 | \n",
" 2015 | \n",
" 4 | \n",
" 156 | \n",
"
\n",
" \n",
" 29 | \n",
" 2015 | \n",
" 5 | \n",
" 143 | \n",
"
\n",
" \n",
" 30 | \n",
" 2015 | \n",
" 6 | \n",
" 179 | \n",
"
\n",
" \n",
" 31 | \n",
" 2015 | \n",
" 7 | \n",
" 320 | \n",
"
\n",
" \n",
" 32 | \n",
" 2015 | \n",
" 8 | \n",
" 403 | \n",
"
\n",
" \n",
" 33 | \n",
" 2015 | \n",
" 9 | \n",
" 530 | \n",
"
\n",
" \n",
" 34 | \n",
" 2015 | \n",
" 10 | \n",
" 497 | \n",
"
\n",
" \n",
" 35 | \n",
" 2015 | \n",
" 11 | \n",
" 533 | \n",
"
\n",
" \n",
" 36 | \n",
" 2015 | \n",
" 12 | \n",
" 596 | \n",
"
\n",
" \n",
" 37 | \n",
" 2015 | \n",
" 13 | \n",
" 539 | \n",
"
\n",
" \n",
" 38 | \n",
" 2015 | \n",
" 14 | \n",
" 576 | \n",
"
\n",
" \n",
" 39 | \n",
" 2015 | \n",
" 15 | \n",
" 627 | \n",
"
\n",
" \n",
" 40 | \n",
" 2015 | \n",
" 16 | \n",
" 681 | \n",
"
\n",
" \n",
" 41 | \n",
" 2015 | \n",
" 17 | \n",
" 691 | \n",
"
\n",
" \n",
" 42 | \n",
" 2015 | \n",
" 18 | \n",
" 819 | \n",
"
\n",
" \n",
" 43 | \n",
" 2015 | \n",
" 19 | \n",
" 766 | \n",
"
\n",
" \n",
" 44 | \n",
" 2015 | \n",
" 20 | \n",
" 694 | \n",
"
\n",
" \n",
" 45 | \n",
" 2015 | \n",
" 21 | \n",
" 634 | \n",
"
\n",
" \n",
" 46 | \n",
" 2015 | \n",
" 22 | \n",
" 616 | \n",
"
\n",
" \n",
" 47 | \n",
" 2015 | \n",
" 23 | \n",
" 516 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 53,
"text": [
" year hour count\n",
"0 2014 0 7217\n",
"1 2014 1 4353\n",
"2 2014 2 3642\n",
"3 2014 3 2537\n",
"4 2014 4 1709\n",
"5 2014 5 1636\n",
"6 2014 6 2195\n",
"7 2014 7 3553\n",
"8 2014 8 5124\n",
"9 2014 9 5844\n",
"10 2014 10 6210\n",
"11 2014 11 6464\n",
"12 2014 12 8803\n",
"13 2014 13 7122\n",
"14 2014 14 7373\n",
"15 2014 15 8111\n",
"16 2014 16 8671\n",
"17 2014 17 9389\n",
"18 2014 18 9865\n",
"19 2014 19 9087\n",
"20 2014 20 8452\n",
"21 2014 21 7743\n",
"22 2014 22 7924\n",
"23 2014 23 7092\n",
"24 2015 0 504\n",
"25 2015 1 420\n",
"26 2015 2 332\n",
"27 2015 3 224\n",
"28 2015 4 156\n",
"29 2015 5 143\n",
"30 2015 6 179\n",
"31 2015 7 320\n",
"32 2015 8 403\n",
"33 2015 9 530\n",
"34 2015 10 497\n",
"35 2015 11 533\n",
"36 2015 12 596\n",
"37 2015 13 539\n",
"38 2015 14 576\n",
"39 2015 15 627\n",
"40 2015 16 681\n",
"41 2015 17 691\n",
"42 2015 18 819\n",
"43 2015 19 766\n",
"44 2015 20 694\n",
"45 2015 21 634\n",
"46 2015 22 616\n",
"47 2015 23 516"
]
}
],
"prompt_number": 53
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"def add_share(x):\n",
" x['share'] = x['count'] / x['count'].sum()\n",
" return x\n",
"counts2.groupby('year').apply(add_share)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" year | \n",
" hour | \n",
" count | \n",
" share | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 2014 | \n",
" 0 | \n",
" 7217 | \n",
" 0.048076 | \n",
"
\n",
" \n",
" 1 | \n",
" 2014 | \n",
" 1 | \n",
" 4353 | \n",
" 0.028998 | \n",
"
\n",
" \n",
" 2 | \n",
" 2014 | \n",
" 2 | \n",
" 3642 | \n",
" 0.024261 | \n",
"
\n",
" \n",
" 3 | \n",
" 2014 | \n",
" 3 | \n",
" 2537 | \n",
" 0.016900 | \n",
"
\n",
" \n",
" 4 | \n",
" 2014 | \n",
" 4 | \n",
" 1709 | \n",
" 0.011385 | \n",
"
\n",
" \n",
" 5 | \n",
" 2014 | \n",
" 5 | \n",
" 1636 | \n",
" 0.010898 | \n",
"
\n",
" \n",
" 6 | \n",
" 2014 | \n",
" 6 | \n",
" 2195 | \n",
" 0.014622 | \n",
"
\n",
" \n",
" 7 | \n",
" 2014 | \n",
" 7 | \n",
" 3553 | \n",
" 0.023668 | \n",
"
\n",
" \n",
" 8 | \n",
" 2014 | \n",
" 8 | \n",
" 5124 | \n",
" 0.034134 | \n",
"
\n",
" \n",
" 9 | \n",
" 2014 | \n",
" 9 | \n",
" 5844 | \n",
" 0.038930 | \n",
"
\n",
" \n",
" 10 | \n",
" 2014 | \n",
" 10 | \n",
" 6210 | \n",
" 0.041368 | \n",
"
\n",
" \n",
" 11 | \n",
" 2014 | \n",
" 11 | \n",
" 6464 | \n",
" 0.043060 | \n",
"
\n",
" \n",
" 12 | \n",
" 2014 | \n",
" 12 | \n",
" 8803 | \n",
" 0.058641 | \n",
"
\n",
" \n",
" 13 | \n",
" 2014 | \n",
" 13 | \n",
" 7122 | \n",
" 0.047443 | \n",
"
\n",
" \n",
" 14 | \n",
" 2014 | \n",
" 14 | \n",
" 7373 | \n",
" 0.049115 | \n",
"
\n",
" \n",
" 15 | \n",
" 2014 | \n",
" 15 | \n",
" 8111 | \n",
" 0.054032 | \n",
"
\n",
" \n",
" 16 | \n",
" 2014 | \n",
" 16 | \n",
" 8671 | \n",
" 0.057762 | \n",
"
\n",
" \n",
" 17 | \n",
" 2014 | \n",
" 17 | \n",
" 9389 | \n",
" 0.062545 | \n",
"
\n",
" \n",
" 18 | \n",
" 2014 | \n",
" 18 | \n",
" 9865 | \n",
" 0.065716 | \n",
"
\n",
" \n",
" 19 | \n",
" 2014 | \n",
" 19 | \n",
" 9087 | \n",
" 0.060533 | \n",
"
\n",
" \n",
" 20 | \n",
" 2014 | \n",
" 20 | \n",
" 8452 | \n",
" 0.056303 | \n",
"
\n",
" \n",
" 21 | \n",
" 2014 | \n",
" 21 | \n",
" 7743 | \n",
" 0.051580 | \n",
"
\n",
" \n",
" 22 | \n",
" 2014 | \n",
" 22 | \n",
" 7924 | \n",
" 0.052786 | \n",
"
\n",
" \n",
" 23 | \n",
" 2014 | \n",
" 23 | \n",
" 7092 | \n",
" 0.047243 | \n",
"
\n",
" \n",
" 24 | \n",
" 2015 | \n",
" 0 | \n",
" 504 | \n",
" 0.042014 | \n",
"
\n",
" \n",
" 25 | \n",
" 2015 | \n",
" 1 | \n",
" 420 | \n",
" 0.035012 | \n",
"
\n",
" \n",
" 26 | \n",
" 2015 | \n",
" 2 | \n",
" 332 | \n",
" 0.027676 | \n",
"
\n",
" \n",
" 27 | \n",
" 2015 | \n",
" 3 | \n",
" 224 | \n",
" 0.018673 | \n",
"
\n",
" \n",
" 28 | \n",
" 2015 | \n",
" 4 | \n",
" 156 | \n",
" 0.013004 | \n",
"
\n",
" \n",
" 29 | \n",
" 2015 | \n",
" 5 | \n",
" 143 | \n",
" 0.011921 | \n",
"
\n",
" \n",
" 30 | \n",
" 2015 | \n",
" 6 | \n",
" 179 | \n",
" 0.014922 | \n",
"
\n",
" \n",
" 31 | \n",
" 2015 | \n",
" 7 | \n",
" 320 | \n",
" 0.026676 | \n",
"
\n",
" \n",
" 32 | \n",
" 2015 | \n",
" 8 | \n",
" 403 | \n",
" 0.033595 | \n",
"
\n",
" \n",
" 33 | \n",
" 2015 | \n",
" 9 | \n",
" 530 | \n",
" 0.044181 | \n",
"
\n",
" \n",
" 34 | \n",
" 2015 | \n",
" 10 | \n",
" 497 | \n",
" 0.041430 | \n",
"
\n",
" \n",
" 35 | \n",
" 2015 | \n",
" 11 | \n",
" 533 | \n",
" 0.044431 | \n",
"
\n",
" \n",
" 36 | \n",
" 2015 | \n",
" 12 | \n",
" 596 | \n",
" 0.049683 | \n",
"
\n",
" \n",
" 37 | \n",
" 2015 | \n",
" 13 | \n",
" 539 | \n",
" 0.044932 | \n",
"
\n",
" \n",
" 38 | \n",
" 2015 | \n",
" 14 | \n",
" 576 | \n",
" 0.048016 | \n",
"
\n",
" \n",
" 39 | \n",
" 2015 | \n",
" 15 | \n",
" 627 | \n",
" 0.052267 | \n",
"
\n",
" \n",
" 40 | \n",
" 2015 | \n",
" 16 | \n",
" 681 | \n",
" 0.056769 | \n",
"
\n",
" \n",
" 41 | \n",
" 2015 | \n",
" 17 | \n",
" 691 | \n",
" 0.057603 | \n",
"
\n",
" \n",
" 42 | \n",
" 2015 | \n",
" 18 | \n",
" 819 | \n",
" 0.068273 | \n",
"
\n",
" \n",
" 43 | \n",
" 2015 | \n",
" 19 | \n",
" 766 | \n",
" 0.063855 | \n",
"
\n",
" \n",
" 44 | \n",
" 2015 | \n",
" 20 | \n",
" 694 | \n",
" 0.057853 | \n",
"
\n",
" \n",
" 45 | \n",
" 2015 | \n",
" 21 | \n",
" 634 | \n",
" 0.052851 | \n",
"
\n",
" \n",
" 46 | \n",
" 2015 | \n",
" 22 | \n",
" 616 | \n",
" 0.051350 | \n",
"
\n",
" \n",
" 47 | \n",
" 2015 | \n",
" 23 | \n",
" 516 | \n",
" 0.043014 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 55,
"text": [
" year hour count share\n",
"0 2014 0 7217 0.048076\n",
"1 2014 1 4353 0.028998\n",
"2 2014 2 3642 0.024261\n",
"3 2014 3 2537 0.016900\n",
"4 2014 4 1709 0.011385\n",
"5 2014 5 1636 0.010898\n",
"6 2014 6 2195 0.014622\n",
"7 2014 7 3553 0.023668\n",
"8 2014 8 5124 0.034134\n",
"9 2014 9 5844 0.038930\n",
"10 2014 10 6210 0.041368\n",
"11 2014 11 6464 0.043060\n",
"12 2014 12 8803 0.058641\n",
"13 2014 13 7122 0.047443\n",
"14 2014 14 7373 0.049115\n",
"15 2014 15 8111 0.054032\n",
"16 2014 16 8671 0.057762\n",
"17 2014 17 9389 0.062545\n",
"18 2014 18 9865 0.065716\n",
"19 2014 19 9087 0.060533\n",
"20 2014 20 8452 0.056303\n",
"21 2014 21 7743 0.051580\n",
"22 2014 22 7924 0.052786\n",
"23 2014 23 7092 0.047243\n",
"24 2015 0 504 0.042014\n",
"25 2015 1 420 0.035012\n",
"26 2015 2 332 0.027676\n",
"27 2015 3 224 0.018673\n",
"28 2015 4 156 0.013004\n",
"29 2015 5 143 0.011921\n",
"30 2015 6 179 0.014922\n",
"31 2015 7 320 0.026676\n",
"32 2015 8 403 0.033595\n",
"33 2015 9 530 0.044181\n",
"34 2015 10 497 0.041430\n",
"35 2015 11 533 0.044431\n",
"36 2015 12 596 0.049683\n",
"37 2015 13 539 0.044932\n",
"38 2015 14 576 0.048016\n",
"39 2015 15 627 0.052267\n",
"40 2015 16 681 0.056769\n",
"41 2015 17 691 0.057603\n",
"42 2015 18 819 0.068273\n",
"43 2015 19 766 0.063855\n",
"44 2015 20 694 0.057853\n",
"45 2015 21 634 0.052851\n",
"46 2015 22 616 0.051350\n",
"47 2015 23 516 0.043014"
]
}
],
"prompt_number": 55
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"rounds[:1].T"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" 0 | \n",
"
\n",
" \n",
" \n",
" \n",
" company_permalink | \n",
" /organization/-fame | \n",
"
\n",
" \n",
" company_name | \n",
" #fame | \n",
"
\n",
" \n",
" company_category_list | \n",
" |Media| | \n",
"
\n",
" \n",
" company_market | \n",
" Media | \n",
"
\n",
" \n",
" company_country_code | \n",
" IND | \n",
"
\n",
" \n",
" company_state_code | \n",
" NaN | \n",
"
\n",
" \n",
" company_region | \n",
" Mumbai | \n",
"
\n",
" \n",
" company_city | \n",
" Mumbai | \n",
"
\n",
" \n",
" funding_round_permalink | \n",
" /funding-round/9a01d05418af9f794eebff7ace91f638 | \n",
"
\n",
" \n",
" funding_round_type | \n",
" venture | \n",
"
\n",
" \n",
" funding_round_code | \n",
" B | \n",
"
\n",
" \n",
" funded_at | \n",
" 2015-01-05 | \n",
"
\n",
" \n",
" funded_month | \n",
" 2015-01 | \n",
"
\n",
" \n",
" funded_quarter | \n",
" 2015-Q1 | \n",
"
\n",
" \n",
" funded_year | \n",
" 2015 | \n",
"
\n",
" \n",
" raised_amount_usd | \n",
" 10,000,000 | \n",
"
\n",
" \n",
" Unnamed: 16 | \n",
" NaN | \n",
"
\n",
" \n",
"
\n",
"
"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 56,
"text": [
" 0\n",
"company_permalink /organization/-fame\n",
"company_name #fame\n",
"company_category_list |Media|\n",
"company_market Media\n",
"company_country_code IND\n",
"company_state_code NaN\n",
"company_region Mumbai\n",
"company_city Mumbai\n",
"funding_round_permalink /funding-round/9a01d05418af9f794eebff7ace91f638\n",
"funding_round_type venture\n",
"funding_round_code B\n",
"funded_at 2015-01-05\n",
"funded_month 2015-01\n",
"funded_quarter 2015-Q1 \n",
"funded_year 2015\n",
"raised_amount_usd 10,000,000 \n",
"Unnamed: 16 NaN"
]
}
],
"prompt_number": 56
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Use ORDER BY in the window clause to achieve cumulative statistics"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"query = \"\"\"\n",
" select company_permalink, funded_at,\n",
" sum(case when raised_amount_usd is null then 0 else raised_amount_usd end) \n",
" OVER (PARTITION BY company_permalink ORDER BY funded_at)\n",
" as cumulative_raised\n",
" from rounds\n",
" order by company_permalink;\n",
"\"\"\"\n",
"sql.read_sql(query, con)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" company_permalink | \n",
" funded_at | \n",
" cumulative_raised | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" /organization/-fame | \n",
" 2015-01-05 | \n",
" 10000000 | \n",
"
\n",
" \n",
" 1 | \n",
" /organization/-qounter | \n",
" 2014-03-01 | \n",
" 700000 | \n",
"
\n",
" \n",
" 2 | \n",
" /organization/-qounter | \n",
" 2014-10-14 | \n",
" 700000 | \n",
"
\n",
" \n",
" 3 | \n",
" /organization/0-6-com | \n",
" 2008-03-19 | \n",
" 2000000 | \n",
"
\n",
" \n",
" 4 | \n",
" /organization/004-technologies | \n",
" 2014-07-24 | \n",
" 0 | \n",
"
\n",
" \n",
" 5 | \n",
" /organization/01games-technology | \n",
" 2014-07-01 | \n",
" 41250 | \n",
"
\n",
" \n",
" 6 | \n",
" /organization/0ndine-biomedical-inc | \n",
" 2009-09-11 | \n",
" 43360 | \n",
"
\n",
" \n",
" 7 | \n",
" /organization/0ndine-biomedical-inc | \n",
" 2009-12-21 | \n",
" 762851 | \n",
"
\n",
" \n",
" 8 | \n",
" /organization/0xdata | \n",
" 2013-01-03 | \n",
" 1700000 | \n",
"
\n",
" \n",
" 9 | \n",
" /organization/0xdata | \n",
" 2014-07-19 | \n",
" 10600000 | \n",
"
\n",
" \n",
" 10 | \n",
" /organization/1 | \n",
" 2011-07-20 | \n",
" 1000050 | \n",
"
\n",
" \n",
" 11 | \n",
" /organization/1 | \n",
" 2013-02-05 | \n",
" 1000050 | \n",
"
\n",
" \n",
" 12 | \n",
" /organization/1 | \n",
" 2014-02-05 | \n",
" 1150050 | \n",
"
\n",
" \n",
" 13 | \n",
" /organization/1-2-3-listo | \n",
" 2013-02-18 | \n",
" 40000 | \n",
"
\n",
" \n",
" 14 | \n",
" /organization/1-4-all | \n",
" 2013-04-21 | \n",
" 0 | \n",
"
\n",
" \n",
" 15 | \n",
" /organization/1-618-technology | \n",
" 2014-01-22 | \n",
" 0 | \n",
"
\n",
" \n",
" 16 | \n",
" /organization/1-800-dentist | \n",
" 2010-08-19 | \n",
" 0 | \n",
"
\n",
" \n",
" 17 | \n",
" /organization/1-800-doctors | \n",
" 2011-03-02 | \n",
" 1750000 | \n",
"
\n",
" \n",
" 18 | \n",
" /organization/10-20-media | \n",
" 2009-06-18 | \n",
" 500000 | \n",
"
\n",
" \n",
" 19 | \n",
" /organization/10-20-media | \n",
" 2010-03-30 | \n",
" 750000 | \n",
"
\n",
" \n",
" 20 | \n",
" /organization/10-20-media | \n",
" 2011-01-11 | \n",
" 1550000 | \n",
"
\n",
" \n",
" 21 | \n",
" /organization/10-20-media | \n",
" 2011-12-28 | \n",
" 2050000 | \n",
"
\n",
" \n",
" 22 | \n",
" /organization/10-minutes-with | \n",
" 2013-01-01 | \n",
" 400000 | \n",
"
\n",
" \n",
" 23 | \n",
" /organization/10-minutes-with | \n",
" 2014-10-09 | \n",
" 4400000 | \n",
"
\n",
" \n",
" 24 | \n",
" /organization/1000-corks | \n",
" 2011-08-23 | \n",
" 40000 | \n",
"
\n",
" \n",
" 25 | \n",
" /organization/1000-markets | \n",
" 2009-05-15 | \n",
" 500000 | \n",
"
\n",
" \n",
" 26 | \n",
" /organization/1000chi | \n",
" 2010-04-01 | \n",
" 43923865 | \n",
"
\n",
" \n",
" 27 | \n",
" /organization/1000jobboersen-de | \n",
" 2011-09-16 | \n",
" 0 | \n",
"
\n",
" \n",
" 28 | \n",
" /organization/1000memories | \n",
" 2010-01-01 | \n",
" 15000 | \n",
"
\n",
" \n",
" 29 | \n",
" /organization/1000memories | \n",
" 2011-02-16 | \n",
" 2535000 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 87131 | \n",
" /organization/zynga | \n",
" 2010-04-24 | \n",
" 70213000 | \n",
"
\n",
" \n",
" 87132 | \n",
" /organization/zynga | \n",
" 2010-06-14 | \n",
" 370213000 | \n",
"
\n",
" \n",
" 87133 | \n",
" /organization/zynga | \n",
" 2010-10-12 | \n",
" 376550786 | \n",
"
\n",
" \n",
" 87134 | \n",
" /organization/zynga | \n",
" 2011-02-18 | \n",
" 866550786 | \n",
"
\n",
" \n",
" 87135 | \n",
" /organization/zyngenia | \n",
" 2010-09-09 | \n",
" 25000000 | \n",
"
\n",
" \n",
" 87136 | \n",
" /organization/zynstra | \n",
" 2012-03-03 | \n",
" 225000 | \n",
"
\n",
" \n",
" 87137 | \n",
" /organization/zynstra | \n",
" 2012-09-18 | \n",
" 2550000 | \n",
"
\n",
" \n",
" 87138 | \n",
" /organization/zynstra | \n",
" 2013-09-12 | \n",
" 6350000 | \n",
"
\n",
" \n",
" 87139 | \n",
" /organization/zynstra | \n",
" 2014-07-09 | \n",
" 14750000 | \n",
"
\n",
" \n",
" 87140 | \n",
" /organization/zyomyx-inc | \n",
" 2010-01-29 | \n",
" 555016 | \n",
"
\n",
" \n",
" 87141 | \n",
" /organization/zyomyx-inc | \n",
" 2013-06-30 | \n",
" 12555016 | \n",
"
\n",
" \n",
" 87142 | \n",
" /organization/zyomyx-inc | \n",
" 2013-07-11 | \n",
" 26775015 | \n",
"
\n",
" \n",
" 87143 | \n",
" /organization/zyomyx-inc | \n",
" 2014-03-26 | \n",
" 34275015 | \n",
"
\n",
" \n",
" 87144 | \n",
" /organization/zypsee | \n",
" 2013-08-13 | \n",
" 2150000 | \n",
"
\n",
" \n",
" 87145 | \n",
" /organization/zyraz-technology | \n",
" 2008-01-01 | \n",
" 2007363 | \n",
"
\n",
" \n",
" 87146 | \n",
" /organization/zyraz-technology | \n",
" 2009-09-11 | \n",
" 7407363 | \n",
"
\n",
" \n",
" 87147 | \n",
" /organization/zyraz-technology | \n",
" 2009-10-09 | \n",
" 15398910 | \n",
"
\n",
" \n",
" 87148 | \n",
" /organization/zyraz-technology | \n",
" 2013-02-15 | \n",
" 15419877 | \n",
"
\n",
" \n",
" 87149 | \n",
" /organization/zyrra | \n",
" 2010-10-21 | \n",
" 50000 | \n",
"
\n",
" \n",
" 87150 | \n",
" /organization/zyrra | \n",
" 2010-11-15 | \n",
" 495000 | \n",
"
\n",
" \n",
" 87151 | \n",
" /organization/zyrra | \n",
" 2010-12-15 | \n",
" 1370000 | \n",
"
\n",
" \n",
" 87152 | \n",
" /organization/zyrra | \n",
" 2012-10-18 | \n",
" 1510500 | \n",
"
\n",
" \n",
" 87153 | \n",
" /organization/zytoprotec | \n",
" 2013-01-29 | \n",
" 2686600 | \n",
"
\n",
" \n",
" 87154 | \n",
" /organization/zzish | \n",
" 2014-03-24 | \n",
" 320000 | \n",
"
\n",
" \n",
" 87155 | \n",
" /organization/zznode-science-and-technology-co... | \n",
" 2012-04-01 | \n",
" 1587301 | \n",
"
\n",
" \n",
" 87156 | \n",
" /organization/zzzzapp-com | \n",
" 2011-11-01 | \n",
" 7000 | \n",
"
\n",
" \n",
" 87157 | \n",
" /organization/zzzzapp-com | \n",
" 2013-03-19 | \n",
" 39360 | \n",
"
\n",
" \n",
" 87158 | \n",
" /organization/zzzzapp-com | \n",
" 2013-08-01 | \n",
" 71525 | \n",
"
\n",
" \n",
" 87159 | \n",
" /organization/zzzzapp-com | \n",
" 2014-01-01 | \n",
" 84480 | \n",
"
\n",
" \n",
" 87160 | \n",
" /organization/zzzzapp-com | \n",
" 2014-09-10 | \n",
" 97398 | \n",
"
\n",
" \n",
"
\n",
"
87161 rows \u00d7 3 columns
\n",
"
"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 57,
"text": [
" company_permalink funded_at \\\n",
"0 /organization/-fame 2015-01-05 \n",
"1 /organization/-qounter 2014-03-01 \n",
"2 /organization/-qounter 2014-10-14 \n",
"3 /organization/0-6-com 2008-03-19 \n",
"4 /organization/004-technologies 2014-07-24 \n",
"5 /organization/01games-technology 2014-07-01 \n",
"6 /organization/0ndine-biomedical-inc 2009-09-11 \n",
"7 /organization/0ndine-biomedical-inc 2009-12-21 \n",
"8 /organization/0xdata 2013-01-03 \n",
"9 /organization/0xdata 2014-07-19 \n",
"10 /organization/1 2011-07-20 \n",
"11 /organization/1 2013-02-05 \n",
"12 /organization/1 2014-02-05 \n",
"13 /organization/1-2-3-listo 2013-02-18 \n",
"14 /organization/1-4-all 2013-04-21 \n",
"15 /organization/1-618-technology 2014-01-22 \n",
"16 /organization/1-800-dentist 2010-08-19 \n",
"17 /organization/1-800-doctors 2011-03-02 \n",
"18 /organization/10-20-media 2009-06-18 \n",
"19 /organization/10-20-media 2010-03-30 \n",
"20 /organization/10-20-media 2011-01-11 \n",
"21 /organization/10-20-media 2011-12-28 \n",
"22 /organization/10-minutes-with 2013-01-01 \n",
"23 /organization/10-minutes-with 2014-10-09 \n",
"24 /organization/1000-corks 2011-08-23 \n",
"25 /organization/1000-markets 2009-05-15 \n",
"26 /organization/1000chi 2010-04-01 \n",
"27 /organization/1000jobboersen-de 2011-09-16 \n",
"28 /organization/1000memories 2010-01-01 \n",
"29 /organization/1000memories 2011-02-16 \n",
"... ... ... \n",
"87131 /organization/zynga 2010-04-24 \n",
"87132 /organization/zynga 2010-06-14 \n",
"87133 /organization/zynga 2010-10-12 \n",
"87134 /organization/zynga 2011-02-18 \n",
"87135 /organization/zyngenia 2010-09-09 \n",
"87136 /organization/zynstra 2012-03-03 \n",
"87137 /organization/zynstra 2012-09-18 \n",
"87138 /organization/zynstra 2013-09-12 \n",
"87139 /organization/zynstra 2014-07-09 \n",
"87140 /organization/zyomyx-inc 2010-01-29 \n",
"87141 /organization/zyomyx-inc 2013-06-30 \n",
"87142 /organization/zyomyx-inc 2013-07-11 \n",
"87143 /organization/zyomyx-inc 2014-03-26 \n",
"87144 /organization/zypsee 2013-08-13 \n",
"87145 /organization/zyraz-technology 2008-01-01 \n",
"87146 /organization/zyraz-technology 2009-09-11 \n",
"87147 /organization/zyraz-technology 2009-10-09 \n",
"87148 /organization/zyraz-technology 2013-02-15 \n",
"87149 /organization/zyrra 2010-10-21 \n",
"87150 /organization/zyrra 2010-11-15 \n",
"87151 /organization/zyrra 2010-12-15 \n",
"87152 /organization/zyrra 2012-10-18 \n",
"87153 /organization/zytoprotec 2013-01-29 \n",
"87154 /organization/zzish 2014-03-24 \n",
"87155 /organization/zznode-science-and-technology-co... 2012-04-01 \n",
"87156 /organization/zzzzapp-com 2011-11-01 \n",
"87157 /organization/zzzzapp-com 2013-03-19 \n",
"87158 /organization/zzzzapp-com 2013-08-01 \n",
"87159 /organization/zzzzapp-com 2014-01-01 \n",
"87160 /organization/zzzzapp-com 2014-09-10 \n",
"\n",
" cumulative_raised \n",
"0 10000000 \n",
"1 700000 \n",
"2 700000 \n",
"3 2000000 \n",
"4 0 \n",
"5 41250 \n",
"6 43360 \n",
"7 762851 \n",
"8 1700000 \n",
"9 10600000 \n",
"10 1000050 \n",
"11 1000050 \n",
"12 1150050 \n",
"13 40000 \n",
"14 0 \n",
"15 0 \n",
"16 0 \n",
"17 1750000 \n",
"18 500000 \n",
"19 750000 \n",
"20 1550000 \n",
"21 2050000 \n",
"22 400000 \n",
"23 4400000 \n",
"24 40000 \n",
"25 500000 \n",
"26 43923865 \n",
"27 0 \n",
"28 15000 \n",
"29 2535000 \n",
"... ... \n",
"87131 70213000 \n",
"87132 370213000 \n",
"87133 376550786 \n",
"87134 866550786 \n",
"87135 25000000 \n",
"87136 225000 \n",
"87137 2550000 \n",
"87138 6350000 \n",
"87139 14750000 \n",
"87140 555016 \n",
"87141 12555016 \n",
"87142 26775015 \n",
"87143 34275015 \n",
"87144 2150000 \n",
"87145 2007363 \n",
"87146 7407363 \n",
"87147 15398910 \n",
"87148 15419877 \n",
"87149 50000 \n",
"87150 495000 \n",
"87151 1370000 \n",
"87152 1510500 \n",
"87153 2686600 \n",
"87154 320000 \n",
"87155 1587301 \n",
"87156 7000 \n",
"87157 39360 \n",
"87158 71525 \n",
"87159 84480 \n",
"87160 97398 \n",
"\n",
"[87161 rows x 3 columns]"
]
}
],
"prompt_number": 57
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Bucketing analyses / Histograms\n",
"===\n",
"- pandas has `cut` (linear / ad hoc binning) and `qcut` (quantile binning) functions\n",
"- SQL: must be constructed from primitive operations and window functions"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"acquisitions.price_amount.isnull().value_counts()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 58,
"text": [
"True 51424\n",
"False 3816\n",
"dtype: int64"
]
}
],
"prompt_number": 58
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"acqs_known = acquisitions[acquisitions.price_amount.notnull()]"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 59
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"acqs_known.price_amount.describe()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 60,
"text": [
"count 3.816000e+03\n",
"mean 7.182556e+08\n",
"std 3.884648e+09\n",
"min 1.000000e+00\n",
"25% 1.822500e+07\n",
"50% 8.295000e+07\n",
"75% 3.450000e+08\n",
"max 1.500000e+11\n",
"Name: price_amount, dtype: float64"
]
}
],
"prompt_number": 60
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"top = 1000000000.\n",
"acqs = acqs_known[(acqs_known.price_amount <= top) & (acqs_known.price_amount > 0)]"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 61
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"pd.cut(acqs.price_amount, 4)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 62,
"text": [
"2 (-999998.999, 250000000.75]\n",
"4 (-999998.999, 250000000.75]\n",
"7 (-999998.999, 250000000.75]\n",
"22 (-999998.999, 250000000.75]\n",
"23 (-999998.999, 250000000.75]\n",
"25 (250000000.75, 500000000.5]\n",
"31 (500000000.5, 750000000.25]\n",
"35 (500000000.5, 750000000.25]\n",
"38 (-999998.999, 250000000.75]\n",
"39 (-999998.999, 250000000.75]\n",
"44 (250000000.75, 500000000.5]\n",
"46 (250000000.75, 500000000.5]\n",
"53 (-999998.999, 250000000.75]\n",
"65 (-999998.999, 250000000.75]\n",
"70 (250000000.75, 500000000.5]\n",
"...\n",
"13458 (-999998.999, 250000000.75]\n",
"13463 (-999998.999, 250000000.75]\n",
"13471 (-999998.999, 250000000.75]\n",
"13472 (500000000.5, 750000000.25]\n",
"13473 (-999998.999, 250000000.75]\n",
"13475 (-999998.999, 250000000.75]\n",
"13476 (-999998.999, 250000000.75]\n",
"13480 (-999998.999, 250000000.75]\n",
"13483 (-999998.999, 250000000.75]\n",
"13488 (-999998.999, 250000000.75]\n",
"13491 (-999998.999, 250000000.75]\n",
"13495 (-999998.999, 250000000.75]\n",
"13496 (250000000.75, 500000000.5]\n",
"13505 (-999998.999, 250000000.75]\n",
"13506 (-999998.999, 250000000.75]\n",
"Name: price_amount, Length: 3400, dtype: category\n",
"Categories (4, object): [(-999998.999, 250000000.75] < (250000000.75, 500000000.5] < (500000000.5, 750000000.25] < (750000000.25, 1000000000]]"
]
}
],
"prompt_number": 62
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"bins = pd.cut(acqs.price_amount, [0, top / 4, top / 2, 3 * top / 4, top])\n",
"acqs.groupby(bins).size()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 63,
"text": [
"price_amount\n",
"(0, 250000000] 2650\n",
"(250000000, 500000000] 464\n",
"(500000000, 750000000] 188\n",
"(750000000, 1000000000] 98\n",
"dtype: int64"
]
}
],
"prompt_number": 63
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"# Get Quantile numbers\n",
"(pd.qcut(acqs.price_amount, 4, labels=False) + 1).value_counts()"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 64,
"text": [
"1 871\n",
"3 856\n",
"4 844\n",
"2 829\n",
"dtype: int64"
]
}
],
"prompt_number": 64
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"quartiles = pd.qcut(acqs.price_amount, 4, labels=False)\n",
"grouped = acqs.groupby(quartiles)\n",
"\n",
"def f(x):\n",
" return pd.Series({\n",
" 'lb': x.price_amount.min(),\n",
" 'ub': x.price_amount.max(),\n",
" 'num': len(x)\n",
" })\n",
"\n",
"grouped.apply(f)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" lb | \n",
" num | \n",
" ub | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 1 | \n",
" 871 | \n",
" 15000000 | \n",
"
\n",
" \n",
" 1 | \n",
" 15040000 | \n",
" 829 | \n",
" 59000000 | \n",
"
\n",
" \n",
" 2 | \n",
" 60000000 | \n",
" 856 | \n",
" 210000000 | \n",
"
\n",
" \n",
" 3 | \n",
" 212000000 | \n",
" 844 | \n",
" 1000000000 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 65,
"text": [
" lb num ub\n",
"0 1 871 15000000\n",
"1 15040000 829 59000000\n",
"2 60000000 856 210000000\n",
"3 212000000 844 1000000000"
]
}
],
"prompt_number": 65
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"query = \"\"\"\n",
"with t0 as ( select \n",
" company_permalink, price_amount\n",
" from acquisitions a\n",
" where price_amount is not null\n",
" and price_amount <= 1000000000\n",
")\n",
"select t2.bucket, sum(t2.price_amount), count(*)\n",
"from (\n",
" select t0.company_permalink, t0.price_amount, \n",
" ceil((t0.price_amount - lb) / ((ub - lb) / 4)) as bucket\n",
" from t0 \n",
" cross join (\n",
" select min(price_amount) - 1e-14 as lb, max(price_amount) as ub\n",
" from t0\n",
" ) t1\n",
") t2\n",
"group by 1;\n",
"\"\"\"\n",
"\n",
"sql.read_sql(query, con)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" bucket | \n",
" sum | \n",
" count | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 1 | \n",
" 1.588106e+11 | \n",
" 2650 | \n",
"
\n",
" \n",
" 1 | \n",
" 2 | \n",
" 1.677637e+11 | \n",
" 464 | \n",
"
\n",
" \n",
" 2 | \n",
" 3 | \n",
" 1.173550e+11 | \n",
" 188 | \n",
"
\n",
" \n",
" 3 | \n",
" 4 | \n",
" 8.616420e+10 | \n",
" 98 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 66,
"text": [
" bucket sum count\n",
"0 1 1.588106e+11 2650\n",
"1 2 1.677637e+11 464\n",
"2 3 1.173550e+11 188\n",
"3 4 8.616420e+10 98"
]
}
],
"prompt_number": 66
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"query = \"\"\"\n",
"select bucket, min(price_amount) as lb,\n",
" max(price_amount) as ub, count(*)\n",
"from (\n",
" select \n",
" ntile(4) over (order by price_amount) AS bucket,\n",
" price_amount\n",
" from acquisitions a\n",
" where price_amount is not null\n",
" and price_amount < 1000000000\n",
") t0\n",
"group by 1\n",
"order by bucket;\n",
"\"\"\"\n",
"\n",
"sql.read_sql(query, con)"
],
"language": "python",
"metadata": {},
"outputs": [
{
"html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" bucket | \n",
" lb | \n",
" ub | \n",
" count | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 1 | \n",
" 1 | \n",
" 15000000 | \n",
" 847 | \n",
"
\n",
" \n",
" 1 | \n",
" 2 | \n",
" 15000000 | \n",
" 57500000 | \n",
" 847 | \n",
"
\n",
" \n",
" 2 | \n",
" 3 | \n",
" 57900000 | \n",
" 206000000 | \n",
" 847 | \n",
"
\n",
" \n",
" 3 | \n",
" 4 | \n",
" 206000000 | \n",
" 996000000 | \n",
" 846 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 67,
"text": [
" bucket lb ub count\n",
"0 1 1 15000000 847\n",
"1 2 15000000 57500000 847\n",
"2 3 57900000 206000000 847\n",
"3 4 206000000 996000000 846"
]
}
],
"prompt_number": 67
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
}
],
"metadata": {}
}
]
}