{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Grabbing data with cenpy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Cenpy (`sen - pie`) is a package that exposes APIs from the US Census Bureau and makes it easy to pull down and work with Census data in Pandas. First, notice that there are two core modules in the package, `base` and `explorer`, which each do different things. First, let's look at `explorer`. " ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [], "source": [ "import cenpy as c\n", "import pandas" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "On import, `explorer` requests all currently available APIs from the Census Bureau's [API listing](http://www.census.gov/data/developers/data-sets.html). In future, it will can also read a `JSON` collection describing the databases from disk, if asked.\n", "\n", "Explorer has two functions, `available` and `explain`. `available` will provide a list of the identifiers of all the APIs that `cenpy` knows about. If run with `verbose=True`, `cenpy` will also include the title of the database as a dictionary. It's a good idea to *not* process this directly, and instead use it to explore currently available APIs.\n", "\n", "Also, beware that the US Census Bureau can change the names of the resources. This means that the index of the following table is not necessarily stable over time; sometimes, the same resource can change its identifier, like when the 2010 decennial census changed from `2010sf1` to `DECENNIALSF12010`. So, consult the table built by `cenpy.explorer.available()` if the keys appear to have changed. \n", "\n", "Here, I'll just show the first five entries:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false, "scrolled": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
titletemporalspatialpublisherprogramCodemodifiedkeyworddistributiondescriptioncontactPoint...c_isTimeseriesc_isCubec_isAvailablec_isAggregatec_groupsLinkc_geographyLinkc_examplesLinkc_datasetbureauCodeaccessLevel
NONEMP20072007 Nonemployer Statistics: Non Employer Stat...2007/2007United StatesU.S. Census Bureau006:0072017-02-09NaN{'@type': 'dcat:Distribution', 'accessURL': 'h...Nonemployer Statistics is an annual series tha...{'fn': 'Nonemployer Statistics staff', 'hasEma......NaNTrueTrueTruehttps://api.census.gov/data/2007/nonemp/groups...https://api.census.gov/data/2007/nonemp/geogra...https://api.census.gov/data/2007/nonemp/exampl...(nonemp,)006:07public
POPESTagesexVintage 2014 Population Estimates: National An...April 1, 2010 - CurrentUnited StatesU.S. Census Bureau006:0082015-11-17NaN{'@type': 'dcat:Distribution', 'accessURL': 'h...Annual Estimates of the Resident Population by...{'fn': 'Population Estimates Branch', 'hasEmai......NaNTrueTrueTruehttps://api.census.gov/data/2014/pep/agesex/gr...https://api.census.gov/data/2014/pep/agesex/ge...https://api.census.gov/data/2014/pep/agesex/ex...(pep, agesex)006:07public
ZBPTotal20112011 County Business Patterns - Zip Code Busin...2011/2011United StatesU.S. Census Bureau006:0072017-05-23NaN{'@type': 'dcat:Distribution', 'accessURL': 'h...ZIP Code Business Patterns (ZBP) is an annual ...{'fn': 'CBP Staff', 'hasEmail': 'ewd.county.bu......NaNNaNTrueTruehttps://api.census.gov/data/2011/zbp/groups.jsonhttps://api.census.gov/data/2011/zbp/geography...https://api.census.gov/data/2011/zbp/examples....(zbp,)006:07public
ITMONTHLYIMPORTSUSDATime Series International Trade: Monthly U.S. ...Time SeriesUnited StatesU.S. Census Bureau006:0042017-10-27NaN{'@type': 'dcat:Distribution', 'accessURL': 'h...The Census data API provides access to the mos...{'fn': 'Jeff McHugh', 'hasEmail': 'Jeffrey.P.M......TrueTrueTrueNaNhttps://api.census.gov/data/timeseries/intltra...https://api.census.gov/data/timeseries/intltra...https://api.census.gov/data/timeseries/intltra...(timeseries, intltrade, imports, usda)006:07public
POPESTnatmonthly2015Vintage 2015 Population Estimates: National Mo...April 1,2010United StatesU.S. Census Bureau006:0082016-12-21NaN{'@type': 'dcat:Distribution', 'accessURL': 'h...Monthly Population Estimates by Universe, Age,...{'fn': 'Population Estimates Branch', 'hasEmai......NaNTrueTrueTruehttps://api.census.gov/data/2015/pep/natmonthl...https://api.census.gov/data/2015/pep/natmonthl...https://api.census.gov/data/2015/pep/natmonthl...(pep, natmonthly)006:07public
\n", "

5 rows × 24 columns

\n", "
" ], "text/plain": [ " title \\\n", "NONEMP2007 2007 Nonemployer Statistics: Non Employer Stat... \n", "POPESTagesex Vintage 2014 Population Estimates: National An... \n", "ZBPTotal2011 2011 County Business Patterns - Zip Code Busin... \n", "ITMONTHLYIMPORTSUSDA Time Series International Trade: Monthly U.S. ... \n", "POPESTnatmonthly2015 Vintage 2015 Population Estimates: National Mo... \n", "\n", " temporal spatial \\\n", "NONEMP2007 2007/2007 United States \n", "POPESTagesex April 1, 2010 - Current United States \n", "ZBPTotal2011 2011/2011 United States \n", "ITMONTHLYIMPORTSUSDA Time Series United States \n", "POPESTnatmonthly2015 April 1,2010 United States \n", "\n", " publisher programCode modified keyword \\\n", "NONEMP2007 U.S. Census Bureau 006:007 2017-02-09 NaN \n", "POPESTagesex U.S. Census Bureau 006:008 2015-11-17 NaN \n", "ZBPTotal2011 U.S. Census Bureau 006:007 2017-05-23 NaN \n", "ITMONTHLYIMPORTSUSDA U.S. Census Bureau 006:004 2017-10-27 NaN \n", "POPESTnatmonthly2015 U.S. Census Bureau 006:008 2016-12-21 NaN \n", "\n", " distribution \\\n", "NONEMP2007 {'@type': 'dcat:Distribution', 'accessURL': 'h... \n", "POPESTagesex {'@type': 'dcat:Distribution', 'accessURL': 'h... \n", "ZBPTotal2011 {'@type': 'dcat:Distribution', 'accessURL': 'h... \n", "ITMONTHLYIMPORTSUSDA {'@type': 'dcat:Distribution', 'accessURL': 'h... \n", "POPESTnatmonthly2015 {'@type': 'dcat:Distribution', 'accessURL': 'h... \n", "\n", " description \\\n", "NONEMP2007 Nonemployer Statistics is an annual series tha... \n", "POPESTagesex Annual Estimates of the Resident Population by... \n", "ZBPTotal2011 ZIP Code Business Patterns (ZBP) is an annual ... \n", "ITMONTHLYIMPORTSUSDA The Census data API provides access to the mos... \n", "POPESTnatmonthly2015 Monthly Population Estimates by Universe, Age,... \n", "\n", " contactPoint ... \\\n", "NONEMP2007 {'fn': 'Nonemployer Statistics staff', 'hasEma... ... \n", "POPESTagesex {'fn': 'Population Estimates Branch', 'hasEmai... ... \n", "ZBPTotal2011 {'fn': 'CBP Staff', 'hasEmail': 'ewd.county.bu... ... \n", "ITMONTHLYIMPORTSUSDA {'fn': 'Jeff McHugh', 'hasEmail': 'Jeffrey.P.M... ... \n", "POPESTnatmonthly2015 {'fn': 'Population Estimates Branch', 'hasEmai... ... \n", "\n", " c_isTimeseries c_isCube c_isAvailable c_isAggregate \\\n", "NONEMP2007 NaN True True True \n", "POPESTagesex NaN True True True \n", "ZBPTotal2011 NaN NaN True True \n", "ITMONTHLYIMPORTSUSDA True True True NaN \n", "POPESTnatmonthly2015 NaN True True True \n", "\n", " c_groupsLink \\\n", "NONEMP2007 https://api.census.gov/data/2007/nonemp/groups... \n", "POPESTagesex https://api.census.gov/data/2014/pep/agesex/gr... \n", "ZBPTotal2011 https://api.census.gov/data/2011/zbp/groups.json \n", "ITMONTHLYIMPORTSUSDA https://api.census.gov/data/timeseries/intltra... \n", "POPESTnatmonthly2015 https://api.census.gov/data/2015/pep/natmonthl... \n", "\n", " c_geographyLink \\\n", "NONEMP2007 https://api.census.gov/data/2007/nonemp/geogra... \n", "POPESTagesex https://api.census.gov/data/2014/pep/agesex/ge... \n", "ZBPTotal2011 https://api.census.gov/data/2011/zbp/geography... \n", "ITMONTHLYIMPORTSUSDA https://api.census.gov/data/timeseries/intltra... \n", "POPESTnatmonthly2015 https://api.census.gov/data/2015/pep/natmonthl... \n", "\n", " c_examplesLink \\\n", "NONEMP2007 https://api.census.gov/data/2007/nonemp/exampl... \n", "POPESTagesex https://api.census.gov/data/2014/pep/agesex/ex... \n", "ZBPTotal2011 https://api.census.gov/data/2011/zbp/examples.... \n", "ITMONTHLYIMPORTSUSDA https://api.census.gov/data/timeseries/intltra... \n", "POPESTnatmonthly2015 https://api.census.gov/data/2015/pep/natmonthl... \n", "\n", " c_dataset bureauCode \\\n", "NONEMP2007 (nonemp,) 006:07 \n", "POPESTagesex (pep, agesex) 006:07 \n", "ZBPTotal2011 (zbp,) 006:07 \n", "ITMONTHLYIMPORTSUSDA (timeseries, intltrade, imports, usda) 006:07 \n", "POPESTnatmonthly2015 (pep, natmonthly) 006:07 \n", "\n", " accessLevel \n", "NONEMP2007 public \n", "POPESTagesex public \n", "ZBPTotal2011 public \n", "ITMONTHLYIMPORTSUSDA public \n", "POPESTnatmonthly2015 public \n", "\n", "[5 rows x 24 columns]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "c.explorer.available().head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `explain` command provides the title and full description of the datasource. If run in verbose mode, the function returns the full `json` listing of the API. " ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "{'Decennial SF1': 'Summary File 1 (SF 1) contains detailed tables focusing on age, sex, households, families, and housing units. These tables provide in-depth figures by race and Hispanic origin> some tables are repeated for each of nine race/Latino groups. Counts also are provided for over forty American Indian and Alaska Native tribes and for groups within race categories. The race categories include eighteen Asian groups and twelve Native Hawaiian and Other Pacific Islander groups. Counts of persons of Hispanic origin by country of origin (twenty-eight groups) are also shown. Summary File 1 presents data for the United States, the 50 states, and the District of Columbia in a hierarchical sequence down to the block level for many tabulations, but only to the census tract level for others. Summaries are included for other geographic areas such as ZIP Code Tabulation Areas (ZCTAs) and Congressional districts. Geographic coverage for Puerto Rico is comparable to the 50 states. Data are presented in a hierarchical sequence down the block level for many tabulations, but only to the census tract level for others. Geographic areas include barrios, barrios-pueblo, subbarrios, places, census tracts, block groups, and blocks. Summaries also are included for other geographic areas such as ZIP Code Tabulation Areas (ZCTAs).'}" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "c.explorer.explain('DECENNIALSF12010')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To actually connect to a database resource, you create a `Connection`. A `Connection` works like a *very* simplified connection from the `sqlalchemy` world. The `Connection` class has a method, `query` that constructs a query string and requests it from the Census server. This result is then parsed into JSON and returned to the user. " ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [], "source": [ "conn = c.base.Connection('DECENNIALSF12010')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That may have taken longer than you'd've expected. This is because, when the `Connection` constructor is called, it populates the connection object with a bit of metadata that makes it possible to construct queries without referring to the census handbooks. \n", "\n", "For instance, a connection's `variables` represent all available search parameters for a given dataset. " ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
attributesconceptgrouplabellimitpredicateOnlypredicateTyperequiredvalues
forNaNCensus API Geography SpecificationN/ACensus API FIPS 'for' clause0Truefips-forNaNNaN
inNaNCensus API Geography SpecificationN/ACensus API FIPS 'in' clause0Truefips-inNaNNaN
ucgidNaNCensus API Geography SpecificationN/AUniform Census Geography Identifier clause0TrueucgidNaNNaN
PCT022021NaNGROUP QUARTERS POPULATION BY SEX BY GROUP QUA...PCT22Total!!Female!!Noninstitutionalized population...0NaNintNaNNaN
P028E001NaNHOUSEHOLD TYPE BY HOUSEHOLD SIZE (NATIVE HAWAI...P28ETotal0NaNintNaNNaN
\n", "
" ], "text/plain": [ " attributes concept \\\n", "for NaN Census API Geography Specification \n", "in NaN Census API Geography Specification \n", "ucgid NaN Census API Geography Specification \n", "PCT022021 NaN GROUP QUARTERS POPULATION BY SEX BY GROUP QUA... \n", "P028E001 NaN HOUSEHOLD TYPE BY HOUSEHOLD SIZE (NATIVE HAWAI... \n", "\n", " group label limit \\\n", "for N/A Census API FIPS 'for' clause 0 \n", "in N/A Census API FIPS 'in' clause 0 \n", "ucgid N/A Uniform Census Geography Identifier clause 0 \n", "PCT022021 PCT22 Total!!Female!!Noninstitutionalized population... 0 \n", "P028E001 P28E Total 0 \n", "\n", " predicateOnly predicateType required values \n", "for True fips-for NaN NaN \n", "in True fips-in NaN NaN \n", "ucgid True ucgid NaN NaN \n", "PCT022021 NaN int NaN NaN \n", "P028E001 NaN int NaN NaN " ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn.variables.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This dataframe is populated just like the census's table describing the variables on the corresponding [api website](https://api.census.gov/data/2010/dec/sf1/variables.html). Fortunately, this means that you can modify and filter this dataframe just like you can regular pandas dataframes, so working out what the exact codes to use in your query is easy. \n", "\n", "I've added a function, `varslike`, that globs variables that fit a regular expression pattern. It can use the builtin python `re` module, in addition to the `fnmatch` module. It also can use any filtering function you want. \n", "\n", "So, you can extract the rows of the variables using the `df.ix` method on the list of columns that match your expression:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
attributesconceptgrouplabellimitpredicateOnlypredicateTyperequiredvalues
H011A002NaNTOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ...H11APopulation in occupied housing units!!Owned wi...0NaNintNaNNaN
H011A001NaNTOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ...H11APopulation in occupied housing units0NaNintNaNNaN
H011A004NaNTOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ...H11APopulation in occupied housing units!!Renter o...0NaNintNaNNaN
H011A003NaNTOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ...H11APopulation in occupied housing units!!Owned fr...0NaNintNaNNaN
H011B001NaNTOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ...H11BPopulation in occupied housing units0NaNintNaNNaN
H011B002NaNTOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ...H11BPopulation in occupied housing units!!Owned wi...0NaNintNaNNaN
H011B003NaNTOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ...H11BPopulation in occupied housing units!!Owned fr...0NaNintNaNNaN
H011B004NaNTOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ...H11BPopulation in occupied housing units!!Renter o...0NaNintNaNNaN
\n", "
" ], "text/plain": [ " attributes concept group \\\n", "H011A002 NaN TOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ... H11A \n", "H011A001 NaN TOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ... H11A \n", "H011A004 NaN TOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ... H11A \n", "H011A003 NaN TOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ... H11A \n", "H011B001 NaN TOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ... H11B \n", "H011B002 NaN TOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ... H11B \n", "H011B003 NaN TOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ... H11B \n", "H011B004 NaN TOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ... H11B \n", "\n", " label limit \\\n", "H011A002 Population in occupied housing units!!Owned wi... 0 \n", "H011A001 Population in occupied housing units 0 \n", "H011A004 Population in occupied housing units!!Renter o... 0 \n", "H011A003 Population in occupied housing units!!Owned fr... 0 \n", "H011B001 Population in occupied housing units 0 \n", "H011B002 Population in occupied housing units!!Owned wi... 0 \n", "H011B003 Population in occupied housing units!!Owned fr... 0 \n", "H011B004 Population in occupied housing units!!Renter o... 0 \n", "\n", " predicateOnly predicateType required values \n", "H011A002 NaN int NaN NaN \n", "H011A001 NaN int NaN NaN \n", "H011A004 NaN int NaN NaN \n", "H011A003 NaN int NaN NaN \n", "H011B001 NaN int NaN NaN \n", "H011B002 NaN int NaN NaN \n", "H011B003 NaN int NaN NaN \n", "H011B004 NaN int NaN NaN " ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn.variables.loc[conn.varslike('H011[AB]')]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Likewise, the different levels of geographic scale are determined from the metadata in the overall API listing and recorded. \n", "\n", "However, many Census products have multiple possible geographical indexing systems, like the deprecated `fips` code system and the new *Geographical Names Information System*, `gnis`. Thus, the `geographies` property is a dictionary of dataframes, where each key is the name of the identifier system and the value is the dataframe describing the identifier system. \n", "\n", "For the 2010 census, the following systems are available:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "dict_keys(['fips'])" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn.geographies.keys()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For an explanation of the geographic hierarchies, the `geographies` tables show the geographies at which the data is summarized:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
geoLevelDisplaynameoptionalWithWCForreferenceDaterequireswildcard
0010usNaN2010-01-01NaNNaN
1020regionNaN2010-01-01NaNNaN
2030divisionNaN2010-01-01NaNNaN
3040stateNaN2010-01-01NaNNaN
4050countystate2010-01-01[state][state]
\n", "
" ], "text/plain": [ " geoLevelDisplay name optionalWithWCFor referenceDate requires wildcard\n", "0 010 us NaN 2010-01-01 NaN NaN\n", "1 020 region NaN 2010-01-01 NaN NaN\n", "2 030 division NaN 2010-01-01 NaN NaN\n", "3 040 state NaN 2010-01-01 NaN NaN\n", "4 050 county state 2010-01-01 [state] [state]" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn.geographies['fips'].head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that some geographies in the `fips` system have a **requires** filter to prevent drawing too much data. This will get passed to the `query` method later. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So, let's just grab the housing information from the 2010 Census Short Form. Using the variables table above, we picked out a subset of the fields we wanted. Since the variables table is indexed by the identifiers, we can grab the indexes of the filtered dataframe as query parameters. \n", "\n", "In addition, adding the `NAME` field smart-fills the table with the name of the geographic entity being pulled from the Census." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [], "source": [ "cols = conn.varslike('H00[012]*', engine='fnmatch')" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [], "source": [ "cols.append('NAME')" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "['H001001',\n", " 'H002001',\n", " 'H002006',\n", " 'H002002',\n", " 'H002003',\n", " 'H002004',\n", " 'H002005',\n", " 'NAME']" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cols" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now the query. The query is constructed just like the API query, and works as follows. \n", "\n", "1. cols - list of columns desired from the database, maps to census API's `get=`\n", "2. geo_unit - string denoting the unit of study to pull, maps to census API's `in=`\n", "3. geo_filter - dictionary containing groupings of geo_units, if required, maps to `for=`\n", " \n", "To be specific, a fully query tells the server *what* columns to pull of *what* underlying geography from *what* aggregation units. It's structured using these heterogeneous datatypes so it's easy to change the smallest units quickly, while providing sufficient granularity to change the filters and columns as you go. \n", "\n", "This query below grabs the names, population, and housing estimates from the ACS, as well as their standard errors from census designated places in Arizona. \n" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false }, "outputs": [], "source": [ "data = conn.query(cols, geo_unit = 'place:*', geo_filter = {'state':'04'})" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once constructed, the query executes as fast as your internet connection will move. This query has:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "(451, 10)" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "28 columns and 451 rows. So, rather fast. \n", "\n", "For validity and ease of use, we store the last executed query to the object. If you're dodgy about your census API key never being shown in plaintext, never print this property!" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'https://api.census.gov/data/2010/dec/sf1?get=H001001,H002001,H002006,H002002,H002003,H002004,H002005,NAME&for=place:*&in=state:04&key=174dc2099125916233a42788cc0ffd0336d2ca85'" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn.last_query" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So, you have a dataframe with the information requested, plus the fields specified in the `geo_filter` and `geo_unit`. Sometimes, the `pandas.infer_objects()` function is not able to infer the types or structures of the data in the ways that you might expect. Thus, you may need to format the final data to ensure that the data types are correct. \n", "\n", "So, the following is a dataframe of the data requested. I've filtered it to only look at data where the population is larger than 40 thousand people.\n", "\n", "Pretty neat!" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
H001001H002001H002006H002002H002003H002004H002005NAMEstateplace
64944049440409439494394010Chandler city, Arizona0412000
11020117320117302009792009790194Mesa city, Arizona0446000
201905059050509049390493012Glendale city, Arizona0427820
267124001124001012004912004903952Scottsdale city, Arizona0465000
30552586525860510825108201504Surprise city, Arizona0471510
31473462734620734627346200Tempe city, Arizona0473000
3332297622297620228577228506711185Tucson city, Arizona0477000
413749077490707488074880027Gilbert town, Arizona0427400
4336481864818064133609393194685Peoria city, Arizona0454050
435590149590149058793658793602213Phoenix city, Arizona0455000
\n", "
" ], "text/plain": [ " H001001 H002001 H002006 H002002 H002003 H002004 H002005 \\\n", "64 94404 94404 0 94394 94394 0 10 \n", "110 201173 201173 0 200979 200979 0 194 \n", "201 90505 90505 0 90493 90493 0 12 \n", "267 124001 124001 0 120049 120049 0 3952 \n", "305 52586 52586 0 51082 51082 0 1504 \n", "314 73462 73462 0 73462 73462 0 0 \n", "333 229762 229762 0 228577 228506 71 1185 \n", "413 74907 74907 0 74880 74880 0 27 \n", "433 64818 64818 0 64133 60939 3194 685 \n", "435 590149 590149 0 587936 587936 0 2213 \n", "\n", " NAME state place \n", "64 Chandler city, Arizona 04 12000 \n", "110 Mesa city, Arizona 04 46000 \n", "201 Glendale city, Arizona 04 27820 \n", "267 Scottsdale city, Arizona 04 65000 \n", "305 Surprise city, Arizona 04 71510 \n", "314 Tempe city, Arizona 04 73000 \n", "333 Tucson city, Arizona 04 77000 \n", "413 Gilbert town, Arizona 04 27400 \n", "433 Peoria city, Arizona 04 54050 \n", "435 Phoenix city, Arizona 04 55000 " ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data[data['H001001'].astype(int) > 40000]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And, just in case you're liable to forget your FIPS codes, the explorer module can look up some fips codes listings for you." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "reading https://www2.census.gov/geo/docs/reference/codes/files/st04_az_places.txt\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
0123456
0AZ4730Aguila CDPCensus Designated PlaceSMaricopa County
1AZ4870Ajo CDPCensus Designated PlaceSPima County
2AZ4940Ak Chin CDPCensus Designated PlaceSPima County
3AZ41090Ak-Chin Village CDPCensus Designated PlaceSPinal County
4AZ41170Alamo Lake CDPCensus Designated PlaceSLa Paz County
5AZ41560Ali Chuk CDPCensus Designated PlaceSPima County
6AZ41570Ali Chukson CDPCensus Designated PlaceSPima County
7AZ41620Ali Molina CDPCensus Designated PlaceSPima County
8AZ41920Alpine CDPCensus Designated PlaceSApache County
9AZ41990Amado CDPCensus Designated PlaceSSanta Cruz County
10AZ42270Anegam CDPCensus Designated PlaceSPima County
11AZ42410Antares CDPCensus Designated PlaceSMohave County
12AZ42430Anthem CDPCensus Designated PlaceSMaricopa County
13AZ42830Apache Junction cityIncorporated PlaceAMaricopa County, Pinal County
14AZ43320Arivaca CDPCensus Designated PlaceSPima County
15AZ43380Arivaca Junction CDPCensus Designated PlaceSPima County
16AZ43530Arizona City CDPCensus Designated PlaceSPinal County
17AZ43915Arizona Village CDPCensus Designated PlaceSMohave County
18AZ44020Arlington CDPCensus Designated PlaceSMaricopa County
19AZ44440Ash Fork CDPCensus Designated PlaceSYavapai County
20AZ44710Avenue B and C CDPCensus Designated PlaceSYuma County
21AZ44720Avondale cityIncorporated PlaceAMaricopa County
22AZ44880Avra Valley CDPCensus Designated PlaceSPima County
23AZ44930Aztec CDPCensus Designated PlaceSYuma County
24AZ45140Bagdad CDPCensus Designated PlaceSYavapai County
25AZ45450Bear Flat CDPCensus Designated PlaceSGila County
26AZ45490Beaver Dam CDPCensus Designated PlaceSMohave County
27AZ45495Beaver Valley CDPCensus Designated PlaceSGila County
28AZ45770Benson cityIncorporated PlaceACochise County
29AZ45970Beyerville CDPCensus Designated PlaceSSanta Cruz County
........................
421AZ482120Wheatfields CDPCensus Designated PlaceSGila County
422AZ482155Whetstone CDPCensus Designated PlaceSCochise County
423AZ482270Whispering Pines CDPCensus Designated PlaceSGila County
424AZ482390Whitecone CDPCensus Designated PlaceSNavajo County
425AZ482425White Hills CDPCensus Designated PlaceSMohave County
426AZ482450White Mountain Lake CDPCensus Designated PlaceSNavajo County
427AZ482530Whiteriver CDPCensus Designated PlaceSNavajo County
428AZ482660Why CDPCensus Designated PlaceSPima County
429AZ482740Wickenburg townIncorporated PlaceAMaricopa County, Yavapai County
430AZ482810Wide Ruins CDPCensus Designated PlaceSApache County
431AZ482880Wikieup CDPCensus Designated PlaceSMohave County
432AZ482950Wilhoit CDPCensus Designated PlaceSYavapai County
433AZ483090Willcox cityIncorporated PlaceACochise County
434AZ483160Williams cityIncorporated PlaceACoconino County
435AZ483388Williamson CDPCensus Designated PlaceSYavapai County
436AZ483475Willow Canyon CDPCensus Designated PlaceSPima County
437AZ483570Willow Valley CDPCensus Designated PlaceSMohave County
438AZ483720Window Rock CDPCensus Designated PlaceSApache County
439AZ483790Winkelman townIncorporated PlaceAGila County, Pinal County
440AZ483930Winslow cityIncorporated PlaceANavajo County
441AZ483960Winslow West CDPCensus Designated PlaceSCoconino County, Navajo County
442AZ484000Wintersburg CDPCensus Designated PlaceSMaricopa County
443AZ484140Wittmann CDPCensus Designated PlaceSMaricopa County
444AZ484350Woodruff CDPCensus Designated PlaceSNavajo County
445AZ484980Yarnell CDPCensus Designated PlaceSYavapai County
446AZ485260York CDPCensus Designated PlaceSGreenlee County
447AZ485330Young CDPCensus Designated PlaceSGila County
448AZ485400Youngtown townIncorporated PlaceAMaricopa County
449AZ485470Yucca CDPCensus Designated PlaceSMohave County
450AZ485540Yuma cityIncorporated PlaceAYuma County
\n", "

451 rows × 7 columns

\n", "
" ], "text/plain": [ " 0 1 2 3 4 5 \\\n", "0 AZ 4 730 Aguila CDP Census Designated Place S \n", "1 AZ 4 870 Ajo CDP Census Designated Place S \n", "2 AZ 4 940 Ak Chin CDP Census Designated Place S \n", "3 AZ 4 1090 Ak-Chin Village CDP Census Designated Place S \n", "4 AZ 4 1170 Alamo Lake CDP Census Designated Place S \n", "5 AZ 4 1560 Ali Chuk CDP Census Designated Place S \n", "6 AZ 4 1570 Ali Chukson CDP Census Designated Place S \n", "7 AZ 4 1620 Ali Molina CDP Census Designated Place S \n", "8 AZ 4 1920 Alpine CDP Census Designated Place S \n", "9 AZ 4 1990 Amado CDP Census Designated Place S \n", "10 AZ 4 2270 Anegam CDP Census Designated Place S \n", "11 AZ 4 2410 Antares CDP Census Designated Place S \n", "12 AZ 4 2430 Anthem CDP Census Designated Place S \n", "13 AZ 4 2830 Apache Junction city Incorporated Place A \n", "14 AZ 4 3320 Arivaca CDP Census Designated Place S \n", "15 AZ 4 3380 Arivaca Junction CDP Census Designated Place S \n", "16 AZ 4 3530 Arizona City CDP Census Designated Place S \n", "17 AZ 4 3915 Arizona Village CDP Census Designated Place S \n", "18 AZ 4 4020 Arlington CDP Census Designated Place S \n", "19 AZ 4 4440 Ash Fork CDP Census Designated Place S \n", "20 AZ 4 4710 Avenue B and C CDP Census Designated Place S \n", "21 AZ 4 4720 Avondale city Incorporated Place A \n", "22 AZ 4 4880 Avra Valley CDP Census Designated Place S \n", "23 AZ 4 4930 Aztec CDP Census Designated Place S \n", "24 AZ 4 5140 Bagdad CDP Census Designated Place S \n", "25 AZ 4 5450 Bear Flat CDP Census Designated Place S \n", "26 AZ 4 5490 Beaver Dam CDP Census Designated Place S \n", "27 AZ 4 5495 Beaver Valley CDP Census Designated Place S \n", "28 AZ 4 5770 Benson city Incorporated Place A \n", "29 AZ 4 5970 Beyerville CDP Census Designated Place S \n", ".. .. .. ... ... ... .. \n", "421 AZ 4 82120 Wheatfields CDP Census Designated Place S \n", "422 AZ 4 82155 Whetstone CDP Census Designated Place S \n", "423 AZ 4 82270 Whispering Pines CDP Census Designated Place S \n", "424 AZ 4 82390 Whitecone CDP Census Designated Place S \n", "425 AZ 4 82425 White Hills CDP Census Designated Place S \n", "426 AZ 4 82450 White Mountain Lake CDP Census Designated Place S \n", "427 AZ 4 82530 Whiteriver CDP Census Designated Place S \n", "428 AZ 4 82660 Why CDP Census Designated Place S \n", "429 AZ 4 82740 Wickenburg town Incorporated Place A \n", "430 AZ 4 82810 Wide Ruins CDP Census Designated Place S \n", "431 AZ 4 82880 Wikieup CDP Census Designated Place S \n", "432 AZ 4 82950 Wilhoit CDP Census Designated Place S \n", "433 AZ 4 83090 Willcox city Incorporated Place A \n", "434 AZ 4 83160 Williams city Incorporated Place A \n", "435 AZ 4 83388 Williamson CDP Census Designated Place S \n", "436 AZ 4 83475 Willow Canyon CDP Census Designated Place S \n", "437 AZ 4 83570 Willow Valley CDP Census Designated Place S \n", "438 AZ 4 83720 Window Rock CDP Census Designated Place S \n", "439 AZ 4 83790 Winkelman town Incorporated Place A \n", "440 AZ 4 83930 Winslow city Incorporated Place A \n", "441 AZ 4 83960 Winslow West CDP Census Designated Place S \n", "442 AZ 4 84000 Wintersburg CDP Census Designated Place S \n", "443 AZ 4 84140 Wittmann CDP Census Designated Place S \n", "444 AZ 4 84350 Woodruff CDP Census Designated Place S \n", "445 AZ 4 84980 Yarnell CDP Census Designated Place S \n", "446 AZ 4 85260 York CDP Census Designated Place S \n", "447 AZ 4 85330 Young CDP Census Designated Place S \n", "448 AZ 4 85400 Youngtown town Incorporated Place A \n", "449 AZ 4 85470 Yucca CDP Census Designated Place S \n", "450 AZ 4 85540 Yuma city Incorporated Place A \n", "\n", " 6 \n", "0 Maricopa County \n", "1 Pima County \n", "2 Pima County \n", "3 Pinal County \n", "4 La Paz County \n", "5 Pima County \n", "6 Pima County \n", "7 Pima County \n", "8 Apache County \n", "9 Santa Cruz County \n", "10 Pima County \n", "11 Mohave County \n", "12 Maricopa County \n", "13 Maricopa County, Pinal County \n", "14 Pima County \n", "15 Pima County \n", "16 Pinal County \n", "17 Mohave County \n", "18 Maricopa County \n", "19 Yavapai County \n", "20 Yuma County \n", "21 Maricopa County \n", "22 Pima County \n", "23 Yuma County \n", "24 Yavapai County \n", "25 Gila County \n", "26 Mohave County \n", "27 Gila County \n", "28 Cochise County \n", "29 Santa Cruz County \n", ".. ... \n", "421 Gila County \n", "422 Cochise County \n", "423 Gila County \n", "424 Navajo County \n", "425 Mohave County \n", "426 Navajo County \n", "427 Navajo County \n", "428 Pima County \n", "429 Maricopa County, Yavapai County \n", "430 Apache County \n", "431 Mohave County \n", "432 Yavapai County \n", "433 Cochise County \n", "434 Coconino County \n", "435 Yavapai County \n", "436 Pima County \n", "437 Mohave County \n", "438 Apache County \n", "439 Gila County, Pinal County \n", "440 Navajo County \n", "441 Coconino County, Navajo County \n", "442 Maricopa County \n", "443 Maricopa County \n", "444 Navajo County \n", "445 Yavapai County \n", "446 Greenlee County \n", "447 Gila County \n", "448 Maricopa County \n", "449 Mohave County \n", "450 Yuma County \n", "\n", "[451 rows x 7 columns]" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "c.explorer.fips_table('place', in_state='AZ')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### GEO & Tiger Integration" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The Census TIGER geometry API is substantively different from every other API, in that it's an ArcGIS REST API. But, I've tried to expose a consistent interface. It works like this:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import cenpy.tiger as tiger" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "[{'name': 'AIANNHA', 'type': 'MapServer'},\n", " {'name': 'CBSA', 'type': 'MapServer'},\n", " {'name': 'Hydro_LargeScale', 'type': 'MapServer'},\n", " {'name': 'Hydro', 'type': 'MapServer'},\n", " {'name': 'Labels', 'type': 'MapServer'},\n", " {'name': 'Legislative', 'type': 'MapServer'},\n", " {'name': 'Places_CouSub_ConCity_SubMCD', 'type': 'MapServer'},\n", " {'name': 'PUMA_TAD_TAZ_UGA_ZCTA', 'type': 'MapServer'},\n", " {'name': 'Region_Division', 'type': 'MapServer'},\n", " {'name': 'School', 'type': 'MapServer'},\n", " {'name': 'Special_Land_Use_Areas', 'type': 'MapServer'},\n", " {'name': 'State_County', 'type': 'MapServer'},\n", " {'name': 'tigerWMS_ACS2013', 'type': 'MapServer'},\n", " {'name': 'tigerWMS_ACS2014', 'type': 'MapServer'},\n", " {'name': 'tigerWMS_ACS2015', 'type': 'MapServer'},\n", " {'name': 'tigerWMS_ACS2016', 'type': 'MapServer'},\n", " {'name': 'tigerWMS_ACS2017', 'type': 'MapServer'},\n", " {'name': 'tigerWMS_ACS2018', 'type': 'MapServer'},\n", " {'name': 'tigerWMS_Census2010', 'type': 'MapServer'},\n", " {'name': 'tigerWMS_Current', 'type': 'MapServer'},\n", " {'name': 'tigerWMS_ECON2012', 'type': 'MapServer'},\n", " {'name': 'tigerWMS_PhysicalFeatures', 'type': 'MapServer'},\n", " {'name': 'Tracts_Blocks', 'type': 'MapServer'},\n", " {'name': 'Transportation_LargeScale', 'type': 'MapServer'},\n", " {'name': 'Transportation', 'type': 'MapServer'},\n", " {'name': 'TribalTracts', 'type': 'MapServer'},\n", " {'name': 'Urban', 'type': 'MapServer'},\n", " {'name': 'USLandmass', 'type': 'MapServer'}]" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tiger.available()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In some cases, it makes quite a bit of sense to \"attach\" a map server to your connection. In the case of the US Census 2010 we've been using, there is an obvious data product match in `tigerWMS_Census2010`. So, let's attach it to the connection." ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false }, "outputs": [], "source": [ "conn.set_mapservice('tigerWMS_Census2010')" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn.mapservice" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "neat! this is the same as calling: \n", "\n", "`tiger.TigerConnection('tigerWMS_Census2010')`\n", "\n", "but this attaches that object it to the connection you've been using. The connection also updates with this information:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "Connection to Decennial SF1(ID: https://api.census.gov/data/id/DECENNIALSF12010)\n", "With MapServer: Census 2010 WMS" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An ESRI MapServer is a big thing, and `cenpy` doesn't support all of its features. Since `cenpy` is designed to support retreival of data from the US Census, we only support `GET` statements for defined geographic units, and ignore the vaious other functionalities in the service. \n", "\n", "To work with a service, note that any map server is composed of layers:" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "{0: (ESRILayer) Public Use Microdata Areas,\n", " 1: (ESRILayer) Public Use Microdata Areas Labels,\n", " 2: (ESRILayer) Traffic Analysis Districts,\n", " 3: (ESRILayer) Traffic Analysis Districts Labels,\n", " 4: (ESRILayer) Traffic Analysis Zones,\n", " 5: (ESRILayer) Traffic Analysis Zones Labels,\n", " 6: (ESRILayer) Urban Growth Areas,\n", " 7: (ESRILayer) Urban Growth Areas Labels,\n", " 8: (ESRILayer) ZIP Code Tabulation Areas,\n", " 9: (ESRILayer) ZIP Code Tabulation Areas Labels,\n", " 10: (ESRILayer) Tribal Census Tracts,\n", " 11: (ESRILayer) Tribal Census Tracts Labels,\n", " 12: (ESRILayer) Tribal Block Groups,\n", " 13: (ESRILayer) Tribal Block Groups Labels,\n", " 14: (ESRILayer) Census Tracts,\n", " 15: (ESRILayer) Census Tracts Labels,\n", " 16: (ESRILayer) Census Block Groups,\n", " 17: (ESRILayer) Census Block Groups Labels,\n", " 18: (ESRILayer) Census Blocks,\n", " 19: (ESRILayer) Census Blocks Labels,\n", " 20: (ESRILayer) Unified School Districts,\n", " 21: (ESRILayer) Unified School Districts Labels,\n", " 22: (ESRILayer) Secondary School Districts,\n", " 23: (ESRILayer) Secondary School Districts Labels,\n", " 24: (ESRILayer) Elementary School Districts,\n", " 25: (ESRILayer) Elementary School Districts Labels,\n", " 26: (ESRILayer) Estates,\n", " 27: (ESRILayer) Estates Labels,\n", " 28: (ESRILayer) County Subdivisions,\n", " 29: (ESRILayer) County Subdivisions Labels,\n", " 30: (ESRILayer) Subbarrios,\n", " 31: (ESRILayer) Subbarrios Labels,\n", " 32: (ESRILayer) Consolidated Cities,\n", " 33: (ESRILayer) Consolidated Cities Labels,\n", " 34: (ESRILayer) Incorporated Places,\n", " 35: (ESRILayer) Incorporated Places Labels,\n", " 36: (ESRILayer) Census Designated Places,\n", " 37: (ESRILayer) Census Designated Places Labels,\n", " 38: (ESRILayer) Alaska Native Regional Corporations,\n", " 39: (ESRILayer) Alaska Native Regional Corporations Labels,\n", " 40: (ESRILayer) Tribal Subdivisions,\n", " 41: (ESRILayer) Tribal Subdivisions Labels,\n", " 42: (ESRILayer) Federal American Indian Reservations,\n", " 43: (ESRILayer) Federal American Indian Reservations Labels,\n", " 44: (ESRILayer) Off-Reservation Trust Lands,\n", " 45: (ESRILayer) Off-Reservation Trust Lands Labels,\n", " 46: (ESRILayer) State American Indian Reservations,\n", " 47: (ESRILayer) State American Indian Reservations Labels,\n", " 48: (ESRILayer) Hawaiian Home Lands,\n", " 49: (ESRILayer) Hawaiian Home Lands Labels,\n", " 50: (ESRILayer) Alaska Native Village Statistical Areas,\n", " 51: (ESRILayer) Alaska Native Village Statistical Areas Labels,\n", " 52: (ESRILayer) Oklahoma Tribal Statistical Areas,\n", " 53: (ESRILayer) Oklahoma Tribal Statistical Areas Labels,\n", " 54: (ESRILayer) State Designated Tribal Statistical Areas,\n", " 55: (ESRILayer) State Designated Tribal Statistical Areas Labels,\n", " 56: (ESRILayer) Tribal Designated Statistical Areas,\n", " 57: (ESRILayer) Tribal Designated Statistical Areas Labels,\n", " 58: (ESRILayer) American Indian Joint-Use Areas,\n", " 59: (ESRILayer) American Indian Joint-Use Areas Labels,\n", " 60: (ESRILayer) 113th Congressional Districts,\n", " 61: (ESRILayer) 113th Congressional Districts Labels,\n", " 62: (ESRILayer) 111th Congressional Districts,\n", " 63: (ESRILayer) 111th Congressional Districts Labels,\n", " 64: (ESRILayer) 2013 State Legislative Districts - Upper,\n", " 65: (ESRILayer) 2013 State Legislative Districts - Upper Labels,\n", " 66: (ESRILayer) 2013 State Legislative Districts - Lower,\n", " 67: (ESRILayer) 2013 State Legislative Districts - Lower Labels,\n", " 68: (ESRILayer) 2010 State Legislative Districts - Upper,\n", " 69: (ESRILayer) 2010 State Legislative Districts - Upper Labels,\n", " 70: (ESRILayer) 2010 State Legislative Districts - Lower,\n", " 71: (ESRILayer) 2010 State Legislative Districts - Lower Labels,\n", " 72: (ESRILayer) Voting Districts,\n", " 73: (ESRILayer) Voting Districts Labels,\n", " 74: (ESRILayer) Census Divisions,\n", " 75: (ESRILayer) Census Divisions Labels,\n", " 76: (ESRILayer) Census Regions,\n", " 77: (ESRILayer) Census Regions Labels,\n", " 78: (ESRILayer) Urbanized Areas,\n", " 79: (ESRILayer) Urbanized Areas Labels,\n", " 80: (ESRILayer) Urban Clusters,\n", " 81: (ESRILayer) Urban Clusters Labels,\n", " 82: (ESRILayer) Combined New England City and Town Areas,\n", " 83: (ESRILayer) Combined New England City and Town Areas Labels,\n", " 84: (ESRILayer) New England City and Town Area Divisions,\n", " 85: (ESRILayer) New England City and Town Area Divisions Labels,\n", " 86: (ESRILayer) Metropolitan New England City and Town Areas,\n", " 87: (ESRILayer) Metropolitan New England City and Town Areas Labels,\n", " 88: (ESRILayer) Micropolitan New England City and Town Areas,\n", " 89: (ESRILayer) Micropolitan New England City and Town Areas Labels,\n", " 90: (ESRILayer) Combined Statistical Areas,\n", " 91: (ESRILayer) Combined Statistical Areas Labels,\n", " 92: (ESRILayer) Metropolitan Divisions,\n", " 93: (ESRILayer) Metropolitan Divisions Labels,\n", " 94: (ESRILayer) Metropolitan Statistical Areas,\n", " 95: (ESRILayer) Metropolitan Statistical Areas Labels,\n", " 96: (ESRILayer) Micropolitan Statistical Areas,\n", " 97: (ESRILayer) Micropolitan Statistical Areas Labels,\n", " 98: (ESRILayer) States,\n", " 99: (ESRILayer) States Labels,\n", " 100: (ESRILayer) Counties,\n", " 101: (ESRILayer) Counties Labels}" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn.mapservice.layers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "These layers are what actually implement query operations. For now, let's focus on the same \"class\" of units we were using before, Census Designated Places:" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "(ESRILayer) Census Designated Places" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn.mapservice.layers[36]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A query function is implemented both at the mapservice level and the layer level. At the mapservice level, a layer ID is required in order to complete the query. \n", "\n", "Mapservice queries are driven by SQL. So, to grab all of the geodata that fits the CDPs we pulled before, you could start to construct it like this. \n", "\n", "First, just like the main connection, each layer has a set of variables: " ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
aliasdomainlengthnametype
0OBJECTIDNoneNaNOBJECTIDesriFieldTypeOID
1AREALANDNoneNaNAREALANDesriFieldTypeDouble
2AREAWATERNoneNaNAREAWATEResriFieldTypeDouble
3PLACENSNone8.0PLACENSesriFieldTypeString
4PLACECCNone2.0PLACECCesriFieldTypeString
5PLACENone5.0PLACEesriFieldTypeString
6FUNCSTATNone1.0FUNCSTATesriFieldTypeString
7GEOIDNone7.0GEOIDesriFieldTypeString
8LSADCNone2.0LSADCesriFieldTypeString
9MTFCCNone5.0MTFCCesriFieldTypeString
10BASENAMENone100.0BASENAMEesriFieldTypeString
11NAMENone100.0NAMEesriFieldTypeString
12OIDNoneNaNOIDesriFieldTypeDouble
13CBSAPCINone1.0CBSAPCIesriFieldTypeString
14NECTAPCINone1.0NECTAPCIesriFieldTypeString
15STATENone2.0STATEesriFieldTypeString
16URNone1.0UResriFieldTypeString
17CENTLONNone12.0CENTLONesriFieldTypeString
18CENTLATNone11.0CENTLATesriFieldTypeString
19INTPTLONNone12.0INTPTLONesriFieldTypeString
20INTPTLATNone11.0INTPTLATesriFieldTypeString
21HU100NoneNaNHU100esriFieldTypeDouble
22POP100NoneNaNPOP100esriFieldTypeDouble
23STGEOMETRYNoneNaNSTGEOMETRYesriFieldTypeGeometry
24STGEOMETRY.AREANoneNaNSTGEOMETRY.AREAesriFieldTypeDouble
25STGEOMETRY.LENNoneNaNSTGEOMETRY.LENesriFieldTypeDouble
\n", "
" ], "text/plain": [ " alias domain length name type\n", "0 OBJECTID None NaN OBJECTID esriFieldTypeOID\n", "1 AREALAND None NaN AREALAND esriFieldTypeDouble\n", "2 AREAWATER None NaN AREAWATER esriFieldTypeDouble\n", "3 PLACENS None 8.0 PLACENS esriFieldTypeString\n", "4 PLACECC None 2.0 PLACECC esriFieldTypeString\n", "5 PLACE None 5.0 PLACE esriFieldTypeString\n", "6 FUNCSTAT None 1.0 FUNCSTAT esriFieldTypeString\n", "7 GEOID None 7.0 GEOID esriFieldTypeString\n", "8 LSADC None 2.0 LSADC esriFieldTypeString\n", "9 MTFCC None 5.0 MTFCC esriFieldTypeString\n", "10 BASENAME None 100.0 BASENAME esriFieldTypeString\n", "11 NAME None 100.0 NAME esriFieldTypeString\n", "12 OID None NaN OID esriFieldTypeDouble\n", "13 CBSAPCI None 1.0 CBSAPCI esriFieldTypeString\n", "14 NECTAPCI None 1.0 NECTAPCI esriFieldTypeString\n", "15 STATE None 2.0 STATE esriFieldTypeString\n", "16 UR None 1.0 UR esriFieldTypeString\n", "17 CENTLON None 12.0 CENTLON esriFieldTypeString\n", "18 CENTLAT None 11.0 CENTLAT esriFieldTypeString\n", "19 INTPTLON None 12.0 INTPTLON esriFieldTypeString\n", "20 INTPTLAT None 11.0 INTPTLAT esriFieldTypeString\n", "21 HU100 None NaN HU100 esriFieldTypeDouble\n", "22 POP100 None NaN POP100 esriFieldTypeDouble\n", "23 STGEOMETRY None NaN STGEOMETRY esriFieldTypeGeometry\n", "24 STGEOMETRY.AREA None NaN STGEOMETRY.AREA esriFieldTypeDouble\n", "25 STGEOMETRY.LEN None NaN STGEOMETRY.LEN esriFieldTypeDouble" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn.mapservice.layers[36].variables" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Our prior query grabbed the places in AZ. So, we could use a SQL query that focuses on that. \n", "\n", "I try to pack the geometries into containers that people are used to using. Without knowing if GEOS is installed on a user's computer, I use `PySAL` as the target geometry type. \n", "\n", "If you do have GEOS, that means you can use Shapely or GeoPandas. So, to choose your backend, you can use the following two arguments to this query function. the `pkg` argument will let you choose the three types of python objects to output to. \n", "\n", "Pysal is default. If you select Shapely, the result will just be a pandas dataframe with Shapely geometries instead of pysal geometries. If you choose geopandas (or throw a gpize) option, cenpy will try to convert the pandas dataframe into a GeoPandas dataframe." ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "collapsed": false }, "outputs": [], "source": [ "geodata = conn.mapservice.query(layer=36, where='STATE = 04')" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
AREALANDAREAWATERBASENAMECBSAPCICENTLATCENTLONFUNCSTATGEOIDHU100INTPTLAT...OIDPLACEPLACECCPLACENSPOP100STATESTGEOMETRY.AREASTGEOMETRY.LENURgeometry
0190327753749Copper HillN+33.4379206-110.7516247S041560063+33.4309411...28040371747673015600U102582762108042.740956e+0729887.334159RPOLYGON ((-12332768.2339 3951052.317500003, -1...
137014089637430VicksburgN+33.7292771-113.8254052S0479940687+33.7271570...28040371738894179940U102582892597045.365771e+08139508.893591RPOLYGON ((-12681838.7724 3986891.478200004, -1...
21493560321306SawmillN+35.8924354-109.1538097S0464590243+35.8940886...28040119080879964590U102409289748042.283813e+0726370.978055RPOLYGON ((-12154943.6963 4285660.0066, -121549...
386936010Goodyear VillageN+33.1973247-111.8723879S0428465121+33.1973247...28040386109119128465U202612139457041.244920e+0718140.700058MPOLYGON ((-12456882.7869 3922518.464100003, -1...
42338541255226CarrizoN+33.9866282-110.3314358S041032040+33.9793542...28040371723164810320U102582748127043.418072e+0737640.185623RPOLYGON ((-12289756.4977 4027500.7016, -122896...
\n", "

5 rows × 26 columns

\n", "
" ], "text/plain": [ " AREALAND AREAWATER BASENAME CBSAPCI CENTLAT CENTLON \\\n", "0 19032775 3749 Copper Hill N +33.4379206 -110.7516247 \n", "1 370140896 37430 Vicksburg N +33.7292771 -113.8254052 \n", "2 14935603 21306 Sawmill N +35.8924354 -109.1538097 \n", "3 8693601 0 Goodyear Village N +33.1973247 -111.8723879 \n", "4 23385412 55226 Carrizo N +33.9866282 -110.3314358 \n", "\n", " FUNCSTAT GEOID HU100 INTPTLAT ... OID PLACE PLACECC \\\n", "0 S 0415600 63 +33.4309411 ... 280403717476730 15600 U1 \n", "1 S 0479940 687 +33.7271570 ... 280403717388941 79940 U1 \n", "2 S 0464590 243 +35.8940886 ... 280401190808799 64590 U1 \n", "3 S 0428465 121 +33.1973247 ... 280403861091191 28465 U2 \n", "4 S 0410320 40 +33.9793542 ... 280403717231648 10320 U1 \n", "\n", " PLACENS POP100 STATE STGEOMETRY.AREA STGEOMETRY.LEN UR \\\n", "0 02582762 108 04 2.740956e+07 29887.334159 R \n", "1 02582892 597 04 5.365771e+08 139508.893591 R \n", "2 02409289 748 04 2.283813e+07 26370.978055 R \n", "3 02612139 457 04 1.244920e+07 18140.700058 M \n", "4 02582748 127 04 3.418072e+07 37640.185623 R \n", "\n", " geometry \n", "0 POLYGON ((-12332768.2339 3951052.317500003, -1... \n", "1 POLYGON ((-12681838.7724 3986891.478200004, -1... \n", "2 POLYGON ((-12154943.6963 4285660.0066, -121549... \n", "3 POLYGON ((-12456882.7869 3922518.464100003, -1... \n", "4 POLYGON ((-12289756.4977 4027500.7016, -122896... \n", "\n", "[5 rows x 26 columns]" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "geodata.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To join the geodata to the other data, use pandas functions:" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "collapsed": false }, "outputs": [], "source": [ "newdata = pd.merge(data, geodata, left_on='place', right_on='PLACE')" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
H001001H002001H002006H002002H002003H002004H002005NAME_xstateplace...OIDPLACEPLACECCPLACENSPOP100STATESTGEOMETRY.AREASTGEOMETRY.LENURgeometry
06216210000621Queen Valley CDP, Arizona0458290...28040126023170658290U102409115788043.615206e+0729036.707030RPOLYGON ((-12394427.7659 3930594.1774, -123943...
12262260148014878Rainbow City CDP, Arizona0458620...28040371723169158620U202582847968048.203581e+0615645.985013MPOLYGON ((-12244069.0863 4010211.943099998, -1...
2162162016201620Rancho Mesa Verde CDP, Arizona0458870...28040371738900858870U202582848625044.109284e+052901.827817MPOLYGON ((-12763708.1353 3841881.7513, -127634...
32022020000202Red Mesa CDP, Arizona0459550...28040119080879459550U102409150480045.222080e+0743104.992136RPOLYGON ((-12184877.6187 4433486.655900002, -1...
46464000064Red Rock CDP (Apache County), Arizona0459760...28040371723166059760U102582849169044.694147e+0613576.105484RPOLYGON ((-12142700.7787 4383404.183799997, -1...
\n", "

5 rows × 36 columns

\n", "
" ], "text/plain": [ " H001001 H002001 H002006 H002002 H002003 H002004 H002005 \\\n", "0 621 621 0 0 0 0 621 \n", "1 226 226 0 148 0 148 78 \n", "2 162 162 0 162 0 162 0 \n", "3 202 202 0 0 0 0 202 \n", "4 64 64 0 0 0 0 64 \n", "\n", " NAME_x state place ... OID \\\n", "0 Queen Valley CDP, Arizona 04 58290 ... 280401260231706 \n", "1 Rainbow City CDP, Arizona 04 58620 ... 280403717231691 \n", "2 Rancho Mesa Verde CDP, Arizona 04 58870 ... 280403717389008 \n", "3 Red Mesa CDP, Arizona 04 59550 ... 280401190808794 \n", "4 Red Rock CDP (Apache County), Arizona 04 59760 ... 280403717231660 \n", "\n", " PLACE PLACECC PLACENS POP100 STATE STGEOMETRY.AREA STGEOMETRY.LEN UR \\\n", "0 58290 U1 02409115 788 04 3.615206e+07 29036.707030 R \n", "1 58620 U2 02582847 968 04 8.203581e+06 15645.985013 M \n", "2 58870 U2 02582848 625 04 4.109284e+05 2901.827817 M \n", "3 59550 U1 02409150 480 04 5.222080e+07 43104.992136 R \n", "4 59760 U1 02582849 169 04 4.694147e+06 13576.105484 R \n", "\n", " geometry \n", "0 POLYGON ((-12394427.7659 3930594.1774, -123943... \n", "1 POLYGON ((-12244069.0863 4010211.943099998, -1... \n", "2 POLYGON ((-12763708.1353 3841881.7513, -127634... \n", "3 POLYGON ((-12184877.6187 4433486.655900002, -1... \n", "4 POLYGON ((-12142700.7787 4383404.183799997, -1... \n", "\n", "[5 rows x 36 columns]" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "newdata.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So, that's how you get your geodata in addition to your regular data!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## OK, that's one API, does it work for others?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll try the Economic Census" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "collapsed": false }, "outputs": [], "source": [ "conn2 = c.base.Connection('CBP2012')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alright, let's look at the available columns:" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
attributesconceptgrouplabellimitpredicateOnlypredicateTyperequiredvalues
forNaNCensus API Geography SpecificationN/ACensus API FIPS 'for' clause0Truefips-forNaNNaN
inNaNCensus API Geography SpecificationN/ACensus API FIPS 'in' clause0Truefips-inNaNNaN
ucgidNaNCensus API Geography SpecificationN/AUniform Census Geography Identifier clause0TrueucgidNaNNaN
PAYANNPAYANN_FNaNN/AAnnual payroll ($1,000)0NaNintNaNNaN
GEOTYPENaNNaNN/AType of geography flag0NaNstringNaNNaN
PAYQTR1PAYQTR1_FNaNN/AFirst-quarter payroll ($1,000)0NaNintNaNNaN
YEARYEAR_TTLNaNN/AYear0NaNstringNaN{'item': {'1982': '1982', '1983': '1983', '198...
COUNTYNaNNaNN/AFIPS county code0NaNNaNNaNNaN
STNaNNaNN/AFIPS state code0NaNNaNNaNNaN
EMPSZESEMPSZES_TTLNaNN/AEmployment size of establishment14NaNstringdefault displayed{'item': {'001': 'All establishments', '204': ...
CSANaNNaNN/AFIPS Combined Statistical Area code0NaNstringNaNNaN
MDNaNNaNN/AFIPS Metropolitan Division code0NaNstringNaNNaN
GEO_IDGEO_TTL,FOOTID_GEONaNN/AGeographic identifier code0NaNstringNaNNaN
EMPEMP_FNaNN/APaid employees for pay period including March ...0NaNintNaNNaN
ESTABESTAB_FNaNN/ANumber of establishments0NaNintNaNNaN
MSANaNNaNN/AFIPS Metropolitan Statistical Area or Micropol...0NaNNaNNaNNaN
NAICS2012NAICS2012_TTL,FOOTID_NAICSNaNN/A2012 NAICS code2016NaNstringdefault displayed{'item': {'111331': 'Apple orchards', '111332'...
LFOLFO_TTLNaNN/ALegal form of organization code8NaNstringdefault displayed{'item': {'001': 'All establishments', '002': ...
PAYQTR1_NPAYQTR1_N_FNaNN/ANoise range for first-quarter payroll (%)0NaNintNaNNaN
PAYANN_NPAYANN_N_FNaNN/ANoise range for annual payroll0NaNintNaNNaN
EMP_NEMP_N_FNaNN/ANoise range for number of paid employees for p...0NaNintNaNNaN
\n", "
" ], "text/plain": [ " attributes concept \\\n", "for NaN Census API Geography Specification \n", "in NaN Census API Geography Specification \n", "ucgid NaN Census API Geography Specification \n", "PAYANN PAYANN_F NaN \n", "GEOTYPE NaN NaN \n", "PAYQTR1 PAYQTR1_F NaN \n", "YEAR YEAR_TTL NaN \n", "COUNTY NaN NaN \n", "ST NaN NaN \n", "EMPSZES EMPSZES_TTL NaN \n", "CSA NaN NaN \n", "MD NaN NaN \n", "GEO_ID GEO_TTL,FOOTID_GEO NaN \n", "EMP EMP_F NaN \n", "ESTAB ESTAB_F NaN \n", "MSA NaN NaN \n", "NAICS2012 NAICS2012_TTL,FOOTID_NAICS NaN \n", "LFO LFO_TTL NaN \n", "PAYQTR1_N PAYQTR1_N_F NaN \n", "PAYANN_N PAYANN_N_F NaN \n", "EMP_N EMP_N_F NaN \n", "\n", " group label limit \\\n", "for N/A Census API FIPS 'for' clause 0 \n", "in N/A Census API FIPS 'in' clause 0 \n", "ucgid N/A Uniform Census Geography Identifier clause 0 \n", "PAYANN N/A Annual payroll ($1,000) 0 \n", "GEOTYPE N/A Type of geography flag 0 \n", "PAYQTR1 N/A First-quarter payroll ($1,000) 0 \n", "YEAR N/A Year 0 \n", "COUNTY N/A FIPS county code 0 \n", "ST N/A FIPS state code 0 \n", "EMPSZES N/A Employment size of establishment 14 \n", "CSA N/A FIPS Combined Statistical Area code 0 \n", "MD N/A FIPS Metropolitan Division code 0 \n", "GEO_ID N/A Geographic identifier code 0 \n", "EMP N/A Paid employees for pay period including March ... 0 \n", "ESTAB N/A Number of establishments 0 \n", "MSA N/A FIPS Metropolitan Statistical Area or Micropol... 0 \n", "NAICS2012 N/A 2012 NAICS code 2016 \n", "LFO N/A Legal form of organization code 8 \n", "PAYQTR1_N N/A Noise range for first-quarter payroll (%) 0 \n", "PAYANN_N N/A Noise range for annual payroll 0 \n", "EMP_N N/A Noise range for number of paid employees for p... 0 \n", "\n", " predicateOnly predicateType required \\\n", "for True fips-for NaN \n", "in True fips-in NaN \n", "ucgid True ucgid NaN \n", "PAYANN NaN int NaN \n", "GEOTYPE NaN string NaN \n", "PAYQTR1 NaN int NaN \n", "YEAR NaN string NaN \n", "COUNTY NaN NaN NaN \n", "ST NaN NaN NaN \n", "EMPSZES NaN string default displayed \n", "CSA NaN string NaN \n", "MD NaN string NaN \n", "GEO_ID NaN string NaN \n", "EMP NaN int NaN \n", "ESTAB NaN int NaN \n", "MSA NaN NaN NaN \n", "NAICS2012 NaN string default displayed \n", "LFO NaN string default displayed \n", "PAYQTR1_N NaN int NaN \n", "PAYANN_N NaN int NaN \n", "EMP_N NaN int NaN \n", "\n", " values \n", "for NaN \n", "in NaN \n", "ucgid NaN \n", "PAYANN NaN \n", "GEOTYPE NaN \n", "PAYQTR1 NaN \n", "YEAR {'item': {'1982': '1982', '1983': '1983', '198... \n", "COUNTY NaN \n", "ST NaN \n", "EMPSZES {'item': {'001': 'All establishments', '204': ... \n", "CSA NaN \n", "MD NaN \n", "GEO_ID NaN \n", "EMP NaN \n", "ESTAB NaN \n", "MSA NaN \n", "NAICS2012 {'item': {'111331': 'Apple orchards', '111332'... \n", "LFO {'item': {'001': 'All establishments', '002': ... \n", "PAYQTR1_N NaN \n", "PAYANN_N NaN \n", "EMP_N NaN " ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn2.variables" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To show the required predicates, can filter the `variables` dataframe by the `required` field. Note that *required* means that the query **will fail** if these are not passed as keyword arguments. They don't have to specify a single value, though, so they can be left as a wild card, like we did with `place:*` in the prior query:" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
attributesconceptgrouplabellimitpredicateOnlypredicateTyperequiredvalues
EMPSZESEMPSZES_TTLNaNN/AEmployment size of establishment14NaNstringdefault displayed{'item': {'001': 'All establishments', '204': ...
NAICS2012NAICS2012_TTL,FOOTID_NAICSNaNN/A2012 NAICS code2016NaNstringdefault displayed{'item': {'111331': 'Apple orchards', '111332'...
LFOLFO_TTLNaNN/ALegal form of organization code8NaNstringdefault displayed{'item': {'001': 'All establishments', '002': ...
\n", "
" ], "text/plain": [ " attributes concept group \\\n", "EMPSZES EMPSZES_TTL NaN N/A \n", "NAICS2012 NAICS2012_TTL,FOOTID_NAICS NaN N/A \n", "LFO LFO_TTL NaN N/A \n", "\n", " label limit predicateOnly predicateType \\\n", "EMPSZES Employment size of establishment 14 NaN string \n", "NAICS2012 2012 NAICS code 2016 NaN string \n", "LFO Legal form of organization code 8 NaN string \n", "\n", " required \\\n", "EMPSZES default displayed \n", "NAICS2012 default displayed \n", "LFO default displayed \n", "\n", " values \n", "EMPSZES {'item': {'001': 'All establishments', '204': ... \n", "NAICS2012 {'item': {'111331': 'Apple orchards', '111332'... \n", "LFO {'item': {'001': 'All establishments', '002': ... " ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn2.variables[~ conn2.variables.required.isnull()]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Like before, geographies are shown within a given hierarchy. Here, the only geography is the `fips` geography. " ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "dict_keys(['fips'])" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn2.geographies.keys()" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
geoLevelDisplaylimitnameoptionalWithWCForreferenceDaterequireswildcard
0011usNaN2012-01-01NaNNaN
10251stateNaN2012-01-01NaNNaN
2033304countystate2012-01-01[state][state]
308,09917metropolitan statistical area/micropolitan sta...NaN2012-01-01NaNNaN
\n", "
" ], "text/plain": [ " geoLevelDisplay limit name \\\n", "0 01 1 us \n", "1 02 51 state \n", "2 03 3304 county \n", "3 08,09 917 metropolitan statistical area/micropolitan sta... \n", "\n", " optionalWithWCFor referenceDate requires wildcard \n", "0 NaN 2012-01-01 NaN NaN \n", "1 NaN 2012-01-01 NaN NaN \n", "2 state 2012-01-01 [state] [state] \n", "3 NaN 2012-01-01 NaN NaN " ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn2.geographies['fips']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, we'll do some fun with error handling and passing of additional arguments to the query. Any \"extra\" required predicates beyond `get`, `for` and `in` are added at the end of the query as keyword arguments. These are caught and introduced into the query following the API specifications. \n", "\n", "First, though, let's see what happens when we submit a malformed query!\n", "\n", "Here, we can query for every column in the dataset applied to places in California (`fips = 06`). The dataset we're working with, the Economic Census, requires an `OPTAX` field, which identifies the \"type of operation or tax status code\" along which to slice the data. Just like the other arguments, we will map them to keywords in the API string, and a wildcard represents a slice of all possible values. " ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "collapsed": false }, "outputs": [], "source": [ "cols = conn2.varslike('ESTAB*', engine='fnmatch')" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "collapsed": false, "scrolled": false }, "outputs": [], "source": [ "data2 = conn2.query(cols=cols, geo_unit='county:*', geo_filter={'state':'06'})" ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ESTABstatecounty
063506999
13670006001
24306003
380106005
4461506007
\n", "
" ], "text/plain": [ " ESTAB state county\n", "0 635 06 999\n", "1 36700 06 001\n", "2 43 06 003\n", "3 801 06 005\n", "4 4615 06 007" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data2.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And so you get the table of employment by County & NAICS code for employment and establishments in California counties. Since we're using counties as our unit of analysis, we could grab the geodata for counties." ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "collapsed": true }, "outputs": [], "source": [ "conn2.set_mapservice('State_County')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "But, there are quite a few layers in this MapService:" ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "71" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(conn2.mapservice.layers)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Oof. If you ever want to check out the web interface to see what it looks like, you can retrieve the URLs of most objects using:" ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'http://tigerweb.geo.census.gov/arcgis/rest/services/TIGERweb/State_County/MapServer'" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn2.mapservice._baseurl" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Anyway, we know counties don't really change all that much. So, let's just pick a counties layer and pull it down for California:" ] }, { "cell_type": "code", "execution_count": 41, "metadata": { "collapsed": false }, "outputs": [], "source": [ "geodata2= conn2.mapservice.query(layer=1,where='STATE = 06')" ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "collapsed": true }, "outputs": [], "source": [ "newdata2 = pd.merge(data2, geodata2, left_on='county', right_on='COUNTY')" ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ESTABstatecountyAREALANDAREAWATERBASENAMECENTLATCENTLONCOUNTYCOUNTYCC...INTPTLONLSADCMTFCCNAMEOBJECTIDOIDSTATESTGEOMETRY.AREASTGEOMETRY.LENgeometry
036700060011909598011216923745Alameda+37.6505688-121.9177578001H1...-121.912488006G4020Alameda County209827590141293924063.398122e+09309844.151899POLYGON ((-13612245.2954 4538149.388899997, -1...
14306003191229263112557304Alpine+38.5971043-119.8206026003H1...-119.798352206G4020Alpine County131727590289634197063.156005e+09275565.367754POLYGON ((-13366502.0648 4678945.273900002, -1...
280106005153993357529470567Amador+38.4466174-120.6516693005H1...-120.653856306G4020Amador County272427590143912562062.562527e+09359598.495341POLYGON ((-13472696.4062 4647651.505999997, -1...
34615060074238438258105310959Butte+39.6665788-121.6007017007H1...-121.601918806G4020Butte County223727590417130535067.340033e+09525974.965501POLYGON ((-13565003.3072 4798393.384000003, -1...
489106009264182920043797659Calaveras+38.2044678-120.5546688009H1...-120.561441506G4020Calaveras County34727590202403841064.356212e+09367005.819167POLYGON ((-13428574.0355 4627724.500200003, -1...
\n", "

5 rows × 24 columns

\n", "
" ], "text/plain": [ " ESTAB state county AREALAND AREAWATER BASENAME CENTLAT \\\n", "0 36700 06 001 1909598011 216923745 Alameda +37.6505688 \n", "1 43 06 003 1912292631 12557304 Alpine +38.5971043 \n", "2 801 06 005 1539933575 29470567 Amador +38.4466174 \n", "3 4615 06 007 4238438258 105310959 Butte +39.6665788 \n", "4 891 06 009 2641829200 43797659 Calaveras +38.2044678 \n", "\n", " CENTLON COUNTY COUNTYCC ... INTPTLON LSADC MTFCC \\\n", "0 -121.9177578 001 H1 ... -121.9124880 06 G4020 \n", "1 -119.8206026 003 H1 ... -119.7983522 06 G4020 \n", "2 -120.6516693 005 H1 ... -120.6538563 06 G4020 \n", "3 -121.6007017 007 H1 ... -121.6019188 06 G4020 \n", "4 -120.5546688 009 H1 ... -120.5614415 06 G4020 \n", "\n", " NAME OBJECTID OID STATE STGEOMETRY.AREA \\\n", "0 Alameda County 2098 27590141293924 06 3.398122e+09 \n", "1 Alpine County 1317 27590289634197 06 3.156005e+09 \n", "2 Amador County 2724 27590143912562 06 2.562527e+09 \n", "3 Butte County 2237 27590417130535 06 7.340033e+09 \n", "4 Calaveras County 347 27590202403841 06 4.356212e+09 \n", "\n", " STGEOMETRY.LEN geometry \n", "0 309844.151899 POLYGON ((-13612245.2954 4538149.388899997, -1... \n", "1 275565.367754 POLYGON ((-13366502.0648 4678945.273900002, -1... \n", "2 359598.495341 POLYGON ((-13472696.4062 4647651.505999997, -1... \n", "3 525974.965501 POLYGON ((-13565003.3072 4798393.384000003, -1... \n", "4 367005.819167 POLYGON ((-13428574.0355 4627724.500200003, -1... \n", "\n", "[5 rows x 24 columns]" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "newdata2.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And that's all there is to it! Geodata and tabular data from the Census APIs in one place.\n", "\n", "[File an issue](https://github.com/ljwolf/cenpy/issues/new) if you have concerns!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (Analysis)", "language": "python", "name": "ana" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.7" } }, "nbformat": 4, "nbformat_minor": 2 }