{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# The building blocks of `cenpy`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Cenpy (`sen - pie`) is a package that exposes APIs from the US Census Bureau and makes it easy to pull down and work with Census data in Pandas. \n", "\n", "Below, we'll discuss the `APIConnection` interface. This is the building blocks of the data products supported in `cenpy.products`. If you're looking for datasets that are not supported in `cenpy.products` or are interested in building your own application on top of `cenpy`, this is probably better for you. However, most users will probably want to use entries in `cenpy.products`. " ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [], "source": [ "import cenpy as c\n", "import pandas" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "On import, `cenpy.explorer` requests all currently available APIs from the Census Bureau's [API listing](http://www.census.gov/data/developers/data-sets.html). In future, it will can also read a `JSON` collection describing the databases from disk, if asked.\n", "\n", "Explorer has two functions, `available` and `explain`. `available` will provide a list of the identifiers of all the APIs that `cenpy` knows about. If run with `verbose=True`, `cenpy` will also include the title of the database as a dictionary. It's a good idea to *not* process this directly, and instead use it to explore currently available APIs.\n", "\n", "Also, beware that the US Census Bureau can change the names of the resources. This means that the index of the following table is not necessarily stable over time; sometimes, the same resource can change its identifier, like when the 2010 decennial census changed from `2010sf1` to `DECENNIALSF12010`. So, consult the table built by `cenpy.explorer.available()` if the keys appear to have changed. \n", "\n", "Here, I'll just show the first five entries:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false, "scrolled": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
titletemporalspatialpublisherprogramCodemodifiedkeyworddistributiondescriptioncontactPoint...c_isTimeseriesc_isCubec_isAvailablec_isAggregatec_groupsLinkc_geographyLinkc_examplesLinkc_datasetbureauCodeaccessLevel
2000sf12000 Decennial: Summary File 12000USUS Census BureauNaNNaN(){'@type': 'dcat:Distribution', 'accessURL': 'h...Data files available from Census 2000 and the ...{'fn': 'Census Bureau Call Center', 'hasEmail'......NaNNaNTrueTruehttps://api.census.gov/data/2000/sf1/groups.jsonhttps://api.census.gov/data/2000/sf1/geography...https://api.census.gov/data/2000/sf1/examples....(sf1,)006:07public
2000sf32000 Decennial: Summary File 32000USUS Census BureauNaN2017-05-23(){'@type': 'dcat:Distribution', 'accessURL': 'h...This Census 2000 file presents data on the pop...{'fn': 'Census Bureau Call Center', 'hasEmail'......NaNNaNTrueTruehttps://api.census.gov/data/2000/sf3/groups.jsonhttps://api.census.gov/data/2000/sf3/geography...https://api.census.gov/data/2000/sf3/examples....(sf3,)006:07public
2012acs12012 American Community Survey: 1-Year Estimates2012USUS Census BureauNaNNaN(){'@type': 'dcat:Distribution', 'accessURL': 'h...The American Community Survey (ACS) is a natio...{'fn': 'Census Bureau Call Center', 'hasEmail'......NaNNaNTrueTruehttps://api.census.gov/data/2012/acs1/groups.jsonhttps://api.census.gov/data/2012/acs1/geograph...https://api.census.gov/data/2012/acs1/examples...(acs1,)006:07public
2012acs32012 American Community Survey: 3-Year Estimates2012USUS Census BureauNaNNaN(){'@type': 'dcat:Distribution', 'accessURL': 'h...The American Community Survey (ACS) is a natio...{'fn': 'Census Bureau Call Center', 'hasEmail'......NaNNaNTrueTruehttps://api.census.gov/data/2012/acs3/groups.jsonhttps://api.census.gov/data/2012/acs3/geograph...https://api.census.gov/data/2012/acs3/examples...(acs3,)006:07public
2012acs3profile2012 American Community Survey: 3-Year Profile...2012USUS Census BureauNaNNaN(){'@type': 'dcat:Distribution', 'accessURL': 'h...The American Community Survey (ACS) is a natio...{'fn': 'Census Bureau Call Center', 'hasEmail'......NaNNaNTrueTruehttps://api.census.gov/data/2012/acs3/profile/...https://api.census.gov/data/2012/acs3/profile/...https://api.census.gov/data/2012/acs3/profile/...(acs3, profile)006:07public
\n", "

5 rows × 24 columns

\n", "
" ], "text/plain": [ " title temporal \\\n", "2000sf1 2000 Decennial: Summary File 1 2000 \n", "2000sf3 2000 Decennial: Summary File 3 2000 \n", "2012acs1 2012 American Community Survey: 1-Year Estimates 2012 \n", "2012acs3 2012 American Community Survey: 3-Year Estimates 2012 \n", "2012acs3profile 2012 American Community Survey: 3-Year Profile... 2012 \n", "\n", " spatial publisher programCode modified keyword \\\n", "2000sf1 US US Census Bureau NaN NaN () \n", "2000sf3 US US Census Bureau NaN 2017-05-23 () \n", "2012acs1 US US Census Bureau NaN NaN () \n", "2012acs3 US US Census Bureau NaN NaN () \n", "2012acs3profile US US Census Bureau NaN NaN () \n", "\n", " distribution \\\n", "2000sf1 {'@type': 'dcat:Distribution', 'accessURL': 'h... \n", "2000sf3 {'@type': 'dcat:Distribution', 'accessURL': 'h... \n", "2012acs1 {'@type': 'dcat:Distribution', 'accessURL': 'h... \n", "2012acs3 {'@type': 'dcat:Distribution', 'accessURL': 'h... \n", "2012acs3profile {'@type': 'dcat:Distribution', 'accessURL': 'h... \n", "\n", " description \\\n", "2000sf1 Data files available from Census 2000 and the ... \n", "2000sf3 This Census 2000 file presents data on the pop... \n", "2012acs1 The American Community Survey (ACS) is a natio... \n", "2012acs3 The American Community Survey (ACS) is a natio... \n", "2012acs3profile The American Community Survey (ACS) is a natio... \n", "\n", " contactPoint ... \\\n", "2000sf1 {'fn': 'Census Bureau Call Center', 'hasEmail'... ... \n", "2000sf3 {'fn': 'Census Bureau Call Center', 'hasEmail'... ... \n", "2012acs1 {'fn': 'Census Bureau Call Center', 'hasEmail'... ... \n", "2012acs3 {'fn': 'Census Bureau Call Center', 'hasEmail'... ... \n", "2012acs3profile {'fn': 'Census Bureau Call Center', 'hasEmail'... ... \n", "\n", " c_isTimeseries c_isCube c_isAvailable c_isAggregate \\\n", "2000sf1 NaN NaN True True \n", "2000sf3 NaN NaN True True \n", "2012acs1 NaN NaN True True \n", "2012acs3 NaN NaN True True \n", "2012acs3profile NaN NaN True True \n", "\n", " c_groupsLink \\\n", "2000sf1 https://api.census.gov/data/2000/sf1/groups.json \n", "2000sf3 https://api.census.gov/data/2000/sf3/groups.json \n", "2012acs1 https://api.census.gov/data/2012/acs1/groups.json \n", "2012acs3 https://api.census.gov/data/2012/acs3/groups.json \n", "2012acs3profile https://api.census.gov/data/2012/acs3/profile/... \n", "\n", " c_geographyLink \\\n", "2000sf1 https://api.census.gov/data/2000/sf1/geography... \n", "2000sf3 https://api.census.gov/data/2000/sf3/geography... \n", "2012acs1 https://api.census.gov/data/2012/acs1/geograph... \n", "2012acs3 https://api.census.gov/data/2012/acs3/geograph... \n", "2012acs3profile https://api.census.gov/data/2012/acs3/profile/... \n", "\n", " c_examplesLink \\\n", "2000sf1 https://api.census.gov/data/2000/sf1/examples.... \n", "2000sf3 https://api.census.gov/data/2000/sf3/examples.... \n", "2012acs1 https://api.census.gov/data/2012/acs1/examples... \n", "2012acs3 https://api.census.gov/data/2012/acs3/examples... \n", "2012acs3profile https://api.census.gov/data/2012/acs3/profile/... \n", "\n", " c_dataset bureauCode accessLevel \n", "2000sf1 (sf1,) 006:07 public \n", "2000sf3 (sf3,) 006:07 public \n", "2012acs1 (acs1,) 006:07 public \n", "2012acs3 (acs3,) 006:07 public \n", "2012acs3profile (acs3, profile) 006:07 public \n", "\n", "[5 rows x 24 columns]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "c.explorer.available().head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `explain` command provides the title and full description of the datasource. If run in verbose mode, the function returns the full `json` listing of the API. " ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "{'Decennial SF1': 'Summary File 1 (SF 1) contains detailed tables focusing on age, sex, households, families, and housing units. These tables provide in-depth figures by race and Hispanic origin> some tables are repeated for each of nine race/Latino groups. Counts also are provided for over forty American Indian and Alaska Native tribes and for groups within race categories. The race categories include eighteen Asian groups and twelve Native Hawaiian and Other Pacific Islander groups. Counts of persons of Hispanic origin by country of origin (twenty-eight groups) are also shown. Summary File 1 presents data for the United States, the 50 states, and the District of Columbia in a hierarchical sequence down to the block level for many tabulations, but only to the census tract level for others. Summaries are included for other geographic areas such as ZIP Code Tabulation Areas (ZCTAs) and Congressional districts. Geographic coverage for Puerto Rico is comparable to the 50 states. Data are presented in a hierarchical sequence down the block level for many tabulations, but only to the census tract level for others. Geographic areas include barrios, barrios-pueblo, subbarrios, places, census tracts, block groups, and blocks. Summaries also are included for other geographic areas such as ZIP Code Tabulation Areas (ZCTAs).'}" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "c.explorer.explain('DECENNIALSF12010')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To actually connect to a database resource, you create a `Connection`. A `Connection` works like a *very* simplified connection from the `sqlalchemy` world. The `Connection` class has a method, `query` that constructs a query string and requests it from the Census server. This result is then parsed into JSON and returned to the user. " ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [], "source": [ "conn = c.remote.APIConnection('DECENNIALSF12010')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That may have taken longer than you'd've expected. This is because, when the `Connection` constructor is called, it populates the connection object with a bit of metadata that makes it possible to construct queries without referring to the census handbooks. \n", "\n", "For instance, a connection's `variables` represent all available search parameters for a given dataset. " ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
attributesconceptgrouplabellimitpredicateOnlypredicateTyperequiredvalues
forNaNCensus API Geography SpecificationN/ACensus API FIPS 'for' clause0Truefips-forNaNNaN
inNaNCensus API Geography SpecificationN/ACensus API FIPS 'in' clause0Truefips-inNaNNaN
ucgidNaNCensus API Geography SpecificationN/AUniform Census Geography Identifier clause0TrueucgidNaNNaN
P029009NaNHOUSEHOLD TYPE BY RELATIONSHIPP29Total!!In households!!In family households!!Ad...0NaNintNaNNaN
P029007NaNHOUSEHOLD TYPE BY RELATIONSHIPP29Total!!In households!!In family households!!Sp...0NaNintNaNNaN
\n", "
" ], "text/plain": [ " attributes concept group \\\n", "for NaN Census API Geography Specification N/A \n", "in NaN Census API Geography Specification N/A \n", "ucgid NaN Census API Geography Specification N/A \n", "P029009 NaN HOUSEHOLD TYPE BY RELATIONSHIP P29 \n", "P029007 NaN HOUSEHOLD TYPE BY RELATIONSHIP P29 \n", "\n", " label limit \\\n", "for Census API FIPS 'for' clause 0 \n", "in Census API FIPS 'in' clause 0 \n", "ucgid Uniform Census Geography Identifier clause 0 \n", "P029009 Total!!In households!!In family households!!Ad... 0 \n", "P029007 Total!!In households!!In family households!!Sp... 0 \n", "\n", " predicateOnly predicateType required values \n", "for True fips-for NaN NaN \n", "in True fips-in NaN NaN \n", "ucgid True ucgid NaN NaN \n", "P029009 NaN int NaN NaN \n", "P029007 NaN int NaN NaN " ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn.variables.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This dataframe is populated just like the census's table describing the variables on the corresponding [api website](https://api.census.gov/data/2010/dec/sf1/variables.html). Fortunately, this means that you can modify and filter this dataframe just like you can regular pandas dataframes, so working out what the exact codes to use in your query is easy. \n", "\n", "I've added a function, `varslike`, that globs variables that fit a regular expression pattern. It can use the builtin python `re` module, in addition to the `fnmatch` module. It also can use any filtering function you want. " ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
attributesconceptgrouplabellimitpredicateOnlypredicateTyperequiredvalues
H011A002NaNTOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ...H11APopulation in occupied housing units!!Owned wi...0NaNintNaNNaN
H011A001NaNTOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ...H11APopulation in occupied housing units0NaNintNaNNaN
H011A004NaNTOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ...H11APopulation in occupied housing units!!Renter o...0NaNintNaNNaN
H011A003NaNTOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ...H11APopulation in occupied housing units!!Owned fr...0NaNintNaNNaN
H011B004NaNTOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ...H11BPopulation in occupied housing units!!Renter o...0NaNintNaNNaN
H011B002NaNTOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ...H11BPopulation in occupied housing units!!Owned wi...0NaNintNaNNaN
H011B003NaNTOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ...H11BPopulation in occupied housing units!!Owned fr...0NaNintNaNNaN
H011B001NaNTOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ...H11BPopulation in occupied housing units0NaNintNaNNaN
\n", "
" ], "text/plain": [ " attributes concept group \\\n", "H011A002 NaN TOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ... H11A \n", "H011A001 NaN TOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ... H11A \n", "H011A004 NaN TOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ... H11A \n", "H011A003 NaN TOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ... H11A \n", "H011B004 NaN TOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ... H11B \n", "H011B002 NaN TOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ... H11B \n", "H011B003 NaN TOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ... H11B \n", "H011B001 NaN TOTAL POPULATION IN OCCUPIED HOUSING UNITS BY ... H11B \n", "\n", " label limit \\\n", "H011A002 Population in occupied housing units!!Owned wi... 0 \n", "H011A001 Population in occupied housing units 0 \n", "H011A004 Population in occupied housing units!!Renter o... 0 \n", "H011A003 Population in occupied housing units!!Owned fr... 0 \n", "H011B004 Population in occupied housing units!!Renter o... 0 \n", "H011B002 Population in occupied housing units!!Owned wi... 0 \n", "H011B003 Population in occupied housing units!!Owned fr... 0 \n", "H011B001 Population in occupied housing units 0 \n", "\n", " predicateOnly predicateType required values \n", "H011A002 NaN int NaN NaN \n", "H011A001 NaN int NaN NaN \n", "H011A004 NaN int NaN NaN \n", "H011A003 NaN int NaN NaN \n", "H011B004 NaN int NaN NaN \n", "H011B002 NaN int NaN NaN \n", "H011B003 NaN int NaN NaN \n", "H011B001 NaN int NaN NaN " ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn.varslike('H011[AB]')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also use this functionality to filter variables using an arbitrary field:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
attributesconceptgrouplabellimitpredicateOnlypredicateTyperequiredvalues
PCT011007NaNHISPANIC OR LATINO BY SPECIFIC ORIGINPCT11Total!!Hispanic or Latino (200-299)!!Dominican...0NaNintNaNNaN
\n", "
" ], "text/plain": [ " attributes concept group \\\n", "PCT011007 NaN HISPANIC OR LATINO BY SPECIFIC ORIGIN PCT11 \n", "\n", " label limit \\\n", "PCT011007 Total!!Hispanic or Latino (200-299)!!Dominican... 0 \n", "\n", " predicateOnly predicateType required values \n", "PCT011007 NaN int NaN NaN " ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn.varslike('Dominican', by='label')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Likewise, the different levels of geographic scale are determined from the metadata in the overall API listing and recorded. \n", "\n", "However, many Census products have multiple possible geographical indexing systems, like the deprecated `fips` code system and the new *Geographical Names Information System*, `gnis`. Thus, the `geographies` property is a dictionary of dataframes, where each key is the name of the identifier system and the value is the dataframe describing the identifier system. \n", "\n", "For the 2010 census, the following systems are available:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "dict_keys(['fips'])" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn.geographies.keys()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For an explanation of the geographic hierarchies, the `geographies` tables show the geographies at which the data is summarized:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
geoLevelDisplaynameoptionalWithWCForreferenceDaterequireswildcard
0010usNaN2010-01-01NaNNaN
1020regionNaN2010-01-01NaNNaN
2030divisionNaN2010-01-01NaNNaN
3040stateNaN2010-01-01NaNNaN
4050countystate2010-01-01[state][state]
\n", "
" ], "text/plain": [ " geoLevelDisplay name optionalWithWCFor referenceDate requires wildcard\n", "0 010 us NaN 2010-01-01 NaN NaN\n", "1 020 region NaN 2010-01-01 NaN NaN\n", "2 030 division NaN 2010-01-01 NaN NaN\n", "3 040 state NaN 2010-01-01 NaN NaN\n", "4 050 county state 2010-01-01 [state] [state]" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn.geographies['fips'].head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that some geographies in the `fips` system have a **requires** filter to prevent drawing too much data. This will get passed to the `query` method later. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So, let's just grab the housing information from the 2010 Census Short Form. Using the variables table above, we picked out a subset of the fields we wanted. Since the variables table is indexed by the identifiers, we can grab the indexes of the filtered dataframe as query parameters. \n", "\n", "In addition, adding the `NAME` field smart-fills the table with the name of the geographic entity being pulled from the Census." ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": false }, "outputs": [], "source": [ "cols = conn.varslike('H00[012]*', engine='fnmatch').index.tolist()" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": false }, "outputs": [], "source": [ "cols.append('NAME')" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "['H001001',\n", " 'H002001',\n", " 'H002006',\n", " 'H002003',\n", " 'H002002',\n", " 'H002005',\n", " 'H002004',\n", " 'NAME']" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cols" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now the query. The query is constructed just like the API query, and works as follows. \n", "\n", "1. cols - list of columns desired from the database, maps to census API's `get=`\n", "2. geo_unit - string denoting the unit of study to pull, maps to census API's `in=`\n", "3. geo_filter - dictionary containing groupings of geo_units, if required, maps to `for=`\n", " \n", "To be specific, a fully query tells the server *what* columns to pull of *what* underlying geography from *what* aggregation units. It's structured using these heterogeneous datatypes so it's easy to change the smallest units quickly, while providing sufficient granularity to change the filters and columns as you go. \n", "\n", "This query below grabs the names, population, and housing estimates from the ACS, as well as their standard errors from census designated places in Arizona. \n" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "collapsed": false }, "outputs": [], "source": [ "data = conn.query(cols, geo_unit = 'place:*', geo_filter = {'state':'04'})" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once constructed, the query executes as fast as your internet connection will move. This query has:" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "(451, 10)" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "28 columns and 451 rows. So, rather fast. \n", "\n", "For validity and ease of use, we store the last executed query to the object. If you're dodgy about your census API key never being shown in plaintext, never print this property!" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'https://api.census.gov/data/2010/dec/sf1?get=H001001,H002001,H002006,H002003,H002002,H002005,H002004,NAME&for=place:*&in=state:04&key=174dc2099125916233a42788cc0ffd0336d2ca85'" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn.last_query" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So, you have a dataframe with the information requested, plus the fields specified in the `geo_filter` and `geo_unit`. Sometimes, the `pandas.infer_objects()` function is not able to infer the types or structures of the data in the ways that you might expect. Thus, you may need to format the final data to ensure that the data types are correct. \n", "\n", "So, the following is a dataframe of the data requested. I've filtered it to only look at data where the population is larger than 40 thousand people.\n", "\n", "Pretty neat!" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
H001001H002001H002006H002003H002002H002005H002004NAMEstateplace
63944049440409439494394100Chandler city, Arizona0412000
146749077490707488074880270Gilbert town, Arizona0427400
148905059050509049390493120Glendale city, Arizona0427820
22420117320117302009792009791940Mesa city, Arizona0446000
2666481864818060939641336853194Peoria city, Arizona0454050
268590149590149058793658793622130Phoenix city, Arizona0455000
328124001124001012004912004939520Scottsdale city, Arizona0465000
36652586525860510825108215040Surprise city, Arizona0471510
37573462734620734627346200Tempe city, Arizona0473000
3942297622297620228506228577118571Tucson city, Arizona0477000
\n", "
" ], "text/plain": [ " H001001 H002001 H002006 H002003 H002002 H002005 H002004 \\\n", "63 94404 94404 0 94394 94394 10 0 \n", "146 74907 74907 0 74880 74880 27 0 \n", "148 90505 90505 0 90493 90493 12 0 \n", "224 201173 201173 0 200979 200979 194 0 \n", "266 64818 64818 0 60939 64133 685 3194 \n", "268 590149 590149 0 587936 587936 2213 0 \n", "328 124001 124001 0 120049 120049 3952 0 \n", "366 52586 52586 0 51082 51082 1504 0 \n", "375 73462 73462 0 73462 73462 0 0 \n", "394 229762 229762 0 228506 228577 1185 71 \n", "\n", " NAME state place \n", "63 Chandler city, Arizona 04 12000 \n", "146 Gilbert town, Arizona 04 27400 \n", "148 Glendale city, Arizona 04 27820 \n", "224 Mesa city, Arizona 04 46000 \n", "266 Peoria city, Arizona 04 54050 \n", "268 Phoenix city, Arizona 04 55000 \n", "328 Scottsdale city, Arizona 04 65000 \n", "366 Surprise city, Arizona 04 71510 \n", "375 Tempe city, Arizona 04 73000 \n", "394 Tucson city, Arizona 04 77000 " ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data[data['H001001'].astype(int) > 40000]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And, just in case you're liable to forget your FIPS codes, the explorer module can look up some fips codes listings for you." ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
AZ0400730Aguila CDPCensus Designated PlaceSMaricopa County
0AZ4870Ajo CDPCensus Designated PlaceSPima County
1AZ4940Ak Chin CDPCensus Designated PlaceSPima County
2AZ41090Ak-Chin Village CDPCensus Designated PlaceSPinal County
3AZ41170Alamo Lake CDPCensus Designated PlaceSLa Paz County
4AZ41560Ali Chuk CDPCensus Designated PlaceSPima County
5AZ41570Ali Chukson CDPCensus Designated PlaceSPima County
6AZ41620Ali Molina CDPCensus Designated PlaceSPima County
7AZ41920Alpine CDPCensus Designated PlaceSApache County
8AZ41990Amado CDPCensus Designated PlaceSSanta Cruz County
9AZ42270Anegam CDPCensus Designated PlaceSPima County
10AZ42410Antares CDPCensus Designated PlaceSMohave County
11AZ42430Anthem CDPCensus Designated PlaceSMaricopa County
12AZ42830Apache Junction cityIncorporated PlaceAMaricopa County, Pinal County
13AZ43320Arivaca CDPCensus Designated PlaceSPima County
14AZ43380Arivaca Junction CDPCensus Designated PlaceSPima County
15AZ43530Arizona City CDPCensus Designated PlaceSPinal County
16AZ43915Arizona Village CDPCensus Designated PlaceSMohave County
17AZ44020Arlington CDPCensus Designated PlaceSMaricopa County
18AZ44440Ash Fork CDPCensus Designated PlaceSYavapai County
19AZ44710Avenue B and C CDPCensus Designated PlaceSYuma County
20AZ44720Avondale cityIncorporated PlaceAMaricopa County
21AZ44880Avra Valley CDPCensus Designated PlaceSPima County
22AZ44930Aztec CDPCensus Designated PlaceSYuma County
23AZ45140Bagdad CDPCensus Designated PlaceSYavapai County
24AZ45450Bear Flat CDPCensus Designated PlaceSGila County
25AZ45490Beaver Dam CDPCensus Designated PlaceSMohave County
26AZ45495Beaver Valley CDPCensus Designated PlaceSGila County
27AZ45770Benson cityIncorporated PlaceACochise County
28AZ45970Beyerville CDPCensus Designated PlaceSSanta Cruz County
29AZ46260Bisbee cityIncorporated PlaceACochise County
........................
420AZ482120Wheatfields CDPCensus Designated PlaceSGila County
421AZ482155Whetstone CDPCensus Designated PlaceSCochise County
422AZ482270Whispering Pines CDPCensus Designated PlaceSGila County
423AZ482390Whitecone CDPCensus Designated PlaceSNavajo County
424AZ482425White Hills CDPCensus Designated PlaceSMohave County
425AZ482450White Mountain Lake CDPCensus Designated PlaceSNavajo County
426AZ482530Whiteriver CDPCensus Designated PlaceSNavajo County
427AZ482660Why CDPCensus Designated PlaceSPima County
428AZ482740Wickenburg townIncorporated PlaceAMaricopa County, Yavapai County
429AZ482810Wide Ruins CDPCensus Designated PlaceSApache County
430AZ482880Wikieup CDPCensus Designated PlaceSMohave County
431AZ482950Wilhoit CDPCensus Designated PlaceSYavapai County
432AZ483090Willcox cityIncorporated PlaceACochise County
433AZ483160Williams cityIncorporated PlaceACoconino County
434AZ483388Williamson CDPCensus Designated PlaceSYavapai County
435AZ483475Willow Canyon CDPCensus Designated PlaceSPima County
436AZ483570Willow Valley CDPCensus Designated PlaceSMohave County
437AZ483720Window Rock CDPCensus Designated PlaceSApache County
438AZ483790Winkelman townIncorporated PlaceAGila County, Pinal County
439AZ483930Winslow cityIncorporated PlaceANavajo County
440AZ483960Winslow West CDPCensus Designated PlaceSCoconino County, Navajo County
441AZ484000Wintersburg CDPCensus Designated PlaceSMaricopa County
442AZ484140Wittmann CDPCensus Designated PlaceSMaricopa County
443AZ484350Woodruff CDPCensus Designated PlaceSNavajo County
444AZ484980Yarnell CDPCensus Designated PlaceSYavapai County
445AZ485260York CDPCensus Designated PlaceSGreenlee County
446AZ485330Young CDPCensus Designated PlaceSGila County
447AZ485400Youngtown townIncorporated PlaceAMaricopa County
448AZ485470Yucca CDPCensus Designated PlaceSMohave County
449AZ485540Yuma cityIncorporated PlaceAYuma County
\n", "

450 rows × 7 columns

\n", "
" ], "text/plain": [ " AZ 04 00730 Aguila CDP Census Designated Place S \\\n", "0 AZ 4 870 Ajo CDP Census Designated Place S \n", "1 AZ 4 940 Ak Chin CDP Census Designated Place S \n", "2 AZ 4 1090 Ak-Chin Village CDP Census Designated Place S \n", "3 AZ 4 1170 Alamo Lake CDP Census Designated Place S \n", "4 AZ 4 1560 Ali Chuk CDP Census Designated Place S \n", "5 AZ 4 1570 Ali Chukson CDP Census Designated Place S \n", "6 AZ 4 1620 Ali Molina CDP Census Designated Place S \n", "7 AZ 4 1920 Alpine CDP Census Designated Place S \n", "8 AZ 4 1990 Amado CDP Census Designated Place S \n", "9 AZ 4 2270 Anegam CDP Census Designated Place S \n", "10 AZ 4 2410 Antares CDP Census Designated Place S \n", "11 AZ 4 2430 Anthem CDP Census Designated Place S \n", "12 AZ 4 2830 Apache Junction city Incorporated Place A \n", "13 AZ 4 3320 Arivaca CDP Census Designated Place S \n", "14 AZ 4 3380 Arivaca Junction CDP Census Designated Place S \n", "15 AZ 4 3530 Arizona City CDP Census Designated Place S \n", "16 AZ 4 3915 Arizona Village CDP Census Designated Place S \n", "17 AZ 4 4020 Arlington CDP Census Designated Place S \n", "18 AZ 4 4440 Ash Fork CDP Census Designated Place S \n", "19 AZ 4 4710 Avenue B and C CDP Census Designated Place S \n", "20 AZ 4 4720 Avondale city Incorporated Place A \n", "21 AZ 4 4880 Avra Valley CDP Census Designated Place S \n", "22 AZ 4 4930 Aztec CDP Census Designated Place S \n", "23 AZ 4 5140 Bagdad CDP Census Designated Place S \n", "24 AZ 4 5450 Bear Flat CDP Census Designated Place S \n", "25 AZ 4 5490 Beaver Dam CDP Census Designated Place S \n", "26 AZ 4 5495 Beaver Valley CDP Census Designated Place S \n", "27 AZ 4 5770 Benson city Incorporated Place A \n", "28 AZ 4 5970 Beyerville CDP Census Designated Place S \n", "29 AZ 4 6260 Bisbee city Incorporated Place A \n", ".. .. .. ... ... ... .. \n", "420 AZ 4 82120 Wheatfields CDP Census Designated Place S \n", "421 AZ 4 82155 Whetstone CDP Census Designated Place S \n", "422 AZ 4 82270 Whispering Pines CDP Census Designated Place S \n", "423 AZ 4 82390 Whitecone CDP Census Designated Place S \n", "424 AZ 4 82425 White Hills CDP Census Designated Place S \n", "425 AZ 4 82450 White Mountain Lake CDP Census Designated Place S \n", "426 AZ 4 82530 Whiteriver CDP Census Designated Place S \n", "427 AZ 4 82660 Why CDP Census Designated Place S \n", "428 AZ 4 82740 Wickenburg town Incorporated Place A \n", "429 AZ 4 82810 Wide Ruins CDP Census Designated Place S \n", "430 AZ 4 82880 Wikieup CDP Census Designated Place S \n", "431 AZ 4 82950 Wilhoit CDP Census Designated Place S \n", "432 AZ 4 83090 Willcox city Incorporated Place A \n", "433 AZ 4 83160 Williams city Incorporated Place A \n", "434 AZ 4 83388 Williamson CDP Census Designated Place S \n", "435 AZ 4 83475 Willow Canyon CDP Census Designated Place S \n", "436 AZ 4 83570 Willow Valley CDP Census Designated Place S \n", "437 AZ 4 83720 Window Rock CDP Census Designated Place S \n", "438 AZ 4 83790 Winkelman town Incorporated Place A \n", "439 AZ 4 83930 Winslow city Incorporated Place A \n", "440 AZ 4 83960 Winslow West CDP Census Designated Place S \n", "441 AZ 4 84000 Wintersburg CDP Census Designated Place S \n", "442 AZ 4 84140 Wittmann CDP Census Designated Place S \n", "443 AZ 4 84350 Woodruff CDP Census Designated Place S \n", "444 AZ 4 84980 Yarnell CDP Census Designated Place S \n", "445 AZ 4 85260 York CDP Census Designated Place S \n", "446 AZ 4 85330 Young CDP Census Designated Place S \n", "447 AZ 4 85400 Youngtown town Incorporated Place A \n", "448 AZ 4 85470 Yucca CDP Census Designated Place S \n", "449 AZ 4 85540 Yuma city Incorporated Place A \n", "\n", " Maricopa County \n", "0 Pima County \n", "1 Pima County \n", "2 Pinal County \n", "3 La Paz County \n", "4 Pima County \n", "5 Pima County \n", "6 Pima County \n", "7 Apache County \n", "8 Santa Cruz County \n", "9 Pima County \n", "10 Mohave County \n", "11 Maricopa County \n", "12 Maricopa County, Pinal County \n", "13 Pima County \n", "14 Pima County \n", "15 Pinal County \n", "16 Mohave County \n", "17 Maricopa County \n", "18 Yavapai County \n", "19 Yuma County \n", "20 Maricopa County \n", "21 Pima County \n", "22 Yuma County \n", "23 Yavapai County \n", "24 Gila County \n", "25 Mohave County \n", "26 Gila County \n", "27 Cochise County \n", "28 Santa Cruz County \n", "29 Cochise County \n", ".. ... \n", "420 Gila County \n", "421 Cochise County \n", "422 Gila County \n", "423 Navajo County \n", "424 Mohave County \n", "425 Navajo County \n", "426 Navajo County \n", "427 Pima County \n", "428 Maricopa County, Yavapai County \n", "429 Apache County \n", "430 Mohave County \n", "431 Yavapai County \n", "432 Cochise County \n", "433 Coconino County \n", "434 Yavapai County \n", "435 Pima County \n", "436 Mohave County \n", "437 Apache County \n", "438 Gila County, Pinal County \n", "439 Navajo County \n", "440 Coconino County, Navajo County \n", "441 Maricopa County \n", "442 Maricopa County \n", "443 Navajo County \n", "444 Yavapai County \n", "445 Greenlee County \n", "446 Gila County \n", "447 Maricopa County \n", "448 Mohave County \n", "449 Yuma County \n", "\n", "[450 rows x 7 columns]" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "c.explorer.fips_table('place', in_state='AZ')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### GEO & Tiger Integration" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The Census TIGER geometry API is substantively different from every other API, in that it's an ArcGIS REST API. But, I've tried to expose a consistent interface. It works like this:" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import cenpy.tiger as tiger" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "[{'name': 'AIANNHA', 'type': 'MapServer'},\n", " {'name': 'CBSA', 'type': 'MapServer'},\n", " {'name': 'Hydro', 'type': 'MapServer'},\n", " {'name': 'Labels', 'type': 'MapServer'},\n", " {'name': 'Legislative', 'type': 'MapServer'},\n", " {'name': 'Places_CouSub_ConCity_SubMCD', 'type': 'MapServer'},\n", " {'name': 'PUMA_TAD_TAZ_UGA_ZCTA', 'type': 'MapServer'},\n", " {'name': 'Region_Division', 'type': 'MapServer'},\n", " {'name': 'School', 'type': 'MapServer'},\n", " {'name': 'Special_Land_Use_Areas', 'type': 'MapServer'},\n", " {'name': 'State_County', 'type': 'MapServer'},\n", " {'name': 'tigerWMS_ACS2013', 'type': 'MapServer'},\n", " {'name': 'tigerWMS_ACS2014', 'type': 'MapServer'},\n", " {'name': 'tigerWMS_ACS2015', 'type': 'MapServer'},\n", " {'name': 'tigerWMS_ACS2016', 'type': 'MapServer'},\n", " {'name': 'tigerWMS_ACS2017', 'type': 'MapServer'},\n", " {'name': 'tigerWMS_ACS2018', 'type': 'MapServer'},\n", " {'name': 'tigerWMS_ACS2019', 'type': 'MapServer'},\n", " {'name': 'tigerWMS_Census2010', 'type': 'MapServer'},\n", " {'name': 'tigerWMS_Current', 'type': 'MapServer'},\n", " {'name': 'tigerWMS_ECON2012', 'type': 'MapServer'},\n", " {'name': 'tigerWMS_PhysicalFeatures', 'type': 'MapServer'},\n", " {'name': 'Tracts_Blocks', 'type': 'MapServer'},\n", " {'name': 'Transportation', 'type': 'MapServer'},\n", " {'name': 'TribalTracts', 'type': 'MapServer'},\n", " {'name': 'Urban', 'type': 'MapServer'},\n", " {'name': 'USLandmass', 'type': 'MapServer'}]" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tiger.available()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In some cases, it makes quite a bit of sense to \"attach\" a map server to your connection. In the case of the US Census 2010 we've been using, there is an obvious data product match in `tigerWMS_Census2010`. So, let's attach it to the connection." ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "Connection to Decennial SF1(ID: https://api.census.gov/data/id/DECENNIALSF12010)\n", "With MapServer: Census 2010 WMS" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn.set_mapservice('tigerWMS_Census2010')" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn.mapservice" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "neat! this is the same as calling: \n", "\n", "`tiger.TigerConnection('tigerWMS_Census2010')`\n", "\n", "but this attaches that object it to the connection you've been using. The connection also updates with this information:" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "Connection to Decennial SF1(ID: https://api.census.gov/data/id/DECENNIALSF12010)\n", "With MapServer: Census 2010 WMS" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An ESRI MapServer is a big thing, and `cenpy` doesn't support all of its features. Since `cenpy` is designed to support retreival of data from the US Census, we only support `GET` statements for defined geographic units, and ignore the vaious other functionalities in the service. \n", "\n", "To work with a service, note that any map server is composed of layers:" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "[(ESRILayer) Public Use Microdata Areas,\n", " (ESRILayer) Public Use Microdata Areas Labels,\n", " (ESRILayer) Traffic Analysis Districts,\n", " (ESRILayer) Traffic Analysis Districts Labels,\n", " (ESRILayer) Traffic Analysis Zones,\n", " (ESRILayer) Traffic Analysis Zones Labels,\n", " (ESRILayer) Urban Growth Areas,\n", " (ESRILayer) Urban Growth Areas Labels,\n", " (ESRILayer) ZIP Code Tabulation Areas,\n", " (ESRILayer) ZIP Code Tabulation Areas Labels,\n", " (ESRILayer) Tribal Census Tracts,\n", " (ESRILayer) Tribal Census Tracts Labels,\n", " (ESRILayer) Tribal Block Groups,\n", " (ESRILayer) Tribal Block Groups Labels,\n", " (ESRILayer) Census Tracts,\n", " (ESRILayer) Census Tracts Labels,\n", " (ESRILayer) Census Block Groups,\n", " (ESRILayer) Census Block Groups Labels,\n", " (ESRILayer) Census Blocks,\n", " (ESRILayer) Census Blocks Labels,\n", " (ESRILayer) Unified School Districts,\n", " (ESRILayer) Unified School Districts Labels,\n", " (ESRILayer) Secondary School Districts,\n", " (ESRILayer) Secondary School Districts Labels,\n", " (ESRILayer) Elementary School Districts,\n", " (ESRILayer) Elementary School Districts Labels,\n", " (ESRILayer) Estates,\n", " (ESRILayer) Estates Labels,\n", " (ESRILayer) County Subdivisions,\n", " (ESRILayer) County Subdivisions Labels,\n", " (ESRILayer) Subbarrios,\n", " (ESRILayer) Subbarrios Labels,\n", " (ESRILayer) Consolidated Cities,\n", " (ESRILayer) Consolidated Cities Labels,\n", " (ESRILayer) Incorporated Places,\n", " (ESRILayer) Incorporated Places Labels,\n", " (ESRILayer) Census Designated Places,\n", " (ESRILayer) Census Designated Places Labels,\n", " (ESRILayer) Alaska Native Regional Corporations,\n", " (ESRILayer) Alaska Native Regional Corporations Labels,\n", " (ESRILayer) Tribal Subdivisions,\n", " (ESRILayer) Tribal Subdivisions Labels,\n", " (ESRILayer) Federal American Indian Reservations,\n", " (ESRILayer) Federal American Indian Reservations Labels,\n", " (ESRILayer) Off-Reservation Trust Lands,\n", " (ESRILayer) Off-Reservation Trust Lands Labels,\n", " (ESRILayer) State American Indian Reservations,\n", " (ESRILayer) State American Indian Reservations Labels,\n", " (ESRILayer) Hawaiian Home Lands,\n", " (ESRILayer) Hawaiian Home Lands Labels,\n", " (ESRILayer) Alaska Native Village Statistical Areas,\n", " (ESRILayer) Alaska Native Village Statistical Areas Labels,\n", " (ESRILayer) Oklahoma Tribal Statistical Areas,\n", " (ESRILayer) Oklahoma Tribal Statistical Areas Labels,\n", " (ESRILayer) State Designated Tribal Statistical Areas,\n", " (ESRILayer) State Designated Tribal Statistical Areas Labels,\n", " (ESRILayer) Tribal Designated Statistical Areas,\n", " (ESRILayer) Tribal Designated Statistical Areas Labels,\n", " (ESRILayer) American Indian Joint-Use Areas,\n", " (ESRILayer) American Indian Joint-Use Areas Labels,\n", " (ESRILayer) 113th Congressional Districts,\n", " (ESRILayer) 113th Congressional Districts Labels,\n", " (ESRILayer) 111th Congressional Districts,\n", " (ESRILayer) 111th Congressional Districts Labels,\n", " (ESRILayer) 2013 State Legislative Districts - Upper,\n", " (ESRILayer) 2013 State Legislative Districts - Upper Labels,\n", " (ESRILayer) 2013 State Legislative Districts - Lower,\n", " (ESRILayer) 2013 State Legislative Districts - Lower Labels,\n", " (ESRILayer) 2010 State Legislative Districts - Upper,\n", " (ESRILayer) 2010 State Legislative Districts - Upper Labels,\n", " (ESRILayer) 2010 State Legislative Districts - Lower,\n", " (ESRILayer) 2010 State Legislative Districts - Lower Labels,\n", " (ESRILayer) Voting Districts,\n", " (ESRILayer) Voting Districts Labels,\n", " (ESRILayer) Census Divisions,\n", " (ESRILayer) Census Divisions Labels,\n", " (ESRILayer) Census Regions,\n", " (ESRILayer) Census Regions Labels,\n", " (ESRILayer) Urbanized Areas,\n", " (ESRILayer) Urbanized Areas Labels,\n", " (ESRILayer) Urban Clusters,\n", " (ESRILayer) Urban Clusters Labels,\n", " (ESRILayer) Combined New England City and Town Areas,\n", " (ESRILayer) Combined New England City and Town Areas Labels,\n", " (ESRILayer) New England City and Town Area Divisions,\n", " (ESRILayer) New England City and Town Area Divisions Labels,\n", " (ESRILayer) Metropolitan New England City and Town Areas,\n", " (ESRILayer) Metropolitan New England City and Town Areas Labels,\n", " (ESRILayer) Micropolitan New England City and Town Areas,\n", " (ESRILayer) Micropolitan New England City and Town Areas Labels,\n", " (ESRILayer) Combined Statistical Areas,\n", " (ESRILayer) Combined Statistical Areas Labels,\n", " (ESRILayer) Metropolitan Divisions,\n", " (ESRILayer) Metropolitan Divisions Labels,\n", " (ESRILayer) Metropolitan Statistical Areas,\n", " (ESRILayer) Metropolitan Statistical Areas Labels,\n", " (ESRILayer) Micropolitan Statistical Areas,\n", " (ESRILayer) Micropolitan Statistical Areas Labels,\n", " (ESRILayer) States,\n", " (ESRILayer) States Labels,\n", " (ESRILayer) Counties,\n", " (ESRILayer) Counties Labels]" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn.mapservice.layers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "These layers are what actually implement query operations. For now, let's focus on the same \"class\" of units we were using before, Census Designated Places:" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "(ESRILayer) Census Designated Places" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn.mapservice.layers[36]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A query function is implemented both at the mapservice level and the layer level. At the mapservice level, a layer ID is required in order to complete the query. \n", "\n", "Mapservice queries are driven by SQL. So, to grab all of the geodata that fits the CDPs we pulled before, you could start to construct it like this. \n", "\n", "First, just like the main connection, each layer has a set of variables: " ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
aliasdomainlengthnametype
0MTFCCNone5.0MTFCCesriFieldTypeString
1OIDNoneNaNOIDesriFieldTypeDouble
2GEOIDNone7.0GEOIDesriFieldTypeString
3STATENone2.0STATEesriFieldTypeString
4PLACENone5.0PLACEesriFieldTypeString
5BASENAMENone100.0BASENAMEesriFieldTypeString
6NAMENone100.0NAMEesriFieldTypeString
7LSADCNone2.0LSADCesriFieldTypeString
8FUNCSTATNone1.0FUNCSTATesriFieldTypeString
9PLACECCNone2.0PLACECCesriFieldTypeString
10AREALANDNoneNaNAREALANDesriFieldTypeDouble
11AREAWATERNoneNaNAREAWATEResriFieldTypeDouble
12URNone1.0UResriFieldTypeString
13CBSAPCINone1.0CBSAPCIesriFieldTypeString
14NECTAPCINone1.0NECTAPCIesriFieldTypeString
15STGEOMETRYNoneNaNSTGEOMETRYesriFieldTypeGeometry
16CENTLATNone11.0CENTLATesriFieldTypeString
17CENTLONNone12.0CENTLONesriFieldTypeString
18INTPTLATNone11.0INTPTLATesriFieldTypeString
19INTPTLONNone12.0INTPTLONesriFieldTypeString
20PLACENSNone8.0PLACENSesriFieldTypeString
21HU100NoneNaNHU100esriFieldTypeDouble
22POP100NoneNaNPOP100esriFieldTypeDouble
23OBJECTIDNoneNaNOBJECTIDesriFieldTypeOID
\n", "
" ], "text/plain": [ " alias domain length name type\n", "0 MTFCC None 5.0 MTFCC esriFieldTypeString\n", "1 OID None NaN OID esriFieldTypeDouble\n", "2 GEOID None 7.0 GEOID esriFieldTypeString\n", "3 STATE None 2.0 STATE esriFieldTypeString\n", "4 PLACE None 5.0 PLACE esriFieldTypeString\n", "5 BASENAME None 100.0 BASENAME esriFieldTypeString\n", "6 NAME None 100.0 NAME esriFieldTypeString\n", "7 LSADC None 2.0 LSADC esriFieldTypeString\n", "8 FUNCSTAT None 1.0 FUNCSTAT esriFieldTypeString\n", "9 PLACECC None 2.0 PLACECC esriFieldTypeString\n", "10 AREALAND None NaN AREALAND esriFieldTypeDouble\n", "11 AREAWATER None NaN AREAWATER esriFieldTypeDouble\n", "12 UR None 1.0 UR esriFieldTypeString\n", "13 CBSAPCI None 1.0 CBSAPCI esriFieldTypeString\n", "14 NECTAPCI None 1.0 NECTAPCI esriFieldTypeString\n", "15 STGEOMETRY None NaN STGEOMETRY esriFieldTypeGeometry\n", "16 CENTLAT None 11.0 CENTLAT esriFieldTypeString\n", "17 CENTLON None 12.0 CENTLON esriFieldTypeString\n", "18 INTPTLAT None 11.0 INTPTLAT esriFieldTypeString\n", "19 INTPTLON None 12.0 INTPTLON esriFieldTypeString\n", "20 PLACENS None 8.0 PLACENS esriFieldTypeString\n", "21 HU100 None NaN HU100 esriFieldTypeDouble\n", "22 POP100 None NaN POP100 esriFieldTypeDouble\n", "23 OBJECTID None NaN OBJECTID esriFieldTypeOID" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn.mapservice.layers[36].variables" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Our prior query grabbed the places in AZ. So, we could use a SQL query that focuses on that. \n", "\n", "I try to pack the geometries into containers that people are used to using. Without knowing if GEOS is installed on a user's computer, I use `PySAL` as the target geometry type. \n", "\n", "If you do have GEOS, that means you can use Shapely or GeoPandas. So, to choose your backend, you can use the following two arguments to this query function. the `pkg` argument will let you choose the three types of python objects to output to. \n", "\n", "Pysal is default. If you select Shapely, the result will just be a pandas dataframe with Shapely geometries instead of pysal geometries. If you choose geopandas (or throw a gpize) option, cenpy will try to convert the pandas dataframe into a GeoPandas dataframe." ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "collapsed": false }, "outputs": [], "source": [ "geodata = conn.mapservice.query(layer=36, where='STATE = 04')" ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
AREALANDAREAWATERBASENAMECBSAPCICENTLATCENTLONFUNCSTATGEOIDHU100INTPTLAT...NECTAPCIOBJECTIDOIDPLACEPLACECCPLACENSPOP100STATEURgeometry
03141830Donovan EstatesN+32.7093536-114.6782229S0419790394+32.7093536...N1981028040371738901319790U202582773150804UPOLYGON ((-12766151.5981 3857230.2984, -127661...
130343690Kohls RanchN+34.3210530-111.0838546S0438600127+34.3210530...N2030828040371747671938600U1025828094604RPOLYGON ((-12368084.4537 4072701.559100002, -1...
2208842680WheatfieldsN+33.4805233-110.8366065S0482120465+33.5553700...N2032628040371747665482120U20258289978504RPOLYGON ((-12343973.9878 3969254.195299998, -1...
386799020Goodyear VillageN+33.1973130-111.8723437S0428465121+33.1973130...N2021628040386109119128465U20261213945704MPOLYGON ((-12456879.4474 3922507.554499999, -1...
42338541255226CarrizoN+33.9866282-110.3314358S041032040+33.9793542...N2028328040371723164810320U10258274812704RPOLYGON ((-12289756.4977 4027500.7016, -122896...
\n", "

5 rows × 24 columns

\n", "
" ], "text/plain": [ " AREALAND AREAWATER BASENAME CBSAPCI CENTLAT CENTLON \\\n", "0 314183 0 Donovan Estates N +32.7093536 -114.6782229 \n", "1 3034369 0 Kohls Ranch N +34.3210530 -111.0838546 \n", "2 20884268 0 Wheatfields N +33.4805233 -110.8366065 \n", "3 8679902 0 Goodyear Village N +33.1973130 -111.8723437 \n", "4 23385412 55226 Carrizo N +33.9866282 -110.3314358 \n", "\n", " FUNCSTAT GEOID HU100 INTPTLAT ... NECTAPCI OBJECTID \\\n", "0 S 0419790 394 +32.7093536 ... N 19810 \n", "1 S 0438600 127 +34.3210530 ... N 20308 \n", "2 S 0482120 465 +33.5553700 ... N 20326 \n", "3 S 0428465 121 +33.1973130 ... N 20216 \n", "4 S 0410320 40 +33.9793542 ... N 20283 \n", "\n", " OID PLACE PLACECC PLACENS POP100 STATE UR \\\n", "0 280403717389013 19790 U2 02582773 1508 04 U \n", "1 280403717476719 38600 U1 02582809 46 04 R \n", "2 280403717476654 82120 U2 02582899 785 04 R \n", "3 280403861091191 28465 U2 02612139 457 04 M \n", "4 280403717231648 10320 U1 02582748 127 04 R \n", "\n", " geometry \n", "0 POLYGON ((-12766151.5981 3857230.2984, -127661... \n", "1 POLYGON ((-12368084.4537 4072701.559100002, -1... \n", "2 POLYGON ((-12343973.9878 3969254.195299998, -1... \n", "3 POLYGON ((-12456879.4474 3922507.554499999, -1... \n", "4 POLYGON ((-12289756.4977 4027500.7016, -122896... \n", "\n", "[5 rows x 24 columns]" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "geodata.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To join the geodata to the other data, use pandas functions:" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "collapsed": false }, "outputs": [], "source": [ "newdata = pd.merge(data, geodata, left_on='place', right_on='PLACE')" ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
H001001H002001H002006H002003H002002H002005H002004NAME_xstateplace...NECTAPCIOBJECTIDOIDPLACEPLACECCPLACENSPOP100STATEURgeometry
03043040003040Aguila CDP, Arizona0400730...N2883528040371747671300730U10258272079804RPOLYGON ((-12599756.2327 4021163.631899998, -1...
1217521750020061692006Ajo CDP, Arizona0400870...N2979128040125418902600870U102407704330404MPOLYGON ((-12573640.5688 3815514.326800004, -1...
21111000110Ak Chin CDP, Arizona0400940...N2341128040371747662600940U1025827213004RPOLYGON ((-12469657.0325 3801071.388700001, -1...
325625600141115141Ak-Chin Village CDP, Arizona0401090...N2470028040126023169801090U10240770586204MPOLYGON ((-12480838.2961 3895295.468500003, -1...
43131000310Alamo Lake CDP, Arizona0401170...N2162128040371738897701170U2025827222504RPOLYGON ((-12647299.4514 4059688.195799999, -1...
\n", "

5 rows × 34 columns

\n", "
" ], "text/plain": [ " H001001 H002001 H002006 H002003 H002002 H002005 H002004 \\\n", "0 304 304 0 0 0 304 0 \n", "1 2175 2175 0 0 2006 169 2006 \n", "2 11 11 0 0 0 11 0 \n", "3 256 256 0 0 141 115 141 \n", "4 31 31 0 0 0 31 0 \n", "\n", " NAME_x state place ... NECTAPCI OBJECTID \\\n", "0 Aguila CDP, Arizona 04 00730 ... N 28835 \n", "1 Ajo CDP, Arizona 04 00870 ... N 29791 \n", "2 Ak Chin CDP, Arizona 04 00940 ... N 23411 \n", "3 Ak-Chin Village CDP, Arizona 04 01090 ... N 24700 \n", "4 Alamo Lake CDP, Arizona 04 01170 ... N 21621 \n", "\n", " OID PLACE PLACECC PLACENS POP100 STATE UR \\\n", "0 280403717476713 00730 U1 02582720 798 04 R \n", "1 280401254189026 00870 U1 02407704 3304 04 M \n", "2 280403717476626 00940 U1 02582721 30 04 R \n", "3 280401260231698 01090 U1 02407705 862 04 M \n", "4 280403717388977 01170 U2 02582722 25 04 R \n", "\n", " geometry \n", "0 POLYGON ((-12599756.2327 4021163.631899998, -1... \n", "1 POLYGON ((-12573640.5688 3815514.326800004, -1... \n", "2 POLYGON ((-12469657.0325 3801071.388700001, -1... \n", "3 POLYGON ((-12480838.2961 3895295.468500003, -1... \n", "4 POLYGON ((-12647299.4514 4059688.195799999, -1... \n", "\n", "[5 rows x 34 columns]" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "newdata.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So, that's how you get your geodata in addition to your regular data!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## OK, that's one API, does it work for others?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll try the Economic Census" ] }, { "cell_type": "code", "execution_count": 41, "metadata": { "collapsed": false }, "outputs": [], "source": [ "conn2 = c.remote.APIConnection('CBP2012')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alright, let's look at the available columns:" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
attributesconceptgrouplabellimitpredicateOnlypredicateTyperequiredvalues
forNaNCensus API Geography SpecificationN/ACensus API FIPS 'for' clause0Truefips-forNaNNaN
inNaNCensus API Geography SpecificationN/ACensus API FIPS 'in' clause0Truefips-inNaNNaN
ucgidNaNCensus API Geography SpecificationN/AUniform Census Geography Identifier clause0TrueucgidNaNNaN
EMP_NEMP_N_FGeography Area Series: County Business Pattern...CB1200CBPNoise range for number of paid employees for p...0NaNintNaNNaN
FOOTID_GEONaNNaNN/AGeo Footnote0NaNstringNaNNaN
PAYQTR1_NPAYQTR1_N_FGeography Area Series: County Business Pattern...CB1200CBPNoise range for first-quarter payroll (%)0NaNintNaNNaN
CSANaNNaNN/AFIPS Combined Statistical Area code0NaNstringNaNNaN
YEARYEAR_TTLGeography Area Series: County Business Pattern...CB1200CBPYear0NaNstringNaN{'item': {'1982': '1982', '1983': '1983', '198...
LFOLFO_TTLGeography Area Series: County Business Pattern...CB1200CBPLegal form of organization code0NaNstringdefault displayed{'item': {'001': 'All establishments', '002': ...
MDNaNNaNN/AFIPS Metropolitan Division code0NaNstringNaNNaN
FOOTID_NAICSNaNNaNN/ANaics Footnote0NaNstringNaNNaN
MSANaNNaNN/AFIPS Metropolitan Statistical Area or Micropol...0NaNNaNNaNNaN
COUNTYNaNNaNN/AFIPS county code0NaNNaNNaNNaN
STNaNNaNN/AFIPS state code0NaNNaNNaNNaN
PAYANN_NPAYANN_N_FGeography Area Series: County Business Pattern...CB1200CBPNoise range for annual payroll0NaNintNaNNaN
PAYQTR1PAYQTR1_FGeography Area Series: County Business Pattern...CB1200CBPFirst-quarter payroll ($1,000)0NaNintNaNNaN
EMPEMP_FGeography Area Series: County Business Pattern...CB1200CBPPaid employees for pay period including March ...0NaNNaNNaNNaN
NAICS2012NAICS2012_TTL,NAICS2012_F,INDLEVEL,SECTOR,SUBS...Geography Area Series: County Business Pattern...CB1200CBP2012 NAICS code2135NaNstringdefault displayed{'item': {'00': 'Total for all sectors', '0000...
GEO_IDGEO_TTL,GEO_ID_FGeography Area Series: County Business Pattern...CB1200CBPGeographic identifier code0NaNNaNNaNNaN
GEOTYPENaNNaNN/AType of geography flag0NaNstringNaNNaN
ESTABESTAB_FGeography Area Series: County Business Pattern...CB1200CBPNumber of establishments0NaNNaNNaNNaN
PAYANNPAYANN_FGeography Area Series: County Business Pattern...CB1200CBPAnnual payroll ($1,000)0NaNintNaNNaN
EMPSZESEMPSZES_TTLGeography Area Series: County Business Pattern...CB1200CBPEmployment size of establishment14NaNstringdefault displayed{'item': {'001': 'All establishments', '204': ...
\n", "
" ], "text/plain": [ " attributes \\\n", "for NaN \n", "in NaN \n", "ucgid NaN \n", "EMP_N EMP_N_F \n", "FOOTID_GEO NaN \n", "PAYQTR1_N PAYQTR1_N_F \n", "CSA NaN \n", "YEAR YEAR_TTL \n", "LFO LFO_TTL \n", "MD NaN \n", "FOOTID_NAICS NaN \n", "MSA NaN \n", "COUNTY NaN \n", "ST NaN \n", "PAYANN_N PAYANN_N_F \n", "PAYQTR1 PAYQTR1_F \n", "EMP EMP_F \n", "NAICS2012 NAICS2012_TTL,NAICS2012_F,INDLEVEL,SECTOR,SUBS... \n", "GEO_ID GEO_TTL,GEO_ID_F \n", "GEOTYPE NaN \n", "ESTAB ESTAB_F \n", "PAYANN PAYANN_F \n", "EMPSZES EMPSZES_TTL \n", "\n", " concept group \\\n", "for Census API Geography Specification N/A \n", "in Census API Geography Specification N/A \n", "ucgid Census API Geography Specification N/A \n", "EMP_N Geography Area Series: County Business Pattern... CB1200CBP \n", "FOOTID_GEO NaN N/A \n", "PAYQTR1_N Geography Area Series: County Business Pattern... CB1200CBP \n", "CSA NaN N/A \n", "YEAR Geography Area Series: County Business Pattern... CB1200CBP \n", "LFO Geography Area Series: County Business Pattern... CB1200CBP \n", "MD NaN N/A \n", "FOOTID_NAICS NaN N/A \n", "MSA NaN N/A \n", "COUNTY NaN N/A \n", "ST NaN N/A \n", "PAYANN_N Geography Area Series: County Business Pattern... CB1200CBP \n", "PAYQTR1 Geography Area Series: County Business Pattern... CB1200CBP \n", "EMP Geography Area Series: County Business Pattern... CB1200CBP \n", "NAICS2012 Geography Area Series: County Business Pattern... CB1200CBP \n", "GEO_ID Geography Area Series: County Business Pattern... CB1200CBP \n", "GEOTYPE NaN N/A \n", "ESTAB Geography Area Series: County Business Pattern... CB1200CBP \n", "PAYANN Geography Area Series: County Business Pattern... CB1200CBP \n", "EMPSZES Geography Area Series: County Business Pattern... CB1200CBP \n", "\n", " label limit \\\n", "for Census API FIPS 'for' clause 0 \n", "in Census API FIPS 'in' clause 0 \n", "ucgid Uniform Census Geography Identifier clause 0 \n", "EMP_N Noise range for number of paid employees for p... 0 \n", "FOOTID_GEO Geo Footnote 0 \n", "PAYQTR1_N Noise range for first-quarter payroll (%) 0 \n", "CSA FIPS Combined Statistical Area code 0 \n", "YEAR Year 0 \n", "LFO Legal form of organization code 0 \n", "MD FIPS Metropolitan Division code 0 \n", "FOOTID_NAICS Naics Footnote 0 \n", "MSA FIPS Metropolitan Statistical Area or Micropol... 0 \n", "COUNTY FIPS county code 0 \n", "ST FIPS state code 0 \n", "PAYANN_N Noise range for annual payroll 0 \n", "PAYQTR1 First-quarter payroll ($1,000) 0 \n", "EMP Paid employees for pay period including March ... 0 \n", "NAICS2012 2012 NAICS code 2135 \n", "GEO_ID Geographic identifier code 0 \n", "GEOTYPE Type of geography flag 0 \n", "ESTAB Number of establishments 0 \n", "PAYANN Annual payroll ($1,000) 0 \n", "EMPSZES Employment size of establishment 14 \n", "\n", " predicateOnly predicateType required \\\n", "for True fips-for NaN \n", "in True fips-in NaN \n", "ucgid True ucgid NaN \n", "EMP_N NaN int NaN \n", "FOOTID_GEO NaN string NaN \n", "PAYQTR1_N NaN int NaN \n", "CSA NaN string NaN \n", "YEAR NaN string NaN \n", "LFO NaN string default displayed \n", "MD NaN string NaN \n", "FOOTID_NAICS NaN string NaN \n", "MSA NaN NaN NaN \n", "COUNTY NaN NaN NaN \n", "ST NaN NaN NaN \n", "PAYANN_N NaN int NaN \n", "PAYQTR1 NaN int NaN \n", "EMP NaN NaN NaN \n", "NAICS2012 NaN string default displayed \n", "GEO_ID NaN NaN NaN \n", "GEOTYPE NaN string NaN \n", "ESTAB NaN NaN NaN \n", "PAYANN NaN int NaN \n", "EMPSZES NaN string default displayed \n", "\n", " values \n", "for NaN \n", "in NaN \n", "ucgid NaN \n", "EMP_N NaN \n", "FOOTID_GEO NaN \n", "PAYQTR1_N NaN \n", "CSA NaN \n", "YEAR {'item': {'1982': '1982', '1983': '1983', '198... \n", "LFO {'item': {'001': 'All establishments', '002': ... \n", "MD NaN \n", "FOOTID_NAICS NaN \n", "MSA NaN \n", "COUNTY NaN \n", "ST NaN \n", "PAYANN_N NaN \n", "PAYQTR1 NaN \n", "EMP NaN \n", "NAICS2012 {'item': {'00': 'Total for all sectors', '0000... \n", "GEO_ID NaN \n", "GEOTYPE NaN \n", "ESTAB NaN \n", "PAYANN NaN \n", "EMPSZES {'item': {'001': 'All establishments', '204': ... " ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn2.variables" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To show the required predicates, can filter the `variables` dataframe by the `required` field. Note that *required* means that the query **will fail** if these are not passed as keyword arguments. They don't have to specify a single value, though, so they can be left as a wild card, like we did with `place:*` in the prior query:" ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
attributesconceptgrouplabellimitpredicateOnlypredicateTyperequiredvalues
LFOLFO_TTLGeography Area Series: County Business Pattern...CB1200CBPLegal form of organization code0NaNstringdefault displayed{'item': {'001': 'All establishments', '002': ...
NAICS2012NAICS2012_TTL,NAICS2012_F,INDLEVEL,SECTOR,SUBS...Geography Area Series: County Business Pattern...CB1200CBP2012 NAICS code2135NaNstringdefault displayed{'item': {'00': 'Total for all sectors', '0000...
EMPSZESEMPSZES_TTLGeography Area Series: County Business Pattern...CB1200CBPEmployment size of establishment14NaNstringdefault displayed{'item': {'001': 'All establishments', '204': ...
\n", "
" ], "text/plain": [ " attributes \\\n", "LFO LFO_TTL \n", "NAICS2012 NAICS2012_TTL,NAICS2012_F,INDLEVEL,SECTOR,SUBS... \n", "EMPSZES EMPSZES_TTL \n", "\n", " concept group \\\n", "LFO Geography Area Series: County Business Pattern... CB1200CBP \n", "NAICS2012 Geography Area Series: County Business Pattern... CB1200CBP \n", "EMPSZES Geography Area Series: County Business Pattern... CB1200CBP \n", "\n", " label limit predicateOnly predicateType \\\n", "LFO Legal form of organization code 0 NaN string \n", "NAICS2012 2012 NAICS code 2135 NaN string \n", "EMPSZES Employment size of establishment 14 NaN string \n", "\n", " required \\\n", "LFO default displayed \n", "NAICS2012 default displayed \n", "EMPSZES default displayed \n", "\n", " values \n", "LFO {'item': {'001': 'All establishments', '002': ... \n", "NAICS2012 {'item': {'00': 'Total for all sectors', '0000... \n", "EMPSZES {'item': {'001': 'All establishments', '204': ... " ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn2.variables[~ conn2.variables.required.isnull()]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Like before, geographies are shown within a given hierarchy. Here, the only geography is the `fips` geography. " ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "dict_keys(['fips'])" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn2.geographies.keys()" ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
geoLevelDisplaylimitnameoptionalWithWCForreferenceDaterequireswildcard
0NaN1usNaN2012-01-01NaNNaN
108,09939metropolitan statistical area/micropolitan sta...NaN2012-01-01NaNNaN
2NaN3249countystate2012-01-01[state][state]
3NaN51stateNaN2012-01-01NaNNaN
\n", "
" ], "text/plain": [ " geoLevelDisplay limit name \\\n", "0 NaN 1 us \n", "1 08,09 939 metropolitan statistical area/micropolitan sta... \n", "2 NaN 3249 county \n", "3 NaN 51 state \n", "\n", " optionalWithWCFor referenceDate requires wildcard \n", "0 NaN 2012-01-01 NaN NaN \n", "1 NaN 2012-01-01 NaN NaN \n", "2 state 2012-01-01 [state] [state] \n", "3 NaN 2012-01-01 NaN NaN " ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn2.geographies['fips']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, we'll do some fun with error handling and passing of additional arguments to the query. Any \"extra\" required predicates beyond `get`, `for` and `in` are added at the end of the query as keyword arguments. These are caught and introduced into the query following the API specifications. \n", "\n", "First, though, let's see what happens when we submit a malformed query!\n", "\n", "Here, we can query for every column in the dataset applied to places in California (`fips = 06`). The dataset we're working with, the Economic Census, requires an `OPTAX` field, which identifies the \"type of operation or tax status code\" along which to slice the data. Just like the other arguments, we will map them to keywords in the API string, and a wildcard represents a slice of all possible values. " ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "collapsed": false }, "outputs": [], "source": [ "cols = conn2.varslike('ESTAB*', engine='fnmatch').index.tolist()" ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "collapsed": false, "scrolled": false }, "outputs": [], "source": [ "data2 = conn2.query(cols=cols, geo_unit='county:*', geo_filter={'state':'06'})" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ESTABstatecounty
03670006001
14306003
280106005
3461506007
489106009
\n", "
" ], "text/plain": [ " ESTAB state county\n", "0 36700 06 001\n", "1 43 06 003\n", "2 801 06 005\n", "3 4615 06 007\n", "4 891 06 009" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data2.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And so you get the table of employment by County & NAICS code for employment and establishments in California counties. Since we're using counties as our unit of analysis, we could grab the geodata for counties." ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/plain": [ "Connection to 2012 County Business Patterns(ID: https://api.census.gov/data/id/CBP2012)\n", "With MapServer: States and Counties" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn2.set_mapservice('State_County')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "But, there are quite a few layers in this MapService:" ] }, { "cell_type": "code", "execution_count": 52, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "71" ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(conn2.mapservice.layers)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Oof. If you ever want to check out the web interface to see what it looks like, you can retrieve the URLs of most objects using:" ] }, { "cell_type": "code", "execution_count": 53, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'http://tigerweb.geo.census.gov/arcgis/rest/services/TIGERweb/State_County/MapServer'" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conn2.mapservice._baseurl" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Anyway, we know counties don't really change all that much. So, let's just pick a counties layer and pull it down for California:" ] }, { "cell_type": "code", "execution_count": 54, "metadata": { "collapsed": false }, "outputs": [], "source": [ "geodata2= conn2.mapservice.query(layer=1,where='STATE = 06')" ] }, { "cell_type": "code", "execution_count": 55, "metadata": { "collapsed": true }, "outputs": [], "source": [ "newdata2 = pd.merge(data2, geodata2, left_on='county', right_on='COUNTY')" ] }, { "cell_type": "code", "execution_count": 56, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ESTABstatecountyAREALANDAREAWATERBASENAMECENTLATCENTLONCOUNTYCOUNTYCC...GEOIDINTPTLATINTPTLONLSADCMTFCCNAMEOBJECTIDOIDSTATEgeometry
036700060011909614756216907015Alameda+37.6505687-121.9177578001H1...06001+37.6471385-121.912488006G4020Alameda County20982759014129392406POLYGON ((-13612245.2954 4538149.388899997, -1...
14306003191229260812557304Alpine+38.5971043-119.8206026003H1...06003+38.6217831-119.798352206G4020Alpine County13172759028963419706POLYGON ((-13366502.0648 4678945.273900002, -1...
280106005153993359629470567Amador+38.4466174-120.6516693005H1...06005+38.4435501-120.653856306G4020Amador County27242759014391256206POLYGON ((-13472696.4062 4647651.505999997, -1...
34615060074238488156105261063Butte+39.6665788-121.6007017007H1...06007+39.6659588-121.601918806G4020Butte County22372759041713053506POLYGON ((-13565003.3072 4798393.384000003, -1...
489106009264178499243841871Calaveras+38.2044678-120.5546688009H1...06009+38.1838996-120.561441506G4020Calaveras County3472759020240384106POLYGON ((-13428574.0355 4627724.500200003, -1...
\n", "

5 rows × 22 columns

\n", "
" ], "text/plain": [ " ESTAB state county AREALAND AREAWATER BASENAME CENTLAT \\\n", "0 36700 06 001 1909614756 216907015 Alameda +37.6505687 \n", "1 43 06 003 1912292608 12557304 Alpine +38.5971043 \n", "2 801 06 005 1539933596 29470567 Amador +38.4466174 \n", "3 4615 06 007 4238488156 105261063 Butte +39.6665788 \n", "4 891 06 009 2641784992 43841871 Calaveras +38.2044678 \n", "\n", " CENTLON COUNTY COUNTYCC ... GEOID INTPTLAT INTPTLON LSADC \\\n", "0 -121.9177578 001 H1 ... 06001 +37.6471385 -121.9124880 06 \n", "1 -119.8206026 003 H1 ... 06003 +38.6217831 -119.7983522 06 \n", "2 -120.6516693 005 H1 ... 06005 +38.4435501 -120.6538563 06 \n", "3 -121.6007017 007 H1 ... 06007 +39.6659588 -121.6019188 06 \n", "4 -120.5546688 009 H1 ... 06009 +38.1838996 -120.5614415 06 \n", "\n", " MTFCC NAME OBJECTID OID STATE \\\n", "0 G4020 Alameda County 2098 27590141293924 06 \n", "1 G4020 Alpine County 1317 27590289634197 06 \n", "2 G4020 Amador County 2724 27590143912562 06 \n", "3 G4020 Butte County 2237 27590417130535 06 \n", "4 G4020 Calaveras County 347 27590202403841 06 \n", "\n", " geometry \n", "0 POLYGON ((-13612245.2954 4538149.388899997, -1... \n", "1 POLYGON ((-13366502.0648 4678945.273900002, -1... \n", "2 POLYGON ((-13472696.4062 4647651.505999997, -1... \n", "3 POLYGON ((-13565003.3072 4798393.384000003, -1... \n", "4 POLYGON ((-13428574.0355 4627724.500200003, -1... \n", "\n", "[5 rows x 22 columns]" ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "newdata2.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And that's all there is to it! Geodata and tabular data from the Census APIs in one place.\n", "\n", "[File an issue](https://github.com/ljwolf/cenpy/issues/new) if you have concerns!" ] } ], "metadata": { "kernelspec": { "display_name": "Analysis", "language": "python", "name": "ana" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.2" } }, "nbformat": 4, "nbformat_minor": 2 }