{ "metadata": { "name": "" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "Goal" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's get set up to use the US Census API:\n", "\n", " http://www.census.gov/developers/" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Things we'd like to be able to do:\n", "\n", "* calculate the population of California.\n", "* then calculate the population of every geographic entity going down to census block if possible.\n", "* for a given geographic unit, can we get the racial/ethnic breakdown?\n", "\n", "It's useful to make ties to the county-type calculation we do with the downloaded census files." ] }, { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "Installing census, a useful Python module" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Dependency: to start with -- let's use the Python module: https://pypi.python.org/pypi/census/\n", "\n", " pip install -U census" ] }, { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "Getting and activating an API key" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To use the census API, you will need an API key\n", "\n", "* fill out form at http://www.census.gov/developers/tos/key_request.html \n", "\n", "\"Your request for a new API key has been successfully submitted. Please check your email. In a few minutes you should receive a message with instructions on how to activate your new key.\"\n", "\n", "* click on link you'll get a link of the form http://api.census.gov/data/KeySignup?validate={key} -- where {key} is the one the Census Bureau will send you.\n", "\n", "Then create a settings.py in the same directory as this notebook (or somewhere else in your Python path) to hold `settings.CENSUS_KEY`. (I prefer this approach over directly exposing your API key in the notebook code.)" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# This cell should run successfully if you have a string set up to represent your census key\n", "\n", "try:\n", " import settings\n", " assert type(settings.CENSUS_KEY) == str or type(settings.CENSUS_KEY) == unicode\n", "except Exception as e:\n", " print \"error in importing settings to get at settings.CENSUS_KEY\", e" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "us.states module" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# let's figure out a bit about the us module, in particular, us.states\n", "# https://github.com/unitedstates/python-us\n", "\n", "from us import states\n", "assert states.CA.fips == u'06'" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "# set up your census object\n", "# example from https://github.com/sunlightlabs/census\n", "\n", "from census import Census\n", "from us import states\n", "\n", "c = Census(settings.CENSUS_KEY)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "for (i, state) in enumerate(states.STATES):\n", " print i, state.name, state.fips" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "Formulating URL requests to the API explicitly" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import requests" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "# get the total population of all states\n", "url = \"http://api.census.gov/data/2010/sf1?key={key}&get=P0010001,NAME&for=state:*\".format(key=settings.CENSUS_KEY)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "r = requests.get(url)" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "EXERCISE" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Show how to calculate the total population of the USA, including and excluding Puerto Rico. (I don't know why Puerto Rico is included but not other [unincorporated territories](https://en.wikipedia.org/wiki/Unincorporated_territories_of_the_United_States)" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# FILL IN WITH YOUR CODE\n", "\n", "\n" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "Next Steps: Focusing on sf1 + census" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "How to map out the geographical hierachy and pull out total population figures?\n", "\n", " 1. Nation\n", " 1. Regions\n", " 1. Divisions\n", " 1. State\n", " 1. County\n", " 1. Census Tract\n", " 1. Block Group\n", " 1. Census Block\n", " \n", "Questions\n", " \n", "* What identifiers are used for these various geographic entities? \n", "* Can we get an enumeration of each of these entities?\n", "* How to figure out which census tract, block group, census block one is in?\n", " \n" ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Total Population of California" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[2010 Census Summary File 1](http://www.census.gov/prod/cen2010/doc/sf1.pdf)\n", "\n", "P0010001 is found in [2010 SF1 API Variables \\[XML\\]](http://www.census.gov/developers/data/sf1.xml) = \"total population\"" ] }, { "cell_type": "code", "collapsed": false, "input": [ "c.sf1.get(('NAME', 'P0010001'), {'for': 'state:%s' % states.CA.fips})" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "\"population of California: {0}\".format(\n", " int(c.sf1.get(('NAME', 'P0010001'), {'for': 'state:%s' % states.CA.fips})[0]['P0010001']))" ], "language": "python", "metadata": {}, "outputs": [] } ], "metadata": {} } ] }