{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# `census_api` Demo Notebook\n", "\n", "This notebook is intended to demonstrate the features of the `census_api` package." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "from datascience import *" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Before you Start\n", "\n", "Before you begin working with the US Census API, you need to obtain an API key. This can be done [here](https://api.census.gov/data/key_signup.html). \n", "\n", "Once you have your API key, paste it in the cell below (where it says `YOUR_API_KEY_GOES_HERE`) and then run the cell. This creates a `config.py` file which Git will ignore (since this file pattern is listed in the `.gitignore` file) to store your API key safely." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_api_key = \"YOUR_API_KEY_GOES_HERE\"\n", "\n", "with open(\"config.py\", \"w+\") as f:\n", " f.write(\"\"\"api_key = \\\"{}\\\"\"\"\".format(my_api_key))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The next code cell will import this file so that your API key can be used in this notebook." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Initializing the Query Class\n", "\n", "This package utilizes the `CensusQuery` class to run queries through. To instantiate the class, you need your API key and the dataset you want to query. **Currently, this package only supports querying the ACS1, ACS5, and SF1 datasets**. When instantiating the class, you can also optionally provide a year to query data for and an [output type](#Output-Types)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import config\n", "import census_api\n", "\n", "# create the class instance\n", "c = census_api.CensusQuery(config.api_key, \"acs5\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Making Queries\n", "\n", "To query the API, use the `CensusQuery.query` method. The parameters are listed below." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "| Parameter | Type | Description |\n", "|-----|-----|-----|\n", "| `variables` | `list` | List of variables to extract from the API. For variable identifiers, find the dataset you're querying on [this page](https://api.census.gov/data.html) and click on `variables` in its row. |\n", "| `state` | `str` | The 2-letter abbreviation of the state you want data for |\n", "| `county` | `str` | Optional. The name of the county you want data for. Defaults to all. |\n", "| `tract` | `str` | Optional. The FIPS code of the tract you want data for. Defaults to all. |\n", "| `year` | `int` | Optional. The year you want data for. If provided, the `year` provided to the instance of `CensusQuery` is ignored. [more info below](#Years) |" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An example of a query is given below." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "output_2014 = c.query([\"NAME\", \"B00001_001E\"], \"CA\", county=\"Alameda\", year=2014)\n", "output_2014.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Years\n", "\n", "There are two ways to define the year that you want to query: in the class instance, or in the `CensusQuery.query` call. If you define it in the class instances, e.g. with\n", "\n", "```python\n", "c_2015 = census_api.CensusQuery(config.api_key, \"acs5\", year=2015)\n", "```\n", "\n", "then you don't need to provide it when you call `CensusQuery.query`. _However_, if you do provide it to `CensusQuery.query`, the year for the class instance will be ignored. So, if I were to call\n", "\n", "```python\n", "c_2015.query([\"NAME\"], \"CA\", year=2014)\n", "```\n", "\n", "I would get 2014 data, not 2015 data.\n", "\n", "If you don't define it in the class instance, you _must_ define it in the `CensusQuery.query` call, or else your output will be empty." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Output Types\n", "\n", "The `CensusQuery` class can output your data in one of two ways: as a `pandas` DataFrame or as a `datascience` Table. The class defaults to `pandas`, but setting the `out` argument when instantiating the class can change this setting. The two possible values of `out` are `\"pd\"` and `\"ds\"`, defaulting to `\"pd\"`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ds_output = census_api.CensusQuery(config.api_key, \"acs5\", out=\"ds\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, if I were to make the same query as above, the output would be of class `datascience.tables.Table`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# original instance\n", "print(type(c.query([\"NAME\", \"B00001_001E\"], \"CA\", county=\"Alameda\", year=2014)))\n", "\n", "# datascience instance\n", "print(type(ds_output.query([\"NAME\", \"B00001_001E\"], \"CA\", county=\"Alameda\", year=2014)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## More Information\n", "\n", "For more information about the Census API, visit [https://www.census.gov/developers/](https://www.census.gov/developers/). 