{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## 06 Using specialized packages to grab data\n", "Previously we saw how 3rd party packages vastly increase what Python can do quickly and fairly easily. Here we see that this applies to downloading data as well; someone has written a Python package to download Census data quite easily.\n", "\n", "First we need to install the packages, and then we'll use them to grab some data. One note, however, is that to use these packages you need to sign up for a [free] Census API key. You can do this here: https://api.census.gov/data/key_signup.html\n", "\n", "Documentation for these packages is here: \n", "https://pypi.python.org/pypi/census\n", "https://github.com/datamade/census\n", "\n", "We'll discuss APIs, such as ths Census API next..." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Import the 'census' package; install if needed\n", "try:\n", " from census import Census\n", "except:\n", " !pip install census\n", " from census import Census" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#And finally, import pandas\n", "import pandas as pd\n", "from plotnine import *" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Add your census key here:\n", "key = None" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create the connection to the Census API\n", "c = Census(key, year=2015)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "variables = ('NAME', 'B19001_001E')\n", "params = {'for':'tract:*', 'in':'state:24'} #FIPS 24 is Maryland\n", "response = c.acs5.get(variables, params)\n", "response = pd.DataFrame(response)\n", "response.dtypes" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Convert variable to numeric value\n", "response[variables[1]] = pd.to_numeric(response[variables[1]])\n", "thePlot = ggplot(data=response)\n", "thePlot + geom_boxplot(aes(x = 'county', y = variables[1]))" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.12" } }, "nbformat": 4, "nbformat_minor": 2 }