{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## 06 Using specialized packages to grab data\n", "Previously we saw how 3rd party packages vastly increase what Python can do quickly and fairly easily. Here we see that this applies to downloading data as well; someone has written a Python package to download Census data quite easily.\n", "\n", "First we need to install the packages, and then we'll use them to grab some data. One note, however, is that to use these packages you need to sign up for a [free] Census API key. You can do this here: https://api.census.gov/data/key_signup.html\n", "\n", "Documentation for these packages is here: \n", "https://pypi.python.org/pypi/census\n", "https://github.com/datamade/census\n", "\n", "We'll discuss APIs, such as ths Census API next..." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Import the 'census' package; install if needed\n", "try:\n", " from census import Census\n", "except:\n", " import pip\n", " pip.main(['install','census'])\n", " from census import Census" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#And we'll preview 'ggplot', import/install it...\n", "try:\n", " import ggplot as gg\n", "except:\n", " import pip\n", " pip.main(['install','ggplot'])\n", " import ggplot as gg" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#And finally, import pandas\n", "import pandas as pd" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Add your census key here:\n", "key = None" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create the connection to the Census API\n", "c = Census(key, year=2015)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "variables = ('NAME', 'B19001_001E')\n", "params = {'for':'tract:*', 'in':'state:24'}\n", "response = c.acs5.get(variables, params)\n", "response = pd.DataFrame(response)\n", "response.dtypes" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import ggplot as gg\n", "\n", "response[variables[1]] = pd.to_numeric(response[variables[1]])\n", "gg.ggplot(response, gg.aes(x = 'county', y = variables[1])) + gg.geom_boxplot()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.5" } }, "nbformat": 4, "nbformat_minor": 2 }