{ "cells": [ { "cell_type": "markdown", "id": "b5ad9932-76e1-4a4a-bb34-afebaf20ccd8", "metadata": {}, "source": [ "## Accessing Forest Inventory and Analysis data with the Planetary Computer STAC API\n", "\n", "This notebook demonstrates accessing [Forest Inventory and Analysis](https://planetarycomputer.microsoft.com/dataset/fia) (FIA) data from the Planetary Computer STAC API.\n", "\n", "The Forest Inventory and Analysis collection contains many tables, and each STAC table corresponds to one STAC item in the [FIA collection](http://planetarycomputer.microsoft.com/api/stac/v1/collections/fia). In this example, we'll use a few of the tables to estimate the total amount of aboveground carbon, in pounds, per US county.\n", "\n", "This example builds on the [plot estimation](https://rfia.netlify.app/courses/plt_est/) example from the [rfia](https://rfia.netlify.app/) package." ] }, { "cell_type": "code", "execution_count": 1, "id": "b11609a3-e138-47c9-bcc7-7bae11405bca", "metadata": {}, "outputs": [], "source": [ "from cartopy import crs as ccrs\n", "import dask_gateway\n", "import dask_geopandas\n", "import dask.dataframe as dd\n", "import geopandas\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import pandas as pd\n", "import pystac_client\n", "import planetary_computer" ] }, { "cell_type": "markdown", "id": "be0c70f3-1124-44c5-96bc-d22107062384", "metadata": {}, "source": [ "The `tree` table below is relatively large, so we'll process it in parallel on a Dask cluster. The example will still run without a cluster, it will just take longer." ] }, { "cell_type": "code", "execution_count": 2, "id": "3e8cca5f-d068-4ea8-baff-45dcb485ed53", "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "fc02b078c07b4efb83440164231f0360", "version_major": 2, "version_minor": 0 }, "text/plain": [ "VBox(children=(HTML(value='

GatewayCluster

'), HBox(children=(HTML(value='\\n
\\n\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PLT_CNCONDIDTREEDRYBIO_AGCARBON_AGTPA_UNADJCNSTATECDCOUNTYCDbiocarbon
npartitions=160
int64int64int64float64float64float64int64int64int64float64float64
.................................
....................................
.................................
.................................
\n", "
\n", "
Dask Name: assign, 1921 tasks
" ], "text/plain": [ "Dask DataFrame Structure:\n", " PLT_CN CONDID TREE DRYBIO_AG CARBON_AG TPA_UNADJ CN STATECD COUNTYCD bio carbon\n", "npartitions=160 \n", " int64 int64 int64 float64 float64 float64 int64 int64 int64 float64 float64\n", " ... ... ... ... ... ... ... ... ... ... ...\n", "... ... ... ... ... ... ... ... ... ... ... ...\n", " ... ... ... ... ... ... ... ... ... ... ...\n", " ... ... ... ... ... ... ... ... ... ... ...\n", "Dask Name: assign, 1921 tasks" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = tree.merge(plot.assign(PLT_CN=plot.CN).compute(), on=\"PLT_CN\").assign(\n", " bio=lambda df: df.DRYBIO_AG * df.TPA_UNADJ / 2000,\n", " carbon=lambda df: df.CARBON_AG * df.TPA_UNADJ / 2000,\n", ")\n", "df" ] }, { "cell_type": "markdown", "id": "efb42525-1800-4612-9c7a-1480c5da3ce0", "metadata": {}, "source": [ "### Compute per-county summaries\n", "\n", "The `df` dataframe now includes the state and county FIPS codes, and the (adjusted) aboveground carbon and biomass. We'll group by the geographic boundaries and sum the aboveground carbon and biomass." ] }, { "cell_type": "code", "execution_count": 7, "id": "423396fb-c05f-43db-bd1e-df0762a363f2", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 259 ms, sys: 36.5 ms, total: 296 ms\n", "Wall time: 35.2 s\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
biocarbonSTATECOUNTY
09130.1799464565.09000001001
127163.82678513565.77134001003
215567.8438067774.59322001005
316643.5338288321.76695401007
48742.4983664364.87686701009
\n", "
" ], "text/plain": [ " bio carbon STATE COUNTY\n", "0 9130.179946 4565.090000 01 001\n", "1 27163.826785 13565.771340 01 003\n", "2 15567.843806 7774.593220 01 005\n", "3 16643.533828 8321.766954 01 007\n", "4 8742.498366 4364.876867 01 009" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%time\n", "result = (\n", " df.groupby([\"STATECD\", \"COUNTYCD\"])[[\"bio\", \"carbon\"]]\n", " .sum()\n", " .compute()\n", " .reset_index()\n", " .assign(\n", " STATE=lambda df: df[\"STATECD\"].astype(\"string\").str.pad(2, fillchar=\"0\"),\n", " COUNTY=lambda df: df[\"COUNTYCD\"].astype(\"string\").str.pad(3, fillchar=\"0\"),\n", " )\n", " .drop(columns=[\"STATECD\", \"COUNTYCD\"])\n", ")\n", "result.head()" ] }, { "cell_type": "markdown", "id": "5652d76a-fc0e-4de2-86b2-63c884471d39", "metadata": {}, "source": [ "### Plot the results\n", "\n", "Now we'll make a chloropleth for the results. We just need to join in the actual geographic boundaries of the datasets, which we can get with geopandas." ] }, { "cell_type": "code", "execution_count": 8, "id": "aaa3d6e1-7731-4446-906e-5e8d2c63c5af", "metadata": {}, "outputs": [], "source": [ "census_item = catalog.get_collection(\"us-census\").get_item(\n", " \"2020-cb_2020_us_county_500k\"\n", ")\n", "\n", "counties = (\n", " dask_geopandas.read_parquet(\n", " census_item.assets[\"data\"].href,\n", " storage_options=census_item.assets[\"data\"].extra_fields[\n", " \"table:storage_options\"\n", " ],\n", " columns=[\"STATEFP\", \"COUNTYFP\", \"geometry\"],\n", " ).rename(columns={\"STATEFP\": \"STATE\", \"COUNTYFP\": \"COUNTY\"})\n", ").compute()" ] }, { "cell_type": "markdown", "id": "4cad86c0-c87b-46f2-be90-c284935467b3", "metadata": {}, "source": [ "Finally, we'll slice the data down to the continental United States (the dataset covers Hawaii, Alaska, and several territories)." ] }, { "cell_type": "code", "execution_count": 9, "id": "ce9de40f-c33c-4d54-8963-7ddb15fa185a", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
biocarbonSTATECOUNTYgeometry
09130.1799464565.09000001001POLYGON ((-86.92120 32.65754, -86.92035 32.658...
127163.82678513565.77134001003POLYGON ((-88.02858 30.22676, -88.02399 30.230...
215567.8438067774.59322001005POLYGON ((-85.74803 31.61918, -85.74544 31.618...
316643.5338288321.76695401007POLYGON ((-87.42194 33.00338, -87.31854 33.006...
48742.4983664364.87686701009POLYGON ((-86.96336 33.85822, -86.95967 33.857...
\n", "
" ], "text/plain": [ " bio carbon STATE COUNTY \\\n", "0 9130.179946 4565.090000 01 001 \n", "1 27163.826785 13565.771340 01 003 \n", "2 15567.843806 7774.593220 01 005 \n", "3 16643.533828 8321.766954 01 007 \n", "4 8742.498366 4364.876867 01 009 \n", "\n", " geometry \n", "0 POLYGON ((-86.92120 32.65754, -86.92035 32.658... \n", "1 POLYGON ((-88.02858 30.22676, -88.02399 30.230... \n", "2 POLYGON ((-85.74803 31.61918, -85.74544 31.618... \n", "3 POLYGON ((-87.42194 33.00338, -87.31854 33.006... \n", "4 POLYGON ((-86.96336 33.85822, -86.95967 33.857... " ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gdf = geopandas.GeoDataFrame(pd.merge(result, counties, on=[\"STATE\", \"COUNTY\"]))\n", "df_conus = gdf.cx[-124.784:-66.951, 24.744:49.346]\n", "df_conus.head()" ] }, { "cell_type": "markdown", "id": "e459f4d4-0aa7-4920-a6dc-7d12c9d64254", "metadata": {}, "source": [ "Finally, we'll plot the (log) of the estimated carbon stored above ground by the trees." ] }, { "cell_type": "code", "execution_count": 10, "id": "a9886f61-470b-44d7-96da-ff459f7cd9a3", "metadata": {}, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "crs = ccrs.LambertConformal()\n", "fig, ax = plt.subplots(subplot_kw={\"projection\": crs}, figsize=(16, 9))\n", "df_conus.assign(carbon=np.log(df_conus.carbon + 1)).to_crs(crs.proj4_init).plot(\n", " column=\"carbon\",\n", " cmap=\"Greens\",\n", " edgecolor=\"k\",\n", " scheme=\"natural_breaks\",\n", " k=8,\n", " ax=ax,\n", " linewidths=0.1,\n", " legend=True,\n", ")\n", "\n", "# Shift the legend\n", "bbox = ax.legend_.get_bbox_to_anchor().transformed(ax.transAxes.inverted())\n", "bbox.x0 += 0.075\n", "bbox.x1 += 0.075\n", "bbox.y0 -= 0.4\n", "bbox.y1 -= 0.4\n", "ax.legend_.set_bbox_to_anchor(bbox)\n", "\n", "ax.axis(\"off\");" ] }, { "cell_type": "markdown", "id": "34527ba3-e795-40de-8cc7-9e0f2fd6a268", "metadata": {}, "source": [ "### Next Steps\n", "\n", "Now that you've an introduction to the Forest Inventory and Analysis dataset, learn more with\n", "\n", "* The [Reading tabular data quickstart](https://planetarycomputer.microsoft.com/docs/quickstarts/reading-tabular-data/) for an introduction to tabular data on the Planeatry Computer" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.10" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 5 }