{ "cells": [ { "cell_type": "markdown", "id": "856f4caa", "metadata": {}, "source": [ "## Accessing gNATSGO data with the Planetary Computer STAC API\n", "\n", "The gridded National Soil Survey Geographic Database (gNATSGO) is a USDA-NRCS Soil & Plant Science Division (SPSD) composite database that provides complete coverage of the best available soils information for all areas of the United States and Island Territories.\n", "\n", "gNATSGO data in the Planetary Computer is provided in two complimentary collections: one containing raster data, and one containing the associated tabular data. \n", "\n", "The raster catalog a number of layers are pre-summarized to the map unit level using best-practice generalization methods intended to meet the needs of most users, as well as a layer which provides the map-unit key (`mukey`) which allows users to join tables from the tables collection for more advanced soil mapping.\n", "\n", "In this example, we will explore the pre-summarized raster assets, and perform a simple join to the tabular data." ] }, { "cell_type": "markdown", "id": "099052d9", "metadata": {}, "source": [ "### Setup environment" ] }, { "cell_type": "code", "execution_count": 1, "id": "88d8880b", "metadata": {}, "outputs": [], "source": [ "import planetary_computer\n", "import numpy as np\n", "import pandas as pd\n", "import rioxarray\n", "import xarray\n", "import pystac_client" ] }, { "cell_type": "markdown", "id": "78776f15", "metadata": {}, "source": [ "### Rasters \n", "\n", "The datasets hosted by the Planetary Computer are available from [Azure Blob Storage](https://docs.microsoft.com/en-us/azure/storage/blobs/). We'll use [pystac-client](https://pystac-client.readthedocs.io/) to search the Planetary Computer's [STAC API](https://planetarycomputer.microsoft.com/api/stac/v1/docs) for the subset of the data that we care about, and then we'll load the data directly from Azure Blob Storage. We'll specify a `modifier` so that we can access the data stored in the Planetary Computer's private Blob Storage Containers. See [Reading from the STAC API](https://planetarycomputer.microsoft.com/docs/quickstarts/reading-stac/) and [Using tokens for data access](https://planetarycomputer.microsoft.com/docs/concepts/sas/) for more.\n", "\n", "Let's start by finding the tile which contains Marshall County, IA and inspecting the stac item for which assets are available." ] }, { "cell_type": "code", "execution_count": 2, "id": "fbeceb40", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Returned 1 item\n", "['mukey', 'aws0_5', 'soc0_5', 'tk0_5a', 'tk0_5s', 'aws0_20', 'aws0_30', 'aws5_20', 'soc0_20', 'soc0_30', 'soc5_20', 'tk0_20a', 'tk0_20s', 'tk0_30a', 'tk0_30s', 'tk5_20a', 'tk5_20s', 'aws0_100', 'aws0_150', 'aws0_999', 'aws20_50', 'droughty', 'nccpi3sg', 'soc0_100', 'soc0_150', 'soc0_999', 'soc20_50', 'tk0_100a', 'tk0_100s', 'tk0_150a', 'tk0_150s', 'tk0_999a', 'tk0_999s', 'tk20_50a', 'tk20_50s', 'aws50_100', 'musumcpct', 'nccpi3all', 'nccpi3cot', 'nccpi3soy', 'pwsl1pomu', 'rootznaws', 'rootznemc', 'soc50_100', 'tk50_100a', 'tk50_100s', 'aws100_150', 'aws150_999', 'musumcpcta', 'musumcpcts', 'nccpi3corn', 'pctearthmc', 'soc100_150', 'soc150_999', 'tk100_150a', 'tk100_150s', 'tk150_999a', 'tk150_999s', 'tilejson', 'rendered_preview']\n" ] } ], "source": [ "catalog = pystac_client.Client.open(\n", " \"https://planetarycomputer.microsoft.com/api/stac/v1\",\n", " modifier=planetary_computer.sign_inplace,\n", ")\n", "\n", "marshall_county = [-92.99900, 42.03580]\n", "search = catalog.search(\n", " collections=[\"gnatsgo-rasters\"],\n", " intersects={\"type\": \"Point\", \"coordinates\": marshall_county},\n", ")\n", "items = list(search.get_items())\n", "print(f\"Returned {len(items)} item\")\n", "raster_item = items[0]\n", "print(list(raster_item.assets.keys()))" ] }, { "cell_type": "markdown", "id": "76579267", "metadata": {}, "source": [ "We can open the asset for Soil Organic Carbon in the top 20cm using `rioxarray` and display the map." ] }, { "cell_type": "code", "execution_count": 3, "id": "ec9f53f4", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Soil organic carbon stock estimate (SOC) in standard zone 2 (0-20 cm depth). The concentration of organic carbon present in the soil expressed in grams C per square meter to a depth of 20 cm. NULL values are presented where data are incomplete or not available.\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/html": [ "" ], "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "signed_asset = raster_item.assets[\"soc0_20\"]\n", "data = (\n", " rioxarray.open_rasterio(signed_asset.href)\n", " .squeeze()\n", " .drop(\"band\")\n", " .coarsen({\"y\": 8, \"x\": 8})\n", " .mean()\n", ")\n", "\n", "print(raster_item.assets[\"soc0_20\"].description)\n", "data.plot.imshow(vmin=0, add_labels=False)" ] }, { "cell_type": "markdown", "id": "3b3e277b", "metadata": {}, "source": [ "### Tabular data\n", "\n", "Beyond the pre-summarized rasters provided, gNATSGO includes around 70 tables containing a wide variety of data on the soil, crops, and more. These tables can be joined to the `mukey` raster asset to map additional quantities. \n", "\n", "As a simple example, we will load the `muaggatt` table which provides some additional quantities that have already been aggregated to the map-unit scale. First, find the item and use `planetary_computer` to `sign` it." ] }, { "cell_type": "code", "execution_count": 4, "id": "801763a2", "metadata": {}, "outputs": [], "source": [ "muaggatt_item = catalog.get_collection(\"gnatsgo-tables\").get_item(\"muaggatt\")" ] }, { "cell_type": "markdown", "id": "9e79a4a1", "metadata": {}, "source": [ "The tables are stored in parquet format, which allows us to efficiently load only the columns we are interested in. Let's load some data on slope (`slopegraddcp`) into a pandas dataframe." ] }, { "cell_type": "code", "execution_count": 5, "id": "49fd8d01", "metadata": {}, "outputs": [], "source": [ "muaggatt_table = pd.read_parquet(\n", " muaggatt_item.assets[\"data\"].href,\n", " columns=[\"mukey\", \"slopegraddcp\"],\n", " storage_options=muaggatt_item.assets[\"data\"].extra_fields[\"table:storage_options\"],\n", " engine=\"pyarrow\",\n", ")\n", "muaggatt_table = muaggatt_table.set_index(\"mukey\")\n", "muaggatt_table[\"slopegraddcp\"] = muaggatt_table[\"slopegraddcp\"].fillna(0)" ] }, { "cell_type": "markdown", "id": "e2a941d1", "metadata": {}, "source": [ "Now we can read the `mukey` raster for our Marshall County tile and create a derived raster by joining to the slope column." ] }, { "cell_type": "code", "execution_count": 6, "id": "d5d63f82", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The difference is elevation between two points, expressed as a percentage of the distance between those points. This column displays the slope gradient of the dominant component of the map unit based on composition percentage.\n" ] }, { "data": { "text/plain": [ "" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" }, { "data": { "text/html": [ "" ], "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "asset = planetary_computer.sign(raster_item.assets[\"mukey\"])\n", "mukey = rioxarray.open_rasterio(asset.href).squeeze().drop(\"band\")\n", "\n", "unique_mukeys, inverse = np.unique(mukey, return_inverse=True)\n", "slope_raster = (\n", " muaggatt_table[\"slopegraddcp\"]\n", " .groupby(level=0)\n", " .first()\n", " .reindex(unique_mukeys, fill_value=0)\n", " .to_numpy()[inverse]\n", " .reshape(mukey.shape)\n", ")\n", "\n", "col = next(\n", " col\n", " for col in muaggatt_item.properties[\"table:columns\"]\n", " if col[\"name\"] == \"slopegraddcp\"\n", ")\n", "print(col[\"description\"])\n", "xarray.DataArray(slope_raster).plot.imshow()" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.13" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 5 }