{ "cells": [ { "cell_type": "markdown", "id": "c0d4e2cb-c1e8-4615-910c-a90e0bf21d38", "metadata": {}, "source": [ "## Accessing NEX-GDDP-CMIP6 data with the Planetary Computer STAC API\n", "\n", "The [NEX-GDDP-CMIP6 dataset](https://planetarycomputer.microsoft.com/dataset/nasa-nex-gddp-cmip6) offers global downscaled climate scenarios derived from the General Circulation Model (GCM) runs conducted under the Coupled Model Intercomparison Project Phase 6 (CMIP6) and across two of the four “Tier 1” greenhouse gas emissions scenarios known as Shared Socioeconomic Pathways (SSPs). The purpose of this dataset is to provide a set of global, high resolution, bias-corrected climate change projections that can be used to evaluate climate change impacts on processes that are sensitive to finer-scale climate gradients and the effects of local topography on climate conditions.\n", "\n", "This dataset uses a Bias-Correction Spatial Disaggregation method to downscale the original General Circulation Model runs to the finer 0.25° resolution. See the [tech note](https://www.nccs.nasa.gov/sites/default/files/NEX-GDDP-CMIP6-Tech_Note.pdf) from the [product homepage](https://www.nccs.nasa.gov/services/data-collections/land-based-products/nex-gddp-cmip6) for more details.\n", "\n", "The NEX-GDDP-CMIP6 files are stored as NetCDF in Azure Blob Storage. Each STAC Item in this collection describes a single year for one scenario for one model." ] }, { "cell_type": "code", "execution_count": 1, "id": "e169c2d3-0751-47d3-90be-c9eac92826cf", "metadata": { "tags": [] }, "outputs": [], "source": [ "import planetary_computer\n", "import xarray as xr\n", "import fsspec\n", "import pystac_client" ] }, { "cell_type": "markdown", "id": "c7a7d42b-aea0-4194-ac19-837121d92cf8", "metadata": {}, "source": [ "### Data access\n", "\n", "The datasets hosted by the Planetary Computer are available from [Azure Blob Storage](https://docs.microsoft.com/en-us/azure/storage/blobs/). We'll use [pystac-client](https://pystac-client.readthedocs.io/) to search the Planetary Computer's [STAC API](https://planetarycomputer.microsoft.com/api/stac/v1/docs) for the subset of the data that we care about, and then we'll load the data directly from Azure Blob Storage. We'll specify a `modifier` so that we can access the data stored in the Planetary Computer's private Blob Storage Containers. See [Reading from the STAC API](https://planetarycomputer.microsoft.com/docs/quickstarts/reading-stac/) and [Using tokens for data access](https://planetarycomputer.microsoft.com/docs/concepts/sas/) for more." ] }, { "cell_type": "code", "execution_count": 2, "id": "32470ec3-5395-4337-a967-63c5315468d2", "metadata": { "tags": [] }, "outputs": [], "source": [ "catalog = pystac_client.Client.open(\n", " \"https://planetarycomputer-test.microsoft.com/stac\",\n", " modifier=planetary_computer.sign_inplace,\n", ")" ] }, { "cell_type": "markdown", "id": "929c3993-8efa-4c36-a8a9-9e70c54e5f82", "metadata": {}, "source": [ "### Understanding the metadata\n", "\n", "The STAC metadata on the Collection, items, and assets provide information on what data is available." ] }, { "cell_type": "code", "execution_count": 3, "id": "d7e302dc-ae8b-4f0b-85d3-c2f6db1ccded", "metadata": { "tags": [] }, "outputs": [], "source": [ "collection = catalog.get_collection(\"nasa-nex-gddp-cmip6\")" ] }, { "cell_type": "markdown", "id": "883208e2-70d5-4a92-8048-a8cf6f9665fd", "metadata": {}, "source": [ "As usual, the collection object contains information about the dataset including its spatio-temporal extent, license, and so on. We also have information unique to CMIP6. The collection is organized by `{model}-{scenario}-{year}`: there's is a single STAC item for each (valid) combination (data is not available for some; see Table 1 in the [tech note](https://www.nccs.nasa.gov/sites/default/files/NEX-GDDP-CMIP6-Tech_Note.pdf) for more). The valid values for each of these are stored in the collection's summaries:" ] }, { "cell_type": "code", "execution_count": 4, "id": "375ec285-5149-4ed7-9b0b-6e29f67cd7ab", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "['ACCESS-CM2', 'ACCESS-ESM1-5', 'BCC-CSM2-MR', 'CESM2', 'CESM2-WACCM']" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# List the models. There are ~30 in total.\n", "collection.summaries.get_list(\"cmip6:model\")[:5]" ] }, { "cell_type": "code", "execution_count": 5, "id": "127efd6a-006e-427e-808c-b2d60670f948", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "['historical', 'ssp245', 'ssp585']" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# List the scenarios\n", "collection.summaries.get_list(\"cmip6:scenario\")" ] }, { "cell_type": "markdown", "id": "405ed57d-7b77-4586-a1e1-990c772858e7", "metadata": {}, "source": [ "The \"historical\" scenario covers the years 1950 - 2014 (inclusive). The \"ssp245\" and \"ssp585\" cover the years 2015 - 2100 (inclusive).\n", "\n", "Each item includes a handful of assets, one per variable, where each asset is a single NetCDF file with the data for that variable for that model-scenario-year." ] }, { "cell_type": "code", "execution_count": 6, "id": "d9d16ecd-da06-4763-ade4-0b3e58f692ad", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "['hurs', 'huss', 'pr', 'rlds', 'rsds', 'sfcWind', 'tas', 'tasmax', 'tasmin']" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# list the variables\n", "collection.summaries.get_list(\"cmip6:variable\")" ] }, { "cell_type": "markdown", "id": "e39e8df4-82f0-4833-b7b0-f7f63761a38d", "metadata": {}, "source": [ "### Querying the STAC API\n", "\n", "Each STAC item covers the same spatial region, so when using the STAC API you're likely filtering on some combination of time, model, and scenario. For example, we can get the STAC items for the \"ACCESS-CM2\" model for the years 1950 - 2000." ] }, { "cell_type": "code", "execution_count": 7, "id": "0b87d961-c786-4a3a-9272-6563fe8fcc68", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "11" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "search = catalog.search(\n", " collections=[\"nasa-nex-gddp-cmip6\"],\n", " datetime=\"1950/1960\",\n", " query={\"cmip6:model\": {\"eq\": \"ACCESS-CM2\"}},\n", ")\n", "items = search.item_collection()\n", "len(items)" ] }, { "cell_type": "markdown", "id": "717fbcd5-7cca-4aa5-8c6c-b476c630e5a3", "metadata": {}, "source": [ "Each of these items has nine assets, one per variable, which point to the NetCDF files in Azure Blob Storage:" ] }, { "cell_type": "code", "execution_count": 8, "id": "79f2daa6-3d90-432b-bd58-1d6403679341", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "['pr', 'tas', 'hurs', 'huss', 'rlds', 'rsds', 'tasmax', 'tasmin', 'sfcWind']" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "item = items[0]\n", "list(item.assets)" ] }, { "cell_type": "markdown", "id": "fd77db07-d331-477f-aaf3-b3d1d822f2f6", "metadata": {}, "source": [ "### Loading data\n", "\n", "Once you have a STAC item or items, you can load the data directly from Blob Storage using xarray." ] }, { "cell_type": "code", "execution_count": 9, "id": "866239eb-75ea-447e-a14d-cb450822b458", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "
<xarray.Dataset>\n", "Dimensions: (time: 366, lat: 600, lon: 1440)\n", "Coordinates:\n", " * time (time) datetime64[ns] 1960-01-01T12:00:00 ... 1960-12-31T12:00:00\n", " * lat (lat) float64 -59.88 -59.62 -59.38 -59.12 ... 89.38 89.62 89.88\n", " * lon (lon) float64 0.125 0.375 0.625 0.875 ... 359.1 359.4 359.6 359.9\n", "Data variables:\n", " hurs (time, lat, lon) float32 ...\n", "Attributes: (12/22)\n", " activity: NEX-GDDP-CMIP6\n", " contact: Dr. Rama Nemani: rama.nemani@nasa.gov, Dr. Bridget...\n", " Conventions: CF-1.7\n", " creation_date: 2021-10-04T13:59:24.689966+00:00\n", " frequency: day\n", " institution: NASA Earth Exchange, NASA Ames Research Center, Mo...\n", " ... ...\n", " history: 2021-10-04T13:59:24.689966+00:00: install global a...\n", " disclaimer: This data is considered provisional and subject to...\n", " external_variables: areacella\n", " cmip6_source_id: ACCESS-CM2\n", " cmip6_institution_id: CSIRO-ARCCSS\n", " cmip6_license: CC-BY-SA 4.0
<xarray.Dataset>\n", "Dimensions: (time: 366, lat: 600, lon: 1440)\n", "Coordinates:\n", " * time (time) datetime64[ns] 1960-01-01T12:00:00 ... 1960-12-31T12:00:00\n", " * lat (lat) float64 -59.88 -59.62 -59.38 -59.12 ... 89.38 89.62 89.88\n", " * lon (lon) float64 0.125 0.375 0.625 0.875 ... 359.1 359.4 359.6 359.9\n", "Data variables:\n", " hurs (time, lat, lon) float32 dask.array<chunksize=(366, 600, 1440), meta=np.ndarray>\n", " huss (time, lat, lon) float32 dask.array<chunksize=(366, 600, 1440), meta=np.ndarray>\n", " pr (time, lat, lon) float32 dask.array<chunksize=(366, 600, 1440), meta=np.ndarray>\n", " rlds (time, lat, lon) float32 dask.array<chunksize=(366, 600, 1440), meta=np.ndarray>\n", " rsds (time, lat, lon) float32 dask.array<chunksize=(366, 600, 1440), meta=np.ndarray>\n", " sfcWind (time, lat, lon) float32 dask.array<chunksize=(366, 600, 1440), meta=np.ndarray>\n", " tas (time, lat, lon) float32 dask.array<chunksize=(366, 600, 1440), meta=np.ndarray>\n", " tasmax (time, lat, lon) float32 dask.array<chunksize=(366, 600, 1440), meta=np.ndarray>\n", " tasmin (time, lat, lon) float32 dask.array<chunksize=(366, 600, 1440), meta=np.ndarray>\n", "Attributes: (12/22)\n", " activity: NEX-GDDP-CMIP6\n", " contact: Dr. Rama Nemani: rama.nemani@nasa.gov, Dr. Bridget...\n", " Conventions: CF-1.7\n", " creation_date: 2021-10-04T13:59:24.689966+00:00\n", " frequency: day\n", " institution: NASA Earth Exchange, NASA Ames Research Center, Mo...\n", " ... ...\n", " history: 2021-10-04T13:59:24.689966+00:00: install global a...\n", " disclaimer: This data is considered provisional and subject to...\n", " external_variables: areacella\n", " cmip6_source_id: ACCESS-CM2\n", " cmip6_institution_id: CSIRO-ARCCSS\n", " cmip6_license: CC-BY-SA 4.0
<xarray.Dataset>\n", "Dimensions: (time: 4018, lat: 600, lon: 1440)\n", "Coordinates:\n", " * lat (lat) float64 -59.88 -59.62 -59.38 -59.12 ... 89.38 89.62 89.88\n", " * lon (lon) float64 0.125 0.375 0.625 0.875 ... 359.1 359.4 359.6 359.9\n", " * time (time) datetime64[ns] 1950-01-01T12:00:00 ... 1960-12-31T12:00:00\n", "Data variables:\n", " hurs (time, lat, lon) float32 dask.array<chunksize=(38, 600, 1440), meta=np.ndarray>\n", " huss (time, lat, lon) float32 dask.array<chunksize=(38, 600, 1440), meta=np.ndarray>\n", " pr (time, lat, lon) float32 dask.array<chunksize=(38, 600, 1440), meta=np.ndarray>\n", " rlds (time, lat, lon) float32 dask.array<chunksize=(38, 600, 1440), meta=np.ndarray>\n", " rsds (time, lat, lon) float32 dask.array<chunksize=(38, 600, 1440), meta=np.ndarray>\n", " sfcWind (time, lat, lon) float32 dask.array<chunksize=(38, 600, 1440), meta=np.ndarray>\n", " tas (time, lat, lon) float32 dask.array<chunksize=(38, 600, 1440), meta=np.ndarray>\n", " tasmax (time, lat, lon) float32 dask.array<chunksize=(38, 600, 1440), meta=np.ndarray>\n", " tasmin (time, lat, lon) float32 dask.array<chunksize=(38, 600, 1440), meta=np.ndarray>\n", "Attributes: (12/22)\n", " Conventions: CF-1.7\n", " activity: NEX-GDDP-CMIP6\n", " cmip6_institution_id: CSIRO-ARCCSS\n", " cmip6_license: CC-BY-SA 4.0\n", " cmip6_source_id: ACCESS-CM2\n", " contact: Dr. Rama Nemani: rama.nemani@nasa.gov, Dr. Bridget...\n", " ... ...\n", " scenario: historical\n", " source: BCSD\n", " title: ACCESS-CM2, r1i1p1f1, historical, global downscale...\n", " tracking_id: cefd5411-1f81-4f48-b9bf-8b38c3ecceb1\n", " variant_label: r1i1p1f1\n", " version: 1.0
<xarray.Dataset>\n", "Dimensions: (time: 23741, lat: 600, lon: 1440)\n", "Coordinates:\n", " * lat (lat) float64 -59.88 -59.62 -59.38 -59.12 ... 89.38 89.62 89.88\n", " * lon (lon) float64 0.125 0.375 0.625 0.875 ... 359.1 359.4 359.6 359.9\n", " * time (time) datetime64[us] 1950-01-01T12:00:00 ... 2014-12-31T12:00:00\n", "Data variables:\n", " hurs (time, lat, lon) float32 dask.array<chunksize=(1, 600, 1440), meta=np.ndarray>\n", " huss (time, lat, lon) float32 dask.array<chunksize=(1, 600, 1440), meta=np.ndarray>\n", " pr (time, lat, lon) float32 dask.array<chunksize=(1, 600, 1440), meta=np.ndarray>\n", " rlds (time, lat, lon) float32 dask.array<chunksize=(1, 600, 1440), meta=np.ndarray>\n", " rsds (time, lat, lon) float32 dask.array<chunksize=(1, 600, 1440), meta=np.ndarray>\n", " sfcWind (time, lat, lon) float32 dask.array<chunksize=(1, 600, 1440), meta=np.ndarray>\n", " tas (time, lat, lon) float32 dask.array<chunksize=(1, 600, 1440), meta=np.ndarray>\n", " tasmax (time, lat, lon) float32 dask.array<chunksize=(1, 600, 1440), meta=np.ndarray>\n", " tasmin (time, lat, lon) float32 dask.array<chunksize=(1, 600, 1440), meta=np.ndarray>\n", "Attributes: (12/22)\n", " Conventions: CF-1.7\n", " activity: NEX-GDDP-CMIP6\n", " cmip6_institution_id: CSIRO-ARCCSS\n", " cmip6_license: CC-BY-SA 4.0\n", " cmip6_source_id: ACCESS-CM2\n", " contact: Dr. Rama Nemani: rama.nemani@nasa.gov, Dr. Bridget...\n", " ... ...\n", " scenario: historical\n", " source: BCSD\n", " title: ACCESS-CM2, r1i1p1f1, historical, global downscale...\n", " tracking_id: 16d27564-470f-41ea-8077-f4cc3efa5bfe\n", " variant_label: r1i1p1f1\n", " version: 1.0