{ "cells": [ { "cell_type": "markdown", "id": "db9799be-8baa-4035-b0ef-e861c99f4c70", "metadata": {}, "source": [ "## Computing climate indicators with xclim\n", "\n", "The Climate Impact Lab Downscaled Projections for Climate Impacts Research (CIL-GDPCR) collections contain bias corrected and downscaled 1/4° CMIP6 projections for temperature and precipitation.\n", "\n", "See the project homepage for more information: [github.com/ClimateImpactLab/downscaleCMIP6](https://github.com/ClimateImpactLab/downscaleCMIP6).\n", "\n", "This tutorial covers constructing a time series across the CMIP:historical and ScenarioMIP:ssp126 experiments, and computing transformations using the [xclim](https://xclim.readthedocs.io/) package. Additional tutorials are available at [github.com/microsoft/PlanetaryComputerExamples](https://github.com/microsoft/PlanetaryComputerExamples/blob/main/datasets/cil-gdpcir)." ] }, { "cell_type": "code", "execution_count": 1, "id": "f6c291ba-5b61-41d9-bde9-3e49afddebaf", "metadata": {}, "outputs": [], "source": [ "# required to locate and authenticate with the stac collection\n", "import planetary_computer\n", "import pystac_client\n", "\n", "# required to load a zarr array using xarray\n", "import xarray as xr\n", "\n", "# climate indicators with xclim\n", "import xclim.indicators\n", "\n", "# optional imports used in this notebook\n", "from dask.diagnostics import ProgressBar" ] }, { "cell_type": "markdown", "id": "77d64979-7b72-4433-af1a-3edfe801dc77", "metadata": {}, "source": [ "### Building a joint historical and projection time series\n", "\n", "Let's work with the FGOALS-g3 historical and ssp1-2.6 simulations. We'll use the Planetary Computer's STAC API to search for the items we want, which contain all the information necessary to load the data with xarray.\n", "\n", "The FGOALS-g3 data are available under the `cil-gdpcir-cc0` collection (which you can check in the `cmip6:institution_id` summary of the collection)." ] }, { "cell_type": "code", "execution_count": 6, "id": "998e6893-4ddd-4be1-bf60-fd1139d53c95", "metadata": {}, "outputs": [], "source": [ "catalog = pystac_client.Client.open(\n", " \"https://planetarycomputer.microsoft.com/api/stac/v1\",\n", " modifier=planetary_computer.sign_inplace,\n", ")\n", "collection_cc0 = catalog.get_collection(\"cil-gdpcir-cc0\")\n", "items = catalog.search(\n", " collections=[\"cil-gdpcir-cc0\"],\n", " query={\n", " \"cmip6:source_id\": {\"eq\": \"FGOALS-g3\"},\n", " \"cmip6:experiment_id\": {\"in\": [\"historical\", \"ssp126\"]},\n", " },\n", ").get_all_items()" ] }, { "cell_type": "code", "execution_count": 3, "id": "771b9b39-be59-4e74-84ba-f09b21760043", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['cil-gdpcir-CAS-FGOALS-g3-ssp126-r1i1p1f1-day',\n", " 'cil-gdpcir-CAS-FGOALS-g3-historical-r1i1p1f1-day']" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[item.id for item in items]" ] }, { "cell_type": "markdown", "id": "79c51100-9d7a-4cda-8157-fe1a65b00ff1", "metadata": {}, "source": [ "Retrieve object URLs by authenticating with Planetary Computer" ] }, { "cell_type": "code", "execution_count": 4, "id": "f3c8aba0-4365-4fb6-97fa-e3d1deb03011", "metadata": {}, "outputs": [], "source": [ "# use the planetary computer API to sign the asset\n", "signed_items = planetary_computer.sign(items)\n", "\n", "# select this variable ID for all models in the collection\n", "variable_id = \"tasmin\"\n", "\n", "# get the API key and other important keyword arguments\n", "open_kwargs = signed_items[0].assets[variable_id].extra_fields[\"xarray:open_kwargs\"]" ] }, { "cell_type": "markdown", "id": "0b9600a8-0270-41fe-b689-caf17a47254f", "metadata": { "tags": [] }, "source": [ "### Reading a single variable" ] }, { "cell_type": "code", "execution_count": 5, "id": "b4168174-629a-41a7-954d-f14471a7d571", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.Dataset>\n",
       "Dimensions:  (lat: 720, lon: 1440, time: 55115)\n",
       "Coordinates:\n",
       "  * lat      (lat) float64 -89.88 -89.62 -89.38 -89.12 ... 89.38 89.62 89.88\n",
       "  * lon      (lon) float64 -179.9 -179.6 -179.4 -179.1 ... 179.4 179.6 179.9\n",
       "  * time     (time) object 1950-01-01 12:00:00 ... 2100-12-31 12:00:00\n",
       "Data variables:\n",
       "    tasmin   (time, lat, lon) float32 dask.array<chunksize=(365, 360, 360), meta=np.ndarray>\n",
       "Attributes: (12/40)\n",
       "    Conventions:                  CF-1.7 CMIP-6.2\n",
       "    contact:                      climatesci@rhg.com\n",
       "    data_specs_version:           01.00.31\n",
       "    dc6_bias_correction_method:   Quantile Delta Method (QDM)\n",
       "    dc6_citation:                 Please refer to https://github.com/ClimateI...\n",
       "    dc6_creation_date:            2022-01-25\n",
       "    ...                           ...\n",
       "    source_type:                  AOGCM\n",
       "    sub_experiment:               none\n",
       "    sub_experiment_id:            none\n",
       "    table_id:                     day\n",
       "    variable_id:                  tasmin\n",
       "    variant_label:                r1i1p1f1
" ], "text/plain": [ "\n", "Dimensions: (lat: 720, lon: 1440, time: 55115)\n", "Coordinates:\n", " * lat (lat) float64 -89.88 -89.62 -89.38 -89.12 ... 89.38 89.62 89.88\n", " * lon (lon) float64 -179.9 -179.6 -179.4 -179.1 ... 179.4 179.6 179.9\n", " * time (time) object 1950-01-01 12:00:00 ... 2100-12-31 12:00:00\n", "Data variables:\n", " tasmin (time, lat, lon) float32 dask.array\n", "Attributes: (12/40)\n", " Conventions: CF-1.7 CMIP-6.2\n", " contact: climatesci@rhg.com\n", " data_specs_version: 01.00.31\n", " dc6_bias_correction_method: Quantile Delta Method (QDM)\n", " dc6_citation: Please refer to https://github.com/ClimateI...\n", " dc6_creation_date: 2022-01-25\n", " ... ...\n", " source_type: AOGCM\n", " sub_experiment: none\n", " sub_experiment_id: none\n", " table_id: day\n", " variable_id: tasmin\n", " variant_label: r1i1p1f1" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds = xr.open_mfdataset(\n", " [item.assets[variable_id].href for item in signed_items],\n", " combine=\"by_coords\",\n", " combine_attrs=\"drop_conflicts\",\n", " parallel=True,\n", " **open_kwargs,\n", ")\n", "\n", "ds" ] }, { "cell_type": "markdown", "id": "f49c446d-2dd9-454f-9026-33dff69a33dc", "metadata": {}, "source": [ "Let's take a look at the variable `tasmin`. Note the summary provided by the dask preview. This array is 213 GB in total, in 180 MB chunks. The data is chunked such that each year and 90 degrees of latitude and longitude form a chunk.\n", "\n", "To read in the full time series for a single point, you'd need to work through 180.45 MB/chunk * 151 annual chunks = 27 GB of data. This doesn't all need to be held in memory, but it gives a sense of what the operation might look like in terms of download & compute time." ] }, { "cell_type": "code", "execution_count": 6, "id": "84df00e5-933e-4ef0-a926-6ca571b5ca7d", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'tasmin' (time: 55115, lat: 720, lon: 1440)>\n",
       "dask.array<concatenate, shape=(55115, 720, 1440), dtype=float32, chunksize=(365, 360, 360), chunktype=numpy.ndarray>\n",
       "Coordinates:\n",
       "  * lat      (lat) float64 -89.88 -89.62 -89.38 -89.12 ... 89.38 89.62 89.88\n",
       "  * lon      (lon) float64 -179.9 -179.6 -179.4 -179.1 ... 179.4 179.6 179.9\n",
       "  * time     (time) object 1950-01-01 12:00:00 ... 2100-12-31 12:00:00\n",
       "Attributes:\n",
       "    cell_measures:  area: areacella\n",
       "    cell_methods:   area: mean time: minimum (interval: 10 minutes)\n",
       "    comment:        minimum near-surface (usually, 2 meter) air temperature (...\n",
       "    coordinates:    height\n",
       "    long_name:      Daily Minimum Near-Surface Air Temperature\n",
       "    standard_name:  air_temperature\n",
       "    units:          K
" ], "text/plain": [ "\n", "dask.array\n", "Coordinates:\n", " * lat (lat) float64 -89.88 -89.62 -89.38 -89.12 ... 89.38 89.62 89.88\n", " * lon (lon) float64 -179.9 -179.6 -179.4 -179.1 ... 179.4 179.6 179.9\n", " * time (time) object 1950-01-01 12:00:00 ... 2100-12-31 12:00:00\n", "Attributes:\n", " cell_measures: area: areacella\n", " cell_methods: area: mean time: minimum (interval: 10 minutes)\n", " comment: minimum near-surface (usually, 2 meter) air temperature (...\n", " coordinates: height\n", " long_name: Daily Minimum Near-Surface Air Temperature\n", " standard_name: air_temperature\n", " units: K" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds.tasmin" ] }, { "cell_type": "markdown", "id": "c0e59406-a48a-491e-a230-f3bd154617a1", "metadata": {}, "source": [ "### Applying a climate indicator from xclim" ] }, { "cell_type": "markdown", "id": "f50fc975-e016-4921-8b1a-311295bdbd19", "metadata": {}, "source": [ "The [`xclim`](https://xclim.readthedocs.io) package provides a large number of useful [indicators](https://xclim.readthedocs.io/en/stable/indicators.html) for analyzing climate data. Here, we'll use the Atmospheric Indicator: [Frost Days (`xclim.indicators.atmos.frost_days`)](https://xclim.readthedocs.io/en/stable/indicators_api.html#xclim.indicators.atmos.frost_days):" ] }, { "cell_type": "code", "execution_count": 7, "id": "ac026565-4afa-429e-9047-927b856f1a38", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'frost_days' (time: 151, lat: 720, lon: 1440)>\n",
       "dask.array<where, shape=(151, 720, 1440), dtype=float64, chunksize=(1, 360, 360), chunktype=numpy.ndarray>\n",
       "Coordinates:\n",
       "  * time     (time) object 1950-01-01 00:00:00 ... 2100-01-01 00:00:00\n",
       "  * lat      (lat) float64 -89.88 -89.62 -89.38 -89.12 ... 89.38 89.62 89.88\n",
       "  * lon      (lon) float64 -179.9 -179.6 -179.4 -179.1 ... 179.4 179.6 179.9\n",
       "Attributes:\n",
       "    units:          days\n",
       "    cell_methods:   area: mean time: minimum (interval: 10 minutes) time: sum...\n",
       "    history:        [2022-04-27 01:25:42] frost_days: FROST_DAYS(tasmin=tasmi...\n",
       "    standard_name:  days_with_air_temperature_below_threshold\n",
       "    long_name:      Number of frost days (tmin < 0 degc)\n",
       "    description:    Annual number of days with minimum daily temperature belo...
" ], "text/plain": [ "\n", "dask.array\n", "Coordinates:\n", " * time (time) object 1950-01-01 00:00:00 ... 2100-01-01 00:00:00\n", " * lat (lat) float64 -89.88 -89.62 -89.38 -89.12 ... 89.38 89.62 89.88\n", " * lon (lon) float64 -179.9 -179.6 -179.4 -179.1 ... 179.4 179.6 179.9\n", "Attributes:\n", " units: days\n", " cell_methods: area: mean time: minimum (interval: 10 minutes) time: sum...\n", " history: [2022-04-27 01:25:42] frost_days: FROST_DAYS(tasmin=tasmi...\n", " standard_name: days_with_air_temperature_below_threshold\n", " long_name: Number of frost days (tmin < 0 degc)\n", " description: Annual number of days with minimum daily temperature belo..." ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "frost_days = xclim.indicators.atmos.frost_days(ds=ds)\n", "frost_days" ] }, { "cell_type": "markdown", "id": "27efedee-ac3e-4588-bf8b-cbcc068fae03", "metadata": {}, "source": [ "Here, the state data requirement has been reduced significantly - but careful - this is the size required by the final product *once computed*. But this is a scheduled [dask](https://docs.xarray.dev/en/latest/user-guide/dask.html) operation, and because of dask's [Lazy Evaluation](https://tutorial.dask.org/01x_lazy.html), we haven't done any work yet. Dask is waiting for us to require operations, e.g. by calling `.compute()`, `.persist()`, or because of blocking operations like writing to disk or plotting. Until we do one of those, we haven't actually read any data yet!\n", "\n", "### Loading a subset of the data\n", "\n", "Let's subset the data and call `.compute()` so we can work with it in locally (in the notebook).\n", "\n", "I'll pick Oslo, Norway, as our oft-frosty location to inspect, and extract one year a decade to plot as a time series. Ideally, we'd look at all of the years and compute a statistic based on a moving multi-decadal window, but this is just an example ;) See [Scale with Dask](https://planetarycomputer.microsoft.com/docs/quickstarts/scale-with-dask/) if you'd like to run this example on a larger amount of data.\n", "\n", "Thanks to [Wikipedia](https://en.wikipedia.org/wiki/Oslo) for the geographic info!" ] }, { "cell_type": "code", "execution_count": 8, "id": "bf89f382-c5ee-47e8-bc59-f1c82e00eb49", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[########################################] | 100% Completed | 13.7s\n" ] } ], "source": [ "with ProgressBar():\n", " oslo_frost_days_summary = (\n", " frost_days.sel(lat=59.913889, lon=10.752222, method=\"nearest\").sel(\n", " time=frost_days.time.dt.year.isin(range(1950, 2101, 10))\n", " )\n", " ).compute()" ] }, { "cell_type": "code", "execution_count": 9, "id": "e9b0a176-69fe-4d50-8aa7-715759593268", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.DataArray 'frost_days' (time: 16)>\n",
       "array([131., 118., 149., 139., 165., 118., 132., 128., 118., 120., 121.,\n",
       "       127., 100., 122., 118., 106.])\n",
       "Coordinates:\n",
       "  * time     (time) object 1950-01-01 00:00:00 ... 2100-01-01 00:00:00\n",
       "    lat      float64 59.88\n",
       "    lon      float64 10.88\n",
       "Attributes:\n",
       "    units:          days\n",
       "    cell_methods:   area: mean time: minimum (interval: 10 minutes) time: sum...\n",
       "    history:        [2022-04-27 01:25:42] frost_days: FROST_DAYS(tasmin=tasmi...\n",
       "    standard_name:  days_with_air_temperature_below_threshold\n",
       "    long_name:      Number of frost days (tmin < 0 degc)\n",
       "    description:    Annual number of days with minimum daily temperature belo...
" ], "text/plain": [ "\n", "array([131., 118., 149., 139., 165., 118., 132., 128., 118., 120., 121.,\n", " 127., 100., 122., 118., 106.])\n", "Coordinates:\n", " * time (time) object 1950-01-01 00:00:00 ... 2100-01-01 00:00:00\n", " lat float64 59.88\n", " lon float64 10.88\n", "Attributes:\n", " units: days\n", " cell_methods: area: mean time: minimum (interval: 10 minutes) time: sum...\n", " history: [2022-04-27 01:25:42] frost_days: FROST_DAYS(tasmin=tasmi...\n", " standard_name: days_with_air_temperature_below_threshold\n", " long_name: Number of frost days (tmin < 0 degc)\n", " description: Annual number of days with minimum daily temperature belo..." ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "oslo_frost_days_summary" ] }, { "cell_type": "code", "execution_count": 10, "id": "b7714209-ba00-4779-85be-14e104c043b9", "metadata": {}, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "oslo_frost_days_summary.plot();" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.10" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": {}, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 5 }