{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Analysis of Gridded Ensemble Precipitation and Temperature Estimates over the Contiguous United States\n", "====\n", "\n", "For this example, we'll work with a 100 member ensemble of precipitation and temperature data. The analysis we do below is quite simple but the problem is a good illustration of a common task in the atmospheric sciences. \n", "\n", "Link to dataset: https://www.earthsystemgrid.org/dataset/gridded_precip_and_temp.html" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "\n", "import numpy as np\n", "import xarray as xr\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Connect to Dask Distributed Cluster" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from dask.distributed import Client, progress\n", "from dask_gateway import Gateway\n", "\n", "gateway = Gateway()\n", "cluster = gateway.new_cluster()\n", "cluster.scale(40)\n", "cluster" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "client = Client(cluster)\n", "client" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Open Dataset\n", "\n", "Here we load the GMET dataset using an [Intake](https://intake.readthedocs.io/en/latest/) catalog. This catalog specifies the location of the GMET data stored in the Zarr format in GCS.\n", "\n", "The dataset has dimensions of time, latitude, longitude, and ensmemble member." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "## Load with intake catalog\n", "import intake\n", "cat = intake.Catalog('https://raw.githubusercontent.com/pangeo-data/pangeo-datastore/master/intake-catalogs/master.yaml')\n", "ds = cat.atmosphere.gmet_v1.read_chunked()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Print dataset\n", "display(ds)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Figure: Elevation and domain mask\n", "\n", "A quick plot of the mask to give us an idea of our spatial domain" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "elevation = ds['elevation'].persist()\n", "elevation = elevation.where(elevation > 0)\n", "elevation.plot(figsize=(10, 6))\n", "plt.title('Domain Elevation')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Intra-ensemble range\n", "\n", "We calculate the intra-ensemble range for all the mean daily temperature in this dataset. This gives us a sense of uncertainty." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "temp_mean = ds['t_mean'].mean(dim='time')\n", "spread = (temp_mean.max(dim='ensemble')\n", " - temp_mean.min(dim='ensemble'))\n", "spread" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Calling compute\n", "The expressions above didn't actually compute anything. They just build the task graph. To do the computations, we call the `compute` or `persist` methods:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "spread = spread.persist()\n", "progress(spread)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Figure: Intra-ensemble range\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "spread\n", "spread.attrs['units'] = 'degC'" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "spread.plot(robust=True, figsize=(10, 6))\n", "plt.title('Intra-ensemble range in mean annual temperature')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Average seasonal snowfall\n", "\n", "We can compute a crude estimate of average seasonal snowfall using the temperature and precipitation variables in our dataset. Here, we'll look at the first 4 ensemble members and make some maps of the seasonal total snowfall in each ensemble member." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "da_snow = ds['pcp'].where(ds['t_mean'] < 0.).resample(time='QS-Mar').sum('time')\n", "seasonal_snow = da_snow.isel(ensemble=slice(0, 4)).groupby('time.season').mean('time').persist()\n", "progress(seasonal_snow)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# properly sort the seasons\n", "seasonal_snow = seasonal_snow.sel(season=['DJF', 'MAM','JJA', 'SON'])\n", "seasonal_snow.attrs['units'] = 'mm/season'\n", "seasonal_snow" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Figure: Average seasonal snowfall totals " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "seasonal_snow.plot.pcolormesh(col='season', row='ensemble', cmap='Blues', robust=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Extract a time series of annual maximum precipitation events over a region\n", "\n", "In the previous two examples, we've mostly reduced the time and/or ensemble dimension. Here, we'll do a reduction operation on the spatial dimension to look at a time series of extreme precipitation events near Austin, TX (30.2672° N, 97.7431° W)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "buf = 0.25 # look at Austin +/- 0.25 deg\n", "\n", "ds_tx = ds.sel(lon=slice(-97.7431-buf, -97.7431+buf), lat=slice(30.2672-buf, 30.2672+buf))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pcp_ann_max = ds_tx['pcp'].resample(time='AS').max('time')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pcp_ann_max_ts = pcp_ann_max.max(('lat', 'lon')).persist()\n", "progress(pcp_ann_max_ts)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Figure: Timeseries of maximum precipitation near Austin, Tx." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ax = pcp_ann_max_ts.transpose().to_pandas().plot(title='Maximum precipitation near Austin, Tx', legend=False)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.6" } }, "nbformat": 4, "nbformat_minor": 2 }