{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Interactive Visualization using `GeoViews`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Overview\n",
"A team at [Google Research & Cloud](https://research.google/) are making parts of the [ECMWF Reanalysis version 5](https://www.ecmwf.int/en/forecasts/dataset/ecmwf-reanalysis-v5) (aka **ERA-5**) accessible in a [Analysis Ready, Cloud Optimized](https://www.frontiersin.org/articles/10.3389/fclim.2021.782909/full) (aka **ARCO**) format.\n",
"\n",
"In this notebook, we will do the following:\n",
"\n",
"1. Access the [ERA-5 ARCO](https://github.com/google-research/arco-era5) catalog\n",
"1. Select a particular dataset and variable from the catalog\n",
"1. Convert the data from Gaussian to Cartesian coordinates\n",
"1. Plot a map of sea-level pressure contours and 2-meter temperature mesh using Geoviews."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Prerequisites\n",
"\n",
"| Concepts | Importance | Notes |\n",
"| --- | --- | --- |\n",
"| [Cartopy](https://foundations.projectpythia.org/core/cartopy/cartopy) | Necessary | |\n",
"| [Xarray](https://foundations.projectpythia.org/core/xarray) | Necessary | |\n",
"| [Geoviews] | Necessary | |\n",
"\n",
"\n",
"- **Time to learn**: 30 minutes"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Imports"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import fsspec\n",
"import xarray as xr\n",
"import matplotlib.pyplot as plt\n",
"import cartopy.crs as ccrs\n",
"import cartopy.feature as cfeature\n",
"import scipy.spatial\n",
"import numpy as np\n",
"import geoviews as gv\n",
"from geoviews import opts\n",
"from pyproj import Transformer\n",
"from holoviews.operation.datashader import rasterize"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Access the ARCO ERA-5 catalog on Google Cloud"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's open the **single-level-reanalysis** Zarr file and save two variables from it"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"reanalysis = xr.open_zarr(\n",
" 'gs://gcp-public-data-arco-era5/co/single-level-reanalysis.zarr', \n",
" chunks={'time': 48},\n",
" consolidated=True,\n",
")\n",
"\n",
"msl = reanalysis.msl\n",
"t2m = reanalysis.t2m"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Regrid to Cartesian coordinates"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"These reanalyses are in their native, Guassian coordinates. We will define and use several functions to convert them to a lat-lon grid, as documented in the [ARCO ERA-5 GCP example notebooks](https://github.com/google-research/arco-era5/tree/main/docs)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def mirror_point_at_360(ds):\n",
" extra_point = (\n",
" ds.where(ds.longitude == 0, drop=True)\n",
" .assign_coords(longitude=lambda x: x.longitude + 360)\n",
" )\n",
" return xr.concat([ds, extra_point], dim='values')\n",
"\n",
"def build_triangulation(x, y):\n",
" grid = np.stack([x, y], axis=1)\n",
" return scipy.spatial.Delaunay(grid)\n",
"\n",
"def interpolate(data, tri, mesh):\n",
" indices = tri.find_simplex(mesh)\n",
" ndim = tri.transform.shape[-1]\n",
" T_inv = tri.transform[indices, :ndim, :]\n",
" r = tri.transform[indices, ndim, :]\n",
" c = np.einsum('...ij,...j', T_inv, mesh - r)\n",
" c = np.concatenate([c, 1 - c.sum(axis=-1, keepdims=True)], axis=-1)\n",
" result = np.einsum('...i,...i', data[:, tri.simplices[indices]], c)\n",
" return np.where(indices == -1, np.nan, result)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Select a particular time range from the dataset."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"msl93 = msl.sel(time=slice('1993-03-13T18:00:00','1993-03-14T00:00:00')).compute().pipe(mirror_point_at_360)\n",
"t2m93 = t2m.sel(time=slice('1993-03-13T18:00:00','1993-03-14T00:00:00')).compute().pipe(mirror_point_at_360)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Regrid to a lat-lon grid."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"tri = build_triangulation(msl93.longitude, msl93.latitude)\n",
"\n",
"longitude = np.linspace(0, 360, num=360*4+1)\n",
"latitude = np.linspace(-90, 90, num=180*4+1)\n",
"\n",
"mesh = np.stack(np.meshgrid(longitude, latitude, indexing='ij'), axis=-1)\n",
"\n",
"grid_mesh_msl = interpolate(msl93.values, tri, mesh)\n",
"grid_mesh_t2m = interpolate(t2m93.values, tri, mesh)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Construct an Xarray `DataArray` from the regridded array."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"da_msl = xr.DataArray(data=grid_mesh_msl,\n",
" dims=[\"time\", \"longitude\", \"latitude\"],\n",
" coords=[('time', msl93.time.data), ('longitude', longitude), ('latitude', latitude)],\n",
" name='msl')\n",
"\n",
"da_t2m = xr.DataArray(data=grid_mesh_t2m,\n",
" dims=[\"time\", \"longitude\", \"latitude\"],\n",
" coords=[('time', t2m93.time.data), ('longitude', longitude), ('latitude', latitude)],\n",
" name='t2m')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Convert to hPa and deg C."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"slp = da_msl/100\n",
"t2m = da_t2m-273.15"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Plot the data using geoviews"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Add an interactive element by using the [holoviz](https://holoviz.org)/[geoviews](https://geoviews.org) libraries.\n",
"\n",
"We need to choose the rendering backend that we want to use in Geoviews, in this case we are using Bokeh."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"gv.extension('bokeh')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If you want the plot restricted to a specific part of the world, you can specify where, however, we need to transform the coordinates from PlateCarree to WebMercator (aka *EPSG-3857*), which is the default projection for Geoviews with the Bokeh backend."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"lonW, lonE, latS, latN = -130, -60, 20, 55 "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"transformer = Transformer.from_crs(ccrs.PlateCarree(), \"EPSG:3857\")\n",
"lon1, lat1 = transformer.transform(lonW,latS)\n",
"lon2, lat2 = transformer.transform(lonE, latN)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Convert our Xarray datasets/data arrays into Geoviews dataset objects, specifying the different dimensions of the dataset (lat, lon, time) as well as the variable we want to plot. For this first example, we'll just visualize the first time in the selected time range."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
Note:
\n", " By default, Geoviews expects a dataset whose coordinates are lat-lon (i.e., using aPlateCarree projection) to have the longitude dimension vary from -180 to 180. In this case, the ERA-5's longitudes range from 0 to 360. \n",
" crs of the element declares the actual central_longitude, e.g. 180 degrees rather than the default 0 degrees.\n",
"Info
\n", " Notice how this only plots a single time. We can iterate over the time dimension by specifying time as a dimension when we create the Geoviews dataset, as shown below!\n", "Note
\n", " Did you notice that it took a little longer for the graphic to appear? That is because we have loaded seven total time steps, instead of just one.\n", "Warning
\n", " If you plot a large dataset without rasterizing it, e.g. the entire globe for multiple timeplots of data, load times and RAM will scale up in tandem.\n", "Time slider bug!
While the visualization loaded much faster, the rasterized dataset does not update as we manipulate the time slider! \n", "