{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "<img src=\"images/ProjectPythia_Logo_Final-01-Blue.svg\" width=250 alt=\"Project Pythia Logo\"> <img src=\"images/ecmwf.png\" style=\"width:250px\" alt=\"ECMWF logo\"> <img src=\"images/googleresearch.png\" style=\"width:250px\" alt=\"Google logo\">" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Interactive Visualization using `GeoViews`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Overview\n", "A team at [Google Research & Cloud](https://research.google/) are making parts of the [ECMWF Reanalysis version 5](https://www.ecmwf.int/en/forecasts/dataset/ecmwf-reanalysis-v5) (aka **ERA-5**) accessible in a [Analysis Ready, Cloud Optimized](https://www.frontiersin.org/articles/10.3389/fclim.2021.782909/full) (aka **ARCO**) format.\n", "\n", "In this notebook, we will do the following:\n", "\n", "1. Access the [ERA-5 ARCO](https://github.com/google-research/arco-era5) catalog\n", "1. Select a particular dataset and variable from the catalog\n", "1. Convert the data from Gaussian to Cartesian coordinates\n", "1. Plot a map of sea-level pressure contours and 2-meter temperature mesh using Geoviews." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prerequisites\n", "\n", "| Concepts | Importance | Notes |\n", "| --- | --- | --- |\n", "| [Cartopy](https://foundations.projectpythia.org/core/cartopy/cartopy.html) | Necessary | |\n", "| [Xarray](https://foundations.projectpythia.org/core/xarray) | Necessary | |\n", "| [Geoviews] | Necessary | |\n", "\n", "\n", "- **Time to learn**: 30 minutes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Imports" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import fsspec\n", "import xarray as xr\n", "import matplotlib.pyplot as plt\n", "import cartopy.crs as ccrs\n", "import cartopy.feature as cfeature\n", "import scipy.spatial\n", "import numpy as np\n", "import geoviews as gv\n", "from geoviews import opts\n", "from pyproj import Transformer\n", "from holoviews.operation.datashader import rasterize" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Access the ARCO ERA-5 catalog on Google Cloud" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's open the **single-level-reanalysis** Zarr file and save two variables from it" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "reanalysis = xr.open_zarr(\n", " 'gs://gcp-public-data-arco-era5/co/single-level-reanalysis.zarr', \n", " chunks={'time': 48},\n", " consolidated=True,\n", ")\n", "\n", "msl = reanalysis.msl\n", "t2m = reanalysis.t2m" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Regrid to Cartesian coordinates" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "These reanalyses are in their native, Guassian coordinates. We will define and use several functions to convert them to a lat-lon grid, as documented in the [ARCO ERA-5 GCP example notebooks](https://github.com/google-research/arco-era5/tree/main/docs)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def mirror_point_at_360(ds):\n", " extra_point = (\n", " ds.where(ds.longitude == 0, drop=True)\n", " .assign_coords(longitude=lambda x: x.longitude + 360)\n", " )\n", " return xr.concat([ds, extra_point], dim='values')\n", "\n", "def build_triangulation(x, y):\n", " grid = np.stack([x, y], axis=1)\n", " return scipy.spatial.Delaunay(grid)\n", "\n", "def interpolate(data, tri, mesh):\n", " indices = tri.find_simplex(mesh)\n", " ndim = tri.transform.shape[-1]\n", " T_inv = tri.transform[indices, :ndim, :]\n", " r = tri.transform[indices, ndim, :]\n", " c = np.einsum('...ij,...j', T_inv, mesh - r)\n", " c = np.concatenate([c, 1 - c.sum(axis=-1, keepdims=True)], axis=-1)\n", " result = np.einsum('...i,...i', data[:, tri.simplices[indices]], c)\n", " return np.where(indices == -1, np.nan, result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Select a particular time range from the dataset." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "msl93 = msl.sel(time=slice('1993-03-13T18:00:00','1993-03-14T00:00:00')).compute().pipe(mirror_point_at_360)\n", "t2m93 = t2m.sel(time=slice('1993-03-13T18:00:00','1993-03-14T00:00:00')).compute().pipe(mirror_point_at_360)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Regrid to a lat-lon grid." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tri = build_triangulation(msl93.longitude, msl93.latitude)\n", "\n", "longitude = np.linspace(0, 360, num=360*4+1)\n", "latitude = np.linspace(-90, 90, num=180*4+1)\n", "\n", "mesh = np.stack(np.meshgrid(longitude, latitude, indexing='ij'), axis=-1)\n", "\n", "grid_mesh_msl = interpolate(msl93.values, tri, mesh)\n", "grid_mesh_t2m = interpolate(t2m93.values, tri, mesh)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Construct an Xarray `DataArray` from the regridded array." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "da_msl = xr.DataArray(data=grid_mesh_msl,\n", " dims=[\"time\", \"longitude\", \"latitude\"],\n", " coords=[('time', msl93.time.data), ('longitude', longitude), ('latitude', latitude)],\n", " name='msl')\n", "\n", "da_t2m = xr.DataArray(data=grid_mesh_t2m,\n", " dims=[\"time\", \"longitude\", \"latitude\"],\n", " coords=[('time', t2m93.time.data), ('longitude', longitude), ('latitude', latitude)],\n", " name='t2m')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Convert to hPa and deg C." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "slp = da_msl/100\n", "t2m = da_t2m-273.15" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plot the data using geoviews" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Add an interactive element by using the [holoviz](https://holoviz.org)/[geoviews](https://geoviews.org) libraries.\n", "\n", "We need to choose the rendering backend that we want to use in Geoviews, in this case we are using Bokeh." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "gv.extension('bokeh')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you want the plot restricted to a specific part of the world, you can specify where, however, we need to transform the coordinates from PlateCarree to WebMercator (aka *EPSG-3857*), which is the default projection for Geoviews with the Bokeh backend." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "lonW, lonE, latS, latN = -130, -60, 20, 55 " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "transformer = Transformer.from_crs(ccrs.PlateCarree(), \"EPSG:3857\")\n", "lon1, lat1 = transformer.transform(lonW,latS)\n", "lon2, lat2 = transformer.transform(lonE, latN)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Convert our Xarray datasets/data arrays into Geoviews dataset objects, specifying the different dimensions of the dataset (lat, lon, time) as well as the variable we want to plot. For this first example, we'll just visualize the first time in the selected time range." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<div class=\"admonition alert alert-warning\">\n", " <p class=\"admonition-title\" style=\"font-weight:bold\">Note:</p>\n", " By default, Geoviews expects a dataset whose coordinates are lat-lon (i.e., using a <code>PlateCarree</code> projection) to have the longitude dimension vary from -180 to 180. In this case, the ERA-5's longitudes range from 0 to 360. \n", " <br>\n", " In this case, ensure that the <code>crs</code> of the element declares the actual <code>central_longitude</code>, e.g. 180 degrees rather than the default 0 degrees.\n", "</div>" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dataset_era = gv.Dataset(slp.isel(time=0), kdims=['longitude','latitude'],vdims='msl') #create a geoviews dataset\n", "contour = gv.project(dataset_era.to(gv.LineContours, ['longitude', 'latitude'],crs=ccrs.PlateCarree(central_longitude=180))) #create a Geoviews contour object" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plotting MSLP" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cint = np.arange(900,1080,4)\n", "(gv.tile_sources.OSM * contour).opts(\n", " opts.LineContours(tools=['hover'], levels=cint, frame_width=700, frame_height=400,show_legend=False, line_width=3,xlim=(lon1,lon2),ylim=(lat1,lat2)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Use the Bokeh tools to the right of the plot to zoom in/out or reset to the original dimensions. The bottom-most tool tip (the *hover* tool) will produce a pop-up window that shows the sea-level pressure value at the closest grid point to where your cursor lies." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<div class=\"admonition alert alert-info\">\n", " <p class=\"admonition-title\" style=\"font-weight:bold\">Info</p>\n", " Notice how this only plots a single time. We can iterate over the time dimension by specifying time as a dimension when we create the Geoviews dataset, as shown below!\n", "</div>" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dataset_era = gv.Dataset(slp, kdims=['longitude','latitude','time'],vdims='msl') #create a geoviews dataset; here we do not select just a single time\n", "contour = gv.project(dataset_era.to(gv.LineContours, ['longitude', 'latitude'])) #create line contours" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create the interactive visualization. We overlay the SLP contours on a map from a web tile server. We specify various options, such as frame size, spatial extent, and line thickness as well." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "(gv.tile_sources.OSM * contour).opts(\n", " opts.LineContours(tools=['hover'], levels=cint, frame_width=700, frame_height=400,show_legend=False, line_width=3,xlim=(lon1,lon2),ylim=(lat1,lat2)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since there are multiple times in the dataset, we now get a time slider that you can manipulate. Notice that as you change the time, the plot automatically updates! " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<div class=\"admonition alert alert-warning\">\n", " <p class=\"admonition-title\" style=\"font-weight:bold\">Note</p>\n", " Did you notice that it took a little longer for the graphic to appear? That is because we have loaded seven total time steps, instead of just one.\n", "</div>" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plotting T2M" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's plot 2 meter temperatures. This time, we'll visualize the data via a Quadmesh (`gv.Quadmesh`) instead of using contour lines (`gv.LineContours`)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Quadmesh is more computationally expensive, so it is good practice to subset the data from its original global extent to an area that you're interested in. Here, we select the same region as we did earlier for sea-level pressure. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def lon_to_360(dlon: float) -> float:\n", " return ((360 + (dlon % 360)) % 360)\n", "\n", "cond = (t2m.longitude > lon_to_360(lonW)) & (t2m.latitude > latS) & \\\n", " (t2m.longitude < lon_to_360(lonE)) & (t2m.latitude < latN)\n", "\n", "t2m_cropped = t2m.where(cond, drop=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dataset_era = gv.Dataset(t2m_cropped, kdims=['longitude','latitude','time'],vdims='t2m')\n", "qm = gv.project(dataset_era.to(gv.QuadMesh, ['longitude', 'latitude']))\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "(gv.tile_sources.OSM().opts(xlim=(lon1, lon2),ylim=(lat1, lat2),title=f'ERA-5 2mT', frame_width=700, frame_height=400) * (qm.opts(tools=['hover'], axiswise=True, cmap='coolwarm', colorbar=True, clim=(-50,50), alpha=0.8)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You probably noticed this visualization took a while to render. A more performant strategy is to **rasterize** the Geoviews dataset. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<div class=\"admonition alert alert-warning\">\n", " <p class=\"admonition-title\" style=\"font-weight:bold\">Warning</p>\n", " If you plot a large dataset without rasterizing it, e.g. the entire globe for multiple timeplots of data, load times and RAM will scale up in tandem.\n", "</div>" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "qm_raster = rasterize(qm, precompute=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "(gv.tile_sources.OSM().opts(xlim=(lon1, lon2),ylim=(lat1, lat2),title=f'ERA-5 2mT', frame_width=700, frame_height=400) * (qm_raster.opts(tools=['hover'], axiswise=True, cmap='coolwarm', colorbar=True, clim=(-50,50), alpha=0.8)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<div class=\"admonition alert alert-danger\">\n", " <p class=\"admonition-title\" style=\"font-weight:bold\">Time slider bug!</p>While the visualization loaded much faster, the rasterized dataset does not update as we manipulate the time slider! \n", "</div>" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## What's next?\n", "In the next notebook, we will add more user-configurable options to our Geoviews-powered visualizations, using another key player in the Holoviz ecosystem, [Panel](https://panel.holoviz.org/)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Summary\n", "In this notebook, we have accessed one of the ARCO ERA-5 datasets, regridded from the ECMWF native spectral to cartesian lat-lon coordinates, and used geoviews to create interactive maps of sea-level pressure and 2-meter temperature." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Resources and references\n", "\n", " - This notebook follows the general workflow as used in the [Google Research ARCO-ERA5 Surface Reanlysis Walkthrough notebook](https://github.com/google-research/arco-era5/blob/main/docs/0-Surface-Reanalysis-Walkthrough.ipynb)\n", " - [Holoviz](https://holoviz.org)\n", " - [Geoviews](https://geoviews.org/)\n", " \n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.13" }, "nbdime-conflicts": { "local_diff": [ { "diff": [ { "diff": [ { "key": 0, "op": "addrange", "valuelist": [ "Python 3" ] }, { "key": 0, "length": 1, "op": "removerange" } ], "key": "display_name", "op": "patch" } ], "key": "kernelspec", "op": "patch" } ], "remote_diff": [ { "diff": [ { "diff": [ { "key": 0, "op": "addrange", "valuelist": [ "Python3" ] }, { "key": 0, "length": 1, "op": "removerange" } ], "key": "display_name", "op": "patch" } ], "key": "kernelspec", "op": "patch" } ] }, "toc-autonumbering": false, "vscode": { "interpreter": { "hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6" } } }, "nbformat": 4, "nbformat_minor": 4 }