{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Google Cloud CMIP6 Public Data: Basic Python Example\n", "\n", "This notebooks shows how to query the catalog and load the data using python" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from matplotlib import pyplot as plt\n", "import numpy as np\n", "import pandas as pd\n", "import xarray as xr\n", "import zarr\n", "import fsspec\n", "\n", "%matplotlib inline\n", "%config InlineBackend.figure_format = 'retina' \n", "plt.rcParams['figure.figsize'] = 12, 6" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Browse Catalog\n", "\n", "The data catatalog is stored as a CSV file. Here we read it with Pandas." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df = pd.read_csv('https://storage.googleapis.com/cmip6/cmip6-zarr-consolidated-stores.csv')\n", "df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The columns of the dataframe correspond to the CMI6 controlled vocabulary. A beginners' guide to these terms is available in [this document](https://docs.google.com/document/d/1yUx6jr9EdedCOLd--CPdTfGDwEwzPpCF6p1jRmqx-0Q). \n", "\n", "Here we filter the data to find monthly surface air temperature for historical experiments." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_ta = df.query(\"activity_id=='CMIP' & table_id == 'Amon' & variable_id == 'tas' & experiment_id == 'historical'\")\n", "df_ta" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we do further filtering to find just the models from NCAR." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_ta_ncar = df_ta.query('institution_id == \"NCAR\"')\n", "df_ta_ncar" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load Data\n", "\n", "Now we will load a single store using fsspec, zarr, and xarray." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# get the path to a specific zarr store (the first one from the dataframe above)\n", "zstore = df_ta_ncar.zstore.values[-1]\n", "print(zstore)\n", "\n", "# create a mutable-mapping-style interface to the store\n", "mapper = fsspec.get_mapper(zstore)\n", "\n", "# open it using xarray and zarr\n", "ds = xr.open_zarr(mapper, consolidated=True)\n", "ds" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot a map from a specific date." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ds.tas.sel(time='1950-01').squeeze().plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a timeseries of global-average surface air temperature. For this we need the area weighting factor for each gridpoint." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_area = df.query(\"variable_id == 'areacella' & source_id == 'CESM2'\")\n", "ds_area = xr.open_zarr(fsspec.get_mapper(df_area.zstore.values[0]), consolidated=True)\n", "ds_area" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "total_area = ds_area.areacella.sum(dim=['lon', 'lat'])\n", "ta_timeseries = (ds.tas * ds_area.areacella).sum(dim=['lon', 'lat']) / total_area\n", "ta_timeseries" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By default the data are loaded lazily, as Dask arrays. Here we trigger computation explicitly." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%time ta_timeseries.load()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ta_timeseries.plot(label='monthly')\n", "ta_timeseries.rolling(time=12).mean().plot(label='12 month rolling mean')\n", "plt.legend()\n", "plt.title('Global Mean Surface Air Temperature')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.1" } }, "nbformat": 4, "nbformat_minor": 4 }