{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# CM2.6 Ocean Model Analysis\n", "\n", "This notebook shows how to load and analyze ocean data from the GFDL [CM2.6](https://www.gfdl.noaa.gov/cm2-6/) high-resolution climate simulation.\n", "\n", "![CM2.6 SST](https://www.gfdl.noaa.gov/wp-content/uploads/ih/2012/06/cm2.6.png)\n", "\n", "Right now the only output available is the 5-day 3D fields of horizontal velocity, temperature, and salinity. We hope to add more going forward.\n", "\n", "Thanks to [Stephen Griffies](https://www.gfdl.noaa.gov/stephen-griffies-homepage/) for providing the data.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "\n", "import numpy as np\n", "import xarray as xr\n", "import matplotlib.pyplot as plt\n", "import holoviews as hv\n", "import datashader\n", "from holoviews.operation.datashader import regrid, shade, datashade\n", "\n", "hv.extension('bokeh', width=100)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create and Connect to Dask Distributed Cluster\n", "\n", "This will launch a cluster of virtual machines in the cloud." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from dask.distributed import Client, progress\n", "from dask_kubernetes import KubeCluster\n", "cluster = KubeCluster(n_workers=40)\n", "cluster" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "👆 Don't forget to click this link to get the cluster dashboard" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "client = Client(cluster)\n", "client" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load CM 2.6 Data\n", "\n", "This data is stored in [xarray-zarr](http://xarray.pydata.org/en/latest/io.html#zarr) format in Google Cloud Storage.\n", "This format is optimized for parallel distributed reads from within the cloud environment.\n", "\n", "It may take up to a minute to initialize the dataset when you run this cell." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#experiment = 'one_percent'\n", "experiment = 'control'\n", "\n", "# Load with Cloud object storage\n", "import gcsfs\n", "gcsmap = gcsfs.mapping.GCSMap('pangeo-data/cm2.6/%s/temp_salt_u_v-5day_avg/' % experiment)\n", "ds = xr.open_zarr(gcsmap, decode_cf=True, decode_times=False)\n", "\n", "# Print dataset\n", "ds" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualize Temperature Data with Holoviews and Datashader\n", "\n", "The cells below show how to interactively explore the dataset.\n", "\n", "_**Warning**: it takes ~10-20 seconds to render each image after moving the sliders. Please be patient. There is an open [github issue](https://github.com/bokeh/datashader/issues/598) about improving the performance of datashader with this sort of dataset._" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "hv_ds = hv.Dataset(ds['temp'])\n", "qm = hv_ds.to(hv.QuadMesh, kdims=[\"xt_ocean\", \"yt_ocean\"], dynamic=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%opts QuadMesh [width=800 height=500 colorbar=True] (cmap='magma') \n", "regrid(qm, precompute=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Make an Expensive Calculation\n", "\n", "Here we make a big reduction by taking the time and zonal mean of the temperature. This demonstrates how the cluster distributes the reads from storage." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "temp_zonal_mean = ds.temp.mean(dim=('time', 'xt_ocean'))\n", "temp_zonal_mean" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Depending on the size of your cluster, this next cell will take a while. On a cluster of 40 workers, it took ~12 minutes." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%time temp_zonal_mean.load()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots(figsize=(16,8))\n", "temp_zonal_mean.plot.contourf(yincrease=False, levels=np.arange(-2,30))\n", "plt.title('Naive Zonal Mean Temperature')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python [default]", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.5" } }, "nbformat": 4, "nbformat_minor": 2 }