{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Sea Surface Altimetry Data Analysis\n",
    "\n",
    "For this example we will use gridded sea-surface altimetry data from The Copernicus Marine Environment:\n",
    "\n",
    "http://marine.copernicus.eu/services-portfolio/access-to-products/?option=com_csw&view=details&product_id=SEALEVEL_GLO_PHY_L4_REP_OBSERVATIONS_008_047\n",
    "\n",
    "This is a widely used dataset in physical oceanography and climate.\n",
    "\n",
    "![globe image](http://marine.copernicus.eu/documents/IMG/SEALEVEL_GLO_SLA_MAP_L4_REP_OBSERVATIONS_008_027.png)\n",
    "\n",
    "The dataset has already been extracted from copernicus and stored in google cloud storage in [xarray-zarr](http://xarray.pydata.org/en/latest/io.html#zarr) format."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import xarray as xr\n",
    "import holoviews as hv\n",
    "import hvplot.xarray\n",
    "import hvplot.pandas"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Initialize Dataset\n",
    "\n",
    "Here we load the dataset from the zarr store. Note that this very large dataset initializes nearly instantly, and we can see the full list of variables and coordinates."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from intake import open_catalog\n",
    "\n",
    "cat = open_catalog(\"https://raw.githubusercontent.com/pangeo-data/pangeo-datastore/master/intake-catalogs/master.yaml\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds = cat.ocean.sea_surface_height.to_dask()\n",
    "ds"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Examine Metadata\n",
    "\n",
    "For those unfamiliar with this dataset, the variable metadata is very helpful for understanding what the variables actually represent"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for v in ds.data_vars:\n",
    "    print('{:>10}: {}'.format(v, ds[v].attrs['long_name']))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create and Connect to Dask Distributed Cluster"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from dask.distributed import Client, progress\n",
    "from dask_gateway import Gateway\n",
    "\n",
    "gateway = Gateway()\n",
    "cluster = gateway.new_cluster()\n",
    "cluster.scale(10)\n",
    "cluster"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "** ☝️ Don't forget to click the link above to view the scheduler dashboard! **"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "client = Client(cluster)\n",
    "client"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Visually Examine Some of the Data\n",
    "\n",
    "Let's do a sanity check that the data looks reasonable:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds.sla.sel(time='1982-08-07', method='nearest').hvplot(colormap='RdBu_r', width=900, height=550, rasterize=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Timeseries of Global Mean Sea Level\n",
    "\n",
    "Here we make a simple yet fundamental calculation: the rate of increase of global mean sea level over the observational period."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# the number of GB involved in the reduction\n",
    "ds.sla.nbytes/1e9"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# the computationally intensive step\n",
    "sla_timeseries = ds.sla.mean(dim=('latitude', 'longitude')).load()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sla_full = sla_timeseries.hvplot(label='full data', grid=True,\n",
    "                          title='Global Sea Level Rise', \n",
    "                          width=800, height=400)\n",
    "sla_filt = sla_timeseries.rolling(time=365, center=True).mean().hvplot(label='rolling annual mean')\n",
    "hv.Overlay([sla_full, sla_filt]).options(legend_position='top_left')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In order to understand how the sea level rise is distributed in latitude, we can make a sort of [Hovmöller diagram](https://en.wikipedia.org/wiki/Hovm%C3%B6ller_diagram)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sla_hov = ds.sla.mean(dim='longitude').load()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sla_hov.transpose().hvplot(rasterize=True, colormap='RdBu_r', width=900, height=400, clim=(-.2,.2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can see that most sea level rise is actually in the Southern Hemisphere."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Sea Level Variability\n",
    "\n",
    "We can examine the natural variability in sea level by looking at its standard deviation in time."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sla_std = ds.sla.std(dim='time').load()\n",
    "sla_std.name = 'Sea Level Variability [m]'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "sla_std.hvplot(colormap='viridis', width=900, height=550, rasterize=True)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}