{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "cdca5c71-3d8d-4043-b89d-0c9e708e6782",
   "metadata": {
    "tags": []
   },
   "source": [
    "# Thalassa\n",
    "\n",
    "`thalassa` (the greek word for \"sea\") is a library for Large Scale Sea level visualizations of unstructured mesh data.\n",
    "\n",
    "https://github.com/ec-jrc/Thalassa\n",
    "\n",
    "## Design goals\n",
    "\n",
    "- Simple API\n",
    "- Performance"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fdd93f1d-38be-4396-abec-b3b8bf2dbb4e",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Imports"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "240fb748-ff8f-47a7-9946-6f8b68f41659",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "%load_ext autoreload\n",
    "%autoreload 2\n",
    "\n",
    "from __future__ import annotations\n",
    "\n",
    "import dask\n",
    "import geoviews as gv\n",
    "import holoviews as hv\n",
    "import numcodecs\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "import shapely\n",
    "import xarray as xr\n",
    "\n",
    "from holoviews import opts as hvopts\n",
    "from holoviews import streams\n",
    "from holoviews.streams import PointerXY\n",
    "from holoviews.streams import Tap\n",
    "\n",
    "hv.extension(\"bokeh\")\n",
    "\n",
    "import thalassa\n",
    "\n",
    "from thalassa import api\n",
    "from thalassa import normalization\n",
    "from thalassa import utils\n",
    "\n",
    "# Set some defaults for the visualization of the graphs\n",
    "hvopts.defaults(\n",
    "    hvopts.Image(\n",
    "        width=800,\n",
    "        height=600,\n",
    "        show_title=True,\n",
    "        tools=[\"hover\"],\n",
    "        active_tools=[\"pan\", \"box_zoom\"],\n",
    "        cmap=\"jet\",\n",
    "    ),\n",
    ")\n",
    "\n",
    "COMPRESSOR = numcodecs.Blosc(cname=\"zstd\", clevel=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "25aded57-b7ed-4d5e-9f84-0ec05582026b",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Retrieve STOFS-3D-Atl data\n",
    "\n",
    "STOFS-3D-Atl is a 3D model that uses SCHISM.\n",
    "\n",
    "Output data are split into multiple files. More specifically, the Sea Water elevation is distributed as 3 netcdf files (3-day prediction - one file per day). \n",
    "We will download them for the run of 29th of August 2023.\n",
    "\n",
    "For more info please check here: https://noaa-nos-stofs3d-pds.s3.amazonaws.com/README.htmlhttps://noaa-nos-stofs3d-pds.s3.amazonaws.com/README.html"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "832d7fbe-a381-468c-93e9-e6b17a9a9ee8",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "%%bash\n",
    "\n",
    "for day in 28 29 30; do\n",
    "    filename=schout_adcirc_202308\"${day}\".nc\n",
    "    if [[ ! -f \"${filename}\" ]]; then\n",
    "        echo Downloading netcdf: \"${filename}\"\n",
    "        wget --quiet https://noaa-nos-stofs3d-pds.s3.amazonaws.com/STOFS-3D-Atl/stofs_3d_atl.20230829/schout_adcirc_202308\"${day}\".nc -O \"${filename}\"\n",
    "    else\n",
    "        echo Netcdf already exists: \"${filename}\"\n",
    "    fi\n",
    "done"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f6b1d984-bb5e-4b68-aaa4-515919d8956f",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Prepare input data\n",
    "\n",
    "We want to check the maximum elevation per node for the full 72-hour time span. \n",
    "This means that we need to do some (lightweight) post-processing.\n",
    "\n",
    "Nevertheless, since we start to do post-processing let's also convert the netcdf files to a zarr archive in order to take advantage of the faster compression algorithms that are supported by zarr."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1458b8f1-5ee0-4a21-90ea-1e0b7ad3f9db",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "ds = xr.open_mfdataset(\"./schout_adcirc_202308??.nc\", coords=\"minimal\", data_vars=\"minimal\", compat=\"override\")\n",
    "ds"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8330c9e9-1d0d-4daa-8a6a-5082505d9d84",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# For this example, let's only keep the depth and the zeta variables\n",
    "ds = ds[[\"element\", \"depth\", \"zeta\"]]\n",
    "#ds[\"time\"] = pd.DatetimeIndex(ds.time)\n",
    "\n",
    "# Recalculate zeta_max\n",
    "ds[\"zeta_max\"] = ds.zeta.max(\"time\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "70a05b0d-7c2a-46a7-a34a-f5c71cf85f12",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "encoding = {var: {\"compressor\": COMPRESSOR} for var in ds}\n",
    "\n",
    "# The default Zarr compression seems to mangle the timestamps. Using ZSTD seems to work fine though.\n",
    "encoding[\"time\"] = {\"compressor\": COMPRESSOR}\n",
    "\n",
    "ds = ds.chunk(dict(time=1, node=200_000))\n",
    "\n",
    "# Save as a zarr archive in the same directory\n",
    "store = ds.to_zarr(\"schout_adcirc_20230829.zarr\", mode=\"w\", consolidated=True, encoding=encoding)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6792872e-97c2-4d17-8b8f-ad4aababc0ab",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "!du -h *.nc\n",
    "\n",
    "!du -hd0 *.zarr"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cb7a42c7-72e8-4202-8a19-51ebe131cfa6",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Normalize dataset\n",
    "\n",
    "Thalassa supports multiple solvers. Currently supported ones include:\n",
    "\n",
    "- Schism (both 2D and 3D)\n",
    "- ADCIRC\n",
    "- Telemac (WIP)\n",
    "\n",
    "In order to support these solvers, thalassa converts their output to what we call the \"thalassa schema\".\n",
    "\n",
    "To make a long story short, we just need to use the `thalassa.open_dataset()` function. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5e24237d-5f78-4b49-ae1b-677052140c1f",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import thalassa\n",
    "\n",
    "ds = thalassa.open_dataset(\"schout_adcirc_20230829.zarr\")\n",
    "ds"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "04401188-0981-4181-a1e2-9569fee6194e",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Plot data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8c6d1ace-adc5-4e94-b61b-5e69b57bd69f",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import thalassa\n",
    "\n",
    "thalassa.plot(ds, \"depth\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6062486d-1768-41d5-840a-6d375b988f61",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Control the colorbar"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0f64ddbf-e550-4b87-963b-417d813b9958",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "thalassa.plot(\n",
    "    ds=ds,\n",
    "    variable=\"zeta_max\",\n",
    "    clim_min=0.5,\n",
    "    clim_max=3,\n",
    "    clabel=\"meters\",\n",
    "    title=\"Custom title for 'zeta_max'\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "310c90df-f667-4051-bcf2-d044566b2dd2",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Plot time-dependent variables"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bdc7d4dd-ff50-45b8-b572-efb40a5bda1d",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "timestamp = pd.Timestamp(ds.time[16].values)\n",
    "\n",
    "thalassa.plot(\n",
    "    ds=ds.sel(time=timestamp),  # or `.isel() etc\n",
    "    variable=\"zeta\", \n",
    "    clim_max=1,\n",
    "    title=f\"zeta: {timestamp}\",\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "df35cfb4-d66b-4e4b-95c4-7a92d38b07d6",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Plot mesh"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3b1fd42a-f4f2-4ed0-ab5b-40663a3d5dfd",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "thalassa.plot_mesh(ds)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3c4d0803-7e14-418f-a549-173ff05b9bd4",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Region of Interest (RoI)\n",
    "\n",
    "If we have a specific RoI, we could crop the dataset. \n",
    "\n",
    "Cropping with a big Bounding Box takes a few seconds, but it is something that only needs to be done once and then everyting is snappier!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f2002c89-7089-48ce-9a4c-3b2283aa79c3",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "bbox = shapely.box(-89, 29.5, -87.5, 31)\n",
    "bbox"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fa9bdc57-6917-4dca-bb41-422c4324aa96",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "cds = thalassa.crop(ds, bbox)\n",
    "cds.dims"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "26bed2e6-2329-4bc7-be8f-b624c0accaaf",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "thalassa.plot(cds, \"zeta_max\", show_mesh=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "70d670c8-ea3c-4410-9e24-e442e684d1e7",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "thalassa.plot_mesh(ds=cds)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4c86fde6-56d0-420a-a6bf-d6569fdad6ee",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# We can also do the cropping on the fly\n",
    "thalassa.plot(\n",
    "    ds=ds,\n",
    "    variable=\"depth\",\n",
    "    clim_min=-5, \n",
    "    clim_max=20, \n",
    "    bbox=bbox\n",
    ")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "thalassa",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}