{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a name=\"top\"></a>\n",
    "<div style=\"width:1000 px\">\n",
    "\n",
    "<div style=\"float:right; width:98 px; height:98px;\">\n",
    "<img src=\"https://raw.githubusercontent.com/Unidata/MetPy/master/src/metpy/plots/_static/unidata_150x150.png\" alt=\"Unidata Logo\" style=\"height: 98px;\">\n",
    "</div>\n",
    "\n",
    "<h1>Siphon (remote_access)</h1>\n",
    "<h3>Unidata AMS 2021 Student Conference</h3>\n",
    "\n",
    "<div style=\"clear:both\"></div>\n",
    "</div>\n",
    "\n",
    "---\n",
    "\n",
    "In this notebook, we'll cover opening, inspecting, subsetting, and plotting a TDS dataset using Siphon's `remote_access` method.\n",
    "<div style=\"float:right; width:250 px\"><img src=\"../../instructors/images/siphon_remote_access_preview.png\" alt=\"plot of data accessed via Siphon's remote_access\" style=\"height: 300px;\"></div>\n",
    "\n",
    "\n",
    "### Focuses\n",
    "* Use Siphon `remote_access` to open a TDS dataset\n",
    "* Access dataset using both [CDM Remote](https://www.unidata.ucar.edu/software/netcdf-java/v4.5/reference/stream/CdmRemote.html) and [OPENDAP](https://www.opendap.org/)\n",
    "* Subset and download variables in dataset\n",
    "* Plot downloaded data\n",
    "\n",
    "\n",
    "### Objectives\n",
    "1. [Find a dataset in a TDS Catalog](#1.-Find-a-dataset-in-a-TDS-Catalog)\n",
    "1. [Access the dataset using `remote_access`](#2.-Access-the-dataset-using-remote_access)\n",
    "1. [Use the remote dataset to subset, download, and display data](#3.-Use-the-remote-dataset-to-subset,-download,-and-display-data)\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Imports\n",
    "Before beginning, let's import the packages to be used throughout this training:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "from siphon.catalog import TDSCatalog"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Find a dataset in a TDS Catalog\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Before we use `remote_access`, we need to find a dataset that we'd like to access.  \n",
    "As an example, we'll use this [dataset](https://thredds-test.unidata.ucar.edu/thredds/catalog/casestudies/harvey/model/gfs/GFS_Global_0p5deg_20170825_1800.grib2/catalog.html?dataset=casestudies/harvey/model/gfs/GFS_Global_0p5deg_20170825_1800.grib2) from the Unidata THREDDS test catalog."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To access a dataset, we need to know two things:\n",
    "* the url of the catalog where the dataset lives\n",
    "* the dataset name  \n",
    "\n",
    "The dataset name can be found on the [dataset HTML page](https://thredds-test.unidata.ucar.edu/thredds/catalog/casestudies/harvey/model/gfs/GFS_Global_0p5deg_20170825_1800.grib2/catalog.html?dataset=casestudies/harvey/model/gfs/GFS_Global_0p5deg_20170825_1800.grib2), e.g. \"GFS_Global_0p5deg_20170825_1800.grib2\".  \n",
    "The catalog URL is the URL of the dataset page up to \".html\", replacing \".html\" with \".xml\"."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "catUrl = \"https://thredds-test.unidata.ucar.edu/thredds/catalog/casestudies/harvey/model/gfs/GFS_Global_0p5deg_20170825_1800.grib2/catalog.xml\";\n",
    "datasetName = \"GFS_Global_0p5deg_20170825_1800.grib2\";"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you have another TDS dataset in mind, you can replace the catlog URL and dataset name above to point to that dataset instead."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we access the catalog using the catalog URL:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "catalog = TDSCatalog(catUrl)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And then select our dataset using the dataset name:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds = catalog.datasets[datasetName]\n",
    "ds.name"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can now view the access protocols available for our dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "list(ds.access_urls)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "<a href=\"#top\">Top</a>\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Access the dataset using `remote_access`\n",
    "\n",
    "Now that we have our dataset and know its access protocols, we can access the remote dataset."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Access via CDM Remote\n",
    "If the name of the service is not provided, `remote_access` defaults to using the `CdmRemote` service."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset = ds.remote_access()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The call to `ds.remote_access` opens the remote dataset and returns a netCDF4-like dataset object, which provides access to the metadata."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# list attributes\n",
    "list(dataset.ncattrs())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# list variables\n",
    "list(dataset.variables)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Access via OPENDAP\n",
    "We can also use `remote_access` to open the dataset via OPENDAP."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset = ds.remote_access('OPENDAP')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The returned netCDF4-like dataset object contains the same metadata as that returned by access via CdmRemote. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "list(dataset.ncattrs())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "list(dataset.variables)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Other than possible reordering of listed attributes and variables, users should see no difference in the object returned by `remote_acesss` using OPENDAP versus CdmRemote. To read more about the two services, see the [resource links](#See-also)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a href=\"#top\">Top</a>\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Use the remote dataset to subset, download, and display data\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can access variables by name using the dataset's `variables` dictionary."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "var = dataset.variables['Precipitable_water_entire_atmosphere_single_layer'];"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And view the variable's metadata:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(var.shape)\n",
    "print(var.dimensions)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can start plotting our data. Let's plot our variable, `Precipitable_water_entire_atmosphere_single_layer`, for all `lat` and `lon` at `time=0`. First, we need to access the `lat` and `lon` variables. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "lat = dataset.variables['lat']\n",
    "lon = dataset.variables['lon']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*Note:* At this point, no data have been transferred over the network. Data will not be transferred until a variable is sliced, and only data corresponding to the slice are downloaded."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "v = np.squeeze(var[0,:,:]) # precipitable water data are subsetted and downloaded here\n",
    "\n",
    "# plot reflectivity\n",
    "plt.pcolormesh(lon[:], lat[:], v, shading='auto') # lat and lon data are subsetted and downloaded here.\n",
    "plt.title(var.name);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Data are finally downloaded when we slice our variables to plot the data. Try changing the indices to request a different subset of data."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a href=\"#top\">Top</a>\n",
    "\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## See also\n",
    "\n",
    "For more information on Siphon and `remote_access`, see the [Siphon docs](https://unidata.github.io/siphon/latest/api/catalog.html?highlight=remote%20open#siphon.catalog.Dataset.remote_access).\n",
    "\n",
    "You may also be interested in  reading more about [OPENDAP](https://www.opendap.org/) and [CDM Remote](https://www.unidata.ucar.edu/software/netcdf-java/v4.5/reference/stream/CdmRemote.html).\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a href=\"#top\">Top</a>\n",
    "\n",
    "---"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [conda env:pyaos-ams-2021]",
   "language": "python",
   "name": "conda-env-pyaos-ams-2021-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}