<a name="top"></a>
<div style="width:1000 px">

<div style="float:right; width:98 px; height:98px;">
<img src="https://raw.githubusercontent.com/Unidata/MetPy/master/src/metpy/plots/_static/unidata_150x150.png" alt="Unidata Logo" style="height: 98px;">
</div>

<h1>Siphon (remote_access)</h1>
<h3>Unidata AMS 2021 Student Conference</h3>

<div style="clear:both"></div>
</div>

---

In this notebook, we'll cover opening, inspecting, subsetting, and plotting a TDS dataset using Siphon's `remote_access` method.
<div style="float:right; width:250 px"><img src="../../instructors/images/siphon_remote_access_preview.png" alt="plot of data accessed via Siphon's remote_access" style="height: 300px;"></div>


### Focuses
* Use Siphon `remote_access` to open a TDS dataset
* Access dataset using both [CDM Remote](https://www.unidata.ucar.edu/software/netcdf-java/v4.5/reference/stream/CdmRemote.html) and [OPENDAP](https://www.opendap.org/)
* Subset and download variables in dataset
* Plot downloaded data


### Objectives
1. [Find a dataset in a TDS Catalog](#1.-Find-a-dataset-in-a-TDS-Catalog)
1. [Access the dataset using `remote_access`](#2.-Access-the-dataset-using-remote_access)
1. [Use the remote dataset to subset, download, and display data](#3.-Use-the-remote-dataset-to-subset,-download,-and-display-data)

---

### Imports
Before beginning, let's import the packages to be used throughout this training:

In [None]:
import matplotlib.pyplot as plt
import numpy as np
from siphon.catalog import TDSCatalog

---

## 1. Find a dataset in a TDS Catalog


Before we use `remote_access`, we need to find a dataset that we'd like to access.  
As an example, we'll use this [dataset](https://thredds-test.unidata.ucar.edu/thredds/catalog/casestudies/harvey/model/gfs/GFS_Global_0p5deg_20170825_1800.grib2/catalog.html?dataset=casestudies/harvey/model/gfs/GFS_Global_0p5deg_20170825_1800.grib2) from the Unidata THREDDS test catalog.

To access a dataset, we need to know two things:
* the url of the catalog where the dataset lives
* the dataset name  

The dataset name can be found on the [dataset HTML page](https://thredds-test.unidata.ucar.edu/thredds/catalog/casestudies/harvey/model/gfs/GFS_Global_0p5deg_20170825_1800.grib2/catalog.html?dataset=casestudies/harvey/model/gfs/GFS_Global_0p5deg_20170825_1800.grib2), e.g. "GFS_Global_0p5deg_20170825_1800.grib2".  
The catalog URL is the URL of the dataset page up to ".html", replacing ".html" with ".xml".

In [None]:
catUrl = "https://thredds-test.unidata.ucar.edu/thredds/catalog/casestudies/harvey/model/gfs/GFS_Global_0p5deg_20170825_1800.grib2/catalog.xml";
datasetName = "GFS_Global_0p5deg_20170825_1800.grib2";

If you have another TDS dataset in mind, you can replace the catlog URL and dataset name above to point to that dataset instead.

Next, we access the catalog using the catalog URL:

In [None]:
catalog = TDSCatalog(catUrl)

And then select our dataset using the dataset name:

In [None]:
ds = catalog.datasets[datasetName]
ds.name

We can now view the access protocols available for our dataset.

In [None]:
list(ds.access_urls)


<a href="#top">Top</a>

---

## 2. Access the dataset using `remote_access`

Now that we have our dataset and know its access protocols, we can access the remote dataset.

### Access via CDM Remote
If the name of the service is not provided, `remote_access` defaults to using the `CdmRemote` service.

In [None]:
dataset = ds.remote_access()

The call to `ds.remote_access` opens the remote dataset and returns a netCDF4-like dataset object, which provides access to the metadata.

In [None]:
# list attributes
list(dataset.ncattrs())

In [None]:
# list variables
list(dataset.variables)

### Access via OPENDAP
We can also use `remote_access` to open the dataset via OPENDAP.

In [None]:
dataset = ds.remote_access('OPENDAP')

The returned netCDF4-like dataset object contains the same metadata as that returned by access via CdmRemote. 

In [None]:
list(dataset.ncattrs())

In [None]:
list(dataset.variables)

Other than possible reordering of listed attributes and variables, users should see no difference in the object returned by `remote_acesss` using OPENDAP versus CdmRemote. To read more about the two services, see the [resource links](#See-also).

<a href="#top">Top</a>

---

## 3. Use the remote dataset to subset, download, and display data


We can access variables by name using the dataset's `variables` dictionary.

In [None]:
var = dataset.variables['Precipitable_water_entire_atmosphere_single_layer'];

And view the variable's metadata:

In [None]:
print(var.shape)
print(var.dimensions)

Now we can start plotting our data. Let's plot our variable, `Precipitable_water_entire_atmosphere_single_layer`, for all `lat` and `lon` at `time=0`. First, we need to access the `lat` and `lon` variables. 

In [None]:
lat = dataset.variables['lat']
lon = dataset.variables['lon']

*Note:* At this point, no data have been transferred over the network. Data will not be transferred until a variable is sliced, and only data corresponding to the slice are downloaded.

In [None]:
v = np.squeeze(var[0,:,:]) # precipitable water data are subsetted and downloaded here

# plot reflectivity
plt.pcolormesh(lon[:], lat[:], v, shading='auto') # lat and lon data are subsetted and downloaded here.
plt.title(var.name);

Data are finally downloaded when we slice our variables to plot the data. Try changing the indices to request a different subset of data.

<a href="#top">Top</a>

---

## See also

For more information on Siphon and `remote_access`, see the [Siphon docs](https://unidata.github.io/siphon/latest/api/catalog.html?highlight=remote%20open#siphon.catalog.Dataset.remote_access).

You may also be interested in  reading more about [OPENDAP](https://www.opendap.org/) and [CDM Remote](https://www.unidata.ucar.edu/software/netcdf-java/v4.5/reference/stream/CdmRemote.html).


<a href="#top">Top</a>

---