{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "
\n", "\n", "
\n", "\"Unidata\n", "
\n", "\n", "

Siphon (remote_access)

\n", "

Unidata AMS 2021 Student Conference

\n", "\n", "
\n", "
\n", "\n", "---\n", "\n", "In this notebook, we'll cover opening, inspecting, subsetting, and plotting a TDS dataset using Siphon's `remote_access` method.\n", "
\"plot
\n", "\n", "\n", "### Focuses\n", "* Use Siphon `remote_access` to open a TDS dataset\n", "* Access dataset using both [CDM Remote](https://www.unidata.ucar.edu/software/netcdf-java/v4.5/reference/stream/CdmRemote.html) and [OPENDAP](https://www.opendap.org/)\n", "* Subset and download variables in dataset\n", "* Plot downloaded data\n", "\n", "\n", "### Objectives\n", "1. [Find a dataset in a TDS Catalog](#1.-Find-a-dataset-in-a-TDS-Catalog)\n", "1. [Access the dataset using `remote_access`](#2.-Access-the-dataset-using-remote_access)\n", "1. [Use the remote dataset to subset, download, and display data](#3.-Use-the-remote-dataset-to-subset,-download,-and-display-data)\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Imports\n", "Before beginning, let's import the packages to be used throughout this training:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np\n", "from siphon.catalog import TDSCatalog" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Find a dataset in a TDS Catalog\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Before we use `remote_access`, we need to find a dataset that we'd like to access. \n", "As an example, we'll use this [dataset](https://thredds-test.unidata.ucar.edu/thredds/catalog/casestudies/harvey/model/gfs/GFS_Global_0p5deg_20170825_1800.grib2/catalog.html?dataset=casestudies/harvey/model/gfs/GFS_Global_0p5deg_20170825_1800.grib2) from the Unidata THREDDS test catalog." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To access a dataset, we need to know two things:\n", "* the url of the catalog where the dataset lives\n", "* the dataset name \n", "\n", "The dataset name can be found on the [dataset HTML page](https://thredds-test.unidata.ucar.edu/thredds/catalog/casestudies/harvey/model/gfs/GFS_Global_0p5deg_20170825_1800.grib2/catalog.html?dataset=casestudies/harvey/model/gfs/GFS_Global_0p5deg_20170825_1800.grib2), e.g. \"GFS_Global_0p5deg_20170825_1800.grib2\". \n", "The catalog URL is the URL of the dataset page up to \".html\", replacing \".html\" with \".xml\"." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "catUrl = \"https://thredds-test.unidata.ucar.edu/thredds/catalog/casestudies/harvey/model/gfs/GFS_Global_0p5deg_20170825_1800.grib2/catalog.xml\";\n", "datasetName = \"GFS_Global_0p5deg_20170825_1800.grib2\";" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you have another TDS dataset in mind, you can replace the catlog URL and dataset name above to point to that dataset instead." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we access the catalog using the catalog URL:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "catalog = TDSCatalog(catUrl)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And then select our dataset using the dataset name:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ds = catalog.datasets[datasetName]\n", "ds.name" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can now view the access protocols available for our dataset." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "list(ds.access_urls)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "Top\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Access the dataset using `remote_access`\n", "\n", "Now that we have our dataset and know its access protocols, we can access the remote dataset." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Access via CDM Remote\n", "If the name of the service is not provided, `remote_access` defaults to using the `CdmRemote` service." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dataset = ds.remote_access()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The call to `ds.remote_access` opens the remote dataset and returns a netCDF4-like dataset object, which provides access to the metadata." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# list attributes\n", "list(dataset.ncattrs())" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# list variables\n", "list(dataset.variables)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Access via OPENDAP\n", "We can also use `remote_access` to open the dataset via OPENDAP." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dataset = ds.remote_access('OPENDAP')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The returned netCDF4-like dataset object contains the same metadata as that returned by access via CdmRemote. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "list(dataset.ncattrs())" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "list(dataset.variables)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Other than possible reordering of listed attributes and variables, users should see no difference in the object returned by `remote_acesss` using OPENDAP versus CdmRemote. To read more about the two services, see the [resource links](#See-also)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Top\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Use the remote dataset to subset, download, and display data\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can access variables by name using the dataset's `variables` dictionary." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "var = dataset.variables['Precipitable_water_entire_atmosphere_single_layer'];" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And view the variable's metadata:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(var.shape)\n", "print(var.dimensions)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can start plotting our data. Let's plot our variable, `Precipitable_water_entire_atmosphere_single_layer`, for all `lat` and `lon` at `time=0`. First, we need to access the `lat` and `lon` variables. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "lat = dataset.variables['lat']\n", "lon = dataset.variables['lon']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*Note:* At this point, no data have been transferred over the network. Data will not be transferred until a variable is sliced, and only data corresponding to the slice are downloaded." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "v = np.squeeze(var[0,:,:]) # precipitable water data are subsetted and downloaded here\n", "\n", "# plot reflectivity\n", "plt.pcolormesh(lon[:], lat[:], v, shading='auto') # lat and lon data are subsetted and downloaded here.\n", "plt.title(var.name);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Data are finally downloaded when we slice our variables to plot the data. Try changing the indices to request a different subset of data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Top\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## See also\n", "\n", "For more information on Siphon and `remote_access`, see the [Siphon docs](https://unidata.github.io/siphon/latest/api/catalog.html?highlight=remote%20open#siphon.catalog.Dataset.remote_access).\n", "\n", "You may also be interested in reading more about [OPENDAP](https://www.opendap.org/) and [CDM Remote](https://www.unidata.ucar.edu/software/netcdf-java/v4.5/reference/stream/CdmRemote.html).\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Top\n", "\n", "---" ] } ], "metadata": { "kernelspec": { "display_name": "Python [conda env:pyaos-ams-2021]", "language": "python", "name": "conda-env-pyaos-ams-2021-py" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.1" } }, "nbformat": 4, "nbformat_minor": 4 }