{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "
\n", "\n", "
\n", "\"Unidata\n", "
\n", "\n", "

Siphon (subset)

\n", "

Unidata AMS 2021 Student Conference

\n", "\n", "
\n", "
\n", "\n", "---\n", "\n", "This notebook will demonstrate how to use Siphon to subset and download data using the NetcdfSubset service (NCSS). NCSS supports coordinate-based subsetting, i.e. selecting data by latitude, longitude, time, etc.\n", "
\"plot
\n", "\n", "\n", "### Focuses\n", "* Use a NCSS client to view metadata of a dataset\n", "* Build NCSS queries\n", " * Query point data\n", " * Query grid data\n", "* Download data subsets by lat, lon, and time\n", "\n", "\n", "### Objectives\n", "1. [Find a dataset in a TDS Catalog](#1.-Find-a-dataset-in-a-TDS-Catalog)\n", "1. [Create an NCSS Client and access metadata](#2.-Create-an-NCSS-client-and-access-metadata)\n", "1. [Use NCSS to query and subset data at a single point](#3.-Use-NCSS-to-query-and-subset-data-at-a-single-point)\n", "1. [Use NCSS to query and subset data for a gridded region](#4.-Use-NCSS-to-query-and-subset-data-for-a-gridded-region)\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Imports\n", "Before beginning, let's import the packages to be used throughout this training:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np\n", "from siphon.catalog import TDSCatalog\n", "from datetime import datetime, timedelta" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Find a dataset in a TDS Catalog\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Our first step is to find a dataset that we'd like to access and subset. \n", "In this example, we'll use the latest [`GFS Quarter Degree Forecaset`](https://thredds.ucar.edu/thredds/catalog/grib/NCEP/GFS/Global_0p25deg/catalog.html) dataset from the Unidata THREDDS catalog." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's start with the top level catalog:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "top_cat = TDSCatalog('http://thredds.ucar.edu/thredds/catalog.xml')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And then navigate down two levels to the GFS catalog:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "models_cat = top_cat.catalog_refs[0].follow() # follow reaturns a handle to the specified dataset\n", "gfs_cat = models_cat.catalog_refs['GFS Quarter Degree Forecast'].follow()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we get a handle for our dataset using `latest`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ds = gfs_cat.latest\n", "ds.name" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can now view the access protocols available for our dataset." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "list(ds.access_urls)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This list includes the `NetcdfSubset` service (or NCSS), which is the service we'll be using to subset and download our data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Top\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Create an NCSS client and access metadata\n", "\n", "To use the NetcdfSubset service, we first call `subset` to get an NCSS client." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ncss = ds.subset()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With this client, we can view the variables in our dataset." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "list(ncss.variables)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also access the metata, which will be returned as [NCSSDataset object](https://unidata.github.io/siphon/latest/api/ncssdataset.html#siphon.ncss_dataset.NCSSDataset)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "metadata = ncss.metadata\n", "# print metadata\n", "print(\"time span: \" + str(metadata.time_span))\n", "print(\"\\naccept list: \" + str(metadata.accept_list))\n", "print(\"\\nlat_lon_box: \" + str(metadata.lat_lon_box))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will use this metadata to create our subset query in the next section." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Top\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Use NCSS to query and subset data at a single point\n", "We can now use our NCSS client to create a query for the data we want." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this example, we'll request a subset of data containing the next 24 hours of forecast at a single point. \n", "First, we create a query object." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "query = ncss.query()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we populate the query to request the data we want." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "query.lonlat_point(lon=-105, lat=40) # set coordinates of point of interest.\n", "now = datetime.utcnow() # get current time\n", "query.time_range(now, now + timedelta(days=1)) # create time range of 24 hours\n", "query.variables('Temperature_surface') # request surface temperature variable\n", "query.accept('netcdf4') # return data as a netCDF4 object" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once our query is fully populated, we can request the data." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "point_data = ncss.get_data(query)\n", "list(point_data.variables)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, let's plot our returned data." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "temp = point_data.variables['Temperature_surface'][:] # get surface temperature data\n", "time = point_data.variables['time'][:] # get time data\n", "plt.plot(time, temp, 'k-'); # plot data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Top\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4. Use NCSS to query and subset data for a gridded region\n", "We can also request data for a region using a bounding box." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We start by creating a query object, just as before." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "query = ncss.query()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will populate this query with the same values as before, except instead of `latlon_point` we'll use `latlon_box`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "query.lonlat_box(east=-80, west=-90, south=35, north=45) # set bounding coordinates\n", "query.time(now + timedelta(days=1))\n", "query.variables('Temperature_surface')\n", "query.accept('netcdf4')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Again, we request the data using `get_data`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "grid_data = ncss.get_data(query)\n", "list(grid_data.variables)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And plot the surface temperature forecast in our region of interest over the next 24 hours." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "temp = grid_data.variables['Temperature_surface']\n", "lat = grid_data.variables['lat']\n", "lon = grid_data.variables['lon']\n", "plt.pcolormesh(lon[:], lat[:], temp[0], shading='auto');\n", "plt.title(temp.name);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Try creating your own NCSS query to request different subsets of data, e.g. different regions, different times..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Top\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## See also\n", "\n", "For more information on Siphon and using NCSS read the [docs](https://unidata.github.io/siphon/latest/api/ncss.html).\n", "\n", "You can also read more about the NetcdfSubset service [here](https://www.unidata.ucar.edu/software/tds/current/reference/NetcdfSubsetServiceReference.html).\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Top\n", "\n", "---" ] } ], "metadata": { "kernelspec": { "display_name": "Python [conda env:pyaos-ams-2021]", "language": "python", "name": "conda-env-pyaos-ams-2021-py" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.1" } }, "nbformat": 4, "nbformat_minor": 4 }