\n",
"\n",
"---\n",
"\n",
"This notebook will demonstrate how to use Siphon to subset and download data using the NetcdfSubset service (NCSS). NCSS supports coordinate-based subsetting, i.e. selecting data by latitude, longitude, time, etc.\n",
"\n",
"\n",
"\n",
"### Focuses\n",
"* Use a NCSS client to view metadata of a dataset\n",
"* Build NCSS queries\n",
" * Query point data\n",
" * Query grid data\n",
"* Download data subsets by lat, lon, and time\n",
"\n",
"\n",
"### Objectives\n",
"1. [Find a dataset in a TDS Catalog](#1.-Find-a-dataset-in-a-TDS-Catalog)\n",
"1. [Create an NCSS Client and access metadata](#2.-Create-an-NCSS-client-and-access-metadata)\n",
"1. [Use NCSS to query and subset data at a single point](#3.-Use-NCSS-to-query-and-subset-data-at-a-single-point)\n",
"1. [Use NCSS to query and subset data for a gridded region](#4.-Use-NCSS-to-query-and-subset-data-for-a-gridded-region)\n",
"\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Imports\n",
"Before beginning, let's import the packages to be used throughout this training:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"from siphon.catalog import TDSCatalog\n",
"from datetime import datetime, timedelta"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Find a dataset in a TDS Catalog\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Our first step is to find a dataset that we'd like to access and subset. \n",
"In this example, we'll use the latest [`GFS Quarter Degree Forecaset`](https://thredds.ucar.edu/thredds/catalog/grib/NCEP/GFS/Global_0p25deg/catalog.html) dataset from the Unidata THREDDS catalog."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's start with the top level catalog:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"top_cat = TDSCatalog('http://thredds.ucar.edu/thredds/catalog.xml')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And then navigate down two levels to the GFS catalog:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"models_cat = top_cat.catalog_refs[0].follow() # follow reaturns a handle to the specified dataset\n",
"gfs_cat = models_cat.catalog_refs['GFS Quarter Degree Forecast'].follow()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally, we get a handle for our dataset using `latest`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ds = gfs_cat.latest\n",
"ds.name"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can now view the access protocols available for our dataset."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"list(ds.access_urls)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This list includes the `NetcdfSubset` service (or NCSS), which is the service we'll be using to subset and download our data."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Top\n",
"\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Create an NCSS client and access metadata\n",
"\n",
"To use the NetcdfSubset service, we first call `subset` to get an NCSS client."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ncss = ds.subset()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"With this client, we can view the variables in our dataset."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"list(ncss.variables)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can also access the metata, which will be returned as [NCSSDataset object](https://unidata.github.io/siphon/latest/api/ncssdataset.html#siphon.ncss_dataset.NCSSDataset)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"metadata = ncss.metadata\n",
"# print metadata\n",
"print(\"time span: \" + str(metadata.time_span))\n",
"print(\"\\naccept list: \" + str(metadata.accept_list))\n",
"print(\"\\nlat_lon_box: \" + str(metadata.lat_lon_box))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will use this metadata to create our subset query in the next section."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Top\n",
"\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Use NCSS to query and subset data at a single point\n",
"We can now use our NCSS client to create a query for the data we want."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this example, we'll request a subset of data containing the next 24 hours of forecast at a single point. \n",
"First, we create a query object."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"query = ncss.query()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, we populate the query to request the data we want."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"query.lonlat_point(lon=-105, lat=40) # set coordinates of point of interest.\n",
"now = datetime.utcnow() # get current time\n",
"query.time_range(now, now + timedelta(days=1)) # create time range of 24 hours\n",
"query.variables('Temperature_surface') # request surface temperature variable\n",
"query.accept('netcdf4') # return data as a netCDF4 object"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Once our query is fully populated, we can request the data."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"point_data = ncss.get_data(query)\n",
"list(point_data.variables)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally, let's plot our returned data."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"temp = point_data.variables['Temperature_surface'][:] # get surface temperature data\n",
"time = point_data.variables['time'][:] # get time data\n",
"plt.plot(time, temp, 'k-'); # plot data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Top\n",
"\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. Use NCSS to query and subset data for a gridded region\n",
"We can also request data for a region using a bounding box."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We start by creating a query object, just as before."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"query = ncss.query()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will populate this query with the same values as before, except instead of `latlon_point` we'll use `latlon_box`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"query.lonlat_box(east=-80, west=-90, south=35, north=45) # set bounding coordinates\n",
"query.time(now + timedelta(days=1))\n",
"query.variables('Temperature_surface')\n",
"query.accept('netcdf4')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Again, we request the data using `get_data`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"grid_data = ncss.get_data(query)\n",
"list(grid_data.variables)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And plot the surface temperature forecast in our region of interest over the next 24 hours."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"temp = grid_data.variables['Temperature_surface']\n",
"lat = grid_data.variables['lat']\n",
"lon = grid_data.variables['lon']\n",
"plt.pcolormesh(lon[:], lat[:], temp[0], shading='auto');\n",
"plt.title(temp.name);"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Try creating your own NCSS query to request different subsets of data, e.g. different regions, different times..."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Top\n",
"\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## See also\n",
"\n",
"For more information on Siphon and using NCSS read the [docs](https://unidata.github.io/siphon/latest/api/ncss.html).\n",
"\n",
"You can also read more about the NetcdfSubset service [here](https://www.unidata.ucar.edu/software/tds/current/reference/NetcdfSubsetServiceReference.html).\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Top\n",
"\n",
"---"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python [conda env:pyaos-ams-2021]",
"language": "python",
"name": "conda-env-pyaos-ams-2021-py"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 4
}