{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Reading in Climate Data with `xarray`\n", "Xarrays is a new-ish Python library that makes working with n-dimensional data relatively painless. \n", "\n", "More info: http://xarray.pydata.org/en/stable/" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Import libraries\n", "import xarray as xr\n", "import numpy as np\n", "import pandas as pd\n", "from matplotlib import pyplot as plt" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Load the netCDF dataset into an `xarray` object\n", "fileName = 'macav2livneh_pr_bcc-csm1-1_r1i1p1_historical_1990_2005_CONUS_monthly.nc'\n", "ds = xr.open_dataset(fileName)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Extract the precipitation data into an xarray dataset object\n", "ds = xr.open_dataset(fileName)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Save the precipitation values into a single data array\n", "arrPrecip = ds['precipitation']" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Display information about the arrPrecip data array\n", "print(arrPrecip)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Extract the time dimension from the data array\n", "arrTime = arrPrecip.time" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Display the first 5 records of the arrTime data array\n", "arrTime[:5].data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Extract the lat and long values\n", "arrLat = arrPrecip.lat\n", "arrLon = arrPrecip.lon" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "arrLat[30].data,arrLon[30].data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Extract data for one time-location combination\n", "Use the xarray dataset's `sel` function to select the datum nearest the specified time and location" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "theTime = np.datetime64('1990-03-15')\n", "theLat = 36.005\n", "theLon = 360-78.942" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Extract the value corresponding to the point nearest to the specified time & location\n", "theResult = ds.sel(lat=theLat,lon=theLon,time=theTime,method='nearest')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Show the precipitation at that point\n", "theResult.precipitation.data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plot a time series for one location\n", "Dropping one of the dimensions in the `sel` statement retrieves all data in that dimension fitting the criteria specified in the other dimensions. Here, we omit the time constraint." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Select only on lat and lon and we get all precip data for all times\n", "theTimeSeries = ds.sel(lat=theLat,lon=theLon,method='nearest')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Extract the precipitation data array from the filtered dataset\n", "ts_Precip = theTimeSeries.precipitation" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Plot the time series\n", "fig = plt.figure(figsize=(15,5))\n", "ts_Precip.plot.line();" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Map precipitation for one time period" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Drop the lat and lon filters to grab data for all locations\n", "theMapResult = ds.sel(time=theTime).precipitation" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Plot the data\n", "fig = plt.figure(figsize=(12,5))\n", "theMapResult.plot(cmap=\"YlGnBu\", #Specifies the colors to use\n", " robust=True); #Drops outliers (<2%,>98%) from plot" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create a spatial subset\n", "We uses \"slices\" to extract subsets of data. Here we subset the data spatially and compute the mean" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Specify the spatial slices to grab\n", "theLats = slice(25,36.5)\n", "theLons = slice(360-91,360-76)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Extract the spatial subset into its own data array\n", "theSubset = ds.sel(lon=theLons,lat=theLats).precipitation" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Display the shape of the returned data array\n", "theSubset.shape" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Plot a histogram of the data within the subset\n", "theSubset.plot();" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Compute the mean across the time axis and show a map\n", "theSubsetAvg = theSubset.mean(axis=0)\n", "theSubsetAvg.plot(cmap=\"YlGnBu\");" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Fancier plots\n", "plt.figure(figsize=(15,3))\n", "plt.subplot(1,3,1)\n", "theSubset.min(axis=0).plot(cmap=\"YlGnBu\")\n", "plt.subplot(1,3,2)\n", "theSubset.mean(axis=0).plot(cmap=\"YlGnBu\")\n", "plt.subplot(1,3,3)\n", "theSubset.max(axis=0).plot(cmap=\"YlGnBu\")\n", "plt.show();" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Calculate summer (JJA) average\n", "The `xarray` package supports seasons to make easy seasonal averages. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Create a new data array by converting the dates in the `time` array to seasons\n", "arrSeason = ds['time'].dt.season" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Replace the time dimension with seasons\n", "ds['time'] = ds['time'].dt.season" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Extract precipitation for just the summer months; we have 48 summers of data\n", "summerPrecip = ds.sel(time='JJA',lat=theLats,lon=theLons).precipitation\n", "summerPrecip.shape" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "summerPrecip.plot();" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "#Fancier plots\n", "plt.figure(figsize=(15,5))\n", "plt.subplot(1,2,1)\n", "summerPrecip.mean(axis=0).plot(cmap=\"YlGnBu\")\n", "plt.subplot(1,2,2)\n", "summerPrecip.std(axis=0).plot(cmap=\"YlGnBu\")\n", "#plt.subplot(1,3,3)\n", "#summerPrecip.max(axis=0).plot(cmap=\"YlGnBu\")\n", "plt.show();" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.4" } }, "nbformat": 4, "nbformat_minor": 2 }