{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## The Disappearing Walker Lake\n",
    "\n",
    "While the loss of the Aral Sea in Kazakhstan and Lake Urmia in Iran have received a lot of attention over the last few decades, this trend is a global phenomena.  Reciently a number of __[papers](https://earthobservatory.nasa.gov/IOTD/view.php?id=91921)__ have been published including one focusing on the __[Decline of the world's saline lakes](https://www.nature.com/articles/ngeo3052)__.  Many of these lakes have lost the majority of their volume over the last century, including Walker Lake (Nevada, USA) which has lost 90 percent of its volume over the last 100 years.\n",
    "\n",
    "The following example is intended to replicate the typical processing required in change detection studies similar to the __[Decline of the world's saline lakes](https://www.nature.com/articles/ngeo3052)__."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import intake\n",
    "import numpy as np\n",
    "import xarray as xr\n",
    "import holoviews as hv\n",
    "import geoviews as gv\n",
    "import datashader as ds\n",
    "import cartopy.crs as ccrs\n",
    "import pandas as pd\n",
    "import glob\n",
    "from holoviews.operation.datashader import rasterize\n",
    "\n",
    "from holoviews.operation.datashader import regrid, shade\n",
    "\n",
    "hv.extension('bokeh', width=80)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# arbitrarily choose a small memory limit (4GB) to stress the \n",
    "# out of core processing infrastructure\n",
    "from dask.distributed import Client\n",
    "client = Client(memory_limit=10e10, processes=False) # Note: was 6e9 \n",
    "client"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Landsat Image Data\n",
    "\n",
    "To replicate this study, we first have to obtain the data from primary sources.  The conventional way to obtain Landsat image data is to download it through USGS's \n",
    "__[EarthExplorer](https://earthexplorer.usgs.gov/)__ or NASA's __[Giovanni](https://giovanni.gsfc.nasa.gov/giovanni/)__, but to facilitate the example two images have been downloaded from EarthExployer and cached.  \n",
    "\n",
    "The two images used by the original study are LT05_L1TP_042033_19881022_20161001_01_T1 and \n",
    "LC08_L1TP_042033_20171022_20171107_01_T1 from 1988/10/22 and 2017/10/22 respectivly.  These images contain Landsat Surface Reflectance Level-2 Science Product images."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Loading into xarray via ``intake``\n",
    "\n",
    "In the next cell, we load the Landsat-5 files into a single xarray ``DataArray`` using [intake](https://intake.readthedocs.io/en/latest/overview.html).  Data sources and caching parameters are specified in a catalog file.  Intake is optional, since any other method of creating an ``xarray.DataArray`` object would work here as well, but it makes it simpler to work with remote datasets while caching them locally."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cat = intake.open_catalog('catalog.yml')\n",
    "l5 = cat.l5()\n",
    "l5.cache[0].clear_all()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "L5_img = l5.read_chunked()\n",
    "print(L5_img)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "L5_img = L5_img.squeeze(dim='band', drop=True)\n",
    "L5_img"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "l8 = cat.l8()\n",
    "L8_img = l8.read_chunked()\n",
    "L8_img = L8_img.squeeze(dim='band', drop=True)\n",
    "print(L8_img)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let us view this ``DataArray``:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And some of the related metadata:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"The shape of the DataArray is :\", L5_img.shape)\n",
    "print(\"With attributes:\\n \", '\\n  '.join('%s=%s'%(k,v) for k,v in L5_img.attrs.items()))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can use this EPSG value shown above under the ``crs`` key to create a cartopy coordinate reference system that we will be using later on in this notebook:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "crs=ccrs.epsg(32611)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Computing the NDVI (1988)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let us compute the [NDVI](https://en.wikipedia.org/wiki/Normalized_difference_vegetation_index) for the 1988 image. Note that we need to promote the ``DataArray`` format as returned by ``rasterio`` to an xarray ``DataSet``. This restriction should be lifted in future (see geoviews issue [209](https://github.com/pyviz/geoviews/issues/209))."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "L5_img.data[L5_img.data==-9999] = np.NaN  # Replace the -9999\n",
    "ndvi5_array = (L5_img[4]-L5_img[3])/(L5_img[4]+L5_img[3])\n",
    "ndvi5 = ndvi5_array.to_dataset(name='ndvi')[['x','y', 'ndvi']]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Computing the NDVI (2017)\n",
    "\n",
    "Now we can do this for the Landsat 8 files for the 2017 image:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "L8_img.data[L8_img.data==-9999] = np.NaN  # Replace the -9999\n",
    "ndvi8_array = (L8_img[4]-L8_img[3])/(L8_img[4]+L8_img[3])\n",
    "ndvi8 = ndvi8_array.to_dataset(name='ndvi')[['x','y', 'ndvi']]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Viewing change via dropdown\n",
    "\n",
    "Using [datashader](http://datashader.org/) together with [geoviews](http://geoviews.org/), we can now easily build an interactive visualization where we select between the 1988 and 2017 images. The use of datashader allows these images to be dynamically updated according to zoom level (Note: it can take datashader a minute to 'warm up' before it becomes fully interactive). For more information on how the dropdown widget was created using ``HoloMap``, please refer to the [HoloMap reference](http://holoviews.org/reference/containers/bokeh/HoloMap.html#bokeh-gallery-holomap)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%opts Image (cmap='viridis') [width=500 height=500 tools=['hover'] colorbar=True]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "hmap = hv.HoloMap({'1988':gv.Image(ndvi5, crs=crs, vdims=['ndvi']), \n",
    "                   '2017':gv.Image(ndvi8, crs=crs, vdims=['ndvi'])}, \n",
    "                  kdims=['Year']).redim(x='lon', y='lat') # Mapping 'x' and 'y' from rasterio to 'lon' and 'lat'\n",
    "rasterize(hmap)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Computing statistics and projecting display\n",
    "\n",
    "The rest of the notebook shows how statistical operations can reduce the dimensionality of the data that may be used to compute new features that may be used as part of an ML pipeline. \n",
    "\n",
    "### The mean and sum over the two time points\n",
    "\n",
    "The next plot (may take a minute to compute) shows the mean of the two NDVI images next to the sum of them:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mean_avg = hmap.collapse(dimensions=['Year'], function=np.mean)\n",
    "mean_img = gv.Image(mean_avg.data, crs=ccrs.epsg(32611), \n",
    "                    kdims=['lon', 'lat'], vdims=['ndvi']).relabel('Mean over Year')\n",
    "\n",
    "summed = hmap.collapse(dimensions=['Year'], function=np.sum)\n",
    "summed_image = gv.Image(summed.data, crs=ccrs.epsg(32611), \n",
    "                        kdims=['lon', 'lat'], vdims=['ndvi']).relabel('Sum over Year')\n",
    "\n",
    "rasterize(mean_img) + rasterize(summed_image)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Difference in NDVI between 1988 and 2017\n",
    "\n",
    "The change in Walker Lake as viewed using the NDVI can be shown by subtracting the NDVI recorded in 1988 from the NDVI recorded in 2017:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "difference = gv.Image(np.subtract(hmap['1988'].data, hmap['2017'].data), crs=ccrs.epsg(32611), \n",
    "                      kdims=['lon', 'lat'], vdims=['ndvi']).relabel('Difference in NDVI')\n",
    "rasterize(difference.redim(ndvi='delta_ndvi'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can see a large change (positive delta) in the areas where there is water, indicating a reduction in the size of the lake over this time period."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Slicing across ``lon`` and ``lat``\n",
    "\n",
    "As a final example, we can use the ``sample`` method to slice across the difference in NDVI along (roughly)the midpoint of the latitude and the midpoint of the longitude. To do this, we define the following helper function to convert latitude/longitude into the appropriate coordinate value used by the ``DataSet``:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def from_lon_lat(x,y):\n",
    "    return ccrs.epsg(32611).transform_point(x,y, ccrs.PlateCarree())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%opts Curve [width=600 tools=['hover']]\n",
    "lon_y, lat_x = from_lon_lat(-118, 39) # Longitude of -118 and Latitude of 39\n",
    "(difference.sample(lat=lat_x) + difference.sample(lon=lon_y)).cols(1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}