{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "hide_input": false,
    "tags": []
   },
   "source": [
    "![OpenSARlab notebook banner](NotebookAddons/blackboard-banner.png)\n",
    "\n",
    "## **Mapping Areas of Active Agriculture using SAR Time Series Data and Coefficient of Variation**\n",
    "\n",
    "### **Franz J Meyer; University of Alaska Fairbanks & Josef Kellndorfer, [Earth Big Data, LLC](http://earthbigdata.com/)**\n",
    "\n",
    "<img style=\"padding: 7px\" src=\"NotebookAddons/UAFLogo_A_647.png\" width=\"170\" align=\"right\" /></font>\n",
    "\n",
    "This notebook introduces you to a simple method to deliniate areas of active agriculture from SAR time series data. The method is based on the **Coefficient of Variation** and results in the detection of actively managed crop areas. The coefficient of vatiation approach will be used as the main algorithm for agriculture mapping by the upcoming NASA/ISRO SAR mssion NISAR. Hence, the approach demonstrated here can be applied to data from this future mission.\n",
    "    \n",
    "The exercises will use *Jupyter Notebooks*, an environment that is easy to launch in any web browser for interactive data exploration with provided or new training data. Notebooks are comprised of text written in a combination of executable python code and markdown formatting including latex style mathematical equations. Another advantage of Jupyter Notebooks is that they can easily be expanded, changed, and shared with new data sets or newly available time series steps. Therefore, they provide an excellent basis for collaborative and repeatable data analysis.\n",
    "\n",
    "**This notebook covers the following data analysis concepts:**\n",
    "\n",
    "- How to calculate the Coefficient of Variation from time series SAR data.\n",
    "- How to use Coefficient of Variation to indicate areas of active agriculture."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "---\n",
    "\n",
    "**<font face=\"Calibri\" size=\"5\" color='rgba(200,0,0,0.2)'> Important Note about JupyterHub</font>** \n",
    "\n",
    "**Your JupyterHub server will automatically shutdown when left idle for more than 1 hour. Your notebooks will not be lost but you will have to restart their kernels and re-run them from the beginning. You will not be able to seamlessly continue running a partially run notebook.**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "import url_widget as url_w\n",
    "notebookUrl = url_w.URLWidget()\n",
    "display(notebookUrl)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from IPython.display import Markdown\n",
    "from IPython.display import display\n",
    "\n",
    "notebookUrl = notebookUrl.value\n",
    "user = !echo $JUPYTERHUB_USER\n",
    "env = !echo $CONDA_PREFIX\n",
    "if env[0] == '':\n",
    "    env[0] = 'Python 3 (base)'\n",
    "if env[0] != '/home/jovyan/.local/envs/rtc_analysis':\n",
    "    display(Markdown(f'<text style=color:red><strong>WARNING:</strong></text>'))\n",
    "    display(Markdown(f'<text style=color:red>This notebook should be run using the \"rtc_analysis\" conda environment.</text>'))\n",
    "    display(Markdown(f'<text style=color:red>It is currently using the \"{env[0].split(\"/\")[-1]}\" environment.</text>'))\n",
    "    display(Markdown(f'<text style=color:red>Select the \"rtc_analysis\" from the \"Change Kernel\" submenu of the \"Kernel\" menu.</text>'))\n",
    "    display(Markdown(f'<text style=color:red>If the \"rtc_analysis\" environment is not present, use <a href=\"{notebookUrl.split(\"/user\")[0]}/user/{user[0]}/notebooks/conda_environments/Create_OSL_Conda_Environments.ipynb\"> Create_OSL_Conda_Environments.ipynb </a> to create it.</text>'))\n",
    "    display(Markdown(f'<text style=color:red>Note that you must restart your server after creating a new environment before it is usable by notebooks.</text>'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "---\n",
    "\n",
    "## **0. Importing Relevant Python Packages**\n",
    "\n",
    "In this notebook we will use the following scientific libraries:\n",
    "\n",
    "1. **[Pandas](https://pandas.pydata.org/)** is a Python library that provides high-level data structures and a vast variety of tools for analysis. The great feature of this package is the ability to translate rather complex operations with data into one or two commands. Pandas contains many built-in methods for filtering and combining data, as well as the time-series functionality. \n",
    "1. **[GDAL](https://www.gdal.org/)** is a software library for reading and writing raster and vector geospatial data formats. It includes a collection of programs tailored for geospatial data processing. Most modern GIS systems (such as ArcGIS or QGIS) use GDAL in the background.\n",
    "1. **[NumPy](http://www.numpy.org/)** is one of the principal packages for scientific applications of Python. It is intended for processing large multidimensional arrays and matrices, and an extensive collection of high-level mathematical functions and implemented methods makes it possible to perform various operations with these objects.\n",
    "1. **[Matplotlib](https://matplotlib.org/index.html)** is a low-level library for creating two-dimensional diagrams and graphs. With its help, you can build diverse charts, from histograms and scatterplots to non-Cartesian coordinates graphs. Moreover, many popular plotting libraries are designed to work in conjunction with matplotlib.\n",
    "1. The **[asf-hyp3 API](https://www.pydoc.io/pypi/asf-hyp3-1.1.1/index.html)** provides useful functions and scripts for accessing and processing SAR data via the Alaska Satellite Facility's Hybrid Pluggable Processing Pipeline, or HyP3 (pronounced \"hype\").\n",
    "1. **[SciPY](https://www.scipy.org/about.html)** is a library that provides functions for numerical integration, interpolation, optimization, linear algebra and statistics. \n",
    "\n",
    "Our first step is to **import them:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%capture\n",
    "from pathlib import Path\n",
    "\n",
    "import pandas as pd # for DatetimeIndex\n",
    "from osgeo import gdal # for Info\n",
    "gdal.UseExceptions()\n",
    "import numpy as np\n",
    "\n",
    "%matplotlib inline\n",
    "import matplotlib.pyplot as plt\n",
    "from matplotlib import animation\n",
    "from matplotlib import rc\n",
    "\n",
    "from IPython.display import HTML\n",
    "\n",
    "import opensarlab_lib as asfn\n",
    "asfn.jupytertheme_matplotlib_format()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "---\n",
    "\n",
    "## **1. Load Data Stack** <img src=\"NotebookAddons/BogotaAg.jpg\" width=\"350\" style=\"padding:5px;\" align=\"right\" /> \n",
    "\n",
    "This notebook will work with a 46-image deep C-band SAR data stack south of Bogota, Colombia, covering the southern end of the city and adjacent agriculture areas. The closeness of the area to Bogota should provide means for visual validation of SAR-derived agriculture extent information. The C-band data were acquired by ESA's Sentinel-1 SAR sensor constellation and are available to you through the services of the [Alaska Satellite Facility](https://www.asf.alaska.edu/). \n",
    "\n",
    "The site in question is shown in the image to the right. Data were acquired between January of 2017 and December of 2018 in descending orbit direction. \n",
    "\n",
    "<br><br>\n",
    "\n",
    "Before we get started, let's first **create a working directory for this analysis and change into it:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "path = Path(\"/home/jovyan/notebooks/SAR_Training/English/Ecosystems/AgricultureMapping\")\n",
    "\n",
    "if not path.exists():\n",
    "    path.mkdir()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We will **retrieve the relevant data** from an [Amazon Web Service (AWS)](https://aws.amazon.com/) cloud storage bucket **using the following command**:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "time_series_path = 's3://asf-jupyter-data-west/BogotaAg.zip'\n",
    "time_series = Path(time_series_path).name\n",
    "!aws --region=us-west-2 --no-sign-request s3 cp $time_series_path $time_series"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, let's **unzip the file (overwriting previous extractions) and clean up after ourselves:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if Path(time_series).exists():\n",
    "    asfn.asf_unzip(str(path), time_series)\n",
    "    Path(time_series).unlink()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## **2. Define Some Python Helper Functions for this Notebook**\n",
    "\n",
    "We are defining two helper functions for this notebook:\n",
    "\n",
    "- `create_geotiff()` to write out images\n",
    "- `timeseries_metrics()` to compute various metrics from a time series data stack"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def create_geotiff(name, array, data_type, ndv, bandnames=None, ref_image=None, \n",
    "                  geo_t=None, projection=None):\n",
    "    # If it's a 2D image we fake a third dimension:\n",
    "    if len(array.shape) == 2:\n",
    "        array = np.array([array])\n",
    "    if ref_image == None and (geo_t == None or projection == None):\n",
    "        raise RuntimeWarning('ref_image or settings required.')\n",
    "    if bandnames != None:\n",
    "        if len(bandnames) != array.shape[0]:\n",
    "            raise RuntimeError('Need {} bandnames. {} given'\n",
    "                               .format(array.shape[0],len(bandnames)))\n",
    "    else:\n",
    "        bandnames = ['Band {}'.format(i+1) for i in range(array.shape[0])]\n",
    "    if ref_image != None:\n",
    "        refimg = gdal.Open(ref_image)\n",
    "        geo_t = refimg.GetGeoTransform()\n",
    "        projection = refimg.GetProjection()\n",
    "    driver = gdal.GetDriverByName('GTIFF')\n",
    "    array[np.isnan(array)] = ndv\n",
    "    DataSet = driver.Create(name, array.shape[2], \n",
    "                            array.shape[1], array.shape[0], \n",
    "                            data_type)\n",
    "    DataSet.SetGeoTransform(geo_t)\n",
    "    DataSet.SetProjection(projection)\n",
    "    for i, image in enumerate(array, 1):\n",
    "        DataSet.GetRasterBand(i).WriteArray( image )\n",
    "        DataSet.GetRasterBand(i).SetNoDataValue(ndv)\n",
    "        DataSet.SetDescription(bandnames[i-1])\n",
    "    DataSet.FlushCache()\n",
    "    return name"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def timeseries_metrics(raster, ndv=0): \n",
    "    # Make us of numpy nan functions\n",
    "    # Check if type is a float array\n",
    "    if not raster.dtype.name.find('float')>-1:\n",
    "        raster=raster.astype(np.float64)\n",
    "    # Set ndv to nan\n",
    "    if ndv != np.nan:\n",
    "        raster[np.equal(raster,ndv)]=np.nan\n",
    "    # Build dictionary of the metrics\n",
    "    tsmetrics={}\n",
    "    rperc = np.nanpercentile(raster,[5,50,95],axis=0)\n",
    "    tsmetrics['mean']=np.nanmean(raster,axis=0)\n",
    "    tsmetrics['max']=np.nanmax(raster,axis=0)\n",
    "    tsmetrics['min']=np.nanmin(raster,axis=0)\n",
    "    tsmetrics['range']=tsmetrics['max']-tsmetrics['min']\n",
    "    tsmetrics['median']=rperc[1]\n",
    "    tsmetrics['p5']=rperc[0]\n",
    "    tsmetrics['p95']=rperc[2]\n",
    "    tsmetrics['prange']=rperc[2]-rperc[0]\n",
    "    tsmetrics['var']=np.nanvar(raster,axis=0)\n",
    "    tsmetrics['cov']=tsmetrics['var']/tsmetrics['mean']\n",
    "    return tsmetrics"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## **3. Define Data Directory and Path to VRT**\n",
    "\n",
    "**Create a variable containing the VRT filename and the image acquisition dates:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!gdalbuildvrt -separate {path}/raster_stack.vrt {path}/tiffs/*_VV.tiff\n",
    "image_file = path/\"raster_stack.vrt\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Create an index of `timedelta64` data with Pandas:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!ls {path}/tiffs/*_VV.tiff | sort | sed 's/[^0-9]*//g' | cut -c 1-8 > {path}/raster_stack.dates\n",
    "datefile= path/'raster_stack.dates'\n",
    "dates= open(str(datefile)).readlines()\n",
    "tindex= pd.DatetimeIndex(dates)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Print the bands and dates for all images in the virtual raster table (VRT):**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "j = 1\n",
    "print(f\"Bands and dates for {image_file}\")\n",
    "for i in tindex:\n",
    "    print(\"{:4d} {}\".format(j, i.date()), end=' ')\n",
    "    j += 1\n",
    "    if j%5 == 1:\n",
    "        print()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "---\n",
    "\n",
    "## **4. Create a Time Series Animation to get an Idea of the Dynamics at the Site**\n",
    "\n",
    "### **4.1 Load Time Series Stack**\n",
    "\n",
    "Now we are ready to create a time series animation from the calibrated SAR data.\n",
    "\n",
    "**First, create a raster from `band 0` and a raster stack from all the images:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "img = gdal.Open(str(image_file))\n",
    "band = img.GetRasterBand(1)\n",
    "raster0 = band.ReadAsArray()\n",
    "band_number = 0 # Needed for updates\n",
    "rasterstack= img.ReadAsArray()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Print the bands, pixels, and lines:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(f\"Number of  bands: {img.RasterCount}\")\n",
    "print(f\"Number of pixels: {img.RasterXSize}\")\n",
    "print(f\"Number of  lines: {img.RasterYSize}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "### **4.2 Calibration and Data Conversion between dB and Power Scales**\n",
    "\n",
    "**<font face=\"Calibri\" size=\"3\"> <font color='rgba(200,0,0,0.2)'> Note, that if your data were generated by HyP3, this step is not necessary! HyP3 performs the full data calibration and provides you with calibrated data in power scale. </font>**\n",
    "    \n",
    "If, your data is from a different source, however, calibration may be necessary to ensure that image gray values correspond to proper radar cross section information. \n",
    "\n",
    "Calibration coefficients for SAR data are often defined in the decibel (dB) scale due to the high dynamic range of the imaging system. For the L-band ALOS PALSAR data at hand, the conversion from uncalibrated DN values to calibrated radar cross section values in dB scale is performed by applying a standard **calibration factor of -83 dB**. \n",
    "\n",
    "$\\gamma^0_{dB} = 20 \\cdot log10(DN) -83$\n",
    "\n",
    "The data at hand are radiometrically terrain corrected images, which are often expressed as terrain flattened $\\gamma^0$ backscattering coefficients. For forest and land cover monitoring applications $\\gamma^o$ is the preferred metric.\n",
    "\n",
    "**To apply the calibration constant for your data and export in *dB* scale, uncomment the following code cell**: "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    " #caldB=20*np.log10(rasterstack)-83"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "While **dB**-scaled images are often \"visually pleasing\", they are often not a good basis for mathematical operations on data. For instance, when we compute the mean of observations, it makes a difference whether we do that in power or dB scale. Since dB scale is a logarithmic scale, we cannot simply average data in that scale. \n",
    "    \n",
    "Please note that the **correct scale** in which operations need to be performed **is the power scale.** This is critical, e.g. when speckle filters are applied, spatial operations like block averaging are performed, or time series are analyzed.\n",
    "\n",
    "To **convert from dB to power**, apply: $\\gamma^o_{pwr} = 10^{\\frac{\\gamma^o_{dB}}{10}}$ "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#calPwr=np.power(10.,caldB/10.)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### **4.3 Create Time Series Animation**\n",
    "\n",
    "**Create and move into a directory in which to store our plots and animations:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "product_path = path/'plots_and_animations'\n",
    "\n",
    "if not product_path.exists():\n",
    "    product_path.mkdir()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%capture \n",
    "fig = plt.figure(figsize=(14, 8))\n",
    "ax = fig.subplots()\n",
    "ax.axis('off')\n",
    "vmin = np.percentile(rasterstack.flatten(), 5)\n",
    "vmax = np.percentile(rasterstack.flatten(), 95)\n",
    "\n",
    "r0dB = 20 * np.log10(raster0) - 83\n",
    "\n",
    "im = ax.imshow(raster0, cmap='gray', vmin=vmin, vmax=vmax)\n",
    "ax.set_title(\"{}\".format(tindex[0].date()))\n",
    "\n",
    "def animate(i):\n",
    "    ax.set_title(\"{}\".format(tindex[i].date()))\n",
    "    im.set_data(rasterstack[i])\n",
    "\n",
    "# Interval is given in milliseconds\n",
    "ani = animation.FuncAnimation(fig, animate, frames=rasterstack.shape[0], interval=400)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Configure matplotlib's RC settings for the animation:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "rc('animation', embed_limit=40971520.0)  # We need to increase the limit maybe to show the entire animation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Create a javascript animation of the time-series running inline in the notebook:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "HTML(ani.to_jshtml())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Delete the dummy png** that was saved to the current working directory while generating the JavaScript animation in the last code cell."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hide_input": false
   },
   "outputs": [],
   "source": [
    "try:\n",
    "    tmp = product_path/'None0000000.png'\n",
    "    tmp.unlink()\n",
    "except FileNotFoundError:\n",
    "    pass"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Save the animation (`animation.gif`):**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ani.save(product_path/'animation-AgSite.gif', writer='pillow', fps=2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "---\n",
    "\n",
    "## **5. Computation and Visualization of Time Series Metrics**\n",
    "\n",
    "Once a time-series was constructed, we can compute **a set of metrics** that will aid us later in applications such as *change detection and active agriculture detection*. In the next code cells, we will compute the following variables for each pixel in the stack:\n",
    "\n",
    "- Mean \n",
    "- Median\n",
    "- Maximum\n",
    "- Minimum\n",
    "- Range (Maximum - Minimum)\n",
    "- 5th Percentile\n",
    "- 95th Percentile\n",
    "- PRange (95th - 5th Percentile)\n",
    "- Variance\n",
    "- Coefficient of Variation (Variance/Mean)\n",
    "\n",
    "---\n",
    "\n",
    "First, we **mask out pixels** without relevant information to be unbiased in statical number we calculate later. Then we **calculate the time series metrics**:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mask = rasterstack == 0\n",
    "rasterPwr = np.ma.array(rasterstack, mask=mask, dtype=np.float64)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%capture\n",
    "metrics = timeseries_metrics(rasterPwr.filled(np.nan), ndv=np.nan)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "metrics.keys()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's look at the histograms for the time series variance and coeficient of variation to aid displaying those images:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots(1, 2, figsize=(16,4))\n",
    "ax[0].hist(metrics['var'].flatten(), bins=100, range=(0, 0.005))\n",
    "ax[1].hist(metrics['cov'].flatten(), bins=100, range=(0, 0.04))\n",
    "_ = ax[0].set_title('Variance')\n",
    "_ = ax[1].set_title('Coefficient of Variation')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We use thresholds determined from those histograms to set the scaling in the time series visualization. For the backscatter metrics we choose a typical range appropriate for this ecosystem, the C-band radar sensor, and the VH polarization. A typical range is -30 dB (0.0001) to -10 dB (0.1)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# List the metrics keys you want to plot\n",
    "metric_keys = ['mean', 'median', 'max',\n",
    "               'min', 'p95', 'p5', 'range', \n",
    "               'prange', 'var', 'cov']\n",
    "fig = plt.figure(figsize=(16, 40))\n",
    "idx = 1\n",
    "for i in metric_keys:\n",
    "    ax = fig.add_subplot(5,2,idx)\n",
    "    if i == 'var': vmin, vmax = (0.0, 0.003)\n",
    "    elif i == 'cov': vmin, vmax = (0., 0.03)\n",
    "    else:\n",
    "        vmin,vmax=(0.0,0.3)\n",
    "    ax.imshow(metrics[i],vmin=vmin,vmax=vmax,cmap='gray')\n",
    "    ax.set_title(i.upper())\n",
    "    ax.axis('off')\n",
    "    idx+=1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "---\n",
    "\n",
    "## **6. Use Coefficient of Variation for Mapping Active Agriculture**\n",
    "\n",
    "Areas of active agriculture show strong changes of vegetation throughout a growth season. From plantation through maturation and harvest, the strong variation of vegetation structure causes the radar brigthness to vary significantly throughout a season, resulting in large values for *Range*, *Variance*, and *Coefficient of Variation* metrics. Steps for agriculture mapping using Coefficient of Variation include:\n",
    "\n",
    "- RTC process your data (done outside of notebook)\n",
    "- Stack up your data (beginning of notebook)\n",
    "- Calculate the Coefficient of Variation (CoV) metrics\n",
    "- Threshold the Coefficient of Variation to create candidate areas for active agriculture\n",
    "- Export map as a GeoTIFF for analysis in a GIS.\n",
    "\n",
    "With the CoV metric already calculated above, we can now **move toward thresholding the metric**:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can set a threshold $t$ for the **coefficient of variation image**\n",
    "to classify change in the time series:\n",
    "    \n",
    "${cp}_{x,y} = \\frac{\\sigma_{x,y}^2}{\\overline{X}_{x,y}} > t$ \n",
    "\n",
    "Let's look at the histogram and the Cumulative Distribution Function (CDF) of the coefficient of variation:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.rcParams.update({'font.size': 14})\n",
    "fig = plt.figure(figsize=(14, 4)) # Initialize figure with a size\n",
    "ax1 = fig.add_subplot(121)  # 121 determines: 2 rows, 2 plots, first plot\n",
    "ax2 = fig.add_subplot(122)\n",
    "\n",
    "h = ax1.hist(metrics['cov'].flatten(), bins=200, range=(0, 0.03))\n",
    "ax1.xaxis.set_label_text('Coefficient of Variation Metric')\n",
    "ax1.set_title('Histogram')\n",
    "\n",
    "n, bins, patches = ax2.hist(metrics['cov'].flatten(), bins=200, \n",
    "                            range=(0, 0.03), cumulative='True', \n",
    "                            density='True', histtype='step', \n",
    "                            label='Empirical')\n",
    "ax2.xaxis.set_label_text('Coefficient of Variation Metric')\n",
    "ax2.set_title('CDF')\n",
    "plt.savefig(product_path/'thresh_cov_histogram.png', dpi=200, transparent='true')\n",
    "\n",
    "outind = np.where(n > 0.9)\n",
    "threshind = np.min(outind)\n",
    "threshcov = bins[threshind]\n",
    "ax1.axvline(threshcov,color='red')\n",
    "ax2.axvline(threshcov,color='red')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With a threshold of $t=CDF_{cov^2} > 0.90$ (10% pixels with highest variance) the change pixels would look like the following image:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.figure(figsize=(8, 8))\n",
    "mask = metrics['cov'] < threshcov # For display we prepare the inverse mask\n",
    "maskcov = ~mask # Store this for later output\n",
    "plt.imshow(mask, cmap='gray')\n",
    "_ = plt.title('Threshold Classifier on Coefficient of Variation > %1.3f' % threshcov )\n",
    "plt.savefig(product_path/'changes_cov_threshold.png', dpi=200, transparent='true')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "---\n",
    "\n",
    "## **7. Write Our Change Detection Results and Metrics Images to GeoTIFF files**\n",
    "\n",
    "### **7.1 Determine Output Geometry**\n",
    "\n",
    "First, we need to **set the correct geotransformation and projection information**. We retrieve the values from the input images:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "proj = img.GetProjection()\n",
    "geotrans = list(img.GetGeoTransform())\n",
    "geotrans"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### **7.2 Output Time Series Metrics Images**\n",
    "\n",
    "We use the root of the time series data stack name and append a `ts_metrics<metric>.tif` ending as filenames:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dirname = path/(image_file.stem + '_tsmetrics')\n",
    "\n",
    "if not dirname.exists():\n",
    "    dirname.mkdir()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can **output the individual metrics as GeoTIFF images**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "names = [] # List to keep track of all the names\n",
    "for i in metrics:\n",
    "    # Name, Array, DataType, NDV,bandnames=None,ref_image\n",
    "    name = dirname/(f'{image_file.stem}{i}.tif')\n",
    "    create_geotiff(str(name), metrics[i], gdal.GDT_Float32,np.nan, \n",
    "                   [i], geo_t=geotrans, projection=proj)\n",
    "    names.append(name)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### **7.3 Output Detected Agriculture Areas Map**\n",
    "\n",
    "Now we can **output the map of active agriculture that we have created using the Coefficient of Variations metric. We will output to a GeoTIFF image**:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "imagenamecov = path/(image_file.stem + 'Ag_from_cov_threshold.tif')\n",
    "\n",
    "create_geotiff(str(imagenamecov), maskcov, gdal.GDT_Byte, \n",
    "               np.nan, [i], geo_t=geotrans, projection=proj)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "---\n",
    "\n",
    "## **8. Conclusion**\n",
    "\n",
    "The Coefficient of Variation is a good indicator for agricultural activity, however, by itself it is not fail safe. Further analysis of radar brightness can be performed. Also, other remote sensing data can be added to remove false alarms. \n",
    "\n",
    "For a bit more information on change detection and SAR in general, please look at the recently published [SAR Handbook: Comprehensive Methodologies for Forest Monitoring and Biomass Estimation](https://gis1.servirglobal.net/TrainingMaterials/SAR/SARHB_FullRes.pdf)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "---\n",
    "\n",
    "<div class=\"alert alert-success\">\n",
    "<font face=\"Calibri\" size=\"5\"> <b> <font color='rgba(200,0,0,0.2)'> <u>EXERCISES</u> </font></b> \n",
    "\n",
    "<font face=\"Calibri\" size=\"3\"> Explore agriculture mapping from the 46-image deep data stack a bit more by:\n",
    "\n",
    "<ul>\n",
    "  <li>Change the threshold used for identifying active agriculture areas.</li>\n",
    "  <li>Load created agriculture extent maps into QGIS and compare the detected areas with other image data (e.g., ESRI satellite layers). What do you think about the quality of the created information? What are the main error sources that you can see? What would be possible methods to mitigate these errors?</li>\n",
    "</ul>\n",
    "\n",
    "</font>\n",
    "</div>\n",
    "\n",
    "---\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "*Exercise8-AgricultureMappingwithSARUsingCoV.ipynb - Version 1.4.1 - November 2021*\n",
    "\n",
    "**Version Changes:**\n",
    "\n",
    "- Converted `os` and obsolete `asfn` modules with `pathlib` counterparts\n",
    "- Ran entire notebook with new `asf_notebook.py`\n",
    "- `html` converted to `Markdown`\n",
    "- Replaced JavaScript cell with `url_widget`\n",
    "- Replaced `asfn_notebook` with `opensarlab_lib`."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "rtc_analysis [conda env:.local-rtc_analysis]",
   "language": "python",
   "name": "conda-env-.local-rtc_analysis-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.16"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}