{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "hideCode": false,
    "hidePrompt": false,
    "tags": []
   },
   "source": [
    "![OpenSARlab notebook banner](NotebookAddons/blackboard-banner.png)\n",
    "\n",
    "# Subset Data Stack\n",
    "\n",
    "### Alex Lewandowski; University of Alaska Fairbanks\n",
    "\n",
    "<img style=\"padding: 7px\" src=\"NotebookAddons/UAFLogo_A_647.png\" width=\"170\" align=\"right\"/></font>\n",
    "\n",
    "This notebook crops a directory of tiffs to a subset area of interest using an interactive Matplotlib plot of an image in your data stack.\n",
    "This notebook assumes that users have generated the original image from `Prepare_Data_Stack` notebook. Users now have an option to subset the original image in following methods:\n",
    "\n",
    "1. Drag and drop to cut out sqare shape\n",
    "1. Given a correct Well-Known Text (WKT), users can define and cut a specific polygon shape.\n",
    "1. Upload shapely file along with its relavent files to cut a specific polygon.\n",
    "\n",
    "<!-- <img style=\"padding: 7px\" src=\"NotebookAddons/UAFLogo_A_647.png\" width=\"170\" align=\"right\"/></font> -->"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "source": [
    "\n",
    "---\n",
    "\n",
    "<font face=\"Calibri\" size=\"5\" color=\"darkred\"> <b>Important Note about JupyterHub</b> </font>\n",
    "\n",
    "**Your JupyterHub server will automatically shutdown when left idle for more than 1 hour. Your notebooks will not be lost but you will have to restart their kernels and re-run them from the beginning. You will not be able to seamlessly continue running a partially run notebook.**\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false,
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "import url_widget as url_w\n",
    "notebookUrl = url_w.URLWidget()\n",
    "display(notebookUrl)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "outputs": [],
   "source": [
    "from IPython.display import Markdown\n",
    "from IPython.display import display\n",
    "\n",
    "notebookUrl = notebookUrl.value\n",
    "user = !echo $JUPYTERHUB_USER\n",
    "env = !echo $CONDA_PREFIX\n",
    "if env[0] == '':\n",
    "    env[0] = 'Python 3 (base)'\n",
    "if env[0] != '/home/jovyan/.local/envs/rtc_analysis':\n",
    "    display(Markdown(f'<text style=color:red><strong>WARNING:</strong></text>'))\n",
    "    display(Markdown(f'<text style=color:red>This notebook should be run using the \"rtc_analysis\" conda environment.</text>'))\n",
    "    display(Markdown(f'<text style=color:red>It is currently using the \"{env[0].split(\"/\")[-1]}\" environment.</text>'))\n",
    "    display(Markdown(f'<text style=color:red>Select the \"rtc_analysis\" from the \"Change Kernel\" submenu of the \"Kernel\" menu.</text>'))\n",
    "    display(Markdown(f'<text style=color:red>If the \"rtc_analysis\" environment is not present, use <a href=\"{notebookUrl.split(\"/user\")[0]}/user/{user[0]}/notebooks/conda_environments/Create_OSL_Conda_Environments.ipynb\"> Create_OSL_Conda_Environments.ipynb </a> to create it.</text>'))\n",
    "    display(Markdown(f'<text style=color:red>Note that you must restart your server after creating a new environment before it is usable by notebooks.</text>'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "source": [
    "\n",
    "--- \n",
    "    \n",
    "## 0. Importing Relevant Python Packages\n",
    "\n",
    "In this notebook we will use the following scientific library:\n",
    "\n",
    "- [GDAL](https://www.gdal.org/) is a software library for reading and writing raster and vector geospatial data formats. It includes a collection of programs tailored for geospatial data processing. Most modern GIS systems (such as ArcGIS or QGIS) use GDAL in the background.\n",
    "\n",
    "**Import the necesssary libraries and modules:**\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "outputs": [],
   "source": [
    "%%capture\n",
    "from pathlib import Path\n",
    "import json # for loads\n",
    "import shutil\n",
    "\n",
    "from osgeo import gdal\n",
    "import pyproj \n",
    "\n",
    "from IPython.display import Markdown\n",
    "from IPython.display import display\n",
    "%matplotlib widget\n",
    "\n",
    "from ipyfilechooser import FileChooser\n",
    "\n",
    "import matplotlib.pyplot as plt \n",
    "plt.rcParams.update({'font.size': 12})\n",
    "\n",
    "import opensarlab_lib as asfn\n",
    "asfn.jupytertheme_matplotlib_format()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "source": [
    "\n",
    "---\n",
    "**Write functions to gather and print individual tiff paths:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "outputs": [],
   "source": [
    "def get_tiff_paths(paths):\n",
    "    tiff_paths = list(paths.parent.rglob(paths.name))    \n",
    "    tiff_paths.sort()\n",
    "    return tiff_paths\n",
    "\n",
    "def print_tiff_paths(tiff_paths):\n",
    "    print(\"Tiff paths:\")\n",
    "    for p in tiff_paths:\n",
    "        print(f\"{p}\\n\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Select the directory holding your tiffs**\n",
    "- Click the `Select` button\n",
    "- Navigate to your data directory\n",
    "- Click the `Select` button\n",
    "- Confirm that the desired path appears in green text\n",
    "- Click the `Change` button to alter your selection\n",
    "- *Note: If you used `Prepare_Data_Stack_Hyp3` notebook to generate directory, tiffs should be in `RTC_GAMMA_tiffs` directory.*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fc = FileChooser('/home/jovyan/notebooks')\n",
    "display(fc)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "source": [
    "**Determine the path to the analysis directory containing the tiff directory:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "outputs": [],
   "source": [
    "tiff_dir = Path(fc.selected_path)\n",
    "analysis_dir = tiff_dir.parent\n",
    "print(f\"analysis_dir: {analysis_dir}\")\n",
    "\n",
    "paths = tiff_dir/\"*.tif*\"\n",
    "tiff_paths = get_tiff_paths(paths)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "source": [
    "**Determine the UTM zone for your images.** \n",
    "\n",
    "This assumes you have already reprojected multiple UTM projections to a single predominant projection using the `Prepare_Data_Stack_Hyp3` notebook."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "outputs": [],
   "source": [
    "# Note - you should only have files that ends with .tif in same format\n",
    "info = gdal.Info(str(tiff_paths[0]), format='json')\n",
    "info = info['coordinateSystem']['wkt']\n",
    "utm = info.split('ID')[-1].split(',')[1][0:-2]\n",
    "print(f\"UTM Zone: {utm}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false,
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "tiff_paths = get_tiff_paths(paths)\n",
    "print_tiff_paths(tiff_paths)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "source": [
    "**Create a string containing paths to one image for each area represented in the stack:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "outputs": [],
   "source": [
    "to_merge = {}\n",
    "for pth in tiff_paths:\n",
    "    \n",
    "    if 'subset' in str(pth):\n",
    "        continue\n",
    "    \n",
    "    info = (gdal.Info(str(pth), options = ['-json']))\n",
    "    info = json.dumps(info)\n",
    "    info = (json.loads(info))['wgs84Extent']['coordinates']\n",
    "    \n",
    "    coords = [info[0][0], info[0][3]]\n",
    "    for i in range(0, 2):\n",
    "        for j in range(0, 2):\n",
    "            coords[i][j] = round(coords[i][j])\n",
    "    str_coords = f\"{str(coords[0])}{str(coords[1])}\"\n",
    "    if str_coords not in to_merge:\n",
    "        to_merge.update({str_coords: pth})\n",
    "merge_paths = \"\"\n",
    "for pth in to_merge:\n",
    "    merge_paths = f\"{merge_paths} {to_merge[pth]}\"\n",
    "    \n",
    "print(merge_paths)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "source": [
    "**Merge the images for display in the Area-Of-Interest selector:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "outputs": [],
   "source": [
    "full_scene = analysis_dir/'full_scene.tif'\n",
    "\n",
    "if full_scene.exists():\n",
    "    full_scene.unlink()\n",
    "    \n",
    "gdal_command = f\"gdal_merge.py -o {full_scene} {merge_paths}\"\n",
    "!{gdal_command}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "source": [
    "---\n",
    "## Subset The Tiffs\n",
    "\n",
    "**Create a Virtual Raster Stack:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "outputs": [],
   "source": [
    "image_file = f\"{analysis_dir}/raster_stack.vrt\"\n",
    "!gdalbuildvrt -separate $image_file -overwrite $full_scene"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "source": [
    "**Convert the VRT into an array:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "outputs": [],
   "source": [
    "img = gdal.Open(image_file)\n",
    "rasterstack = img.ReadAsArray()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "source": [
    "**Print the number of bands, pixels, and lines:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "outputs": [],
   "source": [
    "print(img.RasterCount) # Number of Bands\n",
    "print(img.RasterXSize) # Number of Pixels\n",
    "print(img.RasterYSize) # Number of Lines"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Choose a best method to cutout image**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"Please choose one of three options:\")\n",
    "\n",
    "option_key = [\n",
    "    None,\n",
    "    'Option 1: Draw rectangle with drag and drop.',\n",
    "    'Option 2: Write/paste Well-Known Text (WKT).',\n",
    "    'Option 3: Upload shapefile.'\n",
    "]\n",
    "\n",
    "option = asfn.select_parameter([option_key[1], option_key[2] , option_key[3],], '')\n",
    "display(option)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "choice = option_key.index(option.value)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": []
   },
   "source": [
    "### **Option 1: Create an AOI selector from an image in your raster stack:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "outputs": [],
   "source": [
    "## this generates interactive plots (AOI subset) (option 1)\n",
    "if choice == 1:\n",
    "    fig_xsize = 7\n",
    "    fig_ysize = 7\n",
    "    aoi = asfn.AOI_Selector(rasterstack, fig_xsize, fig_ysize)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### **Option 2 & 3: Preset**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Choose a `.tif` file to generate your shapefile**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if choice != 1:    \n",
    "    display(Markdown(f'<text style=color:red>NOTE: As of now, your WKT/shapefile will work if it meets following conditions: </text>'))\n",
    "    display(Markdown(f'<ol text style=color:red><li>Simple polygons </li> <li>Has same projection as the original image.</li></ol>'))\n",
    "    display(Markdown(f'<text style=color:red>Using complex shapes (e.g. MULTIPOLYGON, POLYGON with holes, etc.) may cause an unexpected results.</text>'))\n",
    "\n",
    "    try:\n",
    "        infile = tiff_paths[0]\n",
    "    except:\n",
    "        raise OSError('Directory that contains your .tif files are empty.')\n",
    "        \n",
    "    print(infile)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Check if you have a correct infile**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if choice != 1:\n",
    "    try:\n",
    "        suffix = infile.suffix\n",
    "    except:\n",
    "        raise TypeError(f'{infile} is not a valid path.')\n",
    "    \n",
    "    if suffix != '.tif':\n",
    "        raise ValueError(f'File you chose is not a \".tif\" file. Pick a valid \".tif\" file.')\n",
    "      \n",
    "    # path to your subset image file\n",
    "    outfile = str(infile.parent/f'subset_{infile.stem}.tif')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Useful functions used in option 2 & 3:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if choice != 1:\n",
    "    from osgeo import osr\n",
    "    \n",
    "    def oldSubsetExist(directory_path) -> bool:\n",
    "        \"\"\"\n",
    "        Given path to a directory containing old 'subset_...tif' file,\n",
    "        it will determine if that subset exists in that directory or not.\n",
    "        \"\"\"\n",
    "        for path in directory_path.iterdir():\n",
    "            if 'subset_' in str(path) and path.suffix == '.tif':\n",
    "                return True\n",
    "            \n",
    "        return False\n",
    "    \n",
    "    def removeOldSubset(directory_path) -> None:\n",
    "        \"\"\"\n",
    "        Given path to a directory containing old 'subset_...tif' file,\n",
    "        it will remove all instances of 'subset_...tif' file.\n",
    "        \"\"\"\n",
    "        for path in directory_path.iterdir():\n",
    "            if 'subset_' in str(path) and path.suffix == '.tif':\n",
    "                print(f'Removed {path}...')\n",
    "                path.unlink()\n",
    "\n",
    "    \n",
    "    # Gets EPSG from your infile\n",
    "    def getEPSG(tif_path) -> int:\n",
    "        try:\n",
    "            Path(tif_path).exists()\n",
    "        except:\n",
    "            print('That .tif file does not exist. Please enter a valid path.')\n",
    "            \n",
    "        ds = gdal.OpenEx(tif_path)\n",
    "        prj = ds.GetProjection()\n",
    "        \n",
    "        return int(osr.SpatialReference(prj).GetAttrValue('AUTHORITY', 1))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### **Option 2: Create subset image and `.shp`**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Clean up previous shapely files (`.shp`, `.prj`, `.dbf`, `.shx`)**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if choice == 2:\n",
    "    print(\"Would you like to remove all instances of previous/unused shapely files?\")\n",
    "    shp_option = asfn.select_parameter(['Yes', 'No',], '')\n",
    "    display(shp_option)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if choice == 2 and shp_option.value == 'Yes':\n",
    "    keywords = ['.shp','.dbf','.prj','.shx']\n",
    "    \n",
    "    for path in infile.parent.iterdir():\n",
    "        for k in keywords:\n",
    "            if path.suffix == k:\n",
    "                print(f'Removed {path}')\n",
    "                path.unlink()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Choose a name for your `.shp` file:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if choice == 2:\n",
    "    shp_name = input('Choose a name for your shapefly file: ')\n",
    "    \n",
    "    # default name if user does not input anything\n",
    "    if not shp_name:\n",
    "        shp_name = 'shape'\n",
    "    \n",
    "    shp = str(infile.parent/f'{shp_name}.shp')\n",
    "    \n",
    "    print(shp)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Generate `.shp`, `.proj`, and other relavent files.**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "outputs": [],
   "source": [
    "# Let user input WKT (option 2)\n",
    "if choice == 2:\n",
    "    from osgeo import ogr\n",
    "    import shapely.wkt as sWkt\n",
    "    import geopandas\n",
    "    \n",
    "    print(\"When inputting your WKT, here are few things to note:\\n\"\\\n",
    "         \"\\t1) If you don't already have WKT, you can use GIS software (e.g. ArcGIS) to obtain WKT for your image.\\n\"\\\n",
    "         \"\\t2) It will only accept single polygon (e.g. POLYGON((x y, x y, ...)). \\n\"\\\n",
    "         \"\\t3) Your WKT has to be based off of your original image that you wish to subset.\")\n",
    "    \n",
    "\n",
    "    correctWktInput = False\n",
    "    while not correctWktInput:\n",
    "        wkt = input(\"Please enter your WKT: \")\n",
    "        \n",
    "        try:\n",
    "            geoShape = sWkt.loads(wkt)\n",
    "            series = geopandas.GeoSeries([geoShape])\n",
    "            isNotConnected = series.is_valid[0]\n",
    "                        \n",
    "            if not isNotConnected:\n",
    "                raise('Obsecure shape detected.')\n",
    "                continue\n",
    "                \n",
    "        except:\n",
    "            print('Error due to an unclosed shape, obsecure shape, or bad input. Try again...')\n",
    "            continue\n",
    "            \n",
    "        correctWktInput = True\n",
    "                \n",
    "    outfile = str(infile.parent/f'subset_{infile.stem}.tif')\n",
    "    epsg = getEPSG(str(infile))\n",
    "#   For reference, latlong epsg = 4326\n",
    "\n",
    "    driver = ogr.GetDriverByName('Esri Shapefile')\n",
    "    ds = driver.CreateDataSource(shp)\n",
    "    srs = osr.SpatialReference()\n",
    "    srs.ImportFromEPSG(epsg)\n",
    "    layer = ds.CreateLayer('', srs, ogr.wkbPolygon)\n",
    "    defn = layer.GetLayerDefn()\n",
    "\n",
    "    # Create a new feature (attribute and geometry)\n",
    "    feat = ogr.Feature(defn)\n",
    "    \n",
    "    # Make a geometry from Shapely object\n",
    "    geom = ogr.CreateGeometryFromWkt(wkt) \n",
    "    feat.SetGeometry(geom)\n",
    "    \n",
    "    layer.CreateFeature(feat)\n",
    "    feat = geom = None  # destroy these\n",
    "\n",
    "    # Save and close everything\n",
    "    ds = layer = feat = geom = None"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### **Option 3: Upload `.shp`**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "source": [
    "If you chose option 3, upload your shapely file (`.shp`) as well as any files that are related to it (`.proj`, `.shx` and `.dbf` files). Once you uploaded them, select the .shp file using file selector. If you selected existing `.shp` it should be highlighted in orange.\n",
    "\n",
    "**Note**: `.shp`, `.shx`, `.dbf`, and `.proj` files must be in the same directory."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "outputs": [],
   "source": [
    "if choice == 3:\n",
    "    display(Markdown(f'<text style=color:red>WARNING: The UPLOADED .shp FILE MUST HAVE A SAME EPSG/PROJECTION AS THE ORIGINAL IMAGE.</text>'))\n",
    "    display(Markdown(f'<text style=color:red>IF YOUR .shp FILE DOES NOT HAVE PROJECTION OR HAVE DIFFERENT PROJECTION FROM ORIGINAL IMAGE,</text>'))\n",
    "    display(Markdown(f'<text style=color:red>IT MAY CAUSE AN UNEXPECTED RESULTS.</text>'))\n",
    "    \n",
    "    shpfc = FileChooser('/home/jovyan/notebooks/SAR_Training/English/Master')\n",
    "    display(shpfc)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "outputs": [],
   "source": [
    "if choice == 3:\n",
    "    try:\n",
    "        shp = Path(shpfc.selected)\n",
    "    except:\n",
    "        raise TypeError('Please choose the file path before proceeding.')\n",
    "        \n",
    "    print(shp)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "outputs": [],
   "source": [
    "# Extract wkt from shapefile:\n",
    "\n",
    "if choice == 3:\n",
    "    if shp.suffix == '.shp':\n",
    "        gInfo = gdal.OpenEx(str(shp))\n",
    "        layer = gInfo.GetLayer()\n",
    "        feature = layer.GetFeature(0)\n",
    "        wkt = feature.GetGeometryRef().ExportToWkt()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### **Option 2 & 3: Create subset image**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**If you have subset image from previous run, it will ask you to either keep the old ones or remove and generate new.**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if choice != 1:\n",
    "    # Check if 'subset' files from previous run exists \n",
    "    if oldSubsetExist(infile.parent):\n",
    "        print('Items from previous run exists. Would you like to keep them or remove them to generate new items?')\n",
    "        clean_option = asfn.select_parameter(['Clean and Generate New Items', 'Keep Old Items', ], '')\n",
    "        display(clean_option)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if choice != 1:\n",
    "        \n",
    "    print(shp)\n",
    "    generate_command = f'gdalwarp -cutline {shp} -crop_to_cutline {infile} {outfile}'\n",
    "    \n",
    "    if oldSubsetExist(infile.parent):\n",
    "        if clean_option.value == 'Clean and Generate New Items':\n",
    "            print('Cleaning previously generated subset file(s)...')\n",
    "            removeOldSubset(infile.parent)\n",
    "            print('\\n')\n",
    "        else:\n",
    "            generate_command = f'echo Kept previous subset.'\n",
    "        \n",
    "    !{generate_command}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "source": [
    "**Gather and define projection details:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "outputs": [],
   "source": [
    "geotrans = img.GetGeoTransform()\n",
    "projlatlon = pyproj.Proj('EPSG:4326') # WGS84\n",
    "projimg = pyproj.Proj(f'EPSG:{utm}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "source": [
    "**Write a function to convert the pixel, line coordinates from the AOI selector into geographic coordinates in the stack's EPSG projection:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "outputs": [],
   "source": [
    "# xy -> geocoords\n",
    "def xy_to_geocoord(x, y, geotrans,latlon=True):    \n",
    "    ref_x = geotrans[0] + (x * geotrans[1])\n",
    "    ref_y = geotrans[3] + (y * geotrans[5])\n",
    "    if latlon:\n",
    "        ref_y, ref_x = pyproj.transform(projimg, projlatlon, ref_x, ref_y)\n",
    "    return [ref_x, ref_y]\n",
    "\n",
    "# geocoords -> xy\n",
    "\"\"\"\n",
    "ref_x: x geocoordinate from WKT\n",
    "ref_y: y geocoordinate from WKT\n",
    "\"\"\"\n",
    "def geocoord_to_xy(ref_x, ref_y, geotrans):\n",
    "    x = (ref_x - geotrans[0]) / geotrans[1]\n",
    "    y = (ref_y - geotrans[3]) / geotrans[5]\n",
    "    \n",
    "    return [x,y]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Option 2 & 3 - Write a function to display the cut out image:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Display cropped image & place cropped image on top on original image.\n",
    "\n",
    "import numpy as np\n",
    "import matplotlib.patches as patches\n",
    "import re\n",
    "\n",
    "# Display image (different option depending on options)\n",
    "def displayImage(rasterstack,\n",
    "                 fig_xsize=None, fig_ysize=None,\n",
    "                 wkt=None,\n",
    "                 cmap=plt.cm.gist_gray,\n",
    "                 vmin=None, vmax=None,):\n",
    "\n",
    "    if not vmin:\n",
    "        vmin = np.nanpercentile(rasterstack, 1)\n",
    " \n",
    "    if not vmax:\n",
    "        vmax=np.nanpercentile(rasterstack, 99)\n",
    "\n",
    "    if fig_xsize and fig_ysize:\n",
    "        fig, current_ax = plt.subplots(figsize=(fig_xsize, fig_ysize))\n",
    "    else:\n",
    "        fig, current_ax = plt.subplots()\n",
    "        \n",
    "    current_ax.imshow(rasterstack, cmap=plt.cm.gist_gray, vmin=vmin, vmax=vmax)\n",
    "    \n",
    "    # If it's original image, show where it's been cropped    \n",
    "    if wkt:\n",
    "        showCropped(wkt, current_ax)\n",
    "\n",
    "# Place cropped image on top of the image\n",
    "def showCropped(wkt, ax):\n",
    "\n",
    "    x,y = [],[]\n",
    "    wkt_coords = [float(i) for i in (re.findall(r\"[-+]?\\d*\\.\\d+|\\d+\", wkt))]\n",
    "\n",
    "    # Since \"patches.Polygon\" does not require last coordinate, remove last coordinate (x,y)  \n",
    "    wkt_coords.pop()\n",
    "    wkt_coords.pop()\n",
    "\n",
    "    isX = True\n",
    "    for c in wkt_coords:\n",
    "        x.append(c) if isX else y.append(c)\n",
    "        isX = not(isX)\n",
    "\n",
    "    if len(x) == len(y):\n",
    "        tmp = []\n",
    "\n",
    "        for i in range(0, len(x)):\n",
    "            tmp = geocoord_to_xy(x[i],y[i],geotrans)\n",
    "            x[i],y[i] = tmp[0], tmp[1]\n",
    "\n",
    "    poly = patches.Polygon(xy=list(zip(x,y)), linestyle='-')\n",
    "\n",
    "    # Add the patches to the Axes\n",
    "    ax.add_patch(poly)\n",
    "    plt.show()\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Option 2 & 3 - Displaying original image and cropped image. If you are satisfied with the cropped image, proceed.**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "outputs": [],
   "source": [
    "if choice != 1:    \n",
    "    \n",
    "    # If file size is too big, it will not display image:\n",
    "    fileSize = Path(outfile).stat().st_size\n",
    "    fileSizeGB = fileSize/(1024 ** 3) # in GB\n",
    "    \n",
    "    if fileSizeGB > 10.00:\n",
    "        raise MemoryError('Subset file too big and will crash. Please regenerate your subset image using correct values.')  \n",
    "    \n",
    "    # First, show the entire map, and highlight the parts where it has been cropped.\n",
    "    displayImage(rasterstack,7.5,7.5,wkt)       \n",
    "    \n",
    "    # generalized - there should only be one 'subset' file in your directory\n",
    "    cropped_img = gdal.Open(str(outfile)).ReadAsArray()\n",
    "    \n",
    "    # Display cropped img\n",
    "    displayImage(cropped_img,7.5,7.5) "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "source": [
    "**Option 1: Call `xy_to_geocoord` to gather the `aoi_coords`:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "outputs": [],
   "source": [
    "try:\n",
    "    if choice == 1:\n",
    "        aoi_coords = [xy_to_geocoord(aoi.x1, aoi.y1, geotrans, latlon=False), xy_to_geocoord(aoi.x2, aoi.y2, geotrans, latlon=False)]\n",
    "        print(f\"aoi_coords in EPSG {utm}: {aoi_coords}\")\n",
    "    else:\n",
    "        # make warning more visable?\n",
    "        display(Markdown(f'<text style=color:red>If the image above: </text>'))\n",
    "        display(Markdown(f'<ol text style=color:red><li>Does not show blue shape on the first image.</li> <li>and/or it displays entirely black image on the second</li></ol>'))\n",
    "        display(Markdown(f'<text style=color:red>It indicates that there is something wrong with your input. Please use WKT or .shp file that has the same projection as original image and generate the subset image.</text>'))\n",
    "        display(Markdown(f'<text style=color:blue>If both images displayed properly as you expected, please proceed ahead.</text>'))\n",
    "except TypeError:\n",
    "    print('TypeError')\n",
    "    display(Markdown(f'<text style=color:red>This error occurs if an AOI was not selected.</text>'))\n",
    "    display(Markdown(f'<text style=color:red>Note that the square tool icon in the AOI selector menu is <b>NOT</b> the selection tool. It is the zoom tool.</text>'))\n",
    "    display(Markdown(f'<text style=color:red>Read the tips above the AOI selector carefully.</text>'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "source": [
    "**Collect the paths to the tiffs:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "outputs": [],
   "source": [
    "tiff_paths = get_tiff_paths(paths)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "source": [
    "**Create a subdirectory in which to store the subset tiffs:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "outputs": [],
   "source": [
    "print(\"Choose a directory name in which to store the subset geotiffs.\")\n",
    "print(\"Note: this will sit alongside the directory containing your pre-subset geotiffs.\")\n",
    "while True:\n",
    "    sub_name = input()\n",
    "    if sub_name == \"\":\n",
    "        print(\"Please enter a valid directory name\")\n",
    "        continue\n",
    "    else:\n",
    "        break"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "source": [
    "**Subset the tiffs and move them from the individual product directories into their own directory, /tiffs:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "outputs": [],
   "source": [
    "tiff_paths = get_tiff_paths(paths)\n",
    "for p in tiff_paths:\n",
    "    print(p)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Presets subset directory. For option 2 & 3, it also checks if the subset directory is empty or not.**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "subset_dir = analysis_dir/f'{sub_name}'\n",
    "\n",
    "if not subset_dir.exists():\n",
    "    subset_dir.mkdir()\n",
    "    \n",
    "if choice != 1:\n",
    "    isEmptyDir = not(any(subset_dir.rglob('*.tiff')))    \n",
    "    \n",
    "    if not isEmptyDir:\n",
    "        print(\"Tiff files from previous run exists. Would you like to remove them and generate new tiff files?\")\n",
    "        reset_option = asfn.select_parameter(['Generate New Tiffs', 'Keep Previous Tiffs', ], '')\n",
    "        display(reset_option)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# sometimes, tiff doesn't follow '[0-9]{7}T[0-9]6' format, hence just get the numbers in those cases \n",
    "\n",
    "# option 2 & 3 \n",
    "if choice != 1:\n",
    "    if isEmptyDir and any(subset_dir.rglob('*.tiff')):\n",
    "        isEmptyDir = False\n",
    "               \n",
    "dup_date_polar = set()\n",
    "for i, tiff_path in enumerate(tiff_paths):\n",
    "    if 'subset' not in str(tiff_path):\n",
    "                \n",
    "        date = Path(asfn.date_from_product_name(str(tiff_path))).name.split('T')[0]\n",
    "        polar = asfn.get_polarity_from_path(str(tiff_path))\n",
    "        print(f\"Product #{i+1}:\")\n",
    "        print(f'Path: {tiff_path}\\n')\n",
    "        \n",
    "        date_polar = f'{date}_{polar}'\n",
    "        \n",
    "        if date_polar in dup_date_polar:\n",
    "            date_polar += f'_{i}'\n",
    "        else:\n",
    "            dup_date_polar.add(date_polar)\n",
    "        \n",
    "        outfile = subset_dir/f'{date_polar}.tiff'  \n",
    "            \n",
    "        if choice == 1:\n",
    "            gdal_command = f\"gdal_translate -projwin {aoi_coords[0][0]} {aoi_coords[0][1]} {aoi_coords[1][0]} {aoi_coords[1][1]} -projwin_srs 'EPSG:{utm}' -co \\\"COMPRESS=DEFLATE\\\" -eco -a_nodata 0 {tiff_path} {outfile}\"\n",
    "\n",
    "        else: # choice 2 & 3\n",
    "            gdal_command = f'gdalwarp -cutline {shp} -crop_to_cutline {tiff_path} {outfile}'\n",
    "\n",
    "            if not isEmptyDir:\n",
    "                if reset_option.value == 'Generate New Tiffs':\n",
    "                    print(f'Removed: {outfile}\\n')\n",
    "                    outfile.unlink()\n",
    "                else:\n",
    "                    gdal_command = f'echo Nothing was executed...\\n'\n",
    "\n",
    "        !{gdal_command} # runs command    \n",
    "        print(f\"Calling the command: {gdal_command}\\n\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "source": [
    "**Delete any subset `tifs` that are filled with `NaNs` and contain no data.**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "outputs": [],
   "source": [
    "subset_paths = subset_dir/f\"*.tif*\"\n",
    "tiff_paths = get_tiff_paths(subset_paths)\n",
    "asfn.remove_nan_filled_tifs(tiff_paths)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "source": [
    "**Sometimes, when using gdal translate to subset a stack of images, there will be slight differences in sizes of the resulting images, off by a single pixel in either direction. The following code checks the newly subset stack for this problem, and if found, it re-subsets all the images to the size of the smallest image in the stack.**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "outputs": [],
   "source": [
    "# Align geotiffs to an integer resolution value\n",
    "# list of new subsets\n",
    "\n",
    "fnames = list(subset_dir.rglob('*.tiff'))\n",
    "fnames.sort()\n",
    "\n",
    "resolution = int(gdal.Info(str(fnames[0]), format='json')['geoTransform'][1])\n",
    "for fname in fnames:\n",
    "    gdal.Warp(str(fname), str(fname), \n",
    "              dstSRS=f'EPSG:{utm}', srcSRS=f'EPSG:{utm}', \n",
    "              xRes=resolution, yRes=resolution, targetAlignedPixels=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "source": [
    "**Decide whether or not to cleanup the original tiffs:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "outputs": [],
   "source": [
    "cleanup = asfn.select_parameter([\"Save original tiffs\", \"Delete original tiffs\"], '')\n",
    "cleanup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "outputs": [],
   "source": [
    "if cleanup.value == 'Delete original tiffs':\n",
    "    shutil.rmtree(tiff_dir)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "source": [
    "**Print the path to your subset directory:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "hideCode": false,
    "hidePrompt": false
   },
   "outputs": [],
   "source": [
    "print(subset_dir)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Relavent notebooks:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Run this cell to display links\n",
    "from IPython.display import display, HTML\n",
    "\n",
    "current = Path.cwd()\n",
    "abs_path = [\n",
    "    Path('/home/jovyan/notebooks/SAR_Training/English/Master/Change_Detection_From_Prepared_Data_Stack.ipynb'),\n",
    "    Path('/home/jovyan/notebooks/SAR_Training/English/Master/Explore_SAR_Time_Series_From_Prepared_Data_Stack.ipynb'),\n",
    "    Path('/home/jovyan/notebooks/SAR_Training/English/Master/SARChangeDetectionMethods_From_Prepared_Data_Stack.ipynb'),\n",
    "    Path('/home/jovyan/notebooks/SAR_Training/English/Master/Time_Series_From_Prepared_Stack.ipynb')\n",
    "]\n",
    "\n",
    "details = [\n",
    "    'Introduce you to the analysis of deep multi-temporal SAR image data stacks.',\n",
    "    'Introduces you to a some popular change detection methods that can be applied on SAR time series data.',\n",
    "    'Introduces you to the time series signatures associated with flooding.',\n",
    "    'Applies Change Point Detection on a deep multi-temporal SAR image data stack acquired by Sentinel-1.'\n",
    "]\n",
    "\n",
    "\n",
    "for a in abs_path:\n",
    "    name = a.stem\n",
    "    relative_path = a.relative_to(current)\n",
    "    detail = details.pop()\n",
    "    link_t = f\"<li><a href='{relative_path}'> {name} </a>: {detail} </li>\"\n",
    "    html = HTML(link_t)\n",
    "    display(html)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*GEOS 657 Microwave Remote Sensing - Version 2.0.0 - January 2022*\n",
    "\n",
    "*Version Changes:*\n",
    "- *Users can now use `wkt` as well as `.shp` file to subset image*"
   ]
  }
 ],
 "metadata": {
  "hide_code_all_hidden": false,
  "kernelspec": {
   "display_name": "rtc_analysis",
   "language": "python",
   "name": "conda-env-.local-rtc_analysis-py"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}