{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Detecting floating objects using deep learning and Sentinel-2 imagery\n",
    "\n",
    "{bdg-primary}`Ocean`\n",
    "{bdg-secondary}`Modelling`\n",
    "{bdg-warning}`Standard`\n",
    "{bdg-info}`Python`\n",
    "\n",
    "<p align=\"left\">\n",
    "    <a href=\"https://github.com/eds-book-gallery/b34facfa-cea8-48f5-89f6-f11ce00812a9/blob/main/LICENSE\">\n",
    "        <img alt=\"license\" src=\"https://img.shields.io/badge/license-MIT-yellow.svg\">\n",
    "    </a>\n",
    "    <a href=\"https://github.com/eds-book-gallery/b34facfa-cea8-48f5-89f6-f11ce00812a9/actions/workflows/render.yaml\">\n",
    "        <img alt=\"render\" src=\"https://github.com/eds-book-gallery/b34facfa-cea8-48f5-89f6-f11ce00812a9/actions/workflows/render.yaml/badge.svg\">\n",
    "    </a>\n",
    "    <a href=\"https://github.com/alan-turing-institute/environmental-ds-book/pull/22\">\n",
    "        <img alt=\"review\" src=\"https://img.shields.io/badge/view-review-purple\">\n",
    "    </a>\n",
    "    <br/>\n",
    "</p>\n",
    "\n",
    "<p align=\"left\">\n",
    "    <a href=\"http://mybinder.org/v2/gh/eds-book-gallery/b34facfa-cea8-48f5-89f6-f11ce00812a9/main?labpath=notebook.ipynb\">\n",
    "        <img alt=\"binder\" src=\"https://mybinder.org/badge_logo.svg\">\n",
    "    </a>\n",
    "    <a href=\"https://replay.notebooks.egi.eu/v2/gh/eds-book-gallery/b34facfa-cea8-48f5-89f6-f11ce00812a9/main?labpath=notebook.ipynb\">\n",
    "        <img alt=\"binder\" src=\"https://img.shields.io/badge/launch-EGI%20binder-F5A252.svg?logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAFkAAABZCAMAAABi1XidAAAB8lBMVEX///9XmsrmZYH1olJXmsr1olJXmsrmZYH1olJXmsr1olJXmsrmZYH1olL1olJXmsr1olJXmsrmZYH1olL1olJXmsrmZYH1olJXmsr1olL1olJXmsrmZYH1olL1olJXmsrmZYH1olL1olL0nFf1olJXmsrmZYH1olJXmsq8dZb1olJXmsrmZYH1olJXmspXmspXmsr1olL1olJXmsrmZYH1olJXmsr1olL1olJXmsrmZYH1olL1olLeaIVXmsrmZYH1olL1olL1olJXmsrmZYH1olLna31Xmsr1olJXmsr1olJXmsrmZYH1olLqoVr1olJXmsr1olJXmsrmZYH1olL1olKkfaPobXvviGabgadXmsqThKuofKHmZ4Dobnr1olJXmsr1olJXmspXmsr1olJXmsrfZ4TuhWn1olL1olJXmsqBi7X1olJXmspZmslbmMhbmsdemsVfl8ZgmsNim8Jpk8F0m7R4m7F5nLB6jbh7jbiDirOEibOGnKaMhq+PnaCVg6qWg6qegKaff6WhnpKofKGtnomxeZy3noG6dZi+n3vCcpPDcpPGn3bLb4/Mb47UbIrVa4rYoGjdaIbeaIXhoWHmZYHobXvpcHjqdHXreHLroVrsfG/uhGnuh2bwj2Hxk17yl1vzmljzm1j0nlX1olL3AJXWAAAAbXRSTlMAEBAQHx8gICAuLjAwMDw9PUBAQEpQUFBXV1hgYGBkcHBwcXl8gICAgoiIkJCQlJicnJ2goKCmqK+wsLC4usDAwMjP0NDQ1NbW3Nzg4ODi5+3v8PDw8/T09PX29vb39/f5+fr7+/z8/Pz9/v7+zczCxgAABC5JREFUeAHN1ul3k0UUBvCb1CTVpmpaitAGSLSpSuKCLWpbTKNJFGlcSMAFF63iUmRccNG6gLbuxkXU66JAUef/9LSpmXnyLr3T5AO/rzl5zj137p136BISy44fKJXuGN/d19PUfYeO67Znqtf2KH33Id1psXoFdW30sPZ1sMvs2D060AHqws4FHeJojLZqnw53cmfvg+XR8mC0OEjuxrXEkX5ydeVJLVIlV0e10PXk5k7dYeHu7Cj1j+49uKg7uLU61tGLw1lq27ugQYlclHC4bgv7VQ+TAyj5Zc/UjsPvs1sd5cWryWObtvWT2EPa4rtnWW3JkpjggEpbOsPr7F7EyNewtpBIslA7p43HCsnwooXTEc3UmPmCNn5lrqTJxy6nRmcavGZVt/3Da2pD5NHvsOHJCrdc1G2r3DITpU7yic7w/7Rxnjc0kt5GC4djiv2Sz3Fb2iEZg41/ddsFDoyuYrIkmFehz0HR2thPgQqMyQYb2OtB0WxsZ3BeG3+wpRb1vzl2UYBog8FfGhttFKjtAclnZYrRo9ryG9uG/FZQU4AEg8ZE9LjGMzTmqKXPLnlWVnIlQQTvxJf8ip7VgjZjyVPrjw1te5otM7RmP7xm+sK2Gv9I8Gi++BRbEkR9EBw8zRUcKxwp73xkaLiqQb+kGduJTNHG72zcW9LoJgqQxpP3/Tj//c3yB0tqzaml05/+orHLksVO+95kX7/7qgJvnjlrfr2Ggsyx0eoy9uPzN5SPd86aXggOsEKW2Prz7du3VID3/tzs/sSRs2w7ovVHKtjrX2pd7ZMlTxAYfBAL9jiDwfLkq55Tm7ifhMlTGPyCAs7RFRhn47JnlcB9RM5T97ASuZXIcVNuUDIndpDbdsfrqsOppeXl5Y+XVKdjFCTh+zGaVuj0d9zy05PPK3QzBamxdwtTCrzyg/2Rvf2EstUjordGwa/kx9mSJLr8mLLtCW8HHGJc2R5hS219IiF6PnTusOqcMl57gm0Z8kanKMAQg0qSyuZfn7zItsbGyO9QlnxY0eCuD1XL2ys/MsrQhltE7Ug0uFOzufJFE2PxBo/YAx8XPPdDwWN0MrDRYIZF0mSMKCNHgaIVFoBbNoLJ7tEQDKxGF0kcLQimojCZopv0OkNOyWCCg9XMVAi7ARJzQdM2QUh0gmBozjc3Skg6dSBRqDGYSUOu66Zg+I2fNZs/M3/f/Grl/XnyF1Gw3VKCez0PN5IUfFLqvgUN4C0qNqYs5YhPL+aVZYDE4IpUk57oSFnJm4FyCqqOE0jhY2SMyLFoo56zyo6becOS5UVDdj7Vih0zp+tcMhwRpBeLyqtIjlJKAIZSbI8SGSF3k0pA3mR5tHuwPFoa7N7reoq2bqCsAk1HqCu5uvI1n6JuRXI+S1Mco54YmYTwcn6Aeic+kssXi8XpXC4V3t7/ADuTNKaQJdScAAAAAElFTkSuQmCC\">\n",
    "    </a>\n",
    "    <br/>\n",
    "</p>\n",
    "\n",
    "<p align=\"left\">\n",
    "    <a href=\"https://w3id.org/ro-id/b34facfa-cea8-48f5-89f6-f11ce00812a9\">\n",
    "        <img alt=\"rohub\" src=\"https://img.shields.io/badge/RoHub-FAIR_Executable_Research_Object-2ea44f?logo=Open+Access&logoColor=blue\">\n",
    "    </a>\n",
    "    <a href=\"https://zenodo.org/badge/latestdoi/493600192\">\n",
    "        <img alt=\"doi\" src=\"https://zenodo.org/badge/493600192.svg\">\n",
    "    </a>\n",
    "</p>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Context\n",
    "\n",
    "### Purpose\n",
    "Detect floating targets that might contain plastic, algea, sargassum, wood, etc. in coastal areas using deep learning methods and Sentinel-2 data.\n",
    "\n",
    "### Modelling approach\n",
    "\n",
    "In this notebook we show the potential of deep learning methods to detect and highlight floating debris of various natures in coastal areas. First, the dataset can be downloaded from [Zenodo](https://zenodo.org/record/5827377#.YdgfjGjMK9I) using this notebook and it contains: a Sentinel-2 image over a coastal area of Rio de Janeiro in Brazil, a validation example from the [Plastic Litter Project (PLP)](http://plp.aegean.gr/category/experiment-log-2021/), and two use-case Sentinel-2 images of scenes containing various floating targets. We then apply pre-trained weights in order to identify the potential existence of floating targets. We visualise the results in an interactive way and we also show that the model works on validation data and on examples containing different types of floating objects. We finally export the predictions in Geotiff formats to the output folders. Our experiments were implemented using Pytorch. Further details on the labeled dataset and the code can be found here: https://github.com/ESA-PhiLab/floatingobjects.\n",
    "\n",
    "### Highlights\n",
    "\n",
    "* We demonstrate the use of deep neural networks for the detection of floating objects on Sentinel-2 data.\n",
    "* Once the user downloads the dataset from Zenodo using this notebook, the predictions can be run and the results will be visualised.\n",
    "* Several use cases are included in order to show the variety of the detected objects.\n",
    "* The user can visualise the RGB image, the NDVI and FDI indices along with the predictions and classifications.\n",
    "* The predictions will be available locally in the user's folder.\n",
    "\n",
    "### Contributions\n",
    "\n",
    "#### Notebook\n",
    "* Jamila Mifdal (author), European Space Agency Φ-lab, [@jmifdal](https://github.com/jmifdal)\n",
    "* Raquel Carmo (author), European Space Agency Φ-lab, [@raquelcarmo](https://github.com/raquelcarmo)\n",
    "* Alejandro Coca-Castro (reviewer), The Alan Turing Institute, [@acocac](https://github.com/acocac)\n",
    "\n",
    "#### Modelling codebase\n",
    "* Jamila Mifdal (author), European Space Agency Φ-lab, [@jmifdal](https://github.com/jmifdal)\n",
    "* Raquel Carmo (author), European Space Agency Φ-lab, [@raquelcarmo](https://github.com/raquelcarmo)\n",
    "* Marc Rußwurm (author), EPFL-ECEO, [@marccoru](https://github.com/MarcCoru)\n",
    "\n",
    "#### Modelling publications\n",
    "```{bibliography}\n",
    "  :style: plain\n",
    "  :list: bullet\n",
    "  :filter: topic % \"b34facfa-cea8-48f5-89f6-f11ce00812a9\"\n",
    "```\n",
    "\n",
    "#### Modelling funding\n",
    "This project is supported by [Φ-lab, European Space Agency](https://philab.phi.esa.int/)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Install and load libraries"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip -q install gdown"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "hide-input"
    ]
   },
   "outputs": [],
   "source": [
    "import os\n",
    "import numpy as np\n",
    "\n",
    "# intake and pooch library\n",
    "import intake\n",
    "import pooch\n",
    "\n",
    "# xarray\n",
    "import xarray as xr\n",
    "\n",
    "# machine learning libraries\n",
    "import torch\n",
    "\n",
    "# image libraries\n",
    "from PIL import Image\n",
    "\n",
    "# visualisation\n",
    "import rasterio as rio \n",
    "\n",
    "import matplotlib, matplotlib.cm\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "from skimage.exposure import equalize_hist\n",
    "\n",
    "import holoviews as hv\n",
    "import hvplot.xarray\n",
    "\n",
    "import warnings\n",
    "warnings.filterwarnings(action='ignore')\n",
    "\n",
    "hv.extension('bokeh', width=100)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Set folder structure"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Define the project main folder\n",
    "data_folder = './notebook'\n",
    "\n",
    "# Set the folder structure\n",
    "config = {\n",
    "    'in_geotiff': os.path.join(data_folder, 'input','tiff'),\n",
    "    'out_geotiff': os.path.join(data_folder, 'output','raster'),\n",
    "    'out_usecases': os.path.join(data_folder, 'output','usecases')\n",
    "}\n",
    "\n",
    "# List comprehension for the folder structure code\n",
    "[os.makedirs(val) for key, val in config.items() if not os.path.exists(val)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load and prepare input image\n",
    "\n",
    "### Fetch input data using `intake`\n",
    "\n",
    "The input data of this notebook can be fetched directly from a [Zenodo repository](https://zenodo.org/record/5827377#.YdgfjGjMK9I) using `pooch`. We provide the repository's DOI (10.5281/zenodo.5827377) and we set the target entries with corresponding output folders where the downloaded files will be stored."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "outputs": [
    {
     "ename": "NameError",
     "evalue": "name 'pooch' is not defined",
     "output_type": "error",
     "traceback": [
      "\u001B[0;31m---------------------------------------------------------------------------\u001B[0m",
      "\u001B[0;31mNameError\u001B[0m                                 Traceback (most recent call last)",
      "Cell \u001B[0;32mIn[1], line 1\u001B[0m\n\u001B[0;32m----> 1\u001B[0m inputs \u001B[38;5;241m=\u001B[39m \u001B[43mpooch\u001B[49m\u001B[38;5;241m.\u001B[39mcreate(\n\u001B[1;32m      2\u001B[0m     path\u001B[38;5;241m=\u001B[39mconfig[\u001B[38;5;124m'\u001B[39m\u001B[38;5;124min_geotiff\u001B[39m\u001B[38;5;124m'\u001B[39m],\n\u001B[1;32m      3\u001B[0m     base_url\u001B[38;5;241m=\u001B[39m\u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mdoi:10.5281/zenodo.5827377\u001B[39m\u001B[38;5;124m\"\u001B[39m,\n\u001B[1;32m      4\u001B[0m )\n",
      "\u001B[0;31mNameError\u001B[0m: name 'pooch' is not defined"
     ]
    }
   ],
   "source": [
    "dataset = pooch.create(\n",
    "    path=config['in_geotiff'],\n",
    "    # Use the figshare DOI\n",
    "    base_url=\"doi:10.5281/zenodo.5827377/\",\n",
    "    registry={\n",
    "        \"cancun_20180914.tif\": \"md5:1d4cdb7db75d3ade702767bbfbd3ea44\",\n",
    "        \"mytilini_20210621.tif\": \"md5:36edf5dace9c09914cc5fc3165109462\",        \n",
    "        \"RioDeJaneiro.tif\": \"md5:734ab89c2cbde1ad76b460f4b029d9a6\", \n",
    "        \"tangshan_20180130.tif\": \"md5:e62f1faf4a6a875c88892efa4fe72723\",\n",
    "    },\n",
    ")\n",
    "\n",
    "# Download all the files in a Zenodo record\n",
    "for fn in dataset.registry_files:\n",
    "    dataset.fetch(fn)"
   ],
   "metadata": {
    "collapsed": false,
    "ExecuteTime": {
     "end_time": "2024-03-12T09:10:49.416698Z",
     "start_time": "2024-03-12T09:10:49.285216Z"
    }
   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# write a catalog YAML file for GeoTIFF images\n",
    "catalog_images = os.path.join(data_folder, 'catalog_images.yaml')\n",
    "\n",
    "with open(catalog_images, 'w') as f:\n",
    "    f.write('''\n",
    "plugins:\n",
    "  source:\n",
    "    - module: intake_xarray\n",
    "sources:\n",
    "  cancun:\n",
    "    driver: rasterio\n",
    "    description: 'GeoTIFF image in Cancun'\n",
    "    args:\n",
    "      urlpath: \"{{ CATALOG_DIR }}/input/tiff/cancun_20180914.tif\"\n",
    "  mytilini:\n",
    "    driver: rasterio\n",
    "    description: 'GeoTIFF image in Mytilini'\n",
    "    args:\n",
    "      urlpath: \"{{ CATALOG_DIR }}/input/tiff/mytilini_20210621.tif\"\n",
    "  riodejaneiro:\n",
    "    driver: rasterio\n",
    "    description: 'GeoTIFF image in Rio de Janeiro'\n",
    "    args:\n",
    "      urlpath: \"{{ CATALOG_DIR }}/input/tiff/RioDeJaneiro.tif\"\n",
    "  tangshan:\n",
    "    driver: rasterio\n",
    "    description: 'GeoTIFF image in tangshan'\n",
    "    args:\n",
    "      urlpath: \"{{ CATALOG_DIR }}/input/tiff/tangshan_20180130.tif\"\n",
    "      ''')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cat_images = intake.open_catalog(catalog_images)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Inspect the Sentinel-2 data image\n",
    "In the example below, we inspect the .tif image provided in the [Zenodo repository](https://zenodo.org/record/5827377#.YdgfjGjMK9I), called \"RioDeJaneiro.tif\".\n",
    "\n",
    "Let's investigate the loaded `data-array` fetched by intake."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "image = cat_images['riodejaneiro'].read()\n",
    "\n",
    "print('shape =', image.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Compute model predictions\n",
    "In this work we focus on the spatial patterns of the floating targets, thus we address this detection task as a binary classification problem of floating objects versus water surface. A deep learning-based segmentation model was chosen to perform the detection and delineation of the floating targets: a U-Net Convolutional Neural Network (CNN). This model has been pre-trained on a large open-source hand-labeled Sentinel-2 dataset, developed by the authors, containing both Level-1C Top-Of-Atmosphere (TOA) and Level-2A Bottom-Of-Atmosphere (BOA) products over coastal water bodies. For more details, please refer to https://github.com/ESA-PhiLab/floatingobjects."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Load the model and its pre-trained weights\n",
    "For the U-Net, there are two pre-trained models available (described in this [link](https://github.com/ESA-PhiLab/floatingobjects/blob/master/hubconf.py)). For the sake of simplicity we are loading the weights of the 'unet_seed0', which refers to a U-Net architecture pre-trained for 50 epochs, with batch size of 160, learning rate of 0.001 and seed 0."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#Device to run the computations on\n",
    "device = 'cuda' if torch.cuda.is_available() else 'cpu'\n",
    "\n",
    "#Available models and weights\n",
    "unet_seed0 = torch.hub.load('ESA-PhiLab/floatingobjects:master', 'unet_seed0', map_location=torch.device(device))\n",
    "#unet_seed1 = torch.hub.load('ESA-PhiLab/floatingobjects:master', 'unet_seed1', map_location=torch.device(device))\n",
    "\n",
    "#Select model\n",
    "model = unet_seed0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Compute predictions\n",
    "Input the image to the model to yield predictions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "l1cbands = [\"B1\", \"B2\", \"B3\", \"B4\", \"B5\", \"B6\", \"B7\", \"B8\", \"B8A\", \"B9\", \"B10\", \"B11\", \"B12\"]\n",
    "l2abands = [\"B1\", \"B2\", \"B3\", \"B4\", \"B5\", \"B6\", \"B7\", \"B8\", \"B8A\", \"B9\", \"B11\", \"B12\"]\n",
    "\n",
    "image = image.assign_coords(band_id=('band', l1cbands))\n",
    "image = image.set_index(band=\"band_id\")\n",
    "\n",
    "#If L1C image (13 bands), read only the 12 bands compatible with L2A data\n",
    "if (image.shape[0] == 13):\n",
    "    image = image.sel(band=l2abands)\n",
    "\n",
    "image= image[:,400:1400,400:1400] #subset to avoid memory issues in Binder (TODO dask delay or map_blocks might help to manipulate the whole image)\n",
    "image = image.values.astype(float)\n",
    "image *= 1e-4\n",
    "image = torch.from_numpy(image)\n",
    "\n",
    "#Compute predictions\n",
    "with torch.no_grad():\n",
    "    x = image.unsqueeze(0)\n",
    "    y_logits = torch.sigmoid(model(x.to(device)).squeeze(0))\n",
    "    y_score = y_logits.cpu().detach().numpy()[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Compute NDVI and FDI indices\n",
    "For detecting marine litter, pixelwise spectral features such as the Normalized Difference Vegetation Index (NDVI) and the [Floating Debris Index (FDI)](https://www.nature.com/articles/s41598-020-62298-z) are often chosen as problem-specific features when using model-driven classifiers (e.g. Random Forest or Naïve Bayes). In this case, because we're applying a data-driven approach (U-Net), we only resort to these indices for visual inspection purposes.\n",
    "To compute these indices, we used the following equations:\n",
    "\n",
    "$$NDVI=\\dfrac{R_{rs,NIR}-R_{rs,RED}}{R_{rs,NIR}+R_{rs,RED}}$$\n",
    "\n",
    "$$FDI=R_{rs,NIR}-R'_{rs,NIR}$$\n",
    "\n",
    "$$R'_{rs,NIR} = R_{rs,RED_2} + (R_{rs,SWIR_1} - R_{rs,RED_2}) \\times \\dfrac{(\\lambda_{NIR} - \\lambda_{RED})}{(\\lambda_{SWIR_1} - \\lambda_{RED})} \\times 10$$\n",
    "\n",
    "where:\n",
    "\n",
    "- $R_{rs,NIR}$ is the spectral reflectance measured in the near infrared waveband (band B08)\n",
    "- $R_{rs,RED}$ is the spectral reflectance measured in the red waveband (band B04)\n",
    "- $R_{rs,RED_2}$ is the spectral reflectance measured in the red edge waveband (band B06)\n",
    "- $R_{rs,SWIR_1}$ is the spectral reflectance measured in the shortwave infrared (band B11)\n",
    "- $\\lambda_{NIR} = 832.9$\n",
    "- $\\lambda_{RED} = 664.8$\n",
    "- $\\lambda_{SWIR_1} = 1612.05$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def calculate_fdi(scene):\n",
    "    '''Compute FDI index'''\n",
    "    NIR = scene[l2abands.index(\"B8\")]\n",
    "    RED2 = scene[l2abands.index(\"B6\")]\n",
    "    SWIR1 = scene[l2abands.index(\"B11\")]\n",
    "\n",
    "    lambda_NIR = 832.9\n",
    "    lambda_RED = 664.8\n",
    "    lambda_SWIR1 = 1612.05\n",
    "    NIR_prime = RED2 + (SWIR1 - RED2) * 10 * (lambda_NIR - lambda_RED) / (lambda_SWIR1 - lambda_RED)\n",
    "    return NIR - NIR_prime\n",
    "\n",
    "def calculate_ndvi(scene):\n",
    "    '''Compute NDVI index'''\n",
    "    NIR = scene[l2abands.index(\"B8\")].float()\n",
    "    RED = scene[l2abands.index(\"B4\")].float()\n",
    "    return (NIR - RED) / (NIR + RED + 1e-12)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#Compute the NDVI and FDI bands corresponding to the image\n",
    "fdi = calculate_fdi(image).cpu().detach().numpy()\n",
    "fdi = np.expand_dims(fdi,0)\n",
    "fdi = np.squeeze(fdi,0)\n",
    "ndvi = calculate_ndvi(image).cpu().detach().numpy()\n",
    "\n",
    "#Compute RGB representation\n",
    "tensor = np.stack([image[l2abands.index('B4')], image[l2abands.index('B3')], image[l2abands.index('B2')]])\n",
    "rgb = equalize_hist(tensor.swapaxes(0,1).swapaxes(1,2))\n",
    "\n",
    "#Configure visualisation settings\n",
    "cmap_magma = matplotlib.cm.get_cmap('magma')\n",
    "cmap_viridis = matplotlib.cm.get_cmap('viridis')\n",
    "cmap_terrain = matplotlib.cm.get_cmap('terrain')\n",
    "norm_fdi = matplotlib.colors.Normalize(vmin=0, vmax=0.1)\n",
    "norm_ndvi = matplotlib.colors.Normalize(vmin=-.4, vmax=0.4)\n",
    "norm = matplotlib.colors.Normalize(vmin=0, vmax=0.4)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we create the interactive plots."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "general_settings = {'x':'x', 'y':'y', 'data_aspect':1, 'flip_yaxis':True, \n",
    "                    'xaxis':False, 'yaxis':None, 'tools':['tap', 'box_select']}\n",
    "\n",
    "#RGB\n",
    "#convert to 'xarray.DataArray'\n",
    "RGB_xr = xr.DataArray(rgb, dims=['y', 'x', 'band'], \n",
    "                      coords={'y': np.arange(rgb.shape[0]),\n",
    "                              'x': np.arange(rgb.shape[1]), \n",
    "                              'band': np.arange(rgb.shape[2])})\n",
    "plot_RGB = RGB_xr.hvplot.rgb(**general_settings, bands='band', title='RGB')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#FDI\n",
    "FDI = cmap_magma(norm_fdi(fdi))\n",
    "FDI_tmp = FDI[:,:,0:3]\n",
    "#convert to 'xarray.DataArray'\n",
    "FDI_xr = xr.DataArray(FDI_tmp, dims=['y', 'x', 'band'], \n",
    "                      coords={'y': np.arange(FDI_tmp.shape[0]),\n",
    "                              'x': np.arange(FDI_tmp.shape[1]), \n",
    "                              'band': np.arange(FDI_tmp.shape[2])})\n",
    "plot_FDI = FDI_xr.hvplot.rgb(**general_settings, bands='band', title='FDI')\n",
    "\n",
    "#NDVI\n",
    "NDVI = cmap_viridis(norm_ndvi(ndvi))\n",
    "NDVI_tmp = NDVI[:,:,0:3]\n",
    "#convert to 'xarray.DataArray'\n",
    "NDVI_xr = xr.DataArray(NDVI_tmp, dims=['y', 'x', 'band'], \n",
    "                       coords={'y': np.arange(NDVI_tmp.shape[0]),\n",
    "                               'x': np.arange(NDVI_tmp.shape[1]),\n",
    "                               'band': np.arange(NDVI_tmp.shape[2])})\n",
    "plot_NDVI = NDVI_xr.hvplot.rgb(**general_settings, bands='band', title='NDVI')\n",
    "\n",
    "#Predictions\n",
    "Predictions = cmap_magma(norm(y_score))\n",
    "#convert to 'xarray.DataArray'\n",
    "Predictions_xr = xr.DataArray(Predictions, dims=['y', 'x', 'band'], \n",
    "                              coords={'y': np.arange(Predictions.shape[0]),\n",
    "                                      'x': np.arange(Predictions.shape[1]), \n",
    "                                      'band': np.arange(Predictions.shape[2])})\n",
    "plot_Predictions = Predictions_xr.hvplot.rgb(**general_settings, bands='band', title='Predictions')\n",
    "\n",
    "#Classification\n",
    "Classification = np.where(y_score>0.4, 1, 0)\n",
    "#convert to 'xarray.DataArray'\n",
    "Classification_xr = xr.DataArray(Classification, dims=['y', 'x'], \n",
    "                                 coords={'y': np.arange(Classification.shape[0]),\n",
    "                                         'x': np.arange(Classification.shape[1])})\n",
    "plot_Classification = Classification_xr.hvplot(**general_settings, cmap='viridis', colorbar=False, title='Classification')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cplot =  plot_RGB + hv.Empty() + plot_FDI + plot_NDVI + plot_Predictions + plot_Classification\n",
    "cplot.cols(2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "remove-cell"
    ]
   },
   "outputs": [],
   "source": [
    "fpath = os.path.join(data_folder, 'interactive-plot_RioDeJaneiro.html')\n",
    "hvplot.save(cplot, fpath)\n",
    "print('Done.')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Convert predictions into geospatial files\n",
    "In this section we demonstrate how to export the model predictions to GeoTIFF format."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "input_tiff = config['in_geotiff'] + '/RioDeJaneiro.tif'\n",
    "output_raster = config['out_geotiff'] + '/predicted_rasters_RioDeJaneiro.tif'\n",
    "\n",
    "x = np.moveaxis(Predictions, -1, 0)  #(bands, height, width)\n",
    "\n",
    "with rio.open(input_tiff, \"r\") as src:\n",
    "    with rio.open(output_raster, 'w', driver='GTiff',\n",
    "                    height=x.shape[1], width=x.shape[2], count=x.shape[0],\n",
    "                    dtype=str(x.dtype),\n",
    "                    crs=src.crs, transform=src.transform) as save_preds:\n",
    "\n",
    "        #Save predictions as GeoTIFF\n",
    "        save_preds.write(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": []
   },
   "source": [
    "## Use Cases\n",
    "In this section we validate our model on specific use cases. The images tested are provided in the [Zenodo repository](https://zenodo.org/record/5827377#.YdgfjGjMK9I)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Plastic Litter Project (PLP) 2021\n",
    "Here we validate the selected model on a Sentinel-2 scene capturing the deployed targets of the latest edition of the [Plastic Litter Project (PLP)](http://plp.aegean.gr/category/experiment-log-2021/) in Mytilene, Greece. The scene was acquired on the 21st of June 2021 (http://plp.aegean.gr/2021/06/21/target-2-placement-2/) and captures two targets:\n",
    "- a circular 28m diameter target composed of high-density polyethylene (HDPE) mesh, covering a total area of 615m$^2$;\n",
    "- a wooden target built with a rectangular grid pattern achieving the same pixel area coverage as the HDPE mesh target."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "hide-input"
    ]
   },
   "outputs": [],
   "source": [
    "plt.rcParams.update({'axes.titlesize': 'x-large'})\n",
    "\n",
    "def visualise(**images):\n",
    "    '''Visualisation function'''\n",
    "    n = len(images)\n",
    "    plt.figure(figsize=(25, 25))\n",
    "    for i, (name, image) in enumerate(images.items()):\n",
    "        plt.subplot(1, n, i + 1)\n",
    "        plt.xticks([])\n",
    "        plt.yticks([])\n",
    "        plt.title(name, size='xx-large')\n",
    "        plt.imshow(image)\n",
    "    plt.show()\n",
    "\n",
    "def run_preds(test_image, name):\n",
    "    '''Run the model over the test image and plot the results'''\n",
    "    image = test_image.assign_coords(band_id=('band', l1cbands))\n",
    "    image = image.set_index(band=\"band_id\")\n",
    "\n",
    "    #If L1C image (13 bands), read only the 12 bands compatible with L2A data\n",
    "    if (image.shape[0] == 13):\n",
    "        image = image.sel(band=l2abands)\n",
    "        \n",
    "    #Compute RGB representation\n",
    "    tensor = np.stack([image[l2abands.index('B4')], image[l2abands.index('B3')], image[l2abands.index('B2')]])\n",
    "    rgb = equalize_hist(tensor.swapaxes(0,1).swapaxes(1,2))\n",
    "\n",
    "    image = image.values.astype(float)\n",
    "    image *= 1e-4\n",
    "    image = torch.from_numpy(image)\n",
    "\n",
    "    #Compute predictions\n",
    "    with torch.no_grad():\n",
    "        x = image.unsqueeze(0)\n",
    "        y_logits = torch.sigmoid(model(x.to(device)).squeeze(0))\n",
    "        y_score = y_logits.cpu().detach().numpy()[0]\n",
    "        \n",
    "        #Save the output as .tif\n",
    "        output_usecase = config['out_usecases'] +'/' + 'predicted_' + name + '.tif'\n",
    "        im = Image.fromarray(y_score)\n",
    "        im.save(output_usecase)\n",
    "\n",
    "    #Compute the NDVI and FDI bands corresponding to the image\n",
    "    fdi = np.squeeze(np.expand_dims(calculate_fdi(image),0),0)\n",
    "    ndvi = calculate_ndvi(image)\n",
    "    \n",
    "    visualise(\n",
    "        RGB = rgb,\n",
    "        FDI = cmap_magma(norm_fdi(fdi)),\n",
    "        NDVI = cmap_viridis(norm_ndvi(ndvi)),\n",
    "        Predictions = cmap_magma(norm(y_score)),\n",
    "        Classification = np.where(y_score>0.5, 1, 0)\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can see that, despite the small size of the objects, the model was able to accurately predict the floating targets."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#Read test image\n",
    "test_image = cat_images['mytilini'].read()\n",
    "run_preds(test_image, 'mytilini')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Sargassum\n",
    "Here we validate our model on a Sentinel-2 scene captured on the 14th of September 2018, likely to contain sargassum in the coastal area of Cancun, Mexico. Sargassum is a type of algae that could turn into a natural pollutant if it stays in the water for a long period of time. Thus it is important to monitor it and remove it when necessary. We notice that our model managed to detect the floating sargassum efficiently along with some waves. The detection of waves is a false positive and will be addressed in our future work. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#Read test image\n",
    "test_image = cat_images['cancun'].read()\n",
    "run_preds(test_image, 'cancun')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Ice\n",
    "We also validate our model on a Sentinel-2 scene captured on the 30th of January 2018 containing ice in the coastal area of Tangshan, China. Sea ice is a floating object that has a different spatial geometry from the classical floating targets (algae, plumes, plastic, etc.). Nevertheless, our model was still able to detect accurately its shape."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#Read test image\n",
    "test_image = cat_images['tangshan'].read()\n",
    "run_preds(test_image, 'tangshan')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Summary"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this notebook we have explored the use of a deep learning-based segmentation model to detect and delineate floating targets on Sentinel-2 coastal scenes. We have developed the following pipeline:\n",
    "\n",
    "* fetch and load data from a Zenodo repository, using the `pooch` and `intake` packages;\n",
    "* inspect the Sentinel-2 images retrieved;\n",
    "* compute model predictions on several use cases using pre-trained model weights for U-Net;\n",
    "* visualise the RGB image, the NDVI and FDI indices along with the predictions and classification on interactive maps, using the `hvplot` package;\n",
    "* export the model predictions in GeoTIFF format, using the `rasterio` library;\n",
    "* export the model predictions for the use cases in GeoTIFF format."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Citing this Notebook"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Please see [CITATION.cff](https://github.com/eds-book-gallery/b34facfa-cea8-48f5-89f6-f11ce00812a9/blob/main/CITATION.cff) for the full citation information. The citation file can be exported to APA or BibTex formats (learn more [here](https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-citation-files))."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Additional information\n",
    "**License**: The code in this notebook is licensed under the MIT License. The Environmental Data Science book is licensed under the Creative Commons by Attribution 4.0 license. See further details [here](https://github.com/alan-turing-institute/environmental-ds-book/blob/master/LICENSE.md).\n",
    "\n",
    "**Contact**: If you have any suggestion or report an issue with this notebook, feel free to [create an issue](https://github.com/alan-turing-institute/environmental-ds-book/issues/new/choose) or send a direct message to [environmental.ds.book@gmail.com](mailto:environmental.ds.book@gmail.com)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "remove-input"
    ]
   },
   "outputs": [],
   "source": [
    "from datetime import date\n",
    "\n",
    "print('Notebook repository version: v1.1.1')\n",
    "print(f'Last tested: {date.today()}')"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}