{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "![OpenSARlab notebook banner](NotebookAddons/blackboard-banner.png)\n", "\n", "# Exploring SAR Time Series Data for Flood Monitoring\n", "\n", "### Franz J Meyer; University of Alaska Fairbanks & Josef Kellndorfer, [Earth Big Data, LLC](http://earthbigdata.com/)\n", "\n", "\n", "\n", "This notebook introduces you to the time series signatures associated with flooding. The data analysis is doen in the framework of *Jupyter Notebooks*. The Jupyter Notebook environment is easy to launch in any web browser for interactive data exploration with provided or new training data. Notebooks are comprised of text written in a combination of executable python code and markdown formatting including latex style mathematical equations. Another advantage of Jupyter Notebooks is that they can easily be expanded, changed, and shared with new data sets or newly available time series steps. Therefore, they provide an excellent basis for collaborative and repeatable data analysis.\n", "\n", "**This notebook covers the following data analysis concepts:**\n", "\n", "- How to load time series stacks into Jupyter Notebooks and how to explore image content using basic functions such as mean value calculation and histogram analysis.\n", "- How to extract time series information for individual pixels of an image.\n", "- Typical time series signatures over forests and deforestation sites.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Important Notes about JupyterHub**\n", "\n", "Your JupyterHub server will automatically shutdown when left idle for more than 1 hour. Your notebooks will not be lost but you will have to restart their kernels and re-run them from the beginning. You will not be able to seamlessly continue running a partially run notebook." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import url_widget as url_w\n", "notebookUrl = url_w.URLWidget()\n", "display(notebookUrl)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "from IPython.display import Markdown\n", "from IPython.display import display\n", "\n", "notebookUrl = notebookUrl.value\n", "user = !echo $JUPYTERHUB_USER\n", "env = !echo $CONDA_PREFIX\n", "if env[0] == '':\n", " env[0] = 'Python 3 (base)'\n", "if env[0] != '/home/jovyan/.local/envs/rtc_analysis':\n", " display(Markdown(f'WARNING:'))\n", " display(Markdown(f'This notebook should be run using the \"rtc_analysis\" conda environment.'))\n", " display(Markdown(f'It is currently using the \"{env[0].split(\"/\")[-1]}\" environment.'))\n", " display(Markdown(f'Select the \"rtc_analysis\" from the \"Change Kernel\" submenu of the \"Kernel\" menu.'))\n", " display(Markdown(f'If the \"rtc_analysis\" environment is not present, use Create_OSL_Conda_Environments.ipynb to create it.'))\n", " display(Markdown(f'Note that you must restart your server after creating a new environment before it is usable by notebooks.'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 0. Importing Relevant Python Packages \n", "\n", "In this notebook we will use the following scientific libraries:\n", "\n", "- [Pandas](https://pandas.pydata.org/) is a Python library that provides high-level data structures and a vast variety of tools for analysis. The great feature of this package is the ability to translate rather complex operations with data into one or two commands. Pandas contains many built-in methods for filtering and combining data, as well as the time-series functionality.\n", "- [GDAL](https://www.gdal.org/) is a software library for reading and writing raster and vector geospatial data formats. It includes a collection of programs tailored for geospatial data processing. Most modern GIS systems (such as ArcGIS or QGIS) use GDAL in the background.\n", "- [NumPy](http://www.numpy.org/) is one of the principal packages for scientific applications of Python. It is intended for processing large multidimensional arrays and matrices, and an extensive collection of high-level mathematical functions and implemented methods makes it possible to perform various operations with these objects.\n", "- [Matplotlib](https://matplotlib.org/index.html) is a low-level library for creating two-dimensional diagrams and graphs. With its help, you can build diverse charts, from histograms and scatterplots to non-Cartesian coordinates graphs. Moreover, many popular plotting libraries are designed to work in conjunction with matplotlib." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Check Python version:\n", "import sys\n", "pn = sys.version_info[0]\n", "\n", "from pathlib import Path\n", "import re\n", "from math import ceil\n", "\n", "import pandas as pd # for DatetimeIndex\n", "from osgeo import gdal # for GetRasterBand, Open, ReadAsArray\n", "gdal.UseExceptions()\n", "import numpy as np #for log10, mean, percentile, power\n", "from pyproj import Transformer\n", "\n", "%matplotlib widget\n", "import matplotlib.pyplot as plt # for add_subplot, axis, figure, imshow, legend, plot, set_axis_off, set_data,\n", " # set_title, set_xlabel, set_ylabel, set_ylim, subplots, title, twinx\n", "import matplotlib.patches as patches # for Rectangle\n", "import matplotlib.animation as an # for FuncAnimation\n", "from matplotlib import rc\n", "\n", "from ipyfilechooser import FileChooser\n", "\n", "from IPython.display import HTML\n", "plt.rcParams.update({'font.size': 12})\n", "\n", "if pn == 2:\n", " import cStringIO #needed for the image checkboxes\n", "elif pn == 3:\n", " import io\n", " import base64\n", " \n", "from opensarlab_lib import select_parameter\n", "\n", "# For exporting:\n", "from PIL import Image" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "--- \n", "\n", "### 1. Load Data Stack" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def get_tiff_paths(paths):\n", " tiff_paths = !ls $paths | sort -t_ -k5,5\n", " return tiff_paths" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Select the directory holding your tiffs**\n", "- Click the `Select` button\n", "- Navigate to your data directory\n", "- Click the `Select` button\n", "- Confirm that the desired path appears in green text\n", "- Click the `Change` button to alter your selection" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fc = FileChooser('/home/jovyan/notebooks')\n", "display(fc)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Determine the path to the analysis directory containing the tiff directory:**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tiff_dir = Path(fc.selected_path)\n", "analysis_dir = tiff_dir.parent\n", "print(f\"analysis_dir: {analysis_dir}\")\n", "\n", "paths = tiff_dir/\"*.tif*\"\n", "tiff_paths = get_tiff_paths(paths)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Select a polarization:**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pol = select_parameter(['vv', 'vh'])\n", "display(pol)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class PolarizationNotFoundError(Exception):\n", " pass\n", "polarization = pol.value\n", "if polarization == 'vv':\n", " path_options = ['*VV*.tif*', '*vv*.tif*']\n", "else:\n", " path_options = ['*VH*.tif*', '*vh*.tif*']\n", "\n", "wildcard_path = None \n", "\n", "for p in path_options:\n", " pths = list(tiff_dir.rglob(p))\n", " if pths:\n", " wildcard_path = tiff_dir/p\n", " break\n", " \n", "if not wildcard_path:\n", " raise PolarizationNotFoundError(f\"No files found in {tiff_dir} with {pol.value} polarization\") \n", "\n", "print(wildcard_path) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Write a function to extract the acquisition dates.**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def get_date(pth):\n", " date_regex = r'(?<=_)\\d{8}T\\d{6}(?=_)'\n", " try:\n", " return re.search(date_regex, str(pth)).group(0)\n", " except AttributeError:\n", " raise Exception(f\"Date string not found in {pth}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Call get_dates() to collect the product acquisition dates:**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dates = [get_date(d) for d in pths ]\n", "dates.sort()\n", "print(dates)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Create the VRT\n", "\n", "**Parse the polarization from a tiff name and define a path to the vrt:**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "raster_path = f'{analysis_dir}/raster_stack_{polarization}.vrt'\n", "print(raster_path)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Create the virtual raster table for the GeoTiffs:**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!gdalbuildvrt -separate $raster_path $wildcard_path" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Create Pandas time index and print the dates:**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "time_index = pd.DatetimeIndex(dates)\n", "\n", "for jacqdate, acqdate in enumerate(time_index):\n", " print('{:4d} {}'.format(jacqdate, acqdate.date()),end=' ')\n", " if (jacqdate % 5 == 4): print()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## 3. Data exploration with an animation\n", "\n", "**Read the data:**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "img = gdal.Open(raster_path)\n", "band = img.GetRasterBand(1)\n", "raster0 = band.ReadAsArray()\n", "band_number = 0 # Needed for updates\n", "rasterstack = img.ReadAsArray()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Before analyzing the data, decide whether to use **linear or logarithmic scaling**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "use_dB = False\n", "\n", "def convert(raster, use_dB=use_dB):\n", " # some Python trickery: \n", " # if you call the convert function later, you can set the keyword \n", " # argument use_dB to True or False\n", " # if you do not provide a keyword argument, the value that you set\n", " # above (when defining the function) is used\n", " if use_dB:\n", " return 10 * np.log10(raster)\n", " else:\n", " return raster" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Create an animation to get an idea of where and when flooding might have occurred**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%capture \n", "figani = plt.figure(figsize=(10, 5))\n", "axani = figani.subplots()\n", "axani.axis('off')\n", "\n", "rasterstack_ = convert(rasterstack)\n", "\n", "imani = axani.imshow(rasterstack_[0,...], cmap='gray', vmin=np.nanpercentile(rasterstack_, 1), \n", " vmax=np.nanpercentile(rasterstack_, 99))\n", "axani.set_title(\"{}\".format(time_index[0].date()))\n", "\n", "def animate(i):\n", " axani.set_title(\"{}\".format(time_index[i].date()))\n", " imani.set_data(rasterstack_[i,...])\n", "\n", "# Interval is given in milliseconds\n", "ani = an.FuncAnimation(figani, animate, frames=rasterstack_.shape[0], interval=300)\n", "rc('animation', embed_limit=40971520.0) # We need to increase the limit maybe to show the entire animation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Render**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "HTML(ani.to_jshtml())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## 4. Create Minimum Image to Identify Inundated Areas\n", "\n", "As flooding is often associated with very low backscater, we first compute the minimum backscatter for each pixel to get a first impression of areas that could have been flooded during the entire period.\n", "\n", "**The following line calculates the minimum backscatter per pixel across the time series:**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "temporal_min = np.nanmin(convert(rasterstack), axis=0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 4.2 Visualize the Minimum Image with Curser Information Included\n", "\n", "We will now visualize the minimum image in a way that we can move our mouse over the image and visualize the line/sample image coordinates. This will help us create time-series information for the most interesting image locations. \n", " \n", "To do so, we first **create some helper functions:** " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "class pixelPicker:\n", " def __init__(self, image, width, height):\n", " self.x = None\n", " self.y = None\n", " self.fig = plt.figure(figsize=(width, height))\n", " self.ax = self.fig.add_subplot(111, visible=False)\n", " self.rect = patches.Rectangle(\n", " (0.0, 0.0), width, height, \n", " fill=False, clip_on=False, visible=False\n", " )\n", " \n", " self.rect_patch = self.ax.add_patch(self.rect)\n", " self.cid = self.rect_patch.figure.canvas.mpl_connect('button_press_event', \n", " self)\n", " self.image = image\n", " self.plot = self.gray_plot(self.image, fig=self.fig, return_ax=True)\n", " self.plot.set_title('Select a Point of Interest')\n", " \n", " \n", " def gray_plot(self, image, vmin=None, vmax=None, fig=None, return_ax=False):\n", " '''\n", " Plots an image in grayscale.\n", " Parameters:\n", " - image: 2D array of raster values\n", " - vmin: Minimum value for colormap\n", " - vmax: Maximum value for colormap\n", " - return_ax: Option to return plot axis\n", " '''\n", " if vmin is None:\n", " vmin = np.nanpercentile(self.image, 1)\n", " if vmax is None:\n", " vmax = np.nanpercentile(self.image, 99)\n", " if fig is None:\n", " my_fig = plt.figure() \n", " ax = fig.add_axes([0.1,0.1,0.8,0.8])\n", " ax.imshow(image, cmap=plt.cm.gist_gray, vmin=vmin, vmax=vmax)\n", " if return_ax:\n", " return(ax)\n", " \n", " \n", " def __call__(self, event):\n", " print('click', event)\n", " self.x = event.xdata\n", " self.y = event.ydata\n", " for pnt in self.plot.get_lines():\n", " pnt.remove()\n", " plt.plot(self.x, self.y, 'ro')\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we are ready to plot the minimum image. **Click a point interest for which you want to analyze radar brightness over time:**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig_xsize = 7.5\n", "fig_ysize = 7.5\n", "my_plot = pixelPicker(temporal_min, fig_xsize, fig_ysize)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Save the selected coordinates:**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sarloc = (ceil(my_plot.x), ceil(my_plot.y))\n", "print(sarloc)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## 5. Plot SAR Brightness Time Series at Point Locations\n", "\n", "### 5.1 SAR Brightness Time Series at Point Locations\n", "\n", "We will pick a pixel location identified in the SAR image above and plot the time series for this identified point. By focusing on image locations undergoing deforestation, we should see the changes in the radar cross section related to the deforestation event.\n", " \n", "First, for processing of the imagery in this notebook we generate a list of image handles and retrieve projection and georeferencing information. We also define a function for mapping image pixels to a geographic projection." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "geotrans = img.GetGeoTransform()\n", "proj = img.GetProjection().split('[')[-1][:-2].split(',')[-1][1:-1]\n", "xsize = img.RasterXSize\n", "ysize = img.RasterYSize\n", "bands = img.RasterCount\n", "transformer = Transformer.from_crs(f\"epsg:{proj}\", \"epsg:4326\")\n", "\n", "class MissingTansformerError(Exception):\n", " pass\n", "\n", "def geolocation(x, geotrans, y=None, latlon=False, transformer=None):\n", " if len(x) == 2:\n", " y = x[1]\n", " x = x[0]\n", " ref_x=geotrans[0]+sarloc[0]*geotrans[1]\n", " ref_y=geotrans[3]+sarloc[1]*geotrans[5]\n", " if latlon:\n", " if transformer:\n", " ref_y, ref_x = transformer.transform(ref_x, ref_y)\n", " else:\n", " raise MissingTansformerError(\n", " \"You must pass a pyproj transformer to geolocation to convert UTM to EPSG\")\n", " return (ref_x, ref_y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, let's **pick a rectangle around a center pixel which we selected and defined in variable *sarloc* ...**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "extent = (5, 5) # choose a 5 by 5 rectangle\n", "latlon = True\n", "refsarloc = geolocation(sarloc, geotrans, latlon=True, transformer=transformer)\n", "projsymbol = '°' if latlon else 'm'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**... and extract the time series for this small area around the selected center pixel in a memory-efficient way (needed for larger stacks):**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.rcParams.update({'font.size': 9})\n", "bs_aggregated = []\n", "for band in range(bands):\n", " rs = img.GetRasterBand(band+1).ReadAsArray(sarloc[0], sarloc[1], \n", " extent[0], extent[1])\n", " rs_mean = convert(np.nanmean(rs))\n", " bs_aggregated.append(rs_mean)\n", "\n", "fig, ax = plt.subplots(1, 1, figsize=(8, 8))\n", "labeldB = 'dB' if use_dB else 'linear'\n", "ax.plot(time_index, bs_aggregated, color='k', marker='o', markersize=3)\n", "ax.set_xlabel('Date')\n", "ax.set_ylabel(f'Sentinel-1 $\\\\gamma^0$ [{labeldB}]')\n", "plt.xticks(rotation = 45)\n", "\n", "plt.grid()\n", "_ = fig.suptitle(f'Location: {refsarloc[0]:.3f}{projsymbol} '\n", " f'{refsarloc[1]:.3f}{projsymbol}')\n", "\n", "# fig.tight_layout() \n", "figname = (f'RCSTimeSeries-{refsarloc[0]:.3f}{projsymbol} '\n", " f'{refsarloc[1]:.3f}{projsymbol}.png')\n", "plt.savefig(f'{analysis_dir}/{figname}', dpi=300, transparent='true')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*ExploreSARTimeSeriesFlood_From_Prepared_Data_Stack.ipynb - Version 1.3.2 - February 2024*\n", "\n", "*Version Changes:*\n", "\n", "- *update get_date function*" ] } ], "metadata": { "kernelspec": { "display_name": "rtc_analysis", "language": "python", "name": "conda-env-.local-rtc_analysis-py" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.1" } }, "nbformat": 4, "nbformat_minor": 4 }