{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "
\n", " Flood Mapping from Single Sentinel-1 SAR Images\n", "\n", "
\n", " Franz J Meyer, University of Alaska Fairbanks
\n", "
\n", "\n", "This notebook presents a SAR-based flood mapping approach that largely follows a methodology developed by the German Aerospace Center and published in Sentinel-1-based flood mapping: a fully automated processing chain by Twele et al.. The approach is based on radiometricall terrain corrected (RTC processed) Sentinel-1 SAR data and applies a dynamic thresholding method followed by fuzzy-logic-based post processing procedure. This notebook implements the initial threshold-based flood mapping approach but, for simplicity, does not include the fuzzy-logic post processing steps. \n", "\n", "The approach is based on image amplitude data and is capable of detecting standing surface water. Note that flooding under vegetation will not be detected by this approach.\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Important Notes about Binder \n", "

\n", " The Binder server will automatically shutdown when left idle for more than 10 minutes. Your notebook edits will be lost when this happens. You will need to relaunch the binder to continue working in a fresh copy of the notebook.\n", "

\n", " How to Save your Notebook Edits\n", "

\n", "The Easy Way\n", "
\n", "Click on the Jupyter logo at the top left of the screen to access the file manager. Download the notebook, then upload and run it the next time you restart the server.\n", "

\n", "The Better, More Complicated Way\n", "
\n", "This solution requires some knowledge of git. Fork the asf-jupyter-notebook repository and update the url for the Binder launch button to the url of your fork. The url to edit can be found in the first line of the README.md file for this branch. Once you have your own fork, push any notebook changes to it prior to shutting down the server or allowing it to time out.
\n", "

\n", "
\n", "
" ] }, { "cell_type": "markdown", "metadata": { "heading_collapsed": true }, "source": [ "# General Methodology and Workflow" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "The workflow of the Sentinel-1-based processing chain, as outlined in the figure below, is composed of the following main elements: \n", "
    \n", "
  1. Find relevant SAR data over your area of interest at the Alaska Satellite Facility's SAR archive and Perform geometric and radiometric terrain correction using the RTC processing flow by GAMMA Remote Sensing,
  2. \n", "
  3. adaptive and automatic threshold calculation as discussed in the course,
  4. \n", "
  5. initial flood detection by applying the calculated threshold image wide,
  6. \n", "
  7. fuzzy-logic-based classification refinement,
  8. \n", "
  9. final classification into permanent and flood waters using auxiliary data, and
  10. \n", "
  11. dissemination of the results.
  12. \n", "
\n", "\n", "Step 1 of this workflow are operationally implemented in the Alaska Satellite Facility's Hybrid Pluggable Processing Pipeline (HyP3) environment and accessible to the public at the HyP3 Website. This notebook will focus on Steps 2 and 3\n", "
\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Flood Mapping Procedure" ] }, { "cell_type": "markdown", "metadata": { "heading_collapsed": true }, "source": [ "## Loading Python Libraries" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hidden": true }, "outputs": [], "source": [ "import glob\n", "import os\n", "from typing import Tuple\n", "\n", "import numpy as np\n", "from osgeo import gdal\n", "import pyproj\n", "import pandas as pd\n", "import rasterio\n", "import skfuzzy\n", "\n", "import ipywidgets as ui\n", "\n", "import matplotlib.pyplot as plt # for add_subplot, axis, figure, imshow, legend, plot, set_axis_off, set_data,\n", " # set_title, set_xlabel, set_ylabel, set_ylim, subplots, title, twinx\n", "\n", "from asf_notebook_FloodMapping import handle_old_data\n", "from asf_notebook_FloodMapping import input_path\n", "from asf_notebook_FloodMapping import NoHANDLayerException\n", "\n", "%matplotlib widget" ] }, { "cell_type": "markdown", "metadata": { "heading_collapsed": true }, "source": [ "## Write Some Helper Scripts" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "Write a function to pad an image, so it may be split into tiles with consistent dimensions" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hidden": true }, "outputs": [], "source": [ "def pad_image(image: np.ndarray, to: int) -> np.ndarray:\n", " height, width = image.shape\n", "\n", " n_rows, n_cols = get_tile_row_col_count(height, width, to)\n", " new_height = n_rows * to\n", " new_width = n_cols * to\n", "\n", " padded = np.zeros((new_height, new_width))\n", " padded[:image.shape[0], :image.shape[1]] = image\n", " return padded" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "Write a function to tile an image" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hidden": true }, "outputs": [], "source": [ "#def tile_image(image: np.ndarray, width: int = 200, height: int = 200) -> np.ndarray:\n", "def tile_image(image: np.ndarray, width, height) -> np.ndarray:\n", " _nrows, _ncols = image.shape\n", " _strides = image.strides\n", "\n", " nrows, _m = divmod(_nrows, height)\n", " ncols, _n = divmod(_ncols, width)\n", "\n", " assert _m == 0, \"Image must be evenly tileable. Please pad it first\"\n", " assert _n == 0, \"Image must be evenly tileable. Please pad it first\"\n", "\n", " return np.lib.stride_tricks.as_strided(\n", " np.ravel(image),\n", " shape=(nrows, ncols, height, width),\n", " strides=(height * _strides[0], width * _strides[1], *_strides),\n", " writeable=False\n", " ).reshape(nrows * ncols, height, width)\n" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "Write a function for multi-class Expectation Maximization Thresholding" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hidden": true }, "outputs": [], "source": [ "def EMSeg_opt(image, number_of_classes):\n", " image_copy = image.copy()\n", " image_copy2 = np.ma.filled(image.astype(float), np.nan) # needed for valid posterior_lookup keys\n", " image = image.flatten()\n", " minimum = np.amin(image)\n", " image = image - minimum + 1\n", " maximum = np.amax(image)\n", "\n", " size = image.size\n", " histogram = make_histogram(image)\n", " nonzero_indices = np.nonzero(histogram)[0]\n", " histogram = histogram[nonzero_indices]\n", " histogram = histogram.flatten()\n", " class_means = (\n", " (np.arange(number_of_classes) + 1) * maximum /\n", " (number_of_classes + 1)\n", " )\n", " class_variances = np.ones((number_of_classes)) * maximum\n", " class_proportions = np.ones((number_of_classes)) * 1 / number_of_classes\n", " sml = np.mean(np.diff(nonzero_indices)) / 1000\n", " iteration = 0\n", " while(True):\n", " class_likelihood = make_distribution(\n", " class_means, class_variances, class_proportions, nonzero_indices\n", " )\n", " sum_likelihood = np.sum(class_likelihood, 1) + np.finfo(\n", " class_likelihood[0][0]).eps\n", " log_likelihood = np.sum(histogram * np.log(sum_likelihood))\n", " for j in range(0, number_of_classes):\n", " class_posterior_probability = (\n", " histogram * class_likelihood[:,j] / sum_likelihood\n", " )\n", " class_proportions[j] = np.sum(class_posterior_probability)\n", " class_means[j] = (\n", " np.sum(nonzero_indices * class_posterior_probability)\n", " / class_proportions[j]\n", " )\n", " vr = (nonzero_indices - class_means[j])\n", " class_variances[j] = (\n", " np.sum(vr *vr * class_posterior_probability)\n", " / class_proportions[j] +sml\n", " )\n", " del class_posterior_probability, vr\n", " class_proportions = class_proportions + 1e-3\n", " class_proportions = class_proportions / np.sum(class_proportions)\n", " class_likelihood = make_distribution(\n", " class_means, class_variances, class_proportions, nonzero_indices\n", " )\n", " sum_likelihood = np.sum(class_likelihood, 1) + np.finfo(\n", " class_likelihood[0,0]).eps\n", " del class_likelihood\n", " new_log_likelihood = np.sum(histogram * np.log(sum_likelihood))\n", " del sum_likelihood\n", " if((new_log_likelihood - log_likelihood) < 0.000001):\n", " break\n", " iteration = iteration + 1\n", " del log_likelihood, new_log_likelihood\n", " class_means = class_means + minimum - 1\n", " s = image_copy.shape\n", " posterior = np.zeros((s[0], s[1], number_of_classes))\n", " posterior_lookup = dict()\n", " for i in range(0, s[0]):\n", " for j in range(0, s[1]):\n", " pixel_val = image_copy2[i,j] \n", " if pixel_val in posterior_lookup:\n", " for n in range(0, number_of_classes): \n", " posterior[i,j,n] = posterior_lookup[pixel_val][n]\n", " else:\n", " posterior_lookup.update({pixel_val: [0]*number_of_classes})\n", " for n in range(0, number_of_classes): \n", " x = make_distribution(\n", " class_means[n], class_variances[n], class_proportions[n],\n", " image_copy[i,j]\n", " )\n", " posterior[i,j,n] = x * class_proportions[n]\n", " posterior_lookup[pixel_val][n] = posterior[i,j,n]\n", " return posterior, class_means, class_variances, class_proportions\n", "\n", "def make_histogram(image):\n", " image = image.flatten()\n", " indices = np.nonzero(np.isnan(image))\n", " image[indices] = 0\n", " indices = np.nonzero(np.isinf(image))\n", " image[indices] = 0\n", " del indices\n", " size = image.size\n", " maximum = int(np.ceil(np.amax(image)) + 1)\n", " #maximum = (np.ceil(np.amax(image)) + 1)\n", " histogram = np.zeros((1, maximum))\n", " for i in range(0,size):\n", " #floor_value = int(np.floor(image[i]))\n", " floor_value = np.floor(image[i]).astype(np.uint8)\n", " #floor_value = (np.floor(image[i]))\n", " if floor_value > 0 and floor_value < maximum - 1:\n", " temp1 = image[i] - floor_value\n", " temp2 = 1 - temp1\n", " histogram[0,floor_value] = histogram[0,floor_value] + temp1\n", " histogram[0,floor_value - 1] = histogram[0,floor_value - 1] + temp2\n", " histogram = np.convolve(histogram[0], [1,2,3,2,1])\n", " histogram = histogram[2:(histogram.size - 3)]\n", " histogram = histogram / np.sum(histogram)\n", " return histogram\n", "\n", "def make_distribution(m, v, g, x):\n", " x = x.flatten()\n", " m = m.flatten()\n", " v = v.flatten()\n", " g = g.flatten()\n", " y = np.zeros((len(x), m.shape[0]))\n", " for i in range(0,m.shape[0]):\n", " d = x - m[i]\n", " amp = g[i] / np.sqrt(2*np.pi*v[i])\n", " y[:,i] = amp * np.exp(-0.5 * (d * d) / v[i])\n", " return y\n", "\n", "def get_proj4(filename):\n", " f=rasterio.open(filename)\n", " return pyproj.Proj(f.crs, preserve_units=True) #used in pysheds\n", " \n", "def gdal_read(filename, ndtype=np.float64):\n", " '''\n", " z=readData('/path/to/file')\n", " '''\n", " ds = gdal.Open(filename) \n", " return np.array(ds.GetRasterBand(1).ReadAsArray()).astype(ndtype);\n", "def gdal_get_geotransform(filename):\n", " '''\n", " [top left x, w-e pixel resolution, rotation, top left y, rotation, n-s pixel resolution]=gdal_get_geotransform('/path/to/file')\n", " '''\n", " #http://stackoverflow.com/questions/2922532/obtain-latitude-and-longitude-from-a-geotiff-file\n", " ds = gdal.Open(filename)\n", " return ds.GetGeoTransform()" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "Write a function to calculate the number of rows and columns of tiles needed to tile an image to a given size" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hidden": true }, "outputs": [], "source": [ "def get_tile_row_col_count(height: int, width: int, tile_size: int) -> Tuple[int, int]:\n", " return int(np.ceil(height / tile_size)), int(np.ceil(width / tile_size))" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "Write a function to extract the tiff dates from a wildcard path: " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hidden": true }, "outputs": [], "source": [ "def get_dates(paths):\n", " dates = []\n", " pths = glob.glob(paths)\n", " for p in pths:\n", " date = p.split('/')[-1].split(\"_\")[0]\n", " dates.append(date)\n", " dates.sort()\n", " return dates" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "Write a function to save a mask" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hidden": true }, "outputs": [], "source": [ "def write_mask_to_file(mask: np.ndarray, file_name: str, projection: str, geo_transform: str) -> None:\n", " (width, height) = mask.shape\n", " out_image = gdal.GetDriverByName('GTiff').Create(\n", " file_name, height, width, bands=1\n", " )\n", " out_image.SetProjection(projection)\n", " out_image.SetGeoTransform(geo_transform)\n", " out_image.GetRasterBand(1).WriteArray(mask)\n", " out_image.GetRasterBand(1).SetNoDataValue(0)\n", " out_image.FlushCache()" ] }, { "cell_type": "markdown", "metadata": { "heading_collapsed": true }, "source": [ "## Load SAR Data Sets to Process" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hidden": true }, "outputs": [], "source": [ "def get_tiff_paths(paths: str) -> list:\n", " tiff_paths = !ls $paths | sort -t_ -k5,5\n", " return tiff_paths" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "Reading a SAR Data Stack:\n", "\n", "This lab uses a subset of a Sentinel-1 SAR time series acquired near the city of Malda, on the Indian Bangladesh border. This area experienced extensive flooding during the 2020 South Asia monsoon season. The time series covers June to August of 2020 and combines ascending and descending RTC imagery into a joint and consistent time series to monitoring this rapidly developing event. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hidden": true }, "outputs": [], "source": [ "name = \"BangladeshFloodMapping_binder\"\n", "tiff_dir = path = f\"/home/jovyan/{name}\"\n", "if not os.path.exists(tiff_dir):\n", " os.mkdir(tiff_dir)\n", "os.chdir(tiff_dir)\n", "print(f\"Current working directory: {os.getcwd()}\")\n", "\n", "time_series_path = f\"s3://asf-jupyter-data-west/{name}.tar.gz\"\n", "time_series = os.path.basename(time_series_path)\n", "!aws --no-sign-request --region us-west-2 s3 cp $time_series_path $time_series\n", "\n", "!tar -xvzf {name}.tar.gz\n", "!rm *.tar.gz" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "Move into the parent directory of the directory containing the data and create a directory in which to store the water masks" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hidden": true }, "outputs": [], "source": [ "analysis_directory = tiff_dir\n", "os.chdir(tiff_dir)\n", "mask_directory = f'{analysis_directory}/Water_Masks'\n", "if not os.path.exists(mask_directory):\n", " os.mkdir(mask_directory)\n", "print(f\"Current working directory: {os.getcwd()}\")\n", "\n", "paths = f\"{tiff_dir}/*_V*.tif*\"\n", "if os.path.exists(tiff_dir):\n", " tiff_paths = get_tiff_paths(paths)" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "Write a function to create a dictionary containing lists of each vv/vh pair" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hidden": true }, "outputs": [], "source": [ "def group_polarizations(tiff_paths: list) -> dict:\n", " pths = {}\n", " for tiff in tiff_paths:\n", " product_name = tiff.split('.')[0][:-2]\n", " if product_name in pths:\n", " pths[product_name].append(tiff)\n", " else:\n", " pths.update({product_name: [tiff]})\n", " pths[product_name].sort()\n", " return pths" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "Write a function to confirm the presence of both VV and VH images in all image sets" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hidden": true }, "outputs": [], "source": [ "def confirm_dual_polarizations(paths: dict) -> bool:\n", " for p in paths:\n", " if len(paths[p]) == 2:\n", " if ('vv' not in paths[p][1] and 'VV' not in paths[p][1]) or \\\n", " ('vh' not in paths[p][0] and 'VH' not in paths[p][0]):\n", " return False\n", " return True " ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "Create a dictionary of VV/VH pairs and check it for completeness" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hidden": true }, "outputs": [], "source": [ "grouped_pths = group_polarizations(tiff_paths)\n", "if not confirm_dual_polarizations(grouped_pths):\n", " print(\"ERROR: AI_Water requires both VV and VH polarizations.\")\n", "else:\n", " print(\"Confirmed presence of VV and VH polarities for each product.\")\n", " \n", "#print(grouped_pths) #uncomment to print VV/VH path pairs" ] }, { "cell_type": "markdown", "metadata": { "heading_collapsed": true }, "source": [ "## Load HAND Layer for your AOI and Create HAND-Exclusion Mask (HAND-EM)" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "This notebook uses a prepared Height Above Nearest Drainage (HAND) file that was cut to the same extent as the SAR image time series. A DEM is used to create HAND by calculating the height difference between a particular image pixels and the nearest drainage (such as the nearest river). \n", " \n", "To create the HAND Exclusion Mask (HAND-EM) we then threshold the HAND layer by assuming that pixels with a HAND value > 15 m are unlikely to be flooded. This means, in other words, we assume that flood waters are unlikely to exceed a depth of 15 m. \n", "" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hidden": true }, "outputs": [], "source": [ "#load HAND and derive HAND-EM\n", "Hthresh = 15\n", "\n", "HAND_file=\"/home/jovyan/BangladeshFloodMapping_binder/Bangladesh_Training_DEM_hand.tif\"\n", "print(f\"Selected HAND: {HAND_file}\")\n", "try:\n", " HAND_gT=gdal_get_geotransform(HAND_file)\n", "except AttributeError:\n", " raise NoHANDLayerException(\"Remember to select a HAND layer in the previous cell.\")\n", "HAND_proj4=get_proj4(HAND_file)\n", "HAND=gdal_read(HAND_file)\n", "hand = np.nan_to_num(HAND)\n", "\n", "# Create Binary HAND-EM\n", "Hmask = hand < Hthresh\n", "handem = np.zeros_like(hand)\n", "sel = np.ones_like(hand)\n", "handem[Hmask] = sel[Hmask]" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "Now let's plot HAND and HAND-EM side-by-side. Dark blue regions in the HAND file (left) area areas near drainage systems such as rivers. These areas are most likely to be affected by floods. Areas in red, however, are at higher elevations and less likely to be flooded.\n", "\n", "The HAND-EM layer to the right shows pixels unlikely to contain flood waters in black. These pixels are at least 15 m above the nearest drainage system." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hidden": true }, "outputs": [], "source": [ "fig = plt.figure(figsize=(9, 5))\n", "plt.suptitle('Height Above Nearest Drainage [HAND] (m)')\n", "ax1 = fig.add_subplot(121)\n", "ax2 = fig.add_subplot(122)\n", "vmin = np.percentile(hand, 5)\n", "vmax = np.percentile(hand, 95)\n", "hh = ax1.imshow(hand, cmap='jet', vmin=vmin, vmax=vmax)\n", "ax1.set_title('HAND [m]')\n", "ax1.grid()\n", "fig.colorbar(hh,ax=ax1)\n", "ax2.imshow(handem, cmap='gray')\n", "ax2.set_title('HAND-EM')\n", "ax2.grid()" ] }, { "cell_type": "markdown", "metadata": { "heading_collapsed": true }, "source": [ "## Do Initial Flood Mapping using Adaptive Dynamic Thresholding" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "A bit of background on the implemented approach:\n", "\n", "This is what is implemented in this notebook: An automatic tile-based thresholding procedure (Martinis, Twele, and Voigt 2009) is used to generate an land/water classification. The selection of tiles is performed on a bilevel quadtree structure with parent level $L^+$ and child level $L^−$:\n", " \n", "1. Firstly, the entire data are separated into quadratic non-overlapping parent tiles on level $L^+$ with a size of $100 \\times 100$ pixels. Each parent object is further represented by four quadratic child objects on a second level $L^−$. The tile selection process is based on statistical hierarchical relations between parent and child objects. \n", "2. A number of parent tiles is automatically selected which offer the highest (>95% quantile) coefficient of variation on $L^+$ of the mean backscatter values of the respective child objects on $L^−$. This criterion serves as a measure of the degree of variation within the data and can therefore be used as an indicator of the probability that the tiles are characterized by spatial inhomogeneity and contain more than one semantic class. The selected parent objects should also have a mean individual backscatter value lower than the mean of all parent tiles on $L^+$. This ensures that tiles lying on the boundary between water and no water areas are selected. In case that no tiles fulfil these criteria, the tile size on $L^+$ and $L^−$ is halved and the quantile for the tile selection is reduced to 90% to guarantee a successful tile selection also in data with a relatively low extent of water surfaces or with smaller dispersed water bodies. \n", "3. To improve the robustness of the automatic threshold derivation the approach restricts the tile selection in Step (3) to only pixels situated in flood-prone regions defined by a Height Above Nearest Drainage (HAND)-based binary Exclustion Mask (HAND-EM). To create HAND-EM, a threshold is applied to HAND to identify non-flood prone areas. A threshold value of $\\geq 15m$ is proposed. The HAND-EM further is shrunk by one pixel using an 8-neighbour function to account for potential geometric inaccuracies between the exclude layer and SAR data. Tiles are only considered in case less than 20% of its data pixels are excluded by HAND-EM.\n", "4. Out of the number of the initially selected tiles, a limited number of N parent tiles are finally chosen for threshold computation. This selection is accomplished by ranking the parent tiles according to the standard deviation of the mean backscatter values of the respective child objects. Tiles with the highest values are chosen for $N$. Extensive testing yielded that $N = 5$ is a sufficient number of parent tiles for threshold computation. \n", "5. A multi-mode Expectation Maximization minimum error thresholding approach is then employed to derive local threshold values using a cost function which is based on the statistical parameterization of the sub-histograms of all selected tiles as bi-modal Gaussian mixture distributions. In order to derive a global (i.e. scenebased) threshold, the locally derived thresholds are combined by computing their arithmetic mean. \n", "6. Using the dynamically calculated threshold, both the VV and VH scenes are thresholded for water detection\n", "7. The detected water maps are combined for arrive at an intial water mask that can be further refined in post processing

\n", "\n", "Here some additional refinement steps that could be added to the notebook in the future: \n", "\n", "1. Calculation of attributes for fuzzy-logic post processing: Based on the preliminary classification, the mean elevation of water objects and the size of all individual flood objects should be calculated. Together with the earlier derived land–water threshold, these parameters calculated from the initial classification can later be used to define fuzzy thresholds for the standard $S$ and $Z$ membership functions (see Section on Post Processing). This means that the initial classification based on the automatic thresholding procedure is mandatory to build elements for the fuzzy-logic-based refinement.
" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "Now Let's do the Work:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hidden": true, "scrolled": true }, "outputs": [], "source": [ "# define some variables you might want to change\n", "precentile = 0.95 # Standard deviation percentile threshold for pivotal tile selection\n", "tilesize = 100 # Setting tile size to use in thresholding algorithm\n", "tilesize2 = 50\n", "Hpick = 0.8 # Threshold for required fraction of valid HAND-EM pixels per tile\n", "vv_corr = -17.0 # VV threshold to use if threshold calculation did not succeed\n", "vh_corr = -24.0 # VH threshold to use if threshold calculation did not succeed\n", "\n", "# Tile up HAND-EM data\n", "handem_p = pad_image(handem, tilesize)\n", "hand_tiles = tile_image(handem_p,width=tilesize,height=tilesize)\n", "Hsum = np.sum(hand_tiles, axis=(1,2))\n", "Hpercent = Hsum/(tilesize*tilesize)\n", "\n", "# Now do adaptive threshold selection\n", "vv_thresholds = np.array([])\n", "vh_thresholds = np.array([])\n", "floodarea = np.array([])\n", "vh_thresholds_corr = np.array([])\n", "vv_thresholds_corr = np.array([])\n", "\n", "posterior_lookup = dict()\n", "\n", "for i, pair in enumerate(grouped_pths):\n", " print(f\"Processing pair {i+1} of {len(grouped_pths)}\")\n", " for tiff in grouped_pths[pair]:\n", " f = gdal.Open(tiff)\n", " img_array = f.ReadAsArray()\n", " original_shape = img_array.shape\n", " n_rows, n_cols = get_tile_row_col_count(*original_shape, tile_size=tilesize)\n", " print(f'tiff: {tiff}')\n", " if 'vv' in tiff or 'VV' in tiff:\n", " vv_array = pad_image(f.ReadAsArray(), tilesize)\n", " invalid_pixels = np.nonzero(vv_array == 0.0)\n", " vv_tiles = tile_image(vv_array,width=tilesize,height=tilesize)\n", " a = np.shape(vv_tiles)\n", " vv_std = np.zeros(a[0])\n", " vvt_masked = np.ma.masked_where(vv_tiles==0, vv_tiles)\n", " vv_picktiles = np.zeros_like(vv_tiles)\n", " for k in range(a[0]):\n", " vv_subtiles = tile_image(vvt_masked[k,:,:],width=tilesize2,height=tilesize2)\n", " vvst_mean = np.ma.mean(vv_subtiles, axis=(1,2))\n", " vvst_std = np.ma.std(vvst_mean)\n", " vv_std[k] = np.ma.std(vvst_mean) \n", " \n", " # find tiles with largest standard deviations\n", " vv_mean = np.ma.median(vvt_masked, axis=(1,2))\n", " x_vv = np.sort(vv_std/vv_mean)\n", " y_vv = np.arange(1, x_vv.size+1) / x_vv.size\n", " \n", " percentile2 = precentile\n", " sort_index = 0\n", " while np.size(sort_index) < 5: \n", " threshold_index_vv = np.ma.min(np.where(y_vv>percentile2))\n", " threshold_vv = x_vv[threshold_index_vv]\n", " #sd_select_vv = np.nonzero(vv_std/vv_mean>threshold_vv)\n", " s_select_vv = np.nonzero(vv_std/vv_mean>threshold_vv) \n", " h_select_vv = np.nonzero(Hpercent > Hpick) # Includes HAND-EM in selection\n", " sd_select_vv = np.intersect1d(s_select_vv, h_select_vv)\n", " \n", " # find tiles with mean values lower than the average mean\n", " omean_vv = np.ma.median(vv_mean[h_select_vv])\n", " mean_select_vv = np.nonzero(vv_meanpercentile2))\n", " threshold_vh = x_vh[threshold_index_vh]\n", " #sd_select_vh = np.nonzero(vh_std/vh_mean>threshold_vh)\n", " s_select_vh = np.nonzero(vh_std/vh_mean>threshold_vh) \n", " h_select_vh = np.nonzero(Hpercent > Hpick) # Includes HAND-EM in selection\n", " sd_select_vh = np.intersect1d(s_select_vh, h_select_vh)\n", " \n", " # find tiles with mean values lower than the average mean \n", " omean_vh = np.ma.median(vh_mean[h_select_vh])\n", " mean_select_vh = np.nonzero(vh_mean 0\n", " flood_comb = np.zeros_like(vv_array)\n", " flood_comb[comb_mask] = sel[comb_mask]\n", " filename, ext = os.path.basename(tiff).split('.')\n", " outfile = f\"{mask_directory}/{filename[:-3]}_water_mask_combined.{ext}\"\n", " write_mask_to_file(flood_comb, outfile, f.GetProjection(), f.GetGeoTransform())\n", " \n", " # Create Information on Thresholds used as well as Flood extent information in km2\n", " vv_thresholds = np.append(vv_thresholds, 10.0*(m_thresh_vv-30))\n", " vh_thresholds = np.append(vh_thresholds, 10.0*(m_thresh_vh-30)) \n", " floodarea = np.append(floodarea,(np.sum(flood_comb)*30**2./(1000**2)))" ] }, { "cell_type": "markdown", "metadata": { "heading_collapsed": true }, "source": [ "## Evaluate Flood Mapping Results" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "The code cell below plots the time series of flooded area that was found in the analyzed SAR scenes" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hidden": true }, "outputs": [], "source": [ "temp_path = f\"{tiff_dir}/*.tiff\"\n", "dates = get_dates(temp_path)\n", "time_index = pd.DatetimeIndex(dates)\n", "\n", "fig = plt.figure(figsize=(9, 4))\n", "ax1 = fig.add_subplot(111) # 121 determines: 2 rows, 2 plots, first plot\n", "ax1.plot(np.unique(time_index), floodarea, color='b', marker='o', markersize=3, label='Flood Area [$km^2$]')\n", "ax1.set_ylim([np.min(floodarea)-10,np.max(floodarea)+10])\n", "ax1.set_xlabel('Date')\n", "ax1.axhline(y=np.mean(floodarea), color='r', linestyle='--')\n", "ax1.set_ylabel('Flood Area [$km^2$]')\n", "ax1.grid()\n", "figname = ('ThresholdAndAreaTS.png')\n", "ax1.legend(loc='upper right')\n", "plt.savefig(figname, dpi=300, transparent='true')\n" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "The following code cells allow you to visualize individual flood maps superimposed on the respective SAR image they were derived from.\n", "\n", "This next code cell first creates the information we are plotting below such as surface water and SAR data stacks.\n", "" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hidden": true }, "outputs": [], "source": [ "wpaths = f\"{tiff_dir}/Water_Masks/*combined.tif*\"\n", "spaths = f\"{tiff_dir}/*VV.tif*\"\n", "vrtcommand = f\"gdalbuildvrt -separate Water.vrt {wpaths}\"\n", "!{vrtcommand}\n", "water_file=\"Water.vrt\"\n", "vrtcommand = f\"gdalbuildvrt -separate SAR.vrt {spaths}\"\n", "!{vrtcommand}\n", "SAR_file = \"SAR.vrt\"\n", "\n", "img = gdal.Open(SAR_file)\n", "wm = gdal.Open(water_file)\n", "SARstack = img.ReadAsArray(5, 20, 5, 5)\n", "SARsize = np.shape(SARstack)\n", "SARbands = SARsize[0]" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "Please change the ```band_num``` setting in the next code cell to visualize flood mapping results for different SAR image acquisition dates.\n", "\n", "Note the tool bar in the bottom left corner of the image that's created. Feel free to use the toolbar to zoom into the image and navigate around." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hidden": true }, "outputs": [], "source": [ "band_num = 7# Change the band number to visualize different SAR acquisitions and respective flood mapping results.\n", "if band_num > SARbands:\n", " band_num = SARbands\n", "\n", "SARraster = img.GetRasterBand(band_num).ReadAsArray()\n", "waterraster = wm.GetRasterBand(band_num).ReadAsArray()\n", "water_masked = np.ma.masked_where(waterraster==0, waterraster)\n", "waterraster = 0\n", "plt.figure(figsize=(9, 6))\n", "vmin = np.percentile(SARraster, 5) #vh_array\n", "vmax = np.percentile(SARraster, 95)\n", "plt.imshow(SARraster, cmap='gray', vmin=vmin, vmax=vmax)\n", "plt.suptitle(f'Water Mask on SAR Image: {time_index.year[band_num*2-1]} / {time_index.month[band_num*2-1]} / {time_index.day[band_num*2-1]}')\n", "plt.grid()\n", "plt.imshow(water_masked, cmap='Blues',vmin=0, vmax=1.2)" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "
\n", "
\n", " EXERCISE: Analyze the Quality of the Water Masks\n", "\n", " Look at different Flood Maps for your 16 dates and evaluate the performance of the flood map. To do so, use the toolbar in the bottom left corner of the image above to zoom in and navigate around: \n", "- Zoom into the map and look at the details. Are there some pxiels that you would have added to the mask? \n", "- If you think flood areas were missed, do you seen anything in the underlying image that may explain why a pixel was not detected? \n", "
\n", "
\n", "
" ] }, { "cell_type": "markdown", "metadata": { "heading_collapsed": true }, "source": [ "## Create Summary Statistics" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "Once flood maps for each individual image acquisition date were created, summary statistics can be derived that describe the severity and duration of an event. In the following, we will be deriving a metric describing how many days each image pixel was inundated during an analyzed event. This should provide a template for other metrics to be created." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hidden": true }, "outputs": [], "source": [ "rasterstack = wm.ReadAsArray()\n", "srs = np.shape(rasterstack)\n", "floodcount = np.sum(rasterstack,0)\n", "floodpercent = floodcount / srs[0] * 100\n", "dt = time_index.dayofyear[SARbands] - time_index.dayofyear[0]\n", "flooddays = floodpercent / 100 * dt\n", "rasterstack = 0\n", "fd_masked = np.ma.masked_where(flooddays==0, flooddays)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hidden": true }, "outputs": [], "source": [ "SARraster = img.GetRasterBand(SARbands).ReadAsArray()\n", "plt.figure(figsize=(9, 6))\n", "vmin = np.percentile(SARraster, 5) #vh_array\n", "vmax = np.percentile(SARraster, 95)\n", "plt.suptitle('Number of Inundated Days Per Pixel - Minimum SAR Image as Background')\n", "plt.imshow(SARraster, cmap='gray', vmin=vmin, vmax=vmax)\n", "im = plt.imshow(fd_masked, cmap='jet')\n", "plt.colorbar(im, orientation='vertical')\n", "plt.grid()\n", "outfile = f\"{mask_directory}/flooddays.tif\"\n", "write_mask_to_file(flooddays, outfile, img.GetProjection(), img.GetGeoTransform())" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ "
\n", "
\n", " EXERCISE: Analyze the Summary Statistic Plot\n", "\n", " Look at different colors in the plot above and try to understand what they mean and whether or not they make sense to you. To do so, use the toolbar in the bottom left corner of the image above to zoom in and navigate around. \n", "
\n", "
\n", "
" ] }, { "cell_type": "markdown", "metadata": { "heading_collapsed": true }, "source": [ "# Version Log" ] }, { "cell_type": "markdown", "metadata": { "hidden": true }, "source": [ " SARHazards_Lab_Floods.ipynb - Version 1.0.4 - 11/03/2021\n", "\n", "Recent Changes:\n", "- Correct dates in Water Mask on SAR Image plot\n", "" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.10" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 4 }