{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "0",
   "metadata": {},
   "source": [
    "# Speckle Statistics\n",
    "\n",
    "\n",
    "This notebook will provide an empirical demonstration of speckle - how it originates, how it visually and statistically looks like, and some of the most common approaches to filter it.\n",
    "\n",
    "Speckle is defined as a kind of noise that affects all radar images. Given the multiple scattering contributions originating from the various elementary objects present within a resolution cell, the resulting backscatter signal can be described as a random constructive and destructive interference of wavelets. As a consequence, speckle is the reason why a granular pattern normally affects SAR images, making it more challenging to interpret and analyze them.\n",
    "\n",
    "![](../../images/speckle_effect.png)\n",
    "\n",
    "*Credits: ESRI*\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1",
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "from functools import partial\n",
    "\n",
    "import holoviews as hv\n",
    "import intake\n",
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "from holoviews.streams import RangeXY\n",
    "from scipy.ndimage import uniform_filter\n",
    "\n",
    "hv.extension(\"bokeh\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2",
   "metadata": {},
   "source": [
    "Let's make an example of a cornfield (with a typical backscattering value of about -10 dB). According to the following equation:\n",
    "\n",
    "$$\n",
    "\\sigma^0 = \\frac{1}{\\text{area}} \\sum_{n \\in \\text{area}} \\sigma_n\n",
    "$$\n",
    "\n",
    "We should ideally have a uniform discrete **sigma naught** $\\sigma^0$ value, given that the cornfield pixel is the only individual contributor.\n",
    "\n",
    "However, since we already learned from the previous notebooks that a pixel's ground size can be in the order of tens of meters (i.e., 10 meters for Sentinel-1), we can imagine that different distributed targets in the scene contribute to the global backscattered information.\n",
    "\n",
    "Let´s replicate this behavior with an ideal uniform area constituted by 100 pixels and then by adding 30% of speckle.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3",
   "metadata": {},
   "outputs": [],
   "source": [
    "ideal_backscatter = -10  # in dB, a typical value for cornfields\n",
    "width = 12\n",
    "size = (width, width)\n",
    "ideal_data = np.full(size, ideal_backscatter)\n",
    "ideal_data_linear = 10 ** (\n",
    "    ideal_data / 10\n",
    ")  # Convert dB to linear scale for speckle addition\n",
    "\n",
    "speckle_fraction = 0.3\n",
    "num_speckled_pixels = int(\n",
    "    size[0] * size[1] * speckle_fraction\n",
    ")  # Rayleigh speckle noise\n",
    "speckled_indices = np.random.choice(\n",
    "    width * width, num_speckled_pixels, replace=False\n",
    ")  # random indices for speckle\n",
    "\n",
    "# Initialize speckled data as the same as the ideal data\n",
    "speckled_data_linear = ideal_data_linear.copy()\n",
    "\n",
    "speckle_noise = np.random.gumbel(scale=1.0, size=num_speckled_pixels)\n",
    "speckled_data_linear.ravel()[\n",
    "    speckled_indices\n",
    "] *= speckle_noise  # Add speckle to the selected pixels\n",
    "\n",
    "ideal_data_dB = 10 * np.log10(ideal_data_linear)\n",
    "speckled_data_dB = 10 * np.log10(speckled_data_linear)\n",
    "plt.figure(figsize=(16, 10))\n",
    "\n",
    "# Ideal data\n",
    "plt.subplot(2, 2, 1)\n",
    "plt.imshow(ideal_data_dB, cmap=\"gray\", vmin=-20, vmax=0)\n",
    "plt.title(\"Ideal Backscatter (Cornfield)\")\n",
    "plt.colorbar(label=\"Backscatter (dB)\")\n",
    "\n",
    "# Speckled data\n",
    "plt.subplot(2, 2, 2)\n",
    "plt.imshow(speckled_data_dB, cmap=\"gray\", vmin=-20, vmax=0)\n",
    "plt.title(f\"Speckled Backscatter ({int(speckle_fraction * 100)}% of Pixels)\")\n",
    "plt.colorbar(label=\"Backscatter (dB)\")\n",
    "\n",
    "bins = 25\n",
    "hist_ideal, bins_ideal = np.histogram(ideal_data_dB.ravel(), bins=bins, range=(-20, 0))\n",
    "hist_speckled, bins_speckled = np.histogram(\n",
    "    speckled_data_dB.ravel(), bins=bins, range=(-20, 0)\n",
    ")\n",
    "max_freq = max(\n",
    "    hist_ideal.max(), hist_speckled.max()\n",
    ")  # maximum frequency for normalization\n",
    "\n",
    "# Histogram for ideal data\n",
    "plt.subplot(2, 2, 3)\n",
    "plt.hist(ideal_data_dB.ravel(), bins=bins, range=(-20, 0), color=\"gray\", alpha=0.7)\n",
    "plt.ylim(0, max_freq)\n",
    "plt.title(\"Histogram of Ideal Backscatter\")\n",
    "plt.xlabel(\"Backscatter (dB)\")\n",
    "plt.ylabel(\"Frequency\")\n",
    "\n",
    "# Histogram for speckled data\n",
    "plt.subplot(2, 2, 4)\n",
    "plt.hist(speckled_data_dB.ravel(), bins=bins, range=(-20, 0), color=\"gray\", alpha=0.7)\n",
    "plt.ylim(0, max_freq)\n",
    "plt.title(f\"Histogram of Speckled Backscatter ({int(speckle_fraction * 100)}%)\")\n",
    "plt.xlabel(\"Backscatter (dB)\")\n",
    "plt.ylabel(\"Frequency\")\n",
    "\n",
    "plt.tight_layout()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4",
   "metadata": {},
   "source": [
    "*Figure 1: Synthetic data that emulates speckles in microwave backscattering*\n",
    "\n",
    "We can imagine that the second plot represents a real SAR acquisition over a cornfield, while the first plot represents an ideal uniform SAR image over a cornfield land (no speckle). The introduction of a simulated 30% speckle noise could be related to the presence of distributed scatterers of any sort present in the scene, which would cause a pixel-to-pixel variation in terms of intensity.\n",
    "\n",
    "All the random contributions (such as the wind) would result in a different speckle pattern each time a SAR scene is acquired over the same area. Many subpixel contributors build up a complex scattered pattern in any SAR image, making it erroneous to rely on a single pixel intensity for making reliable image analysis. In order to enhance the degree of usability of a SAR image, several techniques have been put in place to mitigate speckle.\n",
    "We will now show two of the most common approaches: the temporal and the spatial filter.\n",
    "\n",
    "## Lake Neusiedl data\n",
    "\n",
    "We load a dataset that contains the CORINE land cover and Sentinel-1 $\\sigma^0_E$ at a 20 meter resolution.  This is the same data presented in notebook 6.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5",
   "metadata": {},
   "outputs": [],
   "source": [
    "url = \"https://huggingface.co/datasets/martinschobben/microwave-remote-sensing/resolve/main/microwave-remote-sensing.yml\"\n",
    "cat = intake.open_catalog(url)\n",
    "fused_ds = cat.speckle.read().compute()\n",
    "fused_ds"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6",
   "metadata": {},
   "source": [
    "We also create the same dashboard for backscatter of different landcover types over time. In order to make this code reusable and adaptable we will define the following function `plot_variability`, which allows the injection of a spatial and/or temporal filter. It is not important to understand all the code of the following cell!\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Load encoding\n",
    "with cat.corine_cmap.read()[0] as f:\n",
    "    color_mapping_data = json.load(f)\n",
    "\n",
    "# Get mapping\n",
    "color_mapping = {item[\"value\"]: item for item in color_mapping_data[\"land_cover\"]}\n",
    "\n",
    "# Get landcover codes present in the image\n",
    "present_landcover_codes = np.unique(\n",
    "    fused_ds.land_cover.values[~np.isnan(fused_ds.land_cover.values)].astype(int)\n",
    ")\n",
    "\n",
    "\n",
    "def load_image(var_ds, time, land_cover, x_range, y_range, filter_fun_spatial=None):\n",
    "    \"\"\"\n",
    "    Callback Function Landcover.\n",
    "\n",
    "    Parameters\n",
    "    ----------\n",
    "    time: panda.datetime\n",
    "        time slice\n",
    "    landcover: int\n",
    "        land cover type\n",
    "    x_range: array_like\n",
    "        longitude range\n",
    "    y_range: array_like\n",
    "        latitude range\n",
    "\n",
    "    Returns\n",
    "    -------\n",
    "    holoviews.Image\n",
    "    \"\"\"\n",
    "\n",
    "    if time is not None:\n",
    "        var_ds = var_ds.sel(time=time)\n",
    "\n",
    "    if land_cover == \"\\xa0\\xa0\\xa0 Complete Land Cover\":\n",
    "        sig0_selected_ds = var_ds.sig0\n",
    "    else:\n",
    "        land_cover_value = int(land_cover.split()[0])\n",
    "        mask_ds = var_ds.land_cover == land_cover_value\n",
    "        sig0_selected_ds = var_ds.sig0.where(mask_ds)\n",
    "\n",
    "    if filter_fun_spatial is not None:\n",
    "        sig0_np = filter_fun_spatial(sig0_selected_ds.values)\n",
    "    else:\n",
    "        sig0_np = sig0_selected_ds.values\n",
    "\n",
    "    # Convert unfiltered data into Holoviews Image\n",
    "    img = hv.Dataset(\n",
    "        (sig0_selected_ds[\"x\"], sig0_selected_ds[\"y\"], sig0_np), [\"x\", \"y\"], \"sig0\"\n",
    "    )\n",
    "\n",
    "    if x_range and y_range:\n",
    "        img = img.select(x=x_range, y=y_range)\n",
    "\n",
    "    return hv.Image(img)\n",
    "\n",
    "\n",
    "def plot_variability(var_ds, filter_fun_spatial=None, filter_fun_temporal=None):\n",
    "\n",
    "    robust_min = var_ds.sig0.quantile(0.02).item()\n",
    "    robust_max = var_ds.sig0.quantile(0.98).item()\n",
    "\n",
    "    bin_edges = [\n",
    "        i + j * 0.5\n",
    "        for i in range(int(robust_min) - 2, int(robust_max) + 2)\n",
    "        for j in range(2)\n",
    "    ]\n",
    "\n",
    "    land_cover = {\"\\xa0\\xa0\\xa0 Complete Land Cover\": 1}\n",
    "    land_cover.update(\n",
    "        {\n",
    "            f\"{int(value): 02} {color_mapping[value]['label']}\": int(value)\n",
    "            for value in present_landcover_codes\n",
    "        }\n",
    "    )\n",
    "    time = var_ds.sig0[\"time\"].values\n",
    "\n",
    "    rangexy = RangeXY()\n",
    "\n",
    "    if filter_fun_temporal is not None:\n",
    "        var_ds = filter_fun_temporal(var_ds)\n",
    "        load_image_ = partial(\n",
    "            load_image, var_ds=var_ds, filter_fun_spatial=filter_fun_spatial, time=None\n",
    "        )\n",
    "        dmap = (\n",
    "            hv.DynamicMap(load_image_, kdims=[\"Landcover\"], streams=[rangexy])\n",
    "            .redim.values(Landcover=land_cover)\n",
    "            .hist(normed=True, bins=bin_edges)\n",
    "        )\n",
    "\n",
    "    else:\n",
    "        load_image_ = partial(\n",
    "            load_image, var_ds=var_ds, filter_fun_spatial=filter_fun_spatial\n",
    "        )\n",
    "        dmap = (\n",
    "            hv.DynamicMap(load_image_, kdims=[\"Time\", \"Landcover\"], streams=[rangexy])\n",
    "            .redim.values(Time=time, Landcover=land_cover)\n",
    "            .hist(normed=True, bins=bin_edges)\n",
    "        )\n",
    "\n",
    "    image_opts = hv.opts.Image(\n",
    "        cmap=\"Greys_r\",\n",
    "        colorbar=True,\n",
    "        tools=[\"hover\"],\n",
    "        clim=(robust_min, robust_max),\n",
    "        aspect=\"equal\",\n",
    "        framewise=False,\n",
    "        frame_height=500,\n",
    "        frame_width=500,\n",
    "    )\n",
    "\n",
    "    hist_opts = hv.opts.Histogram(width=350, height=555)\n",
    "\n",
    "    return dmap.opts(image_opts, hist_opts)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8",
   "metadata": {},
   "source": [
    "Now, lets work on the real-life dataset to see how speckle actually looks like.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9",
   "metadata": {},
   "outputs": [],
   "source": [
    "plot_variability(fused_ds)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "10",
   "metadata": {},
   "source": [
    "*Figure 2: Lake Neusiedl $\\sigma^0_E$ without any filter.*\n",
    "\n",
    "The speckle noise typically appears as a \"salt-and-pepper\" pattern. Also, please note the distribution of backscatter for each land cover. Even though speckle is known for following non-normal distributions (i.e., Rayleigh distribution for amplitude in the linear domain, and the Gumple for intensity in the log domain), we can assume that due to the Central Limit Theorem, the overall backscatter means (dB) tend to follow a Gaussian distribution.\n",
    "\n",
    "We can mitigate speckle (it is impossible to remove it completely) by following approaches such as:\n",
    "- spatial filtering - taking mean backscatter value over the same land cover, or\n",
    "- temporal filtering - taking the average backscatter value over some time period.\n",
    "\n",
    "Either way, one pixel is never representative of ground truth! Therefore we need to look at samples and distributions.\n",
    "\n",
    "## Spatial filtering\n",
    "\n",
    "We first introduce a common spatial filter. The Lee filter is an adaptive speckle filter. The filter works using a kernel window with a configurable size, which refers to the dimensions of the neighborhood over which the filter operates. The kernel slides across the data, applying the smoothing operation at each pixel position of the image. It follows three assumptions:\n",
    "\n",
    "1) SAR speckle is modeled as a multiplicative noise - the brighter the area the noisier the data.\n",
    "2) The noise and the signal are statistically independent of each other.\n",
    "3) The sample mean and sample variance of a pixel is equal to its local mean and local variance.\n",
    "\n",
    "This approach comes with some limitations: it reduces the spatial resolution of the SAR image.\n",
    "\n",
    "Let's build up a function for applying a Lee filter with a kernel window size of 7 (do not forget to switch back to linear units before doing this computation and to dB after it):\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "11",
   "metadata": {},
   "outputs": [],
   "source": [
    "def lee_filter(raster, size=7):\n",
    "    \"\"\"\n",
    "    Parameters:\n",
    "    raster: ndarray\n",
    "        2D array representing the noisy image (e.g., radar image with speckle)\n",
    "    size: int\n",
    "        Size of the kernel window for the filter (must be odd, default is 7)\n",
    "\n",
    "    Returns:\n",
    "    filtered_image (ndarray): The filtered image with reduced speckle noise\n",
    "    \"\"\"\n",
    "\n",
    "    raster = np.nan_to_num(raster)\n",
    "    raster = 10 ** (raster / 10)\n",
    "\n",
    "    # Mean and variance over local kernel window\n",
    "    mean_window = uniform_filter(raster, size=size)\n",
    "    mean_sq_window = uniform_filter(raster**2, size=size)\n",
    "    variance_window = mean_sq_window - mean_window**2\n",
    "\n",
    "    # Noise variance estimation (this could also be set manually)\n",
    "    overall_variance = np.var(raster)\n",
    "\n",
    "    # Compute the Lee filter\n",
    "    weights = variance_window / (variance_window + overall_variance)\n",
    "\n",
    "    return 10 * np.log10(mean_window + weights * (raster - mean_window))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "12",
   "metadata": {},
   "outputs": [],
   "source": [
    "plot_variability(fused_ds, filter_fun_spatial=lee_filter)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "13",
   "metadata": {},
   "source": [
    "*Figure 3: Lake Neusiedl $\\sigma^0_E$ with a Lee filter applied.*\n",
    "\n",
    "## Temporal filtering\n",
    "\n",
    "Temporal filtering would involve taking the average of all previous (past) observations for each pixel. This approach comes with some limitations: it takes out the content-rich information tied to the temporal variability of backscatter.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "14",
   "metadata": {},
   "outputs": [],
   "source": [
    "def temporal_filter(raster):\n",
    "    \"\"\"\n",
    "    Parameters:\n",
    "    raster: ndarray\n",
    "        3D array representing the noisy image over time\n",
    "        (e.g., radar image with speckle)\n",
    "\n",
    "    Returns:\n",
    "    filtered_image (ndarray): The filtered image with reduced speckle noise\n",
    "    \"\"\"\n",
    "\n",
    "    return raster.mean(\"time\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "15",
   "metadata": {},
   "outputs": [],
   "source": [
    "plot_variability(fused_ds, filter_fun_temporal=temporal_filter)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "16",
   "metadata": {},
   "source": [
    "*Figure 4: Lake Neusiedl $\\sigma^0_E$ with a temporal filter applied.*\n",
    "\n",
    "Let´s observe the histograms of the two plots. Especially in the region around the lake, it is clear that the distribution is now less dispersed and more centered around a central value."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}