{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Contrast and Image Enhancement\n", "## Transfer Functions\n", "stough 202-\n", "\n", "1. [Windowing and Piece-wise Linear Transforms](#windowing)\n", "1. [Power Transforms](#power)\n", "1. [Log and Miscellaneous Transforms](#log)\n", " - [Changing Colorspace](#colorspace)\n", "\n", "*A lot of pictures from my slides might go here.*\n", "\n", "- Enhancement: manipulating an image to be more suitable *for the application*\n", " - Human perception\n", " - Compressibility\n", " - ?\n", "- Contrast: difference in color or luminance between nearby elements or over a whole image\n", " - quantifiable globally through histograms\n", " - think of the difference between the Gaussian and Uniform distributions and how that looked in the RGB cube.\n", " \n", "In this notebook we'll consider some images that could be *enhanced* by improving the *contrast*. We'll apply **Transfer functions** of various type, which can all very generally be expressed as\n", "\n", "\\begin{equation*}\n", "s = \\mathbf{T}(r)\n", "\\end{equation*}\n", "\n", "where $r$ is the image intensity stored in the file, and $s$ is a *corrected* or transformed intensity that we want the display device to use. \n", "\n", "### Transfer functions are already being used.\n", "Whether or not we use them to enhance our images, transfer functions are already in use. Every display device uses some mapping to turn the image file's pixel intensities into voltages for the display elements/pixels (or pigment mixtures for the printer). The mapping generally used is a Gamma function, which we'll [actually see below](#power). \n", "\n", "So doing nothing to your images is in fact a decision: that the image as is will be displayed exactly as you intended." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Imports\n", "Note the addition of the `vis_pair` function, which can nicely plot an original and changed image for side-by-side comparison." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%matplotlib widget\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import matplotlib.colors as mcolors\n", "import skimage.color as color\n", "from ipywidgets import VBox, HBox, FloatLogSlider\n", "\n", "# For importing from alternative directory sources\n", "import sys \n", "sys.path.insert(0, '../dip_utils')\n", "\n", "from matrix_utils import arr_info\n", "from vis_utils import (vis_rgb_cube,\n", " vis_hsv_cube,\n", " vis_hists,\n", " vis_pair,\n", " lab_uniform)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## Windowing and Piece-wise Linear Transforms\n", "\n", "The simplest transfer function one might imagine is the linear function\n", "\n", "\\begin{equation*}\n", "s = \\mathbf{T}(r) = Ax+B\n", "\\end{equation*}\n", "\n", "where $A$ and $B$ are constants. Let's consider a particular case." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "I = plt.imread('../dip_pics/got_tooDark.jpg')\n", "vis_hists(I, bins=np.arange(257))\n", "arr_info(I)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`arr_info` tells us this image has data that are 8-bit unsigned integer. Given that there are 3 color channels, that implies a 24-bit color image, very typical. From the histograms though, we can also see this image has pretty much all of its pixels bunched up in the lower half of the nominal display range $[0,255]$. Notwithstanding the photographer's intent in choreographing the shot, let's \"correct\" for my old eyes. A simple linear function might do. But we have to remember that `matplotlib` expects three-channel images to stay in $[0,255]$ (if integer type)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def simple_linear(I, A = 1.0, B = 0.0):\n", " '''\n", " simpleLinear(I, A = 1.0, B = 0.0): return a linearly transformed image with the \n", " provided constants A and B. \n", " '''\n", " Tr = lambda x: A*x + B\n", " J = Tr(I.astype('float'))\n", " return np.clip(J, 0, 255).astype('uint8')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The above function implementation has three key steps:\n", "\n", "- Create a [`lambda`](https://realpython.com/python-lambda/) function `Tr` that applies the $Ax+B$\n", "- Make the initial output image that is `Tr` applied to the original image. We assure we're doing floating math by sending `I.astype('float')`. Note that this initial output `J` could have values outside $[0,255]$ and will be floating point data, neither of which are valid for integer image storage and display.\n", "- The return statement both clips the image `J` to ensure a $[0,255]$ range and casts to unsigned integer `uint8` datatype.\n", "\n", "We can even see practically what this simple linear transform does in transforming the input intensity." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = np.arange(256)\n", "plt.figure(figsize=(4,3))\n", "plt.plot(x, x, label='equal')\n", "plt.plot(x, simple_linear(x, A = 3), label = '3x+0')\n", "plt.xlabel('Input Intensity')\n", "plt.ylabel('Output Intensity')\n", "plt.legend()\n", "plt.tight_layout()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "I_corrected = simple_linear(I, A = 3)\n", "vis_pair(I, I_corrected, second_title='scaled by 3')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can see what we did by viewing the histograms, or even (cooler), seeing what happened in the rgb cube!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "vis_hists(I, bins=np.arange(257))\n", "vis_hists(I_corrected, bins=np.arange(257))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "vis_rgb_cube(I)\n", "vis_rgb_cube(I_corrected)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One approach you may think of is to renormalize the intensity so that $[min,max]$ is exactly $[0,255]$. This is of course *also a linear transform*:\n", "\n", "\\begin{equation*}\n", "s = \\mathbf{T}(r) = 255 \\frac{r - \\tt{I.min}}{\\tt{I.max} - \\tt{I.min}} \\\\\n", "\\end{equation*}\n", "\n", "or $A = \\frac{255}{\\tt{I.max} - \\tt{I.min}}$, and $B = \\frac{-255\\times\\tt{I.min}}{\\tt{I.max} - \\tt{I.min}}$" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "I_normed = simple_linear(I, A = 255.0/(I.max()-I.min()), B = -255*I.min()/(I.max()-I.min()))\n", "vis_pair(I, I_normed, second_title='min-max normed')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we plot all of these potential linear maps we can see their varying effect." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = np.arange(256)\n", "plt.figure(figsize=(4,3))\n", "plt.plot(x, x, label='equal')\n", "plt.plot(x, simple_linear(x, A = 3), label = '3x+0')\n", "plt.plot(x, simple_linear(x, A = 255.0/(I.max()-I.min()), B = -255*I.min()/(I.max()-I.min())), label='min-max normed')\n", "plt.xlabel('Input Intensity')\n", "plt.ylabel('Output Intensity')\n", "plt.legend()\n", "plt.tight_layout()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Gaussian Noise Example\n", "\n", "We'll do one more example with windowing. We'll use a Gaussian distributed random image that we've come to know from previous notebooks. It abstractly represents poor contrast, because its intensities are bunched up relative to a uniform distributed image." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "I_uniform = np.uint8(256*np.random.random((100,100,3)))\n", "I_gauss = np.clip(128 + 30*np.random.randn(100,100,3), 0, 255).astype('uint8')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "vis_pair(I_uniform, I_gauss, first_title='Uniform Dist', second_title='Gaussian Dist')\n", "vis_rgb_cube(I_gauss)\n", "vis_hists(I_gauss, bins=np.arange(257))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Looking at the histogram of `I_gauss`, you see that most of the intensities seem to be bunched up in the $[50,200]$ range. But our output range is $[0,255]$, so we have some room to work with. We want to define a transfer function that will dedicate the entire output range to where in the input range our data actually are. Let's make a function that linearly maps some input range to some output range. While we could formulate this in terms of [$Ax+B$](http://www.webmath.com/equline1.html) or the [two point form](https://www.cuemath.com/geometry/two-point-form/), I like to think of it as mixing: we're mixing between 0 and 255, and the mixing amount is equal to how far we are along the road from 50 to 200.\n", "\n", "\\begin{equation*}\n", "\\mathbf{T}(r) = (1-\\alpha)\\times0 + \\alpha\\times255\\\\\n", "\\alpha = \\frac{r - 50}{200-50}\n", "\\end{equation*}" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def make_linmap(inputrange, outputrange):\n", " a,b = inputrange\n", " c,d = outputrange\n", " \n", " return lambda x: (1-((x-a)/(b-a)))*c + ((x-a)/(b-a))*d" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The above gives us a linear map defined by input and output ranges, as in our $[50,200]$->$[0,255]$ attempt above. However, the linear map doesn't clip the results of the transformation like `simple_linear` above. Let's use it in a larger function that takes care of the out-of-bounds issues. We can also see practically the transfer function." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def interval_stretch(I, inputrange, outputrange):\n", " Tr = make_linmap(inputrange, outputrange)\n", " J = Tr(I.astype('float'))\n", " return np.clip(J, 0, 255).astype('uint8')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = np.arange(256)\n", "plt.figure(figsize=(4,3))\n", "plt.plot(x, x, label='equal')\n", "plt.plot(interval_stretch(x, [50,200], [0,255]), label='enhanced middle')\n", "plt.xlabel('Input Intensity')\n", "plt.ylabel('Output Intensity')\n", "plt.legend()\n", "plt.tight_layout()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As you can see, our attempted `interval_stretch` will stretch the middle $[50,200]$ of the input range, where our data actually are, to the entire display range $[0,255]$." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "I_gauss_enhanced = interval_stretch(I_gauss, [50,200], [0,255])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "vis_hists(I_gauss_enhanced)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "vis_pair(I_gauss, I_gauss_enhanced, first_title='Original Gauss', second_title='Enhanced Gauss')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "vis_rgb_cube(I_gauss)\n", "vis_rgb_cube(I_gauss_enhanced)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can see that our windowing redistributed the image intensities to better cover the whole display range or RGB cube. We call this windowing because we dedicate the entire display range to a *window*, or valid input range, of intensities (in this case, $[50,200]$). \n", "\n", "We can also see from the histogram that values which were already outside of our arbitrary $[50,200]$ range start piling up at the tail ends of `I_gauss_enhanced`'s histogram. The tighter we set our window (e.g., $[70,180]$), the more pixels that will just end up with their intensities maxed out at 0 or 255. Go ahead and try it out." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## Power Transforms\n", "\n", "While windowing has the very simple effect of linearly scaling input ranges, we see that it can create artifacts at the extreme ends. Additionally the perception of light itself is nonlinear, so that maybe a nonlinear approach could be better. \n", "\n", "The nonlinear transfer function that is used to map intensity to voltage in display devices is commonly the Gamma function:\n", "\n", "\\begin{equation*}\n", "s = \\mathbf{T}(r) = cr^{\\gamma}\n", "\\end{equation*}\n", "\n", "Functions of this form also include what we think of as powers or roots. For example, $\\gamma=2$ corresponds to squaring the intensity, while the reciprocol $\\gamma=\\frac{1}{2}$ leads to the square root. We can plot this transfer function for various values of $\\gamma$, with $c$ set to correct the output scale to $[0,255]$" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = np.arange(256)\n", "plt.figure(figsize=(4,3))\n", "plt.plot(x, x, c = 'blue', label = '$\\gamma$ = 1')\n", "for gamma in [2.4, 5.0]:\n", " c = 255.0/(np.power(255.0, gamma))\n", " cinv = 255/(np.power(255.0, 1/gamma))\n", " plt.plot(x, c*np.power(x,gamma), label = f'{gamma:.01f}')\n", " plt.plot(x, cinv*np.power(x,1/gamma), label = f'{1/gamma:.02f}')\n", "\n", "plt.xlabel('Input Intensity')\n", "plt.ylabel('Output Intensity')\n", "plt.legend()\n", "plt.tight_layout()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can see that where the bulk of an image's intensities are can inform you as to a good $\\gamma$ that can enhance the image. In the above I intentionally place 2.4 and its reciprocal (0.42) in the plot because 2.4 is special: it is a [standard gamma value](https://www.cnet.com/news/what-is-gamma-and-hdr-eotf-on-tvs-and-why-should-you-care/) for High Dynamic Range (HDR) content for movies viewed in darker rooms. \n", "\n", "Here is an interactive visualization to play around with it for yourself." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "I = plt.imread('../dip_pics/world_overexposed.jpg')\n", "vis_hists(I, bins=np.arange(257))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def applyGamma(I, gamma = 1):\n", " c = 255.0/(np.power(255.0, gamma))\n", " J = c*np.power(I.astype('float'), gamma)\n", " return J.astype('uint8')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.ioff()\n", "\n", "# We'll use a log slider \n", "gamma_slider = FloatLogSlider(\n", " value=1,\n", " base=10,\n", " min=-2, # max exponent of base\n", " max=1, # min exponent of base\n", " step=0.1, # exponent step\n", " description='Log Gamma'\n", ")\n", "\n", "\n", "# Setting up the figure.\n", "fig_args = {'num':101, 'frameon':True} # , 'gridspec_kw':{'width_ratios': [1, 1, 1]}}\n", "fig, ax = plt.subplots(1,3, figsize=(8,3), **fig_args)\n", "fig.canvas.header_visible = False\n", "\n", "# assign the display artists I'll update\n", "ax[0].imshow(I)\n", "mplot = ax[1].plot(x, x)\n", "ax[1].set_aspect('equal', 'box')\n", "ax[1].set_title('Gamma Curve')\n", "rdisp = ax[2].imshow(I)\n", "\n", "ax[0].set_title(f'Original (1.0)')\n", "rtext = ax[2].set_title(f'Gamma: {gamma_slider.value:.01f}')\n", "\n", "[a.axes.get_xaxis().set_visible(False) for a in ax];\n", "[a.axes.get_yaxis().set_visible(False) for a in ax];\n", "\n", "\n", "def update_image(change):\n", " global I, mplot, rdisp, rtext\n", " \n", " J = applyGamma(I, gamma_slider.value)\n", " mplot[0].set_data(x, applyGamma(x, gamma_slider.value))\n", " rdisp.set_array(J)\n", " rtext.set_text(f'Gamma: {gamma_slider.value:.01f}') #hue_slider.value\n", " fig.canvas.draw()\n", " fig.canvas.flush_events()\n", "\n", "gamma_slider.observe(update_image, names='value')\n", "\n", "plt.tight_layout()\n", "\n", "VBox([gamma_slider, fig.canvas])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Close the above interactive demo figure, so that it won't appear later.\n", "plt.close(101)\n", "plt.ion()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## Log and Miscellaneous Transforms\n", "\n", "Other nonlinear families often used for image enhancement are logs and their inverse the exponentials:\n", "\n", "- $s = \\mathbf{T}(r) = c \\log(1+r)$\n", "- $s = \\mathbf{T}(r) = c e^{r}$" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = np.arange(256)\n", "plt.figure(figsize=(4,3))\n", "plt.plot(x, x, c = 'blue', label = 'equal')\n", "plt.plot(x, (255/8)*np.log2(1 + x), label = '$log_{2}$')\n", "plt.plot(x, np.power(2,(256/255)*x/32)-1, label = '$2^{x/32}$')\n", "\n", "\n", "plt.xlabel('Input Intensity')\n", "plt.ylabel('Output Intensity')\n", "plt.legend()\n", "plt.tight_layout()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(In the above code there are some confounding bits to get the output correct. These come from the fact that the discrete interval $[0,255]$, if understood as continous, is the right-open interval $[0,256)$.)\n", "\n", "From the plot above we can see that the $\\log$ function has the effect of dedicating much of the display range to the darker pixels in the image. Further this contrast stretching is dynamic (nonlinear) in the input range. For example pixel instensities that are 50 units apart at the dark end of the domain will be mapped to intensities that are 150+ apart in the output range, while the same difference of 50 units at the high end will end up being compressed to about 10.\n", "\n", "Let's see what this does to a severely underexposed image." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "I = plt.imread('../dip_pics/underExposed.jpg')\n", "vis_hists(I, bins=np.arange(257))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "I_log_corrected = ((255/8)*np.log2(1.0 + I)).astype('uint8')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "vis_pair(I, I_log_corrected, second_title='Log Corrected', show_ticks=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "In the above you can see clearly the differential contrast adjustment in the dark and light areas of the image. In this image the $\\log$ transform greatly expanded contrast on the dark buildings, but at the cost of reducing the contrast in the bright sky. \n", "\n", "### Log Correction in [Lab](../Color/color_Lab.ipynb) Space\n", "\n", "You might notice a few additional artifacts in our corrected image above. For example the blue sky has turned a kind of aqua (maybe?). Additionally the colorful [lens flare](https://en.wikipedia.org/wiki/Lens_flare) on the church and the blue tone of the leftmost rooftop are likely artifacts of the intense scaling done at the very low end of the input range, and not the most faithful representation of the underexposed foreground in reality. (Not that faithfulness to reality has to be the goal). Often in image acquisition, very small differences in R, G, and B are unreliable when found at the extreme low end. \n", "\n", "We would like to contrast-enhance the original underexposed image in a way that does not change hue too drastically (blue sky turning aqua). Conveniently then, in the [Color](../Color/color_intro.ipynb) module we saw several alternative colorspaces that separate the \"color\" from the brightness/value/luminance dimension of perception, and we can use any of these instead of RGB as above. \n", "\n", "Among [HSV](../Color/color_HSV.ipynb), [YCbCr](../Color/color_YCbCr.ipynb), and [Lab](../Color/color_Lab.ipynb), let's try Lab. If we $\\log$ transform the Luminance (L) only, leaving the chromaticity dimensions (a, b) constant, we might more objectively enhance the contrast in the underexposed foreground. Do it! Keep in mind though\n", "\n", "- You can use `color.rgb2lab` to convert to Lab space.\n", "- When you log transform the first channel of the Lab image, use the constants above to maintain the expected $[0, 100]$ range of the Luminance scale (see [`rgb2lab`](https://scikit-image.org/docs/dev/api/skimage.color.html#skimage.color.rgb2lab)). Creating values outside of this range may lead to errors when converting back using [`color.lab2rgb`](https://scikit-image.org/docs/dev/api/skimage.color.html#lab2rgb) \n", "- Make frequent use of our `arr_info` function to make sure you're considering the correct datatypes and ranges. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.11" } }, "nbformat": 4, "nbformat_minor": 4 }