{"worksheets": [{"cells": [{"cell_type": "markdown", "metadata": {}, "source": ["Gradient Descent Methods", "========================", "", "*Important:* Please read the [installation page](http://gpeyre.github.io/numerical-tours/installation_python/) for details about how to install the toolboxes.", "$\\newcommand{\\dotp}[2]{\\langle #1, #2 \\rangle}$", "$\\newcommand{\\enscond}[2]{\\lbrace #1, #2 \\rbrace}$", "$\\newcommand{\\pd}[2]{ \\frac{ \\partial #1}{\\partial #2} }$", "$\\newcommand{\\umin}[1]{\\underset{#1}{\\min}\\;}$", "$\\newcommand{\\umax}[1]{\\underset{#1}{\\max}\\;}$", "$\\newcommand{\\umin}[1]{\\underset{#1}{\\min}\\;}$", "$\\newcommand{\\uargmin}[1]{\\underset{#1}{argmin}\\;}$", "$\\newcommand{\\norm}[1]{\\|#1\\|}$", "$\\newcommand{\\abs}[1]{\\left|#1\\right|}$", "$\\newcommand{\\choice}[1]{ \\left\\{ \\begin{array}{l} #1 \\end{array} \\right. }$", "$\\newcommand{\\pa}[1]{\\left(#1\\right)}$", "$\\newcommand{\\diag}[1]{{diag}\\left( #1 \\right)}$", "$\\newcommand{\\qandq}{\\quad\\text{and}\\quad}$", "$\\newcommand{\\qwhereq}{\\quad\\text{where}\\quad}$", "$\\newcommand{\\qifq}{ \\quad \\text{if} \\quad }$", "$\\newcommand{\\qarrq}{ \\quad \\Longrightarrow \\quad }$", "$\\newcommand{\\ZZ}{\\mathbb{Z}}$", "$\\newcommand{\\CC}{\\mathbb{C}}$", "$\\newcommand{\\RR}{\\mathbb{R}}$", "$\\newcommand{\\EE}{\\mathbb{E}}$", "$\\newcommand{\\Zz}{\\mathcal{Z}}$", "$\\newcommand{\\Ww}{\\mathcal{W}}$", "$\\newcommand{\\Vv}{\\mathcal{V}}$", "$\\newcommand{\\Nn}{\\mathcal{N}}$", "$\\newcommand{\\NN}{\\mathcal{N}}$", "$\\newcommand{\\Hh}{\\mathcal{H}}$", "$\\newcommand{\\Bb}{\\mathcal{B}}$", "$\\newcommand{\\Ee}{\\mathcal{E}}$", "$\\newcommand{\\Cc}{\\mathcal{C}}$", "$\\newcommand{\\Gg}{\\mathcal{G}}$", "$\\newcommand{\\Ss}{\\mathcal{S}}$", "$\\newcommand{\\Pp}{\\mathcal{P}}$", "$\\newcommand{\\Ff}{\\mathcal{F}}$", "$\\newcommand{\\Xx}{\\mathcal{X}}$", "$\\newcommand{\\Mm}{\\mathcal{M}}$", "$\\newcommand{\\Ii}{\\mathcal{I}}$", "$\\newcommand{\\Dd}{\\mathcal{D}}$", "$\\newcommand{\\Ll}{\\mathcal{L}}$", "$\\newcommand{\\Tt}{\\mathcal{T}}$", "$\\newcommand{\\si}{\\sigma}$", "$\\newcommand{\\al}{\\alpha}$", "$\\newcommand{\\la}{\\lambda}$", "$\\newcommand{\\ga}{\\gamma}$", "$\\newcommand{\\Ga}{\\Gamma}$", "$\\newcommand{\\La}{\\Lambda}$", "$\\newcommand{\\si}{\\sigma}$", "$\\newcommand{\\Si}{\\Sigma}$", "$\\newcommand{\\be}{\\beta}$", "$\\newcommand{\\de}{\\delta}$", "$\\newcommand{\\De}{\\Delta}$", "$\\newcommand{\\phi}{\\varphi}$", "$\\newcommand{\\th}{\\theta}$", "$\\newcommand{\\om}{\\omega}$", "$\\newcommand{\\Om}{\\Omega}$"]}, {"cell_type": "markdown", "metadata": {}, "source": ["This tour explores the use of gradient descent method for unconstrained", "and constrained optimization of a smooth function"]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["from __future__ import division", "import nt_toolbox as nt", "from nt_solutions import optim_1_gradient_descent as solutions", "%matplotlib inline", "%load_ext autoreload", "%autoreload 2"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Gradient Descent for Unconstrained Problems", "-------------------------------------------", "We consider the problem of finding a minimum of a function $f$, hence", "solving", "$$ \\umin{x \\in \\RR^d} f(x) $$", "where $f : \\RR^d \\rightarrow \\RR$ is a smooth function.", "", "", "Note that the minimum is not necessarily unique. In the general case,", "$f$ might exhibit local minima, in which case the proposed algorithms", "is not expected to find a global minimizer of the problem.", "In this tour, we restrict our attention to convex function, so that", "the methods will converge to a global minimizer.", "", "", "The simplest method is the gradient descent, that computes", "$$ x^{(k+1)} = x^{(k)} - \\tau_k \\nabla f(x^{(k)}), $$", "where $\\tau_k>0$ is a step size, and $\\nabla f(x) \\in \\RR^d$ is the", "gradient of $f$ at the point $x$, and $x^{(0)} \\in \\RR^d$ is any initial point.", "", "", "In the convex case, if $f$ is of class $C^2$,", "in order to ensure convergence, the step size should satisfy", "$$ 0 < \\tau_k < \\frac{2}{ \\sup_x \\norm{Hf(x)} } $$", "where $Hf(x) \\in \\RR^{d \\times d}$ is the Hessian of $f$ at $x$", "and $ \\norm{\\cdot} $ is the spectral operator norm (largest eigenvalue).", "", "Gradient Descent in 2-D", "-----------------------", "We consider a simple problem, corresponding to the minimization of a 2-D", "quadratic form", "$$ f(x) = \\frac{1}{2} \\pa{ x_1^2 + \\eta x_2^2, } $$", "where $ \\eta>0 $ controls the anisotropy, and hence the difficulty, of", "the problem.", "", "", "Anisotropy parameter $\\eta$."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["eta = 10"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Function $f$."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["f = lambda x: (x(1)^2 + eta*x(2)^2) / 2"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Background image of the function."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["t = linspace(-.7, .7, 101)", "[u, v] = meshgrid(t, t)", "F = (u.^2 + eta*v.^2)/ 2"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Display the function as a 2-D image."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["clf; hold on", "imagesc(t, t, F); colormap jet(256)", "contour(t, t, F, 20, 'k')", "axis off; axis equal"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Gradient."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["Gradf = lambda x: [x(1); eta*x(2)]"]}, {"cell_type": "markdown", "metadata": {}, "source": ["The step size should satisfy $\\tau_k < 2/\\eta$. We use here a constrant", "step size."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["tau = 1.8/ eta"]}, {"cell_type": "markdown", "metadata": {}, "source": ["__Exercise 1__", "", "Perform the gradient descent using a fixed step size $\\tau_k=\\tau$.", "Display the decay of the energy $f(x^{(k)})$ through the iteration.", "Save the iterates so that |X(:,k)| corresponds to $x^{(k)}$."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["solutions.exo1()"]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["## Insert your code here."]}, {"cell_type": "markdown", "metadata": {}, "source": ["Display the iterations."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["clf; hold on", "imagesc(t, t, F); colormap jet(256)", "contour(t, t, F, 20, 'k')", "h = plot(X(1, : ), X(2, : ), 'k.-')", "set(h, 'LineWidth', 2)", "set(h, 'MarkerSize', 15)", "axis off; axis equal"]}, {"cell_type": "markdown", "metadata": {}, "source": ["__Exercise 2__", "", "Display the iteration for several different step sizes."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["solutions.exo2()"]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["## Insert your code here."]}, {"cell_type": "markdown", "metadata": {}, "source": ["Gradient and Divergence of Images", "---------------------------------", "Local differential operators like gradient, divergence and laplacian are", "the building blocks for variational image processing.", "", "", "Load an image $x_0 \\in \\RR^N$ of $N=n \\times n$ pixels."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["n = 256", "x0 = rescale(load_image('lena', n))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Display it."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["imageplot(x0)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["For a continuous function $g$, the gradient reads", "$$ \\nabla g(s) = \\pa{ \\pd{g(s)}{s_1}, \\pd{g(s)}{s_2} } \\in \\RR^2. $$", "(note that here, the variable $s$ denotes the 2-D spacial position).", "", "", "We discretize this differential operator on a discrete image $x \\in \\RR^N$ using first order finite", "differences.", "$$ (\\nabla x)_i = ( x_{i_1,i_2}-x_{i_1-1,i_2}, x_{i_1,i_2}-x_{i_1,i_2-1} ) \\in \\RR^2. $$", "Note that for simplity we use periodic boundary conditions.", "", "", "Compute its gradient, using finite differences."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["grad = lambda x: cat(3, x-x([end 1: end-1], : ), x-x(: , [end 1: end-1]))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["One thus has $ \\nabla : \\RR^N \\mapsto \\RR^{N \\times 2}. $"]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["v = grad(x0)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["One can display each of its components."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["imageplot(v(: , : , 1), 'd/ dx', 1, 2, 1)", "imageplot(v(: , : , 2), 'd/ dy', 1, 2, 2)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["One can also display it using a color image."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["imageplot(v)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["One can display its magnitude $\\norm{(\\nabla x)_i}$, which is large near edges."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["imageplot(sqrt(sum3(v.^2, 3)))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["The divergence operator maps vector field to images.", "For continuous vector fields $v(s) \\in \\RR^2$, it is defined as", "$$ \\text{div}(v)(s) = \\pd{v_1(s)}{s_1} + \\pd{v_2(s)}{s_2} \\in \\RR. $$", "(note that here, the variable $s$ denotes the 2-D spacial position).", "It is minus the adjoint of the gadient, i.e. $\\text{div} = - \\nabla^*$.", "", "", "It is discretized, for $v=(v^1,v^2)$ as", "$$ \\text{div}(v)_i = v^1_{i_1+1,i_2} - v^1_{i_1,i_2} + v^2_{i_1,i_2+1} - v^2_{i_1,i_2} . $$"]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["div = lambda v: v([2: end 1], : , 1)-v(: , : , 1) + v(: , [2: end 1], 2)-v(: , : , 2)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["The Laplacian operatore is defined as $\\Delta=\\text{div} \\circ \\nabla =", "-\\nabla^* \\circ \\nabla$. It is thus a negative symmetric operator."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["delta = lambda x: div(grad(x))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Display $\\Delta x_0$."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["imageplot(delta(x0))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Check that the relation $ \\norm{\\nabla x} = - \\dotp{\\Delta x}{x}. $"]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["dotp = lambda a, b: sum(a(: ).*b(: ))", "fprintf('Should be 0: %.3i\\n', dotp(grad(x0), grad(x0)) + dotp(delta(x0), x0))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Gradient Descent in Image Processing", "------------------------------------", "We consider now the problem of denoising an image $y \\in \\RR^d$ where", "$d = n \\times n$ is the number of pixels ($n$ being the number of rows/columns in the image).", "", "", "Add noise to the original image, to simulate a noisy image."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["sigma = .1", "y = x0 + randn(n)*sigma"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Display the noisy image $y$."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["imageplot(clamp(y))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Denoising is obtained by minimizing the following functional", "$$ \\umin{x \\in \\RR^d} f(x) = \\frac{1}{2} \\norm{y-x}^2 + \\la J_\\epsilon(x) $$", "where $J_\\epsilon(x)$ is a smoothed total variation of the image.", "$$ J_\\epsilon(x) = \\sum_i \\norm{ (G x)_i }_{\\epsilon} $$", "where $ (Gx)_i \\in \\RR^2 $ is an approximation of the gradient of $x$ at pixel", "$i$ and for $u \\in \\RR^2$, we use the following smoothing of the", "$L^2$ norm in $\\RR^2$", "$$ \\norm{u}_\\epsilon = \\sqrt{ \\epsilon^2 + \\norm{u}^2 }, $$", "for a small value of $\\epsilon>0$.", "", "", "The gradient of the functional read", "$$ \\nabla f(x) = x-y + \\lambda \\nabla J_\\epsilon(x) $$", "where the gradient of the smoothed TV norm is", "$$ \\nabla J_\\epsilon(x)_i = G^*( u ) \\qwhereq", " u_i = \\frac{ (G x)_i }{\\norm{ (G x)_i }_\\epsilon} $$", "where $G^*$ is the adjoint operator of $G$ which corresponds to minus", "a discretized divergence.", "", "", "Value for $\\lambda$."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["lambda = .3/ 5"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Value for $\\epsilon$."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["epsilon = 1e-3"]}, {"cell_type": "markdown", "metadata": {}, "source": ["TV norm."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["NormEps = lambda u, epsilon: sqrt(epsilon^2 + sum(u.^2, 3))", "J = lambda x, epsilon: sum(sum(NormEps(grad(x), epsilon)))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Function $f$ to minimize."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["f = lambda y, x, epsilon: 1/ 2*norm(x-y, 'fro')^2 + lambda*J(x, epsilon)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Gradient of $J_\\epsilon$. Note that |div| implement $-G^*$."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["Normalize = lambda u, epsilon: u./ repmat(NormEps(u, epsilon), [1 1 2])", "GradJ = lambda x, epsilon: -div(Normalize(grad(x), epsilon))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Gradient of the functional."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["Gradf = lambda y, x, epsilon: x-y + lambda*GradJ(x, epsilon)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["The step size should satisfy", "$$ 0 < \\tau_k < \\frac{2}{ 1 + 4 \\lambda / \\epsilon }. $$", "Here we use a slightly larger step size, which still work in practice."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["tau = 1.8/ (1 + lambda*8/ epsilon)", "tau = tau*4"]}, {"cell_type": "markdown", "metadata": {}, "source": ["__Exercise 3__", "", "Implement the gradient descent. Monitor the decay of $f$ through the", "iterations."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["solutions.exo3()"]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["## Insert your code here."]}, {"cell_type": "markdown", "metadata": {}, "source": ["Display the resulting denoised image."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["imageplot(clamp(x))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Constrained Optimization Using Projected Gradient Descent", "---------------------------------------------------------", "We consider a linear imaging operator $\\Phi : x \\mapsto \\Phi(x)$", "that maps high resolution images to low dimensional observations.", "Here we consider a pixel masking operator, that is diagonal over the", "spacial domain.", "", "", "To emphasis the effect of the TV functional, we use a simple geometric", "image."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["n = 64", "name = 'square'", "x0 = load_image(name, n)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["We consider here the inpainting problem. This simply corresponds to a", "masking operator.", "Here we remove the central part of the image."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["a = 4", "Lambda = ones(n)", "Lambda(end/ 2-a: end/ 2 + a, : ) = 0"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Masking operator $ \\Phi $. Note that it is symmetric, i.e. $\\Phi^*=\\Phi$"]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["Phi = lambda x: x.*Lambda", "PhiS = lambda x: Phi(x)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Noiseless observations $y=\\Phi x_0$."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["y = Phi(x0)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Display."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["imageplot(x0, 'Original', 1, 2, 1)", "imageplot(y, 'Damaged', 1, 2, 2)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["We want to solve the noiseless inverse problem $y=\\Phi f$ using a total", "variation regularization:", "$$ \\umin{ y=\\Phi x } J_\\epsilon(x). $$", "We use the following projected gradient descent", "$$ x^{(k+1)} = \\text{Proj}_{\\Hh}( x^{(k)} - \\tau_k \\nabla J_{\\epsilon}(x^{(k)}) ) $$", "where $\\text{Proj}_{\\Hh}$ is the orthogonal projection on the set of", "linear constraint $\\Phi x = y$, and is easy to compute for inpainting"]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["ProjH = lambda x, y: x + PhiS(y - Phi(x))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["__Exercise 4__", "", "Display the evolution of the inpainting process."]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["solutions.exo4()"]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["## Insert your code here."]}, {"cell_type": "markdown", "metadata": {}, "source": ["__Exercise 5__", "", "Try with several values of $\\epsilon$.", "au = tau * 100;"]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["solutions.exo5()"]}, {"cell_type": "code", "language": "python", "metadata": {}, "outputs": [], "collapsed": false, "input": ["## Insert your code here."]}]}], "nbformat": 3, "metadata": {"name": ""}, "nbformat_minor": 0}