{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "$\\LaTeX \\text{ commands here}\n", "\\newcommand{\\R}{\\mathbb{R}}\n", "\\newcommand{\\im}{\\text{im}\\,}\n", "\\newcommand{\\norm}[1]{||#1||}\n", "\\newcommand{\\inner}[1]{\\langle #1 \\rangle}\n", "\\newcommand{\\span}{\\mathrm{span}}\n", "\\newcommand{\\proj}{\\mathrm{proj}}\n", "\\newcommand{\\OPT}{\\mathrm{OPT}}\n", "$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "
\n", "\n", "**Georgia Tech, CS 4540**\n", "\n", "# L6: Duality, Convexity, & Postive Semidefiniteness\n", "\n", "Jake Abernethy & Benjamin Bray\n", "\n", "*Thursday, September 5, 2019*" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### More Calc. Problems\n", "\n", "Let $w$ be an $n$-dim vector. Calculate the first derivatives of the following functions:\n", "+ $f(w) = \\|w\\|_2$\n", "+ $f(w) = \\| w \\|_1$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "#### Answer\n", "+ $f(w) = \\|w\\|_2 \\implies \\nabla f(w) = \\frac{w}{\\|w\\|_2}$\n", "+ $f(w) = \\| w \\|_1 \\implies \\nabla f(w) = [\\text{sign}(w_i)]_{i=1\\ldots n}$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Even More Calc. Problems\n", "\n", "Let $w$ be an $n$-dim vector. Calculate the derivatives of the following functions:\n", "+ $f(w) = \\frac 1 2 \\|w - v\\|_2^2$ where $v$ is some fixed vector\n", "+ $f(w) = \\frac 1 2 w^\\top M w$ (calculate the hessian for this one as well!)\n", "+ $f(w) = \\exp(w^\\top M w)$ where $M$ is some symmetric $n \\times n $ matrix" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "#### Answer\n", "\n", "+ $f(w) = \\frac 1 2 \\|w - v\\|_2^2 \\implies \\nabla f(w) = (w - v)$\n", "+ $f(w) = \\frac 1 2 w^\\top M w \\implies \\nabla f(w) = \\frac 1 2 w^\\top(M + M^\\top)$\n", "+ $f(w) = \\exp(w^\\top M w) \\implies \\nabla f(w) = \\exp(w^\\top M w)Mw$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Positive Semidefinite Matrices\n", "\n", "**Definition**: A matrix $M \\in \\R^{n\\times n}$ is *positive semidefinite* (PSD) if $x^\\top M x \\geq 0\\; \\forall x \\in \\R^n$\n", "\n", "Commonly, we work with matrices $M$ that are *symmetric* (that is, $M = M^\\top$). In this case, the following are equivalent:\n", "\n", "1. $M$ is PSD\n", "2. The eigenvalues of $M$ are all $\\geq 0$\n", "3. The matrix $M$ can be written as $B^\\top B$ for some $B \\in \\R^{n \\times n}$\n", "\n", "#### Problem\n", "Show that (3) $\\implies$ (1) $\\implies$ (2)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "#### Solution\n", "\n", "+ (3) $\\implies$ (1): If $M = B^\\top B$ then for any $x$ we have\n", "$$x^\\top M x = x^\\top (B^\\top B) x = (x^\\top B^\\top) (B x) = (B x)^\\top (B x) = \\|B x\\|_2^2 \\geq 0$$\n", "+ (1) $\\implies$ (2): If $M$ is PSD, then for any vector $x$ we have $x^\\top M x \\geq 0$. Let's take an vector $v$ of $M$, with eigenvalue $\\lambda$, hence $M v = \\lambda v$. We need to show that $\\lambda \\geq 0$. Notice that $v \\ne 0$, hence, if we multiply on the left hand side by $v$, we use the fact that $M$ is PSD to get\n", "$$ 0 \\leq v^\\top M v = v^\\top(\\lambda v) = \\lambda v^\\top v = \\lambda \\| v\\|^2.$$\n", "The fact that $0 \\leq \\lambda \\| v \\|^2$ can only occur if $\\lambda \\geq 0$. \n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### PSD matrices\n", "\n", "We often use the notation $M \\succcurlyeq 0$ to denote that $M$ is PSD. The notation $M \\succcurlyeq N$ is equivalent to $M - N \\succcurlyeq 0$, i.e. $M - N$ is PSD.\n", "\n", "#### Problem A\n", "\n", "Let $M \\in \\R^{n \\times n}$ be an arbitrary symmetric matrix. Let $P \\in \\R$ be any nonsingular matrix. Show that\n", "$$M \\succcurlyeq 0 \\Longleftrightarrow P^\\top M P \\succcurlyeq 0$$\n", "\n", "#### Problem B\n", "\n", "Show that the set $S_+^n := \\{ M \\in \\R^{n\\times n}: M = M^\\top, M \\succcurlyeq 0 \\}$ forms a convex cone." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "#### Solutions\n", "\n", "+ (Problem A)\n", " + First assume that $M \\succcurlyeq 0$. Then $x^\\top M x \\geq 0$ for all $x$. Now take an arbitrary vector $y$, and observe that $y^\\top (P^\\top M P) y = (P y)^\\top M (P y)$. Of course, $P y$ is just some arbitrary vector, hence $(P y)^\\top M (P y) \\geq 0$ as desired\n", " + For the second direction, assume that for all $y$ we have $y^\\top (P^\\top M P) y \\geq 0$. We need to show that for all $x$, $x^\\top M x \\geq 0$. Notice that $P$ is invertible, hence we can set $y = P^{-1} x$, and observe that\n", " $$ 0 \\leq (P^{-1} x)^\\top (P^\\top M P) (P^{-1} x) = x^\\top (P^{-1})^\\top P^\\top M P P^{-1} x = x^\\top M x$$\n", " as desired.\n", "+ (Problem B) Let us quickly check convexity. Let $N,M \\in S_+^n$, and observe that for any $x$ we have $x^\\top M x \\geq 0$ and $x^\\top N x \\geq 0$. Hence, if we take a convex combination $\\theta M + (1-\\theta)$, with $\\theta \\in [0,1]$ we have\n", "$$ x^\\top (\\theta M + (1-\\theta) N) x = \\theta x^\\top M x + (1-\\theta) x^\\top N x$$\n", "which is nonnegative since both terms are nonnegative" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Convex Functions\n", "\n", "A function $f : \\R^d \\to \\R$ is convex if, for any $x,y \\in \\text{dom}(f)$ and any $\\theta \\in [0,1]$, we have\n", "$$f(\\theta x + (1-\\theta)y) \\leq \\theta f(x) + (1-\\theta) f(y)$$\n", "\n", "**First-order Condition** When $f$ is differentiable, then $f$ is convex if and only if\n", "$$f(y) \\geq f(x) + \\nabla f(x)^\\top(y - x) \\quad \\text{ for any } x,y \\in \\text{dom}(f)$$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Problem\n", "\n", "**First-order Condition** When $f$ is differentiable, then $f$ is convex if and only if\n", "$$f(y) \\geq f(x) + \\nabla f(x)^\\top(y - x) \\quad \\text{ for any } x,y \\in \\text{dom}(f)$$\n", "\n", "Use the first order condition to show that, for any convex and differentiable $f$, we have\n", "$$(\\nabla f(y) - \\nabla f(x))^\\top(y - x) \\geq 0 \\text{ for any } x,y \\in \\text{dom}(f)$$\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "#### Answer\n", "\n", "Let's apply the first order condition twice, once at $x$ and once at $y$:\n", "\\begin{eqnarray*}\n", "f(y) & \\geq & f(x) + \\nabla f(x)^\\top(y - x) \\\\\n", "f(x) & \\geq & f(y) + \\nabla f(y)^\\top(x - y).\n", "\\end{eqnarray*}\n", "Add these two inequalities together, and subtract $f(x) + f(y)$ from both sides and you are done!" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Problem\n", "\n", "**Second-order Condition** When $f$ is twice differentiable, then $f$ is convex if and only if\n", "the hessian satisfies $\\nabla^2f(x) \\succcurlyeq 0$.\n", "\n", "Find conditions under which the following function is convex $f(x) = \\frac 1 2 x^\\top A x$ for a symmetric matrix $A \\in \\R^{d \\times d}$" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "#### Answer\n", "\n", "As we computed earlier in lecture, for $f(x) = \\frac 1 2 x^\\top A x$, the hessian is $\\nabla^2 f(x) = A$. We know that a twice-differentiable function is convex if its hessian is positive semi-definite. Therefore, $f$ is convex if and only if $A$ is positive semi-definite." ] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 4 }