{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "***\n", "# Lecture 7 - Convex Sets\n", "***\n", "$\\newcommand{\\vct}[1]{\\mathbf{#1}}$\n", "$\\newcommand{\\mtx}[1]{\\mathbf{#1}}$\n", "$\\newcommand{\\e}{\\varepsilon}$\n", "$\\newcommand{\\norm}[1]{\\|#1\\|}$\n", "$\\newcommand{\\minimize}{\\text{minimize}\\quad}$\n", "$\\newcommand{\\maximize}{\\text{maximize}\\quad}$\n", "$\\newcommand{\\subjto}{\\quad\\text{subject to}\\quad}$\n", "$\\newcommand{\\R}{\\mathbb{R}}$\n", "$\\newcommand{\\trans}{T}$\n", "$\\newcommand{\\ip}[2]{\\langle {#1}, {#2} \\rangle}$\n", "$\\newcommand{\\zerovct}{\\vct{0}}$\n", "$\\newcommand{\\diff}[1]{\\mathrm{d}{#1}}$\n", "$\\newcommand{\\conv}{\\operatorname{conv}}$\n", "$\\newcommand{\\inter}{{\\operatorname{int}}}$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this lecture we begin our study of the theory underlying constrained convex optimization. One way to define a **convex optimization problem** is\n", "\n", "\\begin{align*}\n", "\\begin{split}\n", " \\minimize & f(\\vct{x})\\\\\n", " \\subjto & f_1(\\vct{x})\\leq 0\\\\\n", " & \\cdots \\\\\n", " & f_m(\\vct{x})\\leq 0\\\\\n", " & \\vct{x}\\in \\Omega\n", " \\end{split}\n", "\\end{align*}\n", "\n", "where $f,f_1,\\dots,f_m\\colon \\R^n\\to \\R$ are *convex* functions and $\\Omega\\subseteq \\R^n$ is a *convex* set. The special case where the $f$ and the $f_i$ are linear functions and $\\Omega=\\R^n$ is known as linear programming, and is studied first. Before embarking on the study of models and algorithms for convex optimization, we need to study convex sets in more depth." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## Convex sets\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We recall the definition of a convex set.\n", "\n", "**Definition.** A set $C\\subseteq \\R^n$ is a *convex set*, if for all $\\vct{x},\\vct{y}\\in C$ and $\\lambda\\in [0,1]$, $\\lambda\\vct{x}+(1-\\lambda)\\vct{y}\\in C$. In words, for every two points in $C$, the line joining them is also in $C$. A compact (closed and bounded) convex set is called a *convex body*.\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will denote by $\\mathcal{C}(\\R^n)$ the collection of convex sets and by $\\mathcal{K}(\\R^n)$ the collection of convex bodies. The following Lemma is left as an exercise.\n", "\n", "**Lemma. ** Let $C,D\\in \\mathcal{C}(\\R^n)$ be convex sets. Then the following are also convex.\n", "* $\\displaystyle C\\cap D$;\n", "* $\\displaystyle C+D=\\{\\vct{x}+\\vct{y} \\mid \\vct{x}\\in C, \\vct{y}\\in D\\}$;\n", "* $\\displaystyle \\mtx{A}C=\\{\\mtx{A}\\vct{x} \\mid \\vct{x}\\in C\\}$, where $\\mtx{A}\\in \\R^{m\\times n}$.\n", "\n", "The **convex hull** $\\conv{S}$ of a set $S$ is the intersection of all convex sets containing $S$. Clearly, if $S$ is convex, then $S=\\conv{S}$.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Example. ** Let $S=\\{(1,1)^{\\trans},(1,-1)^{\\trans},(-1,1)^{\\trans},(-1,-1)^{\\trans},(0,0)^{\\trans}\\}$. The convex hull of this set is the square.\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A **convex combination** of points $\\vct{x}_1,\\dots,\\vct{x}_k$ is a linear combination\n", "\n", "\\begin{equation*}\n", " \\sum_{i=1}^k \\lambda_i \\vct{x}_i\n", "\\end{equation*}\n", "\n", "such that $\\lambda_i\\geq 0$ and $\\sum_{i=1}^k \\lambda_i = 1$. It can be shown inductively that convex sets are closed under convex combinations: any convex combination of points in $C\\in \\mathcal{C}(\\R^n)$ is still in $C$. In fact, the set of all convex combinations of points in a set $S$ is the convex hull of $S$.\n", "\n", "**Lemma.** Let $S$ be a set. Then \n", " \\begin{equation*}\n", " \\conv{S} = \\{\\vct{x}\\in \\R^n \\mid \\vct{x}=\\sum_{i=1}^k \\lambda_i \\vct{x}_k, \\ \\vct{x}_i\\in S, \\ \\sum_{i=1}^k \\lambda_i = 1, \\ \\lambda_i\\geq 0\\}.\n", " \\end{equation*}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Example. ** A hyperplane, defined as the solution set of one linear equation,\n", "\n", "\\begin{equation*}\n", " H = \\{\\vct{x} \\mid \\ip{\\vct{a}}{\\vct{x}} = b\\},\n", "\\end{equation*}\n", "\n", "is a convex set.\n", "Define the halfspaces $H_+$ and $H_-$ as the two sides that $H$ divides $\\R^n$ into:\n", "\n", "\\begin{equation*}\n", " H_- = \\{\\vct{x}\\mid \\ip{\\vct{a}}{\\vct{x}}\\leq b\\}, \\quad H_+ = \\{\\vct{x}\\mid \\ip{\\vct{a}}{\\vct{x}}\\geq b\\}\n", "\\end{equation*}\n", "\n", "These are also convex sets. \n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Example.** Euclidean balls and ellipsoids are common examples of convex sets. Let $\\mtx{P}$ be a positive semidefinite symmetric matrix. Then an ellipsoid with center $\\vct{x}_0$ is a set of the form\n", "\n", " \\begin{equation*}\n", " \\mathcal{E} = \\{\\vct{x} \\mid \\ip{\\vct{x}-\\vct{x}_0}{\\mtx{P}^{-1}(\\vct{x}-\\vct{x}_0)}\\leq 1\\}.\n", " \\end{equation*}\n", " \n", "A Euclidean unit ball is the special case $\\mtx{P}=\\mtx{I}$.\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A **convex cone** is a set $C$ such that for all $\\vct{x},\\vct{y}$ and $\\lambda\\geq 0, \\mu\\geq 0$,\n", " $\\lambda \\vct{x}+\\mu \\vct{y}\\in C$. It is easily verified that such a set is convex. Three important cones are the following:\n", "\n", "* The non-negative orthant $\\R^n_{+}=\\{\\vct{x}\\in \\R^n \\mid x_i\\geq 0, 1\\leq i\\leq n\\}$,\n", "* The second order (ice cream) cone (or Lorentz cone)\n", "\n", " \\begin{equation*}\n", " C_{\\alpha} = \\{\\vct{x} \\mid \\sum_{i=1}^{n-1}x_i^2\\leq x_n^2\\},\n", " \\end{equation*}\n", " \n", "* The cone $\\mathcal{S}_{+}^n$ of positive semidefinite symmetric matrices.\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Possibly the most important result in convex geometry is the {\\em hyperplane separation theorem}. We first need the following.\n", "\n", "**Lemma.** Let $C$ be a non-empty convex set and $\\vct{x}\\not\\in C$. Then there exists a point $\\vct{y}\\in C$ that minimizes the distance $\\norm{\\vct{x}-\\vct{y}}$. Moreover, for all $\\vct{z}\\in C$ we have\n", " \n", " \\begin{equation*}\n", " \\ip{\\vct{z}-\\vct{y}}{\\vct{x}-\\vct{y}}\\leq 0.\n", " \\end{equation*}\n", "\n", "In words, the vectors $\\vct{z}-\\vct{y}$ and $\\vct{x}-\\vct{y}$ form an obtuse angle.\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Proof.** Since $C\\neq \\emptyset$, there exists $r>0$ such that the ball $B(\\vct{x},r):=\\{\\vct{y}\\in \\R^n \\mid \\norm{\\vct{y}-\\vct{x}}\\leq \\e\\}$ intersected with $C$ is not empty. Since $K:=C\\cap B(\\vct{x},r)$ is compact (closed and bounded) and the function $\\norm{\\vct{y}-\\vct{x}}$ is continuous on $K$, it has a minimizer $\\vct{y}\\in K$. For the second claim, note that since $C$ is convex, for every $\\lambda\\in [0,1]$,\n", "\n", " \\begin{equation*}\n", " \\vct{w} = \\lambda \\vct{z}+(1-\\lambda)\\vct{y} \\in C.\n", " \\end{equation*}\n", "\n", "For the distance between $\\vct{z}$ and $\\vct{x}$ we then get\n", "\n", "\\begin{align*}\n", " \\norm{\\vct{w}-\\vct{x}}^2 &= \\norm{\\lambda\\vct{z}+(1-\\lambda) \\vct{y}-\\vct{x}}^2 = \\norm{\\lambda (\\vct{z}-\\vct{y})-(\\vct{x}-\\vct{y})}^2\\\\\n", " &= \\lambda^2\\norm{\\vct{z}-\\vct{y}}^2-2\\lambda \\ip{\\vct{z}-\\vct{y}}{\\vct{x}-\\vct{y}}+\\norm{\\vct{x}-\\vct{y}}^2.\n", "\\end{align*}\n", "\n", "We now prove the claim by contradition. Assume $\\ip{\\vct{z}-\\vct{y}}{\\vct{x}-\\vct{y}}>0$. Then we can choose $\\lambda$ such that\n", "\n", "\\begin{equation*}\n", " 0< \\lambda < \\min\\left\\{ \\frac{2\\ip{\\vct{x}-\\vct{y}}{\\vct{z}-\\vct{y}}}{\\norm{\\vct{z}-\\vct{y}}^2} , 1\\right\\}.\n", "\\end{equation*}\n", "\n", "With such a $\\lambda$ we get\n", "\n", "\\begin{equation*}\n", " \\norm{\\vct{w}-\\vct{x}}^2 = \\lambda^2\\norm{\\vct{z}-\\vct{y}}^2-2\\lambda \\ip{\\vct{z}-\\vct{y}}{\\vct{x}-\\vct{y}}+\\norm{\\vct{x}-\\vct{y}}^2 < \\norm{\\vct{x}-\\vct{y}}^2.\n", "\\end{equation*}\n", "\n", "This inequality, however, contradicts the assumption that $\\vct{y}$ is a closest point, so that\n", "$\\ip{\\vct{z}-\\vct{y}}{\\vct{x}-\\vct{y}}\\leq 0$ has to hold." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In what follows write $\\inter S$ for the *interior* of a set $S$.\n", "\n", "**Theorem. **\n", " Let $C$ be a closed convex set and $\\vct{x}\\not\\in C$. Then there exists a hyperplane $H$ such that $C\\subset \\inter H_-$ and $\\vct{x}\\in \\inter H_+$. \n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Proof.** Let $\\vct{y}\\in C$ be a nearest point to $\\vct{x}$ in $C$, i.e., a point such that for all other $\\vct{z}\\in C$, $\\norm{\\vct{x}-\\vct{y}}\\leq \\norm{\\vct{x}-\\vct{z}}$. Define\n", "\n", " \\begin{equation*}\n", " \\vct{a}= \\vct{x}-\\vct{y}, \\quad b = (\\norm{\\vct{x}}^2-\\norm{\\vct{y}}^2)/2.\n", " \\end{equation*}\n", " \n", "We aim to show that $\\ip{\\vct{a}}{\\vct{x}} = b$ defines a separating hyperplane. \n", "\n", "For this we have to show that\n", "\n", "1. $\\ip{\\vct{a}}{\\vct{x}}>b$;\n", "2. For all $\\vct{z}\\in C$, $\\ip{\\vct{a}}{\\vct{z}}\\ip{\\vct{x}-\\vct{y}}{\\vct{x}}-\\frac{1}{2}\\norm{\\vct{x}-\\vct{y}}^2 = \\frac{1}{2}(\\norm{\\vct{x}}^2-\\norm{\\vct{y}}^2) = b.\n", "\\end{equation*}\n", "\n", "To prove (2), assume on the contrary that there exists a $\\vct{z}\\in C$ such that $\\ip{\\vct{a}}{\\vct{z}}\\geq b$. We know that the point $\\vct{y}\\in C$ satisfies the inequality (2), since\n", "\n", "\\begin{equation*}\n", " \\ip{\\vct{a}}{\\vct{y}} < \\ip{\\vct{a}}{\\vct{y}}+\\frac{1}{2}\\norm{\\vct{a}}^2 = \\ip{\\vct{a}}{\\vct{y}}+\\frac{1}{2}\\norm{\\vct{x}-\\vct{y}}^2 = b.\n", "\\end{equation*}\n", "\n", "Therefore, \n", "\n", "\\begin{equation*}\n", " \\ip{\\vct{a}}{\\vct{z}-\\vct{y}} = \\ip{\\vct{a}}{\\vct{z}}-\\ip{\\vct{a}}{\\vct{y}} > b-b = 0,\n", "\\end{equation*}\n", "\n", "but this contradicts the above Lemma. We therefore conclude $\\ip{\\vct{a}}{\\vct{z}}