{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"***\n",
"# Lecture 7 - Convex Sets\n",
"***\n",
"$\\newcommand{\\vct}[1]{\\mathbf{#1}}$\n",
"$\\newcommand{\\mtx}[1]{\\mathbf{#1}}$\n",
"$\\newcommand{\\e}{\\varepsilon}$\n",
"$\\newcommand{\\norm}[1]{\\|#1\\|}$\n",
"$\\newcommand{\\minimize}{\\text{minimize}\\quad}$\n",
"$\\newcommand{\\maximize}{\\text{maximize}\\quad}$\n",
"$\\newcommand{\\subjto}{\\quad\\text{subject to}\\quad}$\n",
"$\\newcommand{\\R}{\\mathbb{R}}$\n",
"$\\newcommand{\\trans}{T}$\n",
"$\\newcommand{\\ip}[2]{\\langle {#1}, {#2} \\rangle}$\n",
"$\\newcommand{\\zerovct}{\\vct{0}}$\n",
"$\\newcommand{\\diff}[1]{\\mathrm{d}{#1}}$\n",
"$\\newcommand{\\conv}{\\operatorname{conv}}$\n",
"$\\newcommand{\\inter}{{\\operatorname{int}}}$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this lecture we begin our study of the theory underlying constrained convex optimization. One way to define a **convex optimization problem** is\n",
"\n",
"\\begin{align*}\n",
"\\begin{split}\n",
" \\minimize & f(\\vct{x})\\\\\n",
" \\subjto & f_1(\\vct{x})\\leq 0\\\\\n",
" & \\cdots \\\\\n",
" & f_m(\\vct{x})\\leq 0\\\\\n",
" & \\vct{x}\\in \\Omega\n",
" \\end{split}\n",
"\\end{align*}\n",
"\n",
"where $f,f_1,\\dots,f_m\\colon \\R^n\\to \\R$ are *convex* functions and $\\Omega\\subseteq \\R^n$ is a *convex* set. The special case where the $f$ and the $f_i$ are linear functions and $\\Omega=\\R^n$ is known as linear programming, and is studied first. Before embarking on the study of models and algorithms for convex optimization, we need to study convex sets in more depth."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"## Convex sets\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We recall the definition of a convex set.\n",
"\n",
"**Definition.** A set $C\\subseteq \\R^n$ is a *convex set*, if for all $\\vct{x},\\vct{y}\\in C$ and $\\lambda\\in [0,1]$, $\\lambda\\vct{x}+(1-\\lambda)\\vct{y}\\in C$. In words, for every two points in $C$, the line joining them is also in $C$. A compact (closed and bounded) convex set is called a *convex body*.\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will denote by $\\mathcal{C}(\\R^n)$ the collection of convex sets and by $\\mathcal{K}(\\R^n)$ the collection of convex bodies. The following Lemma is left as an exercise.\n",
"\n",
"**Lemma. ** Let $C,D\\in \\mathcal{C}(\\R^n)$ be convex sets. Then the following are also convex.\n",
"* $\\displaystyle C\\cap D$;\n",
"* $\\displaystyle C+D=\\{\\vct{x}+\\vct{y} \\mid \\vct{x}\\in C, \\vct{y}\\in D\\}$;\n",
"* $\\displaystyle \\mtx{A}C=\\{\\mtx{A}\\vct{x} \\mid \\vct{x}\\in C\\}$, where $\\mtx{A}\\in \\R^{m\\times n}$.\n",
"\n",
"The **convex hull** $\\conv{S}$ of a set $S$ is the intersection of all convex sets containing $S$. Clearly, if $S$ is convex, then $S=\\conv{S}$.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Example. ** Let $S=\\{(1,1)^{\\trans},(1,-1)^{\\trans},(-1,1)^{\\trans},(-1,-1)^{\\trans},(0,0)^{\\trans}\\}$. The convex hull of this set is the square.\n",
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A **convex combination** of points $\\vct{x}_1,\\dots,\\vct{x}_k$ is a linear combination\n",
"\n",
"\\begin{equation*}\n",
" \\sum_{i=1}^k \\lambda_i \\vct{x}_i\n",
"\\end{equation*}\n",
"\n",
"such that $\\lambda_i\\geq 0$ and $\\sum_{i=1}^k \\lambda_i = 1$. It can be shown inductively that convex sets are closed under convex combinations: any convex combination of points in $C\\in \\mathcal{C}(\\R^n)$ is still in $C$. In fact, the set of all convex combinations of points in a set $S$ is the convex hull of $S$.\n",
"\n",
"**Lemma.** Let $S$ be a set. Then \n",
" \\begin{equation*}\n",
" \\conv{S} = \\{\\vct{x}\\in \\R^n \\mid \\vct{x}=\\sum_{i=1}^k \\lambda_i \\vct{x}_k, \\ \\vct{x}_i\\in S, \\ \\sum_{i=1}^k \\lambda_i = 1, \\ \\lambda_i\\geq 0\\}.\n",
" \\end{equation*}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Example. ** A hyperplane, defined as the solution set of one linear equation,\n",
"\n",
"\\begin{equation*}\n",
" H = \\{\\vct{x} \\mid \\ip{\\vct{a}}{\\vct{x}} = b\\},\n",
"\\end{equation*}\n",
"\n",
"is a convex set.\n",
"Define the halfspaces $H_+$ and $H_-$ as the two sides that $H$ divides $\\R^n$ into:\n",
"\n",
"\\begin{equation*}\n",
" H_- = \\{\\vct{x}\\mid \\ip{\\vct{a}}{\\vct{x}}\\leq b\\}, \\quad H_+ = \\{\\vct{x}\\mid \\ip{\\vct{a}}{\\vct{x}}\\geq b\\}\n",
"\\end{equation*}\n",
"\n",
"These are also convex sets. \n",
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Example.** Euclidean balls and ellipsoids are common examples of convex sets. Let $\\mtx{P}$ be a positive semidefinite symmetric matrix. Then an ellipsoid with center $\\vct{x}_0$ is a set of the form\n",
"\n",
" \\begin{equation*}\n",
" \\mathcal{E} = \\{\\vct{x} \\mid \\ip{\\vct{x}-\\vct{x}_0}{\\mtx{P}^{-1}(\\vct{x}-\\vct{x}_0)}\\leq 1\\}.\n",
" \\end{equation*}\n",
" \n",
"A Euclidean unit ball is the special case $\\mtx{P}=\\mtx{I}$.\n",
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A **convex cone** is a set $C$ such that for all $\\vct{x},\\vct{y}$ and $\\lambda\\geq 0, \\mu\\geq 0$,\n",
" $\\lambda \\vct{x}+\\mu \\vct{y}\\in C$. It is easily verified that such a set is convex. Three important cones are the following:\n",
"\n",
"* The non-negative orthant $\\R^n_{+}=\\{\\vct{x}\\in \\R^n \\mid x_i\\geq 0, 1\\leq i\\leq n\\}$,\n",
"* The second order (ice cream) cone (or Lorentz cone)\n",
"\n",
" \\begin{equation*}\n",
" C_{\\alpha} = \\{\\vct{x} \\mid \\sum_{i=1}^{n-1}x_i^2\\leq x_n^2\\},\n",
" \\end{equation*}\n",
" \n",
"* The cone $\\mathcal{S}_{+}^n$ of positive semidefinite symmetric matrices.\n",
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Possibly the most important result in convex geometry is the {\\em hyperplane separation theorem}. We first need the following.\n",
"\n",
"**Lemma.** Let $C$ be a non-empty convex set and $\\vct{x}\\not\\in C$. Then there exists a point $\\vct{y}\\in C$ that minimizes the distance $\\norm{\\vct{x}-\\vct{y}}$. Moreover, for all $\\vct{z}\\in C$ we have\n",
" \n",
" \\begin{equation*}\n",
" \\ip{\\vct{z}-\\vct{y}}{\\vct{x}-\\vct{y}}\\leq 0.\n",
" \\end{equation*}\n",
"\n",
"In words, the vectors $\\vct{z}-\\vct{y}$ and $\\vct{x}-\\vct{y}$ form an obtuse angle.\n",
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Proof.** Since $C\\neq \\emptyset$, there exists $r>0$ such that the ball $B(\\vct{x},r):=\\{\\vct{y}\\in \\R^n \\mid \\norm{\\vct{y}-\\vct{x}}\\leq \\e\\}$ intersected with $C$ is not empty. Since $K:=C\\cap B(\\vct{x},r)$ is compact (closed and bounded) and the function $\\norm{\\vct{y}-\\vct{x}}$ is continuous on $K$, it has a minimizer $\\vct{y}\\in K$. For the second claim, note that since $C$ is convex, for every $\\lambda\\in [0,1]$,\n",
"\n",
" \\begin{equation*}\n",
" \\vct{w} = \\lambda \\vct{z}+(1-\\lambda)\\vct{y} \\in C.\n",
" \\end{equation*}\n",
"\n",
"For the distance between $\\vct{z}$ and $\\vct{x}$ we then get\n",
"\n",
"\\begin{align*}\n",
" \\norm{\\vct{w}-\\vct{x}}^2 &= \\norm{\\lambda\\vct{z}+(1-\\lambda) \\vct{y}-\\vct{x}}^2 = \\norm{\\lambda (\\vct{z}-\\vct{y})-(\\vct{x}-\\vct{y})}^2\\\\\n",
" &= \\lambda^2\\norm{\\vct{z}-\\vct{y}}^2-2\\lambda \\ip{\\vct{z}-\\vct{y}}{\\vct{x}-\\vct{y}}+\\norm{\\vct{x}-\\vct{y}}^2.\n",
"\\end{align*}\n",
"\n",
"We now prove the claim by contradition. Assume $\\ip{\\vct{z}-\\vct{y}}{\\vct{x}-\\vct{y}}>0$. Then we can choose $\\lambda$ such that\n",
"\n",
"\\begin{equation*}\n",
" 0< \\lambda < \\min\\left\\{ \\frac{2\\ip{\\vct{x}-\\vct{y}}{\\vct{z}-\\vct{y}}}{\\norm{\\vct{z}-\\vct{y}}^2} , 1\\right\\}.\n",
"\\end{equation*}\n",
"\n",
"With such a $\\lambda$ we get\n",
"\n",
"\\begin{equation*}\n",
" \\norm{\\vct{w}-\\vct{x}}^2 = \\lambda^2\\norm{\\vct{z}-\\vct{y}}^2-2\\lambda \\ip{\\vct{z}-\\vct{y}}{\\vct{x}-\\vct{y}}+\\norm{\\vct{x}-\\vct{y}}^2 < \\norm{\\vct{x}-\\vct{y}}^2.\n",
"\\end{equation*}\n",
"\n",
"This inequality, however, contradicts the assumption that $\\vct{y}$ is a closest point, so that\n",
"$\\ip{\\vct{z}-\\vct{y}}{\\vct{x}-\\vct{y}}\\leq 0$ has to hold."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In what follows write $\\inter S$ for the *interior* of a set $S$.\n",
"\n",
"**Theorem. **\n",
" Let $C$ be a closed convex set and $\\vct{x}\\not\\in C$. Then there exists a hyperplane $H$ such that $C\\subset \\inter H_-$ and $\\vct{x}\\in \\inter H_+$. \n",
"\n",
""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Proof.** Let $\\vct{y}\\in C$ be a nearest point to $\\vct{x}$ in $C$, i.e., a point such that for all other $\\vct{z}\\in C$, $\\norm{\\vct{x}-\\vct{y}}\\leq \\norm{\\vct{x}-\\vct{z}}$. Define\n",
"\n",
" \\begin{equation*}\n",
" \\vct{a}= \\vct{x}-\\vct{y}, \\quad b = (\\norm{\\vct{x}}^2-\\norm{\\vct{y}}^2)/2.\n",
" \\end{equation*}\n",
" \n",
"We aim to show that $\\ip{\\vct{a}}{\\vct{x}} = b$ defines a separating hyperplane. \n",
"\n",
"For this we have to show that\n",
"\n",
"1. $\\ip{\\vct{a}}{\\vct{x}}>b$;\n",
"2. For all $\\vct{z}\\in C$, $\\ip{\\vct{a}}{\\vct{z}}\\ip{\\vct{x}-\\vct{y}}{\\vct{x}}-\\frac{1}{2}\\norm{\\vct{x}-\\vct{y}}^2 = \\frac{1}{2}(\\norm{\\vct{x}}^2-\\norm{\\vct{y}}^2) = b.\n",
"\\end{equation*}\n",
"\n",
"To prove (2), assume on the contrary that there exists a $\\vct{z}\\in C$ such that $\\ip{\\vct{a}}{\\vct{z}}\\geq b$. We know that the point $\\vct{y}\\in C$ satisfies the inequality (2), since\n",
"\n",
"\\begin{equation*}\n",
" \\ip{\\vct{a}}{\\vct{y}} < \\ip{\\vct{a}}{\\vct{y}}+\\frac{1}{2}\\norm{\\vct{a}}^2 = \\ip{\\vct{a}}{\\vct{y}}+\\frac{1}{2}\\norm{\\vct{x}-\\vct{y}}^2 = b.\n",
"\\end{equation*}\n",
"\n",
"Therefore, \n",
"\n",
"\\begin{equation*}\n",
" \\ip{\\vct{a}}{\\vct{z}-\\vct{y}} = \\ip{\\vct{a}}{\\vct{z}}-\\ip{\\vct{a}}{\\vct{y}} > b-b = 0,\n",
"\\end{equation*}\n",
"\n",
"but this contradicts the above Lemma. We therefore conclude $\\ip{\\vct{a}}{\\vct{z}}