{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Optimization Examples - Semidefinite Programming (SDP)\n", "\n", "## SDP\n", "\n", "\n", "\n", "\n", "\n", "* A **semidefinite program (SDP)** has the form\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& \\mathbf{c}^T \\mathbf{x} \\\\\n", "\t&\\text{subject to}& x_1 \\mathbf{F}_1 + \\cdots + x_n \\mathbf{F}_n + \\mathbf{G} \\preceq \\mathbf{0} \\quad (\\text{LMI, linear matrix inequality}) \\\\\n", "\t& & \\mathbf{A} \\mathbf{x} = \\mathbf{b},\n", "\\end{eqnarray*}\n", "where $\\mathbf{G}, \\mathbf{F}_1, \\ldots, \\mathbf{F}_n \\in \\mathbf{S}^k$, $\\mathbf{A} \\in \\mathbb{R}^{p \\times n}$, and $\\mathbf{b} \\in \\mathbb{R}^p$.\n", "\n", " When $\\mathbf{G}, \\mathbf{F}_1, \\ldots, \\mathbf{F}_n$ are all diagonal, SDP reduces to LP.\n", "\n", "* The **standard form SDP** has form\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& \\text{tr}(\\mathbf{C} \\mathbf{X}) \\\\\n", "\t&\\text{subject to}& \\text{tr}(\\mathbf{A}_i \\mathbf{X}) = b_i, \\quad i = 1, \\ldots, p \\\\\n", "\t& & \\mathbf{X} \\succeq \\mathbf{0},\n", "\\end{eqnarray*}\n", "where $\\mathbf{C}, \\mathbf{A}_1, \\ldots, \\mathbf{A}_p \\in \\mathbf{S}^n$.\n", "\n", "* An **inequality form SDP** has form\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& \\mathbf{c}^T \\mathbf{x} \\\\\n", "\t&\\text{subject to}& x_1 \\mathbf{A}_1 + \\cdots + x_n \\mathbf{A}_n \\preceq \\mathbf{B},\n", "\\end{eqnarray*}\n", "with variable $\\mathbf{x} \\in \\mathbb{R}^n$, and parameters $\\mathbf{B}, \\mathbf{A}_1, \\ldots, \\mathbf{A}_n \\in \\mathbf{S}^n$, $\\mathbf{c} \\in \\mathbb{R}^n$.\n", "\n", "* Exercise. Write LP, QP, QCQP, and SOCP in form of SDP. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## SDP example: nearest correlation matrix\n", "\n", "* Let $\\mathbb{C}^n$ be the convex set of $n \\times n$ correlation matrices\n", "\\begin{eqnarray*}\n", "\t\\mathbb{C}^n = \\{ \\mathbf{X} \\in \\mathbf{S}_+^n: x_{ii}=1, i=1,\\ldots,n\\}.\n", "\\end{eqnarray*}\n", "Given $\\mathbf{A} \\in \\mathbf{S}^n$, often we need to find the closest correlation matrix to $\\mathbf{A}$\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& \\|\\mathbf{A} - \\mathbf{X}\\|_{\\text{F}} \\\\\n", "\t&\\text{subject to}& \\mathbf{X} \\in \\mathbb{C}^n.\n", "\\end{eqnarray*}\n", "This projection problem can be solved via an SDP\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& t \\\\\n", "\t&\\text{subject to}& \\|\\mathbf{A} - \\mathbf{X}\\|_{\\text{F}} \\le t \\\\\n", "\t& & \\mathbf{X} = \\mathbf{X}^T, \\, \\text{diag}(\\mathbf{X}) = \\mathbf{1} \\\\\n", "\t& & \\mathbf{X} \\succeq \\mathbf{0}\n", "\\end{eqnarray*}\n", "in variables $\\mathbf{X} \\in \\mathbb{R}^{n \\times n}$ and $t \\in \\mathbb{R}$. The SOC constraint can be written as an LMI\n", "\\begin{eqnarray*}\n", "\t\\begin{pmatrix}\n", "\t\tt \\mathbf{I} & \\text{vec} (\\mathbf{A} - \\mathbf{X}) \\\\\n", "\t\t\\text{vec} (\\mathbf{A} - \\mathbf{X})^T & t\n", "\t\\end{pmatrix} \\succeq \\mathbf{0}\n", "\\end{eqnarray*}\n", "by the Schur complement lemma." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## SDP example: eigenvalue problems\n", "\n", "* Suppose \n", "\\begin{eqnarray*}\n", "\t\\mathbf{A}(\\mathbf{x}) = \\mathbf{A}_0 + x_1 \\mathbf{A}_1 + \\cdots x_n \\mathbf{A}_n,\n", "\\end{eqnarray*}\n", "where $\\mathbf{A}_i \\in \\mathbf{S}^m$, $i = 0, \\ldots, n$. Let $\\lambda_1(\\mathbf{x}) \\ge \\lambda_2(\\mathbf{x}) \\ge \\cdots \\ge \\lambda_m(\\mathbf{x})$ be the ordered eigenvalues of $\\mathbf{A}(\\mathbf{x})$.\n", "\n", "* Minimize the maximal eigenvalue is equivalent to the SDP\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& t \\\\\n", "\t&\\text{subject to}& \\mathbf{A}(\\mathbf{x}) \\preceq t \\mathbf{I}\n", "\\end{eqnarray*}\n", "in variables $\\mathbf{x} \\in \\mathbb{R}^n$ and $t \\in \\mathbb{R}$.\n", "\n", "* Minimizing the sum of $k$ largest eigenvalues is an SDP too. How about minimizing the sum of all eigenvalues?\n", "\n", "* Maximize the minimum eigenvalue is an SDP as well.\n", "\n", "* Minimize the spread of the eigenvalues $\\lambda_1(\\mathbf{x}) - \\lambda_m(\\mathbf{x})$ is equivalent to the SDP\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& t_1 - t_m \\\\\n", "\t&\\text{subject to}& t_m \\mathbf{I} \\preceq \\mathbf{A}(\\mathbf{x}) \\preceq t_1 \\mathbf{I}\n", "\\end{eqnarray*}\n", "in variables $\\mathbf{x} \\in \\mathbb{R}^n$ and $t_1, t_m \\in \\mathbb{R}$.\n", "\n", "* Minimize the spectral radius (or spectral norm) $\\rho(\\mathbf{x}) = \\max_{i=1,\\ldots,m} |\\lambda_i(\\mathbf{x})|$ is equivalent to the SDP\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& t \\\\\n", "\t&\\text{subject to}& - t \\mathbf{I} \\preceq \\mathbf{A}(\\mathbf{x}) \\preceq t \\mathbf{I}\n", "\\end{eqnarray*}\n", "in variables $\\mathbf{x} \\in \\mathbb{R}^n$ and $t \\in \\mathbb{R}$.\n", "\n", "* To minimize the condition number $\\kappa(\\mathbf{x}) = \\lambda_1(\\mathbf{x}) / \\lambda_m(\\mathbf{x})$, note $\\lambda_1(\\mathbf{x}) / \\lambda_m(\\mathbf{x}) \\le t$ if and only if there exists a $\\mu > 0$ such that $\\mu \\mathbf{I} \\preceq \\mathbf{A}(\\mathbf{x}) \\preceq \\mu t \\mathbf{I}$, or equivalently, $\\mathbf{I} \\preceq \\mu^{-1} \\mathbf{A}(\\mathbf{x}) \\preceq t \\mathbf{I}$. With change of variables $y_i = x_i / \\mu$ and $s = 1/\\mu$, we can solve the SDP\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& t \\\\\n", "\t&\\text{subject to}& \\mathbf{I} \\preceq s \\mathbf{A}_0 + y_1 \\mathbf{A}_1 + \\cdots y_n \\mathbf{A}_n \\preceq t \\mathbf{I} \\\\\n", "\t& & s \\ge 0,\n", "\\end{eqnarray*}\n", "in variables $\\mathbf{y} \\in \\mathbb{R}^n$ and $s, t \\ge 0$. In other words, we normalize the spectrum by the smallest eigenvalue and then minimize the largest eigenvalue of the normalized LMI.\n", "\n", "* Minimize the $\\ell_1$ norm of the eigenvalues $|\\lambda_1(\\mathbf{x})| + \\cdots + |\\lambda_m(\\mathbf{x})|$ is equivalent to the SDP\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& \\text{tr} (\\mathbf{A}^+) + \\text{tr}(\\mathbf{A}^-) \\\\\n", "\t&\\text{subject to}& \\mathbf{A}(\\mathbf{x}) = \\mathbf{A}^+ - \\mathbf{A}^- \\\\\n", "\t& & \\mathbf{A}^+ \\succeq \\mathbf{0}, \\mathbf{A}^- \\succeq \\mathbf{0},\n", "\\end{eqnarray*}\n", "in variables $\\mathbf{x} \\in \\mathbb{R}^n$ and $\\mathbf{A}^+, \\mathbf{A}^- \\in \\mathbf{S}_+^n$.\n", "\n", "* Roots of determinant. The determinant of a semidefinite matrix $\\text{det} (\\mathbf{A}(\\mathbf{x})) = \\prod_{i=1}^m \\lambda_i (\\mathbf{x})$ is neither convex or concave, but rational powers of the determinant can be modeled using linear matrix inequalities. For a rational power $0 \\le q \\le 1/m$, the function $\\text{det} (\\mathbf{A}(\\mathbf{x}))^q$ is concave and we have\n", "\\begin{eqnarray*}\n", "\t& & t \\le \\text{det} (\\mathbf{A}(\\mathbf{x}))^q\\\\\n", "\t&\\Leftrightarrow& \\begin{pmatrix}\n", "\t\\mathbf{A}(\\mathbf{x}) & \\mathbf{Z} \\\\\n", "\t\\mathbf{Z}^T & \\text{diag}(\\mathbf{Z})\n", "\t\\end{pmatrix} \\succeq \\mathbf{0}, \\quad (z_{11} z_{22} \\cdots z_{mm})^q \\ge t,\n", "\\end{eqnarray*}\n", "where $\\mathbf{Z} \\in \\mathbb{R}^{m \\times m}$ is a lower-triangular matrix. Similarly for any rational $q>0$, we have\n", "\\begin{eqnarray*}\n", "\t& & t \\ge \\text{det} (\\mathbf{A}(\\mathbf{x}))^{-q} \\\\\n", "\t&\\Leftrightarrow& \\begin{pmatrix}\n", "\t\\mathbf{A}(\\mathbf{x}) & \\mathbf{Z} \\\\\n", "\t\\mathbf{Z}^T & \\text{diag}(\\mathbf{Z})\n", "\t\\end{pmatrix} \\succeq \\mathbf{0}, \\quad (z_{11} z_{22} \\cdots z_{mm})^{-q} \\le t\n", "\\end{eqnarray*}\n", "for a lower triangular $\\mathbf{Z}$.\n", "\n", "* References: See Lecture 4 (p146-p151) in the book [Ben-Tal and Nemirovski (2001)](https://doi.org/10.1137/1.9780898718829) for the proof of above facts.\n", "\n", "* `lambda_max`, `lambda_min`, `lambda_sum_largest`, `lambda_sum_smallest`, `det_rootn`, and `trace_inv` are implemented in cvx for Matlab.\n", "\n", "* `lambda_max`, `lambda_min` are implemented in Convex.jl package for Julia." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## SDP example: experiment design\n", "\n", "See HW6 Q1 " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## SDP example: singular value problems\n", "\n", "* Let $\\mathbf{A}(\\mathbf{x}) = \\mathbf{A}_0 + x_1 \\mathbf{A}_1 + \\cdots x_n \\mathbf{A}_n$, where $\\mathbf{A}_i \\in \\mathbb{R}^{p \\times q}$ and $\\sigma_1(\\mathbf{x}) \\ge \\cdots \\sigma_{\\min\\{p,q\\}}(\\mathbf{x}) \\ge 0$ be the ordered singular values.\n", "\n", "* **Spectral norm** (or **operator norm** or **matrix-2 norm**) minimization. Consider minimizing the spectral norm $\\|\\mathbf{A}(\\mathbf{x})\\|_2 = \\sigma_1(\\mathbf{x})$. Note $\\|\\mathbf{A}\\|_2 \\le t$ if and only if $\\mathbf{A}^T \\mathbf{A} \\preceq t^2 \\mathbf{I}$ (and $t \\ge 0$) if and only if $\\begin{pmatrix} t\\mathbf{I} & \\mathbf{A} \\\\ \\mathbf{A}^T & t \\mathbf{I} \\end{pmatrix} \\succeq \\mathbf{0}$. This results in the SDP\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& t \\\\\n", "\t&\\text{subject to}& \\begin{pmatrix} t\\mathbf{I} & \\mathbf{A}(\\mathbf{x}) \\\\ \\mathbf{A}(\\mathbf{x})^T & t \\mathbf{I} \\end{pmatrix} \\succeq \\mathbf{0}\n", "\\end{eqnarray*}\n", "in variables $\\mathbf{x} \\in \\mathbb{R}^n$ and $t \\in \\mathbb{R}$.\n", "\n", "Minimizing the sum of $k$ largest singular values is an SDP as well.\n", "\n", "* Nuclear norm minimization. Minimization of the \\emph{nuclear norm} (or \\emph{trace norm}) $\\|\\mathbf{A}(\\mathbf{x})\\|_* = \\sum_i \\sigma_i(\\mathbf{x})$ can be formulated as an SDP. \n", "\n", "Argument 1: Singular values of $\\mathbf{A}$ coincides with the eigenvalues of the symmetric matrix\n", "\\begin{eqnarray*}\n", "\t\\begin{pmatrix}\n", "\t\\mathbf{0} & \\mathbf{A} \\\\\n", "\t\\mathbf{A}^T & \\mathbf{0}\n", "\t\\end{pmatrix},\n", "\\end{eqnarray*}\n", "which has eigenvalues $(\\sigma_1, \\ldots, \\sigma_p, - \\sigma_p, \\ldots, - \\sigma_1)$. Therefore minimizing the nuclear norm of $\\mathbf{A}$ is same as minimizing the $\\ell_1$ norm of eigenvalues of the augmented matrix, which we know is an SDP.\n", "\n", "Argument 2: An alternative characterization of nuclear norm is $\\|\\mathbf{A}\\|_* = \\sup_{\\|\\mathbf{Z}\\|_2 \\le 1} \\text{tr} (\\mathbf{A}^T \\mathbf{Z})$. That is\n", "\\begin{eqnarray*}\n", "\t&\\text{maximize}& \\text{tr}(\\mathbf{A}^T \\mathbf{Z}) \\\\\n", "\t&\\text{subject to}& \\begin{pmatrix} \\mathbf{I} & \\mathbf{Z}^T \\\\ \\mathbf{Z} & \\mathbf{I} \\end{pmatrix} \\succeq \\mathbf{0},\n", "\\end{eqnarray*}\n", "with the dual problem\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& \\text{tr}(\\mathbf{U} + \\mathbf{V}) /2 \\\\\n", "\t&\\text{subject to}& \\begin{pmatrix}\n", "\t\\mathbf{U} & \\mathbf{A}(\\mathbf{x})^T \\\\\n", "\t\\mathbf{A}(\\mathbf{x}) & \\mathbf{V}\n", "\t\\end{pmatrix} \\succeq \\mathbf{0}.\n", "\\end{eqnarray*}\n", "\n", "Therefore the epigraph of nuclear norm can be represented by LMI\n", "\\begin{eqnarray*}\n", "\t& & \\|\\mathbf{A}(\\mathbf{x})\\|_* \\le t \\\\\n", "\t&\\Leftrightarrow& \\begin{pmatrix}\n", "\t\\mathbf{U} & \\mathbf{A}(\\mathbf{x})^T \\\\\n", "\t\\mathbf{A}(\\mathbf{x}) & \\mathbf{V}\n", "\t\\end{pmatrix} \\succeq \\mathbf{0}, \\quad \\text{tr}(\\mathbf{U} + \\mathbf{V}) /2 \\le t.\n", "\\end{eqnarray*}\n", "in variables $t \\in \\mathbb{R}$, $\\mathbf{U} \\in \\mathbb{R}^{q \\times p}$ and $\\mathbf{V} \\in \\mathbb{R}^{p \\times q}$.\n", "\n", "Argument 3: See Proposition 4.2.2, p154 of [Ben-Tal and Nemirovski (2001)](https://doi.org/10.1137/1.9780898718829).\n", "\n", "* `sigma_max` and `norm_nuc` are implemented in cvx for Matlab.\n", "\n", "* `operator_norm` and `nuclear_norm` are implemented in Convex.jl package for Julia." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## SDP example: matrix completion" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Quadratic or quadratic-over-linear matrix inequalities. Suppose\n", "\\begin{eqnarray*}\n", "\t\\mathbf{A}(\\mathbf{x}) &=& \\mathbf{A}_0 + x_1 \\mathbf{A}_1 + \\cdots + x_n \\mathbf{A}_n \\\\\n", "\t\\mathbf{B}(\\mathbf{y}) &=& \\mathbf{B}_0 + y_1 \\mathbf{B}_1 + \\cdots + y_r \\mathbf{B}_r.\n", "\\end{eqnarray*}\n", "Then\n", "\\begin{eqnarray*}\n", "\t& & \\mathbf{A}(\\mathbf{x})^T \\mathbf{B}(\\mathbf{y})^{-1} \\mathbf{A}(\\mathbf{x}) \\preceq \\mathbf{C} \\\\\n", "\t&\\Leftrightarrow& \\begin{pmatrix}\n", "\t\\mathbf{B}(\\mathbf{y}) & \\mathbf{A}(\\mathbf{x})^T \\\\\n", "\t\\mathbf{A}(\\mathbf{x}) & \\mathbf{C}\n", "\t\\end{pmatrix} \\succeq \\mathbf{0}\n", "\\end{eqnarray*}\n", "by the Schur complement lemma.\n", "\n", "* `matrix_frac()` is implemented in both cvx for Matlab and Convex.jl package for Julia.\n", "\n", "* General quadratic matrix inequality. Let $\\mathbf{X} \\in \\mathbb{R}^{m \\times n}$ be a rectangular matrix and\n", "\\begin{eqnarray*}\n", "\tF(\\mathbf{X}) = (\\mathbf{A} \\mathbf{X} \\mathbf{B})(\\mathbf{A} \\mathbf{X} \\mathbf{B})^T + \\mathbf{C} \\mathbf{X} \\mathbf{D} + (\\mathbf{C} \\mathbf{X} \\mathbf{D})^T + \\mathbf{E}\n", "\\end{eqnarray*}\n", "be a quadratic matrix-valued function. Then\n", "\\begin{eqnarray*}\n", "\t& & F(\\mathbf{X}) \\preceq \\mathbf{Y} \\\\\n", "\t&\\Leftrightarrow& \\begin{pmatrix} \n", "\t\\mathbf{I} & (\\mathbf{A} \\mathbf{X} \\mathbf{B})^T \\\\\n", "\t\\mathbf{A} \\mathbf{X} \\mathbf{B} & \\mathbf{Y} - \\mathbf{E} - \\mathbf{C} \\mathbf{X} \\mathbf{D} - (\\mathbf{C} \\mathbf{X} \\mathbf{D})^T\n", "\t\\end{pmatrix} \\preceq \\mathbf{0}\n", "\\end{eqnarray*}\n", "by the Schur complement lemma.\n", "\n", "* Another matrix inequality\n", "\\begin{eqnarray*}\n", "\t& & \\mathbf{X} \\succeq \\mathbf{0}, \\mathbf{Y} \\preceq (\\mathbf{C}^T \\mathbf{X}^{-1} \\mathbf{C})^{-1} \\\\\n", "\t&\\Leftrightarrow& \\mathbf{Y} \\preceq \\mathbf{Z}, \\mathbf{Z} \\succeq \\mathbf{0}, \\mathbf{X} \\succeq \\mathbf{C} \\mathbf{Z} \\mathbf{C}^T.\n", "\\end{eqnarray*}\n", "See Chapter 20.c (p155) of [Ben-Tal and Nemirovski (2001)](https://doi.org/10.1137/1.9780898718829)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## SDP example: cone of nonnegative polynomials\n", "\n", "* Consider nonnegative polynomial of degree $2n$ \n", "\\begin{eqnarray*}\n", "\tf(t) = \\mathbf{x}^T \\mathbf{v}(t) = x_0 + x_1 t + \\cdots x_{2n} t^{2n} \\ge 0, \\text{ for all } t.\n", "\\end{eqnarray*}\n", "The cone\n", "\\begin{eqnarray*}\n", "\t\\mathbf{K}^n = \\{\\mathbf{x} \\in \\mathbb{R}^{2n+1}: f(t) = \\mathbf{x}^T \\mathbf{v}(t) \\ge 0, \\text{ for all } t \\in \\mathbb{R}\\}\n", "\\end{eqnarray*}\n", "can be characterized by LMI\n", "\\begin{eqnarray*}\n", "\tf(t) \\ge 0 \\text{ for all } t \\Leftrightarrow x_i = \\langle \\mathbf{X}, \\mathbf{H}_i \\rangle, i=0,\\ldots,2n, \\mathbf{X} \\in \\mathbf{S}_+^{n+1},\n", "\\end{eqnarray*}\n", "where $\\mathbf{H}_i \\in \\mathbb{R}^{(n+1) \\times (n+1)}$ are Hankel matrices with entries $(\\mathbf{H}_i)_{kl} = 1$ if $k+l=i$ or 0 otherwise. Here $k, l \\in \\{0, 1, \\ldots, n\\}$. \n", "\n", "* Similarly the cone of nonnegative polynomials on a finite interval\n", "\\begin{eqnarray*}\n", "\t\\mathbf{K}_{a,b}^n = \\{\\mathbf{x} \\in \\mathbb{R}^{n+1}: f(t) = \\mathbf{x}^T \\mathbf{v}(t) \\ge 0, \\text{ for all } t \\in [a,b]\\}\n", "\\end{eqnarray*}\n", "can be characterized by LMI as well. MosekModelling.pdf p48.\n", " * (Even degree) Let $n = 2m$. Then\n", " \\begin{eqnarray*}\n", " \\mathbf{K}_{a,b}^n &=& \\{\\mathbf{x} \\in \\mathbb{R}^{n+1}: x_i = \\langle \\mathbf{X}_1, \\mathbf{H}_i^m \\rangle + \\langle \\mathbf{X}_2, (a+b) \\mathbf{H}_{i-1}^{m-1} - ab \\mathbf{H}_i^{m-1} - \\mathbf{H}_{i-2}^{m-1} \\rangle, \\\\\n", " & & \\quad i = 0,\\ldots,n, \\mathbf{X}_1 \\in \\mathbf{S}_+^m, \\mathbf{X}_2 \\in \\mathbf{S}_+^{m-1}\\}.\n", " \\end{eqnarray*}\n", " * (Odd degree) Let $n = 2m + 1$. Then\n", " \\begin{eqnarray*}\n", " \\mathbf{K}_{a,b}^n &=& \\{\\mathbf{x} \\in \\mathbb{R}^{n+1}: x_i = \\langle \\mathbf{X}_1, \\mathbf{H}_{i-1}^m - a \\mathbf{H}_i^m \\rangle + \\langle \\mathbf{X}_2, b \\mathbf{H}_{i}^{m} - \\mathbf{H}_{i-1}^{m} \\rangle, \\\\\n", " & & \\quad i = 0,\\ldots,n, \\mathbf{X}_1, \\mathbf{X}_2 \\in \\mathbf{S}_+^m\\}.\n", " \\end{eqnarray*}\n", "\n", "* References: [Nesterov (2000)](https://doi.org/10.1007/978-1-4757-3216-0_17) and Lecture 4 (p157-p159) of [Ben-Tal and Nemirovski (2001)](https://doi.org/10.1137/1.9780898718829).\n", "\n", "* Example. Polynomial curve fitting. We want to fit a univariate polynomial of degree $n$\n", "\\begin{eqnarray*}\n", "\tf(t) = x_0 + x_1 t + x_2 t^2 + \\cdots x_n t^n\n", "\\end{eqnarray*}\n", "to a set of measurements $(t_i, y_i)$, $i=1,\\ldots,m$, such that $f(t_i) \\approx y_i$. Define the Vandermonde matrix\n", "\\begin{eqnarray*}\n", "\t\\mathbf{A} = \\begin{pmatrix}\n", "\t1 & t_1 & t_1^2 & \\cdots & t_1^n \\\\\n", "\t1 & t_2 & t_2^2 & \\cdots & t_2^n \\\\\n", "\t\\vdots & \\vdots & \\vdots & & \\vdots \\\\\n", "\t1 & t_m & t_m^2 & \\cdots & t_m^n\n", "\t\\end{pmatrix},\n", "\\end{eqnarray*}\n", "then we wish $\\mathbf{A} \\mathbf{x} \\approx \\mathbf{y}$. Using least squares criterion, we obtain the optimal solution $\\mathbf{x}_{\\text{LS}} = (\\mathbf{A}^T \\mathbf{A})^{-1} \\mathbf{A}^T \\mathbf{y}$. With various constraints, it is possible to find optimal $\\mathbf{x}$ by SDP.\n", " * Nonnegativity. Then we require $\\mathbf{x} \\in \\mathbf{K}_{a,b}^n$.\n", " * Monotonicity. We can ensure monotonicity of $f(t)$ by requiring that $f'(t) \\ge 0$ or $f'(t) \\le 0$. That is $(x_1,2x_2, \\ldots, nx_n) \\in \\mathbf{K}_{a,b}^{n-1}$ or $-(x_1,2x_2, \\ldots, nx_n) \\in \\mathbf{K}_{a,b}^{n-1}$.\n", " * Convexity or concavity. Convexity or concavity of $f(t)$ corresponds to $f''(t) \\ge 0$ or $f''(t) \\le 0$. That is $(2x_2, 2x_3, \\ldots, (n-1)nx_n) \\in \\mathbf{K}_{a,b}^{n-2}$ or $-(2x_2, 2x_3, \\ldots, (n-1)nx_n) \\in \\mathbf{K}_{a,b}^{n-2}$.\n", " \n", "* `nonneg_poly_coeffs()` and `convex_poly_coeffs()` are implemented in cvx. Not in Convex.jl yet." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## SDP relaxation of binary optimization. \n", "\n", "* Consider a binary linear optimization problem\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& \\mathbf{c}^T \\mathbf{x} \\\\\n", "\t&\\text{subject to}& \\mathbf{A} \\mathbf{x} = \\mathbf{b}, \\quad \\mathbf{x} \\in \\{0,1\\}^n.\n", "\\end{eqnarray*}\n", "Note\n", "\\begin{eqnarray*}\n", "\t& & x_i \\in \\{0,1\\} \n", "\t\\Leftrightarrow x_i^2 = x_i \n", "\t\\Leftrightarrow \\mathbf{X} = \\mathbf{x} \\mathbf{x}^T, \\text{diag}(\\mathbf{X}) = \\mathbf{x}.\n", "\\end{eqnarray*}\n", "By relaxing the rank 1 constraint on $\\mathbf{X}$, we obtain an SDP relaxation\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& \\mathbf{c}^T \\mathbf{x} \\\\\n", "\t&\\text{subject to}& \\mathbf{A} \\mathbf{x} = \\mathbf{b}, \\text{diag}(\\mathbf{X}) = \\mathbf{x}, \\mathbf{X} \\succeq \\mathbf{x} \\mathbf{x}^T,\n", "\\end{eqnarray*}\n", "which can be efficiently solved and provides a lower bound to the original problem. If the optimal $\\mathbf{X}$ has rank 1, then it is a solution to the original binary problem also. Note $\\mathbf{X} \\succeq \\mathbf{x} \\mathbf{x}^T$ is equivalent to the LMI \n", "\\begin{eqnarray*}\n", "\t\\begin{pmatrix}\n", "\t1 & \\mathbf{x}^T \\\\ \\mathbf{x} &\\mathbf{X}\n", "\t\\end{pmatrix} \\succeq \\mathbf{0}.\n", "\\end{eqnarray*}\n", "We can tighten the relaxation by adding other constraints that cut away part of the feasible set, without excluding rank 1 solutions. For instance, $0 \\le x_i \\le 1$ and $0 \\le X_{ij} \\le 1$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## SDP relaxation of boolean optimization\n", "\n", "* For Boolean constraints $\\mathbf{x} \\in \\{-1, 1\\}^n$, we note\n", "\\begin{eqnarray*}\n", "\t& & x_i \\in \\{0,1\\} \\Leftrightarrow \\mathbf{X} = \\mathbf{x} \\mathbf{x}^T, \\text{diag}(\\mathbf{X}) = \\mathbf{1}.\n", "\\end{eqnarray*}" ] } ], "metadata": { "@webio": { "lastCommId": null, "lastKernelId": null }, "kernelspec": { "display_name": "Julia 1.1.0", "language": "julia", "name": "julia-1.1" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "1.1.0" }, "toc": { "colors": { "hover_highlight": "#DAA520", "running_highlight": "#FF0000", "selected_highlight": "#FFD700" }, "moveMenuLeft": true, "nav_menu": { "height": "180px", "width": "252px" }, "navigate_menu": true, "number_sections": true, "sideBar": true, "skip_h1_title": true, "threshold": 4, "toc_cell": false, "toc_position": { "height": "390.3333435058594px", "left": "0px", "right": "818px", "top": "140.6666717529297px", "width": "180px" }, "toc_section_display": "block", "toc_window_display": true, "widenNotebook": false } }, "nbformat": 4, "nbformat_minor": 2 }