{ "cells": [ { "cell_type": "markdown", "metadata": { "toc": "true" }, "source": [ "# Table of Contents\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Optimization Examples - Quadratic Programming\n", "\n", "## Quadratic programming (QP)\n", "\n", "\n", "\n", "* A **quadratic program** (QP) has quadratic objective function and affine constraint functions\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& (1/2) \\mathbf{x}^T \\mathbf{P} \\mathbf{x} + \\mathbf{q}^T \\mathbf{x} + r \\\\\n", "\t&\\text{subject to}& \\mathbf{G} \\mathbf{x} \\preceq \\mathbf{h} \\\\\n", "\t& & \\mathbf{A} \\mathbf{x} = \\mathbf{b},\n", "\\end{eqnarray*}\n", "where we require $\\mathbf{P} \\in \\mathbf{S}_+^n$ (why?). Apparently LP is a special case of QP with $\\mathbf{P} = \\mathbf{0}_{n \\times n}$.\n", "\n", "## Examples\n", "\n", "* Example. The _least squares_ problem minimizes $\\|\\mathbf{y} - \\mathbf{X} \\beta\\|_2^2$, which obviously is a QP.\n", "\n", "* Example. Least squares with linear constraints. For example, _nonnegative least squares_ (NNLS)\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& \\frac 12 \\|\\mathbf{y} - \\mathbf{X} \\beta\\|_2^2 \\\\\n", "\t&\\text{subject to}& \\beta \\succeq \\mathbf{0}.\n", "\\end{eqnarray*}\n", "\n", " In NNMF (nonnegative matrix factorization), the objective $\\|\\mathbf{X} - \\mathbf{V} \\mathbf{W}\\|_{\\text{F}}^2$ can be minimized by alternating NNLS.\n", "\n", "* Example. Lasso regression [Tibshirani (1996)](https://www.jstor.org/stable/2346178), [Donoho (1994)](https://doi.org/10.1093/biomet/81.3.425) minimizes the least squares loss with $\\ell_1$ (lasso) penalty\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& \\frac 12 \\|\\mathbf{y} - \\beta_0 \\mathbf{1} - \\mathbf{X} \\beta\\|_2^2 + \\lambda \\|\\beta\\|_1,\n", "\\end{eqnarray*}\n", "where $\\lambda \\ge 0$ is a tuning parameter. Writing $\\beta = \\beta^+ - \\beta^-$, the equivalent QP is\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& \\frac 12 (\\beta^+ - \\beta^-)^T \\mathbf{X}^T \\left(\\mathbf{I} - \\frac{\\mathbf{1} \\mathbf{1}^T}{n} \\right) \\mathbf{X} (\\beta^+ - \\beta^-) + \\\\\n", "\t& & \\quad \\mathbf{y}^T \\left(\\mathbf{I} - \\frac{\\mathbf{1} \\mathbf{1}^T}{n} \\right) \\mathbf{X} (\\beta^+ - \\beta^-) + \\lambda \\mathbf{1}^T (\\beta^+ + \\beta^-) \\\\\n", "\t&\\text{subject to}& \\beta^+ \\succeq \\mathbf{0}, \\, \\beta^- \\succeq \\mathbf{0}\n", "\\end{eqnarray*}\n", "in $\\beta^+$ and $\\beta^-$.\n", "\n", "\n", "\n", "\n", "\n", "* Example: Elastic net [Zou and Hastie (2005)](https://www.jstor.org/stable/3647580)\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& \\frac 12 \\|\\mathbf{y} - \\beta_0 \\mathbf{1} - \\mathbf{X} \\beta\\|_2^2 + \\lambda (\\alpha \\|\\beta\\|_1 + (1-\\alpha) \\|\\beta\\|_2^2),\n", "\\end{eqnarray*}\n", "where $\\lambda \\ge 0$ and $\\alpha \\in [0,1]$ are tuning parameters.\n", "\n", "* Example: Image denoising by anisotropic penalty. See