{ "cells": [ { "cell_type": "markdown", "metadata": { "toc": "true" }, "source": [ "# Table of Contents\n", "

1  Optimization Examples - Quadratic Programming
1.1  Quadratic programming (QP)
1.2  Examples
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Optimization Examples - Quadratic Programming\n", "\n", "## Quadratic programming (QP)\n", "\n", "\n", "\n", "* A **quadratic program** (QP) has quadratic objective function and affine constraint functions\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& (1/2) \\mathbf{x}^T \\mathbf{P} \\mathbf{x} + \\mathbf{q}^T \\mathbf{x} + r \\\\\n", "\t&\\text{subject to}& \\mathbf{G} \\mathbf{x} \\preceq \\mathbf{h} \\\\\n", "\t& & \\mathbf{A} \\mathbf{x} = \\mathbf{b},\n", "\\end{eqnarray*}\n", "where we require $\\mathbf{P} \\in \\mathbf{S}_+^n$ (why?). Apparently LP is a special case of QP with $\\mathbf{P} = \\mathbf{0}_{n \\times n}$.\n", "\n", "## Examples\n", "\n", "* Example. The _least squares_ problem minimizes $\\|\\mathbf{y} - \\mathbf{X} \\beta\\|_2^2$, which obviously is a QP.\n", "\n", "* Example. Least squares with linear constraints. For example, _nonnegative least squares_ (NNLS)\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& \\frac 12 \\|\\mathbf{y} - \\mathbf{X} \\beta\\|_2^2 \\\\\n", "\t&\\text{subject to}& \\beta \\succeq \\mathbf{0}.\n", "\\end{eqnarray*}\n", "\n", " In NNMF (nonnegative matrix factorization), the objective $\\|\\mathbf{X} - \\mathbf{V} \\mathbf{W}\\|_{\\text{F}}^2$ can be minimized by alternating NNLS.\n", "\n", "* Example. Lasso regression [Tibshirani (1996)](https://www.jstor.org/stable/2346178), [Donoho (1994)](https://doi.org/10.1093/biomet/81.3.425) minimizes the least squares loss with $\\ell_1$ (lasso) penalty\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& \\frac 12 \\|\\mathbf{y} - \\beta_0 \\mathbf{1} - \\mathbf{X} \\beta\\|_2^2 + \\lambda \\|\\beta\\|_1,\n", "\\end{eqnarray*}\n", "where $\\lambda \\ge 0$ is a tuning parameter. Writing $\\beta = \\beta^+ - \\beta^-$, the equivalent QP is\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& \\frac 12 (\\beta^+ - \\beta^-)^T \\mathbf{X}^T \\left(\\mathbf{I} - \\frac{\\mathbf{1} \\mathbf{1}^T}{n} \\right) \\mathbf{X} (\\beta^+ - \\beta^-) + \\\\\n", "\t& & \\quad \\mathbf{y}^T \\left(\\mathbf{I} - \\frac{\\mathbf{1} \\mathbf{1}^T}{n} \\right) \\mathbf{X} (\\beta^+ - \\beta^-) + \\lambda \\mathbf{1}^T (\\beta^+ + \\beta^-) \\\\\n", "\t&\\text{subject to}& \\beta^+ \\succeq \\mathbf{0}, \\, \\beta^- \\succeq \\mathbf{0}\n", "\\end{eqnarray*}\n", "in $\\beta^+$ and $\\beta^-$.\n", "\n", "\n", "\n", "\n", "\n", "* Example: Elastic net [Zou and Hastie (2005)](https://www.jstor.org/stable/3647580)\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& \\frac 12 \\|\\mathbf{y} - \\beta_0 \\mathbf{1} - \\mathbf{X} \\beta\\|_2^2 + \\lambda (\\alpha \\|\\beta\\|_1 + (1-\\alpha) \\|\\beta\\|_2^2),\n", "\\end{eqnarray*}\n", "where $\\lambda \\ge 0$ and $\\alpha \\in [0,1]$ are tuning parameters.\n", "\n", "* Example: Image denoising by anisotropic penalty. See .\n", "\n", "* Example: (Linearly) constrained lasso\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& \\frac 12 \\|\\mathbf{y} - \\beta_0 \\mathbf{1} - \\mathbf{X} \\beta\\|_2^2 + \\lambda \\|\\beta\\|_1 \\\\\n", "\t&\\text{subject to}& \\mathbf{G} \\beta \\preceq \\mathbf{h} \\\\\n", "\t& & \\mathbf{A} \\beta = \\mathbf{b},\n", "\\end{eqnarray*}\n", "where $\\lambda \\ge 0$ is a tuning parameter. \n", "\n", "* Example: The Huber loss function \n", "\\begin{eqnarray*}\n", "\t\\phi(r) = \\begin{cases}\n", "\tr^2 & |r| \\le M \\\\\n", "\tM(2|r| - M) & |r| > M\n", "\t\\end{cases}\n", "\\end{eqnarray*}\n", "is commonly used in robust statistics. The robust regression problem\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& \\sum_{i=1}^n \\phi(y_i - \\beta_0 - \\mathbf{x}_i^T \\beta)\n", "\\end{eqnarray*}\n", "can be transformed to a QP\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& \\mathbf{u}^T \\mathbf{u} + 2 M \\mathbf{1}^T \\mathbf{v} \\\\\n", "\t&\\text{subject to}& - \\mathbf{u} - \\mathbf{v} \\preceq \\mathbf{y} - \\mathbf{X} \\beta \\preceq \\mathbf{u} + \\mathbf{v} \\\\\n", "\t& & \\mathbf{0} \\preceq \\mathbf{u} \\preceq M \\mathbf{1}, \\mathbf{v} \\succeq \\mathbf{0}\n", "\\end{eqnarray*}\n", "in $\\mathbf{u}, \\mathbf{v} \\in \\mathbb{R}^n$ and $\\beta \\in \\mathbb{R}^p$. Hint: write $|r_i| = (|r_i| \\wedge M) + (|r_i| - M)_+ = u_i + v_i$.\n", "\n", "\n", "\n", "* Example: Support vector machines (SVM). In two-class classification problems, we are given training data $(\\mathbf{x}_i, y_i)$, $i=1,\\ldots,n$, where $\\mathbf{x}_i \\in \\mathbb{R}^n$ are feature vector and $y_i \\in \\{-1, 1\\}$ are class labels. Support vector machine solves the optimization problem\n", "\\begin{eqnarray*}\n", "\t&\\text{minimize}& \\sum_{i=1}^n \\left[ 1 - y_i \\left( \\beta_0 + \\sum_{j=1}^p x_{ij} \\beta_j \\right) \\right]_+ + \\lambda \\|\\beta\\|_2^2,\n", "\\end{eqnarray*}\n", "where $\\lambda \\ge 0$ is a tuning parameters. This is a QP." ] } ], "metadata": { "@webio": { "lastCommId": null, "lastKernelId": null }, "kernelspec": { "display_name": "Julia 1.1.0", "language": "julia", "name": "julia-1.1" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "1.1.0" }, "toc": { "colors": { "hover_highlight": "#DAA520", "running_highlight": "#FF0000", "selected_highlight": "#FFD700" }, "moveMenuLeft": true, "nav_menu": { "height": "64.4000015258789px", "width": "251.60000610351562px" }, "navigate_menu": true, "number_sections": true, "sideBar": true, "skip_h1_title": true, "threshold": 4, "toc_cell": true, "toc_position": { "height": "399.3333435058594px", "left": "0px", "right": "899px", "top": "140.6666717529297px", "width": "151px" }, "toc_section_display": "block", "toc_window_display": true, "widenNotebook": false } }, "nbformat": 4, "nbformat_minor": 2 }