{ "cells": [ { "cell_type": "markdown", "metadata": { "toc": "true" }, "source": [ "# Table of Contents\n", "

1  Newton's Method for Constrained Optimization (BV Chapters 10, 11)
1.1  Equality constraint
1.1.1  KKT condition
1.1.2  Newton algorithm
1.2  Inequality constraint - interior point method
1.2.1  Barrier method
1.2.2  Primal-dual interior-point method
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Newton's Method for Constrained Optimization (BV Chapters 10, 11)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We only consider convex optimization in this lecture.\n", "\n", "## Equality constraint\n", "\n", "* Consider equality constrained optimization\n", "\\begin{eqnarray*}\n", " &\\text{minimize}& f(\\mathbf{x}) \\\\\n", " &\\text{subject to}& \\mathbf{A} \\mathbf{x} = \\mathbf{b},\n", "\\end{eqnarray*}\n", "where $f$ is convex.\n", "\n", "### KKT condition\n", "\n", "* The Langrangian function is\n", "\\begin{eqnarray*}\n", " L(\\mathbf{x}, \\lambda) = f(\\mathbf{x}) + \\nu^T(\\mathbf{A} \\mathbf{x} - \\mathbf{b}),\n", "\\end{eqnarray*}\n", "where $\\nu$ is the vector of Langrange multipliers. \n", "\n", "* Setting the gradient of Langrangian function to zero yields the optimality condition (**Karush-Kuhn-Tucker condition**)\n", "\\begin{eqnarray*}\n", " \\mathbf{A} \\mathbf{x}^\\star &=& \\mathbf{b} \\quad \\quad (\\text{primal feasibility condition}) \\\\\n", " \\nabla f(\\mathbf{x}^\\star) + \\mathbf{A}^T \\nu^\\star &=& \\mathbf{0} \\quad \\quad (\\text{dual feasibility condition})\n", "\\end{eqnarray*}\n", "\n", "### Newton algorithm\n", "\n", "* Let $\\mathbf{x}$ be a feasible point, i.e., $\\mathbf{A} \\mathbf{x} = \\mathbf{b}$, and denote **Newton direction** by $\\Delta \\mathbf{x}$. By second order Taylor expansion\n", "\\begin{eqnarray*}\n", "\tf(\\mathbf{x} + \\Delta \\mathbf{x}) \\approx f(\\mathbf{x}) + \\nabla f(\\mathbf{x})^T \\Delta \\mathbf{x} + \\frac 12 \\Delta \\mathbf{x}^T \\nabla^2 f(\\mathbf{x}) \\Delta \\mathbf{x}.\n", "\\end{eqnarray*}\n", "To maximize the quadratic approximation subject to constraint $\\mathbf{A}(\\mathbf{x} + \\Delta \\mathbf{x}) = \\mathbf{b}$, we solve the KKT equation\n", "\\begin{eqnarray*}\n", " \\begin{pmatrix}\n", " \\nabla^2 f(\\mathbf{x}) & \\mathbf{A}^T \\\\\n", " \\mathbf{A} & \\mathbf{0}\n", " \\end{pmatrix} \\begin{pmatrix}\n", " \\Delta \\mathbf{x} \\\\\n", " \\nu\n", " \\end{pmatrix} = \\begin{pmatrix}\n", " - \\nabla f(\\mathbf{x}) \\\\\n", " \\mathbf{0}\n", " \\end{pmatrix}\n", "\\end{eqnarray*}\n", "for the Newton direction. \n", "\n", "* When $\\nabla^2 f(\\mathbf{x})$ is pd and $\\mathbf{A}$ has full row rank, the KKT matrix is nonsingular therefore the Newton direction is uniquely determined.\n", "\n", "* Line search is similar to the unconstrained case." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* **Infeasible Newton step**. So far we assume that we start with a feasible point. How to derive Newton step from an infeasible point $\\mathbf{x}$? Again from the KKT condition,\n", "\\begin{eqnarray*}\n", " \\begin{pmatrix}\n", " \\nabla^2 f(\\mathbf{x}) & \\mathbf{A}^T \\\\\n", " \\mathbf{A} & \\mathbf{0}\n", " \\end{pmatrix} \\begin{pmatrix}\n", " \\Delta \\mathbf{x} \\\\ \\omega\n", " \\end{pmatrix} = - \\begin{pmatrix} \\nabla f(\\mathbf{x}) \\\\ \\mathbf{A} \\mathbf{x} - \\mathbf{b}\n", " \\end{pmatrix}.\n", "\\end{eqnarray*}\n", "Writing the updated dual variable $\\omega = \\nu + \\Delta \\nu$, we have the equivalent form in terms of primal update $\\Delta \\mathbf{x}$ and dual update $\\Delta \\nu$\n", "\\begin{eqnarray*}\n", " \\begin{pmatrix}\n", " \\nabla^2 f(\\mathbf{x}) & \\mathbf{A}^T \\\\\n", " \\mathbf{A} & \\mathbf{0}\n", " \\end{pmatrix} \\begin{pmatrix}\n", " \\Delta \\mathbf{x} \\\\ \\Delta \\nu\n", " \\end{pmatrix} = - \\begin{pmatrix} \\nabla f(\\mathbf{x}) + \\mathbf{A}^T \\nu \\\\ \\mathbf{A} \\mathbf{x} - \\mathbf{b}\n", " \\end{pmatrix}.\n", "\\end{eqnarray*}\n", "The righthand side is recognized as the primal and dual residuals. Therefore the infeasible Newton step is also interpreted as a **primal-dual mtehod**.\n", "\n", "* It can be shown that the norm of the residual decreases along the Newton direction. Therefore line search is based on the norm of residual." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Inequality constraint - interior point method" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* We consider the constrained optimization\n", "\\begin{eqnarray*}\n", " &\\text{minimize}& f_0(\\mathbf{x}) \\\\\n", " &\\text{subject to}& f_i(\\mathbf{x}) \\le 0, \\quad i = 1,\\ldots,m \\\\\n", " & & \\mathbf{A} \\mathbf{x} = \\mathbf{b},\n", "\\end{eqnarray*}\n", "where $f_0, \\ldots, f_m: \\mathbb{R}^n \\mapsto \\mathbb{R}$ are convex and and twice continuously differentiable, and $\\mathbf{A}$ has full row rank. \n", "\n", "* We assume the problem is solvable with optimal point $\\mathbf{x}^\\star$ and and optimal value $f_0(\\mathbf{x}^\\star) = p^\\star$. \n", "\n", "* KKT condition:\n", "\\begin{eqnarray*}\n", " \\mathbf{A} \\mathbf{x}^\\star = \\mathbf{b}, f_i(\\mathbf{x}^\\star) &\\le& 0, i = 1,\\ldots,m \\quad (\\text{primal feasibility}) \\\\\n", " \\lambda^\\star &\\succeq& \\mathbf{0} \\\\\n", " \\nabla f_0(\\mathbf{x}^\\star) + \\sum_{i=1}^m \\lambda_i^\\star \\nabla f_i(\\mathbf{x}^\\star) + \\mathbf{A}^T \\nu^\\star &=& \\mathbf{0} \\quad \\quad \\quad \\quad \\quad \\quad (\\text{dual feasibility}) \\\\\n", " \\lambda_i^\\star f_i(\\mathbf{x}^\\star) &=& 0, \\quad i = 1,\\ldots,m.\n", "\\end{eqnarray*}\n", "\n", "### Barrier method\n", "\n", "* Alternative form makes inequality constraints implicit in the objective\n", "\\begin{eqnarray*}\n", " &\\text{minimize}& f_0(\\mathbf{x}) + \\sum_{i=1}^m I_-(f_i(\\mathbf{x})) \\\\\n", " &\\text{subject to}& \\mathbf{A} \\mathbf{x} = \\mathbf{b},\n", "\\end{eqnarray*}\n", "where\n", "\\begin{eqnarray*}\n", " I_-(u) = \\begin{cases}\n", " 0 & u \\le 0 \\\\\n", " \\infty & u > 0\n", " \\end{cases}.\n", "\\end{eqnarray*}\n", "\n", "* The idea of the barrier method is to approximate $I_-$ by a differentiable function\n", "\\begin{eqnarray*}\n", " \\hat I_-(u) = - (1/t) \\log (-u), \\quad u < 0,\n", "\\end{eqnarray*}\n", "where $t>0$ is a parameter tuning the approximation accuracy. As $t$ increases, the approximation becomes more accurate.\n", "\n", "\n", "\n", "* The **barrier method** solves a sequence of equality-constraint problems\n", "\\begin{eqnarray*}\n", " &\\text{minimize}& t f_0(\\mathbf{x}) - \\sum_{i=1}^m \\log(-f_i(\\mathbf{x})) \\\\\n", " &\\text{subject to}& \\mathbf{A} \\mathbf{x} = \\mathbf{b},\n", "\\end{eqnarray*}\n", "increasing the parameter $t$ at each step and starting each Newton minimization at the solution for the previous value of $t$.\n", "\n", "* The function $\\phi(\\mathbf{x}) = - \\sum_{i=1}^m \\log (-f_i(\\mathbf{x}))$ is called the **logarithmic barrier** or **log barrier** function.\n", " \n", "* Denote the solution at $t$ by $\\mathbf{x}^\\star(t)$. Using duality theory, it can be shown\n", "\\begin{eqnarray*}\n", " f_0(\\mathbf{x}^\\star(t)) - p^\\star \\le m / t.\n", "\\end{eqnarray*}\n", "\n", "\n", "\n", "* Feasibility and phase I methods. Barrier method has to start from a **strictly feasible point**. We can find such a point by solving\n", "\\begin{eqnarray*}\n", " &\\text{minimize}& s \\\\\n", " &\\text{subject to}& f_i(\\mathbf{x}) \\le s, \\quad i = 1,\\ldots,m \\\\\n", " & & \\mathbf{A} \\mathbf{x} = \\mathbf{b}\n", "\\end{eqnarray*}\n", "by the barrier method." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Primal-dual interior-point method\n", "\n", "* Difference from barrier method: no double loop.\n", "\n", "* In the barrier method, it can be show that a point $\\mathbf{x}$ is equal to $\\mathbf{x}^\\star(t)$ if and only if \n", "\\begin{eqnarray*}\n", " \\nabla f_0(\\mathbf{x}) + \\sum_{i=1}^m \\lambda_i \\nabla f_i(\\mathbf{x}) + \\mathbf{A}^T \\nu &=& \\mathbf{0} \\\\\n", " - \\lambda_i f_i(\\mathbf{x}) &=& 1/t, \\quad i = 1,\\ldots,m \\\\\n", " \\mathbf{A} \\mathbf{x} &=& \\mathbf{b}.\n", "\\end{eqnarray*}\n", "\n", "* We define the KKT residual \n", "\\begin{eqnarray*}\n", " r_t(\\mathbf{x}, \\lambda, \\nu) = \\begin{pmatrix}\n", " \\nabla f_0(\\mathbf{x}) + Df(\\mathbf{x})^T \\lambda + \\mathbf{A}^T \\nu \\\\\n", " - \\text{diag}(\\lambda) f(\\mathbf{x}) - (1/t) \\mathbf{1} \\\\\n", " \\mathbf{A} \\mathbf{x} - \\mathbf{b}\n", " \\end{pmatrix} \\triangleq \\begin{pmatrix}\n", " r_{\\text{dual}} \\\\\n", " r_{\\text{cent}} \\\\\n", " r_{\\text{pri}}\n", " \\end{pmatrix},\n", "\\end{eqnarray*}\n", "where\n", "\\begin{eqnarray*}\n", " f(\\mathbf{x}) = \\begin{pmatrix}\n", " f_1(\\mathbf{x}) \\\\\n", " \\vdots \\\\\n", " f_m(\\mathbf{x})\n", " \\end{pmatrix}, \\quad Df(\\mathbf{x}) = \\begin{pmatrix}\n", " \\nabla f_1(\\mathbf{x})^T \\\\\n", " \\vdots \\\\\n", " \\nabla f_m(\\mathbf{x})^T\n", " \\end{pmatrix}.\n", "\\end{eqnarray*}\n", "\n", "* Denote the current point and Newton step as \n", "\\begin{eqnarray*}\n", " \\mathbf{y} = (\\mathbf{x}, \\lambda, \\nu), \\quad \\Delta \\mathbf{y} = (\\Delta \\mathbf{x}, \\Delta \\lambda, \\Delta \\nu).\n", "\\end{eqnarray*}\n", "In view of the linear equation\n", "\\begin{eqnarray*}\n", " r_t(\\mathbf{y} + \\Delta \\mathbf{y}) \\approx r_t(\\mathbf{y}) + Dr_t(\\mathbf{y}) \\Delta \\mathbf{y} = \\mathbf{0},\n", "\\end{eqnarray*}\n", "we solve $\\Delta \\mathbf{y} = - D r_t(\\mathbf{y})^{-1} r_t(\\mathbf{y})$, i.e.,\n", "\\begin{eqnarray*}\n", " \\begin{pmatrix}\n", " \\nabla^2 f_0(\\mathbf{x}) + \\sum_{i=1}^m \\lambda_i \\nabla^2 f_i(\\mathbf{x}) & Df(\\mathbf{x})^T & \\mathbf{A}^T \\\\\n", " - \\text{diag}(\\lambda) Df(\\mathbf{x}) & - \\text{diag}(f(\\mathbf{x})) & \\mathbf{0} \\\\\n", " \\mathbf{A} & \\mathbf{0} & \\mathbf{0} \n", " \\end{pmatrix} \\begin{pmatrix}\n", " \\Delta \\mathbf{x} \\\\\n", " \\Delta \\lambda \\\\\n", " \\Delta \\nu\n", " \\end{pmatrix} = - \\begin{pmatrix}\n", " r_{\\text{dual}} \\\\\n", " r_{\\text{cent}} \\\\\n", " r_{\\text{pri}} \n", " \\end{pmatrix}\n", "\\end{eqnarray*}\n", "for the **primal-dual search direction**." ] } ], "metadata": { "@webio": { "lastCommId": null, "lastKernelId": null }, "kernelspec": { "display_name": "Julia 1.1.0", "language": "julia", "name": "julia-1.1" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "1.1.0" }, "toc": { "colors": { "hover_highlight": "#DAA520", "running_highlight": "#FF0000", "selected_highlight": "#FFD700" }, "moveMenuLeft": true, "nav_menu": { "height": "68px", "width": "252px" }, "navigate_menu": true, "number_sections": true, "sideBar": true, "skip_h1_title": true, "threshold": 4, "toc_cell": true, "toc_position": { "height": "370.6000061035156px", "left": "0px", "right": "651.4000244140625px", "top": "61.400001525878906px", "width": "116.5999984741211px" }, "toc_section_display": "block", "toc_window_display": true, "widenNotebook": false } }, "nbformat": 4, "nbformat_minor": 2 }