{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "toc": "true"
   },
   "source": [
    "# Table of Contents\n",
    " <p><div class=\"lev1 toc-item\"><a href=\"#Newton's-Method-for-Constrained-Optimization-(BV-Chapters-10,-11)\" data-toc-modified-id=\"Newton's-Method-for-Constrained-Optimization-(BV-Chapters-10,-11)-1\"><span class=\"toc-item-num\">1&nbsp;&nbsp;</span>Newton's Method for Constrained Optimization (BV Chapters 10, 11)</a></div><div class=\"lev2 toc-item\"><a href=\"#Equality-constraint\" data-toc-modified-id=\"Equality-constraint-11\"><span class=\"toc-item-num\">1.1&nbsp;&nbsp;</span>Equality constraint</a></div><div class=\"lev3 toc-item\"><a href=\"#KKT-condition\" data-toc-modified-id=\"KKT-condition-111\"><span class=\"toc-item-num\">1.1.1&nbsp;&nbsp;</span>KKT condition</a></div><div class=\"lev3 toc-item\"><a href=\"#Newton-algorithm\" data-toc-modified-id=\"Newton-algorithm-112\"><span class=\"toc-item-num\">1.1.2&nbsp;&nbsp;</span>Newton algorithm</a></div><div class=\"lev2 toc-item\"><a href=\"#Inequality-constraint---interior-point-method\" data-toc-modified-id=\"Inequality-constraint---interior-point-method-12\"><span class=\"toc-item-num\">1.2&nbsp;&nbsp;</span>Inequality constraint - interior point method</a></div><div class=\"lev3 toc-item\"><a href=\"#Barrier-method\" data-toc-modified-id=\"Barrier-method-121\"><span class=\"toc-item-num\">1.2.1&nbsp;&nbsp;</span>Barrier method</a></div><div class=\"lev3 toc-item\"><a href=\"#Primal-dual-interior-point-method\" data-toc-modified-id=\"Primal-dual-interior-point-method-122\"><span class=\"toc-item-num\">1.2.2&nbsp;&nbsp;</span>Primal-dual interior-point method</a></div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Newton's Method for Constrained Optimization (BV Chapters 10, 11)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We only consider convex optimization in this lecture.\n",
    "\n",
    "## Equality constraint\n",
    "\n",
    "* Consider equality constrained optimization\n",
    "\\begin{eqnarray*}\n",
    "    &\\text{minimize}& f(\\mathbf{x}) \\\\\n",
    "    &\\text{subject to}& \\mathbf{A} \\mathbf{x} = \\mathbf{b},\n",
    "\\end{eqnarray*}\n",
    "where $f$ is convex.\n",
    "\n",
    "### KKT condition\n",
    "\n",
    "* The Langrangian function is\n",
    "\\begin{eqnarray*}\n",
    "    L(\\mathbf{x}, \\lambda) = f(\\mathbf{x}) + \\nu^T(\\mathbf{A} \\mathbf{x} - \\mathbf{b}),\n",
    "\\end{eqnarray*}\n",
    "where $\\nu$ is the vector of Langrange multipliers. \n",
    "\n",
    "* Setting the gradient of Langrangian function to zero yields the optimality condition (**Karush-Kuhn-Tucker condition**)\n",
    "\\begin{eqnarray*}\n",
    "    \\mathbf{A} \\mathbf{x}^\\star &=& \\mathbf{b} \\quad \\quad (\\text{primal feasibility condition}) \\\\\n",
    "    \\nabla f(\\mathbf{x}^\\star) + \\mathbf{A}^T \\nu^\\star &=& \\mathbf{0} \\quad \\quad (\\text{dual feasibility condition})\n",
    "\\end{eqnarray*}\n",
    "\n",
    "### Newton algorithm\n",
    "\n",
    "* Let $\\mathbf{x}$ be a feasible point, i.e., $\\mathbf{A} \\mathbf{x} = \\mathbf{b}$, and denote **Newton direction** by $\\Delta \\mathbf{x}$. By second order Taylor expansion\n",
    "\\begin{eqnarray*}\n",
    "\tf(\\mathbf{x} + \\Delta \\mathbf{x}) \\approx f(\\mathbf{x}) + \\nabla f(\\mathbf{x})^T \\Delta \\mathbf{x} + \\frac 12 \\Delta \\mathbf{x}^T \\nabla^2 f(\\mathbf{x}) \\Delta \\mathbf{x}.\n",
    "\\end{eqnarray*}\n",
    "To maximize the quadratic approximation subject to constraint $\\mathbf{A}(\\mathbf{x} + \\Delta \\mathbf{x}) = \\mathbf{b}$, we solve the KKT equation\n",
    "\\begin{eqnarray*}\n",
    "    \\begin{pmatrix}\n",
    "    \\nabla^2 f(\\mathbf{x}) & \\mathbf{A}^T \\\\\n",
    "    \\mathbf{A} & \\mathbf{0}\n",
    "    \\end{pmatrix} \\begin{pmatrix}\n",
    "    \\Delta \\mathbf{x} \\\\\n",
    "    \\nu\n",
    "    \\end{pmatrix} = \\begin{pmatrix}\n",
    "    - \\nabla f(\\mathbf{x}) \\\\\n",
    "    \\mathbf{0}\n",
    "    \\end{pmatrix}\n",
    "\\end{eqnarray*}\n",
    "for the Newton direction. \n",
    "\n",
    "* When $\\nabla^2 f(\\mathbf{x})$ is pd and $\\mathbf{A}$ has full row rank, the KKT matrix is nonsingular therefore the Newton direction is uniquely determined.\n",
    "\n",
    "* Line search is similar to the unconstrained case."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* **Infeasible Newton step**. So far we assume that we start with a feasible point. How to derive Newton step from an infeasible point $\\mathbf{x}$? Again from the KKT condition,\n",
    "\\begin{eqnarray*}\n",
    "    \\begin{pmatrix}\n",
    "    \\nabla^2 f(\\mathbf{x}) & \\mathbf{A}^T \\\\\n",
    "    \\mathbf{A} & \\mathbf{0}\n",
    "    \\end{pmatrix} \\begin{pmatrix}\n",
    "    \\Delta \\mathbf{x} \\\\ \\omega\n",
    "    \\end{pmatrix} = - \\begin{pmatrix} \\nabla f(\\mathbf{x}) \\\\ \\mathbf{A} \\mathbf{x} - \\mathbf{b}\n",
    "    \\end{pmatrix}.\n",
    "\\end{eqnarray*}\n",
    "Writing the updated dual variable $\\omega = \\nu + \\Delta \\nu$, we have the equivalent form in terms of primal update $\\Delta \\mathbf{x}$ and dual update $\\Delta \\nu$\n",
    "\\begin{eqnarray*}\n",
    "    \\begin{pmatrix}\n",
    "    \\nabla^2 f(\\mathbf{x}) & \\mathbf{A}^T \\\\\n",
    "    \\mathbf{A} & \\mathbf{0}\n",
    "    \\end{pmatrix} \\begin{pmatrix}\n",
    "    \\Delta \\mathbf{x} \\\\ \\Delta \\nu\n",
    "    \\end{pmatrix} = - \\begin{pmatrix} \\nabla f(\\mathbf{x}) + \\mathbf{A}^T \\nu \\\\ \\mathbf{A} \\mathbf{x} - \\mathbf{b}\n",
    "    \\end{pmatrix}.\n",
    "\\end{eqnarray*}\n",
    "The righthand side is recognized as the primal and dual residuals. Therefore the infeasible Newton step is also interpreted as a **primal-dual mtehod**.\n",
    "\n",
    "* It can be shown that the norm of the residual decreases along the Newton direction. Therefore line search is based on the norm of residual."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Inequality constraint - interior point method"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* We consider the constrained optimization\n",
    "\\begin{eqnarray*}\n",
    "    &\\text{minimize}& f_0(\\mathbf{x}) \\\\\n",
    "    &\\text{subject to}& f_i(\\mathbf{x}) \\le 0, \\quad i = 1,\\ldots,m \\\\\n",
    "    & & \\mathbf{A} \\mathbf{x} = \\mathbf{b},\n",
    "\\end{eqnarray*}\n",
    "where $f_0, \\ldots, f_m: \\mathbb{R}^n \\mapsto \\mathbb{R}$ are convex and and twice continuously differentiable, and $\\mathbf{A}$ has full row rank. \n",
    "\n",
    "* We assume the problem is solvable with optimal point $\\mathbf{x}^\\star$ and and optimal value $f_0(\\mathbf{x}^\\star) = p^\\star$. \n",
    "\n",
    "* KKT condition:\n",
    "\\begin{eqnarray*}\n",
    "    \\mathbf{A} \\mathbf{x}^\\star = \\mathbf{b}, f_i(\\mathbf{x}^\\star) &\\le& 0, i = 1,\\ldots,m \\quad (\\text{primal feasibility}) \\\\\n",
    "    \\lambda^\\star &\\succeq& \\mathbf{0} \\\\\n",
    "    \\nabla f_0(\\mathbf{x}^\\star) + \\sum_{i=1}^m \\lambda_i^\\star \\nabla f_i(\\mathbf{x}^\\star) + \\mathbf{A}^T \\nu^\\star &=& \\mathbf{0} \\quad \\quad \\quad \\quad \\quad \\quad (\\text{dual feasibility}) \\\\\n",
    "    \\lambda_i^\\star f_i(\\mathbf{x}^\\star) &=& 0, \\quad i = 1,\\ldots,m.\n",
    "\\end{eqnarray*}\n",
    "\n",
    "### Barrier method\n",
    "\n",
    "* Alternative form makes inequality constraints implicit in the objective\n",
    "\\begin{eqnarray*}\n",
    "    &\\text{minimize}& f_0(\\mathbf{x}) + \\sum_{i=1}^m I_-(f_i(\\mathbf{x})) \\\\\n",
    "    &\\text{subject to}& \\mathbf{A} \\mathbf{x} = \\mathbf{b},\n",
    "\\end{eqnarray*}\n",
    "where\n",
    "\\begin{eqnarray*}\n",
    "    I_-(u) = \\begin{cases}\n",
    "    0 & u \\le 0 \\\\\n",
    "    \\infty & u > 0\n",
    "    \\end{cases}.\n",
    "\\end{eqnarray*}\n",
    "\n",
    "* The idea of the barrier method is to approximate $I_-$ by a differentiable function\n",
    "\\begin{eqnarray*}\n",
    "    \\hat I_-(u) = - (1/t) \\log (-u), \\quad u < 0,\n",
    "\\end{eqnarray*}\n",
    "where $t>0$ is a parameter tuning the approximation accuracy. As $t$ increases, the approximation becomes more accurate.\n",
    "\n",
    "<img src=\"./log_barrier.png\" width=\"500\" align=\"center\"/>\n",
    "\n",
    "* The **barrier method** solves a sequence of equality-constraint problems\n",
    "\\begin{eqnarray*}\n",
    "    &\\text{minimize}& t f_0(\\mathbf{x}) - \\sum_{i=1}^m \\log(-f_i(\\mathbf{x})) \\\\\n",
    "    &\\text{subject to}& \\mathbf{A} \\mathbf{x} = \\mathbf{b},\n",
    "\\end{eqnarray*}\n",
    "increasing the parameter $t$ at each step and starting each Newton minimization at the solution for the previous value of $t$.\n",
    "\n",
    "* The function $\\phi(\\mathbf{x}) = - \\sum_{i=1}^m \\log (-f_i(\\mathbf{x}))$ is called the **logarithmic barrier** or **log barrier** function.\n",
    "    \n",
    "* Denote the solution at $t$ by $\\mathbf{x}^\\star(t)$. Using duality theory, it can be shown\n",
    "\\begin{eqnarray*}\n",
    "    f_0(\\mathbf{x}^\\star(t)) - p^\\star \\le m / t.\n",
    "\\end{eqnarray*}\n",
    "\n",
    "<img src=\"./lp_centralpath.png\" width=\"500\" align=\"center\"/>\n",
    "\n",
    "* Feasibility and phase I methods. Barrier method has to start from a **strictly feasible point**. We can find such a point by solving\n",
    "\\begin{eqnarray*}\n",
    "    &\\text{minimize}& s \\\\\n",
    "    &\\text{subject to}& f_i(\\mathbf{x}) \\le s, \\quad i = 1,\\ldots,m \\\\\n",
    "    & & \\mathbf{A} \\mathbf{x} = \\mathbf{b}\n",
    "\\end{eqnarray*}\n",
    "by the barrier method."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Primal-dual interior-point method\n",
    "\n",
    "* Difference from barrier method: no double loop.\n",
    "\n",
    "* In the barrier method, it can be show that a point $\\mathbf{x}$ is equal to $\\mathbf{x}^\\star(t)$ if and only if  \n",
    "\\begin{eqnarray*}\n",
    "    \\nabla f_0(\\mathbf{x}) + \\sum_{i=1}^m \\lambda_i \\nabla f_i(\\mathbf{x}) + \\mathbf{A}^T \\nu &=& \\mathbf{0} \\\\\n",
    "    - \\lambda_i f_i(\\mathbf{x}) &=& 1/t, \\quad i = 1,\\ldots,m \\\\\n",
    "    \\mathbf{A} \\mathbf{x} &=& \\mathbf{b}.\n",
    "\\end{eqnarray*}\n",
    "\n",
    "* We define the KKT residual  \n",
    "\\begin{eqnarray*}\n",
    "    r_t(\\mathbf{x}, \\lambda, \\nu) = \\begin{pmatrix}\n",
    "    \\nabla f_0(\\mathbf{x}) + Df(\\mathbf{x})^T \\lambda + \\mathbf{A}^T \\nu \\\\\n",
    "    - \\text{diag}(\\lambda) f(\\mathbf{x}) - (1/t) \\mathbf{1} \\\\\n",
    "    \\mathbf{A} \\mathbf{x} - \\mathbf{b}\n",
    "    \\end{pmatrix} \\triangleq \\begin{pmatrix}\n",
    "    r_{\\text{dual}} \\\\\n",
    "    r_{\\text{cent}} \\\\\n",
    "    r_{\\text{pri}}\n",
    "    \\end{pmatrix},\n",
    "\\end{eqnarray*}\n",
    "where\n",
    "\\begin{eqnarray*}\n",
    "    f(\\mathbf{x}) = \\begin{pmatrix}\n",
    "    f_1(\\mathbf{x}) \\\\\n",
    "    \\vdots \\\\\n",
    "    f_m(\\mathbf{x})\n",
    "    \\end{pmatrix}, \\quad Df(\\mathbf{x}) = \\begin{pmatrix}\n",
    "    \\nabla f_1(\\mathbf{x})^T \\\\\n",
    "    \\vdots \\\\\n",
    "    \\nabla f_m(\\mathbf{x})^T\n",
    "    \\end{pmatrix}.\n",
    "\\end{eqnarray*}\n",
    "\n",
    "* Denote the current point and Newton step as  \n",
    "\\begin{eqnarray*}\n",
    "    \\mathbf{y} = (\\mathbf{x}, \\lambda, \\nu), \\quad \\Delta \\mathbf{y} = (\\Delta \\mathbf{x}, \\Delta \\lambda, \\Delta \\nu).\n",
    "\\end{eqnarray*}\n",
    "In view of the linear equation\n",
    "\\begin{eqnarray*}\n",
    "    r_t(\\mathbf{y} + \\Delta \\mathbf{y}) \\approx r_t(\\mathbf{y}) + Dr_t(\\mathbf{y}) \\Delta \\mathbf{y} = \\mathbf{0},\n",
    "\\end{eqnarray*}\n",
    "we solve $\\Delta \\mathbf{y} = - D r_t(\\mathbf{y})^{-1} r_t(\\mathbf{y})$, i.e.,\n",
    "\\begin{eqnarray*}\n",
    "    \\begin{pmatrix}\n",
    "    \\nabla^2 f_0(\\mathbf{x}) + \\sum_{i=1}^m \\lambda_i \\nabla^2 f_i(\\mathbf{x}) & Df(\\mathbf{x})^T & \\mathbf{A}^T \\\\\n",
    "    - \\text{diag}(\\lambda) Df(\\mathbf{x}) & - \\text{diag}(f(\\mathbf{x})) & \\mathbf{0} \\\\\n",
    "    \\mathbf{A} & \\mathbf{0} & \\mathbf{0} \n",
    "    \\end{pmatrix} \\begin{pmatrix}\n",
    "    \\Delta \\mathbf{x} \\\\\n",
    "    \\Delta \\lambda \\\\\n",
    "    \\Delta \\nu\n",
    "    \\end{pmatrix} = - \\begin{pmatrix}\n",
    "    r_{\\text{dual}} \\\\\n",
    "    r_{\\text{cent}} \\\\\n",
    "    r_{\\text{pri}}    \n",
    "    \\end{pmatrix}\n",
    "\\end{eqnarray*}\n",
    "for the **primal-dual search direction**."
   ]
  }
 ],
 "metadata": {
  "@webio": {
   "lastCommId": null,
   "lastKernelId": null
  },
  "kernelspec": {
   "display_name": "Julia 1.1.0",
   "language": "julia",
   "name": "julia-1.1"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.1.0"
  },
  "toc": {
   "colors": {
    "hover_highlight": "#DAA520",
    "running_highlight": "#FF0000",
    "selected_highlight": "#FFD700"
   },
   "moveMenuLeft": true,
   "nav_menu": {
    "height": "68px",
    "width": "252px"
   },
   "navigate_menu": true,
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": true,
   "threshold": 4,
   "toc_cell": true,
   "toc_position": {
    "height": "370.6000061035156px",
    "left": "0px",
    "right": "651.4000244140625px",
    "top": "61.400001525878906px",
    "width": "116.5999984741211px"
   },
   "toc_section_display": "block",
   "toc_window_display": true,
   "widenNotebook": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}