{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "无约束最优化问题\n", "\\begin{align*} \\\\& \\min_{x \\in R^{n}} f\\left(x\\right)\\end{align*} \n", "其中$x^{*}$为目标函数的极小点。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "设$f\\left(x\\right)$具有二阶连续偏导数,若第$k$次迭代值为$x^{\\left(k\\right)}$,则可将$f\\left(x\\right)$在$x^{\\left(k\\right)}$附近进行二阶泰勒展开\n", "\\begin{align*} \\\\& f\\left(x\\right) = f\\left(x^{\\left(k\\right)}\\right)+g_{k}^{T}\\left(x-x^{\\left(k\\right)}\\right)+\\dfrac{1}{2}\\left(x-x^{\\left(k\\right)}\\right)^{T} H\\left(x^{\\left(k\\right)}\\right)\\left(x-x^{\\left(x\\right)}\\right)\\end{align*} \n", "其中,$g_{k}=g\\left(x^{\\left(k\\right)}\\right)=\\nabla f\\left(x^{\\left(k\\right)}\\right)$是$f\\left(x\\right)$的梯度向量在点$x^{\\left(k\\right)}$的值,$H\\left(x^{\\left(k\\right)}\\right)$是$f\\left(x\\right)$的海赛矩阵\n", "\\begin{align*} \\\\& H\\left(x\\right)=\\left[\\dfrac{\\partial^{2}f}{\\partial x_{i} \\partial x_{j}}\\right]_{n \\times n}\\end{align*} \n", "在点$x^{\\left(k\\right)}$的值。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "函数$f\\left(x\\right)$有极值的必要条件是在极值点处一阶导数为0,即梯度向量为0。特别的当$H\\left(x^{\\left(k\\right)}\\right)$是正定矩阵时,函数$f\\left(x\\right)$的极值为极小值。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "假设$x^{\\left(k+1\\right)}$满足\n", "\\begin{align*} \\\\& \\nabla f\\left(x^{\\left(k+1\\right)}\\right)=0\\end{align*} \n", "根据二阶泰勒展开,得\n", "\\begin{align*} \\\\& \\nabla f\\left(x\\right)=g_{k}+H_{k}\\left(x-x^{\\left(x\\right)}\\right)\\end{align*} \n", "其中,$H_{k}=H\\left(x^{\\left(k\\right)}\\right)$,则\n", "\\begin{align*} \\\\& g_{k}+H_{k}\\left(x^{\\left(k+1\\right)}-x^{\\left(x\\right)}\\right)=0\n", "\\\\ & x^{\\left(k+1\\right)}=x^{\\left(k\\right)}-H_{k}^{-1}g_{k}\\end{align*} \n", "令\n", "\\begin{align*} \\\\& H_{k}p_{k}=-g_{k}\\end{align*} \n", "则\n", "\\begin{align*} \\\\& x^{\\left(k+1\\right)}=x^{\\left(k\\right)}+p_{k}\\end{align*} " ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "牛顿法: \n", "输入:目标函数$f\\left(x\\right)$,梯度$g\\left(x\\right)=\\nabla f\\left(x\\right)$,海赛矩阵$H\\left(x\\right)$,精度要求$\\varepsilon$ \n", "输出:$f\\left(x\\right)$的极小点$x^{*}$ \n", "1. 取初始点$x^{\\left(0\\right)}$,置$k=0$ \n", "2. 计算$g_{k}=g\\left(x^{\\left(k\\right)}\\right)$ \n", "3. 若$\\|g_{k}\\| < \\varepsilon$则停止计算,得近似解$x^{*}=x^{\\left(k\\right)}$ \n", "4. 计算$H_{k}=H\\left(x^{\\left(k\\right)}\\right)$,并求$p_{k}$\n", "\\begin{align*} \\\\& H_{k}p_{k}=-g_{k}\\end{align*}\n", "5. 置$x^{\\left(k+1\\right)}=x^{\\left(k\\right)}+p_{k}$\n", "6. 置$k=k+1$,转2." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "设牛顿法搜索方向是$p_{k}=-\\lambda g_{k}$\n", "由\n", "\\begin{align*} \\\\& x^{\\left(k+1\\right)}=x^{\\left(k\\right)}-H_{k}^{-1}g_{k} \\end{align*} \n", "有\n", "\\begin{align*} \\\\& x=x^{\\left(k\\right)}-\\lambda H_{k}^{-1} g_{k}=x^{\\left(k\\right)}+\\lambda p_{k} \\end{align*}\n", "则$f\\left(x\\right)$在$x^{\\left(k\\right)}$的泰勒展开可近似为\n", "\\begin{align*} \\\\& f\\left(x\\right)=f\\left(x^{\\left(k\\right)}\\right)-\\lambda g_{k}^{T} H_{k}^{-1} g_{k}\\end{align*}\n", "由于$H_{k}^{-1}$正定,故$g_{k}^{T} H_{k}^{-1} g_{k} > 0$。当$\\lambda$为一个充分小的正数时,有$f\\left(x\\right) < f\\left(x^{\\left(x\\right)}\\right)$,即搜索方向$p_{k}$是下降方向。" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "取$x=x^{\\left(k+1\\right)}$,由\n", "\\begin{align*} \\\\& \\nabla f\\left(x\\right)=g_{k}+H_{k}\\left(x-x^{\\left(x\\right)}\\right)\\end{align*} \n", "得\n", "\\begin{align*} \\\\& g_{k+1}-g_{k}=H_{k}\\left(x^{\\left(k+1\\right)}-x^{\\left(x\\right)}\\right)\\end{align*} \n", "记$y_{k}=g_{k+1}-g_{k},\\delta_{k}=x^{\\left(k+1\\right)}-x^{\\left(k\\right)}$,则\n", "\\begin{align*} \\\\& y_{k}=H_{k}\\delta_{k}\n", "\\\\ & H_{k}^{-1}y_{k}=\\delta_{k}\\end{align*} \n", "称为拟牛顿条件。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "DFP算法中选择$G_{k}$作为$H_{k}^{-1}$的近似,假设每一步迭代中矩阵$G_{k+1}$是由$G_{k}$加上两个附加项构成,即\n", "\\begin{align*} \\\\& G_{k+1}=G_{k}+P_{k}+Q_{k}\\end{align*}\n", "其中,$P_{k}$与$Q_{k}$是待定矩阵。则\n", "\\begin{align*} \\\\& G_{k+1}y_{k}=G_{k}y_{k}+P_{k}y_{k}+Q_{k}y_{k}\\end{align*}\n", "为使$G_{k+1}$满足拟牛顿条件,可使$P_{k}$与$Q_{k}$满足\n", "\\begin{align*} \\\\& P_{k}y_{k}=\\delta_{k}\n", "\\\\ & Q_{k}y_{k}=-G_{k}y_{k}\\end{align*}\n", "可取\n", "\\begin{align*} \\\\& P_{k}= \\dfrac{\\delta_{k} \\delta_{k}^{T}}{\\delta_{k}^{T} y_{k}}\n", "\\\\ & Q_{k}=- \\dfrac{G_{k}y_{k}y_{k}^{T}G_{k}}{y_{k}^{T}G_{k}y_{k}}\\end{align*}\n", "可得矩阵$G_{k+1}$的迭代公式\n", "\\begin{align*} \\\\& G_{k+1}=G_{k}+\\dfrac{\\delta_{k} \\delta_{k}^{T}}{\\delta_{k}^{T} y_{k}}- \\dfrac{G_{k}y_{k}y_{k}^{T}G_{k}}{y_{k}^{T}G_{k}y_{k}}\\end{align*}\n", "可以证明,如果初始矩阵$G_{0}$是正定的,则迭代过程中的每个矩阵$G_{k}$都是正定的。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "DFP算法: \n", "输入:目标函数$f\\left(x\\right)$,梯度$g\\left(x\\right)=\\nabla f\\left(x\\right)$,精度要求$\\varepsilon$ \n", "输出:$f\\left(x\\right)$的极小点$x^{*}$ \n", "1. 取初始点$x^{\\left(0\\right)}$,取$G_{0}为正定矩阵,$置$k=0$ \n", "2. 计算$g_{k}=g\\left(x^{\\left(k\\right)}\\right)$ 若$\\|g_{k}\\| < \\varepsilon$则停止计算,得近似解$x^{*}=x^{\\left(k\\right)}$;否则,转3. \n", "3. 置$p_{k}=-G_{k}g_{k}$\n", "4. 一维搜索,求$\\lambda_{k}$使\n", "\\begin{align*} \\\\& f\\left(x^{\\left(k\\right)}+\\lambda_{k}p_{k}\\right)=\\min_{\\lambda \\geq 0} f\\left(x^{\\left(k\\right)}+\\lambda p_{k}\\right)\\end{align*}\n", "5. 置$x^{\\left(k+1\\right)}=x^{\\left(k\\right)}+\\lambda p_{k}$\n", "6. 计算$g_{k+1}=g\\left(x^{\\left(k+1\\right)}\\right)$,若$\\|g_{k+1}\\| < \\varepsilon$,则停止计算,的近似解$x^{*}=x^{\\left(k+1\\right)}$;否则,计算\n", "\\begin{align*} \\\\& G_{k+1}=G_{k}+\\dfrac{\\delta_{k} \\delta_{k}^{T}}{\\delta_{k}^{T} y_{k}}- \\dfrac{G_{k}y_{k}y_{k}^{T}G_{k}}{y_{k}^{T}G_{k}y_{k}}\\end{align*}\n", "7. 置$k=k+1$,转3." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "BFGS算法中选择$B_{k}$逼近海赛矩阵$H_{k}$,相应的拟牛顿条件\n", "\\begin{align*} \\\\& B_{k+1} \\delta_{k} = y_{k}\\end{align*}\n", "假设每一步迭代中矩阵$B_{k+1}$是由$B_{k}$加上两个附加项构成,即\n", "\\begin{align*} \\\\& B_{k+1}=B_{k}+P_{k}+Q_{k}\\end{align*}\n", "其中,$P_{k}$与$Q_{k}$是待定矩阵。则\n", "\\begin{align*} \\\\& B_{k+1}y_{k}=B_{k}y_{k}+P_{k}y_{k}+Q_{k}y_{k}\\end{align*}\n", "为使$B_{k+1}$满足拟牛顿条件,可使$P_{k}$与$Q_{k}$满足\n", "\\begin{align*} \\\\& P_{k}\\delta_{k}=y_{k}\n", "\\\\ & Q_{k}\\delta_{k}=-B_{k}y_{k}\\delta_{k}\\end{align*}\n", "可取\n", "\\begin{align*} \\\\& P_{k}= \\dfrac{y_{k}y_{k}^{T}}{y_{k}^{T}\\delta_{k} }\n", "\\\\ & Q_{k}=- \\dfrac{B_{k}\\delta_{k}\\delta_{k}^{T}B_{k}}{\\delta_{k}^{T}B_{k}\\delta_{k}}\\end{align*}\n", "可得矩阵$B_{k+1}$的迭代公式\n", "\\begin{align*} \\\\& B_{k+1}=B_{k}+\\dfrac{y_{k}y_{k}^{T}}{y_{k}^{T}\\delta_{k} }- \\dfrac{B_{k}\\delta_{k}\\delta_{k}^{T}B_{k}}{\\delta_{k}^{T}B_{k}\\delta_{k}}\\end{align*}\n", "可以证明,如果初始矩阵$B_{0}$是正定的,则迭代过程中的每个矩阵$B_{k}$都是正定的。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "BFGS算法: \n", "输入:目标函数$f\\left(x\\right)$,梯度$g\\left(x\\right)=\\nabla f\\left(x\\right)$,精度要求$\\varepsilon$ \n", "输出:$f\\left(x\\right)$的极小点$x^{*}$ \n", "1. 取初始点$x^{\\left(0\\right)}$,取$B_{0}为正定矩阵,$置$k=0$ \n", "2. 计算$g_{k}=g\\left(x^{\\left(k\\right)}\\right)$ 若$\\|g_{k}\\| < \\varepsilon$则停止计算,得近似解$x^{*}=x^{\\left(k\\right)}$;否则,转3. \n", "3. 由$B_{k}p_{k}=-g_{k}$求出$p_{k}$\n", "4. 一维搜索,求$\\lambda_{k}$使\n", "\\begin{align*} \\\\& f\\left(x^{\\left(k\\right)}+\\lambda_{k}p_{k}\\right)=\\min_{\\lambda \\geq 0} f\\left(x^{\\left(k\\right)}+\\lambda p_{k}\\right)\\end{align*}\n", "5. 置$x^{\\left(k+1\\right)}=x^{\\left(k\\right)}+\\lambda p_{k}$\n", "6. 计算$g_{k+1}=g\\left(x^{\\left(k+1\\right)}\\right)$,若$\\|g_{k+1}\\| < \\varepsilon$,则停止计算,的近似解$x^{*}=x^{\\left(k+1\\right)}$;否则,计算\n", "\\begin{align*} \\\\& B_{k+1}=B_{k}+\\dfrac{y_{k}y_{k}^{T}}{y_{k}^{T}\\delta_{k} }- \\dfrac{B_{k}\\delta_{k}\\delta_{k}^{T}B_{k}}{\\delta_{k}^{T}B_{k}\\delta_{k}}\\end{align*}\n", "7. 置$k=k+1$,转3." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "记\n", "\\begin{align*} \\\\& G_{k}=B_{k}^{-1},\\quad G_{k+1}=B_{k+1}^{-1}\\end{align*}\n", "两次应用Sherman-Morrison公式,得\n", "\\begin{align*} \\\\& G_{k+1}=\\left(I- \\dfrac{\\delta_{k}y_{k}^{T}}{\\delta_{k}^{T}y_{k}}\\right)G_{k}\\left(I-\\dfrac{\\delta_{k}y_{k}^{T}}{\\delta_{k}^{T}y_{k}}\\right)^{T}+\\dfrac{\\delta_{k}\\delta_{k}^{T}}{\\delta_{k}^{T}y_{k}}\\end{align*}\n", "称为BFGS算法关于$G_{k}$的迭代公式。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "令由DFP算法$G_{k}$的迭代公式得到的$G_{k+1}$记作$G^{DFP}$,由BFGS算法$G_{k}$的迭代公式得到的$G_{k+1}$记作$G^{BFGS}$,\n", "由于$G^{DFP}$和$G^{BFGS}$均满足拟牛顿条件,\n", "则两者的线性组合\n", "\\begin{align*} \\\\& G_{k+1}=\\alpha G^{DFP}+\\left(1-\\alpha\\right) G^{BFGS}\\end{align*}\n", "也满足拟牛顿条件,而且是正定的。其中,$0 \\leq \\alpha \\leq 1$。该类算法称为Broyden类算法。" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.11" } }, "nbformat": 4, "nbformat_minor": 0 }