{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "***\n", "# Problem Sheet 2 Part A - Solutions\n", "***\n", "$\\newcommand{\\vct}[1]{\\mathbf{#1}}$\n", "$\\newcommand{\\mtx}[1]{\\mathbf{#1}}$\n", "$\\newcommand{\\e}{\\varepsilon}$\n", "$\\newcommand{\\norm}[1]{\\|#1\\|}$\n", "$\\newcommand{\\minimize}{\\text{minimize}\\quad}$\n", "$\\newcommand{\\maximize}{\\text{maximize}\\quad}$\n", "$\\newcommand{\\subjto}{\\quad\\text{subject to}\\quad}$\n", "$\\newcommand{\\R}{\\mathbb{R}}$\n", "$\\newcommand{\\trans}{T}$\n", "$\\newcommand{\\ip}[2]{\\langle {#1}, {#2} \\rangle}$\n", "$\\newcommand{\\zerovct}{\\vct{0}}$\n", "$\\newcommand{\\diff}[1]{\\mathrm{d}{#1}}$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Solution to Problem 2.1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(a) We apply the bound inductively,\n", "\n", "\\begin{equation*}\n", " \\norm{\\vct{x}_k-\\vct{x}^*}\\leq r\\cdot \\norm{\\vct{x}_{k-1}-\\vct{x}^*}\\leq r \\cdot (r \\cdot \\norm{\\vct{x}_{k-2}-\\vct{x}^*})\\leq \\cdots \\leq r^k\\cdot \\norm{\\vct{x}_{0}-\\vct{x}^*}.\n", " \\end{equation*}\n", "\n", "(b) Let $\\e>0$. We are guaranteed to have an error bounded by $\\e$ if $r^N\\cdot M<\\e$, by Part (a).\n", "Taking logarithms of this inequality,\n", "\n", "\\begin{equation*}\n", " N\\ln(r) + \\ln(M)\\leq \\ln(\\e).\n", "\\end{equation*}\n", "\n", "Negating this, we get\n", "\n", "\\begin{equation*}\n", " -N\\ln(r) - \\ln(M)=N\\ln(1/r) - \\ln(M)\\geq \\ln(1/\\e).\n", "\\end{equation*}\n", "\n", "Dividing by $\\ln(1/r)$ gives\n", "\n", "\\begin{equation*}\n", " N \\geq \\frac{1}{\\ln(1/r)} \\left(\\ln(1/\\e)+\\ln(M)\\right) > \\frac{r}{1-r}\\left(\\ln(M)+\\ln(1/\\e)\\right),\n", "\\end{equation*}\n", "\n", "where we used the inequality $\\ln(1/r)<1/r-1 = (1-r)/r$. \n", "\n", "(c) For quadratic convergence, the bound is derived in exactly the same way as for linear convergence. To determine the number of steps, we start with the bound\n", "\n", "\\begin{equation*}\n", " C^N\\cdot M^{2^N}\\leq \\e.\n", "\\end{equation*}\n", "\n", "Taking logarithms and negating,\n", "\n", "\\begin{equation*}\n", " N\\ln(1/C)-2^N\\ln(M) \\geq \\ln(1/\\e).\n", "\\end{equation*}\n", "\n", "Taking logarithms again, we get\n", "\n", "\\begin{equation*}\n", " N\\cdot \\left(\\frac{\\log_2(N)}{N}+\\log_2(\\ln(1/c)-\\ln(M))\\right) \\geq \\log_2 (\\ln(1/\\e)),\n", "\\end{equation*}\n", "\n", "so that if \n", "\n", "\\begin{equation*}\n", " N> C'\\cdot \\ln\\ln(1/\\e)\n", "\\end{equation*}\n", "\n", "for a constant $C'$, we are guaranteed an error below $\\e$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Solution to Problem 2.2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We first compute the derivatives,\n", "\n", "\\begin{align*}\n", " f(x) &= \\sqrt{x^2+1}\\\\\n", " f'(x) &= \\frac{x}{\\sqrt{x^2+1}}\\\\\n", " f''(x) &= \\frac{1}{(x^2+1)^{3/2}}.\n", "\\end{align*}\n", "\n", "Note that the second derivative is always positive.\n", "Newton's method then has the following form\n", "\n", "\\begin{equation*}\n", " x_{k+1} = x_k - \\frac{f'(x_k)}{f''(x_k)} = x_k - x_k(1+x_k^2) = -x_k^3.\n", "\\end{equation*}\n", "\n", "For $|x_0|<1$ this clearly converges to $0$, while for $x_0>1$ this diverges. For $|x_0|=1$ the sequence alternates between $1$ and $-1$.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Solution to Problem 2.3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We first have to think about what it means to be a steepest descent direction with respect to a norm. If we look for a vector $\\vct{p}$ with $\\norm{\\vct{p}}_\\infty=1$ such that $\\ip{\\vct{p}}{\\nabla f(\\vct{x})}$ is minimal. \n", "Set $\\vct{v}=\\nabla f(\\vct{x})$, to ease notation. Since $\\norm{\\vct{p}}_\\infty = 1$ is the same as saying that $\\max_{1\\leq i\\leq n} |p_i| = 1$, this amounts to solving the minimization problem\n", "\n", "\\begin{align*}\n", " \\minimize& \\ip{\\vct{p}}{\\vct{v}}\\\\\n", " \\subjto & -1 \\leq p_i \\leq 1, \\quad 1\\leq i\\leq n.\n", "\\end{align*}\n", "\n", "Suppose that a minimizer is found and the minimum has the form\n", "\n", "\\begin{equation*}\n", " \\sum_{i=1}^d p_i v_i.\n", "\\end{equation*}\n", "\n", "If $p_iv_i>0$, then we can decrease the objective function further by changing the sign of $p_i$, and then even further by setting $p_i=-1$ if $\\mathrm{sign} \\ v_i=1$ and $p_i=1$ otherwise. Therefore, the optimizer has the form\n", "\n", "\\begin{equation*}\n", " p_i = -\\mathrm{sign} \\ \\nabla f(\\vct{x})_i\n", "\\end{equation*}\n", "\n", "for $1\\leq i\\leq n$.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python [py27]", "language": "python", "name": "Python [py27]" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.12" } }, "nbformat": 4, "nbformat_minor": 0 }