{ "cells": [ { "cell_type": "markdown", "id": "4159be0e", "metadata": { "hideCode": false, "hidePrompt": false, "slideshow": { "slide_type": "slide" } }, "source": [ "# 2023-02-01 Convergence classes\n", "\n", "* Office hours: Monday 9-10pm, Tuesday 2-3pm, Thursday 2-3pm\n", "* This week will stay virtual. Plan is to start in-person the following Monday (Jan 31)\n", "\n", "## Last time\n", "\n", "* Forward and backward error\n", "* Computing volume of a polygon\n", "* Rootfinding examples\n", "* Use [Roots.jl](https://juliapackages.com/p/roots) to solve\n", "* Introduce Bisection\n", "\n", "## Today\n", "\n", "* Discuss rootfinding as a modeling tool\n", "* Limitations of bisection\n", "* Convergence classes\n", "* Intro to Newton methods" ] }, { "cell_type": "code", "execution_count": 136, "id": "eb781a7a", "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "using Plots\n", "default(linewidth=4)" ] }, { "cell_type": "markdown", "id": "6b384a37", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Stability demo: Volume of a polygon" ] }, { "cell_type": "code", "execution_count": 156, "id": "f45333ac", "metadata": { "cell_style": "split", "scrolled": true }, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n" ] }, "execution_count": 156, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X = ([1 0; 2 1; 1 3; 0 1; -1 1.5; -2 -1; .5 -2; 1 0])\n", "R(θ) = [cos(θ) -sin(θ); sin(θ) cos(θ)]\n", "Y = X * R(deg2rad(10))' .+ [1e4 1e4]\n", "plot(Y[:,1], Y[:,2], seriestype=:shape, aspect_ratio=:equal)" ] }, { "cell_type": "code", "execution_count": 157, "id": "1ea4925d", "metadata": { "cell_style": "split" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "pvolume(Y) = 9.249999989226126\n" ] }, { "data": { "text/plain": [ "8×2 Matrix{Float64}:\n", " 1.0 0.0\n", " 2.0 1.0\n", " 1.0 3.0\n", " 0.0 1.0\n", " -1.0 1.5\n", " -2.0 -1.0\n", " 0.5 -2.0\n", " 1.0 0.0" ] }, "execution_count": 157, "metadata": {}, "output_type": "execute_result" } ], "source": [ "using LinearAlgebra\n", "function pvolume(X)\n", " n = size(X, 1)\n", " vol = sum(det(X[i:i+1, :]) / 2 for i in 1:n-1)\n", "end\n", "\n", "@show pvolume(Y)\n", "[det(Y[i:i+1, :]) for i in 1:size(Y, 1)-1]\n", "A = Y[2:3, :]\n", "sum([A[1,1] * A[2,2], - A[1,2] * A[2,1]])\n", "X" ] }, { "cell_type": "markdown", "id": "7f7fce01", "metadata": {}, "source": [ "* Why don't we even get the first digit correct? These numbers are only $10^8$ and $\\epsilon_{\\text{machine}} \\approx 10^{-16}$ so shouldn't we get about 8 digits correct?" ] }, { "cell_type": "markdown", "id": "cc20233c", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Rootfinding\n", "\n", "> Given $f(x)$, find $x$ such that $f(x) = 0$.\n", "\n", "We'll work with scalars ($f$ and $x$ are just numbers) for now, and revisit later when they vector-valued." ] }, { "cell_type": "markdown", "id": "eb9574ab", "metadata": { "cell_style": "split" }, "source": [ "## Change inputs to outputs\n", "\n", "* $f(x; b) = x^2 - b$\n", " * $x(b) = \\sqrt{b}$\n", "* $f(x; b) = \\tan x - b$\n", " * $x(b) = \\arctan b$\n", "* $f(x) = \\cos x + x - b$\n", " * $x(b) = ?$" ] }, { "cell_type": "markdown", "id": "b7c09eb2", "metadata": { "cell_style": "split", "slideshow": { "slide_type": "" } }, "source": [ "We aren't given $f(x)$, but rather an algorithm `f(x)` that approximates it.\n", "* Sometimes we get extra information, like `fp(x)` that approximates $f'(x)$\n", "* If we have source code for `f(x)`, maybe it can be transformed \"automatically\"" ] }, { "cell_type": "markdown", "id": "3156c885", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Nesting matryoshka dolls of rootfinding" ] }, { "cell_type": "markdown", "id": "d1cfa8e7", "metadata": { "cell_style": "split" }, "source": [ "* I have a function (pressure, temperature) $\\mapsto$ (density, energy)\n", " * I need (density, energy) $\\mapsto$ (pressure, temperature)\n", "* I know what condition is satisfied across waves, but not the state on each side. \n", "\n", "![](../img/Toro-ShockOrRarefaction-2.19.png)" ] }, { "cell_type": "markdown", "id": "cafab69f", "metadata": { "cell_style": "split" }, "source": [ "* Given a proposed state, I can measure how much (mass, momentum, energy) is not conserved.\n", " * Find the solution that conserves exactly\n", "* I can compute lift given speed and angle of attack\n", " * How much can the plane lift?\n", " * How much can it lift on a short runway?\n", "![](../img/Boeing_A2_isoQspeed2_lowRes.png)" ] }, { "cell_type": "markdown", "id": "388e9918", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Discuss: rootfinding and inverse problems\n", "\n", "* Inverse kinematics for robotics\n", " * (statics) how much does each joint an robotic arm need to move to grasp an object\n", " * (with momentum) fastest way to get there (motors and arms have finite strength)\n", " * similar for virtual reality\n", "* Infer a light source (or lenses/mirrors) from an iluminated scene\n", "* Imaging\n", " * radiologist seeing an x-ray is mentally inferring 3D geometry from 2D image\n", " * computed tomography (CT) reconstructs 3D model from multiple 2D x-ray images\n", " * seismology infers subsurface/deep earth structure from seismograpms" ] }, { "cell_type": "markdown", "id": "0c4ecccb", "metadata": { "cell_style": "center", "slideshow": { "slide_type": "slide" } }, "source": [ "\n", "\n", "# Iterative bisection" ] }, { "cell_type": "code", "execution_count": 158, "id": "76489e23", "metadata": { "cell_style": "split" }, "outputs": [ { "data": { "text/plain": [ "bisect_iter (generic function with 1 method)" ] }, "execution_count": 158, "metadata": {}, "output_type": "execute_result" } ], "source": [ "f(x) = cos(x) - x\n", "hasroot(f, a, b) = f(a) * f(b) < 0\n", "function bisect_iter(f, a, b, tol)\n", " hist = Float64[]\n", " while abs(b - a) > tol\n", " mid = (a + b) / 2\n", " push!(hist, mid)\n", " if hasroot(f, a, mid)\n", " b = mid\n", " else\n", " a = mid\n", " end\n", " end\n", " hist\n", "end" ] }, { "cell_type": "code", "execution_count": 159, "id": "3bc94db0", "metadata": { "cell_style": "split" }, "outputs": [ { "data": { "text/plain": [ "56" ] }, "execution_count": 159, "metadata": {}, "output_type": "execute_result" } ], "source": [ "length(bisect_iter(f, -1, 3, 1e-20))" ] }, { "cell_type": "markdown", "id": "6a970a26", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Let's plot the error\n", "\n", "$$ \\lvert \\texttt{bisect}^k(f, a, b) - r \\rvert, \\quad k = 1, 2, \\dotsc $$\n", "\n", "where $r$ is the true root, $f(r) = 0$." ] }, { "cell_type": "code", "execution_count": 160, "id": "2ab5c98b", "metadata": { "cell_style": "split" }, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n" ] }, "execution_count": 160, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hist = bisect_iter(f, -1, 3, 1e-10)\n", "r = hist[end] # What are we trusting?\n", "hist = hist[1:end-1]\n", "scatter( abs.(hist .- r), yscale=:log10)\n", "ks = 1:length(hist)\n", "plot!(ks, 4 * (.5 .^ ks))" ] }, { "cell_type": "markdown", "id": "4b5d448c", "metadata": { "cell_style": "split", "slideshow": { "slide_type": "fragment" } }, "source": [ "Evidently the error $e_k = x_k - x_*$ after $k$ bisections satisfies the bound\n", "$$ |e^k| \\le c 2^{-k} . $$" ] }, { "cell_type": "markdown", "id": "0809feac", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Convergence classes" ] }, { "cell_type": "markdown", "id": "63b0c695", "metadata": { "cell_style": "split", "slideshow": { "slide_type": "" } }, "source": [ "A convergent rootfinding algorithm produces a sequence of approximations $x_k$ such that $$\\lim_{k \\to \\infty} x_k \\to x_*$$ where $f(x_*) = 0$. For analysis, it is convenient to define the errors $e_k = x_k - x_*$. We say that an iterative algorithm is **$q$-linearly convergent** if $$\\lim_{k \\to \\infty} |e_{k+1}| / |e_k| = \\rho < 1.$$ (The $q$ is for \"quotient\".) A smaller convergence factor $\\rho$ represents faster convergence. A slightly weaker condition ($r$-linear convergence or just **linear convergence**) is that\n", "$$ |e_k| \\le \\epsilon_k $$\n", "for all sufficiently large $k$ when the sequence $\\{\\epsilon_k\\}$ converges $q$-linearly to 0." ] }, { "cell_type": "code", "execution_count": 121, "id": "f26cdf35", "metadata": { "cell_style": "split", "slideshow": { "slide_type": "" } }, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n" ] }, "execution_count": 121, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ρ = 0.8\n", "errors = [1.]\n", "for i in 1:30\n", " next_e = errors[end] * ρ\n", " push!(errors, next_e)\n", "end\n", "plot(errors, yscale=:log10, ylims=(1e-10, 1))" ] }, { "cell_type": "code", "execution_count": 125, "id": "b26a49a4", "metadata": { "cell_style": "split", "slideshow": { "slide_type": "" } }, "outputs": [ { "data": { "image/svg+xml": [ "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", " \n", " \n", " \n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n" ] }, "execution_count": 125, "metadata": {}, "output_type": "execute_result" } ], "source": [ "e = hist .- r\n", "scatter(abs.(errors[2:end] ./ errors[1:end-1]), ylims=(0,1))" ] }, { "cell_type": "markdown", "id": "fa100b09", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "## Bisection: A = q-linearly convergent, B = r-linearly convergent, C = neither" ] }, { "cell_type": "markdown", "id": "923cb558", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Remarks on bisection\n", "\n", "* Specifying an interval is often inconvenient\n", "* An interval in which the function changes sign guarantees convergence (robustness)\n", "* No derivative information is required\n", "* If bisection works for $f(x)$, then it works and gives the same accuracy for $f(x) \\sigma(x)$ where $\\sigma(x) > 0$.\n", "* Roots of even degree are problematic\n", "* A bound on the solution error is directly available\n", "* The convergence rate is modest -- one iteration per bit of accuracy" ] }, { "cell_type": "markdown", "id": "ca87ee9c", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Newton-Raphson Method\n", "\n", "Much of numerical analysis reduces to [Taylor series](https://en.wikipedia.org/wiki/Taylor_series), the approximation\n", "$$ f(x) = f(x_0) + f'(x_0) (x-x_0) + f''(x_0) (x - x_0)^2 / 2 + \\underbrace{\\dotsb}_{O((x-x_0)^3)} $$\n", "centered on some reference point $x_0$.\n", "\n", "In numerical computation, it is exceedingly rare to look beyond the first-order approximation\n", "$$ \\tilde f_{x_0}(x) = f(x_0) + f'(x_0)(x - x_0) . $$\n", "Since $\\tilde f_{x_0}(x)$ is a linear function, we can explicitly compute the unique solution of $\\tilde f_{x_0}(x) = 0$ as\n", "$$ x = x_0 - f(x_0) / f'(x_0) . $$\n", "This is Newton's Method (aka Newton-Raphson or Newton-Raphson-Simpson) for finding the roots of differentiable functions." ] }, { "cell_type": "markdown", "id": "d1dc047b", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# An implementation" ] }, { "cell_type": "code", "execution_count": 126, "id": "03a8b234", "metadata": { "cell_style": "center" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1] x=1 f(x)=-0.45969769413186023 f'(x)=-1.8414709848078965\n", "[2] x=0.7503638678402439 f(x)=-0.018923073822117442 f'(x)=-1.6819049529414878\n", "[3] x=0.7391128909113617 f(x)=-4.6455898990771516e-5 f'(x)=-1.6736325442243012\n", "[4] x=0.739085133385284 f(x)=-2.847205804457076e-10 f'(x)=-1.6736120293089505\n", "[5] x=0.7390851332151607 f(x)=0.0 f'(x)=-1.6736120291832148\n" ] }, { "data": { "text/plain": [ "(0.7390851332151607, 0.0, 5)" ] }, "execution_count": 126, "metadata": {}, "output_type": "execute_result" } ], "source": [ "function newton(f, fp, x0; tol=1e-8, verbose=false)\n", " x = x0\n", " for k in 1:100 # max number of iterations\n", " fx = f(x)\n", " fpx = fp(x)\n", " if verbose\n", " println(\"[$k] x=$x f(x)=$fx f'(x)=$fpx\")\n", " end\n", " if abs(fx) < tol\n", " return x, fx, k\n", " end\n", " x = x - fx / fpx\n", " end \n", "end\n", "\n", "f(x) = cos(x) - x\n", "fp(x) = -sin(x) - 1\n", "newton(f, fp, 1; tol=1e-15, verbose=true)" ] } ], "metadata": { "@webio": { "lastCommId": "d005758a2c014970b1ecb25bdfe0e5ce", "lastKernelId": "6b2d1577-a396-467b-b12c-4c035d941eea" }, "celltoolbar": "Slideshow", "hide_code_all_hidden": false, "kernelspec": { "display_name": "Julia 1.7.2", "language": "julia", "name": "julia-1.7" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "1.8.4" }, "rise": { "enable_chalkboard": true } }, "nbformat": 4, "nbformat_minor": 5 }