{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Iterative Methods for Solving Linear Equations\n", "\n", "## Introduction\n", "\n", "So far we have considered direct methods for solving linear equations. \n", "\n", "* **Direct methods** (flops fixed _a priori_) vs **iterative methods**:\n", " - Direct method (GE/LU, Cholesky, QR, SVD): good for dense, small to moderate sized $\\mathbf{A}$ \n", " - Iterative methods (Jacobi, Gauss-Seidal, SOR, conjugate-gradient, GMRES): good for large, sparse, structured linear system, parallel computing, warm start\n", " \n", "\n", " \n", "## PageRank problem\n", "\n", "\n", "\n", "* $\\mathbf{A} \\in \\{0,1\\}^{n \\times n}$ the connectivity matrix of webpages with entries\n", "$$\n", "\\begin{eqnarray*}\n", "\ta_{ij} = \\begin{cases}\n", "\t1 & \\text{if page $i$ links to page $j$} \\\\\n", "\t0 & \\text{otherwise}\n", "\t\\end{cases}. \n", "\\end{eqnarray*}\n", "$$\n", "$n \\approx 10^9$ in May 2017.\n", "\n", "* $r_i = \\sum_j a_{ij}$ is the *out-degree* of page $i$. \n", "\n", "* [Larry Page](https://en.wikipedia.org/wiki/PageRank) imagines a random surfer wandering on internet according to following rules:\n", " - From a page $i$ with $r_i>0$\n", " * with probability $p$, (s)he randomly chooses a link on page $i$ (uniformly) and follows that link to the next page \n", " * with probability $1-p$, (s)he randomly chooses one page from the set of all $n$ pages (uniformly) and proceeds to that page \n", " - From a page $i$ with $r_i=0$ (a dangling page), (s)he randomly chooses one page from the set of all $n$ pages (uniformly) and proceeds to that page\n", " \n", "The process defines a Markov chain on the space of $n$ pages. Stationary distribution of this Markov chain gives the ranks (probabilities) of each page.\n", "\n", "* Stationary distribution is the top left eigenvector of the transition matrix $\\mathbf{P}$ corresponding to eigenvalue 1. Equivalently it can be cast as a linear equation.\n", "$$\n", " (\\mathbf{I} - \\mathbf{P}^T) \\mathbf{x} = \\mathbf{0}.\n", "$$\n", "\n", "* Gene Golub: Largest matrix computation in world. \n", "\n", "* GE/LU will take $2 \\times (10^9)^3/3/10^{12} \\approx 6.66 \\times 10^{14}$ seconds $\\approx 20$ million years on a tera-flop supercomputer!\n", "\n", "* Iterative methods come to the rescue." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Jacobi method\n", "\n", "\n", "\n", "Solve $\\mathbf{A} \\mathbf{x} = \\mathbf{b}$.\n", "\n", "* Jacobi iteration: \n", "$$\n", "x_i^{(t+1)} = \\frac{b_i - \\sum_{j=1}^{i-1} a_{ij} x_j^{(t)} - \\sum_{j=i+1}^n a_{ij} x_j^{(t)}}{a_{ii}}.\n", "$$\n", "\n", "* With splitting: $\\mathbf{A} = \\mathbf{L} + \\mathbf{D} + \\mathbf{U}$, Jacobi iteration can be written as\n", "$$\n", " \\mathbf{D} \\mathbf{x}^{(t+1)} = - (\\mathbf{L} + \\mathbf{U}) \\mathbf{x}^{(t)} + \\mathbf{b},\n", "$$\n", "i.e., \n", "$$\n", "\t\\mathbf{x}^{(t+1)} = - \\mathbf{D}^{-1} (\\mathbf{L} + \\mathbf{U}) \\mathbf{x}^{(t)} + \\mathbf{D}^{-1} \\mathbf{b} = - \\mathbf{D}^{-1} \\mathbf{A} \\mathbf{x}^{(t)} + \\mathbf{x}^{(t)} + \\mathbf{D}^{-1} \\mathbf{b}.\n", "$$\n", "\n", "* One round costs $2n^2$ flops with an **unstructured** $\\mathbf{A}$. Gain over GE/LU if converges in $o(n)$ iterations. Saving is huge for **sparse** or **structured** $\\mathbf{A}$. By structured, we mean the matrix-vector multiplication $\\mathbf{A} \\mathbf{v}$ is fast ($O(n)$ or $O(n \\log n)$)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Gauss-Seidel method\n", "\n", "\n", "\n", "\n", "\n", "* Gauss-Seidel (GS) iteration:\n", "$$\n", "x_i^{(t+1)} = \\frac{b_i - \\sum_{j=1}^{i-1} a_{ij} x_j^{(t+1)} - \\sum_{j=i+1}^n a_{ij} x_j^{(t)}}{a_{ii}}.\n", "$$\n", "\n", "* With splitting, $(\\mathbf{D} + \\mathbf{L}) \\mathbf{x}^{(t+1)} = - \\mathbf{U} \\mathbf{x}^{(t)} + \\mathbf{b}$, i.e., \n", "$$\n", "\\mathbf{x}^{(t+1)} = - (\\mathbf{D} + \\mathbf{L})^{-1} \\mathbf{U} \\mathbf{x}^{(t)} + (\\mathbf{D} + \\mathbf{L})^{-1} \\mathbf{b}.\n", "$$\n", "\n", "* GS converges for any $\\mathbf{x}^{(0)}$ for symmetric and pd $\\mathbf{A}$.\n", "\n", "* Convergence rate of Gauss-Seidel is the spectral radius of the $(\\mathbf{D} + \\mathbf{L})^{-1} \\mathbf{U}$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Successive over-relaxation (SOR)\n", "\n", "* SOR: \n", "$$\n", "x_i^{(t+1)} = \\omega \\left( b_i - \\sum_{j=1}^{i-1} a_{ij} x_j^{(t+1)} - \\sum_{j=i+1}^n a_{ij} x_j^{(t)} \\right) / a_{ii} + (1-\\omega) x_i^{(t)},\n", "$$\n", "i.e., \n", "$$\n", "\\mathbf{x}^{(t+1)} = (\\mathbf{D} + \\omega \\mathbf{L})^{-1} [(1-\\omega) \\mathbf{D} - \\omega \\mathbf{U}] \\mathbf{x}^{(t)} + (\\mathbf{D} + \\omega \\mathbf{L})^{-1} (\\mathbf{D} + \\mathbf{L})^{-1} \\omega \\mathbf{b}.\n", "$$\n", "\n", "* Need to pick $\\omega \\in [0,1]$ beforehand, with the goal of improving convergence rate." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Conjugate gradient method\n", "\n", "* **Conjugate gradient and its variants are the top-notch iterative methods for solving huge, structured linear systems.**\n", "\n", "* A UCLA invention! \n", "\n", "* Solving $\\mathbf{A} \\mathbf{x} = \\mathbf{b}$ is equivalent to minimizing the quadratic function $\\frac{1}{2} \\mathbf{x}^T \\mathbf{A} \\mathbf{x} - \\mathbf{b}^T \\mathbf{x}$. \n", "\n", "* To do later, when we study optimization. \n", "\n", "[Kershaw's results](http://www.sciencedirect.com/science/article/pii/0021999178900980?via%3Dihub) for a fusion problem.\n", "\n", "| Method | Number of Iterations |\n", "|----------------------------------------|----------------------|\n", "| Gauss Seidel | 208,000 |\n", "| Block SOR methods | 765 |\n", "| Incomplete Cholesky conjugate gradient | 25 |" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Numerical examples\n", "\n", "The [`IterativeSolvers.jl`](https://github.com/JuliaMath/IterativeSolvers.jl) package implements many commonly used iterative solvers." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "typeof(A) = SparseMatrixCSC{Float64,Int64}\n", "typeof(Afull) = Array{Float64,2}\n" ] }, { "data": { "text/plain": [ "0.00425125" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "using BenchmarkTools, IterativeSolvers\n", "\n", "srand(280)\n", "\n", "n = 4000\n", "# a sparse pd matrix, about 0.5% non-zero entries\n", "A = sprand(n, n, 0.002)\n", "A = A + A' + (2n) * I\n", "@show typeof(A)\n", "b = randn(n)\n", "# dense matrix representation of A\n", "Afull = full(A)\n", "@show typeof(Afull)\n", "# actual sparsity level\n", "countnz(A) / length(A)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "BenchmarkTools.Trial: \n", " memory estimate: 31.33 KiB\n", " allocs estimate: 2\n", " --------------\n", " minimum time: 5.618 ms (0.00% GC)\n", " median time: 7.455 ms (0.00% GC)\n", " mean time: 7.712 ms (0.00% GC)\n", " maximum time: 14.812 ms (0.00% GC)\n", " --------------\n", " samples: 646\n", " evals/sample: 1" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# dense matrix-vector multiplication\n", "@benchmark Afull * b" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "BenchmarkTools.Trial: \n", " memory estimate: 31.33 KiB\n", " allocs estimate: 2\n", " --------------\n", " minimum time: 66.647 μs (0.00% GC)\n", " median time: 69.379 μs (0.00% GC)\n", " mean time: 74.066 μs (2.24% GC)\n", " maximum time: 1.427 ms (92.48% GC)\n", " --------------\n", " samples: 10000\n", " evals/sample: 1" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# sparse matrix-vector multiplication\n", "@benchmark A * b" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "BenchmarkTools.Trial: \n", " memory estimate: 122.10 MiB\n", " allocs estimate: 11\n", " --------------\n", " minimum time: 277.813 ms (2.90% GC)\n", " median time: 279.444 ms (2.87% GC)\n", " mean time: 284.541 ms (3.83% GC)\n", " maximum time: 335.722 ms (17.85% GC)\n", " --------------\n", " samples: 18\n", " evals/sample: 1" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# dense solve via Cholesky\n", "xchol = cholfact(Afull) \\ b\n", "@benchmark cholfact(Afull) \\ b" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "vecnorm(xjacobi - xchol) = 0.009481824791564454\n" ] }, { "data": { "text/plain": [ "0.009481824791564454" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Jacobi solution is fairly close to Cholesky solution\n", "xjacobi, = jacobi(A, b)\n", "@show vecnorm(xjacobi - xchol)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "BenchmarkTools.Trial: \n", " memory estimate: 94.36 KiB\n", " allocs estimate: 13\n", " --------------\n", " minimum time: 1.034 ms (0.00% GC)\n", " median time: 1.061 ms (0.00% GC)\n", " mean time: 1.100 ms (0.51% GC)\n", " maximum time: 3.343 ms (58.19% GC)\n", " --------------\n", " samples: 4537\n", " evals/sample: 1" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@benchmark jacobi(A, b)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.009481824791564454" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Gauss-Seidel solution is fairly close to Cholesky solution\n", "xgs, = gauss_seidel(A, b)\n", "vecnorm(xgs - xchol)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "BenchmarkTools.Trial: \n", " memory estimate: 63.11 KiB\n", " allocs estimate: 13\n", " --------------\n", " minimum time: 1.116 ms (0.00% GC)\n", " median time: 1.215 ms (0.00% GC)\n", " mean time: 1.232 ms (0.26% GC)\n", " maximum time: 3.518 ms (65.23% GC)\n", " --------------\n", " samples: 4050\n", " evals/sample: 1" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@benchmark gauss_seidel(A, b)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.009481824791561674" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# symmetric SOR with ω=0.75\n", "xsor, = ssor(A, b, 0.75)\n", "vecnorm(xsor - xchol)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "BenchmarkTools.Trial: \n", " memory estimate: 94.81 KiB\n", " allocs estimate: 38\n", " --------------\n", " minimum time: 1.123 ms (0.00% GC)\n", " median time: 1.165 ms (0.00% GC)\n", " mean time: 1.210 ms (0.39% GC)\n", " maximum time: 2.644 ms (55.17% GC)\n", " --------------\n", " samples: 4121\n", " evals/sample: 1" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@benchmark sor(A, b, 0.75)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.009481824791658372" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# conjugate gradient\n", "xcg, = cg(A, b)\n", "vecnorm(xcg - xchol)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "BenchmarkTools.Trial: \n", " memory estimate: 126.86 KiB\n", " allocs estimate: 31\n", " --------------\n", " minimum time: 229.348 μs (0.00% GC)\n", " median time: 237.998 μs (0.00% GC)\n", " mean time: 254.097 μs (3.40% GC)\n", " maximum time: 2.293 ms (81.10% GC)\n", " --------------\n", " samples: 10000\n", " evals/sample: 1" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@benchmark cg(A, b)" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Julia Version 0.6.2\n", "Commit d386e40c17 (2017-12-13 18:08 UTC)\n", "Platform Info:\n", " OS: macOS (x86_64-apple-darwin14.5.0)\n", " CPU: Intel(R) Core(TM) i7-6920HQ CPU @ 2.90GHz\n", " WORD_SIZE: 64\n", " BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)\n", " LAPACK: libopenblas64_\n", " LIBM: libopenlibm\n", " LLVM: libLLVM-3.9.1 (ORCJIT, skylake)\n" ] } ], "source": [ "versioninfo()" ] } ], "metadata": { "kernelspec": { "display_name": "Julia 0.6.2", "language": "julia", "name": "julia-0.6" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "0.6.2" }, "toc": { "colors": { "hover_highlight": "#DAA520", "running_highlight": "#FF0000", "selected_highlight": "#FFD700" }, "moveMenuLeft": true, "nav_menu": { "height": "143px", "width": "252px" }, "navigate_menu": true, "number_sections": false, "sideBar": true, "threshold": 4, "toc_cell": false, "toc_section_display": "block", "toc_window_display": false, "widenNotebook": false } }, "nbformat": 4, "nbformat_minor": 2 }