{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Iterative Methods for Solving Linear Equations\n",
"\n",
"## Introduction\n",
"\n",
"So far we have considered direct methods for solving linear equations. \n",
"\n",
"* **Direct methods** (flops fixed _a priori_) vs **iterative methods**:\n",
" - Direct method (GE/LU, Cholesky, QR, SVD): good for dense, small to moderate sized $\\mathbf{A}$ \n",
" - Iterative methods (Jacobi, Gauss-Seidal, SOR, conjugate-gradient, GMRES): good for large, sparse, structured linear system, parallel computing, warm start\n",
" \n",
"\n",
" \n",
"## PageRank problem\n",
"\n",
"\n",
"\n",
"* $\\mathbf{A} \\in \\{0,1\\}^{n \\times n}$ the connectivity matrix of webpages with entries\n",
"$$\n",
"\\begin{eqnarray*}\n",
"\ta_{ij} = \\begin{cases}\n",
"\t1 & \\text{if page $i$ links to page $j$} \\\\\n",
"\t0 & \\text{otherwise}\n",
"\t\\end{cases}. \n",
"\\end{eqnarray*}\n",
"$$\n",
"$n \\approx 10^9$ in May 2017.\n",
"\n",
"* $r_i = \\sum_j a_{ij}$ is the *out-degree* of page $i$. \n",
"\n",
"* [Larry Page](https://en.wikipedia.org/wiki/PageRank) imagines a random surfer wandering on internet according to following rules:\n",
" - From a page $i$ with $r_i>0$\n",
" * with probability $p$, (s)he randomly chooses a link on page $i$ (uniformly) and follows that link to the next page \n",
" * with probability $1-p$, (s)he randomly chooses one page from the set of all $n$ pages (uniformly) and proceeds to that page \n",
" - From a page $i$ with $r_i=0$ (a dangling page), (s)he randomly chooses one page from the set of all $n$ pages (uniformly) and proceeds to that page\n",
" \n",
"The process defines a Markov chain on the space of $n$ pages. Stationary distribution of this Markov chain gives the ranks (probabilities) of each page.\n",
"\n",
"* Stationary distribution is the top left eigenvector of the transition matrix $\\mathbf{P}$ corresponding to eigenvalue 1. Equivalently it can be cast as a linear equation.\n",
"$$\n",
" (\\mathbf{I} - \\mathbf{P}^T) \\mathbf{x} = \\mathbf{0}.\n",
"$$\n",
"\n",
"* Gene Golub: Largest matrix computation in world. \n",
"\n",
"* GE/LU will take $2 \\times (10^9)^3/3/10^{12} \\approx 6.66 \\times 10^{14}$ seconds $\\approx 20$ million years on a tera-flop supercomputer!\n",
"\n",
"* Iterative methods come to the rescue."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Jacobi method\n",
"\n",
"\n",
"\n",
"Solve $\\mathbf{A} \\mathbf{x} = \\mathbf{b}$.\n",
"\n",
"* Jacobi iteration: \n",
"$$\n",
"x_i^{(t+1)} = \\frac{b_i - \\sum_{j=1}^{i-1} a_{ij} x_j^{(t)} - \\sum_{j=i+1}^n a_{ij} x_j^{(t)}}{a_{ii}}.\n",
"$$\n",
"\n",
"* With splitting: $\\mathbf{A} = \\mathbf{L} + \\mathbf{D} + \\mathbf{U}$, Jacobi iteration can be written as\n",
"$$\n",
" \\mathbf{D} \\mathbf{x}^{(t+1)} = - (\\mathbf{L} + \\mathbf{U}) \\mathbf{x}^{(t)} + \\mathbf{b},\n",
"$$\n",
"i.e., \n",
"$$\n",
"\t\\mathbf{x}^{(t+1)} = - \\mathbf{D}^{-1} (\\mathbf{L} + \\mathbf{U}) \\mathbf{x}^{(t)} + \\mathbf{D}^{-1} \\mathbf{b} = - \\mathbf{D}^{-1} \\mathbf{A} \\mathbf{x}^{(t)} + \\mathbf{x}^{(t)} + \\mathbf{D}^{-1} \\mathbf{b}.\n",
"$$\n",
"\n",
"* One round costs $2n^2$ flops with an **unstructured** $\\mathbf{A}$. Gain over GE/LU if converges in $o(n)$ iterations. Saving is huge for **sparse** or **structured** $\\mathbf{A}$. By structured, we mean the matrix-vector multiplication $\\mathbf{A} \\mathbf{v}$ is fast ($O(n)$ or $O(n \\log n)$)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Gauss-Seidel method\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"* Gauss-Seidel (GS) iteration:\n",
"$$\n",
"x_i^{(t+1)} = \\frac{b_i - \\sum_{j=1}^{i-1} a_{ij} x_j^{(t+1)} - \\sum_{j=i+1}^n a_{ij} x_j^{(t)}}{a_{ii}}.\n",
"$$\n",
"\n",
"* With splitting, $(\\mathbf{D} + \\mathbf{L}) \\mathbf{x}^{(t+1)} = - \\mathbf{U} \\mathbf{x}^{(t)} + \\mathbf{b}$, i.e., \n",
"$$\n",
"\\mathbf{x}^{(t+1)} = - (\\mathbf{D} + \\mathbf{L})^{-1} \\mathbf{U} \\mathbf{x}^{(t)} + (\\mathbf{D} + \\mathbf{L})^{-1} \\mathbf{b}.\n",
"$$\n",
"\n",
"* GS converges for any $\\mathbf{x}^{(0)}$ for symmetric and pd $\\mathbf{A}$.\n",
"\n",
"* Convergence rate of Gauss-Seidel is the spectral radius of the $(\\mathbf{D} + \\mathbf{L})^{-1} \\mathbf{U}$."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Successive over-relaxation (SOR)\n",
"\n",
"* SOR: \n",
"$$\n",
"x_i^{(t+1)} = \\omega \\left( b_i - \\sum_{j=1}^{i-1} a_{ij} x_j^{(t+1)} - \\sum_{j=i+1}^n a_{ij} x_j^{(t)} \\right) / a_{ii} + (1-\\omega) x_i^{(t)},\n",
"$$\n",
"i.e., \n",
"$$\n",
"\\mathbf{x}^{(t+1)} = (\\mathbf{D} + \\omega \\mathbf{L})^{-1} [(1-\\omega) \\mathbf{D} - \\omega \\mathbf{U}] \\mathbf{x}^{(t)} + (\\mathbf{D} + \\omega \\mathbf{L})^{-1} (\\mathbf{D} + \\mathbf{L})^{-1} \\omega \\mathbf{b}.\n",
"$$\n",
"\n",
"* Need to pick $\\omega \\in [0,1]$ beforehand, with the goal of improving convergence rate."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Conjugate gradient method\n",
"\n",
"* **Conjugate gradient and its variants are the top-notch iterative methods for solving huge, structured linear systems.**\n",
"\n",
"* A UCLA invention! \n",
"\n",
"* Solving $\\mathbf{A} \\mathbf{x} = \\mathbf{b}$ is equivalent to minimizing the quadratic function $\\frac{1}{2} \\mathbf{x}^T \\mathbf{A} \\mathbf{x} - \\mathbf{b}^T \\mathbf{x}$. \n",
"\n",
"* To do later, when we study optimization. \n",
"\n",
"[Kershaw's results](http://www.sciencedirect.com/science/article/pii/0021999178900980?via%3Dihub) for a fusion problem.\n",
"\n",
"| Method | Number of Iterations |\n",
"|----------------------------------------|----------------------|\n",
"| Gauss Seidel | 208,000 |\n",
"| Block SOR methods | 765 |\n",
"| Incomplete Cholesky conjugate gradient | 25 |"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Numerical examples\n",
"\n",
"The [`IterativeSolvers.jl`](https://github.com/JuliaMath/IterativeSolvers.jl) package implements many commonly used iterative solvers."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"typeof(A) = SparseMatrixCSC{Float64,Int64}\n",
"typeof(Afull) = Array{Float64,2}\n"
]
},
{
"data": {
"text/plain": [
"0.00425125"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"using BenchmarkTools, IterativeSolvers\n",
"\n",
"srand(280)\n",
"\n",
"n = 4000\n",
"# a sparse pd matrix, about 0.5% non-zero entries\n",
"A = sprand(n, n, 0.002)\n",
"A = A + A' + (2n) * I\n",
"@show typeof(A)\n",
"b = randn(n)\n",
"# dense matrix representation of A\n",
"Afull = full(A)\n",
"@show typeof(Afull)\n",
"# actual sparsity level\n",
"countnz(A) / length(A)"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 31.33 KiB\n",
" allocs estimate: 2\n",
" --------------\n",
" minimum time: 5.618 ms (0.00% GC)\n",
" median time: 7.455 ms (0.00% GC)\n",
" mean time: 7.712 ms (0.00% GC)\n",
" maximum time: 14.812 ms (0.00% GC)\n",
" --------------\n",
" samples: 646\n",
" evals/sample: 1"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# dense matrix-vector multiplication\n",
"@benchmark Afull * b"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 31.33 KiB\n",
" allocs estimate: 2\n",
" --------------\n",
" minimum time: 66.647 μs (0.00% GC)\n",
" median time: 69.379 μs (0.00% GC)\n",
" mean time: 74.066 μs (2.24% GC)\n",
" maximum time: 1.427 ms (92.48% GC)\n",
" --------------\n",
" samples: 10000\n",
" evals/sample: 1"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# sparse matrix-vector multiplication\n",
"@benchmark A * b"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 122.10 MiB\n",
" allocs estimate: 11\n",
" --------------\n",
" minimum time: 277.813 ms (2.90% GC)\n",
" median time: 279.444 ms (2.87% GC)\n",
" mean time: 284.541 ms (3.83% GC)\n",
" maximum time: 335.722 ms (17.85% GC)\n",
" --------------\n",
" samples: 18\n",
" evals/sample: 1"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# dense solve via Cholesky\n",
"xchol = cholfact(Afull) \\ b\n",
"@benchmark cholfact(Afull) \\ b"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"vecnorm(xjacobi - xchol) = 0.009481824791564454\n"
]
},
{
"data": {
"text/plain": [
"0.009481824791564454"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Jacobi solution is fairly close to Cholesky solution\n",
"xjacobi, = jacobi(A, b)\n",
"@show vecnorm(xjacobi - xchol)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 94.36 KiB\n",
" allocs estimate: 13\n",
" --------------\n",
" minimum time: 1.034 ms (0.00% GC)\n",
" median time: 1.061 ms (0.00% GC)\n",
" mean time: 1.100 ms (0.51% GC)\n",
" maximum time: 3.343 ms (58.19% GC)\n",
" --------------\n",
" samples: 4537\n",
" evals/sample: 1"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark jacobi(A, b)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.009481824791564454"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Gauss-Seidel solution is fairly close to Cholesky solution\n",
"xgs, = gauss_seidel(A, b)\n",
"vecnorm(xgs - xchol)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 63.11 KiB\n",
" allocs estimate: 13\n",
" --------------\n",
" minimum time: 1.116 ms (0.00% GC)\n",
" median time: 1.215 ms (0.00% GC)\n",
" mean time: 1.232 ms (0.26% GC)\n",
" maximum time: 3.518 ms (65.23% GC)\n",
" --------------\n",
" samples: 4050\n",
" evals/sample: 1"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark gauss_seidel(A, b)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.009481824791561674"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# symmetric SOR with ω=0.75\n",
"xsor, = ssor(A, b, 0.75)\n",
"vecnorm(xsor - xchol)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 94.81 KiB\n",
" allocs estimate: 38\n",
" --------------\n",
" minimum time: 1.123 ms (0.00% GC)\n",
" median time: 1.165 ms (0.00% GC)\n",
" mean time: 1.210 ms (0.39% GC)\n",
" maximum time: 2.644 ms (55.17% GC)\n",
" --------------\n",
" samples: 4121\n",
" evals/sample: 1"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark sor(A, b, 0.75)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0.009481824791658372"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# conjugate gradient\n",
"xcg, = cg(A, b)\n",
"vecnorm(xcg - xchol)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 126.86 KiB\n",
" allocs estimate: 31\n",
" --------------\n",
" minimum time: 229.348 μs (0.00% GC)\n",
" median time: 237.998 μs (0.00% GC)\n",
" mean time: 254.097 μs (3.40% GC)\n",
" maximum time: 2.293 ms (81.10% GC)\n",
" --------------\n",
" samples: 10000\n",
" evals/sample: 1"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark cg(A, b)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Julia Version 0.6.2\n",
"Commit d386e40c17 (2017-12-13 18:08 UTC)\n",
"Platform Info:\n",
" OS: macOS (x86_64-apple-darwin14.5.0)\n",
" CPU: Intel(R) Core(TM) i7-6920HQ CPU @ 2.90GHz\n",
" WORD_SIZE: 64\n",
" BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)\n",
" LAPACK: libopenblas64_\n",
" LIBM: libopenlibm\n",
" LLVM: libLLVM-3.9.1 (ORCJIT, skylake)\n"
]
}
],
"source": [
"versioninfo()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Julia 0.6.2",
"language": "julia",
"name": "julia-0.6"
},
"language_info": {
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
"version": "0.6.2"
},
"toc": {
"colors": {
"hover_highlight": "#DAA520",
"running_highlight": "#FF0000",
"selected_highlight": "#FFD700"
},
"moveMenuLeft": true,
"nav_menu": {
"height": "143px",
"width": "252px"
},
"navigate_menu": true,
"number_sections": false,
"sideBar": true,
"threshold": 4,
"toc_cell": false,
"toc_section_display": "block",
"toc_window_display": false,
"widenNotebook": false
}
},
"nbformat": 4,
"nbformat_minor": 2
}