--- title: Gaussian Elimination and LU Decomposition subtitle: Biostat/Biomath M257 author: Dr. Hua Zhou @ UCLA date: today format: html: theme: cosmo embed-resources: true number-sections: true toc: true toc-depth: 4 toc-location: left code-fold: false jupyter: jupytext: formats: 'ipynb,qmd' text_representation: extension: .qmd format_name: quarto format_version: '1.0' jupytext_version: 1.14.5 kernelspec: display_name: Julia (8 threads) 1.8.5 language: julia name: julia-_8-threads_-1.8 --- System information (for reproducibility): ```{julia} versioninfo() ``` Load packages: ```{julia} using Pkg Pkg.activate(pwd()) Pkg.instantiate() Pkg.status() ``` * Goal: solve linear equation $$ \mathbf{A} \mathbf{x} = \mathbf{b}. $$ For simplicity we consider a square matrix $\mathbf{A} \in \mathbb{R}^{n \times n}$. * History: Chinese mathematical text [The Nine Chapters on the Mathematical Art](https://en.wikipedia.org/wiki/The_Nine_Chapters_on_the_Mathematical_Art), Issac Newton and Carl Friedrich Gauss.

* A [toy example](https://en.wikipedia.org/wiki/Gaussian_elimination#Example_of_the_algorithm). ```{julia} A = [2.0 1.0 -1.0; -3.0 -1.0 2.0; -2.0 1.0 2.0] ``` ```{julia} b = [8.0, -11.0, -3.0] ``` ```{julia} # Julia way to solve linear equation # equivalent to `solve(A, b)` in R A \ b ``` What happens when we call `A \ b` to solve a linear equation? ## Elementary operator matrix * **Elementary operator matrix** is the identity matrix with the 0 in position $(j,k)$ replaced by $c$ $$ \mathbf{E}_{jk}(c) = \begin{pmatrix} 1 & & \\ & \ddots & \\ & & 1 & \\ & & & \ddots & \\ & & c & & 1 & \\ & & & & & \ddots \\ & & & & & & 1 \end{pmatrix} = \mathbf{I} + c \mathbf{e}_j \mathbf{e}_k^T. $$ * $\mathbf{E}_{jk}(c)$ is unit triangular, full rank, and its inverse is $$ \mathbf{E}_{jk}^{-1}(c) = \mathbf{E}_{jk}(-c). $$ * $\mathbf{E}_{jk}(c)$ left-multiplies an $n \times m$ matrix $\mathbf{X}$ effectively replace its $j$-th row $\mathbf{x}_{j\cdot}$ by $c \mathbf{x}_{k \cdot} + \mathbf{x}_{j \cdot}$ $$ \mathbf{E}_{jk}(c) \times \mathbf{X} = \mathbf{E}_{jk}(c) \times \begin{pmatrix} & & \\ \cdots & \mathbf{x}_{k\cdot} & \cdots \\ & & \\ \cdots & \mathbf{x}_{j\cdot} & \cdots \\ & & \end{pmatrix} = \begin{pmatrix} & & \\ \cdots & \mathbf{x}_{k\cdot} & \cdots \\ & & \\ \cdots & c \mathbf{x}_{k\cdot} + \mathbf{x}_{j\cdot} & \cdots \\ & & \end{pmatrix}. $$ $2m$ flops. * Gaussian elimination applies a sequence of elementary operator matrices to transform the linear system $\mathbf{A} \mathbf{x} = \mathbf{b}$ to an upper triangular system $$ \begin{eqnarray*} \mathbf{E}_{n,n-1}(c_{n,n-1}) \cdots \mathbf{E}_{21}(c_{21}) \mathbf{A} \mathbf{x} &=& \mathbf{E}_{n,n-1}(c_{n,n-1}) \cdots \mathbf{E}_{21}(c_{21}) \mathbf{b} \\ \mathbf{U} \mathbf{x} &=& \mathbf{b}_{\text{new}}. \end{eqnarray*} $$ Column 1: ```{julia} E21 = [1.0 0.0 0.0; 1.5 1.0 0.0; 0.0 0.0 1.0] ``` ```{julia} # zero (2, 1) entry E21 * A ``` ```{julia} E31 = [1.0 0.0 0.0; 0.0 1.0 0.0; 1.0 0.0 1.0] ``` ```{julia} # zero (3, 1) entry E31 * E21 * A ``` Column 2: ```{julia} E32 = [1.0 0.0 0.0; 0.0 1.0 0.0; 0.0 -4.0 1.0] ``` ```{julia} # zero (3, 2) entry E32 * E31 * E21 * A ``` ## Gauss transformations * For the first column, $$ \mathbf{M}_1 = \mathbf{E}_{n1}(c_{n1}) \cdots \mathbf{E}_{31}(c_{31}) \mathbf{E}_{21}(c_{21}) = \begin{pmatrix} 1 & \\ c_{21} & \\ & \ddots & \\ c_{n1} & & 1 \end{pmatrix} $$ For the $k$-th column, $$ \mathbf{M}_k = \mathbf{E}_{nk}(c_{nk}) \cdots \mathbf{E}_{k+1,k}(c_{k+1,k}) = \begin{pmatrix} 1 & \\ & \ddots \\ & & 1 & \\ & & c_{k+1,k} & 1\\ & & \vdots & & \ddots \\ & & c_{n,k} & & & 1 \end{pmatrix}. $$ * $\mathbf{M}_1, \ldots, \mathbf{M}_{n-1}$ are called the **Gauss transformations**. ```{julia} M1 = E31 * E21 ``` ```{julia} M2 = E32 ``` * Gauss transformations $\mathbf{M}_k$ are unit triangular, full rank, with inverse $$ \mathbf{M}_k^{-1} = \mathbf{E}_{k+1,k}^{-1}(c_{k+1,k}) \cdots \mathbf{E}_{nk}^{-1}(c_{nk}) = \begin{pmatrix} 1 & \\ & \ddots \\ & & 1 & \\ & & - c_{k+1,k}\\ & & \vdots & & \ddots \\ & & - c_{n,k} & & & 1 \end{pmatrix}. $$ ```{julia} inv(M1) ``` ```{julia} inv(M2) ``` ## LU decomposition Gaussian elimination does $$ \mathbf{M}_{n-1} \cdots \mathbf{M}_1 \mathbf{A} = \mathbf{U}. $$ Let \begin{equation*} \mathbf{L} = \mathbf{M}_1^{-1} \cdots \mathbf{M}_{n-1}^{-1} = \begin{pmatrix} 1 & & & & \\ \,- c_{21} & \ddots & & & \\ & & 1 & & \\ \, - c_{k+1,1} & & - c_{k+1,k} & & \\ & & \vdots & & \ddots \\ \,- c_{n1} & & - c_{nk} & & & 1 \end{pmatrix}. \end{equation*} Thus we have the **LU decomposition** $$ \mathbf{A} = \mathbf{L} \mathbf{U}, $$ where $\mathbf{L}$ is unit lower triangular and $\mathbf{U}$ is upper triangular. ```{julia} # collect negative multipliers into a unit lower triangular matrix L = [1 0 0; -3/2 1 0; -1 4 1] ``` ```{julia} # upper triangular matrix after Gaussian elimination U = [2 1 -1; 0 1/2 1/2; 0 0 -1] ``` ```{julia} # recovers original matrix L * U ``` * The whole LU algorithm is done in place, i.e., $\mathbf{A}$ is overwritten by $\mathbf{L}$ and $\mathbf{U}$. * LU decomposition exists if the principal sub-matrix $\mathbf{A}[1:k, 1:k]$ is non-singular for $k=1,\ldots,n-1$. * If the LU decomposition exists and $\mathbf{A}$ is non-singular, then the LU decmpositon is unique and $$ \det(\mathbf{A}) = \det(\mathbf{L}) \det(\mathbf{U}) = \prod_{k=1}^n u_{kk}. $$ * The LU decomposition costs $$ 2(n-1)^2 + 2(n-2)^2 + \cdots + 2 \cdot 1^2 \approx \frac 23 n^3 \quad \text{flops}. $$

* Actual implementations can differ: outer product LU ($kij$ loop), block outer product LU (higher level-3 fraction), Crout's algorithm ($jki$ loop). * Given LU decomposition $\mathbf{A} = \mathbf{L} \mathbf{U}$, solving $(\mathbf{L} \mathbf{U}) \mathbf{x} = \mathbf{b}$ for one right hand side costs $2n^2$ flops: - One forward substitution ($n^2$ flops) to solve $$ \mathbf{L} \mathbf{y} = \mathbf{b} $$ - One back substitution ($n^2$ flops) to solve $$ \mathbf{U} \mathbf{x} = \mathbf{y} $$ * Therefore to solve $\mathbf{A} \mathbf{x} = \mathbf{b}$ via LU, we need a total of $$ \frac 23 n^3 + 2n^2 \quad \text{flops}. $$ * If there are multiple right hand sides, LU only needs to be done once. ## Matrix inversion * For matrix inversion, we need to solve $\mathbf{A} \mathbf{X} = \mathbf{I}$, or $\mathbf{A} \mathbf{x}_i = \mathbf{e}_i$ for $i=1,\ldots,n$. There are $n$ right hand sides $\mathbf{e}_1, \ldots, \mathbf{e}_n$. Naively, we may need $\frac 23 n^3 + 2n^3$ flops. But taking advantage of 0s reduces the second term $2n^3$ to $\frac 43 n^3$. * So **matrix inversion via LU** costs $$ \frac 23 n^3 + \frac 43 n^3 = 2n^3 \quad \text{flops}. $$ This is 3x more expensive than just solving one equation. * **No inversion mentality**: > **Whenever we see matrix inverse, we should think in terms of solving linear equations.** * We do not compute matrix inverse unless 1. it is necessary to compute standard errors 2. number of right hand sides is much larger than $n$ 3. $n$ is small ## Pivoting * Let $$ \mathbf{A} = \begin{pmatrix} 0 & 1 \\ 1 & 0 \\ \end{pmatrix}. $$ Is there a solution to $\mathbf{A} \mathbf{x} = \mathbf{b}$ for an arbitrary $\mathbf{b}$? Does GE/LU work for $\mathbf{A}$? * What if, during LU procedure, the **pivot** $a_{kk}$ is 0 or nearly 0 due to underflow? Solution: pivoting. * **Partial pivoting**: before zeroing the $k$-th column, the row with $\max_{i=k}^n |a_{ik}|$ is moved into the $k$-th row. * LU with partial pivoting yields $$ \mathbf{P} \mathbf{A} = \mathbf{L} \mathbf{U}, $$ where $\mathbf{P}$ is a permutation matrix, $\mathbf{L}$ is unit lower triangular with $|\ell_{ij}| \le 1$, and $\mathbf{U}$ is upper triangular. * Complete pivoting: Do both row and column interchanges so that the largest entry in the sub matrix `A[k:n, k:n]` is permuted to the $(k,k)$-th entry. This yields the decomposition $$ \mathbf{P} \mathbf{A} \mathbf{Q} = \mathbf{L} \mathbf{U}, $$ where $|\ell_{ij}| \le 1$. * LU decomposition with partial pivoting is the most commonly used methods for solving **general** linear systems. Complete pivoting is the most stable but costs more computation. Partial pivoting is stable most of times. ## LAPACK and Julia implementation * LAPACK: [?getrf](http://www.netlib.org/lapack/explore-html/dd/d9a/group__double_g_ecomputational_ga0019443faea08275ca60a734d0593e60.html#ga0019443faea08275ca60a734d0593e60) does $\mathbf{P} \mathbf{A} = \mathbf{L} \mathbf{U}$ (LU decomposition with partial pivoting) in place. * R: `solve()` implicitly performs LU decomposition (wrapper of LAPACK routine `dgesv`). `solve()` allows specifying a single or multiple right hand sides. If none, it computes the matrix inverse. The `matrix` package contains `lu()` function that outputs `L`, `U`, and `pivot`. * Julia: - [LinearAlgebra.lu](https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/#LinearAlgebra.lu). - [LinearAlgebra.lu!](https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/#LinearAlgebra.lu!). In-place version. Input matrix gets overwritten with L and U. - Or call LAPACK wrapper function [getrf!](https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/#LinearAlgebra.LAPACK.getrf!) directly. - Other LU-related LAPACK wrapper functions: [gesv](https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/#LinearAlgebra.LAPACK.gesv!), [gesvx](https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/#LinearAlgebra.LAPACK.gesvx!), [trtri](https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/#LinearAlgebra.LAPACK.trtri!) (inverse of triangular matrix), [trtrs](https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/#LinearAlgebra.LAPACK.trtrs!). ```{julia} A ``` ```{julia} using LinearAlgebra # second argument indicates partial pivoting (default) or not alu = lu(A) typeof(alu) ``` ```{julia} dump(alu) ``` ```{julia} alu.L ``` ```{julia} alu.U ``` ```{julia} alu.p ``` ```{julia} alu.P ``` ```{julia} alu.L * alu.U ``` ```{julia} A[alu.p, :] ``` ```{julia} # this is doing two triangular solves, 2n^2 flops alu \ b ``` ```{julia} det(A) # this does LU! O(n^3) ``` ```{julia} det(alu) # this is cheap O(n) ``` ```{julia} inv(A) # this does LU! O(n^3) ``` ```{julia} inv(alu) # this is cheap O(n^2) ``` ## Further reading * Sections II.5.2 and II.5.3 of [Computational Statistics](http://ucla.worldcat.org/title/computational-statistics/oclc/437345409&referer=brief_results) by James Gentle (2010). * A definite reference is Chapter 3 of the book [Matrix Computation](http://catalog.library.ucla.edu/vwebv/holdingsInfo?bibId=7122088) by Gene Golub and Charles Van Loan.