{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "using LinearAlgebra # load this package for most of Julia's linear algebra functions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Gaussian Elimination = Matrix multiplications\n", "\n", "Let's look more closely at the process of Gaussian elimination in matrix form, using the matrix from lecture 1." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3×3 Matrix{Int64}:\n", " 1 3 1\n", " 1 1 -1\n", " 3 11 6" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A = [1 3 1\n", " 1 1 -1\n", " 3 11 6]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Gaussian elimination produces the matrix $U$, which we can compute in Julia as in lecture 1:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3×3 Matrix{Float64}:\n", " 1.0 3.0 1.0\n", " 0.0 -2.0 -2.0\n", " 0.0 0.0 1.0" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# LU factorization (Gaussian elimination) of the matrix A, \n", "# passing the NoPivot() to prevent row re-ordering (\"pivoting\")\n", "L, U = lu(A, NoPivot()) \n", "U # just show U" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, let's go through **Gaussian elimination in matrix form**, by **expressing the elimination steps as matrix multiplications.** In Gaussian elimination, we make linear combination of *rows* to cancel elements below the pivot, and we now know that this corresponds to multiplying on the *left* by some *elimination matrix* $E$.\n", "\n", "The first step is to eliminate in the first column of $A$. The pivot is the 1 in the upper-left-hand corner. For this $A$, we need to:\n", "\n", "1. Leave the first row alone.\n", "2. Subtract the first row from the second row to get the new second row.\n", "3. Subtract $3 \\times {}$ first row from the third row to get the new third row.\n", "\n", "This corresponds to multiplying $A$ on the left by the matrix `E1`. As above (in the \"row × matrix\" picture), the three rows of `E1` correspond exactly to the three row operations listed above:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3×3 Matrix{Int64}:\n", " 1 0 0\n", " -1 1 0\n", " -3 0 1" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "E1 = [ 1 0 0\n", " -1 1 0\n", " -3 0 1]" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3×3 Matrix{Int64}:\n", " 1 3 1\n", " 0 -2 -2\n", " 0 2 3" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "E1*A" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As desired, this introduced zeros below the diagonal in the first column." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It's fun and perhaps illuminating to try acting `E1` on a matrix of *symbolic* variables, using the [Symbolics.jl](https://github.com/JuliaSymbolics/Symbolics.jl) computer-algebra (symbolic-math) package for Julia." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "using Symbolics" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll define a matrix `S` whose entries are *variables* instead of numbers, and then act `E1` on it.\n", "\n", "We *could* use boring variables like $s_{11}$ etcetera:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/latex": [ "\\begin{equation}\n", "\\left[\n", "\\begin{array}{ccc}\n", "s{{_1}}ˏ{_1} & s{{_1}}ˏ{_2} & s{{_1}}ˏ{_3} \\\\\n", "s{{_2}}ˏ{_1} & s{{_2}}ˏ{_2} & s{{_2}}ˏ{_3} \\\\\n", "s{{_3}}ˏ{_1} & s{{_3}}ˏ{_2} & s{{_3}}ˏ{_3} \\\\\n", "\\end{array}\n", "\\right]\n", "\\end{equation}\n" ], "text/plain": [ "3×3 Matrix{Num}:\n", " s[1, 1] s[1, 2] s[1, 3]\n", " s[2, 1] s[2, 2] s[2, 3]\n", " s[3, 1] s[3, 2] s[3, 3]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@variables s[1:3, 1:3] # declare an bunch of symbolic variables\n", "S = collect(s) # & collect them into a matrix" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/latex": [ "\\begin{equation}\n", "\\left[\n", "\\begin{array}{ccc}\n", "s{{_1}}ˏ{_1} & s{{_1}}ˏ{_2} & s{{_1}}ˏ{_3} \\\\\n", " - s{{_1}}ˏ{_1} + s{{_2}}ˏ{_1} & - s{{_1}}ˏ{_2} + s{{_2}}ˏ{_2} & - s{{_1}}ˏ{_3} + s{{_2}}ˏ{_3} \\\\\n", " - 3 s{{_1}}ˏ{_1} + s{{_3}}ˏ{_1} & - 3 s{{_1}}ˏ{_2} + s{{_3}}ˏ{_2} & - 3 s{{_1}}ˏ{_3} + s{{_3}}ˏ{_3} \\\\\n", "\\end{array}\n", "\\right]\n", "\\end{equation}\n" ], "text/plain": [ "3×3 Matrix{Num}:\n", " s[1, 1] s[1, 2] s[1, 3]\n", " s[2, 1] - s[1, 1] s[2, 2] - s[1, 2] s[2, 3] - s[1, 3]\n", " s[3, 1] - 3s[1, 1] s[3, 2] - 3s[1, 2] s[3, 3] - 3s[1, 3]" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "E1 * S" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "But it's hard to visually distinguish all of these subscripts, and nowadays on a computer we have a lot more symbols to choose from.\n", "\n", "Let's instead make a symbolic matrix full of emoji:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/latex": [ "\\begin{equation}\n", "\\left[\n", "\\begin{array}{ccc}\n", "😀 & 🥸 & 😭 \\\\\n", "🐸 & 🐶 & 🐨 \\\\\n", "🚗 & 🚛 & 🚜 \\\\\n", "\\end{array}\n", "\\right]\n", "\\end{equation}\n" ], "text/plain": [ "3×3 Matrix{Num}:\n", " 😀 🥸 😭\n", " 🐸 🐶 🐨\n", " 🚗 🚛 🚜" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@variables 😀 🥸 😭 🐸 🐶 🐨 🚗 🚛 🚜 # declare \"moar funner\" symbolic variables\n", "\n", "S = [ 😀 🥸 😭 # first row = faces\n", " 🐸 🐶 🐨 # second row = animals\n", " 🚗 🚛 🚜 ] # third row = Cars™" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/latex": [ "\\begin{equation}\n", "\\left[\n", "\\begin{array}{ccc}\n", "😀 & 🥸 & 😭 \\\\\n", "🐸 - 😀 & 🐶 - 🥸 & 🐨 - 😭 \\\\\n", "🚗 - 3 😀 & 🚛 - 3 🥸 & 🚜 - 3 😭 \\\\\n", "\\end{array}\n", "\\right]\n", "\\end{equation}\n" ], "text/plain": [ "3×3 Matrix{Num}:\n", " 😀 🥸 😭\n", " 🐸 - 😀 🐶 - 🥸 🐨 - 😭\n", " 🚗 - 3😀 🚛 - 3🥸 🚜 - 3😭" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "E1 * S" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, we need to eliminate the 2 below the diagonal in the *second* column of `E1*A`. Our new pivot is $-2$ (in the second row), and we just add the second row of `E1*A` with the third row to make the new third row.\n", "\n", "This corresponds to multiplying on the left by the matrix `E2`, which leaves the first two rows alone and makes the new third row by adding the second and third rows:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3×3 Matrix{Int64}:\n", " 1 0 0\n", " 0 1 0\n", " 0 1 1" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "E2 = [1 0 0\n", " 0 1 0\n", " 0 1 1]" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/latex": [ "\\begin{equation}\n", "\\left[\n", "\\begin{array}{ccc}\n", "😀 & 🥸 & 😭 \\\\\n", "🐸 & 🐶 & 🐨 \\\\\n", "🐸 + 🚗 & 🐶 + 🚛 & 🐨 + 🚜 \\\\\n", "\\end{array}\n", "\\right]\n", "\\end{equation}\n" ], "text/plain": [ "3×3 Matrix{Num}:\n", " 😀 🥸 😭\n", " 🐸 🐶 🐨\n", " 🐸 + 🚗 🐶 + 🚛 🐨 + 🚜" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "E2 * S" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3×3 Matrix{Int64}:\n", " 1 3 1\n", " 0 -2 -2\n", " 0 0 1" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "E2*E1*A" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/latex": [ "\\begin{equation}\n", "\\left[\n", "\\begin{array}{ccc}\n", "😀 & 🥸 & 😭 \\\\\n", "🐸 - 😀 & 🐶 - 🥸 & 🐨 - 😭 \\\\\n", "🐸 + 🚗 - 4 😀 & 🐶 + 🚛 - 4 🥸 & 🐨 + 🚜 - 4 😭 \\\\\n", "\\end{array}\n", "\\right]\n", "\\end{equation}\n" ], "text/plain": [ "3×3 Matrix{Num}:\n", " 😀 🥸 😭\n", " 🐸 - 😀 🐶 - 🥸 🐨 - 😭\n", " 🐸 + 🚗 - 4😀 🐶 + 🚛 - 4🥸 🐨 + 🚜 - 4😭" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "E2*E1*S" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As expected, this is upper triangular, and in fact the same as the `U` matrix returned by the Julia `lu` function above:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "true" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "E2*E1*A == U" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Thus, we have arrived at the formula:\n", "$$\n", "\\underbrace{E_2 E_1}_E A = U\n", "$$\n", "Notice that we multiplied $A$ by the elimination matrices from *right to left* in the order of the steps: it is $E_2 E_1 A$, *not* $E_1 E_2 A$. Because matrix multiplication is generally [not commutative](https://en.wikipedia.org/wiki/Commutative_property), $E_2 E_1$ and $E_1 E_2$ give *different* matrices:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3×3 Matrix{Int64}:\n", " 1 0 0\n", " -1 1 0\n", " -4 1 1" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "E = E2*E1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(Notice that when we multiply $E_2 E_1$ to get $E$, the elimination steps get \"mixed up together\", and we get new numbers like `-4` (above) that didn't appear as multipliers in the elimination steps. As discussed below, this makes $E$ a bit inconvenient as a way to work with Gaussian elimination, because it requires some pointless effort to compute $E$.)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3×3 Matrix{Int64}:\n", " 1 0 0\n", " -1 1 0\n", " -3 1 1" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "E1*E2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Elimination without row swaps: U = (lower triangular) * A\n", "\n", "Notice, furthermore, that the matrices $E_1$ and $E_2$ are both *lower-triangular matrices*. This is a consequence of the structure of Gaussian elimination (assuming no row re-ordering): we always add the pivot row to rows *below* it, never *above* it.\n", "\n", "The *product* of lower-triangular matrices is always lower-triangular too. (In homework, you will explore a similar property for upper-triangular matrices) In consequence, the product $E = E_2 E_1$ is lower-triangular, and Gaussian elimination can be viewed as yielding $EA=U$ where $E$ is lower triangular and $U$ is upper triangular.\n", "\n", "*Caveat:* there is a slight complication if we have to do *row swap* steps, since row swaps are not lower triangular. We will defer this case until later, since row swaps don't arise in hand calculations unless you are unlucky." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# The other way around: A = LU\n", "\n", "There are two problems with writing $U = EA$.\n", "\n", "1. The matrix $E$ is annoying to compute — the different elimination steps are mixed up together, and we don't *actually* want to multiply those matrices together.\n", "\n", "2. We want to think of this process as **simplifying A**. i.e. we want to *replace* $A$ with $A = \\mbox{simpler stuff}$.\n", "\n", "Both of these concerns are addressed by thinking about Gaussian elimination **in reverse**: *How do we get A back from U*? Again, this consists of row operations on $U$, and will result in\n", "\n", "$$\n", "A = LU\n", "$$\n", "\n", "where $L$ is a lower-triangular matrix with 1's on the diagonal, and the **multipliers from the elimination steps** written below the diagonal.\n", "\n", "See handwritten notes on:\n", "\n", "1. How to get $L$: just a record of the elimination steps, no extra computation!\n", "2. How to *use* $A = LU$ to solve $Ax=b$ for any right-hand side(s) we want.\n", "\n", "Check:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "LU{Float64, Matrix{Float64}}\n", "L factor:\n", "3×3 Matrix{Float64}:\n", " 1.0 0.0 0.0\n", " 1.0 1.0 0.0\n", " 3.0 -1.0 1.0\n", "U factor:\n", "3×3 Matrix{Float64}:\n", " 1.0 3.0 1.0\n", " 0.0 -2.0 -2.0\n", " 0.0 0.0 1.0" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "lu(A, NoPivot())" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3×3 Matrix{Float64}:\n", " 1.0 0.0 0.0\n", " 1.0 1.0 0.0\n", " 3.0 -1.0 1.0" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "E^-1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# The Identity Matrix\n", "\n", "A very important special matrix is the *identity* matrix $I$. Here is a $ 5 \\times 5$ identity matrix:\n", "\n", "$$\n", "I = \\begin{pmatrix} 1&0&0&0&0 \\\\ 0&1&0&0&0 \\\\ 0&0&1&0&0 \\\\ 0&0&0&1&0 \\\\ 0&0&0&0&1 \\end{pmatrix}\n", "= \\begin{pmatrix} e_1 & e_2 & e_3 & e_4 & e_5 \\end{pmatrix}\n", "$$\n", "\n", "where the columns $e_1 \\cdots e_5$ of $I$ are the **unit vectors** in each component.\n", "\n", "The identity matrix, which can be constructed by `I(5)` in Julia, has the property that $Ix=x$ for any $x$, and hence $IA = A$ for any (here $5 \\times 5$) matrix $A$:" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5×5 Matrix{Int64}:\n", " 4 -2 -7 -4 -8\n", " 9 -6 -6 -1 -5\n", " -2 -9 3 -5 2\n", " 9 7 -9 5 -8\n", " -1 6 -3 9 6" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A = [4 -2 -7 -4 -8\n", " 9 -6 -6 -1 -5\n", " -2 -9 3 -5 2\n", " 9 7 -9 5 -8\n", " -1 6 -3 9 6] # a randomly chosen 5x5 matrix" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5-element Vector{Int64}:\n", " -7\n", " 2\n", " 4\n", " -4\n", " -7" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b = [-7,2,4,-4,-7] # a randomly chosen right-hand side" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5×5 Diagonal{Bool, Vector{Bool}}:\n", " 1 ⋅ ⋅ ⋅ ⋅\n", " ⋅ 1 ⋅ ⋅ ⋅\n", " ⋅ ⋅ 1 ⋅ ⋅\n", " ⋅ ⋅ ⋅ 1 ⋅\n", " ⋅ ⋅ ⋅ ⋅ 1" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "I(5)" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "true" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "I(5) * b == b" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "true" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "I(5) * A == A" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "true" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A * I(5) == A" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Notational ambiguity\n", "\n", "In principle, \"$I$\" is a little ambiguous in linear algebra, because there are $m \\times m$ identity matrices **of every size** $m$, and $I$ could refer to *any* of them. In practice, you usually infer the size of $I$ **from context**: if you see an expression like $Ix$ or $IA$ or $A + I$, then $I$ refers to the identity matrix of the corresponding size.\n", "\n", "There is actually a built-in constant called `I` in Julia that represents just that: an identity matrix whose size is inferred from context:" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "UniformScaling{Bool}\n", "true*I" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "I" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5-element Vector{Int64}:\n", " -7\n", " 2\n", " 4\n", " -4\n", " -7" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "I * b" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "true" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "I * A == A == A * I" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2-element Vector{Int64}:\n", " 2\n", " 3" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "I * [2, 3]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When there is an ambiguity, I will sometimes write $I_m$ for the $m \\times m$ identity matrix.\n", "\n", "For example, if $B$ is an $3\\times2$ matrix, then\n", "\n", "$$\n", "I_3 B I_2 = B\n", "$$\n", "\n", "where we need to multiply by a different-size identity matrix on the left and right." ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3×2 Matrix{Int64}:\n", " 2 3\n", " 4 5\n", " 6 7" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "B = [2 3\n", " 4 5\n", " 6 7]" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3×2 Matrix{Int64}:\n", " 2 3\n", " 4 5\n", " 6 7" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "I * B * I" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3×2 Matrix{Int64}:\n", " 2 3\n", " 4 5\n", " 6 7" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "I(3) * B * I(2)" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "ename": "LoadError", "evalue": "DimensionMismatch(\"second dimension of D, 2, does not match the first of B, 3\")", "output_type": "error", "traceback": [ "DimensionMismatch(\"second dimension of D, 2, does not match the first of B, 3\")", "", "Stacktrace:", " [1] lmul!(D::Diagonal{Bool, Vector{Bool}}, B::Matrix{Int64})", " @ LinearAlgebra /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/stdlib/v1.7/LinearAlgebra/src/diagonal.jl:241", " [2] *(D::Diagonal{Bool, Vector{Bool}}, A::Matrix{Int64})", " @ LinearAlgebra /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/stdlib/v1.7/LinearAlgebra/src/diagonal.jl:224", " [3] _tri_matmul(A::Diagonal{Bool, Vector{Bool}}, B::Matrix{Int64}, C::Diagonal{Bool, Vector{Bool}}, δ::Nothing)", " @ LinearAlgebra /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/stdlib/v1.7/LinearAlgebra/src/matmul.jl:1132", " [4] _tri_matmul", " @ /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/stdlib/v1.7/LinearAlgebra/src/matmul.jl:1124 [inlined]", " [5] *(A::Diagonal{Bool, Vector{Bool}}, B::Matrix{Int64}, C::Diagonal{Bool, Vector{Bool}})", " @ LinearAlgebra /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/stdlib/v1.7/LinearAlgebra/src/matmul.jl:1120", " [6] top-level scope", " @ In[33]:1", " [7] eval", " @ ./boot.jl:373 [inlined]", " [8] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)", " @ Base ./loading.jl:1196" ] } ], "source": [ "I(2) * B * I(3) # should give an error: wrong-shape matrices!" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3×2 Matrix{Int64}:\n", " 2 3\n", " 4 5\n", " 6 7" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "I * B * I" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Matrix Inverses\n", "\n", "It is often conceptually convenient to talk about the *inverse* $A^{-1}$ of a matrix $A$, which exists **for any non-singular square matrix**. This is the matrix such that $x = A^{-1}b$ solves $Ax = b$ for any $b$. The inverse is conceptually convenient becuase it allows us to move matrices around in equations *almost* like numbers (except that matrices don't commute!)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Equivalently, the inverse matrix $A^{-1}$ is the matrix such that $A^{-1} A = A A^{-1} = I$.\n", "\n", "Why does this correspond to solving $Ax=b$? Multiplying both sides on the *left* by $A^{-1}$ (multiplying on the *right* would make no sense: we can't multiply vector×matrix!), we get \n", "\n", "$$\n", "A^{-1}Ax=Ix = x = A^{-1}b\n", "$$\n", "\n", "In Julia, you can just do `inv(A)` to get the inverse of a matrix:" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5×5 Matrix{Float64}:\n", " 0.0109991 0.529789 -0.908341 -0.635197 -0.0879927\n", " 0.131989 0.35747 -0.900092 -0.622365 -0.055912\n", " -0.235564 -0.179652 0.370302 0.353804 -0.11549\n", " -0.301558 -0.69172 1.48701 1.16499 0.0791323\n", " 0.2044 0.678582 -1.29667 -1.05408 0.0314696" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "inv(A)" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "ename": "LoadError", "evalue": "MethodError: no method matching round(::Float64, ::Int64)\n\u001b[0mClosest candidates are:\n\u001b[0m round(::T, \u001b[91m::RoundingMode{:NearestTiesUp}\u001b[39m) where T<:AbstractFloat at /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/base/floatfuncs.jl:220\n\u001b[0m round(\u001b[91m::Type{T}\u001b[39m, ::Integer) where T<:Integer at /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/base/int.jl:645\n\u001b[0m round(::Float64, \u001b[91m::RoundingMode{:ToZero}\u001b[39m) at /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/base/float.jl:371\n\u001b[0m ...", "output_type": "error", "traceback": [ "MethodError: no method matching round(::Float64, ::Int64)\n\u001b[0mClosest candidates are:\n\u001b[0m round(::T, \u001b[91m::RoundingMode{:NearestTiesUp}\u001b[39m) where T<:AbstractFloat at /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/base/floatfuncs.jl:220\n\u001b[0m round(\u001b[91m::Type{T}\u001b[39m, ::Integer) where T<:Integer at /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/base/int.jl:645\n\u001b[0m round(::Float64, \u001b[91m::RoundingMode{:ToZero}\u001b[39m) at /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/base/float.jl:371\n\u001b[0m ...", "", "Stacktrace:", " [1] _broadcast_getindex_evalf", " @ ./broadcast.jl:670 [inlined]", " [2] _broadcast_getindex", " @ ./broadcast.jl:643 [inlined]", " [3] getindex", " @ ./broadcast.jl:597 [inlined]", " [4] copy", " @ ./broadcast.jl:899 [inlined]", " [5] materialize(bc::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{2}, Nothing, typeof(round), Tuple{Matrix{Float64}, Int64}})", " @ Base.Broadcast ./broadcast.jl:860", " [6] top-level scope", " @ In[36]:1", " [7] eval", " @ ./boot.jl:373 [inlined]", " [8] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)", " @ Base ./loading.jl:1196" ] } ], "source": [ "round.(inv(A) * A, 3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(This is the identity matrix up to **roundoff errors**: computers only keep about 15 significant digits when they are doing most arithmetic.)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Inverses and products\n", "\n", "Inverses have a special relationship to matrix products:\n", "$$\n", "(AB)^{-1} = B^{-1} A^{-1}\n", "$$\n", "\n", "The reason for this is that we must have $(AB)^{-1} AB = I$, and it is easy to see that $B^{-1} A^{-1}$ does the trick. Equivalently, $AB$ is the matrix that first multiplies by $B$ then by $A$; to invert this, we must *reverse the steps*: first multiply by the inverse of $A$ and then by the inverse of $B$." ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "4×4 Matrix{Float64}:\n", " 2.22127 1.98224 -3.57568 -1.32172\n", " 2.93196 -1.87908 1.29332 -1.45601\n", " 1.02713 1.25965 -0.0833894 -2.21414\n", " -5.81069 -0.203271 2.05968 4.76163" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "C = rand(4,4)\n", "D = rand(4,4)\n", "inv(C*D)" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "4×4 Matrix{Float64}:\n", " 2.22127 1.98224 -3.57568 -1.32172\n", " 2.93196 -1.87908 1.29332 -1.45601\n", " 1.02713 1.25965 -0.0833894 -2.21414\n", " -5.81069 -0.203271 2.05968 4.76163" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "inv(D)*inv(C)" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "4×4 Matrix{Float64}:\n", " 0.0647283 -3.91299 4.92046 -3.07784\n", " -1.34403 -0.547912 -0.499501 3.0236\n", " -1.23397 -0.192993 3.08178 -1.93372\n", " 1.8898 3.55496 -5.53689 2.42184" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "inv(C)*inv(D) # this is not the inverse of C*D" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Gaussian Elimination, Inverses, and LU Factorization\n", "\n", "Recall from the last lecture that we were looking at the Gaussian-elimination steps to convert a matrix $A$ into **upper-triangular form** via a sequence of **row operations**:\n", "\n", "$$\n", "\\underbrace{\\begin{pmatrix}\n", "\\boxed{1} & 3 & 1 \\\\\n", "1 & 1 & -1 \\\\\n", "3 & 11 & 6 \n", "\\end{pmatrix}}_A\\to\n", "\\underbrace{\\begin{pmatrix}\n", "\\boxed{1} & 3 & 1 \\\\\n", "0 & \\boxed{-2} & -2 \\\\\n", "0 & 2 & 3 \n", "\\end{pmatrix}}_{E_1 A}\\to\n", "\\underbrace{\\begin{pmatrix}\n", "\\boxed{1} & 3 & 1 \\\\\n", "0 & \\boxed{-2} & -2 \\\\\n", "0 & 0 & \\boxed{1} \n", "\\end{pmatrix}}_{E_2 E_1 A}\n", "$$\n", "\n", "Since row operations correspond to **multiplying by matrices on the *left***, constructed the \"elimination matrices\" $E_1$ and $E_2$ corresponding to each of these elimination steps (under the first and second pivots)." ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3×3 Matrix{Int64}:\n", " 1 0 0\n", " 0 1 0\n", " 0 1 1" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A = [1 3 1\n", " 1 1 -1\n", " 3 11 6]\n", "E1 = [ 1 0 0\n", " -1 1 0\n", " -3 0 1]\n", "E2 = [1 0 0\n", " 0 1 0\n", " 0 1 1]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Thus, we arrived at the formula:\n", "$$\n", "\\underbrace{E_2 E_1}_E A = U\n", "$$\n", "Notice that we multiplied $A$ by the elimination matrices from *right to left* in the order of the steps: it is $E_2 E_1 A$, *not* $E_1 E_2 A$. Because matrix multiplication is generally [not commutative](https://en.wikipedia.org/wiki/Commutative_property), $E_2 E_1$ and $E_1 E_2$ give *different* matrices:" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3×3 Matrix{Int64}:\n", " 1 0 0\n", " -1 1 0\n", " -4 1 1" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "E2*E1" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3×3 Matrix{Int64}:\n", " 1 0 0\n", " -1 1 0\n", " -3 1 1" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "E1*E2" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3×3 Matrix{Float64}:\n", " 1.0 0.0 0.0\n", " 1.0 1.0 0.0\n", " 3.0 -1.0 1.0" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "inv(E2*E1)" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3×3 Matrix{Int64}:\n", " 1 0 0\n", " -1 1 0\n", " -3 0 1" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "E1" ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3×3 Matrix{Int64}:\n", " 1 0 0\n", " 0 1 0\n", " 0 1 1" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "E2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice, furthermore, that the matrices $E_1$ and $E_2$ are both **lower-triangular matrices**. This is a consequence of the structure of Gaussian elimination (assuming no row re-ordering): we always add the pivot row to rows *below* it, never *above* it.\n", "\n", "The *product* of lower-triangular matrices is always lower-triangular too. (In homework, you will explore a similar property for upper-triangular matrices) In consequence, the product $E = E_2 E_1$ is lower-triangular, and Gaussian elimination can be viewed as yielding\n", "\n", "$$EA=U$$\n", "\n", "where $E$ is lower triangular and $U$ is upper triangular!\n", "\n", "# Inverse elimination: LU factors\n", "\n", "However, in practice, it turns out to be more useful to write this as\n", "\n", "$$A= E^{-1} U = E_1^{-1}E_2^{-1}U$$\n", "\n", "where $E^{-1}$ is the [inverse of the matrix](http://mathworld.wolfram.com/MatrixInverse.html) $E$. We will have more to say about how to compute matrix inverses later in 18.06, but for now we just need to know that it is the matrix that **reverses the steps** of Gaussian elimination, taking us back from $U$ to $A$. Computing matrix inverses is laborious in general, but in this particular case it is easy. We just need to *reverse the steps one by one* starting with the *last* elimination step and working back to the *first* one. \n", "\n", "Hence, just as for our general rule about inverses of products above, we need to reverse (invert) $E_2$ *first* on $U$, and *then* reverse (invert) $E_1$: $A = E_1^{-1} E_2^{-1} U$. But reversing an individual elimination step like $E_2$ is easy: we just **flip the signs below the diagonal**, so that wherever we *added* the pivot row we *subtract* and vice-versa. That is:\n", "$$\n", "\\begin{pmatrix} 1 & 0 & 0 \\\\ 0 & 1 & 0 \\\\ 0 & 1 & 1 \\end{pmatrix}^{-1} =\n", "\\begin{pmatrix} 1 & 0 & 0 \\\\ 0 & 1 & 0 \\\\ 0 & -1 & 1 \\end{pmatrix}\n", "$$\n", "(The last elimination step was adding the second row to the third row, so we reverse it by *subtracting* the second row from the third row of $U$.)\n", "\n", "Julia can compute matrix inverses for us with the `inv` function. (It doesn't know the trick of flipping the sign, which only works for very special matrices, but it can compute it the \"hard way\" so quickly (for such a small matrix) that it doesn't matter.) Of course that gives the same result:" ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3×3 Matrix{Float64}:\n", " 1.0 0.0 0.0\n", " 0.0 1.0 0.0\n", " 0.0 -1.0 1.0" ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "inv(E2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Similarly for $E_1$:" ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3×3 Matrix{Float64}:\n", " 1.0 0.0 0.0\n", " 1.0 1.0 0.0\n", " 3.0 0.0 1.0" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "inv(E1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we didn't make any mistakes, then $E_1^{-1} E_2^{-1} U$ should give $A$, and it does:" ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [ { "ename": "LoadError", "evalue": "MethodError: no method matching lu(::Matrix{Int64}, ::Type{Val{false}})\n\u001b[0mClosest candidates are:\n\u001b[0m lu(\u001b[91m::SciMLBase.DiffEqScaledOperator\u001b[39m, ::Any...) at ~/.julia/packages/SciMLBase/XzX8e/src/operators/diffeq_operator.jl:129\n\u001b[0m lu(\u001b[91m::SciMLBase.AbstractDiffEqLinearOperator\u001b[39m, ::Any...) at ~/.julia/packages/SciMLBase/XzX8e/src/operators/common_defaults.jl:33\n\u001b[0m lu(::AbstractMatrix{T}, \u001b[91m::Union{NoPivot, RowMaximum}\u001b[39m; check) where T at /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/stdlib/v1.7/LinearAlgebra/src/lu.jl:277\n\u001b[0m ...", "output_type": "error", "traceback": [ "MethodError: no method matching lu(::Matrix{Int64}, ::Type{Val{false}})\n\u001b[0mClosest candidates are:\n\u001b[0m lu(\u001b[91m::SciMLBase.DiffEqScaledOperator\u001b[39m, ::Any...) at ~/.julia/packages/SciMLBase/XzX8e/src/operators/diffeq_operator.jl:129\n\u001b[0m lu(\u001b[91m::SciMLBase.AbstractDiffEqLinearOperator\u001b[39m, ::Any...) at ~/.julia/packages/SciMLBase/XzX8e/src/operators/common_defaults.jl:33\n\u001b[0m lu(::AbstractMatrix{T}, \u001b[91m::Union{NoPivot, RowMaximum}\u001b[39m; check) where T at /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/stdlib/v1.7/LinearAlgebra/src/lu.jl:277\n\u001b[0m ...", "", "Stacktrace:", " [1] top-level scope", " @ In[48]:1", " [2] eval", " @ ./boot.jl:373 [inlined]", " [3] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)", " @ Base ./loading.jl:1196" ] } ], "source": [ "L, U = lu(A, Val{false})\n", "U" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3×3 Matrix{Float64}:\n", " 1.0 3.0 1.0\n", " 1.0 1.0 -1.0\n", " 3.0 11.0 6.0" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "inv(E1)*inv(E2)*U" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We call *inverse* elimination matrix $L = E^{-1} = E_1^{-1} E_2^{-1}$ Since the inverses of each elimination matrix were lower-triangular (with flipped signs), their product $L$ is also lower triangular:" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3×3 Matrix{Float64}:\n", " 1.0 0.0 0.0\n", " 1.0 1.0 0.0\n", " 3.0 -1.0 1.0" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "L = inv(E1)*inv(E2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As mentioned above, this is the same as the inverse of $E = E_2 E_1$:" ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3×3 Matrix{Float64}:\n", " 1.0 0.0 0.0\n", " 1.0 1.0 0.0\n", " 3.0 -1.0 1.0" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "inv(E2*E1)" ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3×3 Matrix{Int64}:\n", " 1 0 0\n", " -1 1 0\n", " -3 0 1" ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "E1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The final result, therefore, is that Gaussian elimination (without row swaps) can be viewed as a *factorization* of the original matrix $A$\n", "$$\n", "A = LU\n", "$$\n", "into a **product of lower- and upper-triangular matrices**. (Furthermore, although we didn't comment on this above, $L$ is always 1 along its diagonal.) This factorization is called the [LU factorization](https://en.wikipedia.org/wiki/LU_decomposition) of $A$. (It's why we used the `lu` function in Julia above.) When a computer performs Gaussian elimination, what it computes are the $L$ and $U$ factors.\n", "\n", "What this accomplishes is to break a complicated matrix $A$ into **much simpler pieces** $L$ and $U$. It may not seem at first that $L$ and $U$ are *that* much simpler than $A$, but they are: lots of operations that are very difficult with $A$, like solving equations or computing the determinant, become *easy* once you known $L$ and $U$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# WARNING: For MOST triangular matrices, negation ≠ inverse\n", "\n", "Above, we used the trick that we can invert a *single* elimination matrix $E$ just by flipping the signs below the diagonal (to reverse the steps). **This doesn't work for a general lower (or upper) triangular matrix**.\n", "\n", "For example, even for the $L$ matrix computed above, it doesn't work:" ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3×3 Matrix{Float64}:\n", " 1.0 0.0 0.0\n", " 1.0 1.0 0.0\n", " 3.0 -1.0 1.0" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "L" ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3×3 Matrix{Float64}:\n", " 1.0 0.0 0.0\n", " -1.0 1.0 0.0\n", " -4.0 1.0 1.0" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "inv(L)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that the 3 in the lower-left corner of $L$ changes to a -4 in the lower-left of $L^{-1}$!\n", "\n", "The reason it works for the \"elementary\" $E$ matrices representing elimination of a *single column* of $A$, but not for $L$ (or other lower-triangular matrices in general), is essentially that operations on different columns affect one another (the elimination steps don't commute).\n", "\n", "In fact, there is *no easy trick to invert a general lower/upper triangular matrix* quickly. But you shouldn't have to! If you see an expression like $L^{-1} b$, you shouldn't compute $L^{-1}$ explicitly! Instead, solve $Lc = b$ by **forward-substitution.**\n", "\n", "Similarly, if you see $U^{-1} c$ for an upper-triangular matrix, don't try to compute $U^{-1}$. Instead, solve $Ux=c$ by **back-substitution.**\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Finding Matrix Inverses in General\n", "\n", "How do we find $A^{-1}$? The key is the equation $A A^{-1} = I$, which looks just like $AX=B$ for the **right-hand sides consisting of the columns of the identity matrix**, i.e. the unit vectors. So, we just solve $Ax=e_i$ for $i=1,\\ldots,5$, or equivalently do `A \\ I` in Julia. Of course, Julia comes with a built-in function `inv(A)` for computing $A^{-1}$ as well:" ] }, { "cell_type": "code", "execution_count": 55, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5×5 Matrix{Int64}:\n", " 4 -2 -7 -4 -8\n", " 9 -6 -6 -1 -5\n", " -2 -9 3 -5 2\n", " 9 7 -9 5 -8\n", " -1 6 -3 9 6" ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A = [4 -2 -7 -4 -8\n", " 9 -6 -6 -1 -5\n", " -2 -9 3 -5 2\n", " 9 7 -9 5 -8\n", " -1 6 -3 9 6] # a randomly chosen 5x5 matrix" ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5×5 Matrix{Float64}:\n", " 0.0109991 0.529789 -0.908341 -0.635197 -0.0879927\n", " 0.131989 0.35747 -0.900092 -0.622365 -0.055912\n", " -0.235564 -0.179652 0.370302 0.353804 -0.11549\n", " -0.301558 -0.69172 1.48701 1.16499 0.0791323\n", " 0.2044 0.678582 -1.29667 -1.05408 0.0314696" ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Ainv = A \\ I(5)" ] }, { "cell_type": "code", "execution_count": 57, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5×5 Matrix{Float64}:\n", " -1.73472e-18 -2.22045e-16 0.0 -1.11022e-16 0.0\n", " 2.77556e-17 5.55112e-17 -1.11022e-16 -2.22045e-16 1.38778e-17\n", " 2.77556e-17 1.11022e-16 -2.22045e-16 -1.11022e-16 1.38778e-17\n", " 0.0 -1.11022e-16 0.0 0.0 -1.38778e-17\n", " 0.0 -1.11022e-16 0.0 0.0 0.0" ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Ainv - inv(A)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The difference is just roundoff errors. We can also check approximate equality (ignoring roundoff differences) with `≈` (typed by `\\approx` followed by a *tab*):" ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "true" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Ainv ≈ inv(A)" ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5×5 Matrix{Float64}:\n", " 1.0 1.11022e-16 1.72085e-15 -6.66134e-16 1.88738e-15\n", " -1.70697e-15 1.0 1.9984e-15 -6.66134e-16 1.33227e-15\n", " 4.44089e-16 1.11022e-15 1.0 4.44089e-16 -4.44089e-16\n", " 1.19349e-15 -5.55112e-17 2.77556e-17 1.0 1.72085e-15\n", " -1.45023e-15 -1.05471e-15 -8.04912e-16 -2.77556e-16 1.0" ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Ainv * A" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(Again, we get $I$ up to roundoff errors because the computer does arithmetic only to 15–16 significant digits.)" ] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5×5 Matrix{Float64}:\n", " 1.0 -8.88178e-16 1.77636e-15 0.0 -5.55112e-17\n", " 0.0 1.0 -8.88178e-16 0.0 -5.55112e-17\n", " -1.66533e-16 4.44089e-16 1.0 2.66454e-15 1.38778e-17\n", " 0.0 -8.88178e-16 0.0 1.0 -5.55112e-17\n", " 6.66134e-16 1.77636e-15 -3.55271e-15 -3.55271e-15 1.0" ] }, "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A * Ainv" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Normally, $AB \\ne BA$ for two matrices $A$ and $B$. Why can we multiply $A$ by $A^{-1}$ on either the left or right and get the same answer $I$? It is fairly easy to see why:\n", "\n", "$$\n", "A A^{-1} = I \\implies A A^{-1} A = IA = A = A (A^{-1} A)\n", "$$\n", "\n", "Since $A (A^{-1} A) = A$, and $A$ is non-singular (so there is a unique solution to this system of equations), we must have $A^{-1} A = I$." ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5×2 Matrix{Float64}:\n", " 0.505958 0.505958\n", " -0.928506 -0.928506\n", " 2.16407 2.16407\n", " 1.46166 1.46166\n", " -1.26428 -1.26428" ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[A\\b Ainv*b] # print the two results side-by-side" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# (Almost) Never Compute Inverses!\n", "\n", "Matrix inverses are funny, however:\n", "\n", "* Inverse matrices are very convenient in *analytical* manipulations, because they allow you to move matrices from one side to the other of equations easily.\n", "\n", "* Inverse matrices are **almost never computed** in \"serious\" numerical calculations. Whenever you see $A^{-1} B$ (or $A^{-1} b$), when you go to *implement* it on a computer you should *read* $A^{-1} B$ as \"solve $AX = B$ by some method.\" e.g. solve it by `A \\ B` or by first computing the LU factorization of $A$ and then using it to solve $AX = B$.\n", "\n", "One reason that you don't usually compute inverse matrices is that it is wasteful: once you have $A=LU$ (later we will generalize this to \"$PA = LU$\"), you can solve $AX=B$ directly without bothering to find $A^{-1}$, and computing $A^{-1}$ requires much more work if you only have to solve a few right-hand sides.\n", "\n", "Another reason is that for many special matrices, there are ways to solve $AX=B$ *much* more quickly than you can find $A^{-1}$. For example, many large matrices in practice are [sparse](https://en.wikipedia.org/wiki/Sparse_matrix) (mostly zero), and often for sparse matrices you can arrange for $L$ and $U$ to be sparse too. Sparse matrices are much more efficient to work with than general \"dense\" matrices because you don't have to multiply (or even store) the zeros. Even if $A$ is sparse, however, $A^{-1}$ is usually non-sparse, so you lose the special efficiency of sparsity if you compute the inverse matrix." ] } ], "metadata": { "@webio": { "lastCommId": null, "lastKernelId": null }, "kernelspec": { "display_name": "Julia 1.7.1", "language": "julia", "name": "julia-1.7" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "1.7.1" } }, "nbformat": 4, "nbformat_minor": 2 }