{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Julia Version 1.6.2\n",
"Commit 1b93d53fc4 (2021-07-14 15:36 UTC)\n",
"Platform Info:\n",
" OS: macOS (x86_64-apple-darwin18.7.0)\n",
" CPU: Intel(R) Core(TM) i5-8279U CPU @ 2.40GHz\n",
" WORD_SIZE: 64\n",
" LIBM: libopenlibm\n",
" LLVM: libLLVM-11.0.1 (ORCJIT, skylake)\n"
]
}
],
"source": [
"versioninfo()"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\u001b[32m\u001b[1m Activating\u001b[22m\u001b[39m environment at `~/Dropbox/class/M1399.000200/2021/M1399_000200-2021fall/Project.toml`\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[32m\u001b[1m Status\u001b[22m\u001b[39m `~/Dropbox/class/M1399.000200/2021/M1399_000200-2021fall/Project.toml`\n",
" \u001b[90m [7d9fca2a] \u001b[39mArpack v0.5.3\n",
" \u001b[90m [6e4b80f9] \u001b[39mBenchmarkTools v1.1.4\n",
" \u001b[90m [1e616198] \u001b[39mCOSMO v0.8.1\n",
" \u001b[90m [f65535da] \u001b[39mConvex v0.14.13\n",
" \u001b[90m [a93c6f00] \u001b[39mDataFrames v1.2.2\n",
" \u001b[90m [31a5f54b] \u001b[39mDebugger v0.6.8\n",
" \u001b[90m [31c24e10] \u001b[39mDistributions v0.24.18\n",
" \u001b[90m [e2685f51] \u001b[39mECOS v0.12.3\n",
" \u001b[90m [f6369f11] \u001b[39mForwardDiff v0.10.19\n",
" \u001b[90m [28b8d3ca] \u001b[39mGR v0.58.1\n",
" \u001b[90m [c91e804a] \u001b[39mGadfly v1.3.3\n",
" \u001b[90m [bd48cda9] \u001b[39mGraphRecipes v0.5.7\n",
" \u001b[90m [82e4d734] \u001b[39mImageIO v0.5.8\n",
" \u001b[90m [6218d12a] \u001b[39mImageMagick v1.2.1\n",
" \u001b[90m [916415d5] \u001b[39mImages v0.24.1\n",
" \u001b[90m [b6b21f68] \u001b[39mIpopt v0.7.0\n",
" \u001b[90m [42fd0dbc] \u001b[39mIterativeSolvers v0.9.1\n",
" \u001b[90m [4076af6c] \u001b[39mJuMP v0.21.9\n",
" \u001b[90m [b51810bb] \u001b[39mMatrixDepot v1.0.4\n",
" \u001b[90m [1ec41992] \u001b[39mMosekTools v0.9.4\n",
" \u001b[90m [76087f3c] \u001b[39mNLopt v0.6.3\n",
" \u001b[90m [47be7bcc] \u001b[39mORCA v0.5.0\n",
" \u001b[90m [a03496cd] \u001b[39mPlotlyBase v0.4.3\n",
" \u001b[90m [f0f68f2c] \u001b[39mPlotlyJS v0.15.0\n",
" \u001b[90m [91a5bcdd] \u001b[39mPlots v1.21.2\n",
" \u001b[90m [438e738f] \u001b[39mPyCall v1.92.3\n",
" \u001b[90m [d330b81b] \u001b[39mPyPlot v2.9.0\n",
" \u001b[90m [dca85d43] \u001b[39mQuartzImageIO v0.7.3\n",
" \u001b[90m [6f49c342] \u001b[39mRCall v0.13.12\n",
" \u001b[90m [ce6b1742] \u001b[39mRDatasets v0.7.5\n",
" \u001b[90m [c946c3f1] \u001b[39mSCS v0.7.1\n",
" \u001b[90m [276daf66] \u001b[39mSpecialFunctions v1.6.1\n",
" \u001b[90m [2913bbd2] \u001b[39mStatsBase v0.33.10\n",
" \u001b[90m [b8865327] \u001b[39mUnicodePlots v2.0.1\n",
" \u001b[90m [0f1e0344] \u001b[39mWebIO v0.8.15\n",
" \u001b[90m [8f399da3] \u001b[39mLibdl\n",
" \u001b[90m [2f01184e] \u001b[39mSparseArrays\n",
" \u001b[90m [10745b16] \u001b[39mStatistics\n"
]
}
],
"source": [
"using Pkg\n",
"Pkg.activate(\"../..\")\n",
"Pkg.status()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Cholesky Decomposition\n",
"\n",
"\n",
"\n",
"[André-Louis Cholesky, 1875-1918](https://en.wikipedia.org/wiki/André-Louis_Cholesky).\n",
"\n",
"* A basic tenet in numerical analysis: \n",
"\n",
"> **The structure should be exploited whenever possible in solving a problem.** \n",
"\n",
" Common structures include: symmetry, positive (semi)definiteness, sparsity, Kronecker product, low rank, ...\n",
"\n",
"* LU decomposition (Gaussian Elimination) is **not** used in statistics very often because most of time statisticians deal with positive (semi)definite matrices.\n",
"\n",
"* Recall that a matrix $M$ is positive (semi)definite if\n",
"$$\n",
" \\mathbf{x}^T\\mathbf{M}\\mathbf{x} > 0 ~(\\ge 0), \\quad \\forall \\mathbf{x}\\neq\\mathbf{0}.\n",
"$$\n",
"\n",
"* Example: normal equation \n",
"$$\n",
" \\mathbf{X}^T \\mathbf{X} \\beta = \\mathbf{X}^T \\mathbf{y}\n",
"$$\n",
"for linear regression, the coefficient matrix $\\mathbf{X}^T \\mathbf{X}$ is symmetric and positive semidefinite. How to exploit this structure?\n",
"\n",
"* Most of time statisticians deal with positive (semi)definite matrices."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Cholesky decomposition\n",
"\n",
"* **Theorem**: Let $\\mathbf{A} \\in \\mathbb{R}^{n \\times n}$ be symmetric and positive definite. Then $\\mathbf{A} = \\mathbf{L} \\mathbf{L}^T$, where $\\mathbf{L}$ is lower triangular with positive diagonal entries and is unique. \n",
"**Proof** (by induction): \n",
"If $n=1$, then $0 < a = \\sqrt{a}\\sqrt{a}$. For $n>1$, the block equation\n",
"$$\n",
"\\begin{eqnarray*}\n",
"\\begin{bmatrix}\n",
"a_{11} & \\mathbf{a}^T \\\\ \\mathbf{a} & \\mathbf{A}_{22}\n",
"\\end{bmatrix} =\n",
"\\begin{bmatrix}\n",
"\t\\ell_{11} & \\mathbf{0}_{n-1}^T \\\\ \\mathbf{b} & \\mathbf{L}_{22}\n",
"\\end{bmatrix}\n",
"\\begin{bmatrix}\n",
"\t\\ell_{11} & \\mathbf{b}^T \\\\ \\mathbf{0}_{n-1} & \\mathbf{L}_{22}^T\n",
"\\end{bmatrix}\n",
"\\end{eqnarray*}\n",
"$$\n",
"entails\n",
"$$\n",
"\\begin{eqnarray*}\n",
"\ta_{11} &=& \\ell_{11}^2 \\\\\n",
"\t\\mathbf{a} &=& \\ell_{11} \\mathbf{b}\t\\\\\n",
"\t\\mathbf{A}_{22} &=& \\mathbf{b} \\mathbf{b}^T + \\mathbf{L}_{22} \\mathbf{L}_{22}^T\n",
" .\n",
"\\end{eqnarray*}\n",
"$$ \n",
"Since $a_{11}>0$ (why?), $\\ell_{11}=\\sqrt{a_{11}}$ and $\\mathbf{b}=a_{11}^{-1/2}\\mathbf{a}$ are uniquely determined. \n",
"$\\mathbf{A}_{22} - \\mathbf{b} \\mathbf{b}^T = \\mathbf{A}_{22} - a_{11}^{-1} \\mathbf{a} \\mathbf{a}^T$ is positive definite of size $(n-1)\\times(n-1)$ because $\\mathbf{A}_{22}$ is positive definite (why? homework). By induction hypothesis, lower triangular $\\mathbf{L}_{22}$ such that $\\mathbf{A}_{22} - \\mathbf{b} \\mathbf{b}^T = \\mathbf{L}_{22}^T\\mathbf{L}_{22}$ exists and is unique."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* This proof is constructive and completely specifies the algorithm: \n",
"```Julia\n",
"for k=1:n\n",
" for j=k+1:n\n",
" a_jk_div_a_kk = A[j, k] / A[k, k] \n",
" for i=j:n\n",
" A[i, j] -= A[i, k] * a_jk_div_a_kk # L_22\n",
" end\n",
" end\n",
" sqrt_akk = sqrt(A[k, k])\n",
" for i=k:n\n",
" A[i, k] /= sqrt_akk # l_11 and b\n",
" end\n",
"end\n",
"```\n",
"\n",
"\n",
"\n",
"Source: \n",
"\n",
"* Computational cost: \n",
"$$\n",
"\\frac{1}{2} [2(n-1)^2 + 2(n-2)^2 + \\cdots + 2 \\cdot 1^2] \\approx \\frac{1}{3} n^3 \\quad \\text{flops}\n",
"$$ \n",
"plus $n$ square roots. **Half** the cost of LU decomposition.\n",
"\n",
"* In general Cholesky decomposition is very stable. Failure of the decomposition simply means $\\mathbf{A}$ is not positive definite. **It is an efficient way to test positive definiteness.**\n",
"\n",
"\n",
"## Pivoting\n",
"\n",
"* Pivoting is only needed if you want the Cholesky factor of a positive *semidefinite* matrix.\n",
"\n",
"* When $\\mathbf{A}$ does not have full rank, e.g., $\\mathbf{X}^T \\mathbf{X}$ with a non-full column rank $\\mathbf{X}$, we encounter $a_{kk} = 0$ during the procedure.\n",
"\n",
"* **Symmetric pivoting**. At each stage $k$, we permute both row and column such that $\\max_{k \\le i \\le n} a_{ii}$ becomes the pivot. If we encounter $\\max_{k \\le i \\le n} a_{ii} = 0$, then $\\mathbf{A}[k:n,k:n] = \\mathbf{0}$ (why?) and the algorithm terminates.\n",
"\n",
"* With symmetric pivoting: \n",
"$$\n",
"\\mathbf{P} \\mathbf{A} \\mathbf{P}^T = \\mathbf{L} \\mathbf{L}^T,\n",
"$$\n",
"where $\\mathbf{P}$ is a permutation matrix and $\\mathbf{L} \\in \\mathbb{R}^{n \\times r}$, $r = \\text{rank}(\\mathbf{A})$.\n",
"\n",
"## Implementation\n",
"\n",
"* LAPACK functions: [`?potrf`](http://www.netlib.org/lapack/explore-html/d1/d7a/group__double_p_ocomputational_ga2f55f604a6003d03b5cd4a0adcfb74d6.html#ga2f55f604a6003d03b5cd4a0adcfb74d6) (without pivoting), [`?pstrf`](http://www.netlib.org/lapack/explore-html/da/dba/group__double_o_t_h_e_rcomputational_ga31cdc13a7f4ad687f4aefebff870e1cc.html#ga31cdc13a7f4ad687f4aefebff870e1cc) (with pivoting).\n",
"\n",
"* Julia functions: [`cholesky`](https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/#LinearAlgebra.cholesky), [`cholesky!`](https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/#LinearAlgebra.cholesky!), or call LAPACK wrapper functions [`potrf!`](https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/#LinearAlgebra.LAPACK.potrf!) and [`pstrf!`](https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/#LinearAlgebra.LAPACK.pstrf!)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Example: positive definite matrix"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"3×3 Matrix{Float64}:\n",
" 4.0 12.0 -16.0\n",
" 12.0 37.0 -43.0\n",
" -16.0 -43.0 98.0"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"using LinearAlgebra\n",
"\n",
"A = Float64.([4 12 -16; 12 37 -43; -16 -43 98]) # check out this is pd!"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Cholesky{Float64, Matrix{Float64}}\n",
"U factor:\n",
"3×3 UpperTriangular{Float64, Matrix{Float64}}:\n",
" 2.0 6.0 -8.0\n",
" ⋅ 1.0 5.0\n",
" ⋅ ⋅ 3.0"
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Cholesky without pivoting\n",
"Achol = cholesky(Symmetric(A))"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Cholesky{Float64, Matrix{Float64}}"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"typeof(Achol)"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(:factors, :uplo, :info)"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"fieldnames(typeof(Achol))"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"3×3 LowerTriangular{Float64, Matrix{Float64}}:\n",
" 2.0 ⋅ ⋅ \n",
" 6.0 1.0 ⋅ \n",
" -8.0 5.0 3.0"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# retrieve the lower triangular Cholesky factor\n",
"Achol.L"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"3×3 UpperTriangular{Float64, Matrix{Float64}}:\n",
" 2.0 6.0 -8.0\n",
" ⋅ 1.0 5.0\n",
" ⋅ ⋅ 3.0"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# retrieve the upper triangular Cholesky factor\n",
"Achol.U"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"3-element Vector{Float64}:\n",
" 28.58333333333338\n",
" -7.666666666666679\n",
" 1.3333333333333353"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"b = [1.0; 2.0; 3.0]\n",
"A \\ b # this does LU; wasteful!; 2/3 n^3 + 2n^2"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"3-element Vector{Float64}:\n",
" 28.583333333333332\n",
" -7.666666666666666\n",
" 1.3333333333333333"
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Achol \\ b # two triangular solves; only 2n^2 flops"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"35.99999999999994"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"det(A) # this actually does LU; wasteful!"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"36.0"
]
},
"execution_count": 39,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"det(Achol) # cheap"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"3×3 Matrix{Float64}:\n",
" 49.3611 -13.5556 2.11111\n",
" -13.5556 3.77778 -0.555556\n",
" 2.11111 -0.555556 0.111111"
]
},
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"inv(A) # this does LU!"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"3×3 Matrix{Float64}:\n",
" 49.3611 -13.5556 2.11111\n",
" -13.5556 3.77778 -0.555556\n",
" 2.11111 -0.555556 0.111111"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"inv(Achol) # 2n triangular solves"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Example: positive *semi*definite matrix"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"5×5 Matrix{Float64}:\n",
" 1.97375 2.0722 1.71191 0.253774 -0.544089\n",
" 2.0722 5.86947 3.01646 0.93344 -1.50292\n",
" 1.71191 3.01646 2.10156 0.21341 -0.965213\n",
" 0.253774 0.93344 0.21341 0.393107 -0.0415803\n",
" -0.544089 -1.50292 -0.965213 -0.0415803 0.546021"
]
},
"execution_count": 42,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"using Random\n",
"\n",
"Random.seed!(123) # seed\n",
"A = randn(5, 3)\n",
"A = A * transpose(A) # A has rank 3"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {},
"outputs": [
{
"ename": "LoadError",
"evalue": "RankDeficientException(1)",
"output_type": "error",
"traceback": [
"RankDeficientException(1)",
"",
"Stacktrace:",
" [1] chkfullrank",
" @ /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/cholesky.jl:639 [inlined]",
" [2] #cholesky!#131",
" @ /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/cholesky.jl:295 [inlined]",
" [3] cholesky!(A::Matrix{Float64}, ::Val{true}; tol::Float64, check::Bool)",
" @ LinearAlgebra /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/cholesky.jl:321",
" [4] #cholesky#135",
" @ /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/cholesky.jl:402 [inlined]",
" [5] cholesky(A::Matrix{Float64}, ::Val{true})",
" @ LinearAlgebra /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/cholesky.jl:402",
" [6] top-level scope",
" @ In[43]:1",
" [7] eval",
" @ ./boot.jl:360 [inlined]",
" [8] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)",
" @ Base ./loading.jl:1116"
]
}
],
"source": [
"Achol = cholesky(A, Val(true)) # 2nd argument requests pivoting"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"CholeskyPivoted{Float64, Matrix{Float64}}\n",
"U factor with rank 4:\n",
"5×5 UpperTriangular{Float64, Matrix{Float64}}:\n",
" 2.4227 0.855329 0.38529 -0.620349 1.24508\n",
" ⋅ 1.11452 -0.0679895 -0.0121011 0.580476\n",
" ⋅ ⋅ 0.489935 0.4013 -0.463002\n",
" ⋅ ⋅ ⋅ 1.49012e-8 0.0\n",
" ⋅ ⋅ ⋅ ⋅ 0.0\n",
"permutation:\n",
"5-element Vector{Int64}:\n",
" 2\n",
" 1\n",
" 4\n",
" 5\n",
" 3"
]
},
"execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Achol = cholesky(A, Val(true), check=false) # turn off checking pd"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"4"
]
},
"execution_count": 45,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"rank(Achol) # determine rank from Cholesky factor"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"3"
]
},
"execution_count": 46,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"rank(A) # determine rank from SVD, which is more numerically stable"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"5×5 LowerTriangular{Float64, Matrix{Float64}}:\n",
" 2.4227 ⋅ ⋅ ⋅ ⋅ \n",
" 0.855329 1.11452 ⋅ ⋅ ⋅ \n",
" 0.38529 -0.0679895 0.489935 ⋅ ⋅ \n",
" -0.620349 -0.0121011 0.4013 1.49012e-8 ⋅ \n",
" 1.24508 0.580476 -0.463002 0.0 0.0"
]
},
"execution_count": 47,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Achol.L"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"5×5 UpperTriangular{Float64, Matrix{Float64}}:\n",
" 2.4227 0.855329 0.38529 -0.620349 1.24508\n",
" ⋅ 1.11452 -0.0679895 -0.0121011 0.580476\n",
" ⋅ ⋅ 0.489935 0.4013 -0.463002\n",
" ⋅ ⋅ ⋅ 1.49012e-8 0.0\n",
" ⋅ ⋅ ⋅ ⋅ 0.0"
]
},
"execution_count": 48,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Achol.U"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"5-element Vector{Int64}:\n",
" 2\n",
" 1\n",
" 4\n",
" 5\n",
" 3"
]
},
"execution_count": 49,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Achol.p"
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"5×5 Matrix{Float64}:\n",
" 0.0 1.0 0.0 0.0 0.0\n",
" 1.0 0.0 0.0 0.0 0.0\n",
" 0.0 0.0 0.0 0.0 1.0\n",
" 0.0 0.0 1.0 0.0 0.0\n",
" 0.0 0.0 0.0 1.0 0.0"
]
},
"execution_count": 50,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Achol.P # this returns P'"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"2.2398766718569015e-16"
]
},
"execution_count": 51,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# P A P' = L U\n",
"norm(transpose(Achol.P) * A * Achol.P - Achol.L * Achol.U)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Applications\n",
"\n",
"* **No inversion** mentality: Whenever we see matrix inverse, we should think in terms of solving linear equations. If the matrix is positive (semi)definite, use Cholesky decomposition, which is twice cheaper than LU decomposition.\n",
"\n",
"### Multivariate normal density \n",
"\n",
"Multivariate normal density $\\mathcal{N}(0, \\Sigma)$, where $\\Sigma$ is $n\\times n$ positive definite, is\n",
"$$\n",
" \\frac{1}{(2\\pi)^{n/2}\\det(\\Sigma)^{1/2}}\\exp\\left(-\\frac{1}{2}\\mathbf{y}^T\\Sigma^{-1}\\mathbf{y}\\right).\n",
"$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Hence the log likelihood is\n",
"$$\n",
"- \\frac{n}{2} \\log (2\\pi) - \\frac{1}{2} \\log \\det \\Sigma - \\frac{1}{2} \\mathbf{y}^T \\Sigma^{-1} \\mathbf{y}.\n",
"$$\n",
"\n",
"* Method 1: \n",
" 1. compute explicit inverse $\\Sigma^{-1}$ ($2n^3$ flops), \n",
" 2. compute quadratic form ($2n^2 + 2n$ flops), \n",
" 3. compute determinant ($2n^3/3$ flops).\n",
" \n",
"* Method 2: \n",
" 1. Cholesky decomposition $\\Sigma = \\mathbf{L} \\mathbf{L}^T$ ($n^3/3$ flops), \n",
" 2. Solve $\\mathbf{L} \\mathbf{x} = \\mathbf{y}$ by forward substitutions ($n^2$ flops), \n",
" 3. compute quadratic form $\\mathbf{x}^T \\mathbf{x}$ ($2n$ flops)\n",
" 4. compute determinant from Cholesky factor ($n$ flops). \n",
"\n",
"**Which method is better?**"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"logpdf_mvn_3 (generic function with 1 method)"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# word-by-word transcription of mathematical expression\n",
"function logpdf_mvn_1(y::Vector, Σ::Matrix)\n",
" n = length(y)\n",
" - (n//2) * log(2π) - (1//2) * logdet(Σ) - (1//2) * transpose(y) * inv(Σ) * y\n",
"end\n",
"\n",
"# you learnt why you should avoid inversion\n",
"function logpdf_mvn_2(y::Vector, Σ::Matrix)\n",
" n = length(y)\n",
" Σchol = cholesky(Symmetric(Σ))\n",
" - (n//2) * log(2π) - (1//2) * logdet(Σchol) - (1//2) * dot(y, Σchol \\ y)\n",
"end\n",
"\n",
"# this is for the efficiency-concerned \n",
"function logpdf_mvn_3(y::Vector, Σ::Matrix)\n",
" n = length(y)\n",
" Σchol = cholesky(Symmetric(Σ))\n",
" - (n//2) * log(2π) - sum(log.(diag(Σchol.L))) - (1//2) * sum(abs2, Σchol.L \\ y)\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"logpdf_mvn_1(y, Σ) = -4878.375103770505\n",
"logpdf_mvn_2(y, Σ) = -4878.375103770553\n",
"logpdf_mvn_3(y, Σ) = -4878.375103770555\n"
]
}
],
"source": [
"using BenchmarkTools, Distributions, Random\n",
"\n",
"Random.seed!(123) # seed\n",
"\n",
"n = 1000\n",
"# a pd matrix\n",
"Σ = convert(Matrix{Float64}, Symmetric([i * (n - j + 1) for i in 1:n, j in 1:n]))\n",
"y = rand(MvNormal(Σ)) # one random sample from N(0, Σ)\n",
"\n",
"# at least they give same answer\n",
"@show logpdf_mvn_1(y, Σ)\n",
"@show logpdf_mvn_2(y, Σ)\n",
"@show logpdf_mvn_3(y, Σ);"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 131 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m31.915 ms\u001b[22m\u001b[39m … \u001b[35m51.792 ms\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 8.27%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m37.008 ms \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m38.295 ms\u001b[22m\u001b[39m ± \u001b[32m 4.086 ms\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m2.88% ± 3.96%\n",
"\n",
" \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m█\u001b[39m▂\u001b[39m▃\u001b[39m▂\u001b[34m \u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[32m \u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \n",
" \u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▁\u001b[39m▃\u001b[39m▃\u001b[39m▅\u001b[39m▃\u001b[39m▆\u001b[39m▆\u001b[39m▇\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[34m▆\u001b[39m\u001b[39m▇\u001b[39m▇\u001b[39m█\u001b[39m▇\u001b[32m▇\u001b[39m\u001b[39m▇\u001b[39m▅\u001b[39m▃\u001b[39m▄\u001b[39m▄\u001b[39m▃\u001b[39m▃\u001b[39m▁\u001b[39m▁\u001b[39m▃\u001b[39m▁\u001b[39m▅\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▁\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▁\u001b[39m▅\u001b[39m▄\u001b[39m▁\u001b[39m▃\u001b[39m▁\u001b[39m▁\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▁\u001b[39m▃\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▃\u001b[39m▃\u001b[39m \u001b[39m▃\n",
" 31.9 ms\u001b[90m Histogram: frequency by time\u001b[39m 50.6 ms \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m15.78 MiB\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m10\u001b[39m."
]
},
"execution_count": 53,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark logpdf_mvn_1($y, $Σ)"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 630 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m5.933 ms\u001b[22m\u001b[39m … \u001b[35m18.414 ms\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m0.00% … 0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m7.527 ms \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m7.934 ms\u001b[22m\u001b[39m ± \u001b[32m 1.511 ms\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m7.17% ± 12.28%\n",
"\n",
" \u001b[39m \u001b[39m \u001b[39m▂\u001b[39m \u001b[39m▃\u001b[39m▃\u001b[39m▁\u001b[39m▂\u001b[39m█\u001b[39m▆\u001b[39m▆\u001b[39m▂\u001b[39m▇\u001b[39m▅\u001b[34m█\u001b[39m\u001b[39m▃\u001b[39m \u001b[39m▂\u001b[32m \u001b[39m\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \n",
" \u001b[39m▄\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[34m█\u001b[39m\u001b[39m█\u001b[39m█\u001b[39m█\u001b[32m█\u001b[39m\u001b[39m▇\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▆\u001b[39m▇\u001b[39m▇\u001b[39m▅\u001b[39m█\u001b[39m▅\u001b[39m▄\u001b[39m▄\u001b[39m▄\u001b[39m▄\u001b[39m▄\u001b[39m▅\u001b[39m▄\u001b[39m▄\u001b[39m▃\u001b[39m▃\u001b[39m▄\u001b[39m▆\u001b[39m▄\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▂\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▃\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m▁\u001b[39m▂\u001b[39m▂\u001b[39m▄\u001b[39m \u001b[39m▄\n",
" 5.93 ms\u001b[90m Histogram: frequency by time\u001b[39m 12.5 ms \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m7.64 MiB\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m5\u001b[39m."
]
},
"execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark logpdf_mvn_2($y, $Σ)"
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: 471 samples with 1 evaluation.\n",
" Range \u001b[90m(\u001b[39m\u001b[36m\u001b[1mmin\u001b[22m\u001b[39m … \u001b[35mmax\u001b[39m\u001b[90m): \u001b[39m\u001b[36m\u001b[1m 8.495 ms\u001b[22m\u001b[39m … \u001b[35m15.879 ms\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmin … max\u001b[90m): \u001b[39m 0.00% … 21.68%\n",
" Time \u001b[90m(\u001b[39m\u001b[34m\u001b[1mmedian\u001b[22m\u001b[39m\u001b[90m): \u001b[39m\u001b[34m\u001b[1m10.529 ms \u001b[22m\u001b[39m\u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmedian\u001b[90m): \u001b[39m 0.00%\n",
" Time \u001b[90m(\u001b[39m\u001b[32m\u001b[1mmean\u001b[22m\u001b[39m ± \u001b[32mσ\u001b[39m\u001b[90m): \u001b[39m\u001b[32m\u001b[1m10.603 ms\u001b[22m\u001b[39m ± \u001b[32m 1.304 ms\u001b[39m \u001b[90m┊\u001b[39m GC \u001b[90m(\u001b[39mmean ± σ\u001b[90m): \u001b[39m13.33% ± 12.45%\n",
"\n",
" \u001b[39m \u001b[39m \u001b[39m▃\u001b[39m▅\u001b[39m▃\u001b[39m▁\u001b[39m \u001b[39m█\u001b[39m▂\u001b[39m▃\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m▂\u001b[39m▁\u001b[39m▃\u001b[39m▁\u001b[39m▅\u001b[39m▇\u001b[34m▅\u001b[39m\u001b[32m▁\u001b[39m\u001b[39m▁\u001b[39m▂\u001b[39m▆\u001b[39m▂\u001b[39m▂\u001b[39m▁\u001b[39m▁\u001b[39m \u001b[39m \u001b[39m▄\u001b[39m \u001b[39m \u001b[39m▃\u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \u001b[39m \n",
" \u001b[39m▄\u001b[39m▅\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m▇\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m▅\u001b[39m▇\u001b[39m▆\u001b[39m▇\u001b[39m▇\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[34m█\u001b[39m\u001b[32m█\u001b[39m\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m█\u001b[39m▇\u001b[39m▅\u001b[39m█\u001b[39m▆\u001b[39m▇\u001b[39m█\u001b[39m▇\u001b[39m▃\u001b[39m▆\u001b[39m▅\u001b[39m▆\u001b[39m▃\u001b[39m▄\u001b[39m▃\u001b[39m▄\u001b[39m▃\u001b[39m▅\u001b[39m▄\u001b[39m▄\u001b[39m▄\u001b[39m▁\u001b[39m▄\u001b[39m▃\u001b[39m▅\u001b[39m▁\u001b[39m▃\u001b[39m▁\u001b[39m▃\u001b[39m \u001b[39m▅\n",
" 8.5 ms\u001b[90m Histogram: frequency by time\u001b[39m 14 ms \u001b[0m\u001b[1m<\u001b[22m\n",
"\n",
" Memory estimate\u001b[90m: \u001b[39m\u001b[33m22.91 MiB\u001b[39m, allocs estimate\u001b[90m: \u001b[39m\u001b[33m13\u001b[39m."
]
},
"execution_count": 55,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"@benchmark logpdf_mvn_3($y, $Σ)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* To evaluate same multivariate normal density at many observations $y_1, y_2, \\ldots$, we pre-compute the Cholesky decomposition ($n^3/3$ flops), then each evaluation costs $n^2$ flops."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Linear regression by Cholesky\n",
"\n",
"* Cholesky decomposition is **one** approach to solve linear regression. Assume $\\mathbf{X} \\in \\mathbb{R}^{n \\times p}$ and $\\mathbf{y} \\in \\mathbb{R}^n$. \n",
" - Compute $\\mathbf{X}^T \\mathbf{X}$: $np^2$ flops \n",
" - Compute $\\mathbf{X}^T \\mathbf{y}$: $2np$ flops \n",
" - Cholesky decomposition of $\\mathbf{X}^T \\mathbf{X}$: $\\frac{1}{3} p^3$ flops \n",
" - Solve normal equation $\\mathbf{X}^T \\mathbf{X} \\beta = \\mathbf{X}^T \\mathbf{y}$: $2p^2$ flops \n",
" - If need standard errors, another $(4/3)p^3$ flops\n",
"\n",
"Total computational cost is $np^2 + (1/3) p^3$ (without s.e.) or $np^2 + (5/3) p^3$ (with s.e.) flops."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Further reading\n",
"\n",
"* Section 7.7 of [Numerical Analysis for Statisticians](https://link.springer.com/book/10.1007/978-1-4419-5945-4) of Kenneth Lange (2010).\n",
"\n",
"* Section II.5.3 of [Computational Statistics](https://link.springer.com/book/10.1007%2F978-0-387-98144-4) by James Gentle (2010).\n",
"\n",
"* Section 4.2 of [Matrix Computation](https://www.amazon.com/Computations-Hopkins-Studies-Mathematical-Sciences/dp/1421407949/ref=sr_1_1?keywords=matrix+computation+golub&qid=1567157884&s=gateway&sr=8-1) by Gene Golub and Charles Van Loan (2013)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Acknowledgment\n",
"\n",
"Many parts of this lecture note is based on [Dr. Hua Zhou](http://hua-zhou.github.io)'s 2019 Spring Statistical Computing course notes available at ."
]
}
],
"metadata": {
"@webio": {
"lastCommId": null,
"lastKernelId": null
},
"kernelspec": {
"display_name": "Julia 1.6.2",
"language": "julia",
"name": "julia-1.6"
},
"language_info": {
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
"version": "1.6.2"
},
"toc": {
"colors": {
"hover_highlight": "#DAA520",
"running_highlight": "#FF0000",
"selected_highlight": "#FFD700"
},
"moveMenuLeft": true,
"nav_menu": {
"height": "118px",
"width": "252px"
},
"navigate_menu": true,
"number_sections": true,
"sideBar": true,
"skip_h1_title": true,
"threshold": 4,
"toc_cell": true,
"toc_section_display": "block",
"toc_window_display": true,
"widenNotebook": false
}
},
"nbformat": 4,
"nbformat_minor": 4
}