{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "toc": "true"
   },
   "source": [
    "# Table of Contents\n",
    " <p><div class=\"lev1 toc-item\"><a href=\"#Cholesky-Decomposition\" data-toc-modified-id=\"Cholesky-Decomposition-1\"><span class=\"toc-item-num\">1&nbsp;&nbsp;</span>Cholesky Decomposition</a></div><div class=\"lev2 toc-item\"><a href=\"#Cholesky-decomposition\" data-toc-modified-id=\"Cholesky-decomposition-11\"><span class=\"toc-item-num\">1.1&nbsp;&nbsp;</span>Cholesky decomposition</a></div><div class=\"lev2 toc-item\"><a href=\"#Pivoting\" data-toc-modified-id=\"Pivoting-12\"><span class=\"toc-item-num\">1.2&nbsp;&nbsp;</span>Pivoting</a></div><div class=\"lev2 toc-item\"><a href=\"#Implementation\" data-toc-modified-id=\"Implementation-13\"><span class=\"toc-item-num\">1.3&nbsp;&nbsp;</span>Implementation</a></div><div class=\"lev3 toc-item\"><a href=\"#Example:-positive-definite-matrix.\" data-toc-modified-id=\"Example:-positive-definite-matrix.-131\"><span class=\"toc-item-num\">1.3.1&nbsp;&nbsp;</span>Example: positive definite matrix.</a></div><div class=\"lev3 toc-item\"><a href=\"#Example:-positive-semi-definite-matrix.\" data-toc-modified-id=\"Example:-positive-semi-definite-matrix.-132\"><span class=\"toc-item-num\">1.3.2&nbsp;&nbsp;</span>Example: positive semi-definite matrix.</a></div><div class=\"lev2 toc-item\"><a href=\"#Applications\" data-toc-modified-id=\"Applications-14\"><span class=\"toc-item-num\">1.4&nbsp;&nbsp;</span>Applications</a></div><div class=\"lev3 toc-item\"><a href=\"#Multivariate-normal-density\" data-toc-modified-id=\"Multivariate-normal-density-141\"><span class=\"toc-item-num\">1.4.1&nbsp;&nbsp;</span>Multivariate normal density</a></div><div class=\"lev3 toc-item\"><a href=\"#Linear-regression\" data-toc-modified-id=\"Linear-regression-142\"><span class=\"toc-item-num\">1.4.2&nbsp;&nbsp;</span>Linear regression</a></div><div class=\"lev2 toc-item\"><a href=\"#Further-reading\" data-toc-modified-id=\"Further-reading-15\"><span class=\"toc-item-num\">1.5&nbsp;&nbsp;</span>Further reading</a></div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Julia Version 1.1.0\n",
      "Commit 80516ca202 (2019-01-21 21:24 UTC)\n",
      "Platform Info:\n",
      "  OS: macOS (x86_64-apple-darwin14.5.0)\n",
      "  CPU: Intel(R) Core(TM) i7-6920HQ CPU @ 2.90GHz\n",
      "  WORD_SIZE: 64\n",
      "  LIBM: libopenlibm\n",
      "  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)\n",
      "Environment:\n",
      "  JULIA_EDITOR = code\n"
     ]
    }
   ],
   "source": [
    "versioninfo()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Cholesky Decomposition\n",
    "\n",
    "<img src=\"http://www.cerebralmastication.com/wp-content/uploads/2010/09/39-cholesky-250x300.jpg\" width=\"200\" align=\"center\"/>\n",
    "\n",
    "* A basic tenet in numerical analysis: \n",
    "\n",
    "> **The structure should be exploited whenever solving a problem.** \n",
    "\n",
    "  Common structures include: symmetry, positive (semi)definiteness, sparsity, Kronecker product, low rank, ...\n",
    "\n",
    "* LU decomposition (Gaussian Elimination) is **not** used in statistics so often because most of time statisticians deal with positive (semi)definite matrix. (That's why I hate to see `solve()` in R code.)\n",
    "\n",
    "* For example, in the normal equation \n",
    "$$\n",
    "    \\mathbf{X}^T \\mathbf{X} \\beta = \\mathbf{X}^T \\mathbf{y}\n",
    "$$\n",
    "for linear regression, the coefficient matrix $\\mathbf{X}^T \\mathbf{X}$ is symmetric and positive semidefinite. How to exploit this structure?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Cholesky decomposition\n",
    "\n",
    "* **Theorem**: Let $\\mathbf{A} \\in \\mathbb{R}^{n \\times n}$ be symmetric and positive definite. Then $\\mathbf{A} = \\mathbf{L} \\mathbf{L}^T$, where $\\mathbf{L}$ is lower triangular with positive diagonal entries and is unique.  \n",
    "**Proof** (by induction):  \n",
    "If $n=1$, then $\\ell = \\sqrt{a}$. For $n>1$, the block equation\n",
    "$$\n",
    "\\begin{eqnarray*}\n",
    "\\begin{pmatrix}\n",
    "a_{11} & \\mathbf{a}^T \\\\ \\mathbf{a} & \\mathbf{A}_{22}\n",
    "\\end{pmatrix} =\n",
    "\\begin{pmatrix}\n",
    "\t\\ell_{11} & \\mathbf{0}_{n-1}^T \\\\ \\mathbf{l} & \\mathbf{L}_{22}\n",
    "\\end{pmatrix}\n",
    "\\begin{pmatrix}\n",
    "\t\\ell_{11} & \\mathbf{l}^T \\\\ \\mathbf{0}_{n-1} & \\mathbf{L}_{22}^T\n",
    "\\end{pmatrix}\n",
    "\\end{eqnarray*}\n",
    "$$\n",
    "has solution\n",
    "$$\n",
    "\\begin{eqnarray*}\n",
    "\t\\ell_{11} &=& \\sqrt{a_{11}} \\\\\n",
    "\t\\mathbf{l} &=& \\ell_{11}^{-1} \\mathbf{a}\t\\\\\n",
    "\t\\mathbf{L}_{22} \\mathbf{L}_{22}^T &=& \\mathbf{A}_{22}  - \\mathbf{l} \\mathbf{l}^T = \\mathbf{A}_{22}  - a_{11}^{-1} \\mathbf{a} \\mathbf{a}^T.\n",
    "\\end{eqnarray*}\n",
    "$$  \n",
    "Now $a_{11}>0$ (why?), so $\\ell_{11}$ and $\\mathbf{l}$ are uniquely determined. $\\mathbf{A}_{22} - a_{11}^{-1} \\mathbf{a} \\mathbf{a}^T$ is positive definite because $\\mathbf{A}$ is positive definite (why?). By induction hypothesis, $\\mathbf{L}_{22}$ exists and is unique.\n",
    "\n",
    "* The constructive proof completely specifies the algorithm: \n",
    "\n",
    "<img src=\"http://www.netlib.org/utk/papers/factor/_25826_figure440.gif\" width=\"500\" align=\"center\"/>\n",
    "\n",
    "* Computational cost: \n",
    "$$\n",
    "\\frac{1}{2} [2(n-1)^2 + 2(n-2)^2 + \\cdots + 2 \\cdot 1^2] \\approx \\frac{1}{3} n^3 \\quad \\text{flops}\n",
    "$$ \n",
    "plus $n$ square roots. Half the cost of LU decomposition by utilizing symmetry.\n",
    "\n",
    "* In general Cholesky decomposition is very stable. Failure of the decomposition simply means $\\mathbf{A}$ is not positive definite. It is an efficient way to test positive definiteness.  \n",
    "\n",
    "\n",
    "## Pivoting\n",
    "\n",
    "* When $\\mathbf{A}$ does not have full rank, e.g., $\\mathbf{X}^T \\mathbf{X}$ with a non-full column rank $\\mathbf{X}$, we encounter $a_{kk} = 0$ during the procedure.\n",
    "\n",
    "* **Symmetric pivoting**. At each stage $k$, we permute both row and column such that $\\max_{k \\le i \\le n} a_{ii}$ becomes the pivot. If we encounter $\\max_{k \\le i \\le n} a_{ii} = 0$, then $\\mathbf{A}[k:n,k:n] = \\mathbf{0}$ (why?) and the algorithm terminates.\n",
    "\n",
    "* With symmetric pivoting: \n",
    "$$\n",
    "\\mathbf{P} \\mathbf{A} \\mathbf{P}^T = \\mathbf{L} \\mathbf{L}^T,\n",
    "$$\n",
    "where $\\mathbf{P}$ is a permutation matrix and $\\mathbf{L} \\in \\mathbb{R}^{n \\times r}$, $r = \\text{rank}(\\mathbf{A})$.\n",
    "\n",
    "## Implementation\n",
    "\n",
    "* LAPACK functions: [`?potrf`](http://www.netlib.org/lapack/explore-html/d1/d7a/group__double_p_ocomputational_ga2f55f604a6003d03b5cd4a0adcfb74d6.html#ga2f55f604a6003d03b5cd4a0adcfb74d6) (without pivoting), [`?pstrf`](http://www.netlib.org/lapack/explore-html/da/dba/group__double_o_t_h_e_rcomputational_ga31cdc13a7f4ad687f4aefebff870e1cc.html#ga31cdc13a7f4ad687f4aefebff870e1cc) (with pivoting).\n",
    "\n",
    "* Julia functions: [`cholesky`](https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/#LinearAlgebra.cholesky), [`cholesky!`](https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/#LinearAlgebra.cholesky!), or call LAPACK wrapper functions [`potrf!`](https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/#LinearAlgebra.LAPACK.potrf!) and [`pstrf!`](https://docs.julialang.org/en/v1/stdlib/LinearAlgebra/#LinearAlgebra.LAPACK.pstrf!)\n",
    "\n",
    "### Example: positive definite matrix."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3×3 Array{Float64,2}:\n",
       "   4.0   12.0  -16.0\n",
       "  12.0   37.0  -43.0\n",
       " -16.0  -43.0   98.0"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "using LinearAlgebra\n",
    "\n",
    "A = Float64.([4 12 -16; 12 37 -43; -16 -43 98])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Cholesky{Float64,Array{Float64,2}}\n",
       "U factor:\n",
       "3×3 UpperTriangular{Float64,Array{Float64,2}}:\n",
       " 2.0  6.0  -8.0\n",
       "  ⋅   1.0   5.0\n",
       "  ⋅    ⋅    3.0"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Cholesky without pivoting\n",
    "Achol = cholesky(Symmetric(A))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Cholesky{Float64,Array{Float64,2}}"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "typeof(Achol)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(:factors, :uplo, :info)"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fieldnames(typeof(Achol))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3×3 LowerTriangular{Float64,Array{Float64,2}}:\n",
       "  2.0   ⋅    ⋅ \n",
       "  6.0  1.0   ⋅ \n",
       " -8.0  5.0  3.0"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# retrieve the lower triangular Cholesky factor\n",
    "Achol.L"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3×3 UpperTriangular{Float64,Array{Float64,2}}:\n",
       " 2.0  6.0  -8.0\n",
       "  ⋅   1.0   5.0\n",
       "  ⋅    ⋅    3.0"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# retrieve the upper triangular Cholesky factor\n",
    "Achol.U"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3-element Array{Float64,1}:\n",
       " 28.58333333333338  \n",
       " -7.666666666666679 \n",
       "  1.3333333333333353"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "b = [1.0; 2.0; 3.0]\n",
    "A \\ b # this does LU; wasteful!; 2/3 n^3 + 2n^2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3-element Array{Float64,1}:\n",
       " 28.583333333333332 \n",
       " -7.666666666666666 \n",
       "  1.3333333333333333"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Achol \\ b # two triangular solves; only 2n^2 flops"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "35.99999999999994"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "det(A) # this actually does LU; wasteful!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "36.0"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "det(Achol) # cheap"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3×3 Array{Float64,2}:\n",
       "  49.3611   -13.5556     2.11111 \n",
       " -13.5556     3.77778   -0.555556\n",
       "   2.11111   -0.555556   0.111111"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "inv(A) # this does LU!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3×3 Array{Float64,2}:\n",
       "  49.3611   -13.5556     2.11111 \n",
       " -13.5556     3.77778   -0.555556\n",
       "   2.11111   -0.555556   0.111111"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "inv(Achol) "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Example: positive semi-definite matrix."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Float64,2}:\n",
       "  1.97375    2.0722    1.71191    0.253774   -0.544089 \n",
       "  2.0722     5.86947   3.01646    0.93344    -1.50292  \n",
       "  1.71191    3.01646   2.10156    0.21341    -0.965213 \n",
       "  0.253774   0.93344   0.21341    0.393107   -0.0415803\n",
       " -0.544089  -1.50292  -0.965213  -0.0415803   0.546021 "
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "using Random\n",
    "\n",
    "Random.seed!(123) # seed\n",
    "A = randn(5, 3)\n",
    "A = A * transpose(A) # A has rank 3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "ename": "RankDeficientException",
     "evalue": "RankDeficientException(1)",
     "output_type": "error",
     "traceback": [
      "RankDeficientException(1)",
      "",
      "Stacktrace:",
      " [1] chkfullrank at /Users/osx/buildbot/slave/package_osx64/build/usr/share/julia/stdlib/v1.1/LinearAlgebra/src/cholesky.jl:498 [inlined]",
      " [2] #cholesky!#98(::Float64, ::Bool, ::Function, ::Hermitian{Float64,Array{Float64,2}}, ::Val{true}) at /Users/osx/buildbot/slave/package_osx64/build/usr/share/julia/stdlib/v1.1/LinearAlgebra/src/cholesky.jl:195",
      " [3] #cholesky! at ./none:0 [inlined]",
      " [4] #cholesky!#100(::Float64, ::Bool, ::Function, ::Array{Float64,2}, ::Val{true}) at /Users/osx/buildbot/slave/package_osx64/build/usr/share/julia/stdlib/v1.1/LinearAlgebra/src/cholesky.jl:221",
      " [5] #cholesky#102 at ./none:0 [inlined]",
      " [6] cholesky(::Array{Float64,2}, ::Val{true}) at /Users/osx/buildbot/slave/package_osx64/build/usr/share/julia/stdlib/v1.1/LinearAlgebra/src/cholesky.jl:296",
      " [7] top-level scope at In[15]:1"
     ]
    }
   ],
   "source": [
    "Achol = cholesky(A, Val(true)) # 2nd argument requests pivoting"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "CholeskyPivoted{Float64,Array{Float64,2}}\n",
       "U factor with rank 4:\n",
       "5×5 UpperTriangular{Float64,Array{Float64,2}}:\n",
       " 2.4227  0.855329   0.38529    -0.620349     1.24508 \n",
       "  ⋅      1.11452   -0.0679895  -0.0121011    0.580476\n",
       "  ⋅       ⋅         0.489935    0.4013      -0.463002\n",
       "  ⋅       ⋅          ⋅          1.49012e-8   0.0     \n",
       "  ⋅       ⋅          ⋅           ⋅           0.0     \n",
       "permutation:\n",
       "5-element Array{Int64,1}:\n",
       " 2\n",
       " 1\n",
       " 4\n",
       " 5\n",
       " 3"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Achol = cholesky(A, Val(true), check=false) # turn off checking pd"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "rank(Achol) # determine rank from Cholesky factor"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "rank(A) # determine rank from SVD, which is more numerically stable"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 LowerTriangular{Float64,Array{Float64,2}}:\n",
       "  2.4227      ⋅           ⋅         ⋅           ⋅ \n",
       "  0.855329   1.11452      ⋅         ⋅           ⋅ \n",
       "  0.38529   -0.0679895   0.489935   ⋅           ⋅ \n",
       " -0.620349  -0.0121011   0.4013    1.49012e-8   ⋅ \n",
       "  1.24508    0.580476   -0.463002  0.0         0.0"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Achol.L"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 UpperTriangular{Float64,Array{Float64,2}}:\n",
       " 2.4227  0.855329   0.38529    -0.620349     1.24508 \n",
       "  ⋅      1.11452   -0.0679895  -0.0121011    0.580476\n",
       "  ⋅       ⋅         0.489935    0.4013      -0.463002\n",
       "  ⋅       ⋅          ⋅          1.49012e-8   0.0     \n",
       "  ⋅       ⋅          ⋅           ⋅           0.0     "
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Achol.U"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5-element Array{Int64,1}:\n",
       " 2\n",
       " 1\n",
       " 4\n",
       " 5\n",
       " 3"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Achol.p"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "7.5285903934693295"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# P A P' = L U\n",
    "norm(Achol.P * A * Achol.P - Achol.L * Achol.U)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Applications\n",
    "\n",
    "* **No inversion** mentality: Whenever we see matrix inverse, we should think in terms of solving linear equations. If the matrix is positive (semi)definite, use Cholesky decomposition, which is twice cheaper than LU decomposition.\n",
    "\n",
    "### Multivariate normal density \n",
    "\n",
    "Multivariate normal density $\\text{MVN}(0, \\Sigma)$, where $\\Sigma$ is p.d., is"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\n",
    "- \\frac{n}{2} \\log (2\\pi) - \\frac{1}{2} \\log \\det \\Sigma - \\frac{1}{2} \\mathbf{y}^T \\Sigma^{-1} \\mathbf{y}.\n",
    "$$\n",
    "\n",
    "* Method 1: (a) compute explicit inverse $\\Sigma^{-1}$ ($2n^3$ flops), (b) compute quadratic form ($2n^2 + 2n$ flops), (c) compute determinant ($2n^3/3$ flops).\n",
    "    \n",
    "* Method 2: (a) Cholesky decomposition $\\Sigma = \\mathbf{L} \\mathbf{L}^T$ ($n^3/3$ flops), (b) Solve $\\mathbf{L} \\mathbf{x} = \\mathbf{y}$ by forward substitutions ($n^2$ flops), (c) compute quadratic form $\\mathbf{x}^T \\mathbf{x}$ ($2n$ flops), and (d) compute determinant from Cholesky factor ($n$ flops).  \n",
    "\n",
    "**Which method is better?**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "logpdf_mvn_3 (generic function with 1 method)"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# this is a person w/o numerical analsyis training\n",
    "function logpdf_mvn_1(y::Vector, Σ::Matrix)\n",
    "    n = length(y)\n",
    "    - (n//2) * log(2π) - (1//2) * logdet(Σ) - (1//2) * transpose(y) * inv(Σ) * y\n",
    "end\n",
    "\n",
    "# this is an efficiency-savvy person \n",
    "function logpdf_mvn_2(y::Vector, Σ::Matrix)\n",
    "    n = length(y)\n",
    "    Σchol = cholesky(Symmetric(Σ))\n",
    "    - (n//2) * log(2π) - (1//2) * logdet(Σchol) - (1//2) * sum(abs2, Σchol.L \\ y)\n",
    "end\n",
    "\n",
    "# better memory efficiency\n",
    "function logpdf_mvn_3(y::Vector, Σ::Matrix)\n",
    "    n = length(y)\n",
    "    Σchol = cholesky!(Symmetric(Σ))\n",
    "    - (n//2) * log(2π) - (1//2) * logdet(Σchol) - (1//2) * dot(y, Σchol \\ y)\n",
    "end"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "logpdf_mvn_1(y, Σ) = -4878.375103770505\n",
      "logpdf_mvn_2(y, Σ) = -4878.375103770553\n",
      "logpdf_mvn_3(y, Σ) = -4878.375103770553\n"
     ]
    }
   ],
   "source": [
    "using BenchmarkTools, Distributions, Random\n",
    "\n",
    "Random.seed!(123) # seed\n",
    "\n",
    "n = 1000\n",
    "# a pd matrix\n",
    "Σ = convert(Matrix{Float64}, Symmetric([i * (n - j + 1) for i in 1:n, j in 1:n]))\n",
    "y = rand(MvNormal(Σ)) # one random sample from N(0, Σ)\n",
    "\n",
    "# at least they give same answer\n",
    "@show logpdf_mvn_1(y, Σ)\n",
    "@show logpdf_mvn_2(y, Σ)\n",
    "@show logpdf_mvn_3(y, Σ);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "BenchmarkTools.Trial: \n",
       "  memory estimate:  15.78 MiB\n",
       "  allocs estimate:  14\n",
       "  --------------\n",
       "  minimum time:     37.747 ms (0.00% GC)\n",
       "  median time:      41.845 ms (3.56% GC)\n",
       "  mean time:        42.982 ms (3.71% GC)\n",
       "  maximum time:     85.951 ms (54.02% GC)\n",
       "  --------------\n",
       "  samples:          117\n",
       "  evals/sample:     1"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "@benchmark logpdf_mvn_1(y, Σ)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "BenchmarkTools.Trial: \n",
       "  memory estimate:  15.27 MiB\n",
       "  allocs estimate:  10\n",
       "  --------------\n",
       "  minimum time:     7.946 ms (0.00% GC)\n",
       "  median time:      9.517 ms (16.00% GC)\n",
       "  mean time:        9.518 ms (12.95% GC)\n",
       "  maximum time:     59.474 ms (82.87% GC)\n",
       "  --------------\n",
       "  samples:          525\n",
       "  evals/sample:     1"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "@benchmark logpdf_mvn_2(y, Σ)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "BenchmarkTools.Trial: \n",
       "  memory estimate:  7.64 MiB\n",
       "  allocs estimate:  8\n",
       "  --------------\n",
       "  minimum time:     6.353 ms (0.00% GC)\n",
       "  median time:      6.571 ms (0.00% GC)\n",
       "  mean time:        7.318 ms (9.51% GC)\n",
       "  maximum time:     54.287 ms (88.08% GC)\n",
       "  --------------\n",
       "  samples:          682\n",
       "  evals/sample:     1"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "@benchmark logpdf_mvn_3(y, Σ)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* To evaluate same multivariate normal density at many observations $y_1, y_2, \\ldots$, we pre-compute the Cholesky decomposition ($n^3/3$ flops), then each evaluation costs $n^2$ flops."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Linear regression\n",
    "\n",
    "* Cholesky decomposition is **one** approach to solve linear regression. Assume $\\mathbf{X} \\in \\mathbb{R}^{n \\times p}$ and $\\mathbf{y} \\in \\mathbb{R}^n$.  \n",
    "    - Compute $\\mathbf{X}^T \\mathbf{X}$: $np^2$ flops  \n",
    "    - Compute $\\mathbf{X}^T \\mathbf{y}$: $2np$ flops  \n",
    "    - Cholesky decomposition of $\\mathbf{X}^T \\mathbf{X}$: $\\frac{1}{3} p^3$ flops  \n",
    "    - Solve normal equation $\\mathbf{X}^T \\mathbf{X} \\beta = \\mathbf{X}^T \\mathbf{y}$: $2p^2$ flops  \n",
    "    - If need standard errors, another $(4/3)p^3$ flops\n",
    "\n",
    "Total computational cost is $np^2 + (1/3) p^3$ (without s.e.) or $np^2 + (5/3) p^3$ (with s.e.) flops."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Further reading\n",
    "\n",
    "* Section 7.7 of [Numerical Analysis for Statisticians](http://ucla.worldcat.org/title/numerical-analysis-for-statisticians/oclc/793808354&referer=brief_results) of Kenneth Lange (2010).\n",
    "\n",
    "* Section II.5.3 of [Computational Statistics](http://ucla.worldcat.org/title/computational-statistics/oclc/437345409&referer=brief_results) by James Gentle (2010).\n",
    "\n",
    "* Section 4.2 of [Matrix Computation](http://catalog.library.ucla.edu/vwebv/holdingsInfo?bibId=7122088) by Gene Golub and Charles Van Loan (2013)."
   ]
  }
 ],
 "metadata": {
  "@webio": {
   "lastCommId": null,
   "lastKernelId": null
  },
  "kernelspec": {
   "display_name": "Julia 1.1.0",
   "language": "julia",
   "name": "julia-1.1"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.1.0"
  },
  "toc": {
   "colors": {
    "hover_highlight": "#DAA520",
    "running_highlight": "#FF0000",
    "selected_highlight": "#FFD700"
   },
   "moveMenuLeft": true,
   "nav_menu": {
    "height": "118px",
    "width": "252px"
   },
   "navigate_menu": true,
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": true,
   "threshold": 4,
   "toc_cell": true,
   "toc_section_display": "block",
   "toc_window_display": true,
   "widenNotebook": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}