{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "toc": "true"
   },
   "source": [
    "# Table of Contents\n",
    " <p><div class=\"lev1 toc-item\"><a href=\"#A-list-of-&quot;easy&quot;-linear-systems\" data-toc-modified-id=\"A-list-of-&quot;easy&quot;-linear-systems-1\"><span class=\"toc-item-num\">1&nbsp;&nbsp;</span>A list of \"easy\" linear systems</a></div><div class=\"lev2 toc-item\"><a href=\"#Diagonal-matrix\" data-toc-modified-id=\"Diagonal-matrix-11\"><span class=\"toc-item-num\">1.1&nbsp;&nbsp;</span>Diagonal matrix</a></div><div class=\"lev2 toc-item\"><a href=\"#Bidiagonal,-tridiagonal,-and-banded-matrices\" data-toc-modified-id=\"Bidiagonal,-tridiagonal,-and-banded-matrices-12\"><span class=\"toc-item-num\">1.2&nbsp;&nbsp;</span>Bidiagonal, tridiagonal, and banded matrices</a></div><div class=\"lev2 toc-item\"><a href=\"#Triangular-matrix\" data-toc-modified-id=\"Triangular-matrix-13\"><span class=\"toc-item-num\">1.3&nbsp;&nbsp;</span>Triangular matrix</a></div><div class=\"lev2 toc-item\"><a href=\"#Block-diagonal-matrix\" data-toc-modified-id=\"Block-diagonal-matrix-14\"><span class=\"toc-item-num\">1.4&nbsp;&nbsp;</span>Block diagonal matrix</a></div><div class=\"lev2 toc-item\"><a href=\"#Kronecker-product\" data-toc-modified-id=\"Kronecker-product-15\"><span class=\"toc-item-num\">1.5&nbsp;&nbsp;</span>Kronecker product</a></div><div class=\"lev2 toc-item\"><a href=\"#Sparse-matrix\" data-toc-modified-id=\"Sparse-matrix-16\"><span class=\"toc-item-num\">1.6&nbsp;&nbsp;</span>Sparse matrix</a></div><div class=\"lev2 toc-item\"><a href=\"#Easy-plus-low-rank\" data-toc-modified-id=\"Easy-plus-low-rank-17\"><span class=\"toc-item-num\">1.7&nbsp;&nbsp;</span>Easy plus low rank</a></div><div class=\"lev2 toc-item\"><a href=\"#Easy-plus-border\" data-toc-modified-id=\"Easy-plus-border-18\"><span class=\"toc-item-num\">1.8&nbsp;&nbsp;</span>Easy plus border</a></div><div class=\"lev2 toc-item\"><a href=\"#Orthogonal-matrix\" data-toc-modified-id=\"Orthogonal-matrix-19\"><span class=\"toc-item-num\">1.9&nbsp;&nbsp;</span>Orthogonal matrix</a></div><div class=\"lev2 toc-item\"><a href=\"#Toeplitz-matrix\" data-toc-modified-id=\"Toeplitz-matrix-110\"><span class=\"toc-item-num\">1.10&nbsp;&nbsp;</span>Toeplitz matrix</a></div><div class=\"lev2 toc-item\"><a href=\"#Circulant-matrix\" data-toc-modified-id=\"Circulant-matrix-111\"><span class=\"toc-item-num\">1.11&nbsp;&nbsp;</span>Circulant matrix</a></div><div class=\"lev2 toc-item\"><a href=\"#Vandermonde-matrix\" data-toc-modified-id=\"Vandermonde-matrix-112\"><span class=\"toc-item-num\">1.12&nbsp;&nbsp;</span>Vandermonde matrix</a></div><div class=\"lev2 toc-item\"><a href=\"#Cauchy-like-matrix\" data-toc-modified-id=\"Cauchy-like-matrix-113\"><span class=\"toc-item-num\">1.13&nbsp;&nbsp;</span>Cauchy-like matrix</a></div><div class=\"lev2 toc-item\"><a href=\"#Structured-rank-matrix\" data-toc-modified-id=\"Structured-rank-matrix-114\"><span class=\"toc-item-num\">1.14&nbsp;&nbsp;</span>Structured-rank matrix</a></div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# A list of \"easy\" linear systems\n",
    "\n",
    "Consider $\\mathbf{A} \\mathbf{x} = \\mathbf{b}$, $\\mathbf{A} \\in \\mathbb{R}^{n \\times n}$. Or, consider matrix inverse (if you want). $\\mathbf{A}$ can be huge. Keep massive data in mind: 1000 Genome Project, NetFlix, Google PageRank, finance, spatial statistics, ... We should be alert to many easy linear systems. \n",
    "\n",
    "Don't blindly use `A \\ b` and `inv` in Julia or `solve` function in R. **Don't waste computing resources by bad choices of algorithms!**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Diagonal matrix\n",
    "\n",
    "Diagonal $\\mathbf{A}$: $n$ flops. Use `Diagonal` type of Julia."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "\\(A::<b>AbstractArray{T,2} where T</b>, B::<b>Union{AbstractArray{T,1}, AbstractArray{T,2}} where T</b>) at <a href=\"https://github.com/JuliaLang/julia/tree/d386e40c17d43b79fc89d3e579fc04547241787c/base/linalg/generic.jl#L820\" target=\"_blank\">linalg/generic.jl:820</a>"
      ],
      "text/plain": [
       "\\(A::AbstractArray{T,2} where T, B::Union{AbstractArray{T,1}, AbstractArray{T,2}} where T) in Base.LinAlg at linalg/generic.jl:820"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "using BenchmarkTools\n",
    "\n",
    "srand(280)\n",
    "n = 1000\n",
    "A = diagm(randn(n)) # a full matrix\n",
    "b = randn(n)\n",
    "\n",
    "@which A \\ b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "BenchmarkTools.Trial: \n",
       "  memory estimate:  15.95 KiB\n",
       "  allocs estimate:  5\n",
       "  --------------\n",
       "  minimum time:     729.697 μs (0.00% GC)\n",
       "  median time:      743.364 μs (0.00% GC)\n",
       "  mean time:        778.562 μs (0.06% GC)\n",
       "  maximum time:     2.414 ms (62.58% GC)\n",
       "  --------------\n",
       "  samples:          6392\n",
       "  evals/sample:     1"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# check `istril(A)` and `istriu(A)`, then call `Diagonal(A) \\ b`\n",
    "@benchmark A \\ b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "BenchmarkTools.Trial: \n",
       "  memory estimate:  15.95 KiB\n",
       "  allocs estimate:  5\n",
       "  --------------\n",
       "  minimum time:     3.791 μs (0.00% GC)\n",
       "  median time:      4.701 μs (0.00% GC)\n",
       "  mean time:        5.935 μs (14.60% GC)\n",
       "  maximum time:     267.824 μs (95.83% GC)\n",
       "  --------------\n",
       "  samples:          10000\n",
       "  evals/sample:     8"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# O(n) computation, no extra array allocation\n",
    "@benchmark Diagonal(A) \\ b"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Bidiagonal, tridiagonal, and banded matrices\n",
    "\n",
    "Bidiagonal, tridiagonal, or banded $\\mathbf{A}$: Band LU, band Cholesky, ... roughly $O(n)$ flops.   \n",
    "* Use [`Bidiagonal`](https://docs.julialang.org/en/stable/stdlib/linalg/#Base.LinAlg.Bidiagonal), [`Tridiagonal`](https://docs.julialang.org/en/stable/stdlib/linalg/#Base.LinAlg.Tridiagonal), [`SymTridiagonal`](https://docs.julialang.org/en/stable/stdlib/linalg/#Base.LinAlg.SymTridiagonal) types of Julia."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1000×1000 SymTridiagonal{Float64}:\n",
       " 0.126238   0.618244    ⋅        …    ⋅          ⋅          ⋅      \n",
       " 0.618244  -2.34688    1.10206        ⋅          ⋅          ⋅      \n",
       "  ⋅         1.10206    1.91661        ⋅          ⋅          ⋅      \n",
       "  ⋅          ⋅        -0.447244       ⋅          ⋅          ⋅      \n",
       "  ⋅          ⋅          ⋅             ⋅          ⋅          ⋅      \n",
       "  ⋅          ⋅          ⋅        …    ⋅          ⋅          ⋅      \n",
       "  ⋅          ⋅          ⋅             ⋅          ⋅          ⋅      \n",
       "  ⋅          ⋅          ⋅             ⋅          ⋅          ⋅      \n",
       "  ⋅          ⋅          ⋅             ⋅          ⋅          ⋅      \n",
       "  ⋅          ⋅          ⋅             ⋅          ⋅          ⋅      \n",
       "  ⋅          ⋅          ⋅        …    ⋅          ⋅          ⋅      \n",
       "  ⋅          ⋅          ⋅             ⋅          ⋅          ⋅      \n",
       "  ⋅          ⋅          ⋅             ⋅          ⋅          ⋅      \n",
       " ⋮                               ⋱                                 \n",
       "  ⋅          ⋅          ⋅             ⋅          ⋅          ⋅      \n",
       "  ⋅          ⋅          ⋅             ⋅          ⋅          ⋅      \n",
       "  ⋅          ⋅          ⋅        …    ⋅          ⋅          ⋅      \n",
       "  ⋅          ⋅          ⋅             ⋅          ⋅          ⋅      \n",
       "  ⋅          ⋅          ⋅             ⋅          ⋅          ⋅      \n",
       "  ⋅          ⋅          ⋅             ⋅          ⋅          ⋅      \n",
       "  ⋅          ⋅          ⋅             ⋅          ⋅          ⋅      \n",
       "  ⋅          ⋅          ⋅        …    ⋅          ⋅          ⋅      \n",
       "  ⋅          ⋅          ⋅            0.709662    ⋅          ⋅      \n",
       "  ⋅          ⋅          ⋅            0.542815  -0.363838    ⋅      \n",
       "  ⋅          ⋅          ⋅           -0.363838  -1.04034    0.321148\n",
       "  ⋅          ⋅          ⋅             ⋅         0.321148  -0.276537"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "srand(280) \n",
    "\n",
    "n  = 1000\n",
    "dv = randn(n)\n",
    "ev = randn(n - 1)\n",
    "b  = randn(n)\n",
    "# symmetric tridiagonal matrix\n",
    "A  = SymTridiagonal(dv, ev)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "BenchmarkTools.Trial: \n",
       "  memory estimate:  7.65 MiB\n",
       "  allocs estimate:  8\n",
       "  --------------\n",
       "  minimum time:     12.552 ms (0.00% GC)\n",
       "  median time:      13.446 ms (0.00% GC)\n",
       "  mean time:        13.404 ms (4.16% GC)\n",
       "  maximum time:     16.647 ms (0.00% GC)\n",
       "  --------------\n",
       "  samples:          373\n",
       "  evals/sample:     1"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# convert to a full matrix\n",
    "Afull = full(A)\n",
    "\n",
    "# LU decomposition (2/3) n^3 flops!\n",
    "@benchmark Afull \\ b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "BenchmarkTools.Trial: \n",
       "  memory estimate:  23.89 KiB\n",
       "  allocs estimate:  6\n",
       "  --------------\n",
       "  minimum time:     12.511 μs (0.00% GC)\n",
       "  median time:      14.714 μs (0.00% GC)\n",
       "  mean time:        17.326 μs (8.28% GC)\n",
       "  maximum time:     2.081 ms (97.27% GC)\n",
       "  --------------\n",
       "  samples:          10000\n",
       "  evals/sample:     1"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# specialized algorithm for tridiagonal matrix, O(n) flops\n",
    "@benchmark A \\ b"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Triangular matrix\n",
    "\n",
    "Triangular $\\mathbf{A}$: $n^2$ flops to solve linear system."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "BenchmarkTools.Trial: \n",
       "  memory estimate:  7.95 KiB\n",
       "  allocs estimate:  2\n",
       "  --------------\n",
       "  minimum time:     762.721 μs (0.00% GC)\n",
       "  median time:      946.558 μs (0.00% GC)\n",
       "  mean time:        1.051 ms (0.02% GC)\n",
       "  maximum time:     7.309 ms (0.00% GC)\n",
       "  --------------\n",
       "  samples:          4721\n",
       "  evals/sample:     1"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "srand(280)\n",
    "\n",
    "n = 1000\n",
    "A = tril(randn(n, n))\n",
    "b = randn(n)\n",
    "\n",
    "# check istril() then triangular solve\n",
    "@benchmark A \\ b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "BenchmarkTools.Trial: \n",
       "  memory estimate:  7.97 KiB\n",
       "  allocs estimate:  3\n",
       "  --------------\n",
       "  minimum time:     320.558 μs (0.00% GC)\n",
       "  median time:      380.211 μs (0.00% GC)\n",
       "  mean time:        430.062 μs (0.07% GC)\n",
       "  maximum time:     4.834 ms (0.00% GC)\n",
       "  --------------\n",
       "  samples:          10000\n",
       "  evals/sample:     1"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# triangular solve directly; save the cost of istril()\n",
    "@benchmark LowerTriangular(A) \\ b"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Block diagonal matrix\n",
    "\n",
    "Block diagonal: Suppose $n = \\sum_i n_i$. $(\\sum_i n_i)^3$ vs $\\sum_i n_i^3$.  \n",
    "\n",
    "Julia has a [`blkdiag`](https://docs.julialang.org/en/stable/stdlib/linalg/#Base.SparseArrays.blkdiag) function that generates a **sparse** matrix. Anyone interested writing a `BlockDiagonal.jl` package?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1000×1000 SparseMatrixCSC{Float64,Int64} with 969 stored entries:\n",
       "  [31  ,    1]  =  2.07834\n",
       "  [53  ,    1]  =  -1.11883\n",
       "  [58  ,    1]  =  -0.66448\n",
       "  [14  ,    4]  =  1.11793\n",
       "  [96  ,    5]  =  1.22813\n",
       "  [81  ,    8]  =  -0.919643\n",
       "  [48  ,    9]  =  1.0185\n",
       "  [49  ,    9]  =  -0.996332\n",
       "  [15  ,   10]  =  1.30841\n",
       "  [28  ,   10]  =  -0.818757\n",
       "  ⋮\n",
       "  [956 ,  987]  =  -0.900804\n",
       "  [967 ,  987]  =  -0.438788\n",
       "  [971 ,  991]  =  0.176756\n",
       "  [929 ,  992]  =  -1.17384\n",
       "  [974 ,  993]  =  1.59235\n",
       "  [967 ,  994]  =  0.542169\n",
       "  [994 ,  995]  =  0.627832\n",
       "  [998 ,  997]  =  0.60382\n",
       "  [935 ,  998]  =  0.342675\n",
       "  [947 ,  998]  =  0.482228\n",
       "  [975 , 1000]  =  0.991598"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "srand(280)\n",
    "\n",
    "B  = 10 # number of blocks\n",
    "ni = 100\n",
    "A  = blkdiag([sprandn(ni, ni, 0.01) for b in 1:B]...)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "## Kronecker product\n",
    "\n",
    "Use\n",
    "$$\n",
    "\\begin{eqnarray*}\n",
    "    (\\mathbf{A} \\otimes \\mathbf{B})^{-1} &=& \\mathbf{A}^{-1} \\otimes \\mathbf{B}^{-1} \\\\\n",
    "    (\\mathbf{C}^T \\otimes \\mathbf{A}) \\text{vec}(\\mathbf{B}) &=& \\text{vec}(\\mathbf{A} \\mathbf{B} \\mathbf{C}).\n",
    "\\end{eqnarray*}    \n",
    "$$\n",
    "to avoid forming and doing computation on the potentially huge Kronecker $\\mathbf{A} \\otimes \\mathbf{B}$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Sparse matrix\n",
    "\n",
    "Sparsity: sparse matrix decomposition or iterative method.  \n",
    "* The easiest recognizable structure. Familiarize yourself with the sparse matrix computation tools in Julia, Matlab, R (`Matrix` package), MKL (sparse BLAS), ... as much as possible."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.005096"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "srand(280)\n",
    "\n",
    "n = 1000\n",
    "# a sparse pd matrix, about 0.5% non-zero entries\n",
    "A = sprand(n, n, 0.002)\n",
    "A = A + A' + (2n) * I\n",
    "b = randn(n)\n",
    "Afull = full(A)\n",
    "countnz(A) / length(A) # sparsity"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "BenchmarkTools.Trial: \n",
       "  memory estimate:  7.94 KiB\n",
       "  allocs estimate:  1\n",
       "  --------------\n",
       "  minimum time:     100.168 μs (0.00% GC)\n",
       "  median time:      186.690 μs (0.00% GC)\n",
       "  mean time:        201.362 μs (0.00% GC)\n",
       "  maximum time:     2.580 ms (0.00% GC)\n",
       "  --------------\n",
       "  samples:          10000\n",
       "  evals/sample:     1"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# dense matrix-vector multiplication\n",
    "@benchmark Afull * b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "BenchmarkTools.Trial: \n",
       "  memory estimate:  7.94 KiB\n",
       "  allocs estimate:  1\n",
       "  --------------\n",
       "  minimum time:     6.943 μs (0.00% GC)\n",
       "  median time:      7.599 μs (0.00% GC)\n",
       "  mean time:        8.241 μs (2.32% GC)\n",
       "  maximum time:     483.364 μs (95.82% GC)\n",
       "  --------------\n",
       "  samples:          10000\n",
       "  evals/sample:     5"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# sparse matrix-vector multiplication\n",
    "@benchmark A * b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "BenchmarkTools.Trial: \n",
       "  memory estimate:  7.63 MiB\n",
       "  allocs estimate:  8\n",
       "  --------------\n",
       "  minimum time:     9.110 ms (0.00% GC)\n",
       "  median time:      11.609 ms (0.00% GC)\n",
       "  mean time:        11.827 ms (8.51% GC)\n",
       "  maximum time:     68.000 ms (85.06% GC)\n",
       "  --------------\n",
       "  samples:          422\n",
       "  evals/sample:     1"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# dense Cholesky decomposition\n",
    "@benchmark cholfact(Afull)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "BenchmarkTools.Trial: \n",
       "  memory estimate:  1.33 MiB\n",
       "  allocs estimate:  53\n",
       "  --------------\n",
       "  minimum time:     3.326 ms (0.00% GC)\n",
       "  median time:      3.434 ms (0.00% GC)\n",
       "  mean time:        3.631 ms (0.99% GC)\n",
       "  maximum time:     7.742 ms (0.00% GC)\n",
       "  --------------\n",
       "  samples:          1375\n",
       "  evals/sample:     1"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# sparse Cholesky decomposition\n",
    "@benchmark cholfact(A)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "BenchmarkTools.Trial: \n",
       "  memory estimate:  7.64 MiB\n",
       "  allocs estimate:  10\n",
       "  --------------\n",
       "  minimum time:     10.132 ms (0.00% GC)\n",
       "  median time:      11.326 ms (0.00% GC)\n",
       "  mean time:        11.478 ms (5.48% GC)\n",
       "  maximum time:     16.560 ms (0.00% GC)\n",
       "  --------------\n",
       "  samples:          435\n",
       "  evals/sample:     1"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# solve via dense Cholesky\n",
    "xchol = cholfact(Afull) \\ b\n",
    "@benchmark cholfact(Afull) \\ b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4.020546956752686e-18"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# solve via sparse Cholesky\n",
    "xcholsp = cholfact(A) \\ b\n",
    "vecnorm(xchol - xcholsp)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "BenchmarkTools.Trial: \n",
       "  memory estimate:  1.36 MiB\n",
       "  allocs estimate:  64\n",
       "  --------------\n",
       "  minimum time:     3.825 ms (0.00% GC)\n",
       "  median time:      3.928 ms (0.00% GC)\n",
       "  mean time:        4.094 ms (0.92% GC)\n",
       "  maximum time:     6.442 ms (0.00% GC)\n",
       "  --------------\n",
       "  samples:          1220\n",
       "  evals/sample:     1"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "@benchmark cholfact(A) \\ b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\u001b[1m\u001b[36mINFO: \u001b[39m\u001b[22m\u001b[36mRecompiling stale cache file /Users/huazhou/.julia/lib/v0.6/IterativeSolvers.ji for module IterativeSolvers.\n",
      "\u001b[39m"
     ]
    },
    {
     "data": {
      "text/plain": [
       "0.022750177100322466"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# sparse solve via conjugate gradient\n",
    "using IterativeSolvers\n",
    "\n",
    "xcg, = cg(A, b)\n",
    "vecnorm(xcg - xchol)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "BenchmarkTools.Trial: \n",
       "  memory estimate:  33.30 KiB\n",
       "  allocs estimate:  27\n",
       "  --------------\n",
       "  minimum time:     31.181 μs (0.00% GC)\n",
       "  median time:      34.950 μs (0.00% GC)\n",
       "  mean time:        40.390 μs (8.18% GC)\n",
       "  maximum time:     2.927 ms (94.92% GC)\n",
       "  --------------\n",
       "  samples:          10000\n",
       "  evals/sample:     1"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "@benchmark cg(A, b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "## Easy plus low rank\n",
    "\n",
    "Easy plus low rank: $\\mathbf{U} \\in \\mathbb{R}^{n \\times r}$, $\\mathbf{V} \\in \\mathbb{R}^{r \\times n}$, $r \\ll n$. Woodbury formula\n",
    "$$\n",
    "\t(\\mathbf{A} + \\mathbf{U} \\mathbf{V}^T)^{-1} = \\mathbf{A}^{-1} - \\mathbf{A}^{-1} \\mathbf{U} (\\mathbf{I}_r + \\mathbf{V}^T \\mathbf{A}^{-1} \\mathbf{U})^{-1} \\mathbf{V}^T \\mathbf{A}^{-1},\n",
    "$$\n",
    "\n",
    "* Keep HW2 Q2 in mind.  \n",
    "* [`WoodburyMatrices.jl`](https://github.com/timholy/WoodburyMatrices.jl) package can be useful."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1000×1000 Symmetric{Float64,Array{Float64,2}}:\n",
       "  1.8571     0.513107    0.872146   …   0.764278    -0.241331    0.54921 \n",
       "  0.513107   4.57505    -0.636972      -1.86465     -1.92237    -1.72569 \n",
       "  0.872146  -0.636972    4.81387        1.99357      1.99337     3.66327 \n",
       " -0.516414  -0.996711   -0.0919924      0.262832     0.612402    0.621834\n",
       "  0.193686   1.68244    -0.770028      -0.723437    -1.4868     -1.32247 \n",
       "  1.6567     0.0634435  -0.901968   …  -0.241872    -0.0356772  -0.39826 \n",
       "  0.553996  -0.274515    2.21265        0.219437     2.20382     2.60902 \n",
       "  0.402356   1.89288    -1.13032       -0.771441    -1.96862    -1.93483 \n",
       " -1.07744   -1.63881     1.78016        0.96551      1.7292      1.91326 \n",
       " -2.21617   -2.90695    -2.55971       -0.47867      0.855389   -0.933916\n",
       "  1.29975    0.779828    4.12459    …   1.87358      0.737112    2.84136 \n",
       " -0.80833    1.44882     1.67581       -0.139063    -0.107873    0.818132\n",
       " -2.32469   -4.83109    -2.31796       -0.0346402    2.65564     0.591075\n",
       "  ⋮                                 ⋱                                    \n",
       "  0.334995   0.0914829   1.60026        0.0937123    1.40804     1.92919 \n",
       "  0.390722   2.36157    -1.58383       -1.92435     -1.3291     -1.88114 \n",
       "  1.19218    0.845478    1.9362     …  -0.0890571    1.36046     2.01013 \n",
       "  0.915526   0.889395   -0.606443      -0.428052    -0.817555   -1.2017  \n",
       " -0.778472   2.1444      1.50696        0.00689644  -1.28104    -0.141234\n",
       "  0.275366  -0.25866     0.416593       0.481534     0.0874531   0.316543\n",
       " -0.289749  -1.16146     0.550825       0.698152     0.789054    0.949917\n",
       "  0.439213  -0.379216    1.0951     …   0.626399     0.624574    0.8568  \n",
       " -0.592534  -0.235847   -1.11586       -0.601497    -0.0651787  -0.573737\n",
       "  0.764278  -1.86465     1.99357        3.21548      0.932773    1.97505 \n",
       " -0.241331  -1.92237     1.99337        0.932773     3.43991     2.8539  \n",
       "  0.54921   -1.72569     3.66327        1.97505      2.8539      4.85234 "
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "using BenchmarkTools, WoodburyMatrices\n",
    "\n",
    "srand(280)\n",
    "n = 1000\n",
    "r = 5\n",
    "\n",
    "A = Diagonal(rand(n))\n",
    "B = randn(n, r)\n",
    "D = Diagonal(rand(r))\n",
    "b = randn(n)\n",
    "# W = A + B * D * B'\n",
    "W = SymWoodbury(A, B, D)\n",
    "Wfull = Symmetric(full(W))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "BenchmarkTools.Trial: \n",
       "  memory estimate:  7.64 MiB\n",
       "  allocs estimate:  9\n",
       "  --------------\n",
       "  minimum time:     8.913 ms (0.00% GC)\n",
       "  median time:      9.119 ms (0.00% GC)\n",
       "  mean time:        9.485 ms (6.82% GC)\n",
       "  maximum time:     12.387 ms (15.54% GC)\n",
       "  --------------\n",
       "  samples:          527\n",
       "  evals/sample:     1"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# solve via Cholesky\n",
    "@benchmark cholfact(Wfull) \\ b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "BenchmarkTools.Trial: \n",
       "  memory estimate:  76.16 KiB\n",
       "  allocs estimate:  26\n",
       "  --------------\n",
       "  minimum time:     19.164 μs (0.00% GC)\n",
       "  median time:      24.648 μs (0.00% GC)\n",
       "  mean time:        34.932 μs (24.29% GC)\n",
       "  maximum time:     4.617 ms (98.44% GC)\n",
       "  --------------\n",
       "  samples:          10000\n",
       "  evals/sample:     1"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# solve using Woodbury formula\n",
    "@benchmark W \\ b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "BenchmarkTools.Trial: \n",
       "  memory estimate:  7.94 KiB\n",
       "  allocs estimate:  1\n",
       "  --------------\n",
       "  minimum time:     193.856 μs (0.00% GC)\n",
       "  median time:      203.398 μs (0.00% GC)\n",
       "  mean time:        205.489 μs (0.00% GC)\n",
       "  maximum time:     531.061 μs (0.00% GC)\n",
       "  --------------\n",
       "  samples:          10000\n",
       "  evals/sample:     1"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# multiplication without using Woodbury structure\n",
    "@benchmark Wfull * b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "BenchmarkTools.Trial: \n",
       "  memory estimate:  24.06 KiB\n",
       "  allocs estimate:  5\n",
       "  --------------\n",
       "  minimum time:     4.910 μs (0.00% GC)\n",
       "  median time:      6.397 μs (0.00% GC)\n",
       "  mean time:        8.787 μs (20.79% GC)\n",
       "  maximum time:     374.038 μs (96.19% GC)\n",
       "  --------------\n",
       "  samples:          10000\n",
       "  evals/sample:     7"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# multiplication using Woodbury structure\n",
    "@benchmark W * b"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Easy plus border\n",
    "\n",
    "Easy plus border: For $\\mathbf{A}$ pd and $\\mathbf{V}$ full row rank,\n",
    "$$\n",
    "\t\\begin{pmatrix}\n",
    "\t\\mathbf{A} & \\mathbf{V}^T \\\\\n",
    "\t\\mathbf{V} & \\mathbf{0}\n",
    "\t\\end{pmatrix}^{-1} = \\begin{pmatrix}\n",
    "\t\\mathbf{A}^{-1} - \\mathbf{A}^{-1} \\mathbf{V}^T (\\mathbf{V} \\mathbf{A}^{-1} \\mathbf{V}^T)^{-1} \\mathbf{V} \\mathbf{A}^{-1} & \\mathbf{A}^{-1} \\mathbf{V}^T (\\mathbf{V} \\mathbf{A}^{-1} \\mathbf{V}^T)^{-1} \\\\\n",
    "\t(\\mathbf{V} \\mathbf{A}^{-1} \\mathbf{V}^T)^{-1} \\mathbf{V} \\mathbf{A}^{-1} & - (\\mathbf{V} \\mathbf{A}^{-1} \\mathbf{V}^T)^{-1}\n",
    "\t\\end{pmatrix}.\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Orthogonal matrix\n",
    "\n",
    "Orthogonal $\\mathbf{A}$: $n^2$ flops **at most**. Permutation matrix, Householder matrix, Jacobi matrix, ... take less."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Toeplitz matrix\n",
    "\n",
    "Toeplitz systems:\n",
    "$$\n",
    "\t\\mathbf{T} = \\begin{pmatrix}\n",
    "\tr_0 & r_1 & r_2 & r_3 \\\\\n",
    "\tr_{-1} & r_0 & r_1 & r_2 \\\\\n",
    "\tr_{-2} & r_{-1} & r_0 & r_1 \\\\\n",
    "\tr_{-3} & r_{-2} & r_{-1} & r_0\n",
    "\t\\end{pmatrix}.\n",
    "$$\n",
    "$\\mathbf{T} \\mathbf{x} = \\mathbf{b}$, where $\\mathbf{T}$ is pd and Toeplitz, can be solved in $O(n^2)$ flops. Durbin algorithm (Yule-Walker equation), Levinson algorithm (general $\\mathbf{b}$), Trench algorithm (inverse). These matrices occur in auto-regressive models and econometrics.\n",
    "\n",
    "* [`ToeplitzMatrices.jl`](https://github.com/JuliaMatrices/ToeplitzMatrices.jl) package can be useful."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Circulant matrix\n",
    "\n",
    "Circulant systems: Toeplitz matrix with wraparound\n",
    "$$\n",
    "\tC(\\mathbf{z}) = \\begin{pmatrix}\n",
    "\tz_0 & z_4 & z_3 & z_2 & z_1 \\\\\n",
    "\tz_1 & z_0 & z_4 & z_3 & z_2 \\\\\n",
    "\tz_2 & z_1 & z_0 & z_4 & z_3 \\\\\n",
    "\tz_3 & z_2 & z_1 & z_0 & z_4 \\\\\n",
    "\tz_4 & z_3 & z_2 & z_1 & z_0\n",
    "\t\\end{pmatrix},\n",
    "$$\n",
    "FFT type algorithms: DCT (discrete cosine transform) and DST (discrete sine transform)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Vandermonde matrix\n",
    "\n",
    "Vandermonde matrix: such as in interpolation and approximation problems\n",
    "$$\n",
    "\t\\mathbf{V}(x_0,\\ldots,x_n) = \\begin{pmatrix}\n",
    "\t1 & 1 & \\cdots & 1 \\\\\n",
    "\tx_0 & x_1 & \\cdots & x_n \\\\\n",
    "\t\\vdots & \\vdots & & \\vdots \\\\\n",
    "\tx_0^n & x_1^n & \\cdots & x_n^n\n",
    "\t\\end{pmatrix}.\n",
    "$$\n",
    "$\\mathbf{V} \\mathbf{x} = \\mathbf{b}$ or $\\mathbf{V}^T \\mathbf{x} = \\mathbf{b}$ can be solved in $O(n^2)$ flops."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Cauchy-like matrix\n",
    "\n",
    "Cauchy-like matrices:\n",
    "$$\n",
    "\t\\Omega \\mathbf{A} - \\mathbf{A} \\Lambda = \\mathbf{R} \\mathbf{S}^T,\n",
    "$$\n",
    "where $\\Omega = \\text{diag}(\\omega_1,\\ldots,\\omega_n)$ and $\\Lambda = \\text{diag}(\\lambda_1,\\ldots, \\lambda_n)$. $O(n)$ flops for LU and QR."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Structured-rank matrix\n",
    "\n",
    "Structured-rank problems: semiseparable matrices (LU and QR takes $O(n)$ flops), quasiseparable matrices, ..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Julia Version 0.6.2\n",
      "Commit d386e40c17 (2017-12-13 18:08 UTC)\n",
      "Platform Info:\n",
      "  OS: macOS (x86_64-apple-darwin14.5.0)\n",
      "  CPU: Intel(R) Core(TM) i7-6920HQ CPU @ 2.90GHz\n",
      "  WORD_SIZE: 64\n",
      "  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)\n",
      "  LAPACK: libopenblas64_\n",
      "  LIBM: libopenlibm\n",
      "  LLVM: libLLVM-3.9.1 (ORCJIT, skylake)\n"
     ]
    }
   ],
   "source": [
    "versioninfo()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 0.6.2",
   "language": "julia",
   "name": "julia-0.6"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "0.6.2"
  },
  "toc": {
   "colors": {
    "hover_highlight": "#DAA520",
    "running_highlight": "#FF0000",
    "selected_highlight": "#FFD700"
   },
   "moveMenuLeft": true,
   "nav_menu": {
    "height": "30px",
    "width": "252px"
   },
   "navigate_menu": true,
   "number_sections": true,
   "sideBar": true,
   "threshold": 4,
   "toc_cell": true,
   "toc_section_display": "block",
   "toc_window_display": true,
   "widenNotebook": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}