{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Transpose, Permutations, and Orthogonality\n",
    "\n",
    "One special type of matrix for which we can solve problems much more quickly is a permutation matrix, introduced as $PA=LU$ factorization."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "permutation_matrix (generic function with 1 method)"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# construct a permutation matrix P from the permutation vector p\n",
    "function permutation_matrix(p)\n",
    "    P = zeros(Int, length(p),length(p))\n",
    "    for i = 1:length(p)\n",
    "        P[i,p[i]] = 1\n",
    "    end\n",
    "    return P\n",
    "end"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Int64,2}:\n",
       " 0  1  0  0  0\n",
       " 0  0  0  1  0\n",
       " 1  0  0  0  0\n",
       " 0  0  0  0  1\n",
       " 0  0  1  0  0"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "P =\n",
    "permutation_matrix([2,4,1,5,3])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\"ONE-HOT VECTOR\""
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "\"ONE-HOT VECTOR\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Float64,2}:\n",
       " 1.0  0.0  0.0  0.0  0.0\n",
       " 0.0  1.0  0.0  0.0  0.0\n",
       " 0.0  0.0  1.0  0.0  0.0\n",
       " 0.0  0.0  0.0  1.0  0.0\n",
       " 0.0  0.0  0.0  0.0  1.0"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "I₅ = eye(5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Float64,2}:\n",
       " 0.0  1.0  0.0  0.0  0.0\n",
       " 0.0  0.0  0.0  1.0  0.0\n",
       " 1.0  0.0  0.0  0.0  0.0\n",
       " 0.0  0.0  0.0  0.0  1.0\n",
       " 0.0  0.0  1.0  0.0  0.0"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "P * I₅"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The inverse of any permutation matrix $P$ turns out to be its [transpose](https://en.wikipedia.org/wiki/Transpose) $P^T$: we just swap rows and columns.  In Julia, this is denoted `P'` (technically, this is the conjugate transpose, and `P.'` is the transpose, but the two are the same for real-number matrices where complex conjugation does nothing)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Int64,2}:\n",
       " 0  1  0  0  0\n",
       " 0  0  0  1  0\n",
       " 1  0  0  0  0\n",
       " 0  0  0  0  1\n",
       " 0  0  1  0  0"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "P"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Int64,2}:\n",
       " 1  0  0  0  0\n",
       " 0  1  0  0  0\n",
       " 0  0  1  0  0\n",
       " 0  0  0  1  0\n",
       " 0  0  0  0  1"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "P'*P"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Int64,2}:\n",
       " 1  0  0  0  0\n",
       " 0  1  0  0  0\n",
       " 0  0  1  0  0\n",
       " 0  0  0  1  0\n",
       " 0  0  0  0  1"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "P*P'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The reason this works is that $P^T P$ computes the dot products of *all the columns* of $P$ with *all of the columns*, and the columns of $P$ are [orthonormal](https://en.wikipedia.org/wiki/Orthonormality) (orthogonal with length 1).  We say that $P$ is an example of an [\"orthogonal\" matrix or a \"unitary\" matrix](https://en.wikipedia.org/wiki/Unitary_matrix).  We will have much to say about such matrices later in 18.06."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3×5 Array{Float64,2}:\n",
       " 0.656093   0.0785021  0.0874409  0.741967  0.401168 \n",
       " 0.0552685  0.0664353  0.369016   0.527357  0.0764656\n",
       " 0.133683   0.416092   0.52589    0.838358  0.628186 "
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A =rand(3,5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×3 Array{Float64,2}:\n",
       " 0.656093   0.0552685  0.133683\n",
       " 0.0785021  0.0664353  0.416092\n",
       " 0.0874409  0.369016   0.52589 \n",
       " 0.741967   0.527357   0.838358\n",
       " 0.401168   0.0764656  0.628186"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "my_transpose (generic function with 1 method)"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "function my_transpose(A)\n",
    "    m,n=size(A)\n",
    "    B = zeros(n,m)\n",
    "    for i=1:m, j=1:n\n",
    "        B[j,i]=A[i,j]\n",
    "    end\n",
    "    B\n",
    "end"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3×5 Array{Float64,2}:\n",
       " 0.656093   0.0785021  0.0874409  0.741967  0.401168 \n",
       " 0.0552685  0.0664353  0.369016   0.527357  0.0764656\n",
       " 0.133683   0.416092   0.52589    0.838358  0.628186 "
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×3 Array{Float64,2}:\n",
       " 0.656093   0.0552685  0.133683\n",
       " 0.0785021  0.0664353  0.416092\n",
       " 0.0874409  0.369016   0.52589 \n",
       " 0.741967   0.527357   0.838358\n",
       " 0.401168   0.0764656  0.628186"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "my_transpose(A)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3×3 Array{Float64,2}:\n",
       " 0.0657944  0.133402  0.10374 \n",
       " 0.974459   0.506739  0.981765\n",
       " 0.591419   0.472668  1.34456 "
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A = rand(3,3)\n",
    "B = rand(3,3)\n",
    "(A*B)'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3×3 Array{Float64,2}:\n",
       " 0.0657944  0.133402  0.10374 \n",
       " 0.974459   0.506739  0.981765\n",
       " 0.591419   0.472668  1.34456 "
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "B' * A'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Transposes and products\n",
    "\n",
    "Transposes are important in linear algebra because they have a special relationship to matrix and vector products:\n",
    "$$\n",
    "(AB)^T = B^T A^T\n",
    "$$\n",
    "and hence for a dot product (inner product) $x^T y$\n",
    "$$\n",
    "x \\cdot (Ay) = x \\mbox{ \"dot\" } (Ay) = x^T (Ay) = (A^T x)^T y = (A^T x) \\mbox{ \"dot\" } y = (A^T x) \\cdot y\n",
    "$$\n",
    "We can even turn the second step around and use this as the *definition* of a transpose: a transpose is *what \"moves\" a matrix from one side to the other of a dot product.*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3-element Array{Float64,1}:\n",
       " 0.289551\n",
       " 0.257849\n",
       " 0.976208"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x = rand(3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3-element Array{Float64,1}:\n",
       " 0.284907\n",
       " 0.180932\n",
       " 0.28499 "
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y  = rand(3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.6837202268882089"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x ⋅ (A*y) # The transpose of A is the unique matrix such that\n",
    "# x ⋅ (A*y) = (A'x)⋅y for all x and y"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.6837202268882088"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "(A'x)⋅y"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "true"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "C = rand(-9:9, 4,4)\n",
    "D = rand(-9:9, 4,4)\n",
    "(C*D)' == D'*C'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Transposes and inverses\n",
    "\n",
    "From the above property, we have:\n",
    "$$\n",
    "(A A^{-1})^T = (A^{-1})^T A^T = I^T = I\n",
    "$$\n",
    "and it follows that:\n",
    "$$\n",
    "(A^{-1})^T = (A^T)^{-1}\n",
    "$$\n",
    "The *transpose of the inverse* is the *inverse of the transpose*."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Int64,2}:\n",
       "  4  -2  -7  -4  -8\n",
       "  9  -6  -6  -1  -5\n",
       " -2  -9   3  -5   2\n",
       "  9   7  -9   5  -8\n",
       " -1   6  -3   9   6"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A = [4  -2  -7  -4  -8\n",
    "     9  -6  -6  -1  -5\n",
    "    -2  -9   3  -5   2\n",
    "     9   7  -9   5  -8\n",
    "    -1   6  -3   9   6]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Float64,2}:\n",
       "  0.0109991   0.131989  -0.235564  -0.301558    0.2044   \n",
       "  0.529789    0.35747   -0.179652  -0.69172     0.678582 \n",
       " -0.908341   -0.900092   0.370302   1.48701    -1.29667  \n",
       " -0.635197   -0.622365   0.353804   1.16499    -1.05408  \n",
       " -0.0879927  -0.055912  -0.11549    0.0791323   0.0314696"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "inv(A')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Float64,2}:\n",
       "  0.0109991   0.131989  -0.235564  -0.301558    0.2044   \n",
       "  0.529789    0.35747   -0.179652  -0.69172     0.678582 \n",
       " -0.908341   -0.900092   0.370302   1.48701    -1.29667  \n",
       " -0.635197   -0.622365   0.353804   1.16499    -1.05408  \n",
       " -0.0879927  -0.055912  -0.11549    0.0791323   0.0314696"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "inv(A)'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As expected, they match!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Transposes and LU factors\n",
    "\n",
    "If $A = LU$, then $A^T = U^T L^T$.  Note that $U^T$ is *lower* triangular, and $L^T$ is *upper* trangular.  That means, that once we have the LU factorization of $A$, we immediately have a similar factorization of $A^T$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "([1.0 0.0 … 0.0 0.0; 1.0 1.0 … 0.0 0.0; … ; -0.111111 0.410256 … 1.0 0.0; -0.222222 -0.794872 … 0.0242696 1.0], [9.0 -6.0 … -1.0 -5.0; 0.0 13.0 … 6.0 -3.0; … ; 0.0 0.0 … 8.67894 9.95297; 0.0 0.0 … 0.0 -0.771206], [2, 4, 1, 5, 3])"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "L,U,p = lu(A)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5-element Array{Int64,1}:\n",
       " 2\n",
       " 4\n",
       " 1\n",
       " 5\n",
       " 3"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "p"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Float64,2}:\n",
       " 1.0  1.0  0.444444   -0.111111  -0.222222 \n",
       " 0.0  1.0  0.0512821   0.410256  -0.794872 \n",
       " 0.0  0.0  1.0         0.582822   0.171779 \n",
       " 0.0  0.0  0.0         1.0        0.0242696\n",
       " 0.0  0.0  0.0         0.0        1.0      "
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "L'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Float64,2}:\n",
       "  9.0   0.0   0.0      0.0       0.0     \n",
       " -6.0  13.0   0.0      0.0       0.0     \n",
       " -6.0  -3.0  -4.17949  0.0       0.0     \n",
       " -1.0   6.0  -3.86325  8.67894   0.0     \n",
       " -5.0  -3.0  -5.62393  9.95297  -0.771206"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "U'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In particular, suppose we know the $PA = LU$ factorization for $A$, but we want to solve $A^T x = b$.  We can:\n",
    "\n",
    "* Write $A = P^T L U \\implies A^T = U^T L^T P$\n",
    "* Substitute this in to $A^T x = b$ to obtain $U^T L^T P x = b$\n",
    "* Parenthesize and solve from the \"outside in\": $U^T (L^T (P x)) = b$:\n",
    "    - First solve $U^T c = b$ for $c$ by forward-substitution\n",
    "    - Then solve $L^T d = c$ by backsubstitution\n",
    "    - Then solve $P x = d$ for $x = P^T d$ (i.e. just reversing the permutation)\n",
    "\n",
    "Let's try it:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5-element Array{Float64,1}:\n",
       "   1.28873 \n",
       "   6.07363 \n",
       " -11.9273  \n",
       "  -8.92392 \n",
       "  -0.643141"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "b = [4,2,1,-2,3] # \"randomly\" chosen right-hand side\n",
    "A' \\ b # correct solution to Aᵀx = b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5-element Array{Float64,1}:\n",
       "   1.28873 \n",
       "   6.07363 \n",
       " -11.9273  \n",
       "  -8.92392 \n",
       "  -0.643141"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "c = U' \\ b # forward-substitution\n",
    "d = L' \\ c # backsubstitution\n",
    "permutation_matrix(p)' * d"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As usual, the `lufact(A)` object (which encapsulates $L$, $U$, and $P$) does all this for you (in a more efficient way because it makes sure to take advantage of the special structure of these matrices, which we didn't above):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5-element Array{Float64,1}:\n",
       "   1.28873 \n",
       "   6.07363 \n",
       " -11.9273  \n",
       "  -8.92392 \n",
       "  -0.643141"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "LU = lufact(A)\n",
    "LU' \\ b"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Symmetric matrices\n",
    "\n",
    "A *very* important type of matrix that arises frequently in real problems (we will have *much more* to say about this later in the course, after exam 2) is a [symmetric matrix](https://en.wikipedia.org/wiki/Symmetric_matrix): a matrix $S$ that is equal to its transpose $S = S^T$.\n",
    "\n",
    "Given any matrix $A$, we can make a symmetric matrix out of it very easily in two ways:\n",
    "* $A + A^T$ (or often we write the \"symmetric part\" of $A$ as $\\frac{A + A^T}{2}$).  (For square matrices only.)\n",
    "* $A^T A$ or $A A^T$.  (This even works for *non-square* matrix.)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Int64,2}:\n",
       "  4  -2  -7  -4  -8\n",
       "  9  -6  -6  -1  -5\n",
       " -2  -9   3  -5   2\n",
       "  9   7  -9   5  -8\n",
       " -1   6  -3   9   6"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Int64,2}:\n",
       "  8    7   -9    5  -9\n",
       "  7  -12  -15    6   1\n",
       " -9  -15    6  -14  -1\n",
       "  5    6  -14   10   1\n",
       " -9    1   -1    1  12"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "S = A' + A"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "true"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "S' == (A')'+ A' == A +A' == A' + A"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Int64,2}:\n",
       "  183   13  -166   21  -159\n",
       "   13  206   -58  148     8\n",
       " -166  -58   184  -53   146\n",
       "   21  148   -53  148    41\n",
       " -159    8   146   41   193"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "S = A' * A"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "true"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "S' == (A' *A )' == A' * A"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The ordinary LU factorization of a symmetric $S$, however, seems to have nothing to do with the symmetry of $S$.  Is there any special relationship between $L$ and $U$ in this case?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "([1.0 0.0 … 0.0 0.0; 0.0710383 1.0 … 0.0 0.0; … ; 0.114754 0.714408 … 1.0 0.0; -0.868852 0.0940872 … 1.11804 1.0], [183.0 13.0 … 21.0 -159.0; 0.0 205.077 … 146.508 19.2951; … ; 0.0 0.0 … 40.8852 45.7112; 0.0 0.0 … 0.0 0.303428], [1, 2, 3, 4, 5])"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "L, U = lu(S, Val{false}) # LU without pivoting"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Float64,2}:\n",
       "  1.0         0.0         0.0        0.0      0.0\n",
       "  0.0710383   1.0         0.0        0.0      0.0\n",
       " -0.907104   -0.225319    1.0        0.0      0.0\n",
       "  0.114754    0.714408   -0.0408412  1.0      0.0\n",
       " -0.868852    0.0940872   0.265894   1.11804  1.0"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "L"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Float64,2}:\n",
       " 183.0   13.0    -166.0      21.0       -159.0     \n",
       "   0.0  205.077   -46.2077  146.508       19.2951  \n",
       "   0.0    0.0      23.0093   -0.939727     6.11804 \n",
       "   0.0    0.0       0.0      40.8852      45.7112  \n",
       "   0.0    0.0       0.0       0.0          0.303428"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "U"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$U$ and $L$ *seem* quite different because $L$ has 1's along the diagonal, but $U$ has some other numbers (the pivots).  We can extract these with `diag(U)` in Julia:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5-element Array{Float64,1}:\n",
       " 183.0     \n",
       " 205.077   \n",
       "  23.0093  \n",
       "  40.8852  \n",
       "   0.303428"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "diag(U)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We could make $U$ look more like $L$ if we *divided each row* of $U$ by these pivots.  That corresponds to multiplying $D^{-1} U$, where $D$ is the *diagonal matrix* of the pivots:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Float64,2}:\n",
       " 183.0    0.0     0.0      0.0     0.0     \n",
       "   0.0  205.077   0.0      0.0     0.0     \n",
       "   0.0    0.0    23.0093   0.0     0.0     \n",
       "   0.0    0.0     0.0     40.8852  0.0     \n",
       "   0.0    0.0     0.0      0.0     0.303428"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "D = diagm(diag(U)) # diagm makes a diagonal matrix from a 1d array"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Since a diagonal matrix just multiplies each row by a single number, the inverse of a diagonal matrix simply *divides* each row by the *reciprocal* of that number:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Float64,2}:\n",
       " 0.00546448  0.0         0.0        0.0        0.0    \n",
       " 0.0         0.00487623  0.0        0.0        0.0    \n",
       " 0.0         0.0         0.0434607  0.0        0.0    \n",
       " 0.0         0.0         0.0        0.0244587  0.0    \n",
       " 0.0         0.0         0.0        0.0        3.29568"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "inv(D)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Float64,2}:\n",
       " 1.0  0.0710383  -0.907104   0.114754   -0.868852 \n",
       " 0.0  1.0        -0.225319   0.714408    0.0940872\n",
       " 0.0  0.0         1.0       -0.0408412   0.265894 \n",
       " 0.0  0.0         0.0        1.0         1.11804  \n",
       " 0.0  0.0         0.0        0.0         1.0      "
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "inv(D) * U"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Wait a minute, now the entries look *exactly* like those of $L$, except above the diagonal rather than below.  In fact, this is precisely the *transpose* of $L$:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Float64,2}:\n",
       " 1.0  0.0710383  -0.907104   0.114754   -0.868852 \n",
       " 0.0  1.0        -0.225319   0.714408    0.0940872\n",
       " 0.0  0.0         1.0       -0.0408412   0.265894 \n",
       " 0.0  0.0         0.0        1.0         1.11804  \n",
       " 0.0  0.0         0.0        0.0         1.0      "
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "L'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Int64,2}:\n",
       "  183   13  -166   21  -159\n",
       "   13  206   -58  148     8\n",
       " -166  -58   184  -53   146\n",
       "   21  148   -53  148    41\n",
       " -159    8   146   41   193"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "S"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Float64,2}:\n",
       "  1.0         0.0         0.0        0.0      0.0\n",
       "  0.0710383   1.0         0.0        0.0      0.0\n",
       " -0.907104   -0.225319    1.0        0.0      0.0\n",
       "  0.114754    0.714408   -0.0408412  1.0      0.0\n",
       " -0.868852    0.0940872   0.265894   1.11804  1.0"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "L\n",
    "D = abs.(D)\n",
    "S =  L*abs.(D)*L'\n",
    "U = L'\n",
    "L"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Since $D^{-1} U = L^T$, we have $U = D L^T$, and hence $S = LU = L D L^T$.\n",
    "\n",
    "This fact is so important that it has its own name: we have constructed the **LDLᵀ factorization** of our symmetric matrix $S$.   This factorization is useful for two reasons:\n",
    "\n",
    "* It preserves the special structure of a symmetric matrix, which is important if we are to do subsequent algebraic manipulations:  $(LDL^T)^T = LDL^T$.\n",
    "\n",
    "* Clever implementations can save roughly a factor of two in the number of operations by exploiting the symmetry."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Cholesky factorization\n",
    "\n",
    "Finally, we should mention another very important variation on this theme.\n",
    "\n",
    "Suppose that we have a symmetric matrix $S$ in which **all the pivots are positive**.  This is called a [positive-definite matrix](https://en.wikipedia.org/wiki/Positive-definite_matrix), and turns out to be the case *whenever* you construct $S$ from $A^T A$ or $A A^T$ (for real $A$), as above.   We will have much more to say about such matrices later in the course.\n",
    "\n",
    "In that case, we can take the *square roots* of the pivots to write $D = KK$ where $K$ is a diagonal matrix of the square roots of the pivots:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Float64,2}:\n",
       "  13.5277     0.0       0.0       0.0      0.0     \n",
       "   0.960988  14.3205    0.0       0.0      0.0     \n",
       " -12.2711    -3.22668   4.7968    0.0      0.0     \n",
       "   1.55236   10.2307   -0.195907  6.39416  0.0     \n",
       " -11.7536     1.34738   1.27544   7.14891  0.550843"
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "K = L* diagm(sqrt.(diag(D)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Float64,2}:\n",
       " 0.0  0.0   0.0          0.0           0.0        \n",
       " 0.0  0.0   7.10543e-15  0.0          -1.77636e-15\n",
       " 0.0  0.0   2.84217e-14  7.10543e-15   0.0        \n",
       " 0.0  0.0   0.0          0.0           0.0        \n",
       " 0.0  0.0  -2.84217e-14  0.0           2.84217e-14"
      ]
     },
     "execution_count": 47,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "K*K' - S # kind of a matrix sqrt: cholesky factorization"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then we can write $S = LDL^T = LKKL^T = (LK)(LK)^T$.  The matrix $\\hat{L} = LK$ is also a lower-triangular matrix, it is $L$ with the *columns* scaled by $K$.  So, we can write *any* symmetric positive-definite (SPD) matrix as:\n",
    "\n",
    "$$\n",
    "S = \\hat{L} \\hat{L}^T\n",
    "$$\n",
    "\n",
    "This is called the [Cholesky factorization](https://en.wikipedia.org/wiki/Cholesky_decomposition) of $S$, and it usually the most efficient way to solve SPD systems (half the operations, and often half the storage, compared to LU).  In Julia, it is computed by `chol` (which returns $\\hat{L}^T$) or `cholfact`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {},
   "outputs": [
    {
     "ename": "LoadError",
     "evalue": "\u001b[91mArgumentError: matrix is not symmetric/Hermitian. This error can be avoided by calling chol(Hermitian(A)) which will ignore either the upper or lower triangle of the matrix.\u001b[39m",
     "output_type": "error",
     "traceback": [
      "\u001b[91mArgumentError: matrix is not symmetric/Hermitian. This error can be avoided by calling chol(Hermitian(A)) which will ignore either the upper or lower triangle of the matrix.\u001b[39m",
      "",
      "Stacktrace:",
      " [1] \u001b[1mchol\u001b[22m\u001b[22m\u001b[1m(\u001b[22m\u001b[22m::Array{Float64,2}\u001b[1m)\u001b[22m\u001b[22m at \u001b[1m./linalg/cholesky.jl:185\u001b[22m\u001b[22m",
      " [2] \u001b[1minclude_string\u001b[22m\u001b[22m\u001b[1m(\u001b[22m\u001b[22m::String, ::String\u001b[1m)\u001b[22m\u001b[22m at \u001b[1m./loading.jl:522\u001b[22m\u001b[22m",
      " [3] \u001b[1mexecute_request\u001b[22m\u001b[22m\u001b[1m(\u001b[22m\u001b[22m::ZMQ.Socket, ::IJulia.Msg\u001b[1m)\u001b[22m\u001b[22m at \u001b[1m/Users/stevenj/.julia/v0.6/IJulia/src/execute_request.jl:193\u001b[22m\u001b[22m",
      " [4] \u001b[1m(::Compat.#inner#14{Array{Any,1},IJulia.#execute_request,Tuple{ZMQ.Socket,IJulia.Msg}})\u001b[22m\u001b[22m\u001b[1m(\u001b[22m\u001b[22m\u001b[1m)\u001b[22m\u001b[22m at \u001b[1m/Users/stevenj/.julia/v0.6/Compat/src/Compat.jl:332\u001b[22m\u001b[22m",
      " [5] \u001b[1meventloop\u001b[22m\u001b[22m\u001b[1m(\u001b[22m\u001b[22m::ZMQ.Socket\u001b[1m)\u001b[22m\u001b[22m at \u001b[1m/Users/stevenj/.julia/v0.6/IJulia/src/eventloop.jl:8\u001b[22m\u001b[22m",
      " [6] \u001b[1m(::IJulia.##13#16)\u001b[22m\u001b[22m\u001b[1m(\u001b[22m\u001b[22m\u001b[1m)\u001b[22m\u001b[22m at \u001b[1m./task.jl:335\u001b[22m\u001b[22m"
     ]
    }
   ],
   "source": [
    "chol(S)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Whoops, the problem was that $S = A^T A$ is supposed to be symmetric, but it is not *exactly* symmetric due to roundoff errors:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Float64,2}:\n",
       " 0.0   0.0           0.0           0.0           0.0        \n",
       " 0.0   0.0          -7.10543e-15   0.0           1.77636e-15\n",
       " 0.0   7.10543e-15   0.0          -7.10543e-15  -2.84217e-14\n",
       " 0.0   0.0           7.10543e-15   0.0           0.0        \n",
       " 0.0  -1.77636e-15   2.84217e-14   0.0           0.0        "
      ]
     },
     "execution_count": 49,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "S - S'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So, we have to tell Julia explicitly that it is symmetric, so that it should ignore these tiny differences between the upper and lower triangles:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 UpperTriangular{Float64,Array{Float64,2}}:\n",
       " 13.5277   0.960988  -12.2711    1.55236   -11.7536  \n",
       "   ⋅      14.3205     -3.22668  10.2307      1.34738 \n",
       "   ⋅        ⋅          4.7968   -0.195907    1.27544 \n",
       "   ⋅        ⋅           ⋅        6.39416     7.14891 \n",
       "   ⋅        ⋅           ⋅         ⋅          0.550843"
      ]
     },
     "execution_count": 50,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "chol(Symmetric(S))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This is the same as our $K^T$ matrix from above:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5×5 Array{Float64,2}:\n",
       " 13.5277   0.960988  -12.2711    1.55236   -11.7536  \n",
       "  0.0     14.3205     -3.22668  10.2307      1.34738 \n",
       "  0.0      0.0         4.7968   -0.195907    1.27544 \n",
       "  0.0      0.0         0.0       6.39416     7.14891 \n",
       "  0.0      0.0         0.0       0.0         0.550843"
      ]
     },
     "execution_count": 51,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "K'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "One interesting fact about Cholesky factorization of SPD matrices is that **row swaps are never required**, even when concerns about roundoff errors are included, so there is no $P$ matrix.\n",
    "\n",
    "We won't do much with Cholesky factorizations in 18.06, but they are extremely useful in many applications of linear algebra.  If your are solving a system of equations involving $A^TA$, and hence SPD, then you gain a factor of 2 or more by telling the computer to use Cholesky factorization instead of LU, and there are many other uses of Cholesky factorization as well."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 0.6.3",
   "language": "julia",
   "name": "julia-0.6"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "0.6.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}