{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "using LinearAlgebra"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Diagonalization\n",
    "\n",
    "If we can find a solution $x \\ne 0$ to\n",
    "\n",
    "$$\n",
    "Ax = \\lambda x\n",
    "$$\n",
    "\n",
    "then, for this vector, the matrix $A$ **acts like a scalar**.  $x$ is called an **eigenvector** of $A$, and $\\lambda$ is called an **eigenvalue**.\n",
    "\n",
    "In fact, for an $m \\times m$ matrix $A$, we typically find $m$ linearly independendent eigenvectors $x_1,x_2,\\ldots,x_m$ and $m$ corresponding eigenvalues $\\lambda_1, \\lambda_2, \\ldots, \\lambda_m$.   Such a matrix is called **diagonalizable**.  Most matrices are diagonalizable; we will deal with the rare \"defective\" (non-diagonalizable) case later.\n",
    "\n",
    "Given such a **basis of eigenvectors**, the key idea for using them is:\n",
    "\n",
    "1. Take any vector $x$ and expand it in this basis: $x = c_1 x_1 + \\cdots c_m x_n$, or $x = Xc$ or $c = X^{-1}x$ where $X$ is the matrix whose *columns are the eigenvectors*.\n",
    "\n",
    "2. For each eigenvector $x_k$, the matrix $A$ acts like a scalar $\\lambda_k$.  Multiplication or division corresponds to multiplying/dividing $x_k$ by $\\lambda_k$.  **Solve your problem for each eigenvector by treating A as the scalar λ**.\n",
    "\n",
    "3. Add up the solution to your problem (sum the basis of the eigenvectors).  That is, multiply the new coefficients by $X$.\n",
    "\n",
    "This process of expanding in the eigenvectors, multiplying (or whatever) by λ, and then summing up the eigenvectors times their new coefficients, is expressed algebraically as the following **diagonalization** of the matrix $A$:\n",
    "\n",
    "$$\n",
    "A = X \\Lambda X^{-1}\n",
    "$$\n",
    "\n",
    "where $\\Lambda$ is the **diagonal matrix of the eigenvalues** and $X = \\begin{pmatrix} x_1 & x_2 & \\cdots & x_m \\end{pmatrix}$ from above."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Expanding in an Eigenvector Basis\n",
    "\n",
    "For example, consider the matrix\n",
    "\n",
    "$$\n",
    "A = \\begin{pmatrix} 1 & 1 \\\\ -2 & 4 \\end{pmatrix}\n",
    "$$\n",
    "\n",
    "whose eigenvalues are $\\lambda_1 = 2$ and $\\lambda_2 = 3$ and whose eigenvectors are $x_1 = \\begin{pmatrix}1\\\\1\\end{pmatrix}$ and $x_2 = \\begin{pmatrix}1\\\\2\\end{pmatrix}$.\n",
    "\n",
    "We put these eigenvectors into a matrix $X = \\begin{pmatrix} x_1 & x_2 \\end{pmatrix} = \\begin{pmatrix} 1 & 1 \\\\ 1 & 2 \\end{pmatrix}$.  The matrix is invertible: $A$ is *diagonalizable*, since the eigenvectors form a *basis*. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2-element Vector{Float64}:\n",
       " 2.0\n",
       " 3.0"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A = [1 1\n",
    "    -2 4]\n",
    "eigvals(A)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2×2 Matrix{Int64}:\n",
       " 1  1\n",
       " 1  2"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X = [1 1\n",
    "     1 2]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To write any vector $x$ in the basis of eigenvectors, we just want $x = Xc$ or $c = X^{-1} x$.  For example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2-element Vector{Float64}:\n",
       "  2.0\n",
       " -1.0"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x = [1,0]\n",
    "c = X \\ x"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "i.e. $x = \\begin{pmatrix} 1 \\\\ 0 \\end{pmatrix} = 2 \\begin{pmatrix} 1 \\\\ 1 \\end{pmatrix} - \\begin{pmatrix} 1 \\\\ 2 \\end{pmatrix}$, which is obviously correct."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$Ax = \\lambda_1 c_1 x_1 + \\lambda_2 c_2 x_2$, or equivalently\n",
    "$$\n",
    "Ax = X \\begin{pmatrix} \\lambda_1 c_1 \\\\ \\lambda_2 c_2 \\end{pmatrix} = \n",
    "     X \\underbrace{\\begin{pmatrix} \\lambda_1 &  \\\\  &  \\lambda_2  \\end{pmatrix}}_\\lambda c\n",
    "     = X \\Lambda X^{-1} x\n",
    "$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2-element Vector{Int64}:\n",
       "  1\n",
       " -2"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A*x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2×2 Diagonal{Int64, Vector{Int64}}:\n",
       " 2  ⋅\n",
       " ⋅  3"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Λ = Diagonal([2, 3])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2-element Vector{Float64}:\n",
       "  1.0\n",
       " -2.0"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X*Λ*c"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2-element Vector{Float64}:\n",
       "  1.0\n",
       " -2.0"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X * Λ * inv(X) * x"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Since this is true for *every* $x$, it means $\\boxed{A = X \\Lambda X^{-1}}$:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2×2 Matrix{Float64}:\n",
       "  1.0  1.0\n",
       " -2.0  4.0"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X * Λ / X   # / X is equivalent to multiplying by inv(X), but is more efficient"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Another way to see this is to consider $$AX = \\begin{pmatrix} Ax_1 & Ax_2 \\end{pmatrix} = \\begin{pmatrix} \\lambda_1 x_1 & \\lambda_2 x_2 \\end{pmatrix} = X \\Lambda$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2×2 Matrix{Int64}:\n",
       " 2  3\n",
       " 2  6"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A*X"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2×2 Matrix{Int64}:\n",
       " 2  3\n",
       " 2  6"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X*Λ"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It follows that $A = X\\Lambda X^{-1}$ or $\\boxed{\\Lambda = X^{-1} A X}$:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2×2 Matrix{Float64}:\n",
       " 2.0  0.0\n",
       " 0.0  3.0"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X \\ A * X"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The key thing is that this works for **any matrix, as long as the eigenvectors form a basis**. Such a matrix is called **diagonalizable**, and it turns out that this is true for almost all matrices; we will deal with the rare exceptions.\n",
    "\n",
    "Thus, eigenproblems join LU factorization (Gaussian elimination) and QR factorization (Gram–Schmidt): the eigensolutions are **equivalent to a matrix factorization**.  This is extremely useful in helping us think *algebraically* about using eigenvalues and eigenvectors, because it lets us work with them *all at once*."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Change of Basis and Similar Matrices"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A key concept in linear algebra, *especially* for eigenproblems, is a *change of basis*.   If $S$ is an $m\\times m$ invertible matrix, then we know its columns form a *basis* for $\\mathbb{R}^m$.  Writing any $x$ in this basis is simply $x=Sc$, i.e. coordinates $c = S^{-1} x$ in the new basis (the new \"coordinate system\").\n",
    "\n",
    "If we have a matrix $A$ representing a linear operation $y=Ax$, we can also try to write the *same* linear operation in a *new* coordinate system.  That is, if we write $x=Sc$ and $y=Sd$, then what is the matrix relating $c$ and $d$?  This is easy to compute:\n",
    "$$\n",
    "d = S^{-1} y = S^{-1} Ax = S^{-1} A S c,\n",
    "$$\n",
    "so the matrix $\\boxed{B = S^{-1} A S}$ is represent the same operation as $A$ but in the new coordinate system.\n",
    "\n",
    "In linear algebra, we say that $A$ and $B$ are **similar matrices**.  That is, B is similar to A if and only if $B = S^{-1} A S$ for some invertible matrix S.   It also follows that $A = SBS^{-1} = (S^{-1})^{-1} B (S^{-1})$, i.e. if B is similar to A using S, then A is similar to B using $S^{-1}$.\n",
    "\n",
    "For example, we can choose a random 2×2 S to write our matrix A from above in a new coordinate system:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2×2 Matrix{Float64}:\n",
       "  0.34504   1.3915\n",
       " -0.486365  1.30546"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "S = randn(2,2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1.1272143106064274"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "det(S) # a random S is almost certainly nonsingular"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2×2 Matrix{Float64}:\n",
       "  3.08981   0.11279\n",
       " -0.867718  1.91019"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "B = S \\ A * S   # same as inv(S) * A * S, but more efficient since it avoids the explicit inverse"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A key fact is that **similar matrices have the same eigenvalues** (but different eigenvectors):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2-element Vector{Float64}:\n",
       " 1.9999999999999991\n",
       " 3.000000000000001"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "eigvals(B)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This is easy to show in a variety of ways.\n",
    "\n",
    "For example, if $Ax=\\lambda x$, since $A=SBS^{-1}$, we have $ SBS^{-1} x = \\lambda x$, or\n",
    "$$\n",
    "B(S^{-1}x) = \\lambda (S^{-1}x)\n",
    "$$\n",
    "so **S⁻¹x is an eigenvector of B with the *same* eigenvalue λ**!\n",
    "\n",
    "In contrast, multiplying $A$ only on *one* side by some matrix $S$ will typically change the eigenvalues:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2-element Vector{ComplexF64}:\n",
       " 1.1487471858694138 - 2.3331664678277177im\n",
       " 1.1487471858694138 + 2.3331664678277177im"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "eigvals(A * S)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Another way of seeing this is by looking at the characteristic polynomial:\n",
    "\n",
    "$$\n",
    "\\det(A - \\lambda I) = \\det(SBS^{-1} - \\lambda I) = \\det \\left[ S (B - \\lambda I) S^{-1}   \\right]\n",
    "= \\det(S) \\det(B - \\lambda I) \\det(S^{-1}) = \\det(B - \\lambda I)\n",
    "$$\n",
    "\n",
    "i.e. **similar matrices have the same characteristic polynomial**.\n",
    "\n",
    "(Here, we used the product rule for determinants and the fact that $\\det(S^{-1}) = 1/\\det(S)$.)\n",
    "\n",
    "If we set λ=0, we see that $\\det A = \\det B$, i.e. **similar matrices have the same determinant**."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(6.0, 5.999999999999999)"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "det(A), det(B)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Determinant = Product of λ’s\n",
    "\n",
    "In the new language, we see that diagonalization $A = X \\Lambda X^{-1}$ is the same thing as saying that **A is similar to a diagonal matrix (of eigenvalues)**.  That is, there is a basis (coordinate system) in which A is diagonal.\n",
    "\n",
    "From above, $\\det A = \\det \\Lambda = \\lambda_1 \\lambda_2 \\cdots \\lambda_m$.  That is, the **determinant is the product of the eigenvalues**.\n",
    "\n",
    "(Technically, we have only shown this for diagonalizable matrices, but it turns out to be true in general.)\n",
    "\n",
    "Let's try it:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "6.0"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "det(A)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This is the same as the product of A's eigenvalues:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "6"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "2 * 3"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Equivalently, using Julia's `prod` function (which computes the product of the elements of a list):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "6.0"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "prod(eigvals(A))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also try it for some large matrices:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1.6680449397144425e78"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Abig = randn(100,100)\n",
    "det(Abig)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1.6680449397143506e78 - 1.3679861476883116e62im"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "prod(eigvals(Abig))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It also follows that the log of the determinant is the sum of the logs of the eigenvalues.\n",
    "\n",
    "Julia has a built-in function `logabsdet` to compute the log of the absolute value of the determinant, which should be the sum of the logs of the absolute values of the eigenvalues.  (There is also a `logdet` function to compute the log without the absolute value, but then we need to deal with complex numbers.)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(180.11328949938405, 1.0)"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "logabsdet(Abig) # the log of the absolute value of the determinant, and the sign"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "180.113289499384"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "s = sum(log.(abs.(eigvals(Abig)))) # the sum of the logs of |λ|"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If we try to compute the determinant of an even bigger matrix, we run into a problem: in the default precision, a computer can't represent numbers bigger than $10^{308}$ or so, and instead gives `Inf` (for \"infinity\"):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "-Inf"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Abigger = randn(1000,1000)\n",
    "det(Abigger)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1.7976931348623157e308"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "floatmax(Float64) # the problem is that there is a maximum value the computer can represent, beyond which it gives \"Inf\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "But the log is fine:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(2954.073853516206, -1.0)"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "logabsdet(Abigger) # but the log is fine"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2954.0738535162086"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "s = sum(log.(abs.(eigvals(Abigger)))) # the sum of the logs of |λ|"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Trace = Sum of λ’s"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Another important quantity for a square matrix, which we haven't covered yet, is the [trace](https://en.wikipedia.org/wiki/Trace_(linear_algebra)).  The trace is defined as the **sum of the diagonal elements** of any matrix.  By plugging in the definition of matrix multiplication, one can quickly show that the trace has a crucial property:\n",
    "\n",
    "$$\n",
    "\\operatorname{trace} (AB) = \\operatorname{trace}(BA)\n",
    "$$\n",
    "\n",
    "It follows that **similar matrices have the same trace**, since if $A=SBS^{-1}$ then \n",
    "\n",
    "$$\n",
    "\\operatorname{trace} (A) = \\operatorname{trace}(SBS^{-1}) = \\operatorname{trace}(S^{-1}SB) = \\operatorname{trace}(B)\n",
    "$$\n",
    "\n",
    "In particular, since A and Λ are similar, this means that the **trace of a matrix is the sum of the eigenvalues**!\n",
    "\n",
    "Let's check:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tr(A)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Yup, this is the sum of the eigenvalues:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "2 + 3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "5.0"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sum(eigvals(A))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's try it for our bigger example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4.625235324319982"
      ]
     },
     "execution_count": 33,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tr(Abig)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "4.625235324319927 + 1.2434497875801753e-14im"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sum(eigvals(Abig))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Diagonalization and Matrix Powers\n",
    "\n",
    "We've already seen that if $Ax = \\lambda x$ then $A^n x = \\lambda^n x$ (for both positive and negative $n$): **if x is an eigenvector of A, then it is also an eigenvector of Aⁿ with the eigenvalue raised to the n-th power**.\n",
    "\n",
    "There is another cute way to see this for diagonalizable matrices.  If $A = X\\Lambda X^{-1}$, then for $n \\ge 0$\n",
    "\n",
    "$$\n",
    "A^n = \\underbrace{AAA\\cdots A}_{n\\mbox{ times}} \n",
    "    = \\underbrace{X\\Lambda X^{-1}X\\Lambda X^{-1}X\\Lambda X^{-1}\\cdots X\\Lambda X^{-1}}_{n\\mbox{ times}} = X\\Lambda^n X^{-1}\n",
    "$$\n",
    "because all of the $X$ terms *cancel* except the first and last ones.  $\\Lambda^n$ is just the diagonal matrix with $\\lambda_1^n, \\lambda_2^n, \\ldots$ on the diagonal.\n",
    "\n",
    "So, since we have the diagonalization of $A^n$, we immediately see that its eigenvectors are $X$ (same as for $A$) and its eigenvalues are $\\lambda^n$.\n",
    "\n",
    "Since $A^{-1}x = \\lambda^{-1} x$ for an eigenvector $x$, it immediately follows that $A^{-1} = X \\Lambda^{-1} X^{-1}$ where $\\Lambda^{-1}$ is the diagonal matrix with $\\lambda_k^{-1}$ on the diagonal.  Similarly for $A^{-n} = X \\Lambda^{-n} X^{-1}$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2×2 Matrix{Int64}:\n",
       "  1  1\n",
       " -2  4"
      ]
     },
     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A = [1 1\n",
    "    -2 4]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2-element Vector{Float64}:\n",
       " 2.0\n",
       " 3.0"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "eigvals(A)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's check that the eigenvalues of $A^4$ are $2^4 = 16$ and $3^4 = 81$:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2-element Vector{Float64}:\n",
       " 16.0\n",
       " 81.0"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "eigvals(A^4)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Remember, these are the eigenvectors $X$ of $A$ with the first component normalized to 1 (via the `./` \"broadcasting\" operator; similar operations can be found in [Python](https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html) and [Matlab](https://www.mathworks.com/help/matlab/ref/bsxfun.html)):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2×2 Matrix{Float64}:\n",
       " 1.0  1.0\n",
       " 1.0  2.0"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X = eigvecs(A)\n",
    "X = X ./ X[1,:]'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's check that the eigenvectors $X'$ of $A^4$ are the same:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2×2 Matrix{Float64}:\n",
       " 1.0  1.0\n",
       " 1.0  2.0"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X′ = eigvecs(A^4)\n",
    "X′ = X′ ./ X′[1,:]'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Similarly for $A^{-1}$, whose eigenvalues should be 1/2 and 1/3:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2-element Vector{Float64}:\n",
       " 0.33333333333333337\n",
       " 0.4999999999999999"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "eigvals(inv(A))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that a **matrix is invertible if and only if its eigenvalues are all nonzero**.\n",
    "\n",
    "In fact an eigenvector of $A$ for $\\lambda = 0$ has another name we've already seen: it is a **nonzero** vector in the **nullspace** $N(A)$.   Only matrices with $N(A) = \\{\\vec{0}\\}$ (remember, $\\vec{0}$ is not allowed as an eigenvector) are invertible."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Matrix square roots\n",
    "\n",
    "One really cool thing that diagonalization allows us to do easily is to compute $A^n$ for **non-integer powers n**.   For example, we can now see how to find the [square root of a matrix](https://en.wikipedia.org/wiki/Square_root_of_a_matrix), at least for diagonalizable matrices.\n",
    "\n",
    "If $A$ is a square matrix, its square root $\\sqrt{A} = A^{1/2}$ is just a matrix so that $A^{1/2} A^{1/2} = A$.  But how would we find such a matrix?\n",
    "\n",
    "As usual, for eigenvalues it is easy: if $Ax=\\lambda x$, then we obviously want $A^{1/2} x = \\lambda^{1/2} x$.  If $A$ is diagonalizable and we do this for *every* eigenvalue/vector, we get the diagonalization:\n",
    "\n",
    "$$\n",
    "\\sqrt{A} = A^{1/2} = X \\underbrace{\\begin{pmatrix} \\sqrt{\\lambda_1} & & & \\\\ & \\sqrt{\\lambda_2} & & \\\\ & & \\ddots & \\\\ & & & \\sqrt{\\lambda_m} \\end{pmatrix}}_\\sqrt{\\Lambda} X^{-1}\n",
    "$$\n",
    "(Obviously, this may give a complex matrix if any eigenvalues are negative or complex.)\n",
    "\n",
    "Does this have the property that $A^{1/2} A^{1/2} = A$? Yes!  $X \\sqrt{\\Lambda} X^{-1} X \\sqrt{\\Lambda} X^{-1} = X \\sqrt{\\Lambda} \\sqrt{\\Lambda} X^{-1} = A$, since obviously $\\sqrt{\\Lambda} \\sqrt{\\Lambda} = \\Lambda$ from the definition above.\n",
    "\n",
    "Let's try it:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2-element Vector{Float64}:\n",
       " 1.4142135623730951\n",
       " 1.7320508075688772"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sqrt.(eigvals(A)) # takes the square root (elementwise) of each eigenvalue"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2×2 Diagonal{Float64, Vector{Float64}}:\n",
       " 1.41421   ⋅ \n",
       "  ⋅       1.73205"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Diagonal(sqrt.(eigvals(A))) # the diagonal matrix √Λ"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2×2 Matrix{Float64}:\n",
       "  1.09638   0.317837\n",
       " -0.635674  2.04989"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Asqrt = X * Diagonal(sqrt.(eigvals(A))) / X   # the √A matrix"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2×2 Matrix{Float64}:\n",
       "  1.0  1.0\n",
       " -2.0  4.0"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "Asqrt * Asqrt"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Hooray, it squared to $A$ as desired!\n",
    "\n",
    "Julia `sqrt` will compute the matrix square root for you if you give it a square-matrix argument.   (It is more general than our approach above, because it works even for non-diagonalizable matrices.)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2×2 Matrix{Float64}:\n",
       "  1.09638   0.317837\n",
       " -0.635674  2.04989"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sqrt(A)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's double-check that it is the same as we got from an explicit square root of the eigenvalues:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "true"
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sqrt(A) ≈ Asqrt"
   ]
  }
 ],
 "metadata": {
  "@webio": {
   "lastCommId": null,
   "lastKernelId": null
  },
  "kernelspec": {
   "display_name": "Julia 1.8.0",
   "language": "julia",
   "name": "julia-1.8"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.8.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}