{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "This notebook is our introduction to the concept of [eigenvalues and eigenvectors](https://en.wikipedia.org/wiki/Eigenvalues_and_eigenvectors) in 18.06. Unlike the textbook, however, I'm going to approach the subject a little differently. I'm going to work *backwards*: starting at what we would *like* to obtain (**make matrices act like scalars**) and go backwards to the methods and conditions to achieve this." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Scalars are easy, matrices are hard?\n", "\n", "Multiplying a vector by a scalar is easy to understand. For example, if we multiply a vector $x$ by 0.1 over and over again, we get\n", "\n", "$$\n", "0.1^n x \\to 0\n", "$$\n", "\n", "The direction remains the same, but the magnitude decreases by a factor of 10 each time we multiply, asympotically going to zero.\n", "\n", "In contrast, multiplying by a *matrix* is a complicated operation. Easy for a computer, but hard to *understand* if the matrix is big. If we multiply a vector $x$ by a matrix $A$ over and over, what happens?\n", "\n", "$$\n", "A^n x \\to ???\n", "$$\n", "\n", "Hard to say at a glance, even if I tell you what $A$ is!\n", "\n", "Also, lots of things that are easy with scalars $\\lambda$ are hard with matrices $A$. For example:\n", "\n", "* **Solving** $\\lambda x = b$ is easy: $x = \\lambda^{-1} b$ (unless $\\lambda=0$). Solving $Ax=b$ is hard; even if $A$ is nonsingular, $A^{-1} b$ is a lot of work to compute. Inverting matrices is harder than inverting numbers!\n", "\n", "* It's easy to tell **whether** $\\lambda x = b$ **has a solution**, and whether it is unique: unique solutions if $\\alpha \\ne 0$, and otherwise if $\\lambda=0$ then there are (infinitely many) solutions only for $b=0$. For $Ax=b$, we need to work out the rank, nullspace, etcetera.\n", "\n", "* Repeated multiplication (**powers**): $\\lambda^n$ is easy to compute and understand, $A^n$ is hard.\n", "\n", " - $\\lambda^n$ will go to zero as $n \\to \\infty$ if $|\\lambda| < 1$, and will blow up if $|\\lambda| > 1$. What about $A^n$?\n", "\n", "* Solving the **ordinary differential equation (ODE)** $\\frac{dx}{dt} = \\lambda x$ is easy: $x(t) = e^{\\lambda t} x(0)$. Solving the *system* of ODEs $\\frac{dx}{dt} = A x$ seems hard. Maybe $x(t) = e^{A t} x(0)$, but what the heck does $e^A$ even mean?\n", "\n", " - The solutions $\\sim e^{\\lambda t}$ will go to zero as $t \\to \\infty$ for real $\\lambda < 0$ and will blow up for $\\lambda > 0$. What about for $A$?\n", "\n", "* ...many other tasks..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Eigenvectors: Where matrices act like scalars\n", "\n", "Suppose that we could find *some* $x \\ne 0$ for which\n", "\n", "$$\n", "\\boxed{Ax = \\lambda x}\n", "$$\n", "\n", "for some scalar λ. **For *that* x, the matrix A would act like a scalar λ.** Multipling, dividing, etcetera by $A$ would be easy for that vector (and multiples thereof)!\n", "\n", "We call such an $x$ an **eigenvector** of $A$, and $\\lambda$ the corresponding **eigenvalue**. Of course, $\\alpha x$ for any scalar $\\alpha$ is an eigenvector of the same $\\lambda$, but this is a *subspace*, and we only need to find a *basis* vector $x$.\n", "\n", "But why should such magical solutions even exist?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Diagonal matrices are almost as easy as scalars\n", "\n", "If we have an $m \\times m$ [diagonal matrix](https://en.wikipedia.org/wiki/Diagonal_matrix) $Λ$, it is almost as easy to work with as a scalar:\n", "\n", "$$\n", "Λ = \\begin{pmatrix} \\lambda_1 & 0 & 0 & 0 \\\\\n", " 0 & \\lambda_2 & 0 & 0 \\\\\n", " 0 & 0 & \\ddots & 0 \\\\\n", " 0 & 0 & 0 & \\lambda_m \\end{pmatrix}\n", "$$\n", "\n", "In fact, it is clear that the diagonal elements $\\lambda_k$ are exactly eigenvalues, and the corresponding eigenvectors are the [standard basis vectors](https://en.wikipedia.org/wiki/Standard_basis): the columns of the $m \\times m$ identity matrix $I$.\n", "\n", "For example, consider the $4\\times4$ matrix with $\\lambda_1 = 0.1$, $\\lambda_2 = 0.2$, $\\lambda_3 = 0.3$, and $\\lambda_4$ = 0.4:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "4×4 Array{Float64,2}:\n", " 0.1 0.0 0.0 0.0\n", " 0.0 0.2 0.0 0.0\n", " 0.0 0.0 0.4 0.0\n", " 0.0 0.0 0.0 0.5" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Λ = diagm([0.1, 0.2, 0.4, 0.5])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This has four eigenvectors\n", "$$\n", "\\begin{pmatrix} 1 \\\\ 0 \\\\ 0 \\\\ 0 \\end{pmatrix}, \n", "\\begin{pmatrix} 0 \\\\ 1 \\\\ 0 \\\\ 0 \\end{pmatrix}, \n", "\\begin{pmatrix} 0 \\\\ 0 \\\\ 1 \\\\ 0 \\end{pmatrix}, \n", "\\begin{pmatrix} 0 \\\\ 0 \\\\ 0 \\\\ 1 \\end{pmatrix}, \n", "$$\n", "(along with any multiples of these vectors: again, we only need a basis for the eigenvectors)." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "4-element Array{Float64,1}:\n", " 1.0\n", " 0.0\n", " 0.0\n", " 0.0" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Λ * [10\n", " 0\n", " 0\n", " 0]" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "4-element Array{Float64,1}:\n", " 0.0\n", " 2.0\n", " 0.0\n", " 0.0" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Λ * [0\n", " 10\n", " 0\n", " 0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we multiply a vector $x \\in \\mathbb{R}^4$ by $Λ$, then it multiplies the first component by 0.1, the second by 0.2, etcetera:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "4-element Array{Float64,1}:\n", " 0.1\n", " 0.2\n", " 0.4\n", " 0.5" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = [1,1,1,1]\n", "Λ * x" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Multiplying 10 times by $Λ$ just multiplies the first component by $0.1^{10}$, the second by $0.2^{10}$, and so on:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "4-element Array{Float64,1}:\n", " 1.0e-10 \n", " 1.024e-7 \n", " 0.000104858\n", " 0.000976563" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Λ^10 * x" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can think about this as writing $x$ in the *basis* of the eigenvectors, and then for *each* eigenvector, the matrix Λ acts like a number λ:\n", "\n", "$$\n", "\\Lambda^n x = \\lambda_1^n \\begin{pmatrix} 1 \\\\ 0 \\\\ 0 \\\\ 0 \\end{pmatrix} +\n", "\\lambda_2^n \\begin{pmatrix} 0 \\\\ 1 \\\\ 0 \\\\ 0 \\end{pmatrix} +\n", "\\lambda_3^n \\begin{pmatrix} 0 \\\\ 0 \\\\ 1 \\\\ 0 \\end{pmatrix} +\n", "\\lambda_4^n \\begin{pmatrix} 0 \\\\ 0 \\\\ 0 \\\\ 1 \\end{pmatrix} \\; .\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since each $|\\lambda_k| < 1$, it is clear that $\\Lambda^n x \\to 0$ as $n \\to \\infty$.\n", "\n", "Equivalently, the matrix $$ Λ^n = \\begin{pmatrix} \\lambda_1^n & & & \\\\ & \\lambda_2^n & & \\\\ & & \\lambda_3^n & \\\\ & & & \\lambda_4^n \\end{pmatrix}$$ itself must go to zero as $n$ increases:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "4×4 Array{Float64,2}:\n", " 1.0e-100 0.0 0.0 0.0 \n", " 0.0 1.26765e-70 0.0 0.0 \n", " 0.0 0.0 1.60694e-40 0.0 \n", " 0.0 0.0 0.0 7.88861e-31" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Λ^100" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Morever, if we multiply $Λ$ by a vector $x$ repeatedly, it will quickly become **nearly parallel** to the last eigenvector $(0,0,0,1)$, since the $0.5^n$ term **decays the most slowly**. It is expecially easy to see this if we look at the **unit vector** (length = 1)\n", "\n", "$$\n", "\\frac{Λ^n x}{\\Vert Λ^n x \\Vert}\n", "$$" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "