{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "from traitlets.config.manager import BaseJSONConfigManager\n",
    "import jupyter_core\n",
    "#path = \"/home/damian/miniconda3/envs/rise_latest/etc/jupyter/nbconfig\"\n",
    "path = \"/Users/i.oseledets/anaconda2/envs/teaching/etc/jupyter/nbconfig\"\n",
    "cm = BaseJSONConfigManager(config_dir=path)\n",
    "cm.update(\"livereveal\", {\n",
    "              \"theme\": \"sky\",\n",
    "              \"transition\": \"zoom\",\n",
    "              \"start_slideshow_at\": \"selected\",\n",
    "               \"scroll\": True\n",
    "})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Lecture 7: Matrix decompositions and how we compute them"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Recap of the previous lecture\n",
    "- Eigenvectors and eigenvalues\n",
    "- Characteristic polynomial and why it is a bad idea\n",
    "- Power method: $x := Ax$\n",
    "- Gershgorin theorem\n",
    "- Schur theorem: $A = U T U^*$ \n",
    "- Normal matrices: $A^* A = A A^*$\n",
    "- Advanced topic: pseudospectrum"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Today lecture\n",
    "Today we will talk about **matrix factorizations**\n",
    "\n",
    "The basic **matrix factorizations** in numerical linear algebra:\n",
    "\n",
    "- LU decomposition and Gaussian elimination — already covered\n",
    "- QR decomposition and Gram-Schmidt algorithm \n",
    "- Schur decomposition and QR-algorithm\n",
    "- Few words about the SVD decomposition\n",
    "\n",
    "We already introduced QR decomposition some time ago, but now we are going to discuss it in more details."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## General concept of matrix factorization\n",
    "\n",
    "In numerical linear algebra we need to solve different tasks, for example:\n",
    "\n",
    "- Solve linear systems $Ax = f$\n",
    "- Compute eigenvalues / eigenvectors\n",
    "- Compute singular values / singular vectors\n",
    "- Compute inverses, even sometimes determinants \n",
    "- Compute **matrix functions** like $\\exp(A), \\cos(A)$ (these are not elementwise functions)\n",
    "\n",
    "In order to do this, we represent the matrix as a sum and/or product of matrices with **simpler structure**,  \n",
    "such that we can solve mentioned tasks faster / in a more stable form.\n",
    "\n",
    "What is a **simpler structure**?\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## What is a simpler structure\n",
    "We already encountered several classes of matrices **with structure**. \n",
    "\n",
    "For dense matrices the most important classes are\n",
    "\n",
    "- **unitary matrices**\n",
    "- **upper/lower triangular matrices** \n",
    "- **diagonal matrices**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Other classes of structured matrices\n",
    "\n",
    "- For **sparse matrices** the sparse constraints are often included in the factorizations.\n",
    "- For **Toeplitz matrices** an important class of matrices is the class of matrices with **low displacement rank**, which is based on the low-rank matrices. \n",
    "\n",
    "- The class of **low-rank matrices** and **block low-rank matrices** is everywhere."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Plan\n",
    "The plan for today's lecture is to discuss the decompositions one-by-one and point out:\n",
    "- How to compute a particular decomposition\n",
    "- When the decomposition exists \n",
    "- What is done in **real life** (LAPACK)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Decompositions we want to discuss today\n",
    "- LU factorization & Cholesky factorization — quick reminder, already done.\n",
    "- QR-decomposition and Gram-Schmidt algorithm\n",
    "- One slide about the SVD (more details on Thursday)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## PLU decomposition\n",
    "\n",
    "Any nonsingular matrix can be factored as \n",
    "\n",
    "$$\n",
    "A = P L U,\n",
    "$$\n",
    "\n",
    "where $P$ is a permutation matrix, $L$ is a **lower triangular matrix**, $U$ is an **upper triangular**\n",
    "\n",
    "**Main goal** of the LU decomposition is to solve linear systems, because\n",
    "\n",
    "$$\n",
    "    A^{-1} f = (L U)^{-1} f = U^{-1} L^{-1} f, \n",
    "$$\n",
    "\n",
    "and this reduces to the solution of two linear systems \n",
    "\n",
    "$$\n",
    "     L y = f,  \\quad U x = y\n",
    "$$\n",
    "\n",
    "with lower and upper triangular matrices respectively."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Positive definite matrices and Cholesky decomposition, reminder\n",
    "\n",
    "If the matrix is hermitian positive definite, i.e. \n",
    "\n",
    "$$\n",
    "A = A^*, \\quad (Ax, x) > 0, \\quad x \\ne 0,\n",
    "$$\n",
    "\n",
    "then it can be factored as \n",
    "\n",
    "$$\n",
    "A = RR^*,\n",
    "$$\n",
    "\n",
    "where $R$ is lower triangular.\n",
    "\n",
    "We will need this for the QR decomposition.\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## QR decomposition\n",
    "The next decomposition: **QR** decomposition. Again from the name it is clear that a matrix is represented as a product  \n",
    "\n",
    "$$\n",
    "    A = Q R, \n",
    "$$\n",
    "\n",
    "where $Q$ is an **column orthogonal (unitary)** matrix and $R$ is **upper triangular**.  \n",
    "\n",
    "The matrix sizes: $Q$ is $n \\times m$, $R$ is $m \\times m$ if $n\\geq m$. See our [poster](../decompositions.pdf) for visualization of QR decomposition\n",
    "\n",
    "QR decomposition is defined for **any rectangular matrix**."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## QR decomposition: applications\n",
    "\n",
    "This decomposition plays a crucial role in many problems:\n",
    "- Computing orthogonal bases in a linear space\n",
    "- Used in the preprocessing step for the SVD\n",
    "- QR-algorithm for the computation of eigenvectors and eigenvalues ([one of the 10 most important algorithms of the 20th century](https://archive.siam.org/pdf/news/637.pdf)) is based on the QR decomposition\n",
    "- Solving overdetermined systems of linear equations"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## QR decomposition and least squares, reminder\n",
    "\n",
    "Suppose we need to solve\n",
    "\n",
    "$$\n",
    "\\Vert A x - f \\Vert_2 \\rightarrow \\min_x,\n",
    "$$\n",
    "\n",
    "where $A$ is $n \\times m$, $n \\geq m$.\n",
    "\n",
    "Then we factorize\n",
    "\n",
    "$$\n",
    "A = Q R,\n",
    "$$\n",
    "\n",
    "and use equation for pseudo-inverse matrix in the case of the full rank matrix $A$:\n",
    "\n",
    "$$\n",
    "x = A^{\\dagger}b = (A^*A)^{-1}A^*b = ((QR)^*(QR))^{-1}(QR)^*b = (R^*Q^*QR)^{-1}R^*Q^*b = R^{-1}Q^*b. \n",
    "$$\n",
    "thus $x$ can be recovered from \n",
    "\n",
    "$$R x = Q^*b$$\n",
    "\n",
    "Note that this is a square system of linear equations with lower triangular matrix. What is the complexity of solving this system?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Short side note: randomized least squares\n",
    "\n",
    "- One of the efficient ways to solve really overdetermined ($n\\gg m$) system of linear equations is to use **Kaczmarz method**.\n",
    "\n",
    "- Instead of solving all equations, pick one randomly, which reads\n",
    "\n",
    "$$a^{\\top}_i x = f_i,$$\n",
    "\n",
    "and given an approximation $x_k$ try to find $x_{k+1}$ as \n",
    "\n",
    "$$x_{k+1} = \\arg \\min_x \\Vert x - x_k \\Vert, \\quad \\mbox{s.t.} \\quad  a^{\\top}_i x = f_i.$$\n",
    "\n",
    "- A simple analysis gives \n",
    "\n",
    "$$x_{k+1} = x_k - \\frac{(a_i, x_k) - b_i}{(a_i, a_i)} a_i. $$\n",
    "\n",
    "- A cheap update, but the analysis is quite complicated."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Existence of QR decomposition\n",
    "**Theorem.**\n",
    "Every rectangular $n \\times m$  matrix has a QR decomposition. \n",
    "\n",
    "\n",
    "There are several ways to prove it and compute it:\n",
    "\n",
    "- Theoretical: using the Gram matrices and Cholesky factorization\n",
    "- Geometrical: using the Gram-Schmidt orthogonalization\n",
    "- Practical: using Householder/Givens transformations "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Proof using Cholesky decomposition\n",
    "If we have the representation of the form\n",
    "\n",
    "$$A = QR,$$\n",
    "\n",
    "then $A^* A = ( Q R)^* (QR)  = R^* (Q^* Q) R = R^* R$, the matrix $A^* A$ is called **Gram matrix**, and its elements are scalar products of the columns of $A$.  "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Proof using Cholesky decomposition (full column rank)\n",
    "\n",
    "Assume that $A$ has **full column rank**. Then, it is simple to show that $A^* A$ is positive definite:\n",
    "\n",
    "$$\n",
    "   (A^* A y, y) = (Ay, Ay) = \\Vert Ay \\Vert^2  > 0, \\quad y\\not = 0.\n",
    "$$\n",
    "\n",
    "Therefore, $A^* A = R^* R$ always exists.\n",
    "\n",
    "Then the matrix $A R^{-1}$ is unitary:  \n",
    "\n",
    "$$\n",
    "   (A R^{-1})^* (AR^{-1})= R^{-*} A^* A R^{-1} = R^{-*} R^* R R^{-1} = I.\n",
    "$$\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Proof using Cholesky decomposition (rank-deficient case)\n",
    "\n",
    "- When an $n \\times m$ matrix does not have full column rank, it is said to be **rank-deficient**.  \n",
    "\n",
    "- The QR decomposition, however, also exists.\n",
    "\n",
    "- For any **rank-deficient matrix** there is a sequence of full-column rank matrices $A_k$ such that $A_k \\rightarrow A$ (why?).\n",
    "\n",
    "- Each $A_k$ can be decomposed as $A_k = Q_k R_k$. \n",
    "\n",
    "- The set of all unitary matrices is compact, thus there exists a converging subsequence $Q_{n_k} \\rightarrow Q$ (why?), and $Q^* A_k \\rightarrow Q^* A = R$, which is triangular."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Stability of QR decomposition via Cholesky decomposition\n",
    "So, the simplest way to compute QR decomposition is then\n",
    "\n",
    "$$A^* A = R^* R,$$\n",
    "\n",
    "and\n",
    "\n",
    "$$Q = A R^{-1}.$$\n",
    "\n",
    "It is a **bad idea** for numerical stability. Let us do some demo (for a submatrix of the Hilbert matrix)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "n = 20\n",
    "r = 4\n",
    "a = [[1.0 / (i + j + 0.5) for i in range(r)] for j in range(n)]\n",
    "a = np.array(a)\n",
    "q, Rmat = np.linalg.qr(a)\n",
    "e = np.eye(r)\n",
    "print('Built-in QR orth', np.linalg.norm(np.dot(q.T, q) - e))\n",
    "gram_matrix = a.T.dot(a)\n",
    "Rmat1 = np.linalg.cholesky(gram_matrix)\n",
    "q1 = np.dot(a, np.linalg.inv(Rmat1.T))\n",
    "print('Via Cholesky:', np.linalg.norm(np.dot(q1.T, q1) - e))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Second way: Gram-Schmidt orthogonalization\n",
    "\n",
    "- QR decomposition is a mathematical way of writing down the Gram-Schmidt orthogonalization process.  \n",
    "- Given a sequence of vectors $a_1, \\ldots, a_m$ we want to find orthogonal basis $q_1, \\ldots, q_m$ such that every $a_i$ is a linear combination of such vectors.  \n",
    "\n",
    "**Gram-Schmidt:**\n",
    "1. $q_1 := a_1/\\Vert a_1 \\Vert$\n",
    "2. $q_2 := a_2 - (a_2, q_1) q_1, \\quad q_2 := q_2/\\Vert q_2 \\Vert$\n",
    "3. $q_3 := a_3 - (a_3, q_1) q_1 - (a_3, q_2) q_2, \\quad q_3 := q_3/\\Vert q_3 \\Vert$\n",
    "4. And so on  \n",
    "\n",
    "Note that the transformation from $Q$ to $A$ has triangular structure, since from the $k$-th vector we subtract only the previous ones. It follows from the fact that the product of triangular matrices is a triangular matrix."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Modified Gram-Schmidt\n",
    "- Gram-Schmidt can be very unstable (i.e., the produced vectors will be not orthogonal, especially if $q_k$ has small norm).  \n",
    "This is called **loss of orthogonality**.  \n",
    "\n",
    "- There is a remedy, called **modified Gram-Schmidt** method. Instead of doing \n",
    "\n",
    "$$q_k := a_k - (a_k, q_1) q_1 - \\ldots - (a_k, q_{k-1}) q_{k-1}$$\n",
    "\n",
    "we do it step-by-step. First we set $q_k := a_k$ and orthogonalize sequentially:\n",
    "\n",
    "$$\n",
    "   q_k := q_k - (q_k, q_1)q_1, \\quad q_k := q_{k} - (q_k,q_2)q_2, \\ldots\n",
    "$$\n",
    "\n",
    "- In exact arithmetic, it is the same. In floating point it is absolutely different!\n",
    "\n",
    "- Note that the complexity is $\\mathcal{O}(nm^2)$ operations"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## QR decomposition: the (almost) practical way\n",
    "\n",
    "If $A = QR$, then  \n",
    "$$\n",
    "R = Q^* A,\n",
    "$$\n",
    "and we need to find a certain orthogonal matrix $Q$ that brings a matrix into upper triangular form.  \n",
    "For simplicity, we will look for an $n \\times n$ matrix such that\n",
    "$$ Q^* A = \\begin{bmatrix} * & * & *  \\\\ 0 & * & * \\\\ 0 & 0 & * \\\\ & 0_{(n-m) \\times m} \\end{bmatrix} $$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "We will do it column-by-column.  \n",
    "First, we find a Householder matrix $H_1 = (I - 2 uu^{\\top})$ such that (we illustrate on a $4 \\times 3$ matrix)\n",
    "\n",
    "$$ H_1 A = \\begin{bmatrix} * & * & * \\\\ 0 & * & * \\\\ 0 & * & * \\\\ 0 & * & * \\end{bmatrix} $$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Then, \n",
    "$$ H_2 H_1 A = \\begin{bmatrix} * & * & * \\\\ 0 & * & * \\\\ 0 & 0 & * \\\\ 0 & 0 & * \\end{bmatrix}, $$\n",
    "where\n",
    "$$ H_2 = \\begin{bmatrix} 1 & 0 \\\\ 0 & H'_2, \\end{bmatrix} $$\n",
    "and $H'_2$ is a $3 \\times 3$ Householder matrix."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Finally, \n",
    "$$ H_3 H_2 H_1 A = \\begin{bmatrix} * & * & * \\\\ 0 & * & * \\\\ 0 & 0 & * \\\\ 0 & 0 & 0 \\end{bmatrix}, $$\n",
    "\n",
    "You can try to implement it yourself, it is simple."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## QR decomposition: real life\n",
    "\n",
    "- In reality, since this is a dense matrix factorization, you should implement the algorithm in terms of blocks (why?).  \n",
    "\n",
    "- Instead of using Householder transformation, we use **block Householder** transformation of the form\n",
    "\n",
    "$$H = (I - 2UU^*), $$\n",
    "\n",
    "where $U^* U = I$.\n",
    "\n",
    "- This allows us to use BLAS-3 operations. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Rank-revealing QR-decomposition\n",
    "\n",
    "- The QR-decomposition can be also used to compute the (numerical) rank of the matrix, see \n",
    "[Rank-Revealing QR Factorizations and the Singular Value Decomposition, Y. P. Hong; C.-T. Pan](http://scgroup.hpclab.ceid.upatras.gr/class/SCII/Various/HongPan.pdf)\n",
    "\n",
    "- It is done via so-called **rank-revealing** factorization.  \n",
    "\n",
    "- It is based on the representation\n",
    "\n",
    "$$P A = Q R,$$\n",
    "\n",
    "where $P$ is the permutation matrix (it permutes columns), and  $R$ has the block form\n",
    "\n",
    "\n",
    "$$R = \\begin{bmatrix} R_{11} & R_{12} \\\\ 0 & R_{22}\\end{bmatrix}.$$\n",
    "\n",
    "- The goal is to find $P$ such that the norm of $R_{22}$ is **small**, so you can find the numerical rank by looking at it.\n",
    "\n",
    "- An estimate is $\\sigma_{r+1} \\leq \\Vert R_{22} \\Vert_2$ (check why)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Summary\n",
    "\n",
    "- LU and QR decompositions can be computed using **direct methods** in finite amount of operations.\n",
    "\n",
    "- What about Schur form and SVD?  \n",
    "\n",
    "- They can not be computed by **direct methods** (why?) they can only be computed by **iterative methods**.\n",
    "\n",
    "- Although iterative methods still have the same $\\mathcal{O}(n^3)$ complexity in floating point arithmetic thanks to fast convergence."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Schur form\n",
    "\n",
    "Recall that every matrix can be written in the **Schur form**\n",
    "\n",
    "$$A = Q T Q^*,$$\n",
    "\n",
    "with upper triangular $T$ and unitary $Q$\n",
    "and this decomposition gives eigenvalues of the matrix (they are on the diagonal of $T$).\n",
    "\n",
    "The first and the main algorithm for computing the Schur form is the **QR algorithm**.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## QR algorithm\n",
    "The QR algorithm was independently proposed in 1961 by Kublanovskaya and Francis.   \n",
    "\n",
    "<font color='red'> Do not **mix** QR algorithm and QR decomposition! </font>\n",
    "\n",
    "QR decomposition is the representation of a matrix, whereas QR algorithm uses QR decomposition to compute the eigenvalues!\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Way to QR algorithm\n",
    "\n",
    "Consider the equation\n",
    "\n",
    "$$A = Q T Q^*,$$\n",
    "\n",
    "and rewrite it in the form\n",
    "\n",
    "$$\n",
    "   Q T = A Q.\n",
    "$$\n",
    "\n",
    "On the left we can see QR factorization of the matrix $AQ$.\n",
    "\n",
    "We can use this to derive fixed-point iteration for the Schur form, also known as **QR algorithm.**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Derivation of the QR algorithm as fixed-point iteration\n",
    "We can write down the iterative process \n",
    "\n",
    "$$\n",
    "    Q_{k+1} R_{k+1} = A Q_k, \\quad Q_{k+1}^* A = R_{k+1} Q^*_k\n",
    "$$\n",
    "\n",
    "Introduce\n",
    "\n",
    "$$A_k = Q^* _k A Q_k = Q^*_k Q_{k+1} R_{k+1} = \\widehat{Q}_k R_{k+1}$$\n",
    "\n",
    "and the new approximation reads\n",
    "\n",
    "$$A_{k+1} = Q^*_{k+1} A Q_{k+1} = ( Q_{k+1}^* A = R_{k+1} Q^*_k)  = R_{k+1} \\widehat{Q}_k.$$\n",
    "\n",
    "So we arrive at the standard form of the QR algorithm."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "The final formulas are then written in the classical **QRRQ**-form:\n",
    "\n",
    "1. Start from $A_0 = A$.\n",
    "2. Compute QR factorization of $A_k = Q_k R_k$.\n",
    "3. Set $A_{k+1} = R_k Q_k$.\n",
    "\n",
    "Iterate until $A_k$ is **triangular enough** (e.g. norm of subdiagonal part is small enough).\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## What about the convergence and complexity\n",
    "\n",
    "**Statement**\n",
    "\n",
    "Matrices $A_k$ are unitary similar to $A$\n",
    "\n",
    "$$A_k = Q^*_{k-1} A_{k-1} Q_{k-1} = (Q_{k-1} \\ldots Q_1)^* A (Q_{k-1} \\ldots Q_1)$$\n",
    "\n",
    "and the product of unitary matrices is a unitary matrix.\n",
    "\n",
    "Complexity of each step is $\\mathcal{O}(n^3)$, if a general QR decomposition is done.\n",
    "\n",
    "Our hope is that $A_k$ will be **very close to the triangular  matrix** for suffiently large $k$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "n = 4\n",
    "a = [[1.0/(i + j + 0.5) for i in range(n)] for j in range(n)]\n",
    "niters = 200\n",
    "for k in range(niters):\n",
    "    q, rmat = np.linalg.qr(a)\n",
    "    a = rmat.dot(q)\n",
    "print('Leading 3x3 block of a:')\n",
    "print(a[:3, :3])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Convergence and complexity of the QR algorithm\n",
    "\n",
    "- The convergence of the QR algorithm is from the **largest eigenvalues** to the smallest.\n",
    "\n",
    "- At least 2-3 iterations is needed for an eigenvalue. \n",
    "\n",
    "- Each step is one QR factorization and one matrix-by-matrix product, as a result $\\mathcal{O}(n^3)$ complexity.\n",
    "\n",
    "**Q**: does it mean $\\mathcal{O}(n^4)$ complexity totally? \n",
    "\n",
    "**A**: fortunately, not. \n",
    "\n",
    "- We can speedup the QR algorithm by using **shifts**, since $A_k - \\lambda I$ has the same Schur vectors.\n",
    "\n",
    "- We will discuss these details in the next lecture"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## A few words about the SVD\n",
    "\n",
    "Last but not least: the **singular value decomposition** of matrices.\n",
    "\n",
    "$$A = U \\Sigma V^*.$$\n",
    "\n",
    "We can compute it via eigendecomposition of \n",
    "\n",
    "$$A^* A = V^* \\Sigma^2 V,$$\n",
    "\n",
    "and/or\n",
    "\n",
    "$$AA^* = U^* \\Sigma^2 U$$\n",
    "\n",
    "with QR algorithm, but it is a **bad idea** (c.f. Gram matrices).\n",
    "\n",
    "We will discuss methods for computing SVD later."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Summary of today's lecture\n",
    "\n",
    "- QR decomposition and Gram-Schmidt algorithm, reduction to a simpler form by Householder transformations\n",
    "- Schur decomposition and QR algorithm"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Next lecture\n",
    "- Efficient implementation of QR algorithm, convergence properties\n",
    "- Efficient computation of the SVD: 4 algorithms\n",
    "- More applications of the SVD"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "##### Questions?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "from IPython.core.display import HTML\n",
    "def css_styling():\n",
    "    styles = open(\"./styles/custom.css\", \"r\").read()\n",
    "    return HTML(styles)\n",
    "css_styling()"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "celltoolbar": "Slideshow",
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.6"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
   "bibliofile": "biblio.bib",
   "cite_by": "apalike",
   "current_citInitial": 1,
   "eqLabelWithNumbers": true,
   "eqNumInitial": 1,
   "labels_anchors": false,
   "latex_user_defs": false,
   "report_style_numbering": false,
   "user_envs_cfg": false
  },
  "nav_menu": {},
  "toc": {
   "navigate_menu": true,
   "number_sections": false,
   "sideBar": true,
   "threshold": 6,
   "toc_cell": false,
   "toc_section_display": "block",
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}