{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "5c33e8d6",
   "metadata": {},
   "source": [
    "# Householder Reflections and Givens Rotations\n",
    "\n",
    "<br />\n",
    "\n",
    "Using rotations and reflections to introduce zeros in a matrix is a powerful paradigm in numerical linear algebra. \n",
    "\n",
    "<br />\n",
    "\n",
    "Reflections and rotations are orthogonal transformations. They preserve the Euclidean lengths of real vectors. \n",
    "\n",
    "<br />\n",
    "\n",
    "They don't shrink and they don't stretch too much - these transformations have the perfect condition number, $\\kappa=1$ (at least, in the 2-norm and other unitarily invariant norms like the Frobenius norm).\n",
    "\n",
    "<br />\n",
    "\n",
    "This notebook walks through the basic construction and application of two important variants. \n",
    "\n",
    "* **Householder reflections:** roughly speaking, these are used for dense problems, introducing zeros column by column.\n",
    "\n",
    "* **Givens rotations:** These are often used for data-sparse problems, for instance, introducing zeros one entry at a time in a sparse matrix."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a275e098",
   "metadata": {},
   "outputs": [],
   "source": [
    "using LinearAlgebra\n",
    "using SparseArrays"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "c2782277",
   "metadata": {},
   "source": [
    "## Householder Reflections\n",
    "\n",
    "The Householder reflector $F_v$, constructed from a real $k$-dimensional vector $v$, is a rank-one modification of the identity: \n",
    "\n",
    "$$F_v = I - 2\\alpha xx^*, \\quad\\text{where}\\quad x = {\\rm sign}(v_1)\\|v\\|e_1+v \\quad\\text{and}\\quad \\alpha = \\|x\\|^2.$$\n",
    "\n",
    "It looks like the orthogonal projection onto subspace orthogonal to $x$, but it is actually a _reflection across_ the subspace orthogonal to $x$. The orthogonal projection has rank $k-1$, but the reflection has rank $k$ and $F_v^{-1}=F_v^T=F_v$.\n",
    "\n",
    "The vector $x$ is chosen so that the reflection takes the vector $v$ to the first coordinate axis:\n",
    "\n",
    "$$ F_v\\begin{pmatrix}v_1 \\\\ v_2 \\\\ \\vdots \\\\ v_k \\end{pmatrix} = \\begin{pmatrix}\\|v\\| \\\\ 0 \\\\ \\vdots \\\\ 0\\end{pmatrix}.$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a3fdac43",
   "metadata": {},
   "outputs": [],
   "source": [
    "# compute the Householder reflector\n",
    "function hhr(v)\n",
    "\n",
    "    x = copy(v)                         # copy v to x\n",
    "    x[1] = x[1] + sign(x[1])*norm(x)    # modify first entry of x\n",
    "    return x\n",
    "end\n",
    "\n",
    "\n",
    "v = randn(6)\n",
    "v = v / norm(v)\n",
    "x = hhr(v)\n",
    "Fv = I-2*x*x' / (x'*x)\n",
    "\n",
    "display(Fv)                     # here's the full reflection matrix\n",
    "display(norm(I - Fv'*Fv))       # orthogonal transformations have transpose = inverse\n",
    "display(Fv*v)                   # x is chosen so that Fv*v = ||v|| * e1 (on the computer, remember that rounding errors occur)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "5b9eef81",
   "metadata": {},
   "source": [
    "In practice, we never form the matrix $F_v$ explicitly. We _apply_ it to vectors as a linear transformation by computing\n",
    "\n",
    "$$ F_vw = w - 2\\alpha x (x^*w). $$\n",
    "\n",
    "The arithmetic cost is a dot product, a vector addition, and a multiplication. Much better than building the whole matrix and multiplying by a vector.\n",
    "\n",
    "We also only need to store the reflection vector $x$ (you could also store the scalar to avoid calculating $\\alpha = \\|x\\|^2$ again if you want)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6de72dfb",
   "metadata": {},
   "outputs": [],
   "source": [
    "function apply_hhr(x, w)\n",
    "    \n",
    "    return w - 2 * x * (x' * w) / (x' * x)\n",
    "end\n",
    "\n",
    "w = randn(6)\n",
    "Fvw = apply_hhr(x, w)               # we can use the computed reflector x to apply Fv to any vector without forming the full reflection matrix\n",
    "display(Fvw)                        # vectors other than v get reflected across the same subspace, but don't usually end up along a coordinate axis (mostly nonzeros)\n",
    "display(norm(Fvw)-norm(w))          # but, reflection means norm is preserved for _any_ vector"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "33046caa",
   "metadata": {},
   "source": [
    "Householder reflections really come into their strength when we use them to introduce zeros and factor matrices. Here's a toy implementation for a familiar example: $A=QR$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d6b49962",
   "metadata": {},
   "outputs": [],
   "source": [
    "function hqr(A, k)\n",
    "    # Householder triangularization on the first k columns of A\n",
    "    \n",
    "    R = copy(A)\n",
    "    n = size(A,2)\n",
    "    X = zeros(size(A))\n",
    "    for j in 1:k\n",
    "        X[j:end,j] = hhr(R[j:end,j]')                           # get Householder reflector\n",
    "        R[j:end,j:end] = apply_hhr(X[j:end,j],R[j:end,j:end])   # introduce zeros in n-j x n-j lower right submatrix\n",
    "    end\n",
    "    return X, R                                                 # return reflectors (for orthogonal Q) and upper triangular R\n",
    "end\n",
    "\n",
    "A = randn(8,5)\n",
    "F = hqr(A,5)\n",
    "display(abs.(F[2]).>1e-14)              # R is now (numerically) zero below the diagonal\n",
    "display(abs.(F[1]).>1e-14)              # F[1] contains Householder vectors of decreasing size"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "22d1b2fe",
   "metadata": {},
   "source": [
    "In practice, the Householder reflectors are usually stored in a compact blocked format that enjoys better storage properties and enables faster matrix-matrix multiplication implementations. \n",
    "\n",
    "The point is, we store the reflectors, and not the full reflection matrix!\n",
    "\n",
    "Next week, Householder reflectors will play a starring role in the \"crown jewel\" of numerical analysis: the QR algorithm for computing eigenvalues (not to be mistaken with the QR factorization)."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "6dca2d2e",
   "metadata": {},
   "source": [
    "## Givens rotations\n",
    "\n",
    "Householder reflections naturally operate on columns of $A$. But what if most column entries in $A$ are nonzero? We can introduce zeros one entry at a time with Givens rotations.\n",
    "\n",
    "<br />\n",
    "\n",
    "You can see the idea clearest in two dimensions first, where the vector $x = (x_1,\\,\\,x_2)^T$ is rotated counterclockwise by an angle $\\theta$ into the vector $y = (y_1,\\,\\,y_2)^T$ by\n",
    "\n",
    "$$\n",
    "\\begin{pmatrix} \n",
    "y_1 \\\\ y_2\n",
    "\\end{pmatrix}\n",
    "=\n",
    "\\begin{pmatrix}\n",
    "\\cos(\\theta) & -\\sin(\\theta) \\\\\n",
    "\\sin(\\theta) & \\cos(\\theta)\n",
    "\\end{pmatrix}\n",
    "\\begin{pmatrix} \n",
    "x_1 \\\\ x_2\n",
    "\\end{pmatrix}.\n",
    "$$\n",
    "\n",
    "Given $x$, how should we chose $\\theta$ so that $y_2 = 0$? We need to rotate $x$ counterclockwise so that it lies along the $e_1=(1,\\,\\,0)^T$ coordinate axis! If we choose $\\cos(\\theta) = x_1/\\|x\\|$ and $\\sin(\\theta) = - x_2/\\|x\\|$, then\n",
    "\n",
    "$$\n",
    "\\begin{pmatrix} \n",
    "\\|x\\| \\\\ 0\n",
    "\\end{pmatrix}\n",
    "=\n",
    "\\frac{1}{\\|x\\|}\\begin{pmatrix}\n",
    "x_1 & x_2 \\\\\n",
    "-x_2 & x_1\n",
    "\\end{pmatrix}\n",
    "\\begin{pmatrix} \n",
    "x_1 \\\\ x_2\n",
    "\\end{pmatrix}.\n",
    "$$\n",
    "\n",
    "The matrix we constructed is a rotation - an orthogonal transformation - that zeros out the second entry of the special vector $x$. A Givens rotation is the $2\\times 2$ rotation analogue of a Housholder reflection!\n",
    "\n",
    "<br />\n",
    "\n",
    "In numerical linear algebra, we are usually concerned with more than just two dimensions. But we can use Givens rotations to introduce one zero at a time by, e.g., mixing two rows at a time. Conceptually, this means embedding the Givens rotation matrix into a larger identity matrix. For example if we want to use the first entry of a $5$-dimensional vector to zero out its last entry, we could write\n",
    "\n",
    "$$\n",
    "\\begin{pmatrix} \n",
    "\\sqrt{x_1^2+x_5^2} \\\\ x_2 \\\\ x_3 \\\\ x_4 \\\\ 0\n",
    "\\end{pmatrix}\n",
    "=\n",
    "\\begin{pmatrix}\n",
    "c & 0 & 0 & 0 & s \\\\\n",
    "0 & 1 & 0 & 0 & 0 \\\\\n",
    "0 & 0 & 1 & 0 & 0 \\\\\n",
    "0 & 0 & 0 & 1 & 0 \\\\\n",
    "-s & 0 & 0 & 0 & c\n",
    "\\end{pmatrix}\n",
    "\\begin{pmatrix} \n",
    "x_1 \\\\ x_2 \\\\ x_3 \\\\ x_4 \\\\ x_5\n",
    "\\end{pmatrix}, \n",
    "\\qquad\\text{where}\\qquad\n",
    "c = \\frac{x_1}{\\sqrt{x_1^2+x_5^2}}\n",
    "\\qquad\\text{and}\\qquad\n",
    "s = \\frac{x_5}{\\sqrt{x_1^2+x_5^2}}.\n",
    "$$\n",
    "\n",
    "Just as with Householder reflections, we never form the full rotation matrix on the computer. We store $c$ and $s$ and apply the Givens rotation as a linear transformation that combines two rows (when applied from the left)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "eb47426d",
   "metadata": {},
   "outputs": [],
   "source": [
    "function toy_givens(x, j, k)\n",
    "    \n",
    "    r = sqrt(x[j]^2 + x[k]^2)\n",
    "    c = x[j] / r\n",
    "    s = x[k] / r\n",
    "\n",
    "    return c, s\n",
    "end\n",
    "\n",
    "function apply_givens(v, c, s, j, k)\n",
    "\n",
    "    w = copy(v)\n",
    "\n",
    "    w[j,:] = c*v[j,:] + s*v[k,:]\n",
    "    w[k,:] = -s*v[j,:] + c*v[k,:]\n",
    "\n",
    "    return w\n",
    "end\n",
    "\n",
    "N = 10\n",
    "A = diagm(-1 => -ones(N-1), 0 => 2*ones(N), 1 => -ones(N-1))\n",
    "g = toy_givens(A[:,1], 1, 2)                                        # compute Givens rotation to zero out first subdiagonal entry\n",
    "B = apply_givens(A, g[1], g[2], 1, 2)                               # apply Givens rotation to mix first two rows of A\n",
    "display(sparse(A))                                                  # display matrix before Givens\n",
    "display(sparse(B))                                                  # display matrix after Givens\n",
    "display( norm(A[:,2]) - norm(B[:,2]) )                              # column norm is preserved"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "effffdd1",
   "metadata": {},
   "source": [
    "There are a number of subtle points to get right when doing Givens on the computer. Luckily for us, Julia has a great ready-to-use function for computing and working with Givens rotations. Let's test it out on the same example we used for our \"toy\" Givens rotation code."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b6844880",
   "metadata": {},
   "outputs": [],
   "source": [
    "N = 10\n",
    "A = diagm(-1 => -ones(N-1), 0 => 2*ones(N), 1 => -ones(N-1))\n",
    "G = givens(A, 1, 2, 1)\n",
    "B = G[1]*A\n",
    "display(sparse(A))\n",
    "display(sparse(B))\n",
    "display( norm(A[:,2]) - norm(B[:,2]) )"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "8359af8a",
   "metadata": {},
   "source": [
    "It looks like our toy code did okay on this simple example. Now, let's string a few more Givens rotations together to triangularize the tridiagonal matrix $A$ above. We'll also allow it to accumulate the same Givens rotations applied to another matrix B, which is useful if we want to express the orthogonal factor $Q$ as a matrix, or if we want to compute $Q^Tb$ for least-squares problems."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1ce4a736",
   "metadata": {},
   "outputs": [],
   "source": [
    "function triQR(A,B)\n",
    "    # compute the QR decomposition of a tridiagonal matrix using Givens rotations\n",
    "    \n",
    "    R = copy(A)\n",
    "    QTB = copy(B)\n",
    "    n = size(R,2)\n",
    "    for j in 1:n-1\n",
    "        G = givens(R, j, j + 1, j)\n",
    "        R = G[1]*R\n",
    "        QTB = G[1]*QTB\n",
    "    end\n",
    "\n",
    "    return R, QTB\n",
    "end\n",
    "\n",
    "F = triQR(A,diagm(0 => ones(N)))\n",
    "display(abs.(F[1]).>1e-14)              # F[1] is the triangular factor - it is banded with upper bandwidth = 3\n",
    "display(norm(F[2]'*F[2]-I))             # F[2] is the transpose of the orthogonal factor"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "8a864530",
   "metadata": {},
   "source": [
    "Banded matrices are very special. What happens for other types of sparse matrices? Let's add a row of ones to the first row of the difference matrix and see how it effects the $QR$ factorization.\n",
    "\n",
    " $$\n",
    " A = \\begin{pmatrix}\n",
    " 1 & 1 & 1 & 1 & \\cdots & 1 \\\\\n",
    " -1 & 2 & -1 & 0 & \\cdots & 0 \\\\\n",
    " 0 & -1 & -2 & 1 & \\cdots & 0 \\\\\n",
    " \\vdots & &  \\ddots & \\ddots & \\ddots & \\vdots \\\\\n",
    " 0 & \\cdots & &  0 & -1 & 2\n",
    " \\end{pmatrix}\n",
    " $$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ad523721",
   "metadata": {},
   "outputs": [],
   "source": [
    "A[1,:] = ones(1,N)\n",
    "F = triQR(A,diagm(0 => ones(N)))\n",
    "display(abs.(F[1]).>1e-14)              # F[1] is the triangular factor - it is banded with upper bandwidth = 3\n",
    "display(norm(F[2]'*F[2]-I))             # F[2] is the transpose of the orthogonal factor"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "56ff1f07",
   "metadata": {},
   "source": [
    "The upper triangular factor is now completely dense... what happened?\n",
    "\n",
    "<br />\n",
    "\n",
    "We can explore by running Householder QR on $A$ one column at a time, stopping to visualize how the upper triangular factor fills in."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d379f452",
   "metadata": {},
   "outputs": [],
   "source": [
    "F = hqr(A, 0)\n",
    "display(abs.(F[2]).>1e-14)              # R is now (numerically) zero below the diagonal"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "0f93d48e",
   "metadata": {},
   "source": [
    "Just as we saw with Gaussian elimination and $A=LU$, the factors of $A$ can suffer from _fill-in_: many more nonzero entries than in the original matrix."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 1.6.3",
   "language": "julia",
   "name": "julia-1.6"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.6.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}