{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# QR Factorization " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "AUTHOR \n", "\n", "MD SHORIFUL ISLAM; \n", "EMAIL- ISLAM5@BYU.EDU\n", "\n", "MODIFIED\n", "\n", "DALLIN SKOUSON\n", "\n", "changelog \n", "December 2020:\n", "Updated code to work with python3 by default\n", "Added information about other methods of calculating the QR factorization\n", "Added code example for calcuation using Gram-Schmidt\n", "Clarified some language\n", "Added additional exploration to existing code" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction \n", "In linear algebra, a QR factorization of a matrix is a factorization of a matrix A into a product $A = QR$ of an orthogonal matrix Q and an upper triangular matrix R. QR factorization is often used to solve the linear least squares problem and is the basis for a particular eigenvalue algorithm, the QR algorithm.\n", "\n", "The Q in the QR factorization will be unitrary (orthogonal when dealing with real numbers). What this means is that $Q^H Q = I$. Additionally it will be shown that $QR$ does yield A.\n", "\n", "One of the key benefits of using QR factorization over other methods for solving linear least squares is that it is more numerically stable, albeit at the expense of being slower to execute. Hence if we are performing a large quantity of regressions as part of a trading backtest, for instance, we need to consider very extensively whether QR factorization is the best fit.\n", "\n", "# Methods for Computing the QR factorization\n", "\n", "## Calculating the QR matrix using Householder transformations\n", "\n", "For a square matrix A the QR factorization converts A into the product of an orthogonal matrix Q (i.e. $QTQ=I$) and an upper triangular matrix R. Hence:\n", "\n", "$A=QR$\n", "The first step is to create the vector x, which is the k-th column of the matrix A, for step k. We define $α=−sgn(xk)(||x||)$ where sgn is the sign operator. The norm $||⋅||$ used here is the Euclidean norm. Given the first column vector of the identity matrix, I of equal size to A, $e1=(1,0,...,0)^T$, we create the vector u:\n", "\n", "$u=x+αe1$\n", "Once we have the vector u, we need to convert it to a unit vector, which we denote as v:\n", "\n", "$v=u/||u||$\n", "Now we form the matrix Q out of the identity matrix I and the vector multiplication of v:\n", "\n", "$Q=I−2vvT$\n", "Q is now an m×m Householder matrix, with $Qx=(α,0,...,0)T$. We will use Q to transform A to upper triangular form, giving us the matrix R. We denote Q as Qk and, since k=1 in this first step, we have Q1 as our first Householder matrix. \n", "Hence, we define Qk as the block matrix $Q_{k}$:\n", "\n", "\\begin{bmatrix}\n", "I_{k-1} & 0\\\\\n", "0 & Q_{k}\n", "\\end{bmatrix}\n", "\n", "$Q_{k}=(I_{k}−100Q′_{k})$\n", "Once we have carried out t iterations of this process we have R as an upper triangular matrix:\n", "\n", "$R=Q_{t}...Q_{2}Q_{1}A$\n", "Q is then fully defined as the multiplication of the transposes of each Qk:\n", "\n", "$Q=Q^{T}_{1}Q^{T}_{2}...Q^{T}_{t}$\n", "\n", "This gives $A=QR$, the QR factorization of A.\n", "\n", "## Gram-Schmidt and modified Gram-Schmidt algorithms\n", "\n", "This method is the most poorly numerically conditioned method of those highlighted here, however it is also the least computationally intensive method of calculating the QR factorization.\n", "\n", "The Gram-Schmidt method creates an orthonormal basis to span the columns of A. This orthonormal basis can be used to create an upper triangular R matrix and by extension a unitary Q matrix.\n", "\n", "## Givens rotations\n", "\n", "Givens rotations are similar to householder transformations, but they zero out individual elements of the matrix one at a time instead of zeroing out portions of columns all at once. Additionally, givens rotations can be implemented using CORDIC rotations which allows the solution to be parallelized and pipleined. Carefully selected rotations can aid in simplifying computational complexity, by ensuring that mulitplications are powers of 2 and that fewer multiplications need to be done.\n" ] }, { "source": [ "# Finding the QR factorization in python\n", "\n", "The QR factorization is found using a common library. In order to compare the result for each method the same matrix A is used for each.\n", "\n", "A = \n", "\n", "\\[\\[12, -51, 4],\n", "\n", "\\[6, 167, -68],\n", "\n", "\\[-4, 24, -41]]" ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "#Finding the QR factorization using SciPy\n", "#Print A, Q and R at the output\n", "import pprint" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import scipy" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "import scipy.linalg # SciPy Linear Algebra Library" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "#Create an matrix\n", "A = scipy.array([[12, -51, 4], [6, 167, -68], [-4, 24, -41]]) # From the Wikipedia Article on QR factorization\n", "Q, R = scipy.linalg.qr(A)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "A:\narray([[ 12, -51, 4],\n [ 6, 167, -68],\n [ -4, 24, -41]])\n" ] } ], "source": [ "print(\"A:\")\n", "pprint.pprint(A)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Q:\narray([[-0.85714286, 0.39428571, 0.33142857],\n [-0.42857143, -0.90285714, -0.03428571],\n [ 0.28571429, -0.17142857, 0.94285714]])\n" ] } ], "source": [ "print (\"Q:\")\n", "pprint.pprint(Q)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "R:\narray([[ -14., -21., 14.],\n [ 0., -175., 70.],\n [ 0., 0., -35.]])\n" ] } ], "source": [ "print (\"R:\")\n", "pprint.pprint(R)" ] }, { "source": [ "$QR$ is in fact a factorization of $A$ as demonstrated by multiplying the matricies togehter" ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "QR:\narray([[ 12., -51., 4.],\n [ 6., 167., -68.],\n [ -4., 24., -41.]])\n" ] } ], "source": [ "# demonstrate that QR yields A\n", "def mult_matrix(X, Y):\n", " \"\"\"Multiply square matrices X and Y of same dimension\"\"\"\n", " # Nested list comprehension to calculate matrix multiplication\n", " return [[sum(a*b for a,b in zip(X_row,Y_col)) for Y_col in zip(*Y)] for X_row in X]\n", " \n", "print(\"QR:\")\n", "pprint.pprint(scipy.array(mult_matrix(Q,R)))" ] }, { "source": [ "Additionally $Q$ is a unitary, and in this case orthogonal matrix as shown by multiplying $Q^T Q$ and $Q Q^T$ both of which should yield $I$" ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Q tranpose(Q)\narray([[ 1.00000000e+00, 8.84708973e-17, 0.00000000e+00],\n [ 8.84708973e-17, 1.00000000e+00, -3.46944695e-17],\n [ 0.00000000e+00, -3.46944695e-17, 1.00000000e+00]])\ntranpose(Q) Q\narray([[ 1.00000000e+00, -2.08166817e-17, -5.55111512e-17],\n [-2.08166817e-17, 1.00000000e+00, 2.77555756e-17],\n [-5.55111512e-17, 2.77555756e-17, 1.00000000e+00]])\n" ] } ], "source": [ "#show that Q is unitary\n", "def mult_matrix(X, Y):\n", " \"\"\"Multiply square matrices X and Y of same dimension\"\"\"\n", " # Nested list comprehension to calculate matrix multiplication\n", " return [[sum(a*b for a,b in zip(X_row,Y_col)) for Y_col in zip(*Y)] for X_row in X]\n", "\n", "def trans_matrix(M):\n", " \"\"\"Take the transpose of a matrix.\"\"\"\n", " n = len(M)\n", " return [[ M[i][j] for i in range(n)] for j in range(n)]\n", "\n", "print(\"Q tranpose(Q)\")\n", "pprint.pprint(scipy.array(mult_matrix(Q,trans_matrix(Q))))\n", "print(\"tranpose(Q) Q\")\n", "pprint.pprint(scipy.array(mult_matrix(trans_matrix(Q),Q)))" ] }, { "source": [ "Note that although the values are not exactly zero off the diagonal this can be explained by floating point math error." ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "markdown", "metadata": {}, "source": [ "# QR factorization using householder transformations\n", "\n", "The QR factorization is computed using householder transformations." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "from math import sqrt\n", "from pprint import pprint" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "def mult_matrix(X, Y):\n", " \"\"\"Multiply square matrices X and Y of same dimension\"\"\"\n", " # Nested list comprehension to calculate matrix multiplication\n", " return [[sum(a*b for a,b in zip(X_row,Y_col)) for Y_col in zip(*Y)] for X_row in X]" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "def trans_matrix(M):\n", " \"\"\"Take the transpose of a matrix.\"\"\"\n", " n = len(M)\n", " return [[ M[i][j] for i in range(n)] for j in range(n)]\n" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "def norm(x):\n", " \"\"\"Return the Euclidean norm of the vector x.\"\"\"\n", " return sqrt(sum([x_i**2 for x_i in x]))\n" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "def Q_i(Q_min, i, j, k):\n", " \"\"\"Construct the Q_t matrix by left-top padding the matrix Q \n", " with elements from the identity matrix.\"\"\"\n", " if i < k or j < k:\n", " return float(i == j)\n", " else:\n", " return Q_min[i-k][j-k]\n" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "def householder(A):\n", " \"\"\"Performs a Householder Reflections based QR Decomposition of the \n", " matrix A. The function returns Q, an orthogonal matrix and R, an \n", " upper triangular matrix such that A = QR.\"\"\"\n", " n = len(A)\n", "\n", " # Set R equal to A, and create Q as a zero matrix of the same size\n", " R = A\n", " Q = [[0.0] * n for i in range(n)]\n", "\n", " # The Householder procedure\n", " for k in range(n-1): # We don't perform the procedure on a 1x1 matrix, so we reduce the index by 1\n", " # Create identity matrix of same size as A \n", " I = [[float(i == j) for i in range(n)] for j in range(n)]\n", "\n", " # Create the vectors x, e and the scalar alpha\n", " # find the sign of the values. used -cmp() in python2\n", " x = [row[k] for row in R[k:]]\n", " e = [row[k] for row in I[k:]]\n", " alpha = -((x[0] > 0) - (x[0] < 0)) * norm(x)#-cmp(x[0],0) * norm(x)\n", "\n", " # Using anonymous functions, we create u and v\n", " u = list(map(lambda p,q: p + alpha * q, x, e))\n", " norm_u = norm(u)\n", " v = list(map(lambda p: p/norm_u, u))\n", "\n", " # Create the Q minor matrix\n", " Q_min = [ [float(i==j) - 2.0 * v[i] * v[j] for i in range(n-k)] for j in range(n-k) ]\n", "\n", " # \"Pad out\" the Q minor matrix with elements from the identity\n", " Q_t = [[ Q_i(Q_min,i,j,k) for i in range(n)] for j in range(n)]\n", "\n", " # If this is the first run through, right multiply by A,\n", " # else right multiply by Q\n", " if k == 0:\n", " Q = Q_t\n", " R = mult_matrix(Q_t,A)\n", " else:\n", " Q = mult_matrix(Q_t,Q)\n", " R = mult_matrix(Q_t,R)\n", "\n", " # Since Q is defined as the product of transposes of Q_t,\n", " # we need to take the transpose upon returning it\n", " return trans_matrix(Q), R" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": true }, "outputs": [], "source": [ "#creating a matrix \n", "A = [[12, -51, 4], [6, 167, -68], [-4, 24, -41]]\n", "#finding the QR factorization\n", "Q, R = householder(A)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "A:\n[[12, -51, 4], [6, 167, -68], [-4, 24, -41]]\n" ] } ], "source": [ "print (\"A:\")\n", "pprint(A)" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Q:\n[[0.8571428571428571, 0.39428571428571435, -0.33142857142857135],\n [0.4285714285714286, -0.9028571428571429, 0.034285714285714114],\n [-0.28571428571428575, -0.17142857142857126, -0.942857142857143]]\n" ] } ], "source": [ "print (\"Q:\")\n", "pprint(Q)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "R:\n[[13.999999999999998, 21.00000000000001, -14.000000000000004],\n [-5.506706202140776e-16, -175.00000000000003, 70.0],\n [3.0198066269804245e-16, -3.552713678800501e-14, 35.000000000000014]]\n" ] } ], "source": [ "print (\"R:\")\n", "pprint(R)" ] }, { "source": [ "This result is comparable to the result achieved with scipy, being off by some rounding error. Casting the resulting matricies to scipy.array will aid in comparison because it will print with fewer significant digits." ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Q:\narray([[ 0.85714286, 0.39428571, -0.33142857],\n [ 0.42857143, -0.90285714, 0.03428571],\n [-0.28571429, -0.17142857, -0.94285714]])\nR:\narray([[ 1.40000000e+01, 2.10000000e+01, -1.40000000e+01],\n [-5.50670620e-16, -1.75000000e+02, 7.00000000e+01],\n [ 3.01980663e-16, -3.55271368e-14, 3.50000000e+01]])\n" ] } ], "source": [ "import scipy\n", "print(\"Q:\")\n", "pprint(scipy.array(Q))\n", "print(\"R:\")\n", "pprint(scipy.array(R))" ] }, { "source": [ "# QR factorization with Gram-Schmidt" ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Q = [[ 0.85714286 -0.39428571 -0.33142857]\n [ 0.42857143 0.90285714 0.03428571]\n [-0.28571429 0.17142857 -0.94285714]]\nR = [[ 14. 21. -14.]\n [ 0. 175. -70.]\n [ 0. 0. 35.]]\n" ] } ], "source": [ "import numpy as np\n", "def gram_schmidt(A):\n", " m = A.shape[0]\n", " n = A.shape[1]\n", " R=np.zeros([m,n])\n", " Q=np.zeros([m,m])\n", " R[0,0]=np.linalg.norm(A[:,0])\n", " if R[0, 0] == 0:\n", " print (\"Could not compute the QR factorization\")\n", " else:\n", " Q[:,0]=A[:,0]/R[0,0]\n", " for i in range(1,n):\n", " Q[:,i]=A[:,i]\n", " for j in range(i):\n", " R[j,i]=np.dot(Q[:,j].T,Q[:,i])\n", " Q[:,i]-=R[j,i]*Q[:,j]\n", " R[i,i]=np.linalg.norm(Q[:,i])\n", " if R[i,i]==0:\n", " print(\"Could not compute the QR factorization\")\n", " else:\n", " Q[:, i] = Q[:, i] / R[i, i]\n", " return Q,R\n", "\n", "A = np.array([[12, -51, 4], [6, 167, -68], [-4, 24, -41]])\n", "\n", "Q, R = gram_schmidt(A)\n", "\n", "print(\"Q = \", Q)\n", "print(\"R = \", R)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Linear Equations\n", "\n", "In numerical analysis, different factorization are used to implement efficient matrix algorithms. For instance, when solving a system of linear equations , the matrix A can be factorized via the LU factorization. The LU factorization factors a matrix into a lower triangular matrix L and an upper triangular matrix U. Solving the system will require fewer additions and multiplications in the LU factorization, though one might require significantly more digits in inexact arithmetic such as floating point.\n", "Similarly, the QR factorization expresses A as QR with Q a unitary matrix and R an upper triangular matrix. The system $Q(Rx) = b$ is solved by $Rx = Q^{T}b = c$, and the system $Rx = c$ is solved by 'back substitution'. The number of additions and multiplications required is about twice that of using the LU solver, but because the QR factorization is numerically stable, no additional significant digits are required in inexact arithmetic." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Eigenvalues\n", "\n", "Even though QR factorization can be used to solve a system of Linear Equations and is in fact better than Gaussian Elimination in terms of stability, it is rarely used for the same because of its cost of computation being twice that of Gaussian elimination.\n", "Another important use of QR factorization is inspiring the QR algorithm for calculation of eigen value factorization." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Conclusion \n", "\n", "Householder transformations and Gram-Schmidt are an appropriate methods for calculating the QR factorization of a matrix. Gram-Schmidt is much simpler compuatationally but it is less robust to poorly conditioned matricies. All methods yield similar results as well" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "3.8.5-final" } }, "nbformat": 4, "nbformat_minor": 2 }