{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Numerical Methods\n", "\n", "# Lecture 5: Numerical Linear Algebra I" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import scipy.linalg as sl" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Learning objectives:\n", "\n", "* Manipulation of matrices and matrix equations in Python\n", "* Reminder on properties of matrices (from MM1): determinants, singularity etc\n", "* Algorithms for the solution of linear systems\n", "* Gaussian elimination, including back substitution" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction - Linear (matrix) systems\n", "\n", "Recall from your Mathematical Methods I course that the we can re-write a system of simultaneous (linear) equations in matrix form. For example, in week 4 of MM1 you considered the following example:\n", "\n", "\\begin{eqnarray*}\n", " 2x + 3y &=& 7 \\\\\n", " x - 4y &=& 3\n", "\\end{eqnarray*} \n", "\n", "and it was noted that this can be written in matrix form as \n", "\n", "$$\n", "\\left(\n", " \\begin{array}{rr}\n", " 2 & 3 \\\\\n", " 1 & -4 \\\\\n", " \\end{array}\n", "\\right)\\left(\n", " \\begin{array}{c}\n", " x \\\\\n", " y \\\\\n", " \\end{array}\n", "\\right) = \\left(\n", " \\begin{array}{c}\n", " 7 \\\\\n", " 3 \\\\\n", " \\end{array}\n", "\\right)\n", "$$\n", "\n", "More generally, consider the arbitrary system of $n$ linear equations for $n$ unknowns\n", "\n", "\\begin{eqnarray*}\n", " A_{11}x_1 + A_{12}x_2 + \\dots + A_{1n}x_n &=& b_1 \\\\ \n", " A_{21}x_1 + A_{22}x_2 + \\dots + A_{2n}x_n &=& b_2 \\\\ \n", " \\vdots &=& \\vdots \\\\ \n", " A_{n1}x_1 + A_{n2}x_2 + \\dots + A_{nn}x_n &=& b_n\n", "\\end{eqnarray*}\n", "\n", "where $A_{ij}$ are the constant coefficients of the linear system, $x_j$ are the unknown variables, and $b_i$\n", "are the terms on the right hand side (RHS). Here the index $i$ is referring to the equation number\n", "(the row in the matrix below), with the index $j$ referring to the component of the unknown\n", "vector $\\pmb{x}$ (the column of the matrix).\n", "\n", "This system of equations can be represented as the matrix equation $A\\pmb{x}=\\pmb{b}$: \n", "\n", "$$\n", "\\left(\n", " \\begin{array}{cccc}\n", " A_{11} & A_{12} & \\dots & A_{1n} \\\\\n", " A_{21} & A_{22} & \\dots & A_{2n} \\\\\n", " \\vdots & \\vdots & \\ddots & \\vdots \\\\\n", " A_{n1} & A_{n2} & \\dots & A_{nn} \\\\\n", " \\end{array}\n", "\\right)\\left(\n", " \\begin{array}{c}\n", " x_1 \\\\\n", " x_2 \\\\\n", " \\vdots \\\\\n", " x_n \\\\\n", " \\end{array}\n", " \\right) = \\left(\n", " \\begin{array}{c}\n", " b_1 \\\\\n", " b_2 \\\\\n", " \\vdots \\\\\n", " b_n \\\\\n", " \\end{array}\n", " \\right)\n", "$$\n", "\n", "\n", "We can easily solve the above $2 \\times 2$ example of two equations and two unknowns using substitution (e.g. multiply the second equation by 2 and subtract the first equation from the resulting equation to eliminate $x$ and hence allowing us to find $y$, then we could compute $x$ from the first equation). We find:\n", "\n", "$$ x=37/11, \\quad y=1/11.$$\n", "\n", "In MM1 you also considered $3 \\times 3$ examples which were a little more complicated but still doable. This lecture considers the case of $n \\times n$ where $n$ could easily be billions! This case arises when you solve a differential equation numerically on a discrete mesh or grid. Here you would typically obtain one unknown and one (discrete, linear or nonlinear) equation at very grid point. You could generate an arbitrarily large matrix system simply by generating a finer mesh.\n", "\n", "Note that you will solve differential equations numerically in the follow-up course Numerical Methods II.\n", "\n", "Cases where the matrix is non-square, i.e. of shape $m \\times n$ where $m\\ne n$ correspond to the (over- or under-determined) system where you have more or less equations than unknowns - we won't consider these in this lecture. \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Matrices in Python\n", "\n", "We have already used numpy arrays to store one-dimensional vectors of numbers.\n", "\n", "The convention is that these are generally considered to be *column* vectors and have shape $n \\times 1$.\n", "\n", "We can extend to higher dimensions through the introduction of matrices as two-dimensional arrays (more generally vectors and matrices are just two examples of tensors). \n", "\n", "We use subscript indices to identify each component of the array or matrix, i.e. we can identify each component of the vector $\\pmb{v}$ by $v_i$, and each component of the matrix $A$ by $A_{ij}$. \n", "\n", "Note that it is a convention that vectors are either underlined or bold, and generally lower case letters, whereas matrices are plain capital letters.\n", "\n", "The *dimension* or *shape* of a vector/matrix is the number of rows and columns it posesses, i.e. $n \\times 1$ and $m \\times n$ for the examples above.\n", "\n", "Here is an example of how we can extend our use of the numpy array object to two dimensions in order to define a matrix $A$ and some examples of some operations we can make on it." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[10. 2. 1.]\n", " [ 6. 5. 4.]\n", " [ 1. 4. 7.]]\n" ] } ], "source": [ "A = np.array([[10., 2., 1.],[6., 5., 4.],[1., 4., 7.]])\n", "\n", "print(A)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "9\n" ] } ], "source": [ "# the total size of the array storing A - here 9 for a 3x3 matrix\n", "print(np.size(A)) " ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2\n" ] } ], "source": [ "# the number of dimensions of the matrix A\n", "print(np.ndim(A) ) " ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(3, 3)\n" ] } ], "source": [ "# the shape of the matrix A\n", "print(np.shape(A))" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[10. 6. 1.]\n", " [ 2. 5. 4.]\n", " [ 1. 4. 7.]]\n" ] } ], "source": [ "# the transpose of the matrix A\n", "print(A.T)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# the inverse of the matrix A - computed using a scipy algorithm\n", "print(sl.inv(A))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# the determinant of the matrix A - computed using a scipy algorithm\n", "print(sl.det(A))" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 1.00000000e+00 -1.11022302e-16 5.55111512e-17]\n", " [-2.22044605e-16 1.00000000e+00 2.22044605e-16]\n", " [-8.32667268e-17 -3.33066907e-16 1.00000000e+00]]\n" ] } ], "source": [ "# Multiply A with its inverse using the @ matrix multiplication operator. \n", "# Note that due to roundoff errors the off diagonal values are not exactly zero.\n", "print(A @ sl.inv(A))" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 1.00000000e+00 -1.11022302e-16 5.55111512e-17]\n", " [-2.22044605e-16 1.00000000e+00 2.22044605e-16]\n", " [-8.32667268e-17 -3.33066907e-16 1.00000000e+00]]\n" ] } ], "source": [ "# The @ operator is fairly new, you may also see use of numpy.dot:\n", "print(np.dot(A,sl.inv(A)))" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 1.00000000e+00 -1.11022302e-16 5.55111512e-17]\n", " [-2.22044605e-16 1.00000000e+00 2.22044605e-16]\n", " [-8.32667268e-17 -3.33066907e-16 1.00000000e+00]]\n" ] } ], "source": [ "# same way to achieve the same thing\n", "print(A.dot(sl.inv(A)))" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 1.42857143 -0.15037594 0.02255639]\n", " [-1.71428571 2.59398496 -1.02255639]\n", " [ 0.14285714 -1.14285714 2. ]]\n" ] } ], "source": [ "# note that the * operator simply does operations element-wise - here this\n", "# is not what we want!\n", "print(A*sl.inv(A))" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0. 0. 0.]\n" ] } ], "source": [ "# how to initialise a vector of zeros \n", "print(np.zeros(3))" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[0. 0. 0.]\n", " [0. 0. 0.]\n", " [0. 0. 0.]]\n" ] } ], "source": [ "# how to initialise a matrix of zeros \n", "print(np.zeros((3,3)))" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[[0. 0. 0.]\n", " [0. 0. 0.]\n", " [0. 0. 0.]]\n", "\n", " [[0. 0. 0.]\n", " [0. 0. 0.]\n", " [0. 0. 0.]]\n", "\n", " [[0. 0. 0.]\n", " [0. 0. 0.]\n", " [0. 0. 0.]]]\n" ] } ], "source": [ "# how to initialise a 3rd-order tensor of zeros\n", "print(np.zeros((3,3,3)))" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[1. 0. 0.]\n", " [0. 1. 0.]\n", " [0. 0. 1.]]\n" ] } ], "source": [ "# how to initialise the identity matrix, I or Id\n", "print(np.eye(3))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's quickly consider the $2 \\times 2$ case from MM1 recreated above where we claimed that $x=37/11$ and $y=1/11$. \n", "\n", "To solve the matrix equation \n", "\n", "$$ A\\pmb{x}=\\pmb{b} $$\n", "\n", "we can simply multiply both sides by the inverse of the matrix $A$ (if $A$ is invertible and if we know what the inverse is of course!):\n", "\n", "\\begin{align}\n", "A\\pmb{x} & = \\pmb{b}\\\\\n", "\\implies A^{-1}A\\pmb{x} & = A^{-1}\\pmb{b}\\\\\n", "\\implies I\\pmb{x} & = A^{-1}\\pmb{b}\\\\\n", "\\implies \\pmb{x} & = A^{-1}\\pmb{b}\n", "\\end{align}\n", "\n", "so we can find the solution $\\pmb{x}$ by multiplying the inverse of $A$ with the RHS vector $\\pmb{b}$.\n", "\n", "### Exercise 5.1: Solving a linear system \n", "Formulate and solve the linear system and check you get the answer quoted above." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Aside: matrix objects \n", "\n", "Note that numpy does possess a matrix object as a sub-class of the numpy array. We can cast the above two-dimensional arrays into matrix objects and then the star operator does yield the expected matrix product:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "# A is an n-dimensional array (n-2 here)\n", "A = np.array([[10., 2., 1.],[6., 5., 4.],[1., 4., 7.]])\n", "print(type(A))" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "# this casts the array A into the matrix class\n", "print(type(np.mat(A)))" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 1.00000000e+00 -1.11022302e-16 5.55111512e-17]\n", " [-2.22044605e-16 1.00000000e+00 2.22044605e-16]\n", " [-8.32667268e-17 -3.33066907e-16 1.00000000e+00]]\n" ] } ], "source": [ "# for these objects * is standard matrix multuiplication \n", "# and we can check that A*A^{-1}=I as expected\n", "print(np.mat(A)*np.mat(sl.inv(A)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Slicing \n", "\n", "Remember from last year's Python module that just as for arrays or lists, we can use *slicing* in order to extract components of matrices, for example:" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[10. 2. 1.]\n", " [ 6. 5. 4.]\n", " [ 1. 4. 7.]]\n" ] } ], "source": [ "A=np.array([[10., 2., 1.],[6., 5., 4.],[1., 4., 7.]])\n", "print(A)" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2.0\n" ] } ], "source": [ "# single entry, first row, second column\n", "print(A[0,1])" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[10. 2. 1.]\n" ] } ], "source": [ "# first row\n", "print(A[0,:])" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1. 4. 7.]\n" ] } ], "source": [ "# last row\n", "print(A[-1,:])" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[2. 5. 4.]\n" ] } ], "source": [ "# second column\n", "print(A[:,1])" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[5. 4.]\n", " [4. 7.]]\n" ] } ], "source": [ "# extract a 2x2 sub-matrix\n", "print(A[1:3,1:3])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise 5.2: Matrix manipulation in Python \n", "\n", "Let\n", "$$\n", "A = \\left(\n", " \\begin{array}{ccc}\n", " 1 & 2 & 3 \\\\\n", " 4 & 5 & 6 \\\\\n", " 7 & 8 & 9 \\\\\n", " \\end{array}\n", "\\right)\n", "\\mathrm{\\quad\\quad and \\quad\\quad}\n", "b = \\left(\n", " \\begin{array}{c}\n", " 2 \\\\\n", " 4 \\\\\n", " 6 \\\\\n", " \\end{array}\n", "\\right)\n", "$$\n", "\n", "- Store $A$ and $b$ in NumPy array structures. Print them.\n", "- Print their shape and size. What is the difference ?\n", "- Create a NumPy array $I$ containing the identity matrix $I_3$. Do $A = A+I$.\n", "- Substitute the 3rd column of $A$ with $b$.\n", "- Solve $Ax=b$." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Properties of matrices: determinants, singularity, solvability of linear systems, etc\n", "\n", "Consider $N$ linear equations in $N$ unknowns, $A\\pmb{x}=\\pmb{b}$. \n", "\n", "From MM1 you learnt that this system has a *unique solution* provided that the determinant of A, $\\det(A)$, is non-zero. In this case the matrix is said to be *non-singular*.\n", "\n", "If $\\det(A)=0$ (with $A$ then termed a *singular matrix*), then the linear system does *not* have a unique solution, it may have either infinite *or* no solutions.\n", "\n", "For example, consider\n", "\n", "$$\n", "\\left(\n", " \\begin{array}{rr}\n", " 2 & 3 \\\\\n", " 4 & 6 \\\\\n", " \\end{array}\n", "\\right)\\left(\n", " \\begin{array}{c}\n", " x \\\\\n", " y \\\\\n", " \\end{array}\n", "\\right) = \\left(\n", " \\begin{array}{c}\n", " 4 \\\\\n", " 8 \\\\\n", " \\end{array}\n", "\\right)\n", "$$\n", "\n", "The second equation is simply twice the first, and hence a solution to the first equation is also automatically a solution to the second equation. \n", "\n", "We hence only have one *linearly-independent* equation, and our problem is under-constrained: we effectively only have one eqution for two unknowns with infinitely many possibly solutions. \n", "\n", "If we replaced the RHS vector with $(4,7)^T$, then the two equations would be contradictory: in this case we have no solutions.\n", "\n", "Note that a set of vectors where one can be written as a linear sum of the others are termed *linearly-dependent*. When this is not the case the vectors are termed *linearly-independent*.\n", "\n", "The following properties of a square $n\\times n$ matrix are equivalent:\n", "\n", "\n", "* $\\det(A)\\ne 0$ - A is non-singular\n", "\n", "\n", "* The columns of $A$ are linearly independent\n", "\n", "\n", "* The rows of $A$ are linearly independent\n", "\n", "\n", "* The columns of A *span* $n$-dimensional space (recall MM1 - we can reach any point in $\\mathbb{R}^N$ through a linear combination of these vectors - note that this is simply what the operation $A\\pmb{x}$ is doing of course if you write it out)\n", "\n", "\n", "* $A$ is invertible, i.e. there exists a matrix $A^{-1}$ such that $A^{-1}A = A A^{-1}=I$\n", "\n", "\n", "* the matrix system $A\\pmb{x}=\\pmb{b}$ has a unique solution for every vector $b$\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Gaussian elimination - method\n", "\n", "\n", "The Gaussian elimination algorithm is simply a systematic implementation of the method of equation substitution we used above to solve the $2\\times 2$ system (i.e. where we \"multiply the second equation by 2 and subtract the first equation from the resulting equation to *eliminate* $x$ and hence allowing us to find $y$, we can then compute $x$ from the first equation\").\n", "\n", "So *Gaussian elimination* as the method is atributed to the mathematician *Gauss* (although it was certainly known before his time) and *elimination* as we seek to eliminate unknowns.\n", "\n", "To perform this method for arbitrarily large systems (on paper) we form the so-called *augmented matrix*\n", "\n", "$$\n", "[A|\\pmb{b}] = \n", "\\left[\n", " \\begin{array}{rr|r}\n", " 2 & 3 & 7 \\\\\n", " 1 & -4 & 3 \\\\\n", " \\end{array}\n", "\\right]\n", "$$\n", "\n", "Note that this encodes the equations (including the RHS values); we can perform so-called *row operations* and as long as we are consistent with what we do for the LHS and RHS components then what this system is describing will not change - a solution to the updated system will be the same as the solution to the original system. \n", "\n", "Our task is to update the system so we can easily read off the solution - of course this is exactly what we do via the substitution approach:\n", "\n", "First we multiplied the second equation by 2, this yield the updated augmented matrix:\n", "\n", "$$\n", "\\left[\n", " \\begin{array}{rr|r}\n", " 2 & 3 & 7 \\\\\n", " 2 & -8 & 6 \\\\\n", " \\end{array}\n", "\\right]\n", "$$\n", "\n", "We can use the following notation to describe this operation:\n", "\n", "$$ Eq. (2) \\leftarrow 2\\times Eq. (2) $$\n", "\n", "Note importantly that this does not change anything about what these pair of equations are telling us about the unknown solution vector $\\pmb{x}$ which although it doesn't appear is implicilty defined by this augmented equation.\n", "\n", "The next step was to subtract the first equation from the updated second ($ Eq. (2) \\leftarrow Eq. (2) - Eq. (1) $):\n", "\n", "$$\n", "\\left[\n", " \\begin{array}{rr|r}\n", " 2 & 3 & 7 \\\\\n", " 0 & -11 & -1 \\\\\n", " \\end{array}\n", "\\right]\n", "$$\n", "\n", "The square matrix that is now in the $A$ position of this augmented system is an example of an *upper-triangular* matrix - all entries below the diagonal are zero. \n", "\n", "For such a matrix we can perform back substitution - starting at the bottom to solve trivially for the final unknown ($y$ here which clearly takes the value $-1/-11$), and then using this knowledge working our way up to solve for each remaining unknown in turn, here just $x$ (solving $2x + 3\\times (1/11) = 7$).\n", "\n", "Note that we can perform the similar substitution if we had a lower triangular matrix, first finding the first unknown and then working our way forward through the remaining unknowns - hence in this case *forward substitution*.\n", "\n", "Note that if we wished we could of course continue working on the augmented matrix to make the $A$ component diagonal: divide the second equation by 11 and multiply by 3 ($ Eq. (2) \\leftarrow (3/11)\\times Eq. (2) $) and add it to the first ($ Eq. (1) \\leftarrow Eq. (1) + Eq. (2) $):\n", "\n", "$$\n", "\\left[\n", " \\begin{array}{rr|r}\n", " 2 & 0 & 7-3/11\\\\\n", " 0 & -3 & -3/11 \\\\\n", " \\end{array}\n", "\\right]\n", "$$\n", "\n", "and we can further make it the identity by dividing the rows by $2$ and $-3$ respectively ($ Eq. (1) \\leftarrow (1/2)\\times Eq. (1) $, $ Eq. (2) \\leftarrow (-1/3)\\times Eq. (2) $) :\n", "\n", "$$\n", "\\left[\n", " \\begin{array}{rr|r}\n", " 1 & 0 & (7-3/11)/2 \\\\\n", " 0 & 1 & 1/11 \\\\\n", " \\end{array}\n", "\\right]\n", "$$\n", "\n", "Each of these augmented matrices encodes exactly the same information as the original matrix system in terms of the unknown vector $\\pmb{x}$, and hence this is telling us that\n", "\n", "$$ \\pmb{x} = I \\pmb{x} = \\left[\n", " \\begin{array}{c}\n", " (7-3/11)/2 \\\\\n", " 1/11 \\\\\n", " \\end{array}\n", "\\right]\n", "$$\n", "\n", "i.e. exactly the solution we found when we performed back substitution from the upper-triangular form of the augmented system.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise 5.3: Gaussian elimination $3 \\times 3$ example (by hand) \n", "\n", "Consider the system of linear equations\n", "\n", "\\begin{align*}\n", " 2x + 3y - 4z &= 5 \\\\\n", " 6x + 8y + 2z &= 3 \\\\\n", " 4x + 8y - 6z &= 19\n", "\\end{align*}\n", "\n", "write this in matrix form, form the corresponding augmented system and perform row operations until you get to upper-triangular form, find the solution using back substitution (**do this all with pen and paper**).\n", "\n", "Write some code to check your answer using `sl.inv(A) @ b`.\n", "\n", "You should find $x=-6$, $y=5$, $z=-1/2$.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Gaussian elimination - algorithm and code\n", "\n", "Notice that we are free to perform the following operations on the augmented system without changing the corresponding solution:\n", "\n", "\n", "* Exchange two rows (refer to the section on *partial pivoting* next week)\n", "\n", "\n", "* Multiply a row by a non-zero constant ($Eq. (i)\\leftarrow \\lambda \\times Eq.(i)$)\n", "\n", "\n", "* Subtracting a (non-zero) multiple of one row with another ($Eq. (i)\\leftarrow Eq. (i) - \\lambda \\times Eq.(j)$)\n", "\n", "\n", "\n", "Note that the equation/row being subtracted is termed the *pivot*.\n", "\n", "\n", "Let's consider the algorithm mid-way working on an arbitrary matrix system, i.e. assume that the first $k$ rows (i.e. above the horizontal dashed line in the matrix below) have already been transformed into upper-triangular form, while the equations/rows below are not yet in this form.\n", "\n", "The augmented equation in this case can be assumed to look like\n", "\n", "$$\n", "\\left[\n", " \\begin{array}{rrrrrrrrr|r}\n", " A_{11} & A_{12} & A_{13} & \\cdots & A_{1k} & \\cdots & A_{1j} & \\cdots & A_{1n} & b_1 \\\\\n", " 0 & A_{22} & A_{23} & \\cdots & A_{2k} & \\cdots & A_{2j} & \\cdots & A_{2n} & b_2 \\\\\n", " 0 & 0 & A_{33} & \\cdots & A_{3k} & \\cdots & A_{3j} & \\cdots & A_{3n} & b_3 \\\\\n", " \\vdots & \\vdots & \\vdots & \\ddots & \\vdots & \\ddots & \\vdots & \\ddots & \\vdots & \\vdots \\\\\n", " 0 & 0 & 0 & \\cdots & A_{kk} & \\cdots & A_{kj} & \\cdots & A_{kn} & b_k \\\\ \n", "\\hdashline\n", " \\vdots & \\vdots & \\vdots & \\ddots & \\vdots & \\ddots & \\vdots & \\ddots & \\vdots & \\vdots \\\\\n", " 0 & 0 & 0 & \\cdots & A_{ik} & \\cdots & A_{ij} & \\cdots & A_{in} & b_i \\\\\n", " \\vdots & \\vdots & \\vdots & \\ddots & \\vdots & \\ddots & \\vdots & \\ddots & \\vdots & \\vdots \\\\\n", " 0 & 0 & 0 & \\cdots & A_{nk} & \\cdots & A_{nj} & \\cdots & A_{nn} & b_n \\\\\n", "\\end{array}\n", "\\right]\n", "$$\n", "\n", "Remember that here as we are mid-way through the algorithm the $A$'s and $b$'s in the above are **not** the same as in the original system!\n", "\n", "Our aim as a next step in the algorithm is to use row $k$ (the \"pivot row\") to *eliminate* $A_{ik}$, and we need to do this for all of the rows $i$ below the pivot, i.e. for all $i>k$.\n", "\n", "Note that the zeros to the left of the leading term in the pivot row means that these operations will not mess up the fact that we have all the zeros we are looking for in the lower left part of the matrix.\n", "\n", "To eliminate $A_{ik}$ for a single row $i$ we need to perform the operation \n", "$$ Eq. (i)\\leftarrow Eq. (i) - \\frac{A_{ik}}{A_{kk}} \\times Eq.(k) $$\n", "\n", "or equivalently\n", "\n", "\\begin{align}\n", "A_{ij} &\\leftarrow A_{ij} - \\frac{A_{ik}}{A_{kk}} A_{kj}, \\quad j=k,k+1,\\ldots,n\\\\\n", "b_i &\\leftarrow b_i - \\frac{A_{ik}}{A_{kk}} b_{k}\n", "\\end{align}\n", "\n", "$j$ only needs to run from $k$ upwards as we can assume that the earlier entries in column $i$ have already been set to zero, and also that the corresponding terms from the pivot row are also zero (we don't need to perform operations that we know involve the addition of zeros!).\n", "\n", "And to eliminate these entries for all rows below the pivot we need to repeat for all $i>k$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise 5.3: Gaussian elimination\n", "\n", "Write some code that takes a matrix $A$ and a vector $\\pmb{b}$ and converts it into upper-triangular form using the above algorithm. For the $2 \\times 2$ and $3\\times 3$ examples from above compare the resulting $A$ and $\\pmb{b}$ you obtain following elimination.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def upper_triangle(A, b):\n", " \"\"\" A function to covert A into upper triangluar form through row operations.\n", " The same row operations are performed on the vector b.\n", " \n", " Note that this implementation does not use partial pivoting which is introduced below.\n", " \n", " Also note that A and b are overwritten, and hence we do not need to return anything\n", " from the function.\n", " \"\"\"\n", " n = np.size(b)\n", " rows, cols = np.shape(A)\n", " # check A is square\n", " assert(rows == cols)\n", " # and check A has the same numner of rows as the size of the vector b\n", " assert(rows == n)\n", "\n", " # Loop over each pivot row - all but the last row which we will never need to use as a pivot\n", " for k in range(n-1):\n", " # Loop over each row below the pivot row, including the last row which we do need to update\n", " for i in range(.................\n", " ...........\n", " ...........\n", "\n", "# Test our code on our 2x2 and 3x3 examples from above\n", "\n", "A = .......\n", "b = .......\n", "\n", "upper_triangle(A, b)\n", "\n", "# Here is a new trick for you - \"pretty print\"\n", "from pprint import pprint\n", "\n", "print('\\nOur A matrix following row operations to transform it into upper-triangular form:')\n", "pprint(A)\n", "print('The correspondingly updated b vector:')\n", "pprint(b)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Back substitution\n", "\n", "Now that we have an augmented system in the upper-triangular form\n", "\n", "$$\n", "\\left[\n", " \\begin{array}{rrrrr|r}\n", " A_{11} & A_{12} & A_{13} & \\cdots & A_{1n} & b_1 \\\\\n", " 0 & A_{22} & A_{23} & \\cdots & A_{2n} & b_2 \\\\\n", " 0 & 0 & A_{33} & \\cdots & A_{3n} & b_3 \\\\\n", " \\vdots & \\vdots & \\vdots & \\ddots & \\vdots & \\vdots \\\\\n", " 0 & 0 & 0 & \\cdots & A_{nn} & b_n \\\\ \n", "\\end{array}\n", "\\right]\n", "$$\n", "\n", "where the solution $\\pmb{x}$ of the original system also satisfies $A\\pmb{x}=\\pmb{b}$ for the $A$ and $\\pmb{b}$ in the above upper-triangular form (rather than the original $A$ and $\\pmb{b}$!).\n", "\n", "We can solve the final equation row to yield\n", "\n", "$$x_n = \\frac{b_n}{A_{nn}}$$\n", "\n", "The second to last equation then yields (note we've introduced a comma in the subscripts here simply to make the more complex double indices easier to read)\n", "\n", "\\begin{align}\n", "A_{n-1,n-1}x_{n-1} + A_{n-1,n}x_n &= b_{n-1}\\\\\n", "\\implies x_{n-1} = \\frac{b_{n-1} - A_{n-1,n}x_n}{A_{n-1,n-1}}\\\\\n", "\\implies x_{n-1} = \\frac{b_{n-1} - A_{n-1,n}\\frac{b_n}{A_{nn}}}{A_{n-1,n-1}}\n", "\\end{align}\n", "\n", "and so on to row $k$ which yields\n", "\n", "\\begin{align}\n", "A_{k,k}x_{k} + A_{k,k+1}x_{k+1} +\\cdots + A_{k,n}x_n &= b_{k}\\\\\n", "\\iff A_{k,k}x_{k} + \\sum_{j=k+1}^{n}A_{kj}x_j &= b_{k}\\\\\n", "\\implies x_{k} &= \\left( b_k - \\sum_{j=k+1}^{n}A_{kj}x_j\\right)\\frac{1}{A_{kk}}\n", "\\end{align}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise 5.5: Back substitution\n", "\n", "Extend your code to perform back substitution and hence to obtain the final solution $\\pmb{x}$. Check against the solutions found earlier. Come up with some random $n\\times n$ matrices (you can use `np.random.rand` for that) and check your code against `sl.inv(A)@b` (remember to use the original $A$ and $\\pmb{b}$ here of course!)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# This function assumes that A is already an upper triangular matrix, \n", "# e.g. we have already run our upper_triangular function if needed.\n", "\n", "def back_substitution(A, b):\n", " \"\"\" Function to perform back subsitution on the system Ax=b.\n", " \n", " Returns the solution x.\n", " \n", " Assumes that A is on upper triangular form.\n", " \"\"\"\n", " n = np.size(b)\n", " # Check A is square and its number of rows and columns same as size of the vector b\n", " rows, cols = np.shape(A)\n", " assert(rows == cols)\n", " assert(rows == n)\n", " # We can/should check that A is upper triangular using np.triu which is the \n", " # upper triangular part of a matrix - if A is already upper triangular, then\n", " # it should of course match the upper-triangular component of A!!\n", " assert(np.allclose(A, np.triu(A)))\n", " \n", " x = np.zeros(n)\n", " # start at the end (row n-1) and work backwards\n", " for k in range(............\n", " ..................\n", "\n", " return x\n", "\n", "\n", "# This A is the upper triangular matrix carried forward\n", "# from the Python box above, and b the correspondingly updated b vector.\n", "A = ..........\n", "b = ..........\n", "\n", "# print the solution using our codes\n", "x = back_substitution(A, b)\n", "print('Our solution: ',x) \n", "\n", "# Reinitialise A and b !\n", "# remember our functions overwrote them\n", "A = np.array([[2., 3., -4.],\n", " [6., 8., 2.],\n", " [4., 8., -6.]])\n", "b = np.array([5., 3., 19.])\n", "\n", "# check our answer against what SciPy gives us by multiplying b by A inverse \n", "print('SciPy solution: ',sl.inv(A) @ b)\n", "\n", "print('Success: ', np.allclose(x, sl.inv(A) @ b))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Gauss-Jordan elimination\n", "\n", "Recall that for the augmented matrix example we did by hand above we continued past the upper-triangular form so that the augmented matrix had the identity matrix in the $A$ location. This algorithm has the name Gauss-Jordan elimination but note that it requires more operations than the conversion to upper-triangular form followed by back subsitution and so is only of academic interest." ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "### Matrix inversion\n", "\n", "Note that if we were to form the augmented equation with the full identity matrix in the place of the vector $\\pmb{b}$, i.e. $[A|I]$ and performed row operations exactly as above until $A$ is transformed into the identity matrix $I$, then we would be left with the inverse of $A$ in the original $I$ location, i.e.\n", "\n", "$$ [A|I] \\rightarrow [I|A^{-1}] $$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise 5.6: Matrix inversion\n", "\n", "Try updating your code to construct the inverse matrix. Check your answer against the inverse matrix given by a built-in function.\n", "\n", "Hint: Once you have performed your Gaussian elimination to transform $A$ into an upper triangular matrix, perform another elimination \"from bottom to top\" to transform $A$ into a diagonal matrix." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Updated version of the upper_triangular function that\n", "# assumes that a matrix, B, is in the old vector location\n", "# in the augmented system, and applies the same operations to\n", "# B as to A\n", "def upper_triangle2(A, B):\n", " # your code here\n", "\n", " \n", "# Function which transforms the matrix into lower triangular \n", "# form - the point here is that if you give it a \n", "# matrix that is already in upper triangular form, then the \n", "# result will be a diagonal matrix\n", "def lower_triangle2(A, B):\n", " # your code here\n", "\n", " \n", "# Let's redefine A as our matrix above\n", "A = # your code here\n", "# and B is the identity of the corresponding size\n", "B = # your code here\n", "\n", "# transform A into upper triangular form (and perform the same operations on B)\n", "# your code here\n", "\n", "# now make this updated A lower triangular as well (the result should be diagonal)\n", "# your code here\n", "\n", "# final step is just to divide each row through by the value of the diagonal\n", "# to end up with the identity in the place of A\n", "# your code here\n", "\n", "# print final A and B and check solution\n", "# your code here" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise 5.6: Zeros on the diagonal\n", "\n", "You may have noticed above that we have no way of guaranteeing that the $A_{kk}$ we divide through by in the Guassian elimination or back substitution algorithms is non-zero (or not very small which will also lead to computational problems).\n", "Note also that we commented that we are free to exchange two rows in our augmented system - how could you use this fact to build robustness into our algorithms in order to deal with matrices for which our algorithms do lead to very small or zero $A_{kk}$ values? \n", "\n", "See if you can figure out how to do this - more on this next week!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.6" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": false, "sideBar": false, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": false, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 1 }