{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Matrix\n", "\n", "> Marcos Duarte \n", "> Laboratory of Biomechanics and Motor Control ([http://demotu.org/](http://demotu.org/)) \n", "> Federal University of ABC, Brazil" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A matrix is a square or rectangular array of numbers or symbols (termed elements), arranged in rows and columns. For instance:\n", "\n", "$$ \n", "\\mathbf{A} = \n", "\\begin{bmatrix} \n", "a_{1,1} & a_{1,2} & a_{1,3} \\\\\n", "a_{2,1} & a_{2,2} & a_{2,3} \n", "\\end{bmatrix}\n", "$$\n", "\n", "$$ \n", "\\mathbf{A} = \n", "\\begin{bmatrix} \n", "1 & 2 & 3 \\\\\n", "4 & 5 & 6 \n", "\\end{bmatrix}\n", "$$\n", "\n", "The matrix $\\mathbf{A}$ above has two rows and three columns, it is a 2x3 matrix.\n", "\n", "In Numpy:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2017-12-30T09:04:21.861732Z", "start_time": "2017-12-30T09:04:21.790233Z" }, "collapsed": true }, "outputs": [], "source": [ "# Import the necessary libraries\n", "import numpy as np\n", "from IPython.display import display\n", "np.set_printoptions(precision=4) # number of digits of precision for floating point" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2017-12-30T09:04:21.880788Z", "start_time": "2017-12-30T09:04:21.863856Z" } }, "outputs": [ { "data": { "text/plain": [ "array([[1, 2, 3],\n", " [4, 5, 6]])" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A = np.array([[1, 2, 3], [4, 5, 6]])\n", "A" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To get information about the number of elements and the structure of the matrix (in fact, a Numpy array), we can use:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "ExecuteTime": { "end_time": "2017-12-30T09:04:21.888230Z", "start_time": "2017-12-30T09:04:21.884175Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "A:\n", " [[1 2 3]\n", " [4 5 6]]\n", "len(A) = 2\n", "np.size(A) = 6\n", "np.shape(A) = (2, 3)\n", "np.ndim(A) = 2\n" ] } ], "source": [ "print('A:\\n', A)\n", "print('len(A) = ', len(A))\n", "print('np.size(A) = ', np.size(A))\n", "print('np.shape(A) = ', np.shape(A))\n", "print('np.ndim(A) = ', np.ndim(A))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We could also have accessed this information with the correspondent methods:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "ExecuteTime": { "end_time": "2017-12-30T09:04:21.895794Z", "start_time": "2017-12-30T09:04:21.889423Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "A.size = 6\n", "A.shape = (2, 3)\n", "A.ndim = 2\n" ] } ], "source": [ "print('A.size = ', A.size)\n", "print('A.shape = ', A.shape)\n", "print('A.ndim = ', A.ndim)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We used the array function in Numpy to represent a matrix. A [Numpy array is in fact different than a matrix](http://www.scipy.org/NumPy_for_Matlab_Users), if we want to use explicit matrices in Numpy, we have to use the function `mat`:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "ExecuteTime": { "end_time": "2017-12-30T09:04:21.904229Z", "start_time": "2017-12-30T09:04:21.897071Z" } }, "outputs": [ { "data": { "text/plain": [ "matrix([[1, 2, 3],\n", " [4, 5, 6]])" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "B = np.mat([[1, 2, 3], [4, 5, 6]])\n", "B" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Both array and matrix types work in Numpy, but you should choose only one type and not mix them; the array is preferred because it is [the standard vector/matrix/tensor type of Numpy](http://www.scipy.org/NumPy_for_Matlab_Users). So, let's use the array type for the rest of this text." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Addition and multiplication\n", "\n", "The sum of two m-by-n matrices $\\mathbf{A}$ and $\\mathbf{B}$ is another m-by-n matrix:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$ \n", "\\mathbf{A} = \n", "\\begin{bmatrix} \n", "a_{1,1} & a_{1,2} & a_{1,3} \\\\\n", "a_{2,1} & a_{2,2} & a_{2,3} \n", "\\end{bmatrix}\n", "\\;\\;\\; \\text{and} \\;\\;\\;\n", "\\mathbf{B} =\n", "\\begin{bmatrix} \n", "b_{1,1} & b_{1,2} & b_{1,3} \\\\\n", "b_{2,1} & b_{2,2} & b_{2,3} \n", "\\end{bmatrix}\n", "$$\n", "\n", "$$\n", "\\mathbf{A} + \\mathbf{B} = \n", "\\begin{bmatrix} \n", "a_{1,1}+b_{1,1} & a_{1,2}+b_{1,2} & a_{1,3}+b_{1,3} \\\\\n", "a_{2,1}+b_{2,1} & a_{2,2}+b_{2,2} & a_{2,3}+b_{2,3} \n", "\\end{bmatrix}\n", "$$\n", "\n", "In Numpy:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "ExecuteTime": { "end_time": "2017-12-30T09:04:21.912708Z", "start_time": "2017-12-30T09:04:21.905887Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "A:\n", " [[1 2 3]\n", " [4 5 6]]\n", "B:\n", " [[ 7 8 9]\n", " [10 11 12]]\n", "A + B:\n", " [[ 8 10 12]\n", " [14 16 18]]\n" ] } ], "source": [ "A = np.array([[1, 2, 3], [4, 5, 6]])\n", "B = np.array([[7, 8, 9], [10, 11, 12]])\n", "print('A:\\n', A)\n", "print('B:\\n', B)\n", "print('A + B:\\n', A+B);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The multiplication of the m-by-n matrix $\\mathbf{A}$ by the n-by-p matrix $\\mathbf{B}$ is a m-by-p matrix:\n", "\n", "$$ \n", "\\mathbf{A} = \n", "\\begin{bmatrix} \n", "a_{1,1} & a_{1,2} \\\\\n", "a_{2,1} & a_{2,2} \n", "\\end{bmatrix}\n", "\\;\\;\\; \\text{and} \\;\\;\\;\n", "\\mathbf{B} =\n", "\\begin{bmatrix} \n", "b_{1,1} & b_{1,2} & b_{1,3} \\\\\n", "b_{2,1} & b_{2,2} & b_{2,3} \n", "\\end{bmatrix}\n", "$$\n", "\n", "$$\n", "\\mathbf{A} \\mathbf{B} = \n", "\\begin{bmatrix} \n", "a_{1,1}b_{1,1} + a_{1,2}b_{2,1} & a_{1,1}b_{1,2} + a_{1,2}b_{2,2} & a_{1,1}b_{1,3} + a_{1,2}b_{2,3} \\\\\n", "a_{2,1}b_{1,1} + a_{2,2}b_{2,1} & a_{2,1}b_{1,2} + a_{2,2}b_{2,2} & a_{2,1}b_{1,3} + a_{2,2}b_{2,3}\n", "\\end{bmatrix}\n", "$$\n", "\n", "In Numpy:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "ExecuteTime": { "end_time": "2017-12-30T09:04:21.920526Z", "start_time": "2017-12-30T09:04:21.914053Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "A:\n", " [[1 2]\n", " [3 4]]\n", "B:\n", " [[ 5 6 7]\n", " [ 8 9 10]]\n", "A x B:\n", " [[21 24 27]\n", " [47 54 61]]\n" ] } ], "source": [ "A = np.array([[1, 2], [3, 4]])\n", "B = np.array([[5, 6, 7], [8, 9, 10]])\n", "print('A:\\n', A)\n", "print('B:\\n', B)\n", "print('A x B:\\n', np.dot(A, B));" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that because the array type is not truly a matrix type, we used the dot product to calculate matrix multiplication. \n", "We can use the matrix type to show the equivalent:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "ExecuteTime": { "end_time": "2017-12-30T09:04:21.928349Z", "start_time": "2017-12-30T09:04:21.921989Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "A:\n", " [[1 2]\n", " [3 4]]\n", "B:\n", " [[ 5 6 7]\n", " [ 8 9 10]]\n", "A x B:\n", " [[21 24 27]\n", " [47 54 61]]\n" ] } ], "source": [ "A = np.mat(A)\n", "B = np.mat(B)\n", "print('A:\\n', A)\n", "print('B:\\n', B)\n", "print('A x B:\\n', A*B);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Same result as before.\n", "\n", "The order in multiplication matters, $\\mathbf{AB} \\neq \\mathbf{BA}$:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "ExecuteTime": { "end_time": "2017-12-30T09:04:21.936387Z", "start_time": "2017-12-30T09:04:21.929914Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "A:\n", " [[1 2]\n", " [3 4]]\n", "B:\n", " [[5 6]\n", " [7 8]]\n", "A x B:\n", " [[19 22]\n", " [43 50]]\n", "B x A:\n", " [[23 34]\n", " [31 46]]\n" ] } ], "source": [ "A = np.array([[1, 2], [3, 4]])\n", "B = np.array([[5, 6], [7, 8]])\n", "print('A:\\n', A)\n", "print('B:\\n', B)\n", "print('A x B:\\n', np.dot(A, B))\n", "print('B x A:\\n', np.dot(B, A));" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The addition or multiplication of a scalar (a single number) to a matrix is performed over all the elements of the matrix:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "ExecuteTime": { "end_time": "2017-12-30T09:04:21.950456Z", "start_time": "2017-12-30T09:04:21.937986Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "A:\n", " [[1 2]\n", " [3 4]]\n", "c:\n", " 10\n", "c + A:\n", " [[11 12]\n", " [13 14]]\n", "cA:\n", " [[10 20]\n", " [30 40]]\n" ] } ], "source": [ "A = np.array([[1, 2], [3, 4]])\n", "c = 10\n", "print('A:\\n', A)\n", "print('c:\\n', c)\n", "print('c + A:\\n', c+A)\n", "print('cA:\\n', c*A);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Transposition\n", "\n", "The transpose of the matrix $\\mathbf{A}$ is the matrix $\\mathbf{A^T}$ turning all the rows of matrix $\\mathbf{A}$ into columns (or columns into rows):\n", "\n", "$$ \n", "\\mathbf{A} = \n", "\\begin{bmatrix} \n", "a & b & c \\\\\n", "d & e & f \\end{bmatrix}\n", "\\;\\;\\;\\;\\;\\;\\iff\\;\\;\\;\\;\\;\\;\n", "\\mathbf{A^T} = \n", "\\begin{bmatrix} \n", "a & d \\\\\n", "b & e \\\\\n", "c & f\n", "\\end{bmatrix} $$\n", "\n", "In NumPy, the transpose operator can be used as a method or function:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "ExecuteTime": { "end_time": "2017-12-30T09:04:21.958825Z", "start_time": "2017-12-30T09:04:21.952372Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "A:\n", " [[1 2]\n", " [3 4]]\n", "A.T:\n", " [[1 3]\n", " [2 4]]\n", "np.transpose(A):\n", " [[1 3]\n", " [2 4]]\n" ] } ], "source": [ "A = np.array([[1, 2], [3, 4]])\n", "print('A:\\n', A)\n", "print('A.T:\\n', A.T)\n", "print('np.transpose(A):\\n', np.transpose(A));" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Determinant\n", "\n", "The determinant is a number associated with a square matrix.\n", "\n", "The determinant of the following matrix: \n", "\n", "$$ \\left[ \\begin{array}{ccc}\n", "a & b & c \\\\\n", "d & e & f \\\\\n", "g & h & i \\end{array} \\right] $$\n", "\n", "is written as:\n", "\n", "$$ \\left| \\begin{array}{ccc}\n", "a & b & c \\\\\n", "d & e & f \\\\\n", "g & h & i \\end{array} \\right| $$\n", "\n", "And has the value:\n", "\n", "$$ (aei + bfg + cdh) - (ceg + bdi + afh) $$\n", "\n", "One way to manually calculate the determinant of a matrix is to use the [rule of Sarrus](http://en.wikipedia.org/wiki/Rule_of_Sarrus): we repeat the last columns (all columns but the first one) in the right side of the matrix and calculate the sum of the products of three diagonal north-west to south-east lines of matrix elements, minus the sum of the products of three diagonal south-west to north-east lines of elements as illustrated in the following figure: \n", "
\n", "
Rule of Sarrus
Figure. Rule of Sarrus: the sum of the products of the solid diagonals minus the sum of the products of the dashed diagonals (image from Wikipedia).
\n", "\n", "In Numpy, the determinant is computed with the `linalg.det` function:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "ExecuteTime": { "end_time": "2017-12-30T09:04:21.968686Z", "start_time": "2017-12-30T09:04:21.960863Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "A:\n", " [[1 2]\n", " [3 4]]\n" ] } ], "source": [ "A = np.array([[1, 2], [3, 4]])\n", "print('A:\\n', A);" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "ExecuteTime": { "end_time": "2017-12-30T09:04:21.977200Z", "start_time": "2017-12-30T09:04:21.970304Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Determinant of A:\n", " -2.0\n" ] } ], "source": [ "print('Determinant of A:\\n', np.linalg.det(A))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Identity\n", "\n", "The identity matrix $\\mathbf{I}$ is a matrix with ones in the main diagonal and zeros otherwise. The 3x3 identity matrix is: \n", "\n", "$$ \\mathbf{I} = \n", "\\begin{bmatrix} \n", "1 & 0 & 0 \\\\\n", "0 & 1 & 0 \\\\\n", "0 & 0 & 1 \\end{bmatrix} $$\n", "\n", "In Numpy, instead of manually creating this matrix we can use the function `eye`:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "ExecuteTime": { "end_time": "2017-12-30T09:04:21.986815Z", "start_time": "2017-12-30T09:04:21.979026Z" } }, "outputs": [ { "data": { "text/plain": [ "array([[ 1., 0., 0.],\n", " [ 0., 1., 0.],\n", " [ 0., 0., 1.]])" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.eye(3) # identity 3x3 array" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Inverse\n", "\n", "The inverse of the matrix $\\mathbf{A}$ is the matrix $\\mathbf{A^{-1}}$ such that the product between these two matrices is the identity matrix:\n", "\n", "$$ \\mathbf{A}\\cdot\\mathbf{A^{-1}} = \\mathbf{I} $$\n", "\n", "The calculation of the inverse of a matrix is usually not simple (the inverse of the matrix $\\mathbf{A}$ is not $1/\\mathbf{A}$; there is no division operation between matrices). The Numpy function `linalg.inv` computes the inverse of a square matrix: \n", "\n", " numpy.linalg.inv(a)\n", " Compute the (multiplicative) inverse of a matrix.\n", " Given a square matrix a, return the matrix ainv satisfying dot(a, ainv) = dot(ainv, a) = eye(a.shape[0])." ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "ExecuteTime": { "end_time": "2017-12-30T09:04:21.995227Z", "start_time": "2017-12-30T09:04:21.988785Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "A:\n", " [[1 2]\n", " [3 4]]\n", "Inverse of A:\n", " [[-2. 1. ]\n", " [ 1.5 -0.5]]\n" ] } ], "source": [ "A = np.array([[1, 2], [3, 4]])\n", "print('A:\\n', A)\n", "Ainv = np.linalg.inv(A)\n", "print('Inverse of A:\\n', Ainv);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Pseudo-inverse\n", "\n", "For a non-square matrix, its inverse is not defined. However, we can calculate what it's known as the pseudo-inverse. \n", "Consider a non-square matrix, $\\mathbf{A}$. To calculate its inverse, note that the following manipulation results in the identity matrix:\n", "\n", "$$ \\mathbf{A} \\mathbf{A}^T (\\mathbf{A}\\mathbf{A}^T)^{-1} = \\mathbf{I} $$\n", "\n", "The $\\mathbf{A} \\mathbf{A}^T$ is a square matrix and is invertible (also [nonsingular](https://en.wikipedia.org/wiki/Invertible_matrix)) if $\\mathbf{A}$ is L.I. ([linearly independent rows/columns](https://en.wikipedia.org/wiki/Linear_independence)). \n", "The matrix $\\mathbf{A}^T(\\mathbf{A}\\mathbf{A}^T)^{-1}$ is known as the [generalized inverse or Moore–Penrose pseudoinverse](https://en.wikipedia.org/wiki/Moore%E2%80%93Penrose_pseudoinverse) of the matrix $\\mathbf{A}$, a generalization of the inverse matrix.\n", "\n", "To compute the Moore–Penrose pseudoinverse, we could calculate it by a naive approach in Python:\n", "```python\n", "from numpy.linalg import inv\n", "Ainv = A.T @ inv(A @ A.T)\n", "```\n", "But both Numpy and Scipy have functions to calculate the pseudoinverse, which might give greater numerical stability (but read [Inverses and pseudoinverses. Numerical issues, speed, symmetry](http://vene.ro/blog/inverses-pseudoinverses-numerical-issues-speed-symmetry.html)). Of note, [numpy.linalg.pinv](http://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.pinv.html) calculates the pseudoinverse of a matrix using its singular-value decomposition (SVD) and including all large singular values (using the [LAPACK (Linear Algebra Package)](https://en.wikipedia.org/wiki/LAPACK) routine gesdd), whereas [scipy.linalg.pinv](http://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.pinv.html#scipy.linalg.pinv) calculates a pseudoinverse of a matrix using a least-squares solver (using the LAPACK method gelsd) and [scipy.linalg.pinv2](http://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.pinv2.html) also uses SVD to find the pseudoinverse (also using the LAPACK routine gesdd). \n", "\n", "For example:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "ExecuteTime": { "end_time": "2017-12-30T09:04:22.022917Z", "start_time": "2017-12-30T09:04:21.996885Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Matrix A:\n", " [[1 0 0]\n", " [0 1 0]]\n", "Pseudo-inverse of A:\n", " [[ 1. 0.]\n", " [ 0. 1.]\n", " [ 0. 0.]]\n", "A x Apinv:\n", " [[ 1. 0.]\n", " [ 0. 1.]]\n" ] } ], "source": [ "from scipy.linalg import pinv2\n", "\n", "A = np.array([[1, 0, 0], [0, 1, 0]])\n", "Apinv = pinv2(A)\n", "print('Matrix A:\\n', A)\n", "print('Pseudo-inverse of A:\\n', Apinv)\n", "print('A x Apinv:\\n', A@Apinv)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Orthogonality\n", "\n", "A square matrix is said to be orthogonal if:\n", "\n", "1. There is no linear combination of one of the lines or columns of the matrix that would lead to the other row or column. \n", "2. Its columns or rows form a basis of (independent) unit vectors (versors).\n", "\n", "As consequence:\n", "\n", "1. Its determinant is equal to 1 or -1.\n", "2. Its inverse is equal to its transpose.\n", "\n", "However, keep in mind that not all matrices with determinant equals to one are orthogonal, for example, the matrix:\n", "\n", "$$ \\begin{bmatrix}\n", "3 & 2 \\\\\n", "4 & 3 \n", "\\end{bmatrix} $$\n", "\n", "Has determinant equals to one but it is not orthogonal (the columns or rows don't have norm equals to one)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Linear equations\n", "\n", "> A linear equation is an algebraic equation in which each term is either a constant or the product of a constant and (the first power of) a single variable ([Wikipedia](http://en.wikipedia.org/wiki/Linear_equation)).\n", "\n", "We are interested in solving a set of linear equations where two or more variables are unknown, for instance:\n", "\n", "$$ x + 2y = 4 $$\n", "\n", "$$ 3x + 4y = 10 $$\n", "\n", "Let's see how to employ the matrix formalism to solve these equations (even that we know the solution is `x=2` and `y=1`). \n", "Let's express this set of equations in matrix form:\n", "\n", "$$ \n", "\\begin{bmatrix} \n", "1 & 2 \\\\\n", "3 & 4 \\end{bmatrix}\n", "\\begin{bmatrix} \n", "x \\\\\n", "y \\end{bmatrix}\n", "= \\begin{bmatrix} \n", "4 \\\\\n", "10 \\end{bmatrix}\n", "$$\n", "\n", "And for the general case:\n", "\n", "$$ \\mathbf{Av} = \\mathbf{c} $$\n", "\n", "Where $\\mathbf{A, v, c}$ are the matrices above and we want to find the values `x,y` for the matrix $\\mathbf{v}$. \n", "Because there is no division of matrices, we can use the inverse of $\\mathbf{A}$ to solve for $\\mathbf{v}$:\n", "\n", "$$ \\mathbf{A}^{-1}\\mathbf{Av} = \\mathbf{A}^{-1}\\mathbf{c} \\implies $$\n", "\n", "$$ \\mathbf{v} = \\mathbf{A}^{-1}\\mathbf{c} $$\n", "\n", "As we know how to compute the inverse of $\\mathbf{A}$, the solution is:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "ExecuteTime": { "end_time": "2017-12-30T09:04:22.029712Z", "start_time": "2017-12-30T09:04:22.024546Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "v:\n", " [ 2. 1.]\n" ] } ], "source": [ "A = np.array([[1, 2], [3, 4]])\n", "Ainv = np.linalg.inv(A)\n", "c = np.array([4, 10])\n", "v = np.dot(Ainv, c)\n", "print('v:\\n', v)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What we expected.\n", "\n", "However, the use of the inverse of a matrix to solve equations is computationally inefficient. \n", "Instead, we should use `linalg.solve` for a determined system (same number of equations and unknowns) or `linalg.lstsq` otherwise: \n", "From the help for `solve`: \n", "\n", " numpy.linalg.solve(a, b)[source]\n", " Solve a linear matrix equation, or system of linear scalar equations.\n", " Computes the “exact” solution, x, of the well-determined, i.e., full rank, linear matrix equation ax = b." ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "ExecuteTime": { "end_time": "2017-12-30T09:04:22.037442Z", "start_time": "2017-12-30T09:04:22.031280Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Using solve:\n", "v:\n", " [ 2. 1.]\n" ] } ], "source": [ "v = np.linalg.solve(A, c)\n", "print('Using solve:')\n", "print('v:\\n', v)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And from the help for `lstsq`:\n", "\n", " numpy.linalg.lstsq(a, b, rcond=-1)[source]\n", " Return the least-squares solution to a linear matrix equation.\n", " Solves the equation a x = b by computing a vector x that minimizes the Euclidean 2-norm || b - a x ||^2. The equation may be under-, well-, or over- determined (i.e., the number of linearly independent rows of a can be less than, equal to, or greater than its number of linearly independent columns). If a is square and of full rank, then x (but for round-off error) is the “exact” solution of the equation." ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "ExecuteTime": { "end_time": "2017-12-30T09:04:22.045250Z", "start_time": "2017-12-30T09:04:22.038964Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Using lstsq:\n", "v:\n", " [ 2. 1.]\n" ] } ], "source": [ "v = np.linalg.lstsq(A, c)[0]\n", "print('Using lstsq:')\n", "print('v:\\n', v)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Same solutions, of course.\n", "\n", "When a system of equations has a unique solution, the determinant of the **square** matrix associated to this system of equations is nonzero. \n", "When the determinant is zero there are either no solutions or many solutions to the system of equations.\n", "\n", "But if we have an overdetermined system:\n", "\n", "$$ x + 2y = 4 $$\n", "\n", "$$ 3x + 4y = 10 $$\n", "\n", "$$ 5x + 6y = 15 $$\n", "\n", "(Note that the possible solution for this set of equations is not exact because the last equation should be equal to 16.)\n", "\n", "Let's try to solve it:" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "ExecuteTime": { "end_time": "2017-12-30T09:04:22.053792Z", "start_time": "2017-12-30T09:04:22.046928Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "A:\n", " [[1 2]\n", " [3 4]\n", " [5 6]]\n", "c:\n", " [ 4 10 15]\n" ] } ], "source": [ "A = np.array([[1, 2], [3, 4], [5, 6]])\n", "print('A:\\n', A)\n", "c = np.array([4, 10, 15])\n", "print('c:\\n', c);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Because the matix $\\mathbf{A}$ is not squared, we can calculate its pseudo-inverse or use the function `linalg.lstsq`:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "ExecuteTime": { "end_time": "2017-12-30T09:04:22.062397Z", "start_time": "2017-12-30T09:04:22.055640Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Using lstsq:\n", "v:\n", " [ 1.3333 1.4167]\n" ] } ], "source": [ "v = np.linalg.lstsq(A, c)[0]\n", "print('Using lstsq:')\n", "print('v:\\n', v)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The functions `inv` and `solve` failed because the matrix $\\mathbf{A}$ was not square (overdetermined system). The function `lstsq` not only was able to handle an overdetermined system but was also able to find the best approximate solution.\n", "\n", "And if the the set of equations was undetermined, `lstsq` would also work. For instance, consider the system:\n", "\n", "$$ x + 2y + 2z = 10 $$\n", "\n", "$$ 3x + 4y + z = 13 $$\n", "\n", "And in matrix form:\n", "\n", "$$ \n", "\\begin{bmatrix} \n", "1 & 2 & 2 \\\\\n", "3 & 4 & 1 \\end{bmatrix}\n", "\\begin{bmatrix} \n", "x \\\\\n", "y \\\\\n", "z \\end{bmatrix}\n", "= \\begin{bmatrix} \n", "10 \\\\\n", "13 \\end{bmatrix}\n", "$$\n", "\n", "A possible solution would be `x=2,y=1,z=3`, but other values would also satisfy this set of equations.\n", "\n", "Let's try to solve using `lstsq`:" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "ExecuteTime": { "end_time": "2017-12-30T09:04:22.070100Z", "start_time": "2017-12-30T09:04:22.064266Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "A:\n", " [[1 2 2]\n", " [3 4 1]]\n", "c:\n", " [10 13]\n" ] } ], "source": [ "A = np.array([[1, 2, 2], [3, 4, 1]])\n", "print('A:\\n', A)\n", "c = np.array([10, 13])\n", "print('c:\\n', c);" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "ExecuteTime": { "end_time": "2017-12-30T09:04:22.085271Z", "start_time": "2017-12-30T09:04:22.071761Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Using lstsq:\n", "v:\n", " [ 0.8 2. 2.6]\n" ] } ], "source": [ "v = np.linalg.lstsq(A, c)[0]\n", "print('Using lstsq:')\n", "print('v:\\n', v);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is an approximated solution and as explained in the help of `solve`, this solution, `v`, is the one that minimizes the Euclidean norm $|| \\mathbf{c - A v} ||^2$." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" }, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "window_display": false }, "widgets": { "state": {}, "version": "1.1.2" } }, "nbformat": 4, "nbformat_minor": 1 }