{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "1ouzTiKEWJbR"
   },
   "source": [
    "# Topic 14: Matrix Inverses and Pseudo Inverses\n",
    "Authors:Evan Chrisney, echrisney@gmail.com (Fall 2018), Trey Henrichsen, treyhenrichsen@gmail.com (Winter 2020)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "CX1H5eeAWWPy"
   },
   "source": [
    "# Introduction\n",
    " \n",
    "Matrix inverses are matrices which when multiplied by their oringinal matrices equate to the identity matrix. Only square matrices are invertible, but rectangular matrices may have left or right inverses. A matrix $\\mathbf{A}$ has a left inverse if there exists a matrix $\\mathbf{B}$ such that $\\mathbf{BA} = I$, and a right inverse if there exists a matrix $\\mathbf{C}$ such that $\\mathbf{AC} = I$ (Moon and Stirling, 2000). Square matrices that are not invertible are called singular matrices. A matrix is invertible if it has both a right and left inverse.\n",
    "\n",
    "Matrix inverses are important because they provide solutions to the ubiquitous equation $\\mathbf{A}x = b$. There are many conditions that satisfy the invertibility of a matrix which will be explained in detail below. If a true matrix inverse does not exist for $\\mathbf{A}$, or if $\\mathbf{A}$ is not square, then the left and right pseudo inverses may be used to solve $\\mathbf{A}x = b$. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Explanation of the theory"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Left and Right Inverses\n",
    "Let $\\mathbf{A}$ be an $m \\times n$ matrix.\n",
    "\n",
    "Matrix $\\mathbf{B}$ is the left inverse of $\\mathbf{A}$ iff $\\mathbf{BA} = I$. The left inverse can only exist when $m \\geq n$, i.e. $\\mathbf{A}$ is either tall or square. Additionally, for $\\mathbf{A}$ to have a left inverse, the columns of $\\mathbf{A}$ must be linearly independant. If $\\mathbf{A}$ has a left inverse, then $\\mathbf{A}x=b$ has at most one solution.\n",
    "\n",
    "Likewise, matrix $\\mathbf{C}$ is the right inverse of matrix $\\mathbf{A}$ iff $\\mathbf{AC} = I$. The right inverse can only exist when $m \\leq n$, i.e. the matrix is either wide or square. Additionally, the columns of $\\mathbf{A}$ must span the whole space. In other words, $\\text{rank}(\\mathbf{A})=m$. If $\\mathbf{A}$ has a right inverse, then $\\mathbf{A}x=b$ has at least one solution.\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Neither left nor right inverses are necessarily unique. For example, take the matrices\n",
    "$$ \\mathbf{A} = \\begin{bmatrix} 1 & 0 & 0 \\\\ 0 & 1 & 0 \\end{bmatrix}$$\n",
    "$$ \\mathbf{C} = \\begin{bmatrix} 1 & 0 \\\\ 0 & 1 \\\\ c_1 & c_2 \\end{bmatrix}$$\n",
    "The product $\\mathbf{AC}$ is\n",
    "$$ \\mathbf{AC} = \\begin{bmatrix} 1 & 0 \\\\ 0 & 1 \\end{bmatrix}$$\n",
    "Thus $\\mathbf{C}$ is a right inverse of $\\mathbf{A}$, for all $c_1$ and $c_2$.\n",
    "\n",
    "Likewise, for the matrix $$ A = \\begin{bmatrix} 1 & 0 \\\\ 0 & 1 \\\\ 0  & 0 \\end{bmatrix}, $$ it can be shown the the left inverse is\n",
    "$$ B = \\begin{bmatrix} 1 & 0 & b_1 \\\\ 0 & 1 & b_2 \\end{bmatrix}$$ for all values $b_1$ and $b_2$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "b9u2BgA5Wi1I"
   },
   "source": [
    "## Matrix Inverses\n",
    "\n",
    "If matrix $\\mathbf{A}$ has both a left and a right inverse, those inverses are the same, and are called the inverse of $\\mathbf{A}$, and $\\mathbf{A}$ is invertible. As can be seen from the requirements for a left inverse ($m \\geq n$) and right inverse($m \\leq n$), only square matrices can be invertible.\n",
    "\n",
    "If $\\mathbf{A}$ has an inverse $\\mathbf{B}$, then:\n",
    "$$ \\mathbf{AB} = \\mathbf{BA} = I $$\n",
    "$\\mathbf{B}$ is unique, and hereby notated as $\\mathbf{A^{-1}}$. \n",
    "\n",
    "Below is a list of properties equivalent to invertibility for square matrices. In other words, all invertible matrices have the following properties, and any $n \\times n$ matrix $\\mathbf{A}$ that has any one of these properties is invertible and has all other properties on this list.\n",
    "\n",
    "1. The null space of $\\mathbf{A}$ is the null vector {0}. \n",
    "2. $\\mathbf{A}$ is full rank, meaning the rank of $\\mathbf{A}$ is n. \n",
    "3. $\\mathbf{A}x = 0$ only for that $x = 0$. \n",
    "4. The rows and columns of $\\mathbf{A}$ are linearly independent.  \n",
    "5. The determinant of $\\mathbf{A}$ is non zero. \n",
    "6. $\\mathbf{A}$ has no zero eigenvalues. \n",
    "7. $\\mathbf{A^HA}$ is positive definite\n",
    "8. $\\mathbf{A}$ is nonsingular. \n",
    "9. $\\mathbf{A}$ has n pivots. \n",
    "10. The transpose of $\\mathbf{A}$, $\\mathbf{A}^T$, is invertible. \n",
    "11. The conjugate transpose of $\\mathbf{A}$, $\\mathbf{A}^H$ is invertible.\n",
    "12. $\\mathbf{A}$ has both a left inversve and a right inverse $\\mathbf{B}$ and $\\mathbf{C}$ such that $\\mathbf{B}$ = $\\mathbf{C}$ = $\\mathbf{A}^{-1}$. \n",
    "13. $(\\mathbf{A}^{-1})^{-1} = \\mathbf{A}$\n",
    "14. $(\\mathbf{A}^{T})^{-1} = (\\mathbf{A}^{-1})^{T}$\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Calculating the Matrix Inverse\n",
    "There are many methods for calculation matrix inverses. These methods vary in complexity and speed. In some situations, such as diagonal matrices, matrix inversion may be simplified. In general, it is best to use existing implementations, such as those in [NumPy](https://numpy.org/doc/stable/reference/generated/numpy.linalg.inv.html), [MatLab](https://www.mathworks.com/help/matlab/ref/inv.html), and [Eigen](https://eigen.tuxfamily.org/dox/group__TutorialLinearAlgebra.html#title3). There may be times where you have to implement matrix inversion yourself, and so two are given below.\n",
    "\n",
    "#### Cramer's Rule\n",
    "The adjugate of $\\mathbf{A}$ can be used to determine $\\mathbf{A}^{-1}$ as\n",
    "$$ \\mathbf{A}^{-1} = \\frac{1}{det(\\mathbf{A})}adj(\\mathbf{A}), $$\n",
    "where $adj$ denotes the adjugate and $det$ denotes the determinant.\n",
    "This method is easy to understand, but can be much slower than other methods, especially for large matrices. In particular, some naive implementations can be particularly slow.\n",
    "\n",
    "#### Diagonalization\n",
    "Invertible matrices can be diagonalized as $\\mathbf{A} = \\mathbf{S \\Lambda S^{-1}}$, where $\\mathbf{\\Lambda}$ is a diagonal matrix of the eigenvalues of $\\mathbf{A}$ and $\\mathbf{S}$ is formed by stacking the column eigenvectors of $\\mathbf{A}$. The inverse of $\\mathbf{A}$ can be calculated as \n",
    "$$\\mathbf{A}^{-1} = \\mathbf{S \\Lambda}^{-1} \\mathbf{S}^{-1}$$\n",
    "See the page on [eigenvalues](t20_eigenvalues.ipynb) for more information on diagonalization.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Calculating Right and Left inverses\n",
    "A non-square (n x m) matrix may have a left or right inverse, as explained above. These inverses may or may not be unique. The simplest way to calculate a right or left inverse is with the Moore-Penrose pseudoinverse.\n",
    "$$ \\mathbf{B}  = (\\mathbf{A}^H\\mathbf{A})^{-1}\\mathbf{A}^H \\quad \\text{(left inverse)}$$\n",
    "$$ \\mathbf{C}  = \\mathbf{A}^H(\\mathbf{A}^H\\mathbf{A})^{-1} \\quad \\text{(right inverse)}$$\n",
    "\n",
    "In general the pseudo inverse of $\\mathbf{A}$ is written as $\\mathbf{A}^{\\dagger}$, whether it is a left or right inverse.\n",
    "Note that the Moore-Penrose pseudo inverse requires a matrix inversion. Matrices which do not have a pseudo inverse will cause this inversion to fail."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "zNIUvElqWwdh"
   },
   "source": [
    "# Simple Numerical Example\n",
    "\n",
    "Here is some simple python code that shows the concepts of matrix inversion for a square matrix, using Cramer's rule, as well as computation of the left and right inverses, which all should be equal for this example. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 168
    },
    "colab_type": "code",
    "id": "OU8eOXxwW9ET",
    "outputId": "47d57def-d903-45f5-949b-de80af39e9d7"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[-0.26666667 -0.6        -0.13333333]\n",
      " [-0.23333333 -0.4        -0.36666667]\n",
      " [-0.13333333  0.2        -0.06666667]]\n",
      "[[-0.26666667 -0.6        -0.13333333]\n",
      " [-0.23333333 -0.4        -0.36666667]\n",
      " [-0.13333333  0.2        -0.06666667]]\n",
      "[[-0.26666667 -0.6        -0.13333333]\n",
      " [-0.23333333 -0.4        -0.36666667]\n",
      " [-0.13333333  0.2        -0.06666667]]\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "\n",
    "# numpy includes its own inverse, but this function drives home the topics above. \n",
    "######################################################\n",
    "# Compute the matrix inverse of a square matrix using the adjugate method\n",
    "\n",
    "\n",
    "# Declare a matrix that we know is invertible\n",
    "A = np.matrix([[-3, 2, -5], [-1, 0, 2], [3, -4, 1]])\n",
    "\n",
    "# get the determinant\n",
    "m = np.linalg.det(A)\n",
    "\n",
    "# Grab length of A which we know is nxn\n",
    "l = len(A) \n",
    "\n",
    "# Initialize the adjugate matrix to 0's\n",
    "C = np.zeros((l,l))\n",
    "\n",
    "# Loop through A and compute the adjugate\n",
    "for i in range(l):\n",
    "        for j in range(l):\n",
    "                # Single out the rows/cols to compute minor determinants\n",
    "                #\n",
    "                # I realize i and j are swapped, this saves the step of\n",
    "                # transposing C later. \n",
    "                temp = np.delete(A, (j), axis = 0)\n",
    "                temp = np.delete(temp, (i), axis = 1)\n",
    "                # Compute minor determinant\n",
    "                M = np.linalg.det(temp)\n",
    "                # Compute the adjugate\n",
    "                C[i][j] = (-1)**(i+j)*M\n",
    "\n",
    "# Finish by computing the adjugate\n",
    "A_inv = 1/m*C\n",
    "# Print out matrix inverse\n",
    "print(A_inv)\n",
    "\n",
    "#####################################################\n",
    "# Now find the left inverse, which should be the same\n",
    "A_tran = A.getH()\n",
    "Grammian_l = np.linalg.inv(A_tran*A)\n",
    "A_dagger_l = Grammian_l*A_tran\n",
    "print(A_dagger_l)\n",
    "\n",
    "\n",
    "#####################################################\n",
    "# Now find the right inverse which is the same\n",
    "Grammian_r = np.linalg.inv(A*A_tran)\n",
    "A_dagger_r = A_tran*Grammian_r\n",
    "print(A_dagger_r)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Engineering Applications"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "wPb-J4i3XNXo"
   },
   "source": [
    "## Remote Sensing\n",
    "\n",
    "As written above, the matrix inverses provide a solution to the ubiquitous equation $\\mathbf{A}x = b$. An engineering example used in my research quite frequently is using least squares (LS) to fit a line to some data, where we'll solve for $s$ in $\\mathbf{A}s = x$ using a left pseudo inverse. \n",
    "\n",
    "As a brief introduction to this application, I will fit normalized radar cross section (RCS) $\\sigma^0$ data vs incidence angle from a satellite radar at C band know as the advanced scatteromer, or ASCAT. ASCAT is a fan beam scatterometer that observes the earth surface at a variable range of incidence angles, from about 30 to 60 degrees. The purpose of this application is to determine what the $\\sigma^0$ incidence angle dependence ($s$) is at C band in order to normalize ASCAT measurements to one incidence angle for cross calibration purposes.  \n",
    "\n",
    "Previous studies have shown that $\\sigma^0$ exhibits a log-linear dependence with incidence angle over tropical rainforests over the mid incidence angle range from about 30 to 60 degrees incidence (N. Madsen, BYU Masters Thesis, 2015). Due to the log-linear dependence, the dependence is easily estimated as the slope $s$ of the first order polynomial \n",
    "\n",
    "$s_1\\theta + s_2 = \\sigma^0, $\n",
    "\n",
    "where $s_1$ and $s_2$ are the coefficients of $s$ and $s_1$ is the slope which we're solving for, $\\theta$ is the ASCAT incidence angle data, and $\\sigma^0$ is the ASCAT RCS data. In matrix form, this equation is\n",
    "\n",
    "$ [\\mathbf{\\theta}, \\mathbf{1}] [s_1, s_2]^{T} = \\sigma^0$,\n",
    "\n",
    "which easily seen of the form $\\mathbf{A}s = x$. \n",
    "\n",
    "To determine the ASCAT $\\sigma^0$ incidence angle dependence, we create the LS equation\n",
    "\n",
    "$ s = \\mathbf{A}^{\\dagger}\\mathbf{x}, $\n",
    "\n",
    "where $s$ is the dB/degree dependence, $\\mathbf{A}^{\\dagger}$ is the left psuedo inverse of the ASCAT incidence angle data, and $x$ is $\\sigma^0$ RCS data. Again, only the first coefficient of $s$ is of interest, as it gives the slope of the line.\n",
    "\n",
    "In our application, we will only use a small portion of the actual ASCAT data, since there are millions of measurements in a given day. To simplify this, I truncated millions of measurements for a day into 21 measurements to give a feel for what the approximate incidence angle dependence $s$ is. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 50
    },
    "colab_type": "code",
    "id": "FkE9JDD1Xc_B",
    "outputId": "8fcce972-a4bc-4420-81da-29c86ac366e1"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[-0.07582702]\n",
      " [-5.07314663]]\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "\n",
    "##############################################################################################\n",
    "# Evan Chrisney\n",
    "# 671 Application code for Pseudo Inverses\n",
    "# This script will use a left inverse to solve a 1st order poylnomial equation as described above\n",
    "\n",
    "# Create A matrix as described above using actual ASCAT incdience angle values\n",
    "A = np.matrix([[59, 48.63, 50.14, 54.22, 39.23, 52.36, 52.6, 44.27, 55.07, 59.06\\\n",
    ",58.32, 58.5, 37.49, 45.61, 52.13, 42.04, 54.74, 58.53, 38.19, 45.74, 59.31], \\\n",
    "[1,1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])\n",
    "\n",
    "# Create sigma_0 row vector using actual ASCAT RCS values\n",
    "sigma_0 = np.array([-9.5103, -8.2828, -8.0002, -8.2415, -8.4277, -9.0573, \\\n",
    "-8.6784, -8.5909, -9.0481, -9.6975, -8.8720, -9.7875, -8.3998, -10.0404, \\\n",
    "-10.0735, -6.8197, -9.2742, -9.5769, -7.5610, -8.9871, -10.3787])\n",
    "\n",
    "# Get length of sigma_0 for use in reshape the row vector to a column vec\n",
    "l = len(sigma_0)\n",
    "\n",
    "# Transpose A into a 21x2 matrix instead of 2x21\n",
    "A = A.transpose()\n",
    "# Transpose sigma_0 into a column vector\n",
    "sigma_0 = np.reshape(sigma_0, (l, 1))\n",
    "\n",
    "# Get A hermitian, although transpose would also work since these are reals\n",
    "A_tran = A.getH()\n",
    "# Create the left inverse using the Grammian as above\n",
    "Grammian_l = np.linalg.inv(A_tran*A)\n",
    "# Compute the left inverse of A as above\n",
    "A_dagger_l = Grammian_l*A_tran\n",
    "# Solve for s using the left inverse\n",
    "s = A_dagger_l*sigma_0\n",
    "# Print out s, where the first coefficient is the approximate sigma^0 incidence angle dependence for ASCAT, \n",
    "# which is -0.0758 dB/degree \n",
    "print(s)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Robotic Arm Inverse Kinematics\n",
    "\n",
    "When using a robotic arm, (known more formally as a serial link manipulator), it is often necesary to find a set of joint angles that put the end manipulator in the correct pose (position and orientation). The process of finding the joint angles to reach a given pose is known as inverse kinematics. Arms with many joints are often too complex for an analytical solution in every case, and so a numerical solution can be computed using the pseudo inverse. This method is a form of gradient descent optimization. For more on gradient descent, see [Gradient Descent](t27_gradient_descent.ipynb).\n",
    "\n",
    "For this application, the vector $q$ contains the angles of each joint in the robot. $J(q)$ is the jacobian of the robot in a given position, such that $J(q) \\dot{q} = v$, where $v$ is a 6-vector containing the linear and angular velocities of the end effector. The calculation of $J(q)$ is beyond the scope of this work, but is solved problem.\n",
    "\n",
    "The general method for inverse kinematics is as such:\n",
    "1. Set $q$ to some initial values\n",
    "2. Calculate the error $e$ between the end effector pose and the desired pose with joint angles $q$\n",
    "3. Update the joint angles: $q_{n+1} = q_n + \\alpha J(q)^\\dagger e$, where $\\alpha$ is a learning rate.\n",
    "4. Repeat steps 2 & 3 until the error $e$ is sufficiently small\n",
    "\n",
    "Using the pseudo inverse works well because it is flexible enough to work with underactuated arms. A problem with this direct approach is that $J(q)$ is not guaranteed to meet the requirements to have a pseudo inverse, often in configurations such as the arm reaching the edge of its workspace, or in situations similar to gimbal lock. To prevent this, a damped pseudo inverse is often used, $(J(q)^T J(q) + \\lambda I)^{-1}J(q)^T$, where $\\lambda$ is a damping coefficient. This change guarantees that a pseudo inverse exists."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "colab": {
   "collapsed_sections": [],
   "name": "Untitled0.ipynb",
   "provenance": [],
   "version": "0.3.2"
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}