{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "### Singular Value Decomposition\n", "**Learning Objectives: using external libraries, matrix manipulation, SVD.**\n", "\n", "Singular value decomposition is a method for finding the pseudo-inverse of non-square and singular matrices. It provides a straightforward way of performing least-squares parameter fitting when the model is linear in its parameters. A useful introduction can be found in _Numerical Recipes_ (W.H Press et al.)\n", "\n", "**Theorem**:\n", "The NxM matrix X can be decomposed as\n", "$$\\mathsf{X} = \\mathsf{UsV},$$\n", "where $\\mathsf{U}$ is NxM and column orthogonal, $\\mathsf{s}$ is MxM and diagonal with non-negative entries and $\\mathsf{V}$ is MxM and row orthogonal.\n", "The pseudo-inverse of $\\mathsf{X}$ is then\n", "$$\\mathsf{X^{-1}} = \\mathsf{V^Ts^{-1}U^T},$$\n", "with the caveat that the entries of $\\mathsf{s^{-1}}$ are 0 if the corresponding value in s is 0.\n" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Singular values = [ 2.54624074e+01 1.29066168e+00 2.35731836e-15]\n" ] } ], "source": [ "import numpy as np\n", "import copy\n", "\n", "#create a matrix, X\n", "X = np.array([[1.0,2.0,3.0],[4.0,5.0,6.0],[7.0,8.0,9.0],[10.0,11.0,12.0]])\n", "\n", "def pseudo_inverse(X):\n", " #Perform SVD, storing the arrays as U, s, V\n", " U, s, V = np.linalg.svd(X, full_matrices=0)\n", " #full_matrices=False gives the above definition of SVD\n", " print \"Singular values = \" + str(s)\n", " \n", " #calculate inverses of the matrices\n", " U_T = np.transpose(U)\n", " V_T = np.transpose(V)\n", " #inverse of s...\n", " inv_s = []\n", " for singular_value in s:\n", " if singular_value < 1.0e-12: #some small cutoff\n", " inv_s.append(0.0)\n", " else:\n", " inv_s.append(1.0/singular_value)\n", " inv_s = np.diag(inv_s) #convert from list to diagonal matrix\n", " \n", " #pseudo-inverse of X...\n", " inv_X = np.dot(V_T, np.dot(inv_s, U_T))\n", " return inv_X\n", "\n", "\n", "inv_X = pseudo_inverse(X)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The smallest singular value is effectively zero, it is of the order of the computer's rounding error. In order to find the pseudo-inverse of $\\mathsf{A}$, _this singular value's reciprocal must be set to zero._ The function `pseudo_inverse` includes this step." ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[-0.48333333 -0.24444444 -0.00555556 0.23333333]\n", " [-0.03333333 -0.01111111 0.01111111 0.03333333]\n", " [ 0.41666667 0.22222222 0.02777778 -0.16666667]]\n" ] } ], "source": [ "print inv_X" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The pseudo-inverse of the matrix $\\mathsf{A}$ has now been calculated for an overdetermined system. The least-squares solution (see below) to\n", "$$\\mathsf{y} = \\mathsf{X}.\\mathsf{a}$$\n", "is simply\n", "$$\\mathsf{a} = \\mathsf{X^{-1}}.\\mathsf{y},$$\n", "where $\\mathsf{X^{-1}}$ is the _pseudo-inverse_ of $\\mathsf{X}$. This vector $\\mathsf{a}$ is the solution in the sense that $$(\\mathsf{y}-\\mathsf{X}.\\mathsf{a})^2$$ \n", "is minimised. This property makes singular value decomposition useful for least-squares problems. \n", "\n", "Suppose that $\\mathsf{a}$ are the parameters of the model $\\mathsf{y} = \\mathsf{X}.\\mathsf{a}$, $\\mathsf{y}$ are measurements of the independent variable and $\\mathsf{X}$ is a matrix of functions of the dependent variables. In the case of equal Gaussian errors on the $y_i$, we then seek $\\mathsf{a}$ which minimises\n", "$$\\chi^2 = (\\mathsf{y}-\\mathsf{X}.\\mathsf{a})^2.$$\n", "Clearly, this can be done using singular value decomposition. In the case that the errors, $\\sigma_i,$ on the measurements are different, the matrix $\\mathsf{X}$ must be replaced by $\\mathsf{A}$, the design matrix, in order to weight the measurements:\n", "$$A_{ij} = \\frac{X_{ij}}{\\sigma_{i}},$$\n", "the vector $\\mathsf{y}$ is replaced by $\\mathsf{b}$:\n", "$$b_{ij} = \\frac{y_{ij}}{\\sigma_{i}}.$$\n", "The pseudo-inverse of $\\mathsf{A}$ can then be used to find the parameters $\\mathsf{a}$:\n", "$$\\mathsf{a} = \\mathsf{A^{-1}}.\\mathsf{b},$$" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Singular values = [ 1.02921777e+02 3.74863557e+00 3.62710615e-15]\n", "Parameters = [ 1.14826608 2.01942306 2.89058005]\n", "Chi-statistic = 2.36148830442\n" ] } ], "source": [ "#Now include some errors\n", "#choose some vector of independent variables, b, and their errors, sigma\n", "#the system is overdetermined, so this does not necessarily give an exact set of parameters!\n", "y = [14.3, 31.8, 49.1, 68.5]\n", "sigma = [0.4, 0.3, 0.6, 0.2]\n", "b = [y[i]/sigma[i] for i in range(len(y))]\n", "\n", "#create the design matrix\n", "A = np.zeros(X.shape)\n", "for i in range(X.shape[0]):\n", " for j in range(X.shape[1]):\n", " A[i][j] = X[i][j]/sigma[i]\n", "\n", "#pseudo_inverse of design matrix\n", "inv_A = pseudo_inverse(A)\n", "\n", "#parameters which minimise chi-squared\n", "a = np.dot(inv_A, b)\n", "print \"Parameters = \" + str(a)\n", "\n", "chi = np.linalg.norm(b - np.dot(A, a))\n", "print \"Chi-statistic = \" + str(chi) " ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "#### Investigation: low temperature heat capacity of a metal\n", "At low temperatures, the heat capacity of a metal can be modelled as\n", "$$C = a_0 T + a_1 T^3,$$\n", "where T is temperature and the parameters a_0 and a_1 reflect the contributions of conduction electrons and lattice vibrations, respectively.\n", "\n", "Given the temperature (K) and heat capacity (JK$^{-1}$kg$^{-1}$) data below, find the model's parameters.\n", "\n", "Plot the model and data as a straight line.\n", "(answers: $a_0$ = 2.043 JK$^{-2}$kg$^{-1}$, $a_1$ = 5.789e-3 JK$^{-4}$kg$^{-1}$)" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "collapsed": false }, "outputs": [], "source": [ "T = [0.1, 2.1, 4.1, 6.1, 8.1, 10.1, 12.1, 14.1, 16.1, 18.1, 20.1]\n", "C = [0.368, 4.565, 10.119, 13.472, 20.375, 26.356, 32.908, 45.0275, 56.946, 70.921, 89.089]\n", "C_errors = [0.163, 0.389, 0.504, 0.593, 0.669, 0.735, 0.795, 0.850, 0.902, 0.950, 0.996]" ] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.11" } }, "nbformat": 4, "nbformat_minor": 0 }