{ "metadata": { "name": "", "signature": "sha256:7df01695821e952444d75a1f6033770ef2ccfabba84bac36e7ba685ac1b0a039" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Least-squares fitting without explicit inverses\n", "\n", "I want to show you in this notebook how you can solve a least-squares problem without having recourse to a direct application of the normal equation. If you have to use [np.linalg.inv](http://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.inv.html) somewhere in your code, it should ring a bell. Explicit matrix inverses can almost always be avoided. And they should be avoided ! Here's how" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Synthetic example" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from sklearn import linear_model, datasets\n", "A, b, coef = datasets.make_regression(n_samples=1000, n_features=1,\n", " n_informative=1, noise=10,\n", " coef=True, random_state=0)\n", "plt.plot(A[:,0], b, '.')" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 1, "text": [ "[]" ] }, { "metadata": {}, "output_type": "display_data", "png": "iVBORw0KGgoAAAANSUhEUgAAAX0AAAEACAYAAABfxaZOAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3XtwXOWZ5/GvZEm2ZFlq+YYs32SMGTCXyFgYMuVUlEpM\nPKrsWPlDuU1NAaGqq0jNJLu1iwmkdnFq55JA7c6EScEWlQyEBMM4nsHBQwyxBwscCHZCjG1sjGwx\nFshgZGPLxlyML9o/nvNyTre61d3qVnef7t+nqqu7T190oOA5bz/v8z4viIiIiIiIiIiIiIiIiIiI\niIiIiIhIkZsEbAdeBvYBf+8dnwpsBnqB3wCRwGfuAA4A+4Eb8namIiKSE3XefRXwIrAcuBtY7R2/\nHfiB93gxdoGoBlqBg0Blvk5URERypw74PXAFNoq/yDve7D0HG+XfHvjMU8D1+TpBEZFyl4tRdiU2\nen8H2ArsxQL+O97r7+BfAFqAgcBnB4DZOTgHERFJQ1UOvuMC0AY0Ak8Dn4t7fdi7JTPaayIikkO5\nCPrOSeBJYCk2um8GjgCzgEHvPYeBuYHPzPGOxVi4cOFwX19fDk9NRKQs9AGXjPaGbNM70/Erc2qB\nFcBO4AngRu/4jcAG7/ETwNeAGmABsAjYMeKs+/oYHh4O7e2uu+4q+DmU47nr/At/0/kX9gYsTBW0\nsx3pzwJ+hl08KoGfA/+BBf51wC3AIeAr3vv3ecf3AeeAb6H0johI3mQb9PcA1yQ4fhz4QpLP/J13\nExGRPFON/Djo6Ogo9CmMWZjPHXT+habzL34VhT6BJIa9/JSIiKSpoqICUsR1jfRFRMqIgr6ISBlR\n0BcRKSMK+iIiZURBX0SkjCjoi4iUEQV9EZEyoqAvIlJGFPRFRMqIgr6ISBlR0BcRKSMK+iIiZURB\nX0SkjCjoi4iUEQV9EZEyoqAvIlJGFPRFRMqIgr6ISBlR0BcRKULRKHR0QGcnDA3l7nsV9EVEilBv\nLzz7LGzaZBeAXKnK3VeJiEg2olEL9nV1UF1tx9rb4YEHcvc3Rt01vYCGh4eHC30OIiJ51dFho3uA\nri4L/A88AJFIep+vqKiAFHFdI30RkSJRV2f37e3w4IPpB/tMaKQvIlIkhoYsxZPJ6D4onZG+gr6I\nSAEE8/dr1+ZmVJ9O0Ff1johIAYxXdU4q2Qb9ucBWYC/wCvBt7/hUYDPQC/wGCF7D7gAOAPuBG7L8\n+yIioRTM3+eyOieVbNM7zd7tZaAeeAnoAm4GjgF3A7cDTcB3gcXAWuBaYDawBbgUuBD3vUrviEhJ\nyzZ/n0ghcvobgB97t88C72AXhR7gMmyUfwH4off+p4A1wItx36OgLyKhMh45+kzlu2SzFVgCbAcu\nwgI+3v1F3uMWYgP8ADbiFxEJNZejB7sARCKFvwgkkqugXw/8K/Ad4L2414a9WzIa0otIqCQa1cfn\n6Lu6ivMikIugX40F/J9j6R3w0zpHgFnAoHf8MDb568zxjo2wZs2aTx53dHTQ0dGRg1MVEcle/Kh+\n3ToL5NEo1NZawN+7115PdhFYty778+jp6aGnpyejz2Sb068Afga8C/y3wPG7vWM/xCZwI8RO5C7D\nn8i9hJGjfeX0RaRodXZaqWV7O2zeHDtqD7ZSmDMH9uyx10f7TK7kYyJ3OfAcsBs/cN8B7ADWAfOA\nQ8BXANcc9E7gm8A5LB30dILvVdAXkaI1WuXN3LkwMACNjbBrF8yfn/ozuaIVuSIieeLy/Lt2+f3v\nu7tzk8ZJl4K+iMg4c8F+9244ccI/Pp5pnGTUZVNEZBTp1Na79/T1WaqmoSH2vcFJXYAlS2DePHjo\noeIp0wxS0BeRspWoCifIHTt50p4PDIx8ryvVbGuD1tbxa4mcKwr6IlK2UvW/6e31A35VFZw7Z++t\nrYVZs+DMGbj6ali1qnhH9vHUZVNEytbatTbZmiz37i4KTU3w0kv+e/v74cgRy+E/+yxs357f886G\nJnJFROK4PH51NUyePHIU72rug/JdqZOIqndERJIYbRI3uMBqwQKbmA2+b2gIbr4ZXngBBgcLU6mT\niKp3RESSSNQgbeNGy9NXeonv9naYOHHkZG8kAjNmwKJF9t716wsf8NOlnL6IlJVo1Eby8b1xNm70\n8/TvvmstFDZvthLN4Puc3l54/nn7zG235f0fY8w00heRkhZM48yYAb/6lV+RU1dnI/lvfAM++sj/\nTGOj3zPHNVKLb59QqJ2vsqWcvoiUtGB+fuJES9+ABfYrr7TROsDMmZafj++Zk0w+eulkShuji0hZ\ncambzk6//01wRH4hsDFrfT3s22eP29pgxw6rwDl0KHXABwv0Lr8fJgr6IlIy3OTspk12AYDYWvzJ\nk/33fvih3yuntdUCfRiDeKYU9EWkZMTn2aNR27zkd7+DL30JPvjAXp8yBWpq7HF1Nbz9duyvg1Km\nnL6IlIz4PHswnx9v2jSr0gkqhgVW2dDiLBEpG/FVOv39VpZ57JiVXZ46ZZO0J0/C9OnWR2doyH+t\nWBZYZUNBX0TKRrIqnZYWWzl7221wzz12/9ZbftVOZ6fl+mtr7UJR6I3Ls6HqHREpGYkqc4KC+fza\nWv/4smX+JK27Dy64euQRO9bfP3ISuBRppC8ioRAcyXd320g82DsH/Hx+dzds2WIbmjzzzMhRe6Ia\n+3xsXD7elN4RkdBzuXqXn3dBuasr9iLgJmCjUau/7+uDF19Mr+YeinOxVaYU9EUk9IIj/DlzrD3C\n6tXW5OzECVtYtXWrHYvfq3b6dLj22vDm6DOlLpsiEhrJ9qJ1ufrKSjh6FC69FI4fh/Pn7Xhrq5/q\nCZZnTp5svwxcjj7MpZi5pKAvIkUhGLSDe9HOmGGPL1ywipyjR/3PNDXZnrTRqI3wg+rq4P33w9cQ\nbbwp6ItIUejri30+ebKlad56K/H7q6pg505/lO9SOs6119p3hDlHPx5UsikieZeo/DJ+wvX9960C\nJ/5iALbI6uBB/zMuBTRlit1Pngwff6yAn4iCvojkVTRqe866mvibb7bj/f3+e6q8HMSECbBwofXH\nCTp9Gm680b9guKZqe/bY5K27YMycCStWlEdPnXQp6ItIXvX2wtmz/vPhYbsQHDniH3Mj9vPnbeVs\nU1Psd5w/bxcNt4hq9WrrhX/rrVbN45w9a8G/lBdbZUo5fRHJm/gJ10jEOmAePWrB3wmO7KdNs83J\nh4YsZePU11tbBYidBF61Cpqb/YvIkiWayA3SSF9E8iY44VpbaxU5g4OxAb+hwTY0mTfPAv65c7B9\nuwX85mY/9XP6tL83bbAFw0MPwauvWvDv6kq8Irec5SLo/zPwDrAncGwqsBnoBX4DBP+V3wEcAPYD\nN+Tg74tIkUnWJycYnK+5xrpbBk2YYBeAhQvhjTes9bHbz7a93YL5ihX+czeCD26UEonYbcMGePxx\nBfx4uViR+xngNPAwcJV37G7gmHd/O9AEfBdYDKwFrgVmA1uAS4ELsV+pFbkiYRbfJ8ctjBoasmDf\n0gKvvWaLp6qqbDQficDKlfDYY7HfVV9vF4sdO6xapxTaJYyXfHXZ3AbEVcjy58DPvMc/A7q8x6uA\nR4GzwCHgILAsB+cgIkUimLd3tfZutB+JWNrm+ect4NfV2egebFerJ5+M/a5IxNI4g4N+Kiese9MW\ni/HK6V+EpXzw7i/yHrcAA4H3DWAjfhEpAa7dgcvbu9LJ5maYOtVSM26Str7eHru+9x9/DO+953/X\ntm3w6U/bY62qzZ18VO8Me7fRXh9hzZo1nzzu6Oigo6MjpyclIrm3caOfg3dpm6oqC+xnztgFoKoK\nKipsBJ9MZyc8/LDl/JubrbmaRvYj9fT00NPTk9FnctVlsxXYiJ/T3w90AEeAWcBW4DIsrw/wA+/+\nKeAuYHvc9ymnLxJCU6f6o/wVK2zVbH+/Vekk0tYGhw7FTva6HvjJWidLcoXcOesJ4Ebv8Y3AhsDx\nrwE1wAJgEbBjnM5BRHJktF2rolGYNcsCvhurtbXZsePHYwO+y9/X1dl3bd1qaR6wBVmdnX6JZbDS\nR6md3MlFeudR4LPAdOBN4H9hI/l1wC3YhO1XvPfu847vA84B32L01I+IFIHg4qf4NsW/+AV8+GHs\n+998Ew4f9lM9zvnzMGmSbXLi+ubMn29dNd97zyZ+XRpn7VpV6YyHXAT9ryc5/oUkx//Ou4lISCQb\ndUejIwM+WH19IvX18Morsc3VgvvVBr/bVelIbmnnLBFJKVgb73ao+sMfrDonU11dtmgq0XdrRJ8d\nbZcoIim5HavcBuOjBV7XITPYMC1TnZ0j6/ElN7RdooikFJ+vj0Ss9PLMGeuPs2CBVeDMn2+5+HQC\nflWVfTZYd+/Et0mW/FLQFylz8fn6ri6/Q2Vw56qBgcSfT+S66yzwB/esBdv85KGHsj5lyYK6bIqU\nuWCzstWrY1sfV3oRoiLDRLDrgT9zpn+ssRF27VLevtAU9EXKlKuvv/hif0FVsPVxdbVfY5/pFFtb\nm21Y/tpr9sth1SpbhBW/JaLknyZyRcpUsBMm2Gj/9GnbwrCqymrqx/K/YUsL7N2rEX0hFHJFrogU\nOZfLBxuZP/AAzJjh98wZS8Bvb7eAf/31FvRnzIjd+1YKTyN9kTI1NGSbkg8P2+RqJDJy9J+J4Ag/\nEvFX486ZYyt0ZfypZFNEAL8W//e/t5F8TY0trnKLpNzrL76Y/Duuusp2rjp3zj/mfhVMmQIvvOCn\ndFxZZl0d/Pa34/PPJGOj9I5IGXC1+B98YO2Kjx2D5ctHvu562zszZ9oFoqrKGqrFd8ucNMnu33vP\n3+QE7IIyZ05sjx0pDkrviJSBzk6boHUqK2HnTvjxj20h1rFjNmKvrLTAPmGCTeSmq6kJXn9dk7eF\npjYMIgJY/n7RIgvujgvwiVRXp155O2WKjfCbmuwCohF94SmnL1KGXH6+r88C8Ysv2mRt/DgqWcCH\n9FotuAodNUoLFwV9kZCLD/L79vkLrDJpnZAON3ELNkmr1sfho/SOSMhlU2aZSkWF/wuhocG2Mnz2\nWavr37pVI/xio8VZImXALbJqbLT7mprcfGdnp7/BCVi1z4YNtnJXAT+8NNIXCbmhIbjmGiuv3L07\n8U5Wo7niCjh6FAYHbWer06fteHe3pYm2bPE3K1egL24a6YuUiNE2Jo9EYN482L49/YDvNigHuOQS\na4zW3e2P7Bsb4Z574Je/tOMK+KVDI32REAjm7bu7/QlUN4m7bdvo1ThBlZVWZvnuu9YrZ/NmP6Av\nX25tkeP/joSDSjZFSkRwo5PaWmuJfPy4LaDKZBGVW3T17rswcSKsXx87gnfN0dxIX0qP0jsiIbB2\nrW1bOHEiPPaY7Wz18ceZBfzmZvjc5/znZ87Etk4Af4HVyZMjX5PSoKAvEgKrV1vN/fPPW7Afi8pK\ny9E3N9vz6dNtK8TgPIHL6butE6X0KKcvUmSiUX9j8qVLrWXxr37ltyoeq+uus6Zp999vo/i33hqZ\nvx8asr+vVbbhpN47IiEUv9gqnT44qTQ0WHdN8AO8a8IWP5kr4aWgLxJCc+fmrn1CcEUtxK6k1ai+\n9Kh6RyQEXNllXZ1N2I41Z59I/NiptdUP8JGISjLLkUb6InkWbJB29iy8847/WvzIPFsNDbbRyeCg\n0jjloJhH+iuBfwQmAD8Bflig8xDJq2jU9qNNlqPPdcDfvdtq7pXGEacQI/0JwGvAF4DDwO+BrwOv\nBt6jkb6UpFx3xIzf4eqqq2xOoLra3+xcykexjvSXAQeBQ97zx4BVxAZ9kZLU1xf7PNt0TlOT7YbV\n0GAtFB55RIFeRleIxVmzgTcDzwe8YyIl7bLLrDY+aCwBv77e7tvabAPy7m5rn/Dkkwr4klohRvpp\n/We+Zs2aTx53dHTQ0dExTqcjknvxu1k1NMDbb6ffFM2J/yXQ2Qn33WeLq1yOXhU45aunp4eenp6M\nPlOInP71wBpsMhfgDuACsZO5yulLKF12mfXFef99f1tBp6Ymu3LMbdsshRNf4qnRvTjF2k//D8Ai\noBWoAb4KPFGA8xDJuddft3YJLuBXV/v3ixfbyH2s7r3X7nt7bTJ40ya7AIhkohDpnXPAXwFPY5U8\nP0WTuFIigj9QKyutK+bZs3Z7+eWxf2+w1XGwzbKaokmmtDhLJEvBdMupU34Ts1wI5vTVFE1SUe8d\nkXEQP0m7b5/tJQu2T+2pU/DRR7n9m1pNK+lQ0BcZB7leYBVv1Sr7ftfjvqUF9u5VwJfUinUiVyTU\nXE7dCW4ynkqq9y5fDhs22MgeYMkSBXzJLY30RVKIT+fU1kJPz8iSzLGoqLCbq9+fN8+2RayutkVY\nDz6ogC/pK9Y2DCKh4kokwe9zP2lSekE/vjdOZaUF8YkTrX1CXZ2/I1Zjo/XNcX+ru1sBX3JP6R2R\nFFw6x+0fW19vWxmmY8qU2Nr8Cxfg+HG7nT3rB/ymJti1S3vUyvhTekeExG0T3GpXVyI5YQKsX599\nWqey0m7ue4ITtSrHlGyoekckTYkqclxdfDQKv/gFfPhh7v/ukiXwzDMK8JIbqt4RSVOiFM6JE3Dj\njRb4cx3wIxErzVTAl3zTSF/KXjRqC6z6+uBP/xQefzx3O1jV18Pp05azv/pq+zXR1AQ7d1oaSSSX\nlN4RSSIahY0bbUJ2eNhfCFVZmXn742RmzoQdO/w2yO7vKl8v40VBXySJRDn8qVOtqiYXLr8cXnhB\nwV3ySzl9kSTiV9WOVbJWyX/yJ37Aj0btItPZ6f+iECkUjfSlZI222cjQEHzqU/DGG8k/71I9me5j\nG98cLfirwlUEiYwHrciVshZcSXv55fDqq34gjkRS72LlRvGZBPyWlpHdMNX/XoqJRvpSsiZPhg8+\n8J8vWGC9bdwCrO3bs1toVVVlvxZeesmeX3UVPPfcyDy+FlxJvmgiV8paY6P1tg8+d20PgrKp2Onq\nsl8CFRVqjiaFp/SOlCWXy49fUJUo4EPmAd81UWtvV6CX8NFIX0pOXV1uV9BWVtoiK/erYcUKC/RK\n10ixUcmmlKVUE7SZqKqy1bPLltnzJUus+mbdOgV8CScFfSk5uQzGX/qStU/45S+t3FK9ciTslN6R\nktPfD62t/vNM6+yDOXttRi5hovSOlA236nXuXPiLv4h9LZ2A7/aubWqCP/7RRvUK+FKKNNKX0Ak2\nS1u61BZEPfqo7UQ1FtXV1l2zqUnVOBJuqtOXkhTfLK2qauyLrIKpn1WrYMMG/7XR2jiIFCPV6UtJ\ncUF4797Y42MN+A0NFvRd/X5887RgG4doVD1zpDQopy9FJ1lXyo0bLQgfOwY1NXasagzDlkmTbFTf\n3w/XXmvHliyx1E6QeuZIKVJ6R4pOsq6UEyf6NfhTp8JnPmM9648eTf+76+pslyy3a9VofXHUM0fC\nZryrd7qBvcB54Jq41+4ADgD7gRsCx5cCe7zXfpTF35YS1tdn942NcM899jgajV10dfy4VdmcOBH7\nWVeFE6+x0S4av/td7DaFkUjyhVajvSYSVtkE/T3Al4Hn4o4vBr7q3a8E7sO/8twP3AIs8m4rs/j7\nUoKiUb/dwcmTllqZOxf+5V9GvvfNN2Pz+a6+3nGbnE+ebN915gz8zd+M37mLhEE2QX8/0Jvg+Crg\nUeAscAg4CFwHzAKmADu89z0MdGXx96UEbdzoB/0JEyx/PzAQ2y0zmWDAb2mB3bstPfTpT9sx5eZF\nxmcitwUYCDwfAGYnOH7YOy5lKn7CNhqNzc9PnWr31dWpvyuY1qmrg8WLLaWzbp3fQkGLrURSl2xu\nBpoTHL8T2Jj70/GtWbPmk8cdHR10dHSM55+TAogvifz3f/dH6xUVFsinTYN3303+HXV1tlHK+fM2\nuj92zJ5v2QI33WR19y43L1Jqenp66OnpyegzqYL+ijGcx2FgbuD5HGyEf9h7HDx+ONmXBIO+hNdo\nC5ziSyKnT/dfGx6GI0dSf7+b3G1rg61b4eKL/WPJNi0XKRXxA+Lvf//7KT+Tq/RO8H+vJ4CvATXA\nAmzCdgdwBDiF5fcrgL8ENiAlzY3mN22yC0DQ2rWxaRd3QahM87/KCRP8idzWVvv80qX2PFHdvYhk\nV6f/ZeBeYDpwEtgJ/Jn32p3AN4FzwHeAp73jS4GHgFrg18C3k3y36vRLRGenBfz2dsuz9/fbCH/G\njJGPAQ4csPe3tY1slDZ5MtTW+ouzamutKmfJEr/lsWrrpZyp944UXDAId3X5Ofzqar9B2sSJVk7p\nTJtmn3P5/YkT4fOfh0cesefXXGN5flfRM2+ebXquHjlS7tR7RwouOInqcviQPOCDBXSXj7/iCvjt\nb/1AHo3awiwX8JuarI5fPXJE0qORvuTN0BBcfrlN0E6bZumbkydj6+vjzZwJK1f6qaBTp+D55+21\n6mpLB916q59CUlmmlDOld6Rg4qt2Vq+259XVtsn4Cy/A4GDiz8a3Sp4+3fL4AM3NdtFoaoIvfhHe\nftv/TvXCl3Kn9I4UTLAGf/p0G9VfuGDPFywYGfCDOf5z5/y0T3u7BfItW+zx+vVw220j5wi6uxXw\nRdKh1soyLoL5+/Pn/YAPI5ukga3MbfaWAba3w2uv+eWcwRW18+f7TdDU+lgkc0rvyLgI5u9TiURs\nwra21soyH3oovVG7yjNFYim9I3kVzOPPmAELF1qlTbAlMsRuUThpkgV8NzmbSZpG7RVEMqegLykl\nm5SNr4vfuDHxyL6mxpqnPf20tTZ+5hkry3Qbmtx6q71PaRqR8af0jqQUv5PVtm1+cJ8/31og1NXZ\nBiXB7Q2Durrg8cftcX8/LF9u9ffz5ytNI5IrSu9IWkZrigaxE6a1tbHtj0+fjl1lm0zwGj5/vm2A\n4ihNI5I/qt6RUZuiQWxjtP5+fzFVTY3/eMIEv+TSqa+3+7Y2m5wVkcJT0JeUpY/BvWLde2tqrJbe\npXOCbYybmy2d88orVpM/eTJ84xvJUz8ikj/K6UtGOXX33rfe8ituamqsCufUKbjySltUNTho6Z6F\nC2H7dntfd7fSOCLjSW0YJKeCuf+zZ22VbFMTXHqpH9hnzoSPPvIbornyTPXFERl/6QR9pXckRjQK\ns2ZZieXs2VZl4/awDeb+6+tt5P766/5etmAj/I8+8p8PD8OcOQr4IsVC1TsSo7fXL8c8ccLSOGAX\nA5fPr6+HHTtsEvfii+Hqq22EPzhoI/qf/tQ2NrlwARoaYlsji0hhKb0jn4hGraGZ641TWWmB2+0/\nC5bKCZZsOqtWWW7fzQssXx67yla5fJHxpzp9yUhvrx/wKyr8JmktLf5Ivb3d0jsNDX7efsmSkf1y\nGhr892uVrUjxUE5fPhEs3Wxs9I8HF125mv3du21039Xl708bFL/puYgUB6V3ylywIuf++/1e9d3d\nVp3jUjsK3CLFTyWbklKwr86CBbbJePwFINOAn6qtg4iMD+X0JaO+OhMn+heA224b++RrcNcsbVQu\nUlyU0y9x6fbVWbzY2hyDTcxmM/mqHa1EipeCfolLt69Of79fuTNvXnYpGU3iihQvBf0SFo1aWWVz\nM1xyiVXauNW18YIXh2w7YgYbtIlIcdFEbgkLTtJWVcG5c/Y40WIpbWQiEn6q3ikjiSZsOzv9Pjmn\nT9v7mpqsX44Cu0jpUcO1MpJownbtWivDnDDBnkcisHOnH/CjUfs1kCzlIyKlJ5ugfw/wKrAL+Dcg\nsIaTO4ADwH7ghsDxpcAe77UfZfG3JU6iCdtIxCZlT56055/9rG1V6KSq7BGR0pNN0P8NcAXwKaAX\nC/QAi4Gvevcrgfvwf27cD9wCLPJuK7P4+xIQXzHjRvF799rriSZoVVopUn6yCfqbAa8lF9uBOd7j\nVcCjwFngEHAQuA6YBUwBdnjvexjoyuLvS0B8xYwbxR87lryfvUorRcpPrnL63wR+7T1uAQYCrw0A\nsxMcP+wdl3EQHMXv2ZM4qKu0UqT8pGrDsBloTnD8TmCj9/h7wMfA2hyel2TIVe/09VnevrbW6vIf\nfFBBXUR8qYL+ihSv3wR0Ap8PHDsMzA08n4ON8A/jp4Dc8cPJvnjNmjWfPO7o6KCjoyPFqZS3YL+b\nAe/3VHe3Ar5IKevp6aGnpyejz2RTp78S+D/AZ4FjgeOLsVH/Mix9swW4BBjGcv/fxvL6TwL3Ak8l\n+G7V6WfI1eQ3Nlq1jjYiFyk/412n/09APZYC2olV6QDsA9Z595uAb2EBH+/xT7CSzYMkDvgSJ516\nejcpu2uXJmdFJDmtyA2BYDsF7TcrIsmoDUOJmDvX8vSNjTaSDy6wEhFx1IahCKVK1SR63QX5kydt\ncxMRkbFS0M+zVK0PEr3e0GD3WjkrItlS0M+zVK0PEr2ulbMikisK+nkWDOCrV8OsWTB1KqxYYemc\nGTPsFgzuWjkrIrmiidwCClblgF0MBgdjK3UikdE3NhcRcTSRW+RcKgf8zcjj0ztqfywiuaSRfgEN\nDcFNN0FFhd8jJ37bQrfSVitsRSQV1emXAO1dKyLpUtAXESkjyumLiEgMBX0RkTKioC8iUkYU9EVE\nyoiCvohIGVHQFxEpIwr6IiJlREFfRKSMKOjnQDp72IqIFAMF/RxQUzQRCQsF/RxItTGKiEixUO+d\nHFBTNBEpBmq4JiJSRtRwTUREYijoi4iUEQV9EZEyoqAvIlJGFPRFRMpINkH/fwO7gJeB/wDmBl67\nAzgA7AduCBxfCuzxXvtRFn9bRETGIJugfzfwKaAN2ADc5R1fDHzVu18J3IdfQnQ/cAuwyLutzOLv\nF62enp5Cn8KYhfncQedfaDr/4pdN0H8v8LgeOOY9XgU8CpwFDgEHgeuAWcAUYIf3voeBriz+ftEK\n8384YT530PkXms6/+FVl+fm/Bf4S+BBY5h1rAV4MvGcAmI1dBAYCxw97x0VEJE9SjfQ3Yzn4+Nt/\n8V7/HjAPeBD4x3E6RxERKTLzgFe8x9/1bs5TWHqnGXg1cPzrwP9L8n0HgWHddNNNN90yuh1kHC0K\nPP5r4Ofe48VYRU8NsADow5/I3Y5dACqAX1OiE7kiIqVoPZbqeRn4V2Bm4LU7sSvOfuCLgeOuZPMg\ncG9+TlN6ExzwAAACgElEQVRERERERIrOfwcuAFMLfSIZGm3hWhjcg82/7AL+DWgs7OlkrBvYC5wH\nrinwuWRiJfbr+ABwe4HPJVP/DLyD/ZIPo7nAVuy/m1eAbxf2dDIyCUudvwzsA/6+sKczdnOxSeD/\nJHxBf0rg8V8DPynUiYzRCvzKrh94tzC5DLgU+584LEF/Apb2bAWqsf+BLy/kCWXoM8ASwhv0m7GF\npmDrjl4jXP/+vf37qMJK5pcne2Mx9975v8DqQp/EGCVbuBYWm7FfWGAjiDkFPJex2A/0FvokMrQM\nC/qHsDUtj2ELHcNiG3Ci0CeRhSPYhRbgNPZLt6Vwp5OxD7z7GmwAcTzZG4s16K/CFnLtLvSJZOFv\ngTeAGwnfSDnom1illYyv2cCbgeduUaPkXyv2q2V7gc8jE5XYResd7BfuvmRvzHZFbjY2Yz+p4n0P\na9gWbNRWjNs6Jjv/O4GN2D/H97A1C/8A3Jy/U0tLqvMHO/+PgbX5OqkMpHP+YTJc6BMQwH6Zrwe+\ng434w+IClp5qBJ4GOoCeAp5PRq7Erlb/6d1cD5+Zo3ymmAUXroXJTcDz2CRRWIUpp389Nofl3EH4\nJnNbCW9OH2wu5Wngvxb6RLL0P4H/UeiTyEYYJ3KTLVwLi5VYFcP0Qp9IlrZia0PCoApbyNiK5WXD\nNpEL4Q76FVgTyH8o9ImMwXQg4j2uBZ4DPl+408ne64Qv6I+2cC0MDgD9wE7vdl9hTydjX8by4x9i\nE3SbCns6afszrGrkIDbSD5NHgbeAM9i/+2JLZ6ayHEuRvIz/331YOgZcBfwRO/fdwG2FPR0RERER\nERERERERERERERERERERERERERERkRL0/wFYs4+qL3aXbwAAAABJRU5ErkJggg==\n", "text": [ "" ] } ], "prompt_number": 1 }, { "cell_type": "code", "collapsed": false, "input": [ "print coef" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "82.1903908408\n" ] } ], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Normal equations\n", "\n", "The solution to the least-squares problem can be found through the direct application of the normal equations:\n", "\\begin{align*}\n", "\\mathbf{A}^\\top\\mathbf{A}\\mathbf{x} = \\mathbf{A}^\\top\\mathbf{b} \\\\\n", "\\mathbf{x} = (\\mathbf{A}^\\top\\mathbf{A})^{-1}\\mathbf{A}^\\top\\mathbf{b}\n", "\\end{align*}\n", "\n", "where $\\mathbf{A}^\\top\\mathbf{A}$ is symmetric. Let's try it:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "coef_inv = np.linalg.inv(A.T.dot(A)).dot(A.T).dot(b)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 3 }, { "cell_type": "code", "collapsed": false, "input": [ "print coef_inv" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "[ 82.11934255]\n" ] } ], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": [ "But does it mean that you should compute the inverse directly ? No ! Although the inverse appears in the normal equations, we can avoid computing it. This is almost always the case in matrix computations. It is not only computationally costly, it is also numerically unstable !" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# QR factorization\n", "\n", "Given the QR factorization of our design matrix, we can find the least-squares solution as follow:\n", "\\begin{align*}\n", "\\mathbf{A}\\mathbf{x} = \\mathbf{b} \\\\\n", "(\\mathbf{Q}\\mathbf{R})\\mathbf{x} = \\mathbf{b} \\\\\n", "\\mathbf{Q}^\\top\\mathbf{Q}\\mathbf{R}\\mathbf{x} = \\mathbf{Q}^\\top\\mathbf{b} \\\\\n", "\\mathbf{R}\\mathbf{x} = \\mathbf{Q}^\\top\\mathbf{b}\n", "\\end{align*}\n", "\n", "where the last line follows from the fact that $\\mathbf{Q}$ is othogonal. Since $\\mathbf{R}$ is triangular, we can efficiently solve the last system of equations through backward substitution:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from scipy.linalg import solve_triangular\n", "Q,R = np.linalg.qr(A)\n", "solve_triangular(R, Q.T.dot(b))" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 5, "text": [ "array([ 82.11934255])" ] } ], "prompt_number": 5 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The complexity of QR is $\\frac{2}{3}n^3+n^2+\\frac{1}{3}n-2=O(n^3)$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# SVD factorization\n", "\n", "What if the inverse of $\\mathbf{A}$ does not exist (not a square matrix for example) ? We can instead take the pseudo-inverse $\\mathbf{A}^+$ through the [SVD decomposition](http://www.cis.upenn.edu/~cis610/cis61009sl10.pdf):\n", "\n", "\\begin{equation}\n", "\\mathbf{A} = \\mathbf{U}\\mathbf{D}\\mathbf{V}^\\top\n", "\\end{equation}\n", "\n", "The pseudo-inverse is then given as:\n", "\\begin{equation}\n", "\\mathbf{A}^+ = \\mathbf{U}\\mathbf{D}^+\\mathbf{V}^\\top\n", "\\end{equation}\n", "\n", "The matrix $\\mathbf{D}^+$ is a diagonal matrix of the inverses of the singular values of $\\mathbf{A}$ of rank $r$:\n", "\\begin{equation}\n", "\\mathbf{D}^+ = \\mbox{diag}(1/\\lambda_1, ..., 1/\\lambda_r,..., 0, 0)\n", "\\end{equation}\n", "\n", "Note that the singular values of $\\mathbf{A}$ are the eigenvalues of $\\mathbf{A}^\\top\\mathbf{A}$. It can be shown that the least-squares solution can be obtained through:\n", "\\begin{equation}\n", "\\mathbf{x}^+ = \\mathbf{A}^+\\mathbf{b}\n", "\\end{equation}" ] }, { "cell_type": "code", "collapsed": false, "input": [ "U, s, Vt = np.linalg.svd(A, full_matrices=False)\n", "Apinv = U.dot(np.diag(1./s)).dot(Vt)\n", "print Apinv.T.dot(b)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "[ 82.11934255]\n" ] } ], "prompt_number": 6 }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Standard numpy functions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If $\\mathbf{A}$ is full-rank, calling [numpy.linalg.solve (http://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.solve.html) will invoke the LAPACK routine gesv which according to [cvxopt](http://cvxopt.org/userguide/lapack.html) uses the LU factorization." ] }, { "cell_type": "code", "collapsed": false, "input": [ "np.linalg.solve(A.T.dot(A), A.T.dot(b))" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 7, "text": [ "array([ 82.11934255])" ] } ], "prompt_number": 7 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Looking at the [source code](https://github.com/numpy/numpy/blob/v1.8.1/numpy/linalg/linalg.py#L1733) for [numpy.linalg.lstq](http://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.lstsq.html), we see that the LAPACK function dgelsd is being invoked under the hood. The dgelsd function is based on the SVD method that we have seen above." ] }, { "cell_type": "code", "collapsed": false, "input": [ "np.linalg.lstsq(A, b)[0]" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 8, "text": [ "array([ 82.11934255])" ] } ], "prompt_number": 8 } ], "metadata": {} } ] }