{ "metadata": { "name": "matrix-intro" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "Array Indexing in NumPy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This notebook presents how to manipulate matrices in NumPy. [NumPy](http://www.numpy.org/) is the fundamental package for scientific computing with Python." ] }, { "cell_type": "code", "collapsed": false, "input": [ "import numpy as np" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "NumPy provides the type *N-dimensional array*." ] }, { "cell_type": "code", "collapsed": false, "input": [ "a = np.array([[1, 2], [3, 4]])" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 2 }, { "cell_type": "code", "collapsed": false, "input": [ "type(a)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 3, "text": [ "numpy.ndarray" ] } ], "prompt_number": 3 }, { "cell_type": "code", "collapsed": false, "input": [ "a" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 4, "text": [ "array([[1, 2],\n", " [3, 4]])" ] } ], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can *multiply* two `numpy.ndarray`s:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "a * a" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 5, "text": [ "array([[ 1, 4],\n", " [ 9, 16]])" ] } ], "prompt_number": 5 }, { "cell_type": "markdown", "metadata": {}, "source": [ "But you can see that arrays are not exactly mathematical matrices. The expected result for *matrix multiplication* would be `np.array([[7, 10], [15, 22]])`. Operators act *element-wise* on arrays. Since this is the case in *matrix addition* too, you can use arrays as matrices when you do:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "a + a" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 6, "text": [ "array([[2, 4],\n", " [6, 8]])" ] } ], "prompt_number": 6 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Adding an array and a scalar follows the expected (sensible) behaviour." ] }, { "cell_type": "code", "collapsed": false, "input": [ "a + 1" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 7, "text": [ "array([[2, 3],\n", " [4, 5]])" ] } ], "prompt_number": 7 }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "A Useful Toolbox" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "NumPy provides the usual functions to apply on arrays." ] }, { "cell_type": "code", "collapsed": false, "input": [ "np.sum(a) # Sums all elements" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 8, "text": [ "10" ] } ], "prompt_number": 8 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Partial sum along the first (index 0) axis:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "np.sum(a, 0) # | (axis=0)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 9, "text": [ "array([4, 6])" ] } ], "prompt_number": 9 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Partial sum along the second (index 1) axis:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "np.sum(a, 1) # --> (axis=1)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 10, "text": [ "array([3, 7])" ] } ], "prompt_number": 10 }, { "cell_type": "code", "collapsed": false, "input": [ "np.sum(a, axis=1)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 11, "text": [ "array([3, 7])" ] } ], "prompt_number": 11 }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Exercise**\n", "\n", "What does `np.sum(a, 2)` do?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Likewise, we can call the `mean` function:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "np.mean(a)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 12, "text": [ "2.5" ] } ], "prompt_number": 12 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Over the entire array or partially:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "np.mean(a, axis=0)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 13, "text": [ "array([ 2., 3.])" ] } ], "prompt_number": 13 }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can look for values in your array:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "np.where(a == 2)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 14, "text": [ "(array([0]), array([1]))" ] } ], "prompt_number": 14 }, { "cell_type": "markdown", "metadata": {}, "source": [ "returns the coordinates of element 2. Note that coordinates are consistent with the mathematical matrix convention. Coordinates in this format can be readily used to access elements; trivially:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "coord = np.where(a == 1)\n", "a[coord]" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 15, "text": [ "array([1])" ] } ], "prompt_number": 15 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that everything is stored in arrays. If you have several instances, it looks like the following:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "a1 = np.ones((2, 2))\n", "print \"Array is\", a1\n", "coord1 = np.where(a1 == 1)\n", "print \"Ones are found at (row, column) =\", coord1\n", "print \"Values at these coordinates\", a1[coord1]\n", "print \"Values of former array at these coordinates\", a[coord1]" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Array is [[ 1. 1.]\n", " [ 1. 1.]]\n", "Ones are found at (row, column) = (array([0, 0, 1, 1]), array([0, 1, 0, 1]))\n", "Values at these coordinates [ 1. 1. 1. 1.]\n", "Values of former array at these coordinates [1 2 3 4]\n" ] } ], "prompt_number": 16 }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you have no instances, say:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "np.where(a == 0)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 17, "text": [ "(array([], dtype=int64), array([], dtype=int64))" ] } ], "prompt_number": 17 }, { "cell_type": "markdown", "metadata": {}, "source": [ "There is no element equal to 0 in `a`, so we get an empty array." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Exercise**\n", "\n", "Access the element of `a` which lies on the second row and first column." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To select the entire second row, use a [slice](../../gloss.html#slice) as follows:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "a[1, :]" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 18, "text": [ "array([3, 4])" ] } ], "prompt_number": 18 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Slices illustrate the power of high-level programming. Do not write [for loops](../../gloss.html#for-loop). Let the computer worry about how to do the element-by-element operations!" ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Matrix multiplication" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Matrix multiplication can be expressed by the dot product of 2D arrays:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "np.dot(a, a)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 19, "text": [ "array([[ 7, 10],\n", " [15, 22]])" ] } ], "prompt_number": 19 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The dot product of 1D arrays expresses inner product of vectors:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "b = np.array([1, 2, 3])" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 20 }, { "cell_type": "code", "collapsed": false, "input": [ "np.dot(b, b) # 1*1 + 2*2 + 3*3" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 21, "text": [ "14" ] } ], "prompt_number": 21 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Of course, `dot()` only works on arrays with compatible shapes. For example," ] }, { "cell_type": "code", "collapsed": false, "input": [ "np.dot(a, b)" ], "language": "python", "metadata": {}, "outputs": [ { "ename": "ValueError", "evalue": "objects are not aligned", "output_type": "pyerr", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdot\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mb\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mValueError\u001b[0m: objects are not aligned" ] } ], "prompt_number": 22 }, { "cell_type": "code", "collapsed": false, "input": [ "a.shape" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 23, "text": [ "(2, 2)" ] } ], "prompt_number": 23 }, { "cell_type": "code", "collapsed": false, "input": [ "b.shape" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 24, "text": [ "(3,)" ] } ], "prompt_number": 24 }, { "cell_type": "markdown", "metadata": {}, "source": [ "You cannot multiply a 2-by-2 matrix with a vector of size 3." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Exercise**\n", "\n", "Create an object named `c` such that you can compute `np.dot(a, c)`.\n", "Try `np.dot(c, a)`. What do you notice?" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# NumPy does not distinguish between row and colum vectors." ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 25 }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you want to use `*` as the operator for matrix multiplication, you need to create matrix objects." ] }, { "cell_type": "code", "collapsed": false, "input": [ "m = np.matrix([[1, 2], [3, 4]])" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 26 }, { "cell_type": "code", "collapsed": false, "input": [ "m * m" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 27, "text": [ "matrix([[ 7, 10],\n", " [15, 22]])" ] } ], "prompt_number": 27 }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can convert an array into a matrix and vice versa." ] }, { "cell_type": "code", "collapsed": false, "input": [ "am = np.matrix(a)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 28 }, { "cell_type": "code", "collapsed": false, "input": [ "ma = np.array(m)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 29 }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Prefer Array or Matrix?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Matrix has all the features of array. You want to use the matrix type if your problem is linear algebra. Indeed, vectors are then 1-by-N matrices." ] }, { "cell_type": "code", "collapsed": false, "input": [ "bm = np.matrix(b)\n", "bm.shape" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 30, "text": [ "(1, 3)" ] } ], "prompt_number": 30 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Transpose it to get a column vector:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "np.transpose(bm)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 31, "text": [ "matrix([[1],\n", " [2],\n", " [3]])" ] } ], "prompt_number": 31 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Otherwise, typically if you are representing multi-dimensional grids, you should use array. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Always remember to look at the docs: [http://docs.scipy.org/doc/numpy/](http://docs.scipy.org/doc/numpy/)\n", "\n", "You will find implementations for `conjugate`, `convolve`, `correlate`, `diagonal`, `fft`, `gradient`, ... These functions are faster than anything you could easily write and ... someone else has tested and debugged them! :)" ] } ], "metadata": {} } ] }