{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Lesson 6 - Fancy indexing and index tricks" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "NumPy offers more indexing facilities than regular Python sequences. In addition\n", "to indexing by integers and slices, as we saw before, arrays can be indexed by\n", "**arrays of integers** and **arrays of booleans**. " ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Indexing with Arrays of Indices" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create an array of the first 12 square numbers:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "a = np.arange(12)**2" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([ 0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121])" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create an array of indices:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": true }, "outputs": [], "source": [ "i = np.array([1, 1, 3, 8, 5]) " ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([1, 1, 3, 8, 5])" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "i" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now the cool part - index the elements of `a` at the positions `i`:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([ 1, 1, 9, 64, 25])" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[i]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's try a different array of (bi-dimensional) indices:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [], "source": [ "j = np.array([[3, 4], [9, 7]])" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[ 9, 16],\n", " [81, 49]])" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[j]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When the indexed array a is multidimensional, a single array of indices refers to the first dimension of a. The following example shows this behavior by converting an image of labels into a color image using a palette." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [], "source": [ "palette = np.array(\n", " [[0, 0, 0], # black\n", " [255, 0, 0], # red\n", " [0, 255, 0], # green\n", " [0, 0, 255], # blue\n", " [255, 255, 255]] # white\n", ") " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create an array of palette colors. e.g. 0: black, 1: red, etc." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [], "source": [ "image = np.array(\n", " [[0, 1, 2, 0 ],\n", " [0, 3, 4, 0 ]]\n", ")" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[[ 0, 0, 0],\n", " [255, 0, 0],\n", " [ 0, 255, 0],\n", " [ 0, 0, 0]],\n", "\n", " [[ 0, 0, 0],\n", " [ 0, 0, 255],\n", " [255, 255, 255],\n", " [ 0, 0, 0]]])" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "palette[image]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also give indexes for more than one dimension. The arrays of indices for\n", "each dimension must have the same shape. " ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": true }, "outputs": [], "source": [ "a = np.arange(12).reshape(3,4)" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[ 0, 1, 2, 3],\n", " [ 4, 5, 6, 7],\n", " [ 8, 9, 10, 11]])" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# indices for the first dim (row) of a\n", "i = np.array(\n", " [[0,1],\n", " [1,2]]\n", ")" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# indices for the second dim (column) of a\n", "j = np.array(\n", " [[2,1],\n", " [3,1]]\n", ")" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[2, 5],\n", " [7, 9]])" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# i and j must have equal shape\n", "a[i,j]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To elaborate more on the `a[i,j]` part:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We are trying to create a 2 by 2 matrix from matrix `a`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\n", "\\begin{bmatrix}\n", "a[(i_1, j_1)], a[(i_1, j_2)] \\\\\n", "a[(i_2, j_1)], a[(i_2, j_2)] \\\\\n", "\\end{bmatrix}\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This becomes:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\n", "\\begin{bmatrix}\n", "a[(0, 2)], a[(1, 1)] \\\\\n", "a[(1, 3)], a[(2, 1)] \\\\\n", "\\end{bmatrix}\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Which get resolves to:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\n", "\\begin{bmatrix}\n", "2, 5 \\\\\n", "7, 9 \\\\\n", "\\end{bmatrix}\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's try something else. Use row defined by `i`, and pick 2nd column of `a` only." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[ 2, 6],\n", " [ 6, 10]])" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[i, 2]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now try something even more interesting." ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[ 0, 1, 2, 3],\n", " [ 4, 5, 6, 7],\n", " [ 8, 9, 10, 11]])" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# recall what a is\n", "a" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[2, 1],\n", " [3, 1]])" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# recall what j is\n", "j" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[[ 2, 1],\n", " [ 3, 1]],\n", "\n", " [[ 6, 5],\n", " [ 7, 5]],\n", "\n", " [[10, 9],\n", " [11, 9]]])" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Use all 3 rows of a, and pick columns defined by j\n", "a[:,j]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Naturally, we can put `i` and `j` in a sequence (say a list) and then do the indexing with the list. " ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[ 0, 1, 2, 3],\n", " [ 4, 5, 6, 7],\n", " [ 8, 9, 10, 11]])" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# recall what a is\n", "a" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[0, 1],\n", " [1, 2]])" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# recall what i is\n", "i" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[2, 1],\n", " [3, 1]])" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# recall what j is\n", "j" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "collapsed": true }, "outputs": [], "source": [ "l = a[i, j]" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[2, 5],\n", " [7, 9]])" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "l" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, we can not do this by putting `i` and `j` into an array, because this array will be interpreted as indexing the first dimension of `a`." ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "collapsed": false }, "outputs": [], "source": [ "s = np.array([i, j])" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[[0, 1],\n", " [1, 2]],\n", "\n", " [[2, 1],\n", " [3, 1]]])" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# This will be wrong\n", "# a[s]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `tuple` function will work though." ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "(array([[0, 1],\n", " [1, 2]]), array([[2, 1],\n", " [3, 1]]))" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tuple(s)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "i.e. a two dimensional tuple. The first element is `i` (row), and second element is `j` (column)." ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[2, 5],\n", " [7, 9]])" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[tuple(s)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Cool hey?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another common use of indexing with arrays is the search of the maximum value\n", "of time-dependent series : " ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# time scale (from 20 to 145 inclusive, 5 increments)\n", "time = np.linspace(20, 145, 5)" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([ 20. , 51.25, 82.5 , 113.75, 145. ])" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "time" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "array([ 0. , 0.84147098, 0.90929743, 0.14112001, -0.7568025 ,\n", " -0.95892427, -0.2794155 , 0.6569866 , 0.98935825, 0.41211849,\n", " -0.54402111, -0.99999021, -0.53657292, 0.42016704, 0.99060736,\n", " 0.65028784, -0.28790332, -0.96139749, -0.75098725, 0.14987721])" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# just to visualize what this is\n", "np.sin(np.arange(20))" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# 4 time-dependent series\n", "data = np.sin(np.arange(20)).reshape(5,4)" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "array([[ 0. , 0.84147098, 0.90929743, 0.14112001],\n", " [-0.7568025 , -0.95892427, -0.2794155 , 0.6569866 ],\n", " [ 0.98935825, 0.41211849, -0.54402111, -0.99999021],\n", " [-0.53657292, 0.42016704, 0.99060736, 0.65028784],\n", " [-0.28790332, -0.96139749, -0.75098725, 0.14987721]])" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# index of the maxima for each series (column.\n", "ind = data.argmax(axis=0) " ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([2, 0, 3, 1], dtype=int64)" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ind" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# times corresponding to the maxima\n", "time_max = time[ind]" ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([ 82.5 , 20. , 113.75, 51.25])" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "time_max" ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# => data[ind[0],0], data[ind[1],1]...\n", "data_max = data[ind, xrange(data.shape[1])]" ] }, { "cell_type": "code", "execution_count": 41, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([ 0.98935825, 0.84147098, 0.99060736, 0.6569866 ])" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data_max" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note the use of `xrange()`. Seems for be mroe preferable than `range()` according to [this StackOverflow forum](http://stackoverflow.com/questions/135041/should-you-always-favor-xrange-over-range). Here is an answer by Brian I like alot:\n", "\n", "> `range(n)` creates a list containing all the integers `0 ... n-1`. This is a problem if you do `range(1000000)`, because you'll end up with a 4MB+ list. `xrange()` deals with this by returning an object that pretends to be a list, but just works out the number needed from the index asked for, and returns that." ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "all(data_max == data.max(axis=0))" ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([ 0.98935825, 0.84147098, 0.99060736, 0.6569866 ])" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data.max(axis=0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For simplicity, the `data.max()` will come in handy." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also use indexing with arrays as a target to assign to: " ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "collapsed": true }, "outputs": [], "source": [ "a = np.arange(5)" ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([0, 1, 2, 3, 4])" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a" ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "collapsed": true }, "outputs": [], "source": [ "a[[1,3,4]] = 0" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([0, 0, 2, 0, 0])" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, when the list of indices contains repetitions, the assignment is done\n", "several times, leaving behind the last value: " ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "collapsed": true }, "outputs": [], "source": [ "a = np.arange(5)" ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([0, 1, 2, 3, 4])" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "collapsed": true }, "outputs": [], "source": [ "a[[0,0,2]]=[1,2,3]" ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([2, 1, 3, 3, 4])" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is reasonable enough, but watch out if you want to use Python's += construct, as it may not do what you expect: " ] }, { "cell_type": "code", "execution_count": 52, "metadata": { "collapsed": true }, "outputs": [], "source": [ "a = np.arange(5)" ] }, { "cell_type": "code", "execution_count": 53, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([0, 1, 2, 3, 4])" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a" ] }, { "cell_type": "code", "execution_count": 54, "metadata": { "collapsed": true }, "outputs": [], "source": [ "a[[0,0,2]]+=1" ] }, { "cell_type": "code", "execution_count": 55, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "array([1, 1, 3, 3, 4])" ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Even though 0 occurs twice in the list of indices, the 0th element is only\n", "incremented once. This is because Python requires \"a+=1\" to be equivalent to\n", "\"a=a+1\". " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Indexing with Boolean Arrays" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When we index arrays with arrays of (integer) indices we are providing the list of indices to pick. \n", "\n", "With boolean indices the approach is different; we explicitly choose which items in the array we want and which ones we don't.\n", "\n", "The most natural way one can think of for boolean indexing is to use boolean\n", "arrays that have the same shape as the original array: " ] }, { "cell_type": "code", "execution_count": 56, "metadata": { "collapsed": true }, "outputs": [], "source": [ "a = np.arange(12).reshape(3,4)" ] }, { "cell_type": "code", "execution_count": 57, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[ 0, 1, 2, 3],\n", " [ 4, 5, 6, 7],\n", " [ 8, 9, 10, 11]])" ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a" ] }, { "cell_type": "code", "execution_count": 58, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# b is a boolean with a's shape\n", "b = a > 4" ] }, { "cell_type": "code", "execution_count": 59, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[False, False, False, False],\n", " [False, True, True, True],\n", " [ True, True, True, True]], dtype=bool)" ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b" ] }, { "cell_type": "code", "execution_count": 60, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([ 5, 6, 7, 8, 9, 10, 11])" ] }, "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# 1d array with the selected elements\n", "a[b]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This property can be very useful in assignments: " ] }, { "cell_type": "code", "execution_count": 61, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# All elements of 'a' higher than 4 become 0\n", "a[b] = 0" ] }, { "cell_type": "code", "execution_count": 62, "metadata": { "collapsed": true }, "outputs": [], "source": [ "a[b] = 0" ] }, { "cell_type": "code", "execution_count": 63, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[0, 1, 2, 3],\n", " [4, 0, 0, 0],\n", " [0, 0, 0, 0]])" ] }, "execution_count": 63, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can look at the Mandelbrot set example to see how to use boolean indexing to\n", "generate an image of the Mandelbrot set. \n", "\n", "The second way of indexing with booleans is more similar to integer indexing; for each dimension of the array we give a 1D boolean array selecting the slices we want. " ] }, { "cell_type": "code", "execution_count": 64, "metadata": { "collapsed": true }, "outputs": [], "source": [ "a = np.arange(12).reshape(3,4)" ] }, { "cell_type": "code", "execution_count": 65, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[ 0, 1, 2, 3],\n", " [ 4, 5, 6, 7],\n", " [ 8, 9, 10, 11]])" ] }, "execution_count": 65, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a" ] }, { "cell_type": "code", "execution_count": 66, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# first dimension selection (row)\n", "b1 = np.array([False,True,True])" ] }, { "cell_type": "code", "execution_count": 67, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# second dimension selection (column)\n", "b2 = np.array([True,False,True,False]) " ] }, { "cell_type": "code", "execution_count": 68, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[ 4, 5, 6, 7],\n", " [ 8, 9, 10, 11]])" ] }, "execution_count": 68, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# selecting rows\n", "a[b1,:]" ] }, { "cell_type": "code", "execution_count": 69, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[ 4, 5, 6, 7],\n", " [ 8, 9, 10, 11]])" ] }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[b1] # same thing" ] }, { "cell_type": "code", "execution_count": 70, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[ 0, 2],\n", " [ 4, 6],\n", " [ 8, 10]])" ] }, "execution_count": 70, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# selecting columns\n", "a[:,b2]" ] }, { "cell_type": "code", "execution_count": 71, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[ 0, 2],\n", " [ 4, 6],\n", " [ 8, 10]])" ] }, "execution_count": 71, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[...,b2] # same thing" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(This example below, I do not quite understand?)" ] }, { "cell_type": "code", "execution_count": 72, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([ 4, 10])" ] }, "execution_count": 72, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# a weird thing to do\n", "a[b1,b2]" ] }, { "cell_type": "code", "execution_count": 73, "metadata": { "collapsed": false }, "outputs": [], "source": [ "s = tuple([b1,b2])" ] }, { "cell_type": "code", "execution_count": 74, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "(array([False, True, True], dtype=bool),\n", " array([ True, False, True, False], dtype=bool))" ] }, "execution_count": 74, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s" ] }, { "cell_type": "code", "execution_count": 75, "metadata": { "collapsed": false, "scrolled": false }, "outputs": [ { "data": { "text/plain": [ "array([ 4, 10])" ] }, "execution_count": 75, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[s]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The ix_() function" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `ix_ function` can be used to combine different vectors so as to obtain the result for each n-uplet.\n", "\n", "For example, if you want to compute all the `a + b*c` for all the triplets taken from each of the vectors `a`, `b` and `c`: " ] }, { "cell_type": "code", "execution_count": 76, "metadata": { "collapsed": true }, "outputs": [], "source": [ "a = np.array([2,3,4,5])\n", "b = np.array([8,5,4])\n", "c = np.array([5,4,6,8,3])" ] }, { "cell_type": "code", "execution_count": 77, "metadata": { "collapsed": false }, "outputs": [], "source": [ "ax,bx,cx = np.ix_(a,b,c)" ] }, { "cell_type": "code", "execution_count": 78, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[[2]],\n", "\n", " [[3]],\n", "\n", " [[4]],\n", "\n", " [[5]]])" ] }, "execution_count": 78, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ax" ] }, { "cell_type": "code", "execution_count": 79, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[[8],\n", " [5],\n", " [4]]])" ] }, "execution_count": 79, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bx" ] }, { "cell_type": "code", "execution_count": 80, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[[5, 4, 6, 8, 3]]])" ] }, "execution_count": 80, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cx" ] }, { "cell_type": "code", "execution_count": 81, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "((4L, 1L, 1L), (1L, 3L, 1L), (1L, 1L, 5L))" ] }, "execution_count": 81, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ax.shape, bx.shape, cx.shape" ] }, { "cell_type": "code", "execution_count": 82, "metadata": { "collapsed": true }, "outputs": [], "source": [ "result = ax + bx*cx" ] }, { "cell_type": "code", "execution_count": 83, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[[42, 34, 50, 66, 26],\n", " [27, 22, 32, 42, 17],\n", " [22, 18, 26, 34, 14]],\n", "\n", " [[43, 35, 51, 67, 27],\n", " [28, 23, 33, 43, 18],\n", " [23, 19, 27, 35, 15]],\n", "\n", " [[44, 36, 52, 68, 28],\n", " [29, 24, 34, 44, 19],\n", " [24, 20, 28, 36, 16]],\n", "\n", " [[45, 37, 53, 69, 29],\n", " [30, 25, 35, 45, 20],\n", " [25, 21, 29, 37, 17]]])" ] }, "execution_count": 83, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result" ] }, { "cell_type": "code", "execution_count": 84, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "17" ] }, "execution_count": 84, "metadata": {}, "output_type": "execute_result" } ], "source": [ "result[3,2,4]" ] }, { "cell_type": "code", "execution_count": 85, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "17" ] }, "execution_count": 85, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[3]+b[2]*c[4]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You could also implement the reduce as follows: " ] }, { "cell_type": "code", "execution_count": 86, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def ufunc_reduce(ufct, *vectors):\n", " vs = np.ix_(*vectors)\n", " r = ufct.identity\n", " for v in vs:\n", " r = ufct(r,v)\n", " return r" ] }, { "cell_type": "code", "execution_count": 87, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[[15, 14, 16, 18, 13],\n", " [12, 11, 13, 15, 10],\n", " [11, 10, 12, 14, 9]],\n", "\n", " [[16, 15, 17, 19, 14],\n", " [13, 12, 14, 16, 11],\n", " [12, 11, 13, 15, 10]],\n", "\n", " [[17, 16, 18, 20, 15],\n", " [14, 13, 15, 17, 12],\n", " [13, 12, 14, 16, 11]],\n", "\n", " [[18, 17, 19, 21, 16],\n", " [15, 14, 16, 18, 13],\n", " [14, 13, 15, 17, 12]]])" ] }, "execution_count": 87, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ufunc_reduce(np.add,a,b,c)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The advantage of this version of reduce compared to the normal ufunc.reduce is\n", "that it makes use of the Broadcasting Rules in order to avoid creating an argument array the size of the output times the num" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "** New Concepts! **" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "See [numpy.identity] (http://docs.scipy.org/doc/numpy/reference/generated/numpy.identity.html) to see an example of creating an identify array." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The use of the `*arg` and `**keywordargs` is explained in this [StackOverflow forum](http://stackoverflow.com/questions/4306574/python-method-function-arguments-starting-with-asterisk-and-dual-asterisk)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The function `repr()` is a Python built in function. Explained [here](https://docs.python.org/2/library/functions.html#func-repr). i.e. Return a string containing a printable representation of an object. " ] }, { "cell_type": "code", "execution_count": 88, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def printlist(*args):\n", " print repr(args)" ] }, { "cell_type": "code", "execution_count": 89, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(1, 2, 3, 4, 5)\n" ] } ], "source": [ "printlist(1, 2, 3, 4, 5) # or as many more arguments as I'd like" ] }, { "cell_type": "code", "execution_count": 90, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def printdict(**kwargs):\n", " print repr(kwargs)" ] }, { "cell_type": "code", "execution_count": 91, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'john': 10, 'jill': 12, 'david': 15}\n" ] } ], "source": [ "printdict(john=10, jill=12, david=15)" ] }, { "cell_type": "code", "execution_count": 92, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def printlistdict(self, *args, **kwargs):\n", " print repr(args)\n", " print repr(kwargs)" ] }, { "cell_type": "code", "execution_count": 93, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(2, 3, 4, 5)\n", "{'john': 10, 'jill': 12, 'david': 15}\n" ] } ], "source": [ "printlistdict(1, 2, 3, 4, 5, john=10, jill=12, david=15)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Indexing with Strings" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "See [RecordArrays](http://docs.scipy.org/doc/numpy/user/basics.rec.html). " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Numpy provides powerful capabilities to create arrays of structs or records. These arrays permit one to manipulate the data by the structs or by fields of the struct. A simple example will show what is meant.:" ] }, { "cell_type": "code", "execution_count": 94, "metadata": { "collapsed": true }, "outputs": [], "source": [ "x = np.zeros((2,),dtype=('i4,f4,a10'))" ] }, { "cell_type": "code", "execution_count": 95, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([(0, 0.0, ''), (0, 0.0, '')], \n", " dtype=[('f0', '