{ "cells": [ { "cell_type": "markdown", "metadata": { "internals": { "slide_helper": "subslide_end", "slide_type": "subslide" }, "slide_helper": "subslide_end", "slideshow": { "slide_type": "slide" } }, "source": [ "# The NumPy array - and a few other things\n", "\n", "## A structure for efficient numerical computation\n", "\n", "\n", "Most material here is from: Stéfan van der Walt (@stefanvdwalt)
\n", "Some from Nicolas Rougier's book (\"From Python to Numpy\")
\n", "Some additions by JB Poline\n", "\n", "
\n", "\n", "\n", "Nipype Workshop, Boston\n", "\n", "\n", "Accompanying **problem sets** are on the Wiki at:\n", " https://python.g-node.org/wiki/numpy" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_helper": "subslide_end", "slide_type": "subslide" }, "slide_helper": "subslide_end", "slideshow": { "slide_type": "subslide" } }, "source": [ "" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_helper": "subslide_end", "slide_type": "subslide" }, "slide_helper": "subslide_end", "slideshow": { "slide_type": "subslide" } }, "source": [ "## Num-what?\n", "\n", "This talk discusses some of the more advanced NumPy features. If you’ve\n", "never seen NumPy before, you may have more fun doing the tutorial at\n", "\n", "http://mentat.za.net/numpy/intro/intro.html\n", "\n", "or studying the course notes at\n", "\n", "http://scipy-lectures.github.io\n", "\n", "You can always catch up by reading:\n", "\n", "\n", "> **The NumPy array: a structure for efficient numerical computation**. Stéfan\n", "> van der Walt, S. Chris Colbert and Gaël Varoquaux. In IEEE Computing in\n", "> Science Engineering, March/April 2011." ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_type": "subslide" }, "slideshow": { "slide_type": "subslide" } }, "source": [ "## Setup" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "import numpy as np # We always use this convention,\n", " # also in the documentation" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false, "internals": { "slide_helper": "subslide_end" }, "slide_helper": "subslide_end", "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1.11.2\n" ] } ], "source": [ "print(np.__version__) # version 1.7 or greater\n", " # Version 1.9 released 7th of September!" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_type": "subslide" }, "slideshow": { "slide_type": "subslide" } }, "source": [ "ATLAS is a fast implementation of BLAS (Basic Linear Algebra Routines). On\n", "OSX you have Accelerate; students can get Intel's MKL for free. On Ubuntu,\n", "install ``libatlas3-base``.\n", "\n", "Make use of IPython's powerful features! TAB-completion, documentation, source\n", "inspection, timing, cpaste, etc." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false, "internals": { "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "lapack_opt_info:\n", " define_macros = [('HAVE_CBLAS', None)]\n", " language = c\n", " library_dirs = ['/usr/local/lib']\n", " libraries = ['openblas', 'openblas']\n", "openblas_info:\n", " define_macros = [('HAVE_CBLAS', None)]\n", " language = c\n", " library_dirs = ['/usr/local/lib']\n", " libraries = ['openblas', 'openblas']\n", "blas_opt_info:\n", " define_macros = [('HAVE_CBLAS', None)]\n", " language = c\n", " library_dirs = ['/usr/local/lib']\n", " libraries = ['openblas', 'openblas']\n", "blas_mkl_info:\n", " NOT AVAILABLE\n", "openblas_lapack_info:\n", " define_macros = [('HAVE_CBLAS', None)]\n", " language = c\n", " library_dirs = ['/usr/local/lib']\n", " libraries = ['openblas', 'openblas']\n", "None\n" ] } ], "source": [ "print(np.show_config ()) # got ATLAS / Accelerate / MKL?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Ultra fast primer" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "l*2: [1, 2, 1, 2] ; a*2: [2 4]\n", "l+l+l: [1, 2, 1, 2, 1, 2] ; a+a+a: [3 6]\n" ] } ], "source": [ "# Array are not list\n", "l = [1,2]\n", "a = np.asarray(l)\n", "print('l*2: ', l*2, '; a*2: ', a*2)\n", "#\n", "print('l+l+l: ', l+l+l, '; a+a+a: ',a+a+a)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Arrays are mutable\n", "\n", "So, if you pass an array as an argument and change it in a function..." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0 1 2 3]\n", "[-10 1 2 3]\n" ] } ], "source": [ "a = np.arange(4)\n", "def change_my_array(arr):\n", " arr[0] = -10\n", " return None\n", " \n", "print(a)\n", "change_my_array(a)\n", "print(a)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(200000000,)\n", "100000000 loops, best of 3: 7.2 ns per loop\n", "100000000 loops, best of 3: 7.15 ns per loop\n" ] } ], "source": [ "# READABILITY\n", "\n", "def function_1(seq, sub):\n", " return [i for i in range(len(seq) - len(sub)) if seq[i:i+len(sub)] == sub]\n", "\n", "def function_2(seq, sub):\n", " target = np.dot(sub, sub)\n", " candidates = np.where(np.correlate(seq, sub, mode='valid') == target)[0]\n", " check = candidates[:, np.newaxis] + np.arange(len(sub))\n", " mask = np.all((np.take(seq, check) == sub), axis=-1)\n", " return candidates[mask]\n", "\n", "from timeit import timeit\n", "n, m = 5000, 200\n", "s = np.kron(np.arange(n),np.eye(m)).reshape(n*m*m)\n", "print(s.shape)\n", "\n", "%timeit(\"function_1(s, [6,7,8,9])\")\n", "%timeit(\"function_2(s, [6,7,8,9])\")" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## The NumPy ndarray\n", "\n", "### Revision: Structure of an array" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_helper": "subslide_end" }, "slide_helper": "subslide_end", "slideshow": { "slide_type": "-" } }, "source": [ "" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_helper": "subslide_end", "slide_type": "subslide" }, "slide_helper": "subslide_end", "slideshow": { "slide_type": "subslide" } }, "source": [ "" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_helper": "subslide_end", "slide_type": "subslide" }, "slide_helper": "subslide_end", "slideshow": { "slide_type": "subslide" } }, "source": [ "Taking a look at ``numpy/core/include/numpy/ndarraytypes.h``:\n", "\n", "```cpp\n", "ndarray typedef struct PyArrayObject {\n", " PyObject_HEAD\n", " char *data; /* pointer to data buffer */\n", " int nd; /* number of dimensions */\n", " npy_intp *dimensions; /* size in each dimension */\n", " npy_intp *strides; /* bytes to jump to get\n", " * to the next element in\n", " * each dimension\n", " */\n", " PyObject *base; /* Pointer to original array\n", " /* Decref this object */\n", " /* upon deletion. */\n", " PyArray_Descr *descr; /* Pointer to type struct */\n", " int flags; /* Flags */\n", " PyObject *weakreflist; /* For weakreferences */\n", "} PyArrayObject ;\n", "```" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 1. 1. 1. 1.]\n", " [ 1. 1. 1. 1.]\n", " [ 1. 1. 1. 1.]]\n", "['T', '__abs__', '__add__', '__and__', '__array__', '__array_finalize__', '__array_interface__', '__array_prepare__', '__array_priority__', '__array_struct__', '__array_wrap__', '__bool__', '__class__', '__contains__', '__copy__', '__deepcopy__', '__delattr__', '__delitem__', '__dir__', '__divmod__', '__doc__', '__eq__', '__float__', '__floordiv__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__iadd__', '__iand__', '__ifloordiv__', '__ilshift__', '__imatmul__', '__imod__', '__imul__', '__index__', '__init__', '__int__', '__invert__', '__ior__', '__ipow__', '__irshift__', '__isub__', '__iter__', '__itruediv__', '__ixor__', '__le__', '__len__', '__lshift__', '__lt__', '__matmul__', '__mod__', '__mul__', '__ne__', '__neg__', '__new__', '__or__', '__pos__', '__pow__', '__radd__', '__rand__', '__rdivmod__', '__reduce__', '__reduce_ex__', '__repr__', '__rfloordiv__', '__rlshift__', '__rmatmul__', '__rmod__', '__rmul__', '__ror__', '__rpow__', '__rrshift__', '__rshift__', '__rsub__', '__rtruediv__', '__rxor__', '__setattr__', '__setitem__', '__setstate__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '__truediv__', '__xor__', 'all', 'any', 'argmax', 'argmin', 'argpartition', 'argsort', 'astype', 'base', 'byteswap', 'choose', 'clip', 'compress', 'conj', 'conjugate', 'copy', 'ctypes', 'cumprod', 'cumsum', 'data', 'diagonal', 'dot', 'dtype', 'dump', 'dumps', 'fill', 'flags', 'flat', 'flatten', 'getfield', 'imag', 'item', 'itemset', 'itemsize', 'max', 'mean', 'min', 'nbytes', 'ndim', 'newbyteorder', 'nonzero', 'partition', 'prod', 'ptp', 'put', 'ravel', 'real', 'repeat', 'reshape', 'resize', 'round', 'searchsorted', 'setfield', 'setflags', 'shape', 'size', 'sort', 'squeeze', 'std', 'strides', 'sum', 'swapaxes', 'take', 'tobytes', 'tofile', 'tolist', 'tostring', 'trace', 'transpose', 'var', 'view']\n" ] } ], "source": [ "a = np.ones((3,4))\n", "print(a)\n", "print(dir(a))\n" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_type": "subslide" }, "slideshow": { "slide_type": "subslide" } }, "source": [ "## A homogeneous container" ] }, { "cell_type": "markdown", "metadata": { "internals": {}, "slideshow": { "slide_type": "-" } }, "source": [ "```cpp\n", "char *data; /* pointer to data buffer */\n", "```" ] }, { "cell_type": "markdown", "metadata": { "internals": {}, "slideshow": { "slide_type": "-" } }, "source": [ "Data is just a pointer to bytes in memory:" ] }, { "cell_type": "markdown", "metadata": { "internals": {}, "slideshow": { "slide_type": "-" } }, "source": [ "" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "x = np.array([1, 2, 3])" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "dtype('int64')" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x.dtype" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "(32558752, False)" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x.__array_interface__['data']" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "print(x.data)" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_type": "subslide" }, "slideshow": { "slide_type": "subslide" } }, "source": [ "## Dimensions" ] }, { "cell_type": "markdown", "metadata": { "internals": {}, "slideshow": { "slide_type": "-" } }, "source": [ "```cpp\n", "int nd;\n", "npy_intp *dimensions; /* number of dimensions */\n", " /* size in each dimension */\n", "```" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_helper": "subslide_end" }, "slide_helper": "subslide_end", "slideshow": { "slide_type": "-" } }, "source": [ "\n", "
\n", "
\n", "\n", "\n", "
\n" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false, "internals": { "slide_type": "subslide" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "x = np.array([])" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "()" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.array(0).shape" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "(3, 2, 3, 3)" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = np.random.random((3, 2, 3, 3))\n", "x.shape" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_type": "subslide" }, "slideshow": { "slide_type": "subslide" } }, "source": [ "### Data type descriptors" ] }, { "cell_type": "markdown", "metadata": { "internals": {}, "slideshow": { "slide_type": "-" } }, "source": [ "```cpp\n", "PyArray_Descr * descr ; /* Pointer to type struct */\n", "```" ] }, { "cell_type": "markdown", "metadata": { "internals": {}, "slideshow": { "slide_type": "-" } }, "source": [ "Common types in include ``int``, ``float``, ``bool``:" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([-1, 0, 1])" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.array ([-1, 0, 1], dtype=int)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([-1., 0., 1.])" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.array([-1, 0, 1], dtype=float)" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([ True, False, True], dtype=bool)" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.array([-1 , 0, 1], dtype=bool)" ] }, { "cell_type": "markdown", "metadata": { "internals": {}, "slideshow": { "slide_type": "-" } }, "source": [ "Each item in the array has to have the same type (occupy a fixed nr of bytes\n", "in memory), but that does not mean a type has to consist of a single item:" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([(0, True), (1, False)], \n", " dtype=[('value', '" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_helper": "subslide_end", "slide_type": "subslide" }, "slide_helper": "subslide_end", "slideshow": { "slide_type": "subslide" } }, "source": [ "### Example: Taking the transpose\n", "\n", "We can take the transpose of the array *without* copying any data by simply manipulating ``shape`` and ``strides``. What should they be?\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## View versus copy\n", "\n", "### A copy will allocate new memory, a view will not\n", "### A change in the copy will not affect the original array, a change in the view will\n", "### Some numpy operation return copies, others return views\n", " - flatten return a copy\n", " - slicing will return a view\n", " - ravel will attempt to return a view\n", " - reshape returns a view \n", "### Check if array is a copy or a view with flags" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "collapsed": false, "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "Z = np.zeros((5,5))\n", "Z.ravel().base is Z == True\n", "#\n", "Z[::2,::2].ravel().base is Z == True\n", "#\n", "assert(Z.flatten().base is not Z)\n", "#\n", "assert(Z.reshape((25,1)).base is Z)" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1 3 5 7]\n" ] } ], "source": [ "Z1 = np.arange(10)\n", "Z2 = Z1[1:-1:2]\n", "print(Z2)" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_type": "subslide" }, "slideshow": { "slide_type": "subslide" } }, "source": [ "### Flags" ] }, { "cell_type": "markdown", "metadata": { "internals": {}, "slideshow": { "slide_type": "-" } }, "source": [ "```cpp\n", "int flags; /* Flags */\n", "```" ] }, { "cell_type": "code", "execution_count": 113, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1 3]\n", " C_CONTIGUOUS : True\n", " F_CONTIGUOUS : True\n", " OWNDATA : True\n", " WRITEABLE : True\n", " ALIGNED : True\n", " UPDATEIFCOPY : False\n" ] } ], "source": [ "x = np.array([1, 2, 3])\n", "z = x[::2]\n", "print(z)\n", "print(x.flags)" ] }, { "cell_type": "code", "execution_count": 114, "metadata": { "collapsed": false, "internals": { "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ " C_CONTIGUOUS : False\n", " F_CONTIGUOUS : False\n", " OWNDATA : False\n", " WRITEABLE : True\n", " ALIGNED : True\n", " UPDATEIFCOPY : False" ] }, "execution_count": 114, "metadata": {}, "output_type": "execute_result" } ], "source": [ "z.flags" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Temporary copy\n", "\n", " >>> X = np.ones(10, dtype=np.int)\n", " >>> Y = np.ones(10, dtype=np.int)\n", " >>> A = 2*X + 2*Y\n", " \n", " Versus:\n", " \n", " >>> X = np.ones(10, dtype=np.int)\n", " >>> Y = np.ones(10, dtype=np.int)\n", " >>> np.multiply(X, 2, out=X)\n", " >>> np.multiply(Y, 2, out=Y)\n", " >>> np.add(X, Y, out=X)" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Structured arrays\n", "\n", "Like we saw earlier, each item in an array has the same type, but that does not\n", "mean a type has to consist of a single item:" ] }, { "cell_type": "code", "execution_count": 179, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([(0, True), (1, False)], \n", " dtype=[('value', '" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_type": "subslide" }, "slideshow": { "slide_type": "subslide" } }, "source": [ "Reading this kind of data can be troublesome:" ] }, { "cell_type": "markdown", "metadata": { "internals": {}, "slideshow": { "slide_type": "-" } }, "source": [ "```matlab\n", "while ((count > 0) && (n <= NumPoints))\n", " % get time - I8 [ ms ]\n", " [lw, count] = fread(fid, 1, 'uint32');\n", " if (count > 0) % then carry on\n", " uw = fread(fid, 1, 'int32');\n", " t(1, n) = (lw + uw * 2^32) / 1000;\n", "\n", " % get number of bytes of data\n", " numbytes = fread(fid, 1, 'uint32');\n", "\n", " % read sMEASUREMENTPOSITIONINFO (11 bytes)\n", " m(1, n) = fread(fid, 1, 'float32'); % az [ rad ]\n", " m(2, n) = fread(fid, 1, 'float32'); % el [ rad ]\n", " m(3, n) = fread(fid, 1, 'uint8'); % region type\n", " m(4, n) = fread(fid, 1, 'uint16'); % region ID\n", " g(1, n) = fread(fid, 1, 'uint8');\n", "\n", " numsamples = ( numbytes -12)/2; % 2 byte int\n", " ...\n", "```" ] }, { "cell_type": "markdown", "metadata": { "internals": {}, "slideshow": { "slide_type": "subslide" } }, "source": [ "Using structured arrays:\n", "\n", "```python\n", "dt = np.dtype([('time', np.uint64),\n", " ('size', np.uint32),\n", " ('position', [('az', np.float32),\n", " ('el', np.float32),\n", " ('region_type', np.uint8),\n", " ('region_ID', np.uint16)]),\n", " ('gain', np.uint8),\n", " ('samples', (np.int16, 2048))])\n", "\n", "data = np.fromfile(f, dtype=dt)\n", "```" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "-" } }, "source": [ "We can then access this structured array as before:\n", "\n", "```python\n", "data['position']['az']\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Array functions" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 0.64488785 29.78139454]\n", "[[ 1.69439088 0.72536413 -2.02163215 -1.00667092 1.8816507 -0.03026824\n", " -0.93375708 0.14278375 -0.09448514 0.41436576]\n", " [ 1.715446 1.08301563 1.59472585 0.50601605 -0.5222702 -0.53497445\n", " -2.05725302 -1.86172097 0.56197185 0.49018729]\n", " [-0.06686053 -0.1104309 -1.20906256 -0.70505399 1.41420279 1.49779434\n", " -0.22250591 -0.09190751 0.67891195 0.01016829]]\n", "[[6 5 0 4 7 9 1 9 6 7]\n", " [2 5 3 4 4 4 1 1 2 3]\n", " [5 6 7 8 6 2 4 5 6 6]]\n" ] } ], "source": [ "a = np.random.normal((3,30))\n", "print(a)\n", "a = np.random.normal(0,1, size=(3,10))\n", "print(a)\n", "a = np.random.randint(0,10, size=(3,10))\n", "print(a)\n" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "a.min(), a.max(), a.mean(), a.std(), a.sum()\n", "1 6 3.5 1.70782512766 21\n", "[ 0 1 3 6 10 15 21]\n", "[ 1 2 6 24 120 720]\n" ] } ], "source": [ "print(\"a.min(), a.max(), a.mean(), a.std(), a.sum()\")\n", "print(a.min(), a.max(), a.mean(), a.std(), a.sum())\n", "a = np.arange(7)\n", "print(a.cumsum())\n", "a = np.arange(1,7)\n", "print(a.cumprod())" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Broadcasting\n", "\n", "Combining of differently shaped arrays without creating large intermediate\n", "arrays:" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0 1 2 3]\n", "[3 4 5 6]\n" ] } ], "source": [ "x = np.arange(4)\n", "print(x)\n", "print(x + 3)" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_helper": "subslide_end" }, "slide_helper": "subslide_end", "slideshow": { "slide_type": "-" } }, "source": [ "![Broadcasting in 1D](images/broadcast_1D.png)\n", "\n", "See docstring of ``np.doc.broadcasting`` for more detail." ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_type": "subslide" }, "slideshow": { "slide_type": "subslide" } }, "source": [ "## Broadcasting (2D)" ] }, { "cell_type": "code", "execution_count": 180, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 0 1 2 3]\n", " [ 4 5 6 7]\n", " [ 8 9 10 11]]\n", "\n", "[[1]\n", " [2]\n", " [3]]\n" ] } ], "source": [ "a = np.arange(12).reshape((3, 4))\n", "b = np.array([1, 2, 3])[:, np.newaxis]\n", "\n", "print(a)\n", "print()\n", "print(b)" ] }, { "cell_type": "code", "execution_count": 181, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([[ 1, 2, 3, 4],\n", " [ 6, 7, 8, 9],\n", " [11, 12, 13, 14]])" ] }, "execution_count": 181, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a + b" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_helper": "subslide_end" }, "slide_helper": "subslide_end", "slideshow": { "slide_type": "-" } }, "source": [ "" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_type": "subslide" }, "slideshow": { "slide_type": "subslide" } }, "source": [ "## Broadcasting (3D)" ] }, { "cell_type": "code", "execution_count": 41, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(3, 5)\n", "(3, 5, 1)\n", "(3, 5, 8)\n" ] } ], "source": [ "x = np.zeros((3, 5))\n", "print(x.shape)\n", "y = np.zeros(8)\n", "z = x[..., np.newaxis]\n", "print(z.shape)\n", "w = z + y\n", "print(w.shape)" ] }, { "cell_type": "markdown", "metadata": { "internals": {}, "slideshow": { "slide_type": "-" } }, "source": [ "```python\n", "x = np.zeros((3, 5))\n", "y = np.zeros(8)\n", "z = x[..., np.newaxis]\n", "```\n", "\n", "```\n", "X shape = (3, 5, 1)\n", "Y shape = ( 8)\n", "Z shape = (3, 5, 8)\n", "```" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_helper": "subslide_end" }, "slide_helper": "subslide_end", "slideshow": { "slide_type": "-" } }, "source": [ "" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_type": "subslide" }, "slideshow": { "slide_type": "subslide" } }, "source": [ "## Broadcasting rules\n", "\n", "The broadcasting rules are straightforward: compare dimensions, starting from\n", "the last. Match when either dimension is one or ``None``, or if dimensions are\n", "equal:" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_helper": "subslide_end" }, "slide_helper": "subslide_end", "slideshow": { "slide_type": "-" } }, "source": [ "```\n", "Scalar 2D 3D Bad\n", "\n", "( ,) (3, 4) (3, 5, 1) (3, 5, 2)\n", "(3,) (3, 1) ( 8) ( 8)\n", "---- ------ --------- ---------\n", "(3,) (3, 4) (3, 5, 8) XXX\n", "```" ] }, { "cell_type": "code", "execution_count": 204, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "(3, 7, 5, 4)" ] }, "execution_count": 204, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = np.random.normal(size=(3,1,5,1))\n", "b = np.random.normal(size=(7,1,4))\n", "\n", "c=a+b\n", "c.shape" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_type": "subslide" }, "slideshow": { "slide_type": "subslide" } }, "source": [ "## Explicit broadcasting" ] }, { "cell_type": "code", "execution_count": 205, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "(3, 5, 8)" ] }, "execution_count": 205, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = np.zeros((3, 5, 1))\n", "y = np.zeros((3, 5, 8))\n", "\n", "xx, yy = np.broadcast_arrays(x, y)\n", "\n", "xx.shape" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_helper": "subslide_end" }, "slide_helper": "subslide_end", "slideshow": { "slide_type": "-" } }, "source": [ "```python\n", "np.broadcast_arrays([1, 2, 3],\n", " [[1], [2], [3]])\n", "```\n", "\n", "```\n", "[[1 2 3] [[1 1 1]\n", " [1 2 3] [2 2 2]\n", " [1 2 3]] [3 3 3]]\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Some useful array manipulations \n" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[False False False False]\n", " [False False True False]\n", " [False False False False]]\n", "[[False False False False]\n", " [False False True True]\n", " [False False False False]]\n" ] }, { "data": { "text/plain": [ "array([6, 7])" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# meshgrid ? \n", "# np.where\n", "# boolean operation\n", "\n", "a = np.arange(12).reshape(3,4)\n", "b = (a == 6) \n", "print(b)\n", "\n", "b = np.logical_or(a == 6, a == 7)\n", "print(b)\n", "a[b]" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 0 1 2 3]\n", " [ 4 5 6 7]\n", " [ 8 9 10 11]]\n", "(array([1, 1, 1, 1, 2, 2, 2, 2]), array([0, 1, 2, 3, 0, 1, 2, 3]))\n", "[ 4 5 6 7 8 9 10 11]\n", "[[False False False False]\n", " [ True True True True]\n", " [ True True True True]]\n" ] } ], "source": [ "print(a)\n", "print(np.where(a>3))\n", "print(a[np.where(a>3)])\n", "print(a > 3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Axis operation" ] }, { "cell_type": "code", "execution_count": 219, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# reshape\n", "# flatten / ravel\n", "# transpose\n", "# axis rotation\n" ] }, { "cell_type": "code", "execution_count": 229, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(1, 3, 2, 4)\n", "(3, 1, 2, 4)\n", "(1, 2, 3, 4)\n" ] } ], "source": [ "a = np.arange(24).reshape(1,2,3,4)\n", "a.T.shape\n", "b = np.rollaxis(a, 2, 1)\n", "print(b.shape)\n", "b = np.rollaxis(a, 2, 0)\n", "print(b.shape)\n", "b = np.rollaxis(a, 2, 3)\n", "print(b.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## hstack - vstack" ] }, { "cell_type": "code", "execution_count": 53, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(4, 10)\n", "(10, 4)\n", "(4, 10)\n", "(10, 4)\n" ] } ], "source": [ "a1 = np.ones((1,10))\n", "a3 = np.ones((3,10))\n", "\n", "# vstack\n", "a4 = np.vstack((a1,a3))\n", "print(a4.shape)\n", "\n", "# hstack\n", "a10 = np.hstack((a1.T, a3.T))\n", "print(a10.shape)\n", "\n", "print(np.concatenate((a1, a3), axis=0).shape)\n", "print(np.concatenate((a1.T, a3.T), axis=1).shape)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## dot product: at the basis of so many things :)" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 1.38102561 0.27409858 1.26081074 0.2600873 ]\n", " [ 0.86771584 0.85718828 0.21770531 0.94686422]\n", " [ 5.09592854 0.25703762 2.07068062 0.36530305]\n", " [ 1.4798213 0.37511234 7.01844686 2.6445728 ]]\n", "True\n", "0.0\n" ] } ], "source": [ "I = np.eye(4)\n", "a = np.random.lognormal(size=(4,4))\n", "print(a)\n", "b = a.dot(I)\n", "print(np.allclose(b, a))\n", "print(((a-b)**2).sum())" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[ 1.38102561, 0.27409858, 1.26081074, 0.2600873 ],\n", " [ 0.86771584, 0.85718828, 0.21770531, 0.94686422],\n", " [ 5.09592854, 0.25703762, 2.07068062, 0.36530305],\n", " [ 1.4798213 , 0.37511234, 7.01844686, 2.6445728 ]])" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b@I" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_helper": "subslide_end", "slide_type": "subslide" }, "slide_helper": "subslide_end", "slideshow": { "slide_type": "slide" } }, "source": [ "## Fancy indexing\n", "\n", "An ndarray can be indexed in two ways:\n", "\n", " - Using slices and scalars\n", " - Using ndarrays (\"fancy indexing\")\n", "\n", "For example:" ] }, { "cell_type": "code", "execution_count": 184, "metadata": { "collapsed": false, "internals": { "slide_type": "subslide" }, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/plain": [ "array([[0, 1, 2],\n", " [3, 4, 5],\n", " [6, 7, 8]])" ] }, "execution_count": 184, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = np.arange(9).reshape((3, 3))\n", "x" ] }, { "cell_type": "code", "execution_count": 185, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([[1, 1, 2],\n", " [4, 4, 5],\n", " [7, 7, 8]])" ] }, "execution_count": 185, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x[:, [1, 1, 2]]" ] }, { "cell_type": "markdown", "metadata": { "internals": {}, "slideshow": { "slide_type": "-" } }, "source": [ "Which is equivalent to:" ] }, { "cell_type": "code", "execution_count": 186, "metadata": { "collapsed": false, "internals": { "slide_helper": "subslide_end" }, "slide_helper": "subslide_end", "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([[1, 1, 2],\n", " [4, 4, 5],\n", " [7, 7, 8]])" ] }, "execution_count": 186, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.array((x[:, 1], x[:, 1], x[:, 2])).T" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_type": "subslide" }, "slideshow": { "slide_type": "subslide" } }, "source": [ "### Output shape of an indexing op\n", "\n", "1. Broadcast all index arrays against one another.\n", "2. Use the dimensions of slices as-is." ] }, { "cell_type": "code", "execution_count": 188, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[0 1 2]\n", " [3 4 5]\n", " [6 7 8]]\n" ] } ], "source": [ "x = np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8]])\n", "print(x)" ] }, { "cell_type": "code", "execution_count": 189, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "(3, 3)" ] }, "execution_count": 189, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x.shape" ] }, { "cell_type": "markdown", "metadata": { "internals": {}, "slideshow": { "slide_type": "-" } }, "source": [ "But what would happen if we do" ] }, { "cell_type": "code", "execution_count": 191, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([[0, 4],\n", " [3, 7]])" ] }, "execution_count": 191, "metadata": {}, "output_type": "execute_result" } ], "source": [ "idx0 = np.array([[0, 1],\n", " [1, 2]]) # row indices\n", "\n", "idx1 = np.array([[0, 1]]) # column indices\n" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_helper": "subslide_end" }, "slide_helper": "subslide_end", "slideshow": { "slide_type": "-" } }, "source": [ "```python\n", "x[idx0, idx1]\n", "```" ] }, { "cell_type": "code", "execution_count": 192, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[0, 4],\n", " [3, 7]])" ] }, "execution_count": 192, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x[idx0, idx1]\n" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_type": "subslide" }, "slideshow": { "slide_type": "subslide" } }, "source": [ "### Output shape of indexing op (cont'd)" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "((2, 2), (1, 2))" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "idx0.shape, idx1.shape" ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([[0, 1],\n", " [1, 2]])" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a, b = np.broadcast_arrays(idx0, idx1)\n", "a" ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([[0, 1],\n", " [0, 1]])" ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([[0, 1, 2],\n", " [3, 4, 5],\n", " [6, 7, 8]])" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x" ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "collapsed": false, "internals": { "slide_helper": "subslide_end" }, "slide_helper": "subslide_end", "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([[0, 4],\n", " [3, 7]])" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x[idx0, idx1]\n" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_type": "subslide" }, "slideshow": { "slide_type": "subslide" } }, "source": [ "## Output shape of an indexing op (cont'd)" ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "x = np.random.random((15, 12, 16, 3))\n", "\n", "index_one = np.array([[0, 1], [2, 3], [4, 5]])\n", "index_two = np.array([[0, 1]])" ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(3, 2)\n", "(1, 2)\n" ] } ], "source": [ "print(index_one.shape)\n", "print(index_two.shape)" ] }, { "cell_type": "markdown", "metadata": { "internals": {}, "slideshow": { "slide_type": "-" } }, "source": [ "Predict the output shape of" ] }, { "cell_type": "markdown", "metadata": { "internals": {}, "slideshow": { "slide_type": "-" } }, "source": [ "```python\n", "x[5:10, index_one, :, index_two]\n", "```" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_helper": "subslide_end" }, "slide_helper": "subslide_end", "slideshow": { "slide_type": "-" } }, "source": [ "Warning! When mixing slicing and fancy indexing, the\n", "*order* of the output dimensions is less easy to predict.\n", "Play it safe and **don't mix the two!**" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_type": "subslide" }, "slideshow": { "slide_type": "subslide" } }, "source": [ "## Output shape of an indexing op (cont'd)" ] }, { "cell_type": "code", "execution_count": 193, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "x = np.random.random((15, 12, 16, 3))\n", "index_one = np.array([[0, 1], [2, 3], [4, 5]])\n", "index_two = np.array([[0, 1]])" ] }, { "cell_type": "code", "execution_count": 194, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(3, 2)\n", "(1, 2)\n" ] } ], "source": [ "print(index_one.shape)\n", "print(index_two.shape)" ] }, { "cell_type": "markdown", "metadata": { "internals": {}, "slideshow": { "slide_type": "-" } }, "source": [ "Broadcast ``index1`` against ``index2``:" ] }, { "cell_type": "markdown", "metadata": { "internals": {}, "slideshow": { "slide_type": "-" } }, "source": [ "```\n", "(3, 2) # shape of index1\n", "(1, 2) # shape of index2\n", "------\n", "(3, 2)\n", "```" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_helper": "subslide_end" }, "slide_helper": "subslide_end", "slideshow": { "slide_type": "-" } }, "source": [ "The shape of ``x[5:10, index_one, :, index_two]`` is\n", "\n", "``(3, 2, 5, 16)``" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_type": "subslide" }, "slideshow": { "slide_type": "subslide" } }, "source": [ "## Jack's dilemma\n", "\n", "Indexing and broadcasting are intertwined, as we’ll see in the following\n", "example. One of my favourites from the NumPy mailing list:" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_helper": "subslide_end" }, "slide_helper": "subslide_end", "slideshow": { "slide_type": "-" } }, "source": [ "```email\n", "Date: Wed, 16 Jul 2008 16:45:37 -0500\n", "From: Jack Cook\n", "To: \n", "Subject: Numpy Advanced Indexing Question\n", "```\n", "\n", "Greetings,\n", "\n", "I have an I,J,K 3D volume of amplitude values at regularly sampled\n", "time intervals. I have an I,J 2D slice which contains a time (K)\n", "value at each I, J location. What I would like to do is extract a\n", "subvolume at a constant +/- K window around the slice. Is there an\n", "easy way to do this using advanced indexing or some other method?\n", "Thanks in advanced for your help.\n", "\n", "-- Jack" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_helper": "subslide_end", "slide_type": "subslide" }, "slide_helper": "subslide_end", "slideshow": { "slide_type": "-" } }, "source": [ "\"Jack's" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_type": "subslide" }, "slideshow": { "slide_type": "subslide" } }, "source": [ "### Test setup" ] }, { "cell_type": "code", "execution_count": 55, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "ni, nj, nk = (10, 15, 20) # dimensions" ] }, { "cell_type": "markdown", "metadata": { "internals": {}, "slideshow": { "slide_type": "-" } }, "source": [ "Make a fake data block such that ``block[i, j, k] == k`` for all i, j, k." ] }, { "cell_type": "code", "execution_count": 56, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "block = np.empty((ni, nj, nk) , dtype=int)\n", "block[:] = np.arange(nk)[np.newaxis, np.newaxis, :]" ] }, { "cell_type": "markdown", "metadata": { "internals": {}, "slideshow": { "slide_type": "-" } }, "source": [ "Pick out a random fake horizon in ``k``:" ] }, { "cell_type": "code", "execution_count": 57, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([[ 5, 14, 11, 13, 8, 12, 11, 8, 13, 8, 7, 10, 12, 8, 10],\n", " [ 7, 5, 6, 8, 9, 10, 10, 14, 5, 14, 14, 12, 5, 8, 9],\n", " [ 7, 14, 8, 14, 6, 6, 13, 9, 6, 5, 6, 13, 13, 11, 7],\n", " [14, 9, 6, 12, 14, 14, 8, 10, 14, 10, 13, 12, 12, 11, 9],\n", " [12, 11, 14, 14, 13, 5, 7, 8, 8, 14, 14, 5, 7, 8, 12],\n", " [ 5, 7, 10, 11, 12, 12, 8, 9, 14, 13, 8, 6, 13, 8, 6],\n", " [ 8, 6, 12, 14, 5, 12, 5, 13, 14, 13, 10, 14, 9, 7, 6],\n", " [10, 6, 14, 14, 11, 9, 11, 8, 11, 5, 13, 13, 12, 14, 13],\n", " [ 5, 11, 9, 5, 6, 8, 7, 11, 10, 9, 14, 5, 13, 9, 6],\n", " [13, 12, 6, 12, 8, 13, 5, 11, 14, 9, 13, 13, 12, 10, 12]])" ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "k = np.random.randint(5, 15, size=(ni, nj))\n", "k" ] }, { "cell_type": "code", "execution_count": 58, "metadata": { "collapsed": false, "internals": { "slide_helper": "subslide_end" }, "slide_helper": "subslide_end", "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "half_width = 3" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_type": "subslide" }, "slideshow": { "slide_type": "subslide" } }, "source": [ "### Solution\n", "\n", "These two indices assure that we take a slice at each (i, j) position" ] }, { "cell_type": "code", "execution_count": 59, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "idx_i = np.arange(ni)[:, np.newaxis, np.newaxis]\n", "idx_j = np.arange(nj)[np.newaxis, :, np.newaxis]" ] }, { "cell_type": "markdown", "metadata": { "internals": {}, "slideshow": { "slide_type": "-" } }, "source": [ "This is the substantitive part that picks out the window:" ] }, { "cell_type": "code", "execution_count": 60, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "idx_k = k[:, :, np.newaxis] + np.arange(-half_width, half_width + 1)\n", "slices = block[idx_i, idx_j, idx_k] # Slice" ] }, { "cell_type": "markdown", "metadata": { "internals": {}, "slideshow": { "slide_type": "-" } }, "source": [ "Apply the broadcasting rules:" ] }, { "cell_type": "code", "execution_count": 64, "metadata": { "collapsed": false, "internals": { "slide_helper": "subslide_end" }, "slide_helper": "subslide_end", "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "(10, 15, 7)" ] }, "execution_count": 64, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(ni, 1, 1 ) # idx_i\n", "(1, nj, 1 ) # idx_j\n", "(ni, nj, 2 * half_width + 1) # idx_k\n", "# ---\n", "(ni, nj, 7) # <-- this is what we wanted!" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_type": "subslide" }, "slideshow": { "slide_type": "subslide" } }, "source": [ "### Solution verification" ] }, { "cell_type": "code", "execution_count": 65, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "(10, 15, 7)" ] }, "execution_count": 65, "metadata": {}, "output_type": "execute_result" } ], "source": [ "slices = block[idx_i, idx_j, idx_k]\n", "slices.shape\n", "(10, 15, 7)" ] }, { "cell_type": "markdown", "metadata": { "internals": {}, "slideshow": { "slide_type": "-" } }, "source": [ "Now verify that our window is centered on ``k`` everywhere:" ] }, { "cell_type": "code", "execution_count": 66, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([[ 5, 14, 11, 13, 8, 12, 11, 8, 13, 8, 7, 10, 12, 8, 10],\n", " [ 7, 5, 6, 8, 9, 10, 10, 14, 5, 14, 14, 12, 5, 8, 9],\n", " [ 7, 14, 8, 14, 6, 6, 13, 9, 6, 5, 6, 13, 13, 11, 7],\n", " [14, 9, 6, 12, 14, 14, 8, 10, 14, 10, 13, 12, 12, 11, 9],\n", " [12, 11, 14, 14, 13, 5, 7, 8, 8, 14, 14, 5, 7, 8, 12],\n", " [ 5, 7, 10, 11, 12, 12, 8, 9, 14, 13, 8, 6, 13, 8, 6],\n", " [ 8, 6, 12, 14, 5, 12, 5, 13, 14, 13, 10, 14, 9, 7, 6],\n", " [10, 6, 14, 14, 11, 9, 11, 8, 11, 5, 13, 13, 12, 14, 13],\n", " [ 5, 11, 9, 5, 6, 8, 7, 11, 10, 9, 14, 5, 13, 9, 6],\n", " [13, 12, 6, 12, 8, 13, 5, 11, 14, 9, 13, 13, 12, 10, 12]])" ] }, "execution_count": 66, "metadata": {}, "output_type": "execute_result" } ], "source": [ "slices[:, :, 3]" ] }, { "cell_type": "code", "execution_count": 67, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 67, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.all(slices[:, :, 3] == k)" ] }, { "cell_type": "markdown", "metadata": { "internals": {}, "slideshow": { "slide_type": "slide" } }, "source": [ "## The \\_\\_array\\_interface\\_\\_\n", "\n", "Any object that exposes a suitable dictionary named\n", "``__array_interface__`` may be converted to a NumPy array. This is\n", "handy for exchanging data with external libraries. The array interface\n", "has the following important keys (see\n", "http://docs.scipy.org/doc/numpy/reference/arrays.interface):\n", "\n", " - **shape**\n", " - **typestr**\n", " - **data**: (20495857, True); 2-tuple—pointer to data and boolean to\n", "indicate whether memory is read-only\n", " - **strides**\n", " - **version**: 3" ] }, { "cell_type": "markdown", "metadata": { "internals": {}, "slideshow": { "slide_type": "subslide" } }, "source": [ "## Example: Repeating sequence" ] }, { "cell_type": "code", "execution_count": 71, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(4,) (8,)\n" ] } ], "source": [ "data = np.array([1, 2, 3, 4])\n", "print(data.shape, data.strides)" ] }, { "cell_type": "code", "execution_count": 72, "metadata": { "collapsed": false, "internals": {}, "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "N = 400000\n", "repeat = np.lib.stride_tricks.as_strided(data, shape=(N, 4), strides=(0, 8))" ] }, { "cell_type": "code", "execution_count": 73, "metadata": { "collapsed": false, "internals": { "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "array([[1, 2, 3, 4],\n", " [1, 2, 3, 4],\n", " [1, 2, 3, 4],\n", " ..., \n", " [1, 2, 3, 4],\n", " [1, 2, 3, 4],\n", " [1, 2, 3, 4]])" ] }, "execution_count": 73, "metadata": {}, "output_type": "execute_result" } ], "source": [ "repeat" ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Questions, discussion and exercizes" ] }, { "cell_type": "markdown", "metadata": { "internals": {}, "slideshow": { "slide_type": "-" } }, "source": [ "---\n", "
" ] }, { "cell_type": "code", "execution_count": 195, "metadata": { "collapsed": false, "internals": { "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%reload_ext load_style\n", "%load_style css/notebook.css" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.1+" } }, "nbformat": 4, "nbformat_minor": 0 }