{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# The NumPy array - and a few other things\n",
"\n",
"## A structure for efficient numerical computation\n",
"\n",
"\n",
"Most material here is from: Stéfan van der Walt (@stefanvdwalt)
\n",
"Some from Nicolas Rougier's book (\"From Python to Numpy\")
\n",
"Some additions by JB Poline\n",
"\n",
"\n",
"\n",
"\n",
"Nipype Workshop, Boston\n",
"\n",
"\n",
"Accompanying **problem sets** are on the Wiki at:\n",
" https://python.g-node.org/wiki/numpy"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"## Num-what?\n",
"\n",
"This talk discusses some of the more advanced NumPy features. If you’ve\n",
"never seen NumPy before, you may have more fun doing the tutorial at\n",
"\n",
"http://mentat.za.net/numpy/intro/intro.html\n",
"\n",
"or studying the course notes at\n",
"\n",
"http://scipy-lectures.github.io\n",
"\n",
"You can always catch up by reading:\n",
"\n",
"\n",
"> **The NumPy array: a structure for efficient numerical computation**. Stéfan\n",
"> van der Walt, S. Chris Colbert and Gaël Varoquaux. In IEEE Computing in\n",
"> Science Engineering, March/April 2011."
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"## Setup"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [],
"source": [
"import numpy as np # We always use this convention,\n",
" # also in the documentation"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false,
"internals": {
"slide_helper": "subslide_end"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1.11.2\n"
]
}
],
"source": [
"print(np.__version__) # version 1.7 or greater\n",
" # Version 1.9 released 7th of September!"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"ATLAS is a fast implementation of BLAS (Basic Linear Algebra Routines). On\n",
"OSX you have Accelerate; students can get Intel's MKL for free. On Ubuntu,\n",
"install ``libatlas3-base``.\n",
"\n",
"Make use of IPython's powerful features! TAB-completion, documentation, source\n",
"inspection, timing, cpaste, etc."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false,
"internals": {
"slide_helper": "subslide_end"
},
"slide_helper": "slide_end",
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"lapack_opt_info:\n",
" define_macros = [('HAVE_CBLAS', None)]\n",
" language = c\n",
" library_dirs = ['/usr/local/lib']\n",
" libraries = ['openblas', 'openblas']\n",
"openblas_info:\n",
" define_macros = [('HAVE_CBLAS', None)]\n",
" language = c\n",
" library_dirs = ['/usr/local/lib']\n",
" libraries = ['openblas', 'openblas']\n",
"blas_opt_info:\n",
" define_macros = [('HAVE_CBLAS', None)]\n",
" language = c\n",
" library_dirs = ['/usr/local/lib']\n",
" libraries = ['openblas', 'openblas']\n",
"blas_mkl_info:\n",
" NOT AVAILABLE\n",
"openblas_lapack_info:\n",
" define_macros = [('HAVE_CBLAS', None)]\n",
" language = c\n",
" library_dirs = ['/usr/local/lib']\n",
" libraries = ['openblas', 'openblas']\n",
"None\n"
]
}
],
"source": [
"print(np.show_config ()) # got ATLAS / Accelerate / MKL?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Ultra fast primer"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"l*2: [1, 2, 1, 2] ; a*2: [2 4]\n",
"l+l+l: [1, 2, 1, 2, 1, 2] ; a+a+a: [3 6]\n"
]
}
],
"source": [
"# Array are not list\n",
"l = [1,2]\n",
"a = np.asarray(l)\n",
"print('l*2: ', l*2, '; a*2: ', a*2)\n",
"#\n",
"print('l+l+l: ', l+l+l, '; a+a+a: ',a+a+a)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Arrays are mutable\n",
"\n",
"So, if you pass an array as an argument and change it in a function..."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[0 1 2 3]\n",
"[-10 1 2 3]\n"
]
}
],
"source": [
"a = np.arange(4)\n",
"def change_my_array(arr):\n",
" arr[0] = -10\n",
" return None\n",
" \n",
"print(a)\n",
"change_my_array(a)\n",
"print(a)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(200000000,)\n",
"100000000 loops, best of 3: 7.2 ns per loop\n",
"100000000 loops, best of 3: 7.15 ns per loop\n"
]
}
],
"source": [
"# READABILITY\n",
"\n",
"def function_1(seq, sub):\n",
" return [i for i in range(len(seq) - len(sub)) if seq[i:i+len(sub)] == sub]\n",
"\n",
"def function_2(seq, sub):\n",
" target = np.dot(sub, sub)\n",
" candidates = np.where(np.correlate(seq, sub, mode='valid') == target)[0]\n",
" check = candidates[:, np.newaxis] + np.arange(len(sub))\n",
" mask = np.all((np.take(seq, check) == sub), axis=-1)\n",
" return candidates[mask]\n",
"\n",
"from timeit import timeit\n",
"n, m = 5000, 200\n",
"s = np.kron(np.arange(n),np.eye(m)).reshape(n*m*m)\n",
"print(s.shape)\n",
"\n",
"%timeit(\"function_1(s, [6,7,8,9])\")\n",
"%timeit(\"function_2(s, [6,7,8,9])\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## The NumPy ndarray\n",
"\n",
"### Revision: Structure of an array"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_helper": "subslide_end"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "-"
}
},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Taking a look at ``numpy/core/include/numpy/ndarraytypes.h``:\n",
"\n",
"```cpp\n",
"ndarray typedef struct PyArrayObject {\n",
" PyObject_HEAD\n",
" char *data; /* pointer to data buffer */\n",
" int nd; /* number of dimensions */\n",
" npy_intp *dimensions; /* size in each dimension */\n",
" npy_intp *strides; /* bytes to jump to get\n",
" * to the next element in\n",
" * each dimension\n",
" */\n",
" PyObject *base; /* Pointer to original array\n",
" /* Decref this object */\n",
" /* upon deletion. */\n",
" PyArray_Descr *descr; /* Pointer to type struct */\n",
" int flags; /* Flags */\n",
" PyObject *weakreflist; /* For weakreferences */\n",
"} PyArrayObject ;\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[ 1. 1. 1. 1.]\n",
" [ 1. 1. 1. 1.]\n",
" [ 1. 1. 1. 1.]]\n",
"['T', '__abs__', '__add__', '__and__', '__array__', '__array_finalize__', '__array_interface__', '__array_prepare__', '__array_priority__', '__array_struct__', '__array_wrap__', '__bool__', '__class__', '__contains__', '__copy__', '__deepcopy__', '__delattr__', '__delitem__', '__dir__', '__divmod__', '__doc__', '__eq__', '__float__', '__floordiv__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__iadd__', '__iand__', '__ifloordiv__', '__ilshift__', '__imatmul__', '__imod__', '__imul__', '__index__', '__init__', '__int__', '__invert__', '__ior__', '__ipow__', '__irshift__', '__isub__', '__iter__', '__itruediv__', '__ixor__', '__le__', '__len__', '__lshift__', '__lt__', '__matmul__', '__mod__', '__mul__', '__ne__', '__neg__', '__new__', '__or__', '__pos__', '__pow__', '__radd__', '__rand__', '__rdivmod__', '__reduce__', '__reduce_ex__', '__repr__', '__rfloordiv__', '__rlshift__', '__rmatmul__', '__rmod__', '__rmul__', '__ror__', '__rpow__', '__rrshift__', '__rshift__', '__rsub__', '__rtruediv__', '__rxor__', '__setattr__', '__setitem__', '__setstate__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '__truediv__', '__xor__', 'all', 'any', 'argmax', 'argmin', 'argpartition', 'argsort', 'astype', 'base', 'byteswap', 'choose', 'clip', 'compress', 'conj', 'conjugate', 'copy', 'ctypes', 'cumprod', 'cumsum', 'data', 'diagonal', 'dot', 'dtype', 'dump', 'dumps', 'fill', 'flags', 'flat', 'flatten', 'getfield', 'imag', 'item', 'itemset', 'itemsize', 'max', 'mean', 'min', 'nbytes', 'ndim', 'newbyteorder', 'nonzero', 'partition', 'prod', 'ptp', 'put', 'ravel', 'real', 'repeat', 'reshape', 'resize', 'round', 'searchsorted', 'setfield', 'setflags', 'shape', 'size', 'sort', 'squeeze', 'std', 'strides', 'sum', 'swapaxes', 'take', 'tobytes', 'tofile', 'tolist', 'tostring', 'trace', 'transpose', 'var', 'view']\n"
]
}
],
"source": [
"a = np.ones((3,4))\n",
"print(a)\n",
"print(dir(a))\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"## A homogeneous container"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"source": [
"```cpp\n",
"char *data; /* pointer to data buffer */\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"source": [
"Data is just a pointer to bytes in memory:"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"source": [
""
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [],
"source": [
"x = np.array([1, 2, 3])"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"dtype('int64')"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x.dtype"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"(32558752, False)"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x.__array_interface__['data']"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n"
]
}
],
"source": [
"print(x.data)"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"## Dimensions"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"source": [
"```cpp\n",
"int nd;\n",
"npy_intp *dimensions; /* number of dimensions */\n",
" /* size in each dimension */\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_helper": "subslide_end"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "-"
}
},
"source": [
"\n",
"
\n",
"\n",
"
\n",
"
\n",
"
\n"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"collapsed": false,
"internals": {
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"x = np.array([])"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"()"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.array(0).shape"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"(3, 2, 3, 3)"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x = np.random.random((3, 2, 3, 3))\n",
"x.shape"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Data type descriptors"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"source": [
"```cpp\n",
"PyArray_Descr * descr ; /* Pointer to type struct */\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"source": [
"Common types in include ``int``, ``float``, ``bool``:"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"array([-1, 0, 1])"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.array ([-1, 0, 1], dtype=int)"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"array([-1., 0., 1.])"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.array([-1, 0, 1], dtype=float)"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"array([ True, False, True], dtype=bool)"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.array([-1 , 0, 1], dtype=bool)"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"source": [
"Each item in the array has to have the same type (occupy a fixed nr of bytes\n",
"in memory), but that does not mean a type has to consist of a single item:"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"array([(0, True), (1, False)], \n",
" dtype=[('value', '"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Example: Taking the transpose\n",
"\n",
"We can take the transpose of the array *without* copying any data by simply manipulating ``shape`` and ``strides``. What should they be?\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## View versus copy\n",
"\n",
"### A copy will allocate new memory, a view will not\n",
"### A change in the copy will not affect the original array, a change in the view will\n",
"### Some numpy operation return copies, others return views\n",
" - flatten return a copy\n",
" - slicing will return a view\n",
" - ravel will attempt to return a view\n",
" - reshape returns a view \n",
"### Check if array is a copy or a view with flags"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"Z = np.zeros((5,5))\n",
"Z.ravel().base is Z == True\n",
"#\n",
"Z[::2,::2].ravel().base is Z == True\n",
"#\n",
"assert(Z.flatten().base is not Z)\n",
"#\n",
"assert(Z.reshape((25,1)).base is Z)"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1 3 5 7]\n"
]
}
],
"source": [
"Z1 = np.arange(10)\n",
"Z2 = Z1[1:-1:2]\n",
"print(Z2)"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Flags"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"source": [
"```cpp\n",
"int flags; /* Flags */\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 113,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[1 3]\n",
" C_CONTIGUOUS : True\n",
" F_CONTIGUOUS : True\n",
" OWNDATA : True\n",
" WRITEABLE : True\n",
" ALIGNED : True\n",
" UPDATEIFCOPY : False\n"
]
}
],
"source": [
"x = np.array([1, 2, 3])\n",
"z = x[::2]\n",
"print(z)\n",
"print(x.flags)"
]
},
{
"cell_type": "code",
"execution_count": 114,
"metadata": {
"collapsed": false,
"internals": {
"slide_helper": "subslide_end"
},
"slide_helper": "slide_end",
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
" C_CONTIGUOUS : False\n",
" F_CONTIGUOUS : False\n",
" OWNDATA : False\n",
" WRITEABLE : True\n",
" ALIGNED : True\n",
" UPDATEIFCOPY : False"
]
},
"execution_count": 114,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"z.flags"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Temporary copy\n",
"\n",
" >>> X = np.ones(10, dtype=np.int)\n",
" >>> Y = np.ones(10, dtype=np.int)\n",
" >>> A = 2*X + 2*Y\n",
" \n",
" Versus:\n",
" \n",
" >>> X = np.ones(10, dtype=np.int)\n",
" >>> Y = np.ones(10, dtype=np.int)\n",
" >>> np.multiply(X, 2, out=X)\n",
" >>> np.multiply(Y, 2, out=Y)\n",
" >>> np.add(X, Y, out=X)"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Structured arrays\n",
"\n",
"Like we saw earlier, each item in an array has the same type, but that does not\n",
"mean a type has to consist of a single item:"
]
},
{
"cell_type": "code",
"execution_count": 179,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"array([(0, True), (1, False)], \n",
" dtype=[('value', '"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Reading this kind of data can be troublesome:"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"source": [
"```matlab\n",
"while ((count > 0) && (n <= NumPoints))\n",
" % get time - I8 [ ms ]\n",
" [lw, count] = fread(fid, 1, 'uint32');\n",
" if (count > 0) % then carry on\n",
" uw = fread(fid, 1, 'int32');\n",
" t(1, n) = (lw + uw * 2^32) / 1000;\n",
"\n",
" % get number of bytes of data\n",
" numbytes = fread(fid, 1, 'uint32');\n",
"\n",
" % read sMEASUREMENTPOSITIONINFO (11 bytes)\n",
" m(1, n) = fread(fid, 1, 'float32'); % az [ rad ]\n",
" m(2, n) = fread(fid, 1, 'float32'); % el [ rad ]\n",
" m(3, n) = fread(fid, 1, 'uint8'); % region type\n",
" m(4, n) = fread(fid, 1, 'uint16'); % region ID\n",
" g(1, n) = fread(fid, 1, 'uint8');\n",
"\n",
" numsamples = ( numbytes -12)/2; % 2 byte int\n",
" ...\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {},
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Using structured arrays:\n",
"\n",
"```python\n",
"dt = np.dtype([('time', np.uint64),\n",
" ('size', np.uint32),\n",
" ('position', [('az', np.float32),\n",
" ('el', np.float32),\n",
" ('region_type', np.uint8),\n",
" ('region_ID', np.uint16)]),\n",
" ('gain', np.uint8),\n",
" ('samples', (np.int16, 2048))])\n",
"\n",
"data = np.fromfile(f, dtype=dt)\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_helper": "subslide_end"
},
"slide_helper": "slide_end",
"slideshow": {
"slide_type": "-"
}
},
"source": [
"We can then access this structured array as before:\n",
"\n",
"```python\n",
"data['position']['az']\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Array functions"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[ 0.64488785 29.78139454]\n",
"[[ 1.69439088 0.72536413 -2.02163215 -1.00667092 1.8816507 -0.03026824\n",
" -0.93375708 0.14278375 -0.09448514 0.41436576]\n",
" [ 1.715446 1.08301563 1.59472585 0.50601605 -0.5222702 -0.53497445\n",
" -2.05725302 -1.86172097 0.56197185 0.49018729]\n",
" [-0.06686053 -0.1104309 -1.20906256 -0.70505399 1.41420279 1.49779434\n",
" -0.22250591 -0.09190751 0.67891195 0.01016829]]\n",
"[[6 5 0 4 7 9 1 9 6 7]\n",
" [2 5 3 4 4 4 1 1 2 3]\n",
" [5 6 7 8 6 2 4 5 6 6]]\n"
]
}
],
"source": [
"a = np.random.normal((3,30))\n",
"print(a)\n",
"a = np.random.normal(0,1, size=(3,10))\n",
"print(a)\n",
"a = np.random.randint(0,10, size=(3,10))\n",
"print(a)\n"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"a.min(), a.max(), a.mean(), a.std(), a.sum()\n",
"1 6 3.5 1.70782512766 21\n",
"[ 0 1 3 6 10 15 21]\n",
"[ 1 2 6 24 120 720]\n"
]
}
],
"source": [
"print(\"a.min(), a.max(), a.mean(), a.std(), a.sum()\")\n",
"print(a.min(), a.max(), a.mean(), a.std(), a.sum())\n",
"a = np.arange(7)\n",
"print(a.cumsum())\n",
"a = np.arange(1,7)\n",
"print(a.cumprod())"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Broadcasting\n",
"\n",
"Combining of differently shaped arrays without creating large intermediate\n",
"arrays:"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[0 1 2 3]\n",
"[3 4 5 6]\n"
]
}
],
"source": [
"x = np.arange(4)\n",
"print(x)\n",
"print(x + 3)"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_helper": "subslide_end"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "-"
}
},
"source": [
"![Broadcasting in 1D](images/broadcast_1D.png)\n",
"\n",
"See docstring of ``np.doc.broadcasting`` for more detail."
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"## Broadcasting (2D)"
]
},
{
"cell_type": "code",
"execution_count": 180,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[ 0 1 2 3]\n",
" [ 4 5 6 7]\n",
" [ 8 9 10 11]]\n",
"\n",
"[[1]\n",
" [2]\n",
" [3]]\n"
]
}
],
"source": [
"a = np.arange(12).reshape((3, 4))\n",
"b = np.array([1, 2, 3])[:, np.newaxis]\n",
"\n",
"print(a)\n",
"print()\n",
"print(b)"
]
},
{
"cell_type": "code",
"execution_count": 181,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 1, 2, 3, 4],\n",
" [ 6, 7, 8, 9],\n",
" [11, 12, 13, 14]])"
]
},
"execution_count": 181,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"a + b"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_helper": "subslide_end"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "-"
}
},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"## Broadcasting (3D)"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(3, 5)\n",
"(3, 5, 1)\n",
"(3, 5, 8)\n"
]
}
],
"source": [
"x = np.zeros((3, 5))\n",
"print(x.shape)\n",
"y = np.zeros(8)\n",
"z = x[..., np.newaxis]\n",
"print(z.shape)\n",
"w = z + y\n",
"print(w.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"source": [
"```python\n",
"x = np.zeros((3, 5))\n",
"y = np.zeros(8)\n",
"z = x[..., np.newaxis]\n",
"```\n",
"\n",
"```\n",
"X shape = (3, 5, 1)\n",
"Y shape = ( 8)\n",
"Z shape = (3, 5, 8)\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_helper": "subslide_end"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "-"
}
},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"## Broadcasting rules\n",
"\n",
"The broadcasting rules are straightforward: compare dimensions, starting from\n",
"the last. Match when either dimension is one or ``None``, or if dimensions are\n",
"equal:"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_helper": "subslide_end"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "-"
}
},
"source": [
"```\n",
"Scalar 2D 3D Bad\n",
"\n",
"( ,) (3, 4) (3, 5, 1) (3, 5, 2)\n",
"(3,) (3, 1) ( 8) ( 8)\n",
"---- ------ --------- ---------\n",
"(3,) (3, 4) (3, 5, 8) XXX\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 204,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"(3, 7, 5, 4)"
]
},
"execution_count": 204,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"a = np.random.normal(size=(3,1,5,1))\n",
"b = np.random.normal(size=(7,1,4))\n",
"\n",
"c=a+b\n",
"c.shape"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"## Explicit broadcasting"
]
},
{
"cell_type": "code",
"execution_count": 205,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"(3, 5, 8)"
]
},
"execution_count": 205,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x = np.zeros((3, 5, 1))\n",
"y = np.zeros((3, 5, 8))\n",
"\n",
"xx, yy = np.broadcast_arrays(x, y)\n",
"\n",
"xx.shape"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_helper": "subslide_end"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "-"
}
},
"source": [
"```python\n",
"np.broadcast_arrays([1, 2, 3],\n",
" [[1], [2], [3]])\n",
"```\n",
"\n",
"```\n",
"[[1 2 3] [[1 1 1]\n",
" [1 2 3] [2 2 2]\n",
" [1 2 3]] [3 3 3]]\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Some useful array manipulations \n"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[False False False False]\n",
" [False False True False]\n",
" [False False False False]]\n",
"[[False False False False]\n",
" [False False True True]\n",
" [False False False False]]\n"
]
},
{
"data": {
"text/plain": [
"array([6, 7])"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# meshgrid ? \n",
"# np.where\n",
"# boolean operation\n",
"\n",
"a = np.arange(12).reshape(3,4)\n",
"b = (a == 6) \n",
"print(b)\n",
"\n",
"b = np.logical_or(a == 6, a == 7)\n",
"print(b)\n",
"a[b]"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[ 0 1 2 3]\n",
" [ 4 5 6 7]\n",
" [ 8 9 10 11]]\n",
"(array([1, 1, 1, 1, 2, 2, 2, 2]), array([0, 1, 2, 3, 0, 1, 2, 3]))\n",
"[ 4 5 6 7 8 9 10 11]\n",
"[[False False False False]\n",
" [ True True True True]\n",
" [ True True True True]]\n"
]
}
],
"source": [
"print(a)\n",
"print(np.where(a>3))\n",
"print(a[np.where(a>3)])\n",
"print(a > 3)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Axis operation"
]
},
{
"cell_type": "code",
"execution_count": 219,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# reshape\n",
"# flatten / ravel\n",
"# transpose\n",
"# axis rotation\n"
]
},
{
"cell_type": "code",
"execution_count": 229,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(1, 3, 2, 4)\n",
"(3, 1, 2, 4)\n",
"(1, 2, 3, 4)\n"
]
}
],
"source": [
"a = np.arange(24).reshape(1,2,3,4)\n",
"a.T.shape\n",
"b = np.rollaxis(a, 2, 1)\n",
"print(b.shape)\n",
"b = np.rollaxis(a, 2, 0)\n",
"print(b.shape)\n",
"b = np.rollaxis(a, 2, 3)\n",
"print(b.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## hstack - vstack"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(4, 10)\n",
"(10, 4)\n",
"(4, 10)\n",
"(10, 4)\n"
]
}
],
"source": [
"a1 = np.ones((1,10))\n",
"a3 = np.ones((3,10))\n",
"\n",
"# vstack\n",
"a4 = np.vstack((a1,a3))\n",
"print(a4.shape)\n",
"\n",
"# hstack\n",
"a10 = np.hstack((a1.T, a3.T))\n",
"print(a10.shape)\n",
"\n",
"print(np.concatenate((a1, a3), axis=0).shape)\n",
"print(np.concatenate((a1.T, a3.T), axis=1).shape)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## dot product: at the basis of so many things :)"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[ 1.38102561 0.27409858 1.26081074 0.2600873 ]\n",
" [ 0.86771584 0.85718828 0.21770531 0.94686422]\n",
" [ 5.09592854 0.25703762 2.07068062 0.36530305]\n",
" [ 1.4798213 0.37511234 7.01844686 2.6445728 ]]\n",
"True\n",
"0.0\n"
]
}
],
"source": [
"I = np.eye(4)\n",
"a = np.random.lognormal(size=(4,4))\n",
"print(a)\n",
"b = a.dot(I)\n",
"print(np.allclose(b, a))\n",
"print(((a-b)**2).sum())"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 1.38102561, 0.27409858, 1.26081074, 0.2600873 ],\n",
" [ 0.86771584, 0.85718828, 0.21770531, 0.94686422],\n",
" [ 5.09592854, 0.25703762, 2.07068062, 0.36530305],\n",
" [ 1.4798213 , 0.37511234, 7.01844686, 2.6445728 ]])"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"b@I"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Fancy indexing\n",
"\n",
"An ndarray can be indexed in two ways:\n",
"\n",
" - Using slices and scalars\n",
" - Using ndarrays (\"fancy indexing\")\n",
"\n",
"For example:"
]
},
{
"cell_type": "code",
"execution_count": 184,
"metadata": {
"collapsed": false,
"internals": {
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0, 1, 2],\n",
" [3, 4, 5],\n",
" [6, 7, 8]])"
]
},
"execution_count": 184,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x = np.arange(9).reshape((3, 3))\n",
"x"
]
},
{
"cell_type": "code",
"execution_count": 185,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"array([[1, 1, 2],\n",
" [4, 4, 5],\n",
" [7, 7, 8]])"
]
},
"execution_count": 185,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x[:, [1, 1, 2]]"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"source": [
"Which is equivalent to:"
]
},
{
"cell_type": "code",
"execution_count": 186,
"metadata": {
"collapsed": false,
"internals": {
"slide_helper": "subslide_end"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"array([[1, 1, 2],\n",
" [4, 4, 5],\n",
" [7, 7, 8]])"
]
},
"execution_count": 186,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.array((x[:, 1], x[:, 1], x[:, 2])).T"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Output shape of an indexing op\n",
"\n",
"1. Broadcast all index arrays against one another.\n",
"2. Use the dimensions of slices as-is."
]
},
{
"cell_type": "code",
"execution_count": 188,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[0 1 2]\n",
" [3 4 5]\n",
" [6 7 8]]\n"
]
}
],
"source": [
"x = np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8]])\n",
"print(x)"
]
},
{
"cell_type": "code",
"execution_count": 189,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"(3, 3)"
]
},
"execution_count": 189,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x.shape"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"source": [
"But what would happen if we do"
]
},
{
"cell_type": "code",
"execution_count": 191,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0, 4],\n",
" [3, 7]])"
]
},
"execution_count": 191,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"idx0 = np.array([[0, 1],\n",
" [1, 2]]) # row indices\n",
"\n",
"idx1 = np.array([[0, 1]]) # column indices\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_helper": "subslide_end"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "-"
}
},
"source": [
"```python\n",
"x[idx0, idx1]\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 192,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0, 4],\n",
" [3, 7]])"
]
},
"execution_count": 192,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x[idx0, idx1]\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Output shape of indexing op (cont'd)"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"((2, 2), (1, 2))"
]
},
"execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"idx0.shape, idx1.shape"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0, 1],\n",
" [1, 2]])"
]
},
"execution_count": 45,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"a, b = np.broadcast_arrays(idx0, idx1)\n",
"a"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0, 1],\n",
" [0, 1]])"
]
},
"execution_count": 46,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"b"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0, 1, 2],\n",
" [3, 4, 5],\n",
" [6, 7, 8]])"
]
},
"execution_count": 47,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {
"collapsed": false,
"internals": {
"slide_helper": "subslide_end"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"array([[0, 4],\n",
" [3, 7]])"
]
},
"execution_count": 48,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x[idx0, idx1]\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"## Output shape of an indexing op (cont'd)"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [],
"source": [
"x = np.random.random((15, 12, 16, 3))\n",
"\n",
"index_one = np.array([[0, 1], [2, 3], [4, 5]])\n",
"index_two = np.array([[0, 1]])"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(3, 2)\n",
"(1, 2)\n"
]
}
],
"source": [
"print(index_one.shape)\n",
"print(index_two.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"source": [
"Predict the output shape of"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"source": [
"```python\n",
"x[5:10, index_one, :, index_two]\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_helper": "subslide_end"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "-"
}
},
"source": [
"Warning! When mixing slicing and fancy indexing, the\n",
"*order* of the output dimensions is less easy to predict.\n",
"Play it safe and **don't mix the two!**"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"## Output shape of an indexing op (cont'd)"
]
},
{
"cell_type": "code",
"execution_count": 193,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [],
"source": [
"x = np.random.random((15, 12, 16, 3))\n",
"index_one = np.array([[0, 1], [2, 3], [4, 5]])\n",
"index_two = np.array([[0, 1]])"
]
},
{
"cell_type": "code",
"execution_count": 194,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(3, 2)\n",
"(1, 2)\n"
]
}
],
"source": [
"print(index_one.shape)\n",
"print(index_two.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"source": [
"Broadcast ``index1`` against ``index2``:"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"source": [
"```\n",
"(3, 2) # shape of index1\n",
"(1, 2) # shape of index2\n",
"------\n",
"(3, 2)\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_helper": "subslide_end"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "-"
}
},
"source": [
"The shape of ``x[5:10, index_one, :, index_two]`` is\n",
"\n",
"``(3, 2, 5, 16)``"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"## Jack's dilemma\n",
"\n",
"Indexing and broadcasting are intertwined, as we’ll see in the following\n",
"example. One of my favourites from the NumPy mailing list:"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_helper": "subslide_end"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "-"
}
},
"source": [
"```email\n",
"Date: Wed, 16 Jul 2008 16:45:37 -0500\n",
"From: Jack Cook\n",
"To: \n",
"Subject: Numpy Advanced Indexing Question\n",
"```\n",
"\n",
"Greetings,\n",
"\n",
"I have an I,J,K 3D volume of amplitude values at regularly sampled\n",
"time intervals. I have an I,J 2D slice which contains a time (K)\n",
"value at each I, J location. What I would like to do is extract a\n",
"subvolume at a constant +/- K window around the slice. Is there an\n",
"easy way to do this using advanced indexing or some other method?\n",
"Thanks in advanced for your help.\n",
"\n",
"-- Jack"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_helper": "subslide_end",
"slide_type": "subslide"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "-"
}
},
"source": [
""
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Test setup"
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [],
"source": [
"ni, nj, nk = (10, 15, 20) # dimensions"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"source": [
"Make a fake data block such that ``block[i, j, k] == k`` for all i, j, k."
]
},
{
"cell_type": "code",
"execution_count": 56,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [],
"source": [
"block = np.empty((ni, nj, nk) , dtype=int)\n",
"block[:] = np.arange(nk)[np.newaxis, np.newaxis, :]"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"source": [
"Pick out a random fake horizon in ``k``:"
]
},
{
"cell_type": "code",
"execution_count": 57,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 5, 14, 11, 13, 8, 12, 11, 8, 13, 8, 7, 10, 12, 8, 10],\n",
" [ 7, 5, 6, 8, 9, 10, 10, 14, 5, 14, 14, 12, 5, 8, 9],\n",
" [ 7, 14, 8, 14, 6, 6, 13, 9, 6, 5, 6, 13, 13, 11, 7],\n",
" [14, 9, 6, 12, 14, 14, 8, 10, 14, 10, 13, 12, 12, 11, 9],\n",
" [12, 11, 14, 14, 13, 5, 7, 8, 8, 14, 14, 5, 7, 8, 12],\n",
" [ 5, 7, 10, 11, 12, 12, 8, 9, 14, 13, 8, 6, 13, 8, 6],\n",
" [ 8, 6, 12, 14, 5, 12, 5, 13, 14, 13, 10, 14, 9, 7, 6],\n",
" [10, 6, 14, 14, 11, 9, 11, 8, 11, 5, 13, 13, 12, 14, 13],\n",
" [ 5, 11, 9, 5, 6, 8, 7, 11, 10, 9, 14, 5, 13, 9, 6],\n",
" [13, 12, 6, 12, 8, 13, 5, 11, 14, 9, 13, 13, 12, 10, 12]])"
]
},
"execution_count": 57,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"k = np.random.randint(5, 15, size=(ni, nj))\n",
"k"
]
},
{
"cell_type": "code",
"execution_count": 58,
"metadata": {
"collapsed": false,
"internals": {
"slide_helper": "subslide_end"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "-"
}
},
"outputs": [],
"source": [
"half_width = 3"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Solution\n",
"\n",
"These two indices assure that we take a slice at each (i, j) position"
]
},
{
"cell_type": "code",
"execution_count": 59,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [],
"source": [
"idx_i = np.arange(ni)[:, np.newaxis, np.newaxis]\n",
"idx_j = np.arange(nj)[np.newaxis, :, np.newaxis]"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"source": [
"This is the substantitive part that picks out the window:"
]
},
{
"cell_type": "code",
"execution_count": 60,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [],
"source": [
"idx_k = k[:, :, np.newaxis] + np.arange(-half_width, half_width + 1)\n",
"slices = block[idx_i, idx_j, idx_k] # Slice"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"source": [
"Apply the broadcasting rules:"
]
},
{
"cell_type": "code",
"execution_count": 64,
"metadata": {
"collapsed": false,
"internals": {
"slide_helper": "subslide_end"
},
"slide_helper": "subslide_end",
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"(10, 15, 7)"
]
},
"execution_count": 64,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(ni, 1, 1 ) # idx_i\n",
"(1, nj, 1 ) # idx_j\n",
"(ni, nj, 2 * half_width + 1) # idx_k\n",
"# ---\n",
"(ni, nj, 7) # <-- this is what we wanted!"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Solution verification"
]
},
{
"cell_type": "code",
"execution_count": 65,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"(10, 15, 7)"
]
},
"execution_count": 65,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"slices = block[idx_i, idx_j, idx_k]\n",
"slices.shape\n",
"(10, 15, 7)"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"source": [
"Now verify that our window is centered on ``k`` everywhere:"
]
},
{
"cell_type": "code",
"execution_count": 66,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 5, 14, 11, 13, 8, 12, 11, 8, 13, 8, 7, 10, 12, 8, 10],\n",
" [ 7, 5, 6, 8, 9, 10, 10, 14, 5, 14, 14, 12, 5, 8, 9],\n",
" [ 7, 14, 8, 14, 6, 6, 13, 9, 6, 5, 6, 13, 13, 11, 7],\n",
" [14, 9, 6, 12, 14, 14, 8, 10, 14, 10, 13, 12, 12, 11, 9],\n",
" [12, 11, 14, 14, 13, 5, 7, 8, 8, 14, 14, 5, 7, 8, 12],\n",
" [ 5, 7, 10, 11, 12, 12, 8, 9, 14, 13, 8, 6, 13, 8, 6],\n",
" [ 8, 6, 12, 14, 5, 12, 5, 13, 14, 13, 10, 14, 9, 7, 6],\n",
" [10, 6, 14, 14, 11, 9, 11, 8, 11, 5, 13, 13, 12, 14, 13],\n",
" [ 5, 11, 9, 5, 6, 8, 7, 11, 10, 9, 14, 5, 13, 9, 6],\n",
" [13, 12, 6, 12, 8, 13, 5, 11, 14, 9, 13, 13, 12, 10, 12]])"
]
},
"execution_count": 66,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"slices[:, :, 3]"
]
},
{
"cell_type": "code",
"execution_count": 67,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 67,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.all(slices[:, :, 3] == k)"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {},
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## The \\_\\_array\\_interface\\_\\_\n",
"\n",
"Any object that exposes a suitable dictionary named\n",
"``__array_interface__`` may be converted to a NumPy array. This is\n",
"handy for exchanging data with external libraries. The array interface\n",
"has the following important keys (see\n",
"http://docs.scipy.org/doc/numpy/reference/arrays.interface):\n",
"\n",
" - **shape**\n",
" - **typestr**\n",
" - **data**: (20495857, True); 2-tuple—pointer to data and boolean to\n",
"indicate whether memory is read-only\n",
" - **strides**\n",
" - **version**: 3"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {},
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"## Example: Repeating sequence"
]
},
{
"cell_type": "code",
"execution_count": 71,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(4,) (8,)\n"
]
}
],
"source": [
"data = np.array([1, 2, 3, 4])\n",
"print(data.shape, data.strides)"
]
},
{
"cell_type": "code",
"execution_count": 72,
"metadata": {
"collapsed": false,
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"outputs": [],
"source": [
"N = 400000\n",
"repeat = np.lib.stride_tricks.as_strided(data, shape=(N, 4), strides=(0, 8))"
]
},
{
"cell_type": "code",
"execution_count": 73,
"metadata": {
"collapsed": false,
"internals": {
"slide_helper": "subslide_end"
},
"slide_helper": "slide_end",
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/plain": [
"array([[1, 2, 3, 4],\n",
" [1, 2, 3, 4],\n",
" [1, 2, 3, 4],\n",
" ..., \n",
" [1, 2, 3, 4],\n",
" [1, 2, 3, 4],\n",
" [1, 2, 3, 4]])"
]
},
"execution_count": 73,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"repeat"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {
"slide_type": "subslide"
},
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Questions, discussion and exercizes"
]
},
{
"cell_type": "markdown",
"metadata": {
"internals": {},
"slideshow": {
"slide_type": "-"
}
},
"source": [
"---\n",
""
]
},
{
"cell_type": "code",
"execution_count": 195,
"metadata": {
"collapsed": false,
"internals": {
"slide_helper": "subslide_end"
},
"slide_helper": "slide_end",
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"data": {
"text/html": [
""
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%reload_ext load_style\n",
"%load_style css/notebook.css"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"celltoolbar": "Slideshow",
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.1+"
}
},
"nbformat": 4,
"nbformat_minor": 0
}