# Numpy Arrays

## Goals

* For beginners, get a sense of how an array can be used.
* For more experienced practitioners, fill in a deeper understanding of how arrays work and perhaps see one or two useful new things.

In [1]:
import numpy as np

In [3]:
a = np.arange(15).reshape(3, 5)
a

array([[ 0, 1, 2, 3, 4],
 [ 5, 6, 7, 8, 9],
 [10, 11, 12, 13, 14]])

## Items and slices

In [46]:
a[1, 1]

6

In [29]:
a[0]

array([0, 1, 2, 3, 4])

In [30]:
a[:, 0]

array([ 0, 5, 10])

In [47]:
a[0:2, 0:2]

array([[0, 1],
 [5, 6]])

What does this do?|

In [49]:
a[10:1000]

array([], shape=(0, 5), dtype=int64)

## Arrays with different dimensions can be combined via "broadcasting"

In [34]:
a * 100

array([[ 0, 100, 200, 300, 400],
 [ 500, 600, 700, 800, 900],
 [1000, 1100, 1200, 1300, 1400]])

In [35]:
a - a[0]

array([[ 0, 0, 0, 0, 0],
 [ 5, 5, 5, 5, 5],
 [10, 10, 10, 10, 10]])

In [55]:
a - a[:, 0] # nope!

ValueError: operands could not be broadcast together with shapes (3,5) (3,) 

Quoting https://jakevdp.github.io/PythonDataScienceHandbook/02.05-computation-on-arrays-broadcasting.html

> Rule 1: If the two arrays differ in their number of dimensions, the shape of the one with fewer dimensions is padded with ones on its leading (left) side.

> Rule 2: If the shape of the two arrays does not match in any dimension, the array with shape equal to 1 in that dimension is stretched to match the other shape.

> Rule 3: If in any dimension the sizes disagree and neither is equal to 1, an error is raised.


In [61]:
print(a.shape)
print(a[:, 0].shape)

(3, 5)
(3,)


In [62]:
a[:, 0]

array([ 0, 5, 10])

In [58]:
a[:, 0, np.newaxis]

array([[ 0],
 [ 5],
 [10]])

In [63]:
print(a.shape)
print(a[:, 0, np.newaxis].shape)

(3, 5)
(3, 1)


In [56]:
a - a[:, 0, np.newaxis]

array([[0, 1, 2, 3, 4],
 [0, 1, 2, 3, 4],
 [0, 1, 2, 3, 4]])

Slices can be created on their own and reused.

In [79]:
every2 = np.s_[::2]
every10 = np.s_[::10]
b = np.arange(15)
print(f"{b = }")
print(f"{every2 = }")
print(f"{every10 = }")
print(f"{b[every2] = }")
print(f"{b[every10] = }")

b = array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])
every2 = slice(None, None, 2)
every10 = slice(None, None, 10)
b[every2] = array([ 0, 2, 4, 6, 8, 10, 12, 14])
b[every10] = array([ 0, 10])


Great reference on slices in Python in general and multi-dimensional slicing in particular: https://quansight-labs.github.io/ndindex/slices.html

## Anatomy of an Array

In [4]:
a.shape

(3, 5)

In [6]:
a.ndim

2

In [7]:
a.size

15

In [8]:
a.ndim

2

In [9]:
a.nbytes

120

In [10]:
a.dtype

dtype('int64')

In [15]:
a.tolist()

[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12, 13, 14]]

## Peeking under the hood, just for a moment

A block of memory with rules to "striding" through it and interpreting it

In [None]:
a.data



In [50]:
a.dtype.itemsize

8

In [51]:
a.shape

(3, 5)

In [22]:
a.strides

(40, 8)

In [33]:
print(a[0])
print(a[:, 0])

[0 1 2 3 4]
[ 0 5 10]


In [23]:
a.data



In [28]:
a.data.hex()

'00000000000000000100000000000000020000000000000003000000000000000400000000000000050000000000000006000000000000000700000000000000080000000000000009000000000000000a000000000000000b000000000000000c000000000000000d000000000000000e00000000000000'

## Limitations and Coping Strategies

* No way to label to dimensions, have to keep track of which is which
 * Pass around an object, like a dict, as a key.
 * Consider using xarray.
 * Resist the temptation to subclass! If you want to go down that general path, look at [Writing custom array containers](https://numpy.org/doc/stable/user/basics.dispatch.html).
* No way to include coordinates ("tick labels")
 * Pass around a simple object, like a dict, containing multiple numpy arrays.
 * Consider using xarray.
* No built-in support for units
 * Use a library like pynt.
 * Numpy has added support for custom data types...
 * https://numpy.org/neps/nep-0042-new-dtypes.html
 * https://github.com/numpy/numpy-user-dtypes
 * ...which can be used to implement units!
 * https://github.com/seberg/unitdtype
 * Numpy's unit support is not "mainstream" yet, but it is growing.