<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Objects-and-type:-a-quick-recap" data-toc-modified-id="Objects-and-type:-a-quick-recap-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Objects and type: a quick recap</a></span></li><li><span><a href="#Vectors,-Matrices-and-Arrays-using-NumPy" data-toc-modified-id="Vectors,-Matrices-and-Arrays-using-NumPy-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Vectors, Matrices and Arrays using NumPy</a></span></li><li><span><a href="#Libraries-in-Python:-introducing-NumPy-for-working-with-arrays" data-toc-modified-id="Libraries-in-Python:-introducing-NumPy-for-working-with-arrays-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Libraries in Python: introducing NumPy for working with arrays</a></span></li><li><span><a href="#Creating-vectors,-matrices-and-arrays-with-NumPy" data-toc-modified-id="Creating-vectors,-matrices-and-arrays-with-NumPy-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Creating vectors, matrices and arrays with NumPy</a></span></li><li><span><a href="#Further-ways-of-creating-NumPy-arrays" data-toc-modified-id="Further-ways-of-creating-NumPy-arrays-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Further ways of creating NumPy arrays</a></span></li><li><span><a href="#Next-steps" data-toc-modified-id="Next-steps-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Next steps</a></span></li></ul></div>

> All content here is under a Creative Commons Attribution [CC-BY 4.0](https://creativecommons.org/licenses/by/4.0/) and all source code is released under a [BSD-2 clause license](https://en.wikipedia.org/wiki/BSD_licenses). Parts of these materials were inspired by https://github.com/engineersCode/EngComp/ (CC-BY 4.0), L.A. Barba, N.C. Clementi.
>
>Please reuse, remix, revise, and [reshare this content](https://github.com/kgdunn/python-basic-notebooks) in any way, keeping this notice.

# Module 4: Overview 

We cover the following topics here:

1. Recap of objects and types
2. What do we mean by vectors, matrices and arrays
3. Using a Python library: introducing NumPy
4. Creating vectors, matrices and arrays with NumPy
5. Special matrices in NumPy (e.g. identity matrices, random numbers)

## Preparing for this module

* Not much, just a general understanding of scalars, matrices and arrays.

<img src="images/general/128px-Achtung.svg.png" style="width: 100px; float:right "/> 

This session appears lengthy, but it is a recap of very familiar topics. 

Quickly go over what you are comfortable with; we hope to get everyone to the same level of understanding.


## Objects and type: a quick recap

> <b>"Arrays store objects of the same type." </b>

There's a lot in that sentence:
* ***objects***: do you recall what an object is in Python?
* ***type***: do you recall what a type is?

A quick recap might be helpful (refer to [session 1](https://yint.org/pybasic01) for a refresher)
* ***everything in Python is an object***. For example, a numeric value, a string, a list, a tuple. These are all just objects. Objects can be assigned to a variable, and they can also be the inputs for a function.
* **``type(object)``** will tell you which type of object you have. For example ``type(45.2)`` will give ``float`` as a reply.

So now you should understand that an ***array*** is just a collection of these objects. Let's take a look with an example.

Here is a collection of floating points objects:

``[45.2, 91.2, 67.2, -23.78]``

The type of the object is ``float`` (we could have also used ``int`` (integer) objects). The 4 objects are collected in a list, and that list is also an object.

Remember you can always confirm the ***type*** of an ***object*** as follows. Try it:
```python
type(45.2)
type(42)
type('some text')
type([45.2, 91.2, 67.2, -23.78])
```

## Vectors, Matrices and Arrays using NumPy

Let's quickly get a few definitions out of the way, and start. Start by collecting some objects together, first singly (scalar), then in a list (vector), then as a 'spreadsheet' (matrix), then as an array (3-dimensional, or higher dimensional).


### Scalars

If our collection of (numeric) objects coincidentally is only a single number, we call that a ***scalar***.
* ```python
scalar_1 = 45.2```
* ```python
scalar_2 = 0```
* ```python
scalar_3 = -12```

### Vectors

A collection of scalars in a single row, or column, is very much like a `list` in regular Python. This collection we then call a ***vector***.
* ```python 
list_1 = [1, 2, 6, -2, 0]```
* ```python 
list_2 = [0, 0, 0, 0, 0, 0, 0, 0]```
* ```python 
list_3 = [254.2, 501, 368.4, 697, 476.5, 188.2, 525.6, 451, 514]```


We say this collection has a single dimension: a single row of numbers, or a single column of numbers. If there coincidentally is 1 number in the collection, we simply call that a scalar. But in theory we can store as many numbers as we like in our vector.

Think, for example, the impeller speed of a batch reactor, measured every minute, during the duration of a batch. This 1-dimensional sequence is called a vector.

### Matrix

If we take several 1-dimensional vectors, but **each one of the same length**, and put them together, side-by-side then we get a ***matrix***. 

* ```python 
matrix_1 = [ [1, 2, 6, -2], [4, 3, 2, 1] ]   # has 2 rows and 4 columns```
* ```python 
matrix_2 = [ [0, 0, 0], [0, 0, 0], [0, 0, 0] ]   # has 3 rows and 3 columns```
* ```python 
matrix_3 = [ [9, 8, 7, 6], [5, 4, 4, 3] ]   # also has 2 rows and 4 columns```

You could crudely store, as we showed above, a matrix by using a list of lists, where the main list (the outside list) contains objects which themselves are lists. This is perfectly valid in Python: remember that a list can contain objects of any type, including other lists. But while this "list-of-lists" approach can store your data, it would not be great for calculations.

**Try this:** (the result is complely unintuitive for mathematical operations)
```python
matrix_1 + matrix_3
matrix_3 + 7
```
    
Another point to note is that a vector is simply a matrix, but where one of the dimensions is equal to 1: either 1 row, or 1 column. 

Matrices are widely used in engineering and data analysis. Often each row is an object, or a sample, or an observation. And each column represents some sort of value measured on that object or sample. For example:
<table>
  <tr>
    <th></th>
    <th>Measurement 1</th>
    <th>Measurement 2</th>
    <th>Measurement 3</th>
    <th>Measurement 4</th>
  </tr>
  <tr class="odd">
    <td>Sample 1</td>
    <td>5.5</td>
    <td>0.55</td>
    <td>-23.4</td>
    <td>561522.2</td>
  </tr>
  <tr class="odd">
    <td>Sample 2</td>
    <td>6.7</td>
    <td>0.44</td>
    <td>-22.2</td>
    <td>526616.4</td>
  </tr>
  <tr class="odd">
    <td>Sample 3</td>
    <td>4.9</td>
    <td>0.61</td>
    <td>-38.1</td>
    <td>612515.7</td>
  </tr>
</table>
This matrix would have 3 rows and 4 columns.


### Array

If we take several 2-dimensional matrices, but **each one with the same number of rows and columns**, and put them together, then we get a ***3-dimensional array***. 

A matrix was a list-of-lists. We can go up to a third dimension and make a list-of-lists-of-lists. 

Why stop there? We can go to higher and higher dimensions. We use a general names for such a collection of (numeric) objects: an ***array***. 

An array is an *n*-dimensional structure of numbers. You can therefore say:
* a vector is a 1-dimensional data structure
* a matrix is a 2-dimensional data structure
* an array is an *n*-dimensional data structure


<img src="images/numpy/batch-data-layers-into-page-3d-structure.png" style="width: 400px; float: right"/> 

For example, a 3-dimensional array here shows data collected in a lab: we are performing the experiment several times (``N``, the layers - each layer is a matrix actually - that lies on top of each other). 

In each experiment we collect a matrix of data from several sensors. There are ``K`` sensors. We set the sensors to collect data on a regular interval, once every 3 seconds, for example, so that we end up with exactly the same number of samples per sensor, ``J`` values per sensor.


Storing the data like this is useful, because now you could perform calculations on all experiments over all time, for all sensors in array **X**.

*For example:* you can calculate the average in the direction of arrow ``J``, to reduce the *array* to a *matrix*. That matrix would be the average value of the sensor for the experiments. That reduced matrix would have ``N`` rows and ``K`` columns.

Engineering applications benefit from using *vectors*, or *matrices* or *arrays*: they are sequences of data all of the _same type_. Arrays behave a lot like lists in Python, except for the constraint that all elements have the same type. 


## Libraries in Python: introducing NumPy for working with arrays

There is an important Python library in science and engineering, called **NumPy**, 
that provides support for _n-dimensional array_ data structures (a.k.a, `ndarray`).

Later on we will learn about the library called ``pandas`` (Python Data Analysis Library), which is better suited than NumPy for many situations. But underneath each pandas dataframe (we will define that term later), exists a NumPy array. So understanding NumPy is key to understanding pandas. Learning NumPy is also an easy step for people coming from MATLAB.

Let us import NumPy and get started.

### Importing libraries

First, a word on importing libraries to expand your running Python session. Because libraries are large collections of code and are for special purposes, they are not loaded automatically when you launch Python (or IPython, or Jupyter). You have to import a library using the `import` command. For example, to import **NumPy**, you can enter:

```python
import numpy
```

Once you execute that command, you can call any NumPy function using the dot notation, prepending the library name. For example, some commonly used functions are:

* [`numpy.sqrt()`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.sqrt.html)
* [`numpy.ones()`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ones.html#numpy.ones)
* [`numpy.zeros()`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.zeros.html#numpy.zeros)
* [`numpy.copy()`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.copy.html#numpy.copy)


Part of the community effort of creating the Python libraries, is also an effort at maintaining excellent documentation. 

### To try:
>Click and read one of those links to explore the documentation - the pages each have the same layout, so once you know where to look, you can quickly search and refer to the documentation for other functions.

> Also try: ``dir(numpy)``. Do you remember what the ``dir(...)`` function does?

The ``dir(...)`` function applies to any ***object*** in Python, and ``numpy`` here, once imported, is also an object.

> What ***type*** is ``numpy`` ?

### Importing libraries as an alias

You will find _a lot_ of source code that uses a different syntax for importing. Most often you will see:
>```python
>import numpy as np
>```

All this does is create an alias for `numpy` with the shorter string `np`, so you then would call a **NumPy** function like this: `np.linspace()` instead of the lengthier ``numpy.linspace()``.

 This is just an alternative way of doing it. It is arguably better that you are explicit (using the full ``numpy.``), but practicality, code reuse, and screen real-estate often dictate that people write it simply as ``np``. Both are fine.

```python
import numpy
import numpy as np    # both do the same
```

### Creating your first array .... well vector, to be specific

To create a NumPy array from an existing Python ``list`` of numbers, we use **`numpy.array()`**, like this:

```python
my_list = [3, 4, 7, -2, 11]
np.array(my_list)

# or more compactly, without the intermediate variable:
np.array([3, 4, 7, -2, 11])
```

Try it yourself:

>Create an array of 11 numbers below, some negative, some positive, some integers, some floating point

```python
# Create a vector of 11 numbers
import numpy as np
eleven = np.array([ ... ])
print(eleven)
print(len(eleven))  # verify the length
```

Python allows you to create lists of mixed types, for example, strings, floating point, integers, etc.
What happens if you try creating a NumPy array from a mixed list of object types?

***What happens?***

In this list there are 3 objects, of 3 different types. Try running the code below to verify:
```python
my_list = ['abc', 123, 456.7] 
np.array(my_list)
```

## Creating vectors, matrices and arrays with NumPy

NumPy offers many [ways to create arrays](https://docs.scipy.org/doc/numpy/reference/routines.array-creation.html#routines-array-creation). Also read [this overview](https://docs.scipy.org/doc/numpy/user/basics.creation.html#arrays-creation).


###  Creating your first vector with NumPy
> 1. Scroll through the first link above to see just how many ways there are.
> 2. One of the simplest vectors we can create is a vector of just ones (1's). Try the `numpy.ones()` command below. We must tell NumPy how many array elements we would like. 
                                        
```python
# To try: change the '5' to some other integer number
import numpy as np
np.ones(shape=5)    # Using the explicit function call
np.ones(5)          # often we use this shortcut instead
```

There is also a command to create a vector of zeros:
```python
np.zeros(shape=3)
np.zeros(3)
```

> Here you see that Python functions can be called by specifying the function input name: in this example the single input ``shape`` is specified in ``np.zeros(shape=...)``.

### Creating your first matrix (a two-dimensional array)

For this we use the ``.ones()`` or ``.zeros()`` command, but we just specify the ``shape`` argument to differently. Instead of an integer, we provide a tuple.

```python
twoD = np.ones(shape=(5,7))
print(twoD)

# Verify that the shape is what you expect:
print(twoD.shape)
print('------------')

naughts = np.zeros((5,7))
print(naughts)
print(type(naughts))      # you have now created an object with type `numpy.ndarray`
```

Every NumPy array can be queried using the ``.shape`` attribute. That means, add ``.shape`` to the array, and you will ask Python to return the attribute of that array called ``shape``.

### Creating your first multi-dimensional array

Why stop at two-dimensions? Create a 3-dimensional array with 2 rows, 3 columns and 4 layers: in other words a $2 \times 3 \times 4$ array.

Just adjust the `tuple` provided to the `shape` argument:
```python
threeD = np.zeros(shape=(2,3,4))
print(threeD)
print(threeD.shape)
```

> Is this what you expected to see? You might have to imagine the 3rd dimension going in and out of the screen.
> 
> 1. Try to create a matrix with 4 rows and 5 columns, where every value in the matrix is the number 8. Do this by making a matrix of only ``.ones()`` and multiplying that matrix by the value of 8.
>
> 2. Now do the same thing, using the ``np.full`` command. If you need help, please see the [Numpy documentation for the ``np.full`` command](https://docs.scipy.org/doc/numpy/reference/generated/numpy.full.html#numpy.full]) .

```python

# Step 1:
eights = np.ones( ___ ) * ___
print(eights)

# Step 2:
eight_again = np.full(shape=___, fill_value=___)
print(eight_again)
```

### Summary so far

You have created vectors, matrices and arrays. These have a specific ``.shape`` attribute that you can check.

There is are several attributes of interest, but one that you will find useful is the ``.ndim`` (the number of dimensions). Try it on one of your prior arrays.

These objects are of the type ``numpy.ndarray``: an n-dimensional array.

## Further ways of creating NumPy arrays

In this section we will look at creating arrays, particularly matrices, in an efficient manner. 

1. Identity matrices: what if you need an [identity matrix](https://en.wikipedia.org/wiki/Identity_matrix) (a matrix with 1's on the diagonal)?
2. Random matrices: arrays filled with random numbers
3. Vector sequences: say you need a vector where the entries are ``[0, 1, 2, 3, 4, ..., 9]``
4. Matrix from a vector: take a vector (of say 12 entries) and [reshape](https://docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html) it into an array (of 3 rows and 4 columns)

In the next section we will look at each one of these.


### Identity matrices

A square matrix with 1's on the diagonal and zeros everywhere else is known as an identity matrix. For example a $4\times 4$ identity matrix is:  $$I_4 = \begin{bmatrix}
1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
0 & 0 & 0 & 1 \end{bmatrix}$$

```python
import numpy as np

# Read the help text for the `identity` function:
help(np.identity)
id5 = np.identity(n=5)
print(id5)
```

A similar function to ``np.identity(...)`` is ``np.eye(...)``. It is a play on words, where ``eye`` refers to the uppercase letter $I$. The above above $4\times 4$ matrix is often written as $I_4$ in mathematical notation.

**Try the following, to see what they produce:**

```python
also_id5 = np.eye(5)
print(also_id5)
print('-----')

yet_again = np.eye(5, 5)
print(yet_again)
print('-----')

another_id5 = np.eye(5, 5, 0)  # start the 1's in the 0th position (i.e. row 1 and column 1)
print(another_id5)
print('-----')

# What if we want diagonal ones, but not on the main diagonal,
# but starting in the first row and third column rather?
print(np.eye(5, 5, 2))
```

>After the above, can you explain the difference between ``np.identity()`` and ``np.eye()``?

### Arrays of random numbers

For simulations it is often helpful to create and use arrays of random values. Each value might be a starting position or state. Or sometimes you just want to test a piece of code, not only with 1's and 0's, but any random values.

For this it is helpful to create arrays of any shape, filled with random values:

```python
import numpy as np

# Random floats between 0 (included) and 1 (not included)
rnd_matrix = np.random.random(size=(4,3))   
print(rnd_matrix)

# Or try a multi-dimensional array
rnd_array = np.random.random(size=(4, 2, 3))
print(rnd_array)
```

Sometimes we want random integers though, between some ``low``er and upper (``high``) bounds. The random values may include the ``low`` values, but will be till just under the ``high`` value specified.

```python
# Run this code a few times to verify that you get -3, but never a +7
rnd_int = np.random.randint(low=-3, high=7, size=(4, 5))
print(rnd_int)
```

### Sequences

Vectors containing a sequence, such as ``[0, 1, 2, ... 9]`` or ``[2, 4, 6, 8, ... 12]`` are often used as a starting point for calculations. To create these we use the `numpy.arange()`  and `numpy.linspace()` commands.

*Syntax:*

`numpy.arange(start, stop, step)`

* `start` by default is zero
* `stop` is not inclusive (in other words, NumPy will stop just before this value), and 
* the `step` has a default value of 1.

As mentioned above, Python functions can be called by specifying the input arguments (``start`` and ``stop`` and ``step`` are the argument names).

Try it out below:
```python
import numpy as np
np.arange(4)

# We could have also written, but you will 
# agree that this is unnecessary, as the defaults
# are already good enough. But this is explicit:
np.arange(start=0, stop=4, step=1)

np.arange(start=2, stop=6, step=1)

# Leave `step` unspecified if it is just "1"
np.arange(start=2, stop=6)  

# Most common usage: leave all arguments unspecified
np.arange(2, 6)             

# Jump in steps of 2
np.arange(start=2, stop=9, step=2)  
np.arange(2, 9, 2)
```

> We saw the built-in Python ``range`` function in [an earlier module](https://yint.org/pybasic02). So what is the difference between the NumPy library's ``np.arange`` function and the built-in ``range`` function?
>
> 1. Try replacing ``np.arange(...)`` with ``range`` and see what differences you notice.
> 2. Try using ``np.arange(...)``, but step in increments of 0.5, or 0.33333 instead. Note that you cannot do this with the ``range(...)`` function.
> 3. Create a sequence of values starting at $-4$ and ending just below $+4$, in steps of $1$
> 4. Create a sequence of values starting at $-2$ and ending just below $+2$, in steps of $0.5$. How many elements are in the sequence? Remember the ``len`` function? What about the ``.shape`` attribute?
>5. Start at $+2$ and step ***down*** in increments of $0.25$, until just before $-2$. How many elements are in the sequence?

There is also the `np.linspace()` command, which is similar to `np.arange()`. The differences are:
* you specify the length of your sequence, instead of a step size. 
* the `stop` value ***is included*** by default, but it can be removed.

It returns an array with evenly spaced numbers over the specified interval.  

*Syntax:*

`np.linspace(start, stop, num)`

where the default value of `num=50`. Type `help(np.linspace)` to see how you can either include or exclude the endpoint. 

### To try:

>1. Confirm that you indeed get a sequence of 50 values when you do not specify ``num``. Also confirm that the ``stop`` value is the last value in the vector.
>2. Try to get a vector with fewer elements, say 6, instead of 50.
>3. Go backwards again: create a sequence where the numbers decrease in value.

### Reshaping

One you have a sequence of numbers in a long vector, you might want to fold them up in a matrix, or an multi-dimensional array.

Use the ``reshape`` function of a NumPy array to do that.

```python
vector = np.arange(12)
matrix = vector.reshape((3, 4))
```

Note the order! NumPy will first fill each row, so the first row will be ``[0, 1, 2, 3]`` and then the next row will be ``[4, 5, 6, 7]``, and so on.

Try it:

```python
vector = np.arange(12)
print('This is a vector with a shape of: ' + str(vector.shape))
matrix = vector.reshape((4, 3))
matrix = vector.reshape((2, 6))
print('This is a matrix with a shape of: ' + str(matrix.shape))
matrix = vector.reshape((4, 4)) # intentional error
``` 

## Next steps

Above we have created vectors, matrices and arrays in all sorts of formats. With ones, zeros, diagonals, random numbers, and sequences of numbers. 

Next it is time to put these to use, and perform calculations on them. This is in the next module, module 5.

>***Feedback and comments about this worksheet?***
> Please provide any anonymous [comments, feedback and tips](https://docs.google.com/forms/d/1Fpo0q7uGLcM6xcLRyp4qw1mZ0_igSUEnJV6ZGbpG4C4/edit).

In [1]:
# IGNORE this. Execute this cell to load the notebook's style sheet.
from IPython.core.display import HTML
css_file = './images/style.css'
HTML(open(css_file, "r").read())