# Lecture 3: Python Core and NumPy

# Contact persons
- If you experience any problem with the course
- Please turn to
  - the lectures, examiner, or 
  - the **course representatives**
- Will hold two meetings on the course

# Lecture 3: Python Core and NumPy
- strings `str`
- printing `print`
- files `open` and `with`
- NumPy
- **Computer Assignment 1**

Reading material:
- [1] Sec. 2.4, 2.5.3 & 7.1 - 7.4
- [2] Sec. 3.1.2, chapter 7
- [4] Data types and array creation from the NumPy user manual

# Strings
- Creating strings `str(...)`, `'...'`, `"..."`, `"""..."""`, `'''...'''`, etc.
- String methods
- String formatting `'...'.format(...)`

## Making strings
- Strings are _immutable_.
- Strings can use both single `'...'` and double quotes `"..."` as delimiters
- This makes it possible to have double `"` or single `'` quotes in a string
- Special characters can also be "escaped" using slash, i.e. `'\''`

In [None]:
print('Single "quoted" string.')

In [None]:
print("Double 'quoted' string.")

In [None]:
print('This let\'s us escape characters.')

## Multi-line strings
Use triple quotes `"""..."""` or `'''...'''` to construct multi-line strings.

In [None]:
print('''This is
a multi-
line string''')

In [None]:
print("""This is
another multi-
line string""")

## Indexing and slicing
Individual characters are accessed by indexing `string[idx]`
- starting from `0` from the left, and 
- starting from `-1` from the right

For slices, `string[start:end:step]`
- the `start` index is *inclusive*, 
- the `end` index is *exclusive*, i.e. the interval is `[start,end)`

In [None]:
e = "Another example string"
print(e[0])

In [None]:
print(e[-3])

In [None]:
print(e[8:15])

In [None]:
print(e[14:7:-1])

In [None]:
print(e[-14:-6])

## String methods
There are many string methods available:
- `.startswith(prefix[, start[, end]]) -> bool`
- `.endswith(suffix[, start[, end]]) -> bool`
- `.find(sub[, start[, end]]) -> int`
- `.lower() -> str`
- `.upper() -> str`
- `.replace(old, new[, count]) -> str`
- `.split(sep=None, maxsplit=-1) -> list of strings`
- `.strip([chars]) -> str`

## String methods: Examples

In [None]:
s = 'This is a test string!'
print(s.startswith("This"), s.endswith("string"))

In [None]:
print(s.upper(), s.lower())

In [None]:
print(s.replace("string","list"))

In [None]:
print(s)

*Remember strings are immutable and a **new** string is always returned when "modifying" a string!*

## Using `help` (and `dir`)
- To see all methods of an object use `dir()`.
- To get the method documentation use `help()`

In [None]:
print(', '.join(s for s in dir(str()) if '__' not in s))

In [None]:
help(str().split)

## String method `.split()`

When parsing text
- for splitting a string in parts, use the `.split()` method

For example, 
- lets say you have the string `"Blue, Green, Red, Yellow, Black"` 
- and want to have the individual parts (colours) in a list of strings.

In [None]:
text = "Blue, Green, Red, Yellow, Black"
split_text = text.split(sep=', ')
print(split_text)

You can of course use any character to split on. (Also look at `rsplit()`)

## String method `.format()`
- Strings can be used to provide a formatted view of some data using the **format()** method

In [None]:
'1: Hello {}, do you like {} tea?'.format( "John", "green")

In [None]:
'2: Hello {0}, do you like {1} tea? I like {1} tea!'.format( "John", "green")

In [None]:
'3: Hello {1}, do you like {0} tea?'.format( "John", "green")

In [None]:
'4: Hello {0[0]}, do you like {0[1][2]} tea?'.format(["John", ["black", "red", "green"]])

In [None]:
'5: Hello {name}, do you like {color} tea?'.format(color="green", name="John")

# Printing with `print`
- Printing is by default done to the file `sys.stdout`, but can be re-directed to (other) files as well.
- Here is the full syntax for the Python 3 `print` command:

In [None]:
help(print)

## `print` examples

In [None]:
print(1, 2, 3, 4, 5, sep=':')
print("First output", end=" | ")
print("Second output")

## Formatted `print`
There are two basic methods for formatting output in Python
- The old syntax from earlier versions of Python *(not recommended)*

In [None]:
print( "My integer is: %d, my float is %4.3f" % (42, 3.141592653589793) )

- or the string formatting `.format()` method in Python 3 **(recommended)**

In [None]:
print( "My integer is: {}, my float is {:4.3f}".format(42, 3.141592653589793) )

## Python 3 format specifier syntax
The format specifier (the part after the **:** inside the brackets) have this syntax:

```python
format_spec ::=  [[fill]align][sign][#][0][width][,][.precision][type]
fill        ::=  <any character>
align       ::=  "<" | ">" | "=" | "^"
sign        ::=  "+" | "-" | " "
width       ::=  integer
precision   ::=  integer
type        ::=  "b" | "c" | "d" | "e" | "E" | "f" | "F" | "g" | "G" | "n" | "o" | "s" | "x" | "X" | "%"
```

For full information on the *str.format* formats, see https://docs.python.org/3/library/string.html#format-string-syntax

See also the nice focused guide with examples: https://pyformat.info/

## Format-strings in Python >= 3.6
Python 3.6 introduced a shorter means of formatting output:

In [None]:
a = 3
b = 2.5
print(f"a is equal to {a} and b is equal to {b}")

Here the "f" before the initial quote marks it as a _format_ string.

# Files
- All input and output go through files (almost)
- The screen and keyboard are handled using the *default* files `sys.stdout` and `sys.stdin`
- `sys.stdout` and `sys.stdin` are available automatically (no explicit open needed)

## Files: How to?
- To open a file, use the `open()` function:<br> `file = open(filename, mode)`
* Typical `mode`s are 
  - `"r"` for reading from a file, 
  - `"w"` for writing to a file and 
  - `"a"` to append to an existing file
* After processing a file, it should be closed using the `close()` method
* See `help(open)` for details

In [None]:
help(open)

## File examples

In [None]:
text = """Hear hear!
This string will be saved to file for all eternity.
To read or not to read that is the question."""

with open('test_file.txt', mode='w') as file:
    file.write(text)

In [None]:
with open('test_file.txt', mode='r') as file:
    new_text = file.read()
    
print(new_text)

## Using `with` instead of `file.close()`
- Calling the `file.close()` method is not needed when using the `with` statement
- It automatically calls `del file` at the end of the scope<br>
  (this is equivalent to `file.close()` for files)

In [None]:
file = open('test_file.txt', mode='r')
print(file.read())
file.close() # I always forget to add this...

In [None]:
with open('test_file.txt', mode='r') as file:
     print(file.read())
# Here the file is automagically closed, nothing to forget! :D

## More on reading from files
- You can read full lines or streams of characters from a file:
  - `.readline()` - reads a full line (from a text file)
  - `.read(n)` - reads $n$ (or all remaning if $< n$) characters from the file
- Or you can iterate on the file object itself!

The following two examples does exactly the same thing:

In [None]:
with open('test_file.txt', 'r') as file:
    line = file.readline()
    while line:
        print(line)
        line = file.readline()

In [None]:
with open('test_file.txt', 'r') as file:
    for line in file:
         print(line)

*Note that:* While the input-file had no empty lines, there are empty lines in the output.
- Due to the text containing linebreaks `\n` and 
- `print` adds a linebreak by default, i.e. 
- `print(text)` is equal to `print(text, end='\n')`

To avoid this
- use one of the `strip()` or `rstrip()` string methods
- or set `end=''`

In [None]:
with open('test_file.txt', 'r') as file:
    for line in file:
         print(line.rstrip())

In [None]:
with open('test_file.txt', 'r') as file:
    for line in file:
         print(line, end='')

## A word on `.read()` and `.readlines()`
- The `.read()` method reads the entire file, while
- the `readlines()` method reads the entire file in to a list of strings
- this is strongly discouraged
- large files lead to high memory usage (and a slower code)
- It is much better to process files in chunks

## Writing to files `.write()`
* To write, simply use the `.write()` method
* Note that newlines have to be inserted explicitly!

In [None]:
with open('test_file_2.txt', mode='w') as file:
    file.write('Hello .write() world!\n')
    file.write('Bye bye!\n')    

In [None]:
with open('test_file_2.txt', mode='r') as file:
    for line in file:
        print(line, end='')

## Writing to files using `print`
- An alternative to `.write()` is `print()`
- `print()` can be redirected from `sys.stdout` to a file

In [None]:
with open('test_file_2.txt', mode='w') as file:
    print('Hello print() world!', file=file)
    print('Bye bye!', file=file)

In [None]:
with open('test_file_2.txt', mode='r') as file: print(file.read())

# Python modules

- Libraries are called **packages** in Python.
- [The Python Standard Library](https://docs.python.org/3/library/) is great!
- But what makes Python fantastic are all the external packages
- Pyhon modules are distributed using the [Python Package Index](https://pypi.python.org/pypi)<br>
  (currently listing some 200'000 modules)

- The [Anaconda Python Distribution](https://www.anaconda.com/distribution/) contanis many standard "extra" packages, like
  - NumPy, SciPy, Matplotlib, PyQT, etc.

## Package structure

- A Python package are made up of a directory and file hierarchy
- Each directory is also a *module* (*namespace*) in Python
- Consider the NumPy package
  - it has the main module `numpy` and 
  - e.g. the submodule `numpy.linalg`

In [None]:
import numpy as np

In [None]:
print(dir(np))

In [None]:
print(dir(np.linalg))

## Package directory structure
That the sub-module structure corresponds to folders we can see by
- printing the module filename `.__file__`, or
- looking at the [NumPy source code](https://github.com/numpy/numpy/tree/master/numpy)

In [None]:
print(np.linalg.__file__)

## High quality packages are well documented
- Usually there are web page versions of the documentation, and
- also the good old `help` function is useful

In [None]:
help(np.zeros)

# NumPy

# NumPy
is the fundamental package for scientific computing with Python. 

It contains among other things:
- a powerful N-dimensional array object
- sophisticated (broadcasting) functions
- tools for integrating C/C++ and Fortran code
- useful linear algebra, Fourier transform, and random number capabilities

(source https://numpy.org/)

Documentation:
- [The NumPy User Guide](http://docs.scipy.org/doc/numpy/user)
- [The NumPy Reference Guide](http://docs.scipy.org/doc/numpy/reference)

## Why NumPy?
### Python `list`
- Python misses a good built in data type for numerics (better than `list`)
- unlike Matlab, the Python `list` type keeps *references* to objects instead of the data itself
- this makes the Python `list` **very flexible**, but it also gives
- poor performance and a large(r) memory footprint for multi-dimensional arrays

### NumPy `ndarray`
- The multi-dimensional array type `ndarray` in NumPy, on the other hand,
- **stores data** directly in its array object
- The data is stored **continuous in memory**, which further improve performance

Quite a few other external libraries (like SciPy) expects data in NumPy format<br> to give good performace, or to work at all!

## Python `list`: Element-wise operations
- Multiplying objects in two lists are not allowed for `list`
- Multiplication has to be done "element by element"

In [None]:
a = [1, 3, 4]
b = [3, 2, 5]
c = a * b + 1 # Multiply each element and add 1, Does _not_ work!

In [None]:
c = []
for A, B in zip(a, b):
    C = A * B + 1
    c.append(C)
print(c)

## NumPy `ndarray`: Element-wise operations
- The `ndarray` support elementwise operations (similar to Matlab)

In [None]:
import numpy as np
a = np.array([1, 3, 4])
b = np.array([3, 2, 5])
print(type(a), type(b))

In [None]:
c = a * b + 1 # Multiply each element in a and b and add 1
print(c)

## On the NumPy `ndarray` type and element `dtypes`
- The `ndarray` is the most prominent NumPy data-type 
- `ndarray` stands for **N**-**D**imensional **Array**
- All elements in the array are of the same type (set by the `dtype` argument)

- Supported `dtypes` include
  - `int8`, `int16`, `int32`, `int64`,<br> (`int == int64 or int32` depending on your platform)
  - `uint8`, `uint16`, `uint32`, `uint64`
  - `float16`, `float32`, `float64`, (`float == float64`)
  - `complex64`, `complex128`, (`complex == complex128`)

- Array construction methods use `float` by default
- Methods taking one argument use the input type to determine the return type
- In operations with more than one data-type, automatic type promotion is performed

## Type promotion example

In [None]:
import numpy as np
a = np.array([1, 3, 4])
b = np.array([3, 2, 5], dtype=np.int8)
c = a * b + 1
print(c)

In [None]:
print(type(c))
print(type(c[0]))
print(c.dtype)

In the example above, we created a NumPy array from a Python list.

This is one way of creating a NumPy array, more will be shown later.

## The `shape` of an `ndarray`

- the `ndarray` can have any number of dimensions
- each dimension has a fixed size
- the sizes of all dimensions is called the `shape` of the array
- the `shape` of an array is a `tuple`

In [None]:
a = np.array([
    [1., 3.],
    [2., 7.],
    [.4, 6.],
    ])
a.shape

In [None]:
print(a)

## Array creation
- Create empty arrays using a shape-`tuple`
- Interesting NumPy methods: `np.zeros()`, `np.ones()`, and `np.empty()`

In [None]:
shape = (2, 4)
a = np.zeros(shape, dtype=int)
b = np.ones(shape, dtype=complex)
c = np.empty(shape, dtype=float)
print(a)
print(b)
print(c)

**Note:** `np.empty` gives non-initialized arrays (fast), but they have an **undefined inital value**.

## Equispaced ranges `np.arange`
- To create a series of equally spaced numbers<br>
  use `np.arange(start=0, stop, step=1)`
- **Note:** like for slicing, `start` is *inclusive*, `stop` is *exclusive*
- Works both for `int` and `float`

In [None]:
a = np.arange(3)
b = np.arange(stop=10, start=24, step=-2)
f = np.arange(1., 3., 0.2)
print("a =", a)
print("b =", b)
print("f =", f)

- When you want to control the **number** of points (rather than the **step**)<br>
  use `np.linspace(start, stop, num=10)`
- **Note:** Here both ends are *inclusive*!

In [None]:
c = np.linspace(-1., 1., num=9)
print("c =", c)

# Computer Assignment 1

# Computer Assignment 1 (CA1)

The best way to learn a new language (computer or other) is of course to use it!

In the Computer Assignments you will have plenty of opportunity to practice, please try to do as much as possible in your group.

In **CA1** you will be able to show your skills in:
* Basic Python language elements
* Writing your own functions
* Reading files
* Storing and using data in NumPy arrays
* Using Matplotlib to plot 2D data
* Finding and using algorithms in SciPy

Lecture contents
* Lectures 1-3 should be enough to get you started and to finish **Step 1**
* Lecture 4 we will go through material for **Step 2** and **Step 3**
* Lecture 5 will introduce the SciPy package used in the last part of **Step 3**

## Groups

The computer assignments should be performed in groups of two.
- Find another student and join one of the pre-created groups on the DAT171 Canvas page
- If you can not find a group partner 
  1. email the examiner,
  2. disclose your program (M, E, PhD,..), and 
  3. any schedule clashes with lectures and computer labs
- If you do none of the above, you will be assigned a group on Thursday 30/1

Please, form groups on your own, whenever possible.

Read through the "General instructions for the computer asignments" and<br>the entire CA1 description (all the three steps) before starting to code.

Deadline for CA1 is Sunday 9/2, 2020! (actually 07.00h Monday moorning)

## Good luck!

# Lecture 3: The End

In [None]:
import antigravity