{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "STAT 479: Machine Learning (Fall 2019) \n", "Instructor: Sebastian Raschka (sraschka@wisc.edu) \n", "Course website: http://pages.stat.wisc.edu/~sraschka/teaching/stat479-fs2019/" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# L04: Scientific Computing in Python" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%load_ext watermark" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Sebastian Raschka \n", "\n", "CPython 3.7.3\n", "IPython 7.7.0\n", "\n", "numpy 1.16.4\n", "scipy 1.3.1\n", "matplotlib 3.1.1\n" ] } ], "source": [ "%watermark -v -a 'Sebastian Raschka' -p numpy,scipy,matplotlib" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## NumPy -- Working with Numerical Arrays" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Introduction to NumPy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This section offers a quick tour of the NumPy library for working with multi-dimensional arrays in Python. NumPy (short for Numerical Python) was created by Travis Oliphant in 2005, by merging Numarray into Numeric. Since then, the open source NumPy library has evolved into an essential library for scientific computing in Python and has become a building block of many other scientific libraries, such as SciPy, Scikit-learn, Pandas, and others.\n", "What makes NumPy so particularly attractive to the scientific community is that it provides a convenient Python interface for working with multi-dimensional array data structures efficiently; the NumPy array data structure is also called `ndarray`, which is short for *n*-dimensional array. \n", "\n", "In addition to being mostly implemented in C and using Python as a \"glue language,\" the main reason why NumPy is so efficient for numerical computations is that NumPy arrays use contiguous blocks of memory that can be efficiently cached by the CPU. In contrast, Python lists are arrays of pointers to objects in random locations in memory, which cannot be easily cached and come with a more expensive memory-look-up. However, the computational efficiency and low-memory footprint come at a cost: NumPy arrays have a fixed size and are homogeneous, which means that all elements must have the same type. Homogenous `ndarray` objects have the advantage that NumPy can carry out operations using efficient C loops and avoid expensive type checks and other overheads of the Python API. While adding and removing elements from the end of a Python list is very efficient, altering the size of a NumPy array is very expensive since it requires to create a new array and carry over the contents of the old array that we want to expand or shrink. \n", "\n", "Besides being more efficient for numerical computations than native Python code, NumPy can also be more elegant and readable due to vectorized operations and broadcasting, which are features that we will explore in this lecture. While this material should be sufficient to follow the code examples in this course." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### N-dimensional Arrays" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "NumPy is built around [`ndarrays`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.html) objects, which are high-performance multi-dimensional array data structures. Intuitively, we can think of a one-dimensional NumPy array as a data structure to represent a vector of elements -- you may think of it as a fixed-size Python list where all elements share the same type. Similarly, we can think of a two-dimensional array as a data structure to represent a matrix or a Python list of lists. While NumPy arrays can have up to 32 dimensions if it was compiled without alterations to the source code, we will focus on lower-dimensional arrays for the purpose of illustration in this introduction.\n", "\n", "Now, let us get started with NumPy by calling the `array` function to create a two-dimensional NumPy array, consisting of two rows and three columns, from a list of lists:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1, 2, 3],\n", " [4, 5, 6]])" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import numpy as np\n", "\n", "\n", "lst = [[1, 2, 3], [4, 5, 6]]\n", "ary2d = np.array(lst)\n", "ary2d" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![](./images/numpy-intro/array_1.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By default, NumPy infers the type of the array upon construction. Since we passed Python integers to the array, the `ndarray` object `ary2d` should be of type `int64` on a 64-bit machine, which we can confirm by accessing the `dtype` attribute:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "dtype('int64')" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary2d.dtype" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we want to construct NumPy arrays of different types, we can pass an argument to the `dtype` parameter of the `array` function, for example `np.int32` to create 32-bit arrays. For a full list of supported data types, please refer to the official [NumPy documentation](https://docs.scipy.org/doc/numpy/user/basics.types.html). Once an array has been constructed, we can downcast or recast its type via the `astype` method as shown in the following example:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1., 2., 3.],\n", " [4., 5., 6.]], dtype=float32)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "float32_ary = ary2d.astype(np.float32)\n", "float32_ary" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "dtype('float32')" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "float32_ary.dtype" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The code snippet above returned `8`, which means that each element in the array (remember that `ndarray`s are homogeneous) takes up 8 bytes in memory. This result makes sense since the array `ary2d` has type `int64` (64-bit integer), which we determined earlier, and 8 bits equals 1 byte. (Note that `'int64'` is just a shorthand for `np.int64`.)\n", "\n", "To return the number of elements in an array, we can use the `size` attribute, as shown below:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "6" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary2d.size" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And the number of dimensions of our array (Intuitively, you may think of *dimensions* as the *rank* of a tensor) can be obtained via the `ndim` attribute:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary2d.ndim" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we are interested in the number of elements along each array dimension (in the context of NumPy arrays, we may also refer to them as *axes*), we can access the `shape` attribute as shown below:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(2, 3)" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary2d.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `shape` is always a tuple; in the code example above, the two-dimensional `ary` object has two *rows* and *three* columns, `(2, 3)`, if we think of it as a matrix representation.\n", "\n", "Similarly, the `shape` of the one-dimensional array only contains a single value:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(3,)" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.array([1, 2, 3]).shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Array Construction Routines" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This section provides a non-comprehensive list of array construction functions. Simple yet useful functions exist to construct arrays containing ones or zeros:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1., 1., 1.],\n", " [1., 1., 1.],\n", " [1., 1., 1.]])" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.ones((3, 3))" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[0., 0., 0.],\n", " [0., 0., 0.],\n", " [0., 0., 0.]])" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.zeros((3, 3))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Creating arrays of ones or zeros can also be useful as placeholder arrays, in cases where we do not want to use the initial values for computations but want to fill it with other values right away. If we do not need the initial values (for instance, `'0.'` or `'1.'`), there is also `numpy.empty`, which follows the same syntax as `numpy.ones` and `np.zeros`. However, instead of filling the array with a particular value, the `empty` function creates the array with non-sensical values from memory. We can think of `zeros` as a function that creates the array via `empty` and then sets all its values to `0.` -- in practice, a difference in speed is not noticeable, though. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "NumPy also comes with functions to create identity matrices and diagonal matrices as `ndarrays` that can be useful in the context of linear algebra -- a topic that we will explore later in this section. " ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1., 0., 0.],\n", " [0., 1., 0.],\n", " [0., 0., 1.]])" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.eye(3)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[3, 0, 0],\n", " [0, 3, 0],\n", " [0, 0, 3]])" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.diag((3, 3, 3))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Lastly, I want to mention two very useful functions for creating sequences of numbers within a specified range, namely, `arange` and `linspace`. NumPy's `arange` function follows the same syntax as Python's `range` objects: If two arguments are provided, the first argument represents the start value and the second value defines the stop value of a half-open interval:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([4., 5., 6., 7., 8., 9.])" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.arange(4., 10.)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notice that `arange` also performs type inference similar to the `array` function. If we only provide a single function argument, the range object treats this number as the endpoint of the interval and starts at 0:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0, 1, 2, 3, 4])" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.arange(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Similar to Python's `range`, a third argument can be provided to define the *step* (the default step size is 1). For example, we can obtain an array of all uneven values between one and ten as follows:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1., 3., 5., 7., 9.])" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.arange(1., 11., 2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `linspace` function is especially useful if we want to create a particular number of evenly spaced values in a specified half-open interval:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0. , 0.25, 0.5 , 0.75, 1. ])" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.linspace(0., 1., num=5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Array Indexing" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this section, we will go over the basics of retrieving NumPy array elements via different indexing methods. Simple NumPy indexing and slicing works similar to Python lists, which we will demonstrate in the following code snippet, where we retrieve the first element of a one-dimensional array:" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary = np.array([1, 2, 3])\n", "ary[0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Also, the same Python semantics apply to slicing operations. The following example shows how to fetch the first two elements in `ary`:" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1, 2])" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary[:2] # equivalent to ary[0:2]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we work with arrays that have more than one dimension or axis, we separate our indexing or slicing operations by commas as shown in the series of examples below:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary = np.array([[1, 2, 3],\n", " [4, 5, 6]])\n", "\n", "ary[0, 0] # upper left" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "6" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary[-1, -1] # lower right" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary[0, 1] # first row, second column" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![](./images/numpy-intro/array_2.png)" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1, 2, 3])" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary[0] # entire first row" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1, 4])" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary[:, 0] # entire first column" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1, 2],\n", " [4, 5]])" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary[:, :2] # first two columns" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary[0, 0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Array Math and Universal Functions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the previous sections, you learned how to create NumPy arrays and how to access different elements in an array. It is about time that we introduce one of the core features of NumPy that makes working with `ndarray` so efficient and convenient: vectorization. While we typically use for-loops if we want to perform arithmetic operations on sequence-like objects, NumPy provides vectorized wrappers for performing element-wise operations implicitly via so-called *ufuncs* -- short for universal functions.\n", "\n", "As of this writing, there are more than 60 ufuncs available in NumPy; ufuncs are implemented in compiled C code and very fast and efficient compared to vanilla Python. In this section, we will take a look at the most commonly used ufuncs, and I recommend you to check out the [official documentation](https://docs.scipy.org/doc/numpy/reference/ufuncs.html#available-ufuncs) for a complete list.\n", "\n", "To provide an example of a simple ufunc for element-wise addition, consider the following example, where we add a scalar (here: 1) to each element in a nested Python list:" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[[2, 3, 4], [5, 6, 7]]" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "lst = [[1, 2, 3], [4, 5, 6]]\n", "\n", "for row_idx, row_val in enumerate(lst):\n", " for col_idx, col_val in enumerate(row_val):\n", " lst[row_idx][col_idx] += 1\n", "lst" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This for-loop approach is very verbose, and we could achieve the same goal more elegantly using list comprehensions:" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[[2, 3, 4], [5, 6, 7]]" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "lst = [[1, 2, 3], [4, 5, 6]]\n", "[[cell + 1 for cell in row] for row in lst]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can accomplish the same using NumPy's ufunc for element-wise scalar addition as shown below:" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[2, 3, 4],\n", " [5, 6, 7]])" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary = np.array([[1, 2, 3], [4, 5, 6]])\n", "ary = np.add(ary, 1)\n", "ary" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The ufuncs for basic arithmetic operations are `add`, `subtract`, `divide`, `multiply`, and `exp` (exponential). However, NumPy uses operator overloading so that we can use mathematical operators (`+`, `-`, `/`, `*`, and `**`) directly:" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[3, 4, 5],\n", " [6, 7, 8]])" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary + 1" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 4, 9, 16],\n", " [25, 36, 49]])" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary**2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Above, we have seen examples of *binary* ufuncs, which are ufuncs that take two arguments as an input. In addition, NumPy implements several useful *unary* ufuncs, such as `log` (natural logarithm), `log10` (base-10 logarithm), and `sqrt` (square root).\n", "\n", "Often, we want to compute the sum or product of array element along a given axis. For this purpose, we can use a ufunc's `reduce` operation. By default, `reduce` applies an operation along the first axis (`axis=0`). In the case of a two-dimensional array, we can think of the first axis as the rows of a matrix. Thus, adding up elements along rows yields the column sums of that matrix as shown below:" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([5, 7, 9])" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary = np.array([[1, 2, 3], \n", " [4, 5, 6]])\n", "\n", "np.add.reduce(ary) # column sumns" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To compute the row sums of the array above, we can specify `axis=1`:" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 6, 15])" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.add.reduce(ary, axis=1) # row sums" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "While it can be more intuitive to use `reduce` as a more general operation, NumPy also provides shorthands for specific operations such as `product` and `sum`. For example, `sum(axis=0)` is equivalent to `add.reduce`:" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([5, 7, 9])" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary.sum(axis=0) # column sums" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![](./images/numpy-intro/ufunc.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As a word of caution, keep in mind that `product` and `sum` both compute the product or sum of the entire array if we do not specify an axis:" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "21" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary.sum()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Other useful unary ufuncs are:\n", " \n", "- `np.mean` (computes arithmetic average)\n", "- `np.std` (computes the standard deviation)\n", "- `np.var` (computes variance)\n", "- `np.sort` (sorts an array)\n", "- `np.argsort` (returns indices that would sort an array)\n", "- `np.min` (returns the minimum value of an array)\n", "- `np.max` (returns the maximum value of an array)\n", "- `np.argmin` (returns the index of the minimum value)\n", "- `np.argmax` (returns the index of the maximum value)\n", "- `np.array_equal` (checks if two arrays have the same shape and elements)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Broadcasting" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A topic we glanced over in the previous section is broadcasting. Broadcasting allows us to perform vectorized operations between two arrays even if their dimensions do not match by creating implicit multidimensional grids. You already learned about ufuncs in the previous section where we performed element-wise addition between a scalar and a multidimensional array, which is just one example of broadcasting. \n", "\n", "\n", "![](./images/numpy-intro/broadcasting-1.png)\n", "\n", "Naturally, we can also perform element-wise operations between arrays of equal dimensions:" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([5, 7, 9])" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary1 = np.array([1, 2, 3])\n", "ary2 = np.array([4, 5, 6])\n", "\n", "ary1 + ary2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In contrast to what we are used from linear algebra, we can also add arrays of different shapes. In the example above, we will add a one-dimensional to a two-dimensional array, where NumPy creates an implicit multidimensional grid from the one-dimensional array `ary1`:" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 5, 7, 9],\n", " [ 8, 10, 12]])" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary3 = np.array([[4, 5, 6], \n", " [7, 8, 9]])\n", "\n", "ary3 + ary1 # similarly, ary1 + ary3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![](./images/numpy-intro/broadcasting-2.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Advanced Indexing -- Memory Views and Copies" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the previous sections, we have used basic indexing and slicing routines. It is important to note that basic integer-based indexing and slicing create so-called *views* of NumPy arrays in memory. Working with views can be highly desirable since it avoids making unnecessary copies of arrays to save memory resources. To illustrate the concept of memory views, let us walk through a simple example where we access the first row in an array, assign it to a variable, and modify that variable:" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[100, 101, 102],\n", " [ 4, 5, 6]])" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary = np.array([[1, 2, 3],\n", " [4, 5, 6]])\n", "\n", "first_row = ary[0]\n", "first_row += 99\n", "ary" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As we can see in the example above, changing the value of `first_row` also affected the original array. The reason for this is that `ary[0]` created a view of the first row in `ary`, and its elements were then incremented by 99. The same concept applies to slicing operations:" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[100, 101, 102],\n", " [ 4, 5, 6]])" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary = np.array([[1, 2, 3],\n", " [4, 5, 6]])\n", "\n", "first_row = ary[:1]\n", "first_row += 99\n", "ary" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 1, 101, 3],\n", " [ 4, 104, 6]])" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary = np.array([[1, 2, 3],\n", " [4, 5, 6]])\n", "\n", "center_col = ary[:, 1]\n", "center_col += 99\n", "ary" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we are working with NumPy arrays, it is always important to be aware that **slicing creates views** -- sometimes it is desirable since it can speed up our code by avoiding to create unnecessary copies in memory. However, in certain scenarios we want force a copy of an array; we can do this via the `copy` method as shown below:" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1, 2, 3],\n", " [4, 5, 6]])" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary = np.array([[1, 2, 3],\n", " [4, 5, 6]])\n", "\n", "second_row = ary[1].copy()\n", "second_row += 99\n", "ary" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In addition to basic single-integer indexing and slicing operations, NumPy supports advanced indexing routines called *fancy* indexing. Via fancy indexing, we can use tuple or list objects of non-contiguous integer indices to return desired array elements. Since fancy indexing can be performed with non-contiguous sequences, it cannot return a view -- a contiguous slice from memory. Thus, fancy indexing always returns a copy of an array -- it is important to keep that in mind. The following code snippets show some fancy indexing examples:" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1, 3],\n", " [4, 6]])" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary = np.array([[1, 2, 3],\n", " [4, 5, 6]])\n", "\n", "ary[:, [0, 2]] # first and and last column" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1, 2, 3],\n", " [4, 5, 6]])" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "this_is_a_copy = ary[:, [0, 2]]\n", "this_is_a_copy += 99\n", "ary" ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[3, 1],\n", " [6, 4]])" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary[:, [2, 0]] # first and and last column" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we can also use Boolean masks for indexing -- that is, arrays of `True` and `False` values. Consider the following example, where we return all values in the array that are greater than 3:" ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[False, False, False],\n", " [ True, True, True]])" ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary = np.array([[1, 2, 3],\n", " [4, 5, 6]])\n", "\n", "greater3_mask = ary > 3\n", "greater3_mask" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using these masks, we can select elements given our desired criteria:" ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([4, 5, 6])" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary[greater3_mask]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also chain different selection criteria using the logical *and* operator '&' or the logical *or* operator '|'. The example below demonstrates how we can select array elements that are greater than 3 and divisible by 2:" ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([4, 6])" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary[(ary > 3) & (ary % 2 == 0)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that indexing using Boolean arrays is also considered \"fancy indexing\" and thus returns a copy of the array." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Random Number Generators" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In machine learning and deep learning, we often have to generate arrays of random numbers -- for example, the initial values of our model parameters before optimization. NumPy has a `random` subpackage to create random numbers and samples from a variety of distributions conveniently. Again, I encourage you to browse through the more comprehensive [numpy.random documentation](https://docs.scipy.org/doc/numpy/reference/routines.random.html) for a more comprehensive list of functions for random sampling.\n", "\n", "To provide a brief overview of the pseudo-random number generators that we will use most commonly, let's start with drawing a random sample from a uniform distribution:" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0.69646919, 0.28613933, 0.22685145])" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.random.seed(123)\n", "np.random.rand(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the code snippet above, we first seeded NumPy's random number generator. Then, we drew three random samples from a uniform distribution via `random.rand` in the half-open interval [0, 1). I highly recommend the seeding step in practical applications as well as in research projects, since it ensures that our results are reproducible. If we run our code sequentially -- for example, if we execute a Python script -- it should be sufficient to seed the random number generator only once at the beginning to enforce reproducible outcomes between different runs. However, it is often useful to create separate `RandomState` objects for various parts of our code, so that we can test methods of functions reliably in unit tests. Working with multiple, separate `RandomState` objects can also be useful if we run our code in non-sequential order -- for example if we are experimenting with our code in interactive sessions or Jupyter Notebook environments. \n", "\n", "The example below shows how we can use a `RandomState` object to create the same results that we obtained via `np.random.rand` in the previous code snippet:" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0.69646919, 0.28613933, 0.22685145])" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "rng1 = np.random.RandomState(seed=123)\n", "rng1.rand(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Reshaping Arrays" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In practice, we often run into situations where existing arrays do not have the *right* shape to perform certain computations. As you might remember from the beginning of this lecture, the size of NumPy arrays is fixed. Fortunately, this does not mean that we have to create new arrays and copy values from the old array to the new one if we want arrays of different shapes -- the size is fixed, but the shape is not. NumPy provides a `reshape` methods that allow us to obtain a view of an array with a different shape. \n", "\n", "For example, we can reshape a one-dimensional array into a two-dimensional one using `reshape` as follows:" ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1, 2, 3],\n", " [4, 5, 6]])" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary1d = np.array([1, 2, 3, 4, 5, 6])\n", "ary2d_view = ary1d.reshape(2, 3)\n", "ary2d_view" ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.may_share_memory(ary2d_view, ary1d)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "While we need to specify the desired elements along each axis, we need to make sure that the reshaped array has the same number of elements as the original one. However, we do not need to specify the number elements in each axis; NumPy is smart enough to figure out how many elements to put along an axis if only one axis is unspecified (by using the placeholder `-1`):" ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1, 2, 3],\n", " [4, 5, 6]])" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary1d.reshape(2, -1)" ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1, 2],\n", " [3, 4],\n", " [5, 6]])" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary1d.reshape(-1, 2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can, of course, also use `reshape` to flatten an array:" ] }, { "cell_type": "code", "execution_count": 55, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1, 2, 3, 4, 5, 6])" ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary = np.array([[[1, 2, 3],\n", " [4, 5, 6]]])\n", "\n", "ary.reshape(-1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sometimes, we are interested in merging different arrays. Unfortunately, there is no efficient way to do this without creating a new array, since NumPy arrays have a fixed size. While combining arrays should be avoided if possible -- for reasons of computational efficiency -- it is sometimes necessary. To combine two or more array objects, we can use NumPy's `concatenate` function as shown in the following examples:" ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1, 2, 3, 1, 2, 3])" ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary = np.array([1, 2, 3])\n", "\n", "# stack along the first axis\n", "np.concatenate((ary, ary)) " ] }, { "cell_type": "code", "execution_count": 57, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1, 2, 3],\n", " [1, 2, 3]])" ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary = np.array([[1, 2, 3]])\n", "\n", "# stack along the first axis (here: rows)\n", "np.concatenate((ary, ary), axis=0) " ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1, 2, 3, 1, 2, 3]])" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# stack along the second axis (here: column)\n", "np.concatenate((ary, ary), axis=1) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Comparison Operators and Masks" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the previous section, we already briefly introduced the concept of Boolean masks in NumPy. Boolean masks are `bool`-type arrays (storing `True` and `False` values) that have the same shape as a certain target array. For example, consider the following 4-element array below. Using comparison operators (such as `<`, `>`, `<=`, and `>=`), we can create a Boolean mask of that array which consists of `True` and `False` elements depending on whether a condition is met in the target array (here: `ary`):" ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([False, False, True, True])" ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary = np.array([1, 2, 3, 4])\n", "mask = ary > 2\n", "mask" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One we created such a Boolean mask, we can use it to select certain entries from the target array -- those entries that match the condition upon which the mask was created):" ] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([3, 4])" ] }, "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary[mask]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Beyond the selection of elements from an array, Boolean masks can also come in handy when we want to count how many elements in an array meet a certain condition:" ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([False, False, True, True])" ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mask" ] }, { "cell_type": "code", "execution_count": 62, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mask.sum()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A related, useful function to assign values to specific elements in an array is the `np.where` function. In the example below, we assign a 1 to all values in the array that are greater than 2 -- and 0, otherwise:" ] }, { "cell_type": "code", "execution_count": 63, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0, 0, 1, 1])" ] }, "execution_count": 63, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.where(ary > 2, 1, 0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are also so-called bit-wise operators that we can use to specify more complex selection criteria:" ] }, { "cell_type": "code", "execution_count": 64, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0, 0, 1, 1])" ] }, "execution_count": 64, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary = np.array([1, 2, 3, 4])\n", "mask = ary > 2\n", "ary[mask] = 1\n", "ary[~mask] = 0\n", "ary" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `~` operator in the example above is one of the logical operators in NumPy:\n", " \n", "- A: `&` or `np.bitwise_and`\n", "- Or: `|` or `np.bitwise_or`\n", "- Xor: `^` or `np.bitwise_xor`\n", "- Not: `~` or `np.bitwise_not`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "These logical operators allow us to chain an arbitrary number of conditions to create even more \"complex\" Boolean masks. For example, using the \"Or\" operator, we can select all elements that are greater than 3 or smaller than 2 as follows:" ] }, { "cell_type": "code", "execution_count": 65, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ True, False, False, True])" ] }, "execution_count": 65, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ary = np.array([1, 2, 3, 4])\n", "\n", "(ary > 3) | (ary < 2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And, for example, to negate the condition, we can use the `~` operator:" ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([False, True, True, False])" ] }, "execution_count": 66, "metadata": {}, "output_type": "execute_result" } ], "source": [ "~((ary > 3) | (ary < 2))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Linear Algebra with NumPy Arrays" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Intuitively, we can think of one-dimensional NumPy arrays as data structures that represent row vectors:" ] }, { "cell_type": "code", "execution_count": 67, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1, 2, 3])" ] }, "execution_count": 67, "metadata": {}, "output_type": "execute_result" } ], "source": [ "row_vector = np.array([1, 2, 3])\n", "row_vector" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Similarly, we can use two-dimensional arrays to create column vectors:" ] }, { "cell_type": "code", "execution_count": 68, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1],\n", " [2],\n", " [3]])" ] }, "execution_count": 68, "metadata": {}, "output_type": "execute_result" } ], "source": [ "column_vector = np.array([[1, 2, 3]]).reshape(-1, 1)\n", "column_vector" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Instead of reshaping a one-dimensional array into a two-dimensional one, we can simply add a new axis as shown below:" ] }, { "cell_type": "code", "execution_count": 69, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1],\n", " [2],\n", " [3]])" ] }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "row_vector[:, np.newaxis]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that in this context, `np.newaxis` behaves like `None`:" ] }, { "cell_type": "code", "execution_count": 70, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1],\n", " [2],\n", " [3]])" ] }, "execution_count": 70, "metadata": {}, "output_type": "execute_result" } ], "source": [ "row_vector[:, None]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All three approaches listed above, using `reshape(-1, 1)`, `np.newaxis`, or `None` yield the same results -- all three approaches create views not copies of the `row_vector` array.\n", "\n", "As we remember from the Linear Algebra appendix, we can think of a column vector as a matrix consisting only of one column. To perform matrix multiplication between matrices, we learned that number of columns of the left matrix must match the number of rows of the matrix to the right. In NumPy, we can perform matrix multiplication via the `matmul` function:" ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [], "source": [ "matrix = np.array([[1, 2, 3], \n", " [4, 5, 6]])" ] }, { "cell_type": "code", "execution_count": 72, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[14],\n", " [32]])" ] }, "execution_count": 72, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.matmul(matrix, column_vector)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![](./images/numpy-intro/matmul.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, if we are working with matrices and vectors, NumPy can be quite forgiving if the dimensions of matrices and one-dimensional arrays do not match exactly -- thanks to broadcasting. The following example yields the same result as the matrix-column vector multiplication, except that it returns a one-dimensional array instead of a two-dimensional one:" ] }, { "cell_type": "code", "execution_count": 73, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([14, 32])" ] }, "execution_count": 73, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.matmul(matrix, row_vector)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Similarly, we can compute the dot-product between two vectors (here: the vector norm)" ] }, { "cell_type": "code", "execution_count": 74, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "14" ] }, "execution_count": 74, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.matmul(row_vector, row_vector)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "NumPy has a special `dot` function that behaves similar to `matmul` on pairs of one- or two-dimensional arrays -- its underlying implementation is different though, and one or the other can be slightly faster on specific machines and versions of [BLAS](https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms):" ] }, { "cell_type": "code", "execution_count": 75, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "14" ] }, "execution_count": 75, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.dot(row_vector, row_vector)" ] }, { "cell_type": "code", "execution_count": 76, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([14, 32])" ] }, "execution_count": 76, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.dot(matrix, row_vector)" ] }, { "cell_type": "code", "execution_count": 77, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[14],\n", " [32]])" ] }, "execution_count": 77, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.dot(matrix, column_vector)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Similar to the examples above we can use `matmul` or `dot` to multiply two matrices (here: two-dimensional arrays). In this context, NumPy arrays have a handy `transpose` method to transpose matrices if necessary:" ] }, { "cell_type": "code", "execution_count": 78, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1, 4],\n", " [2, 5],\n", " [3, 6]])" ] }, "execution_count": 78, "metadata": {}, "output_type": "execute_result" } ], "source": [ "matrix = np.array([[1, 2, 3], \n", " [4, 5, 6]])\n", "\n", "matrix.transpose()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![](./images/numpy-intro/transpose.png)" ] }, { "cell_type": "code", "execution_count": 79, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[14, 32],\n", " [32, 77]])" ] }, "execution_count": 79, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.matmul(matrix, matrix.transpose())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![](./images/numpy-intro/matmatmul.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "While `transpose` can be annoyingly verbose for implementing linear algebra operations -- think of [PEP8's](https://www.python.org/dev/peps/pep-0008/) *80 character per line* recommendation -- NumPy has a shorthand for that: `T`:" ] }, { "cell_type": "code", "execution_count": 80, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1, 4],\n", " [2, 5],\n", " [3, 6]])" ] }, "execution_count": 80, "metadata": {}, "output_type": "execute_result" } ], "source": [ "matrix.T" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "While this section demonstrates some of the basic linear algebra operations carried out on NumPy arrays that we use in practice, you can find an additional function in the documentation of NumPy's submodule for linear algebra: [`numpy.linalg`](https://docs.scipy.org/doc/numpy/reference/routines.linalg.html). If you want to perform a particular linear algebra routine that is not implemented in NumPy, it is also worth consulting the [`scipy.linalg` documentation](https://docs.scipy.org/doc/scipy/reference/linalg.html) -- SciPy is a library for scientific computing built on top of NumPy." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "\n", "I want to mention that there is also a special [`matrix`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.matrix.html) type in NumPy. NumPy `matrix` objects are analogous to NumPy arrays but are restricted to two dimensions. Also, matrices define certain operations differently than arrays; for instance, the `*` operator performs matrix multiplication instead of element-wise multiplication. However, NumPy `matrix` is less popular in the science community compared to the more general array data structure. \n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## SciPy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "SciPy is another open-source library from Python's scientific computing stack. SciPy includes submodules for integration, optimization, and many other kinds of computations that are out of the scope of NumPy itself. We will not cover SciPy as a library here, since it can be more considered as an \"add-on\" library on top of NumPy. Rather, we may discuss individual SciPy function in the context of homework exercises if the need arises.\n", "\n", "In any case, I recommend you to take a look at the SciPy documentation to get a brief overview of the different function that exists within this library: [https://docs.scipy.org/doc/scipy/reference/](https://docs.scipy.org/doc/scipy/reference/)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Matplotlib" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Lastly, we will briefly cover Matplotlib in this lecture. Matplotlib is a plotting library for Python created by John D. Hunter in 2003. Unfortunately, John D. Hunter became ill and past away in 2012. However, Matplot is still the most mature plotting library, and is being maintained until this day (in fact, version 3.1.1 was just released a few months \n", "ago, in July 2019).\n", "\n", "In general, Matplotlib is a rather \"low-level\" plotting library, which means that it has a lot of room for customization. The advantage of Matplotlib is that it is so customizable; the disadvantage of Matplotlib is that it is so customizable -- some people find it a little bit too verbose due to all the different options.\n", "\n", "In any case, Matplotlib is among the most widely used plotting library and the go-to choice for many data scientists and machine learning researchers and practictioners.\n", "\n", "In my opinion, the best way to work with Matplotlib is to use the Matplotlib gallery on the official website at [https://matplotlib.org/gallery/index.html](https://matplotlib.org/gallery/index.html) often. It contains code examples for creating various different kinds of plots, which are useful as templates for creating your own plots. Also, if you are completely new to Matplotlib, I recommend the tutorials at [https://matplotlib.org/tutorials/index.html](https://matplotlib.org/tutorials/index.html).\n", "\n", "In this section, we will look at a few very simple examples, which should be very intuitive and shouldn't require much explanation." ] }, { "cell_type": "code", "execution_count": 81, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The main plotting functions of Matplotlib are contained in the pyplot module, which we imported above. Note that the `%matplotlib inline` command is an \"IPython magic\" command -- we discussed this last lecture. This particular `%matplotlib inline` is specific to Jupyter notebooks (which, in our case, use an IPython kernel) to show the plots \"inline,\" that is, the notebook itself." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plotting Functions and Lines" ] }, { "cell_type": "code", "execution_count": 82, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "x = np.linspace(0, 10, 100)\n", "plt.plot(x, np.sin(x))\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Add axis ranges and labels:" ] }, { "cell_type": "code", "execution_count": 83, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "x = np.linspace(0, 10, 100)\n", "plt.plot(x, np.sin(x))\n", "\n", "plt.xlim([2, 8])\n", "plt.ylim([0, 0.75])\n", "\n", "plt.xlabel('x-axis')\n", "plt.ylabel('y-axis')\n", "\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": 84, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "x = np.linspace(0, 10, 100)\n", "\n", "plt.plot(x, np.sin(x), label=('sin(x)'))\n", "plt.plot(x, np.cos(x), label=('cos(x)'))\n", "\n", "plt.ylabel('f(x)')\n", "plt.xlabel('x')\n", "\n", "plt.legend(loc='lower left')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Scatter Plots" ] }, { "cell_type": "code", "execution_count": 85, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "rng = np.random.RandomState(123)\n", "x = rng.normal(size=500)\n", "y = rng.normal(size=500)\n", "\n", "\n", "plt.scatter(x, y)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Bar Plots" ] }, { "cell_type": "code", "execution_count": 86, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXcAAAD4CAYAAAAXUaZHAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAP80lEQVR4nO3db4hdd53H8ffHJP4BxaqZ3YY001Hsg1VRW4dYEZZBXahV2gUrREGNWIKuRQX3QepCxT6K+0BBK3YjLaYitVJFo6ZIRUV90Og0pH/S6G4s7ja0bmOqqUG3Eve7D+ZWZm/v9J47cyc39+f7BZc5f7733O+PUz75zZlzT1NVSJLa8oxJNyBJGj/DXZIaZLhLUoMMd0lqkOEuSQ3aOKkP3rx5c83NzU3q4yVpKt19992/qaqZYXUTC/e5uTkWFxcn9fGSNJWS/GeXOi/LSFKDDHdJapDhLkkNMtwlqUGGuyQ1yHCXpAYNDfckz07y0yT3JDmS5BMDanYmOZHkcO919fq0K0nqost97k8Ab6iq00k2AT9JckdV3dVXd1tVXTP+FiVJoxoa7rX0wPfTvdVNvZcPgZekc1ina+5JNiQ5DDwK3FlVBweUvS3JvUluT7JthePsSrKYZPHEiRNraFuSliwsLLCwsDDpNs45ncK9qv5cVa8GLgC2J3lFX8m3gLmqeiXwPWDfCsfZW1XzVTU/MzP00QiSpFUa6W6Zqvod8EPgsr7tJ6vqid7qF4DXjKU7SdKqdLlbZibJeb3l5wBvAn7eV7Nl2eoVwNFxNilJGk2Xu2W2APuSbGDpH4OvVtW3k1wPLFbVfuBDSa4AzgCPATvXq2FJ0nBd7pa5F7h4wPbrli1fC1w73tYkSavlN1QlqUGGuyQ1yHCXpAYZ7pLUIMNdkhpkuEtSgwx3SWqQ4S5JDTLcJalBhrskNchwl6QGGe6S1CDDXZIaZLhLUoMMd0lqkOEuSQ0y3CWpQYa7JDXIcJekBhnuktSgoeGe5NlJfprkniRHknxiQM2zktyW5FiSg0nm1qNZSVI3XWbuTwBvqKpXAa8GLktyaV/N+4DfVtVLgU8Dnxxvm5KkUQwN91pyure6qfeqvrIrgX295duBNybJ2LqUJI1kY5eiJBuAu4GXAp+rqoN9JVuBhwCq6kySU8CLgN/0HWcXsAtgdnZ2bZ1LWldzu78z6RY6+fWDJ4Hp6RfgV3vesu6f0ekPqlX156p6NXABsD3JK/pKBs3S+2f3VNXeqpqvqvmZmZnRu5UkdTLS3TJV9Tvgh8BlfbuOA9sAkmwEng88Nob+JEmr0OVumZkk5/WWnwO8Cfh5X9l+4D295auA71fVU2bukqSzo8s19y3Avt5192cAX62qbye5Hlisqv3ATcCXkhxjaca+Y906liQNNTTcq+pe4OIB269btvw/wNvH25okabX8hqokNchwl6QGGe6S1CDDXZIaZLhLUoMMd0lqkOEuSQ0y3CWpQYa7JDXIcJekBhnuktQgw12SGmS4S1KDDHdJapDhLkkNMtwlqUGGuyQ1yHCXpAYZ7pLUIMNdkhpkuEtSg4aGe5JtSX6Q5GiSI0k+PKBmIcmpJId7r+vWp11JUhcbO9ScAT5aVYeSPA+4O8mdVfVAX92Pq+qt429RkjSqoTP3qnqkqg71ln8PHAW2rndjkqTV6zJz/4skc8DFwMEBu1+X5B7gYeCfq+rIgPfvAnYBzM7OjtqrJD3F+e/cM+kWzkmd/6Ca5LnA14CPVNXjfbsPARdW1auAzwLfGHSMqtpbVfNVNT8zM7PaniVJQ3QK9ySbWAr2L1fV1/v3V9XjVXW6t3wA2JRk81g7lSR11uVumQA3AUer6lMr1JzfqyPJ9t5xT46zUUlSd12uub8eeBdwX5LDvW0fA2YBqupG4CrgA0nOAH8EdlRVrUO/kqQOhoZ7Vf0EyJCaG4AbxtWUJGlt/Iaq1GdhYYGFhYVJtyGtieEuSQ0y3CWpQYa7JDXIcJekBhnuktQgw12SGmS4S1KDDHdJapDhLkkNMtwlqUGGuyQ1yHCXpAYZ7pLUIMNdkhpkuEtSgwx3SWqQ4S5JDTLcJalBhrskNWhouCfZluQHSY4mOZLkwwNqkuQzSY4luTfJJevTriSpi40das4AH62qQ0meB9yd5M6qemBZzZuBi3qv1wKf7/2UJE3A0Jl7VT1SVYd6y78HjgJb+8quBG6pJXcB5yXZMvZuJUmddJm5/0WSOeBi4GDfrq3AQ8vWj/e2PdL3/l3ALoDZ2dnROtVUm9v9nUm30NmvHzwJTE/Pv9rzlkm3oHNQ5z+oJnku8DXgI1X1eP/uAW+pp2yo2ltV81U1PzMzM1qnkqTOOoV7kk0sBfuXq+rrA0qOA9uWrV8APLz29iRJq9HlbpkANwFHq+pTK5TtB97du2vmUuBUVT2yQq0kaZ11ueb+euBdwH1JDve2fQyYBaiqG4EDwOXAMeAPwHvH36okqauh4V5VP2HwNfXlNQV8cFxNSZLWxm+oSlKDDHdJapDhLkkNMtwlqUGGuyQ1yHCXpAYZ7pLUIMNdkhpkuEtSgwx3SWqQ4S5JDTLcJalBhrskNchwl6QGGe6S1CDDXZIaZLhLUoMMd0lqUJf/h6r0V+X8d+6ZdAvSmjlzl6QGGe6S1KCh4Z7k5iSPJrl/hf0LSU4lOdx7XTf+NiVJo+hyzf2LwA3ALU9T8+OqeutYOpIkrdnQmXtV/Qh47Cz0Ikkak3Fdc39dknuS3JHk5SsVJdmVZDHJ4okTJ8b00ZKkfuMI90PAhVX1KuCzwDdWKqyqvVU1X1XzMzMzY/hoSdIgaw73qnq8qk73lg8Am5JsXnNnkqRVW3O4Jzk/SXrL23vHPLnW40qSVm/o3TJJbgUWgM1JjgMfBzYBVNWNwFXAB5KcAf4I7KiqWreOJUlDDQ33qnrHkP03sHSrpCTpHOE3VCWpQYa7JDXIcJekBhnuktQgw12SGmS4S1KDDHdJapDhLkkNMtwlqUGGuyQ1yHCXpAYZ7pLUIMN9nS0sLLCwsDDpNiT9lTHcJalBhrskNchwl6QGGe6S1CDDXZIaZLhLUoMMd0lq0NBwT3JzkkeT3L/C/iT5TJJjSe5Ncsn425QkjaLLzP2LwGVPs//NwEW91y7g82tvS5K0FkPDvap+BDz2NCVXArfUkruA85JsGVeDkqTRjeOa+1bgoWXrx3vbJEkTsnEMx8iAbTWwMNnF0qUbZmdnV/2Bc7u/s+r3nm2/fvAkMD09/2rPWybdgqQxGMfM/Tiwbdn6BcDDgwqram9VzVfV/MzMzBg+WpI0yDjCfT/w7t5dM5cCp6rqkTEcV5K0SkMvyyS5FVgANic5Dnwc2ARQVTcCB4DLgWPAH4D3rlezkqRuhoZ7Vb1jyP4CPji2jiRJa+Y3VCWpQYa7JDXIcJekBhnuktQgw12SGmS4S1KDxvH4AT2N89+5Z9ItSPor5MxdkhpkuEtSgwx3SWqQ4S5JDTLcJalBhrskNchwl6QGGe6S1CDDXZIaZLhLUoMMd0lqkOEuSQ0y3CWpQYa7JDWoU7gnuSzJL5IcS7J7wP6dSU4kOdx7XT3+ViVJXQ19nnuSDcDngH8AjgM/S7K/qh7oK72tqq5Zhx4lSSPqMnPfDhyrqger6k/AV4Ar17ctSdJadAn3rcBDy9aP97b1e1uSe5PcnmTboAMl2ZVkMcniiRMnVtGuJKmLLuGeAduqb/1bwFxVvRL4HrBv0IGqam9VzVfV/MzMzGidSpI66xLux4HlM/ELgIeXF1TVyap6orf6BeA142lPkrQaXcL9Z8BFSV6c5JnADmD/8oIkW5atXgEcHV+LkqRRDb1bpqrOJLkG+C6wAbi5qo4kuR5YrKr9wIeSXAGcAR4Ddq5jz5KkIYaGO0BVHQAO9G27btnytcC1421NkrRafkNVkhpkuEtSgwx3SWqQ4S5JDTLcJalBhrskNchwl6QGGe6S1CDDXZIaZLhLUoMMd0lqkOEuSQ0y3CWpQYa7JDXIcJekBhnuktQgw12SGmS4S1KDDHdJapDhLkkNMtwlqUGdwj3JZUl+keRYkt0D9j8ryW29/QeTzI27UUlSd0PDPckG4HPAm4GXAe9I8rK+svcBv62qlwKfBj457kYlSd11mblvB45V1YNV9SfgK8CVfTVXAvt6y7cDb0yS8bUpSRrFxg41W4GHlq0fB167Uk1VnUlyCngR8JvlRUl2Abt6q6eT/KLvOJv739OIqRlXRvuda2rGtQpTMzbPGTBl41rjObuwy5u6hPugGXitooaq2gvsXfGDksWqmu/Q01RxXNOn1bE5rumz2rF1uSxzHNi2bP0C4OGVapJsBJ4PPDZqM5Kk8egS7j8DLkry4iTPBHYA+/tq9gPv6S1fBXy/qp4yc5cknR1DL8v0rqFfA3wX2ADcXFVHklwPLFbVfuAm4EtJjrE0Y9+xyn5WvGQz5RzX9Gl1bI5r+qxqbHGCLUnt8RuqktQgw12SGjTRcE/ywiR3JvmP3s8XrFD35ySHe6/+P+aeM1p9TEOHce1McmLZObp6En2OKsnNSR5Ncv8K+5PkM71x35vkkrPd42p0GNdCklPLztd1Z7vH1UiyLckPkhxNciTJhwfUTN056ziu0c9ZVU3sBfwrsLu3vBv45Ap1pyfZZ8exbAB+CbwEeCZwD/Cyvpp/Am7sLe8Abpt032Ma107ghkn3uoqx/T1wCXD/CvsvB+5g6XsclwIHJ93zmMa1AHx70n2uYlxbgEt6y88D/n3Af4tTd846jmvkczbpyzLLH1uwD/jHCfayVq0+pqHLuKZSVf2Ip/8+xpXALbXkLuC8JFvOTner12FcU6mqHqmqQ73l3wNHWfp2/HJTd846jmtkkw73v62qR2BpgMDfrFD37CSLSe5Kcq7+AzDoMQ39J+j/PaYBePIxDeeyLuMCeFvv1+Dbk2wbsH8adR37NHpdknuS3JHk5ZNuZlS9S5oXAwf7dk31OXuaccGI56zL4wfWJMn3gPMH7PqXEQ4zW1UPJ3kJ8P0k91XVL8fT4diM7TEN55guPX8LuLWqnkjyfpZ+O3nDune2/qbxfHVxCLiwqk4nuRz4BnDRhHvqLMlzga8BH6mqx/t3D3jLVJyzIeMa+Zyt+8y9qt5UVa8Y8Pom8N9P/srU+/noCsd4uPfzQeCHLP3Ldq5p9TENQ8dVVSer6one6heA15yl3tZbl3M6darq8ao63Vs+AGxKsnnCbXWSZBNLAfjlqvr6gJKpPGfDxrWaczbpyzLLH1vwHuCb/QVJXpDkWb3lzcDrgQfOWofdtfqYhqHj6rumeQVL1wxbsB94d+8OjEuBU09eRpxmSc5/8m89SbazlAMnJ9vVcL2ebwKOVtWnViibunPWZVyrOWfrfllmiD3AV5O8D/gv4O0ASeaB91fV1cDfAf+W5H9ZGtCeqjrnwr3O7mMazpqO4/pQkiuAMyyNa+fEGh5BkltZugthc5LjwMeBTQBVdSNwgKW7L44BfwDeO5lOR9NhXFcBH0hyBvgjsGMKJhmwNLF7F3BfksO9bR8DZmGqz1mXcY18znz8gCQ1aNKXZSRJ68Bwl6QGGe6S1CDDXZIaZLhLUoMMd0lqkOEuSQ36PzrfkgdeZIaDAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# input data\n", "means = [1, 2, 3]\n", "stddevs = [0.2, 0.4, 0.5]\n", "bar_labels = ['bar 1', 'bar 2', 'bar 3']\n", "\n", "\n", "# plot bars\n", "x_pos = list(range(len(bar_labels)))\n", "plt.bar(x_pos, means, yerr=stddevs)\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Histograms" ] }, { "cell_type": "code", "execution_count": 87, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXkAAAD4CAYAAAAJmJb0AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAPdklEQVR4nO3df6xkZ13H8ffHLgX5ZVt626zb6m7NihATbb3BKkIMi0q7yBalpsTIBptsTEBBNLLYRPhzqwJKQiArrSym8sMC6QZUqGuR+Acrd0vpD5a627KUpcvuRX4qptDw9Y85a6bL3Ht358ydufv0/Upu5pxnzsz55pm5n/vcc848k6pCktSmH5p1AZKk1WPIS1LDDHlJapghL0kNM+QlqWHrZl0AwIUXXlgbN26cdRmSdFY5cODAV6tqbrlt1kTIb9y4kYWFhVmXIUlnlSRfXGkbD9dIUsMMeUlqmCEvSQ0z5CWpYYa8JDXMkJekhhnyktQwQ16SGmbIS1LD1sQnXqW1ZOPOjy5535FdW6dYidSfI3lJapghL0kNM+QlqWGGvCQ1zJCXpIYZ8pLUMENekhpmyEtSwwx5SWqYIS9JDTPkJalhhrwkNcwJyqQzsNzkZeAEZlp7HMlLUsNWDPkkNyc5keTeobYLktye5FB3e37XniRvS3I4yd1JrljN4iVJyzudkfy7gRed0rYT2FdVm4F93TrAVcDm7mcH8I7JlClJGseKIV9VnwS+dkrzNmBPt7wHuGao/T018CngvCTrJ1WsJOnMjHtM/uKqOgbQ3V7UtW8AvjS03dGuTZI0A5M+8ZoRbTVyw2RHkoUkC4uLixMuQ5IE44f88ZOHYbrbE137UeDSoe0uAR4e9QRVtbuq5qtqfm5ubswyJEnLGTfk9wLbu+XtwG1D7a/orrK5EvjmycM6kqTpW/HDUEneC/wycGGSo8AbgV3AB5JcDzwEXNtt/o/A1cBh4DvAK1ehZknSaVox5Kvq5UvctWXEtgW8qm9RkqTJcFoDPe6sNDWB1BKnNZCkhhnyktQwQ16SGmbIS1LDDHlJapghL0kNM+QlqWGGvCQ1zJCXpIYZ8pLUMENekhpmyEtSwwx5SWqYIS9JDTPkJalhhrwkNcyQl6SGGfKS1DBDXpIaZshLUsMMeUlqmCEvSQ1bN+sCpJZs3PnRZe8/smvrlCqRBhzJS1LDDHlJapghL0kNM+QlqWGeeFVzVjr5KT2e9BrJJ/nDJPcluTfJe5M8KcmmJPuTHEry/iTnTqpYSdKZGTvkk2wA/gCYr6qfBs4BrgNuBN5aVZuBrwPXT6JQSdKZ63tMfh3ww0nWAU8GjgEvAG7t7t8DXNNzH5KkMY0d8lX1ZeAvgYcYhPs3gQPAN6rq0W6zo8CGUY9PsiPJQpKFxcXFccuQJC2jz+Ga84FtwCbgR4GnAFeN2LRGPb6qdlfVfFXNz83NjVuGJGkZfQ7XvBD4QlUtVtX3gA8Bvwic1x2+AbgEeLhnjZKkMfUJ+YeAK5M8OUmALcDngDuAl3XbbAdu61eiJGlcfY7J72dwgvVO4J7uuXYDrwdel+Qw8AzgpgnUKUkaQ68PQ1XVG4E3ntL8IPCcPs8rSZoMpzWQpIYZ8pLUMENekhpmyEtSwwx5SWqYIS9JDTPkJalhhrwkNcyQl6SGGfKS1DBDXpIaZshLUsMMeUlqmCEvSQ0z5CWpYb3mk5c0WRt3fnTZ+4/s2jqlStQKR/KS1DBH8tIUrTRSlybNkbwkNcyQl6SGGfKS1DBDXpIaZshLUsMMeUlqmCEvSQ0z5CWpYYa8JDXMkJekhhnyktSwXiGf5Lwktyb5fJKDSX4hyQVJbk9yqLs9f1LFSpLOTN+R/F8D/1xVPwX8DHAQ2Ansq6rNwL5uXZI0A2OHfJKnA88HbgKoqu9W1TeAbcCebrM9wDV9i5QkjafPSP4yYBH42ySfSfKuJE8BLq6qYwDd7UWjHpxkR5KFJAuLi4s9ypAkLaVPyK8DrgDeUVWXA//DGRyaqardVTVfVfNzc3M9ypAkLaVPyB8FjlbV/m79VgahfzzJeoDu9kS/EiVJ4xo75KvqK8CXkjyza9oCfA7YC2zv2rYDt/WqUJI0tr5f//f7wC1JzgUeBF7J4A/HB5JcDzwEXNtzH9Jj+BV60unrFfJVdRcwP+KuLX2eV5I0GX7iVZIaZshLUsMMeUlqmCEvSQ0z5CWpYX0voZQmzkskpclxJC9JDXMkL51Flvsv58iurVOsRGcLR/KS1DBDXpIaZshLUsMMeUlqmCEvSQ0z5CWpYYa8JDXMkJekhhnyktQwQ16SGmbIS1LDDHlJapghL0kNM+QlqWFONayZ8ItBpOlwJC9JDTPkJalhhrwkNcyQl6SGGfKS1LDeIZ/knCSfSfKRbn1Tkv1JDiV5f5Jz+5cpSRrHJEbyrwEODq3fCLy1qjYDXweun8A+JElj6BXySS4BtgLv6tYDvAC4tdtkD3BNn31IksbXdyT/V8CfAN/v1p8BfKOqHu3WjwIbRj0wyY4kC0kWFhcXe5YhSRpl7JBP8mLgRFUdGG4esWmNenxV7a6q+aqan5ubG7cMSdIy+kxr8FzgJUmuBp4EPJ3ByP68JOu60fwlwMP9y5QkjWPskXxVvaGqLqmqjcB1wL9W1W8DdwAv6zbbDtzWu0pJ0lhW4zr51wOvS3KYwTH6m1ZhH5Kk0zCRWSir6hPAJ7rlB4HnTOJ5JUn9+IlXSWqYIS9JDfNLQ6RGrPRFLEd2bZ1SJVpLHMlLUsMMeUlqmCEvSQ0z5CWpYYa8JDXMq2ukxwmvvnl8ciQvSQ0z5CWpYYa8JDXMkJekhhnyktQwQ16SGmbIS1LDDHlJapghL0kNM+QlqWGGvCQ1zJCXpIY5QZlWxUqTYUmaDkfyktQwQ16SGmbIS1LDDHlJapghL0kNM+QlqWFeQqmxeImkdHYYeySf5NIkdyQ5mOS+JK/p2i9IcnuSQ93t+ZMrV5J0JvocrnkU+KOqehZwJfCqJM8GdgL7qmozsK9blyTNwNghX1XHqurObvnbwEFgA7AN2NNttge4pm+RkqTxTOTEa5KNwOXAfuDiqjoGgz8EwEVLPGZHkoUkC4uLi5MoQ5J0it4hn+SpwAeB11bVt073cVW1u6rmq2p+bm6ubxmSpBF6XV2T5AkMAv6WqvpQ13w8yfqqOpZkPXCib5GSVt9KV0wd2bV1SpVokvpcXRPgJuBgVb1l6K69wPZueTtw2/jlSZL66DOSfy7wO8A9Se7q2v4U2AV8IMn1wEPAtf1KlCSNa+yQr6p/B7LE3VvGfV5J0uQ4rYEkNcyQl6SGGfKS1DBDXpIaZshLUsMMeUlqmCEvSQ3zS0M0kl8KIrXBkbwkNcyQl6SGGfKS1DBDXpIaZshLUsMMeUlqmCEvSQ0z5CWpYYa8JDXMkJekhhnyktQwQ16SGmbIS1LDnIXyccpZJnWmVnrPHNm1dUqV6Ew4kpekhhnyktQwQ16SGmbIS1LDPPF6FvNEmNaS5d6Pvhdnx5G8JDUsVTXrGpifn6+FhYVZl7HmeJmjNOB/AqMlOVBV88ttsyoj+SQvSnJ/ksNJdq7GPiRJK5v4Mfkk5wBvB34FOAp8OsneqvrcpPclSafj8Xy+YDVG8s8BDlfVg1X1XeB9wLZV2I8kaQWrcXXNBuBLQ+tHgZ8/daMkO4Ad3ep/J7l/zP1dCHx1zMeuNmsbj7WNp9nacuMEK/nB5z6b++3HV3qC1Qj5jGj7gbO7VbUb2N17Z8nCSiceZsXaxmNt47G28bRe22ocrjkKXDq0fgnw8CrsR5K0gtUI+U8Dm5NsSnIucB2wdxX2I0lawcQP11TVo0leDXwMOAe4uarum/R+hvQ+5LOKrG081jYeaxtP07WtiQ9DSZJWh9MaSFLDDHlJathZFfJJrk1yX5LvJ5k/5b43dNMo3J/k14bapz7FQpL3J7mr+zmS5K6ufWOS/x26753TqOeU2t6U5MtDNVw9dN/IPpxibX+R5PNJ7k7y4STnde0z77eujjUzXUeSS5PckeRg9zvxmq59ydd3yvUdSXJPV8NC13ZBktuTHOpuz59BXc8c6pu7knwryWtn1W9Jbk5yIsm9Q20j+ykDb+vef3cnueK0dlJVZ80P8CzgmcAngPmh9mcDnwWeCGwCHmBw0vecbvky4Nxum2dPueY3A3/WLW8E7p1xH74J+OMR7SP7cMq1/Sqwrlu+EbhxDfXbzN9Lp9SzHriiW34a8J/dazjy9Z1BfUeAC09p+3NgZ7e88+TrO+PX9CsMPlA0k34Dng9cMfz+XqqfgKuBf2LwWaQrgf2ns4+zaiRfVQeratQnY7cB76uqR6rqC8BhBtMrzHSKhSQBfgt477T22cNSfTg1VfXxqnq0W/0Ug89YrBVrarqOqjpWVXd2y98GDjL4tPlatg3Y0y3vAa6ZYS0AW4AHquqLsyqgqj4JfO2U5qX6aRvwnhr4FHBekvUr7eOsCvlljJpKYcMy7dPyPOB4VR0aatuU5DNJ/i3J86ZYy7BXd//u3Tz0L/Os++pUv8tg1HLSrPttrfXP/0uyEbgc2N81jXp9p62Ajyc5kMEUJgAXV9UxGPyRAi6aUW0nXcdjB2Brod9g6X4a6z245kI+yb8kuXfEz3KjpqWmUjitKRZWsc6X89g30THgx6rqcuB1wN8nefok6jmD2t4B/ATws109bz75sBFPNfHra0+n35LcADwK3NI1TaXfVip9RNvMrz9O8lTgg8Brq+pbLP36Tttzq+oK4CrgVUmeP6M6Rsrgg5ovAf6ha1or/bacsd6Da+7r/6rqhWM8bLmpFFZlioWV6kyyDvgN4OeGHvMI8Ei3fCDJA8BPAhP9xpTT7cMkfwN8pFudynQUp9Fv24EXA1uqOxA5rX5bwZqbriPJExgE/C1V9SGAqjo+dP/w6ztVVfVwd3siyYcZHO46nmR9VR3rDjOcmEVtnauAO0/211rpt85S/TTWe3DNjeTHtBe4LskTk2wCNgP/wWynWHgh8PmqOnqyIclcBvPtk+Syrs4Hp1TPyRqGj+G9FDh5Vn+pPpxmbS8CXg+8pKq+M9Q+835jjU3X0Z3vuQk4WFVvGWpf6vWdZm1PSfK0k8sMTqjfy6C/tnebbQdum3ZtQx7zX/Za6LchS/XTXuAV3VU2VwLfPHlYZ1mzPLs9xpnolzL4a/YIcBz42NB9NzC4+uF+4Kqh9qsZXHnwAHDDFGt9N/B7p7T9JnAfgysz7gR+fQZ9+HfAPcDd3Ztm/Up9OMXaDjM45nhX9/POtdJvs3wvLVHLLzH4V/3uof66ernXd4q1Xda9Vp/tXrcbuvZnAPuAQ93tBTPquycD/wX8yFDbTPqNwR+aY8D3umy7fql+YnC45u3d++8ehq4wXO7HaQ0kqWGtHK6RJI1gyEtSwwx5SWqYIS9JDTPkJalhhrwkNcyQl6SG/R/mpxw0VoFgEQAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "rng = np.random.RandomState(123)\n", "x = rng.normal(0, 20, 1000) \n", "\n", "# fixed bin size\n", "bins = np.arange(-100, 100, 5) # fixed bin size\n", "\n", "plt.hist(x, bins=bins)\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": 88, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXkAAAD4CAYAAAAJmJb0AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAATHklEQVR4nO3df6xcZ33n8fdnww+phW4SchNZSVwnyKClqOukVyErNohsoJtELCbdhY1V0dBma5CIFmhXIiFSQStV0B8harW7QaaxElYhhG7IEq3SBTdKiSo1lOtgHKcOxE5NMPHat6EiSEHZdfjuH3NuO7nM/eGZuTPXj98vaTRnnnPOzNfPjD9z7jPnR6oKSVKb/sm0C5AkrR1DXpIaZshLUsMMeUlqmCEvSQ172bQLADjrrLNq06ZN0y5Dkk4qu3fv/ruqmllumXUR8ps2bWJubm7aZUjSSSXJd1daxuEaSWqYIS9JDTPkJalhhrwkNcyQl6SGGfKS1DBDXpIaZshLUsNWDPkk5yd5KMn+JI8n+VDXfmaSXUme7O7P6NqT5I+THEiyN8nFa/2PkCQNtpojXo8Dv11VjyZ5NbA7yS7gfcCDVfWpJDcCNwIfBa4CNne3NwG3dfeSVvLQJ5eff/lNk6lDzVhxS76qjlTVo930j4D9wLnAVuDObrE7gXd101uBz1XPI8DpSTaMvXJJ0opOaEw+ySbgIuDrwDlVdQR6XwTA2d1i5wLf61vtcNe2+Lm2J5lLMjc/P3/ilUuSVrTqkE/yKuBe4MNV9dxyiw5o+6kLyVbVjqqararZmZllT6ImSRrSqkI+ycvpBfxdVfWlrvnowjBMd3+saz8MnN+3+nnAM+MpV5J0Ilazd02A24H9VfXpvln3A9d109cBX+5r/7VuL5tLgR8uDOtIkiZrNXvXvBl4L/BYkj1d28eATwFfTHI98DTw7m7eA8DVwAHgeeDXx1qxJGnVVgz5qvpLBo+zA1wxYPkCPjhiXZKkMfCIV0lqmCEvSQ1bF9d4lU4ZKx3RKo2ZW/KS1DBDXpIaZshLUsMMeUlqmCEvSQ0z5CWpYYa8JDXMkJekhhnyktQwQ16SGmbIS1LDDHlJapghL0kNW83l/3YmOZZkX1/bPUn2dLdDC1eMSrIpyY/75n1mLYuXJC1vNacavgP4L8DnFhqq6t8vTCe5Bfhh3/IHq2rLuAqUJA1vNZf/ezjJpkHzuot8vwf4V+MtS5I0DqOOyV8GHK2qJ/vaLkjyzSRfS3LZUism2Z5kLsnc/Pz8iGVIkgYZNeS3AXf3PT4CbKyqi4DfAj6f5OcGrVhVO6pqtqpmZ2ZmRixDkjTI0CGf5GXArwD3LLRV1QtV9Ww3vRs4CLxu1CIlScMZZUv+bcATVXV4oSHJTJLTuukLgc3AU6OVKEka1mp2obwb+Cvg9UkOJ7m+m3UtLx2qAXgLsDfJt4D/AXygqn4wzoIlSau3mr1rti3R/r4BbfcC945eliRpHDziVZIaZshLUsMMeUlqmCEvSQ0z5CWpYYa8JDXMkJekhhnyktQwQ16SGraai4ZIWi8e+uTy8y+/aTJ16KThlrwkNcyQl6SGGfKS1DBDXpIaZshLUsMMeUlqmCEvSQ1bcT/5JDuBdwDHquqNXdsngN8E5rvFPlZVD3TzbgKuB14E/mNVfWUN6pbWp5X2Y5cmbDVb8ncAVw5ov7WqtnS3hYB/A71rv/5Ct85/W7iwtyRp8lYM+ap6GFjtxbi3Al+oqheq6m+BA8AlI9QnSRrBKGPyNyTZm2RnkjO6tnOB7/Utc7hr+ylJtieZSzI3Pz8/aBFJ0oiGDfnbgNcCW4AjwC1dewYsW4OeoKp2VNVsVc3OzMwMWYYkaTlDhXxVHa2qF6vqJ8Bn+cchmcPA+X2Lngc8M1qJkqRhDRXySTb0PbwG2NdN3w9cm+SVSS4ANgN/PVqJkqRhrWYXyruBtwJnJTkMfBx4a5It9IZiDgHvB6iqx5N8Efgb4Djwwap6cW1KlyStZMWQr6ptA5pvX2b53wV+d5SiJEnj4RGvktQwQ16SGmbIS1LDDHlJapghL0kNM+QlqWGGvCQ1zJCXpIYZ8pLUMENekhpmyEtSwwx5SWqYIS9JDTPkJalhhrwkNcyQl6SGrRjySXYmOZZkX1/bHyR5IsneJPclOb1r35Tkx0n2dLfPrGXxkqTlrWZL/g7gykVtu4A3VtUvAt8Bbuqbd7CqtnS3D4ynTEnSMFYM+ap6GPjBoravVtXx7uEjwHlrUJskaUTjGJP/DeDP+h5fkOSbSb6W5LIxPL8kaUgrXsh7OUluBo4Dd3VNR4CNVfVskl8C/meSX6iq5wasux3YDrBx48ZRypAkLWHoLfkk1wHvAH61qgqgql6oqme76d3AQeB1g9avqh1VNVtVszMzM8OWIUlaxlAhn+RK4KPAO6vq+b72mSSnddMXApuBp8ZRqCTpxK04XJPkbuCtwFlJDgMfp7c3zSuBXUkAHun2pHkL8J+THAdeBD5QVT8Y+MSSpDW3YshX1bYBzbcvsey9wL2jFiVJGg+PeJWkhhnyktQwQ16SGmbIS1LDDHlJapghL0kNM+QlqWGGvCQ1zJCXpIYZ8pLUsJFONSy16NZd31l2/kfePvDEqtK65Ja8JDXMkJekhhnyktQwQ16SGmbIS1LDDHlJatiqdqFMspPeRbuPVdUbu7YzgXuATcAh4D1V9ffpXQ/wj4CrgeeB91XVo+MvXZqO5Xax/Ig7JWudWe2W/B3AlYvabgQerKrNwIPdY4Cr6F3AezOwHbht9DIlScNYVchX1cPA4gtybwXu7KbvBN7V1/656nkEOD3JhnEUK0k6MaOMyZ9TVUcAuvuzu/Zzge/1LXe4a3uJJNuTzCWZm5+fH6EMSdJS1uKH1wxoq59qqNpRVbNVNTszM7MGZUiSRgn5owvDMN39sa79MHB+33LnAc+M8DqSpCGNsi/A/cB1wKe6+y/3td+Q5AvAm4AfLgzrSOvBSicgk1qy2l0o7wbeCpyV5DDwcXrh/sUk1wNPA+/uFn+A3u6TB+jtQvnrY65ZkrRKqwr5qtq2xKwrBixbwAdHKUqSNB4e8SpJDTPkJalhHoQtteShTy497/KbJleH1g235CWpYYa8JDXMkJekhhnyktQwQ16SGmbIS1LDDHlJapj7yUtj9FdPPbvs/H9x4WsmVInU45a8JDXMkJekhjlco+Z4vnjpHxny0gm69Okd0y5BWjWHaySpYYa8JDVs6OGaJK8H7ulruhD4HeB04DeB+a79Y1X1wNAVSpKGNnTIV9W3gS0ASU4Dvg/cR++arrdW1R+OpUJJ0tDGNVxzBXCwqr47pueTJI3BuEL+WuDuvsc3JNmbZGeSMwatkGR7krkkc/Pz84MWkSSNaOSQT/IK4J3An3ZNtwGvpTeUcwS4ZdB6VbWjqmaranZmZmbUMiRJA4xjS/4q4NGqOgpQVUer6sWq+gnwWeCSMbyGJGkI4wj5bfQN1STZ0DfvGmDfGF5DkjSEkY54TfIzwNuB9/c1/36SLUABhxbNkyRN0EghX1XPA69Z1PbekSqSJI2N566R1hHPR69xM+SlCVopxKVx89w1ktQwQ16SGmbIS1LDDHlJapghL0kNM+QlqWGGvCQ1zJCXpIYZ8pLUMI941Unn1l3fmXYJ0knDLXlJapghL0kNM+QlqWGOyWtdctxdGo+RQz7JIeBHwIvA8aqaTXImcA+wid7Vod5TVX8/6mtJpzrPN68TNa7hmsuraktVzXaPbwQerKrNwIPdY0nShK3VmPxW4M5u+k7gXWv0OpKkZYwj5Av4apLdSbZ3bedU1RGA7v7sMbyOJOkEjeOH1zdX1TNJzgZ2JXliNSt1XwjbATZu3DiGMiRJi428JV9Vz3T3x4D7gEuAo0k2AHT3xwast6OqZqtqdmZmZtQyJEkDjBTySX42yasXpoFfBvYB9wPXdYtdB3x5lNeRJA1n1OGac4D7kiw81+er6n8n+QbwxSTXA08D7x7xdSRJQxgp5KvqKeCfD2h/FrhilOeWJI3OI141FR7RKk2GIS8tcunTO6ZdgjQ2nqBMkhrmlrx0qnjok8vPv/ymydShiXJLXpIaZshLUsMMeUlqmCEvSQ0z5CWpYe5dIzVkuStHedWoU5Nb8pLUMENekhrmcI10iljxIuCXT6gQTZRb8pLUMENekhpmyEtSwwx5SWrY0CGf5PwkDyXZn+TxJB/q2j+R5PtJ9nS3q8dXriTpRIyyd81x4Ler6tHuYt67k+zq5t1aVX84enmSpFEMHfJVdQQ40k3/KMl+4NxxFaaTm5f3k9aHsYzJJ9kEXAR8vWu6IcneJDuTnLHEOtuTzCWZm5+fH0cZkqRFRg75JK8C7gU+XFXPAbcBrwW20NvSv2XQelW1o6pmq2p2ZmZm1DIkSQOMFPJJXk4v4O+qqi8BVNXRqnqxqn4CfBa4ZPQyJUnDGHpMPkmA24H9VfXpvvYN3Xg9wDXAvtFK1HrkmLt0chhl75o3A+8FHkuyp2v7GLAtyRaggEPA+0eqUJI0tFH2rvlLIANmPTB8OZKkcfIslJKAlYfgPvL2102oEo2TpzWQpIYZ8pLUMENekhpmyEtSwwx5SWqYe9doIA92OvVc+vSOFZbwxLInI7fkJalhhrwkNcyQl6SGOSavU87KY89SO9ySl6SGGfKS1DBDXpIa5pj8KazVfeEdc18bnqXy5OSWvCQ1zJCXpIat2XBNkiuBPwJOA/6kqj61Vq91qvLPZ60nfh7Xp1TV+J80OQ34DvB24DDwDWBbVf3NoOVnZ2drbm5u7HWc7FodM1+N5cbVH9m4feh1tXZWel9G4RfEYEl2V9Xscsus1Zb8JcCBqnqqK+QLwFZgYMhL0lo6lf/KWKst+X8HXFlV/6F7/F7gTVV1Q98y24GFr/7XA98e4SXPAv5uhPXXkrUNx9qGY23DOVlr+/mqmllu5bXaks+Atpd8m1TVDmAsf1cnmVvpT5ZpsbbhWNtwrG04Lde2VnvXHAbO73t8HvDMGr2WJGkJaxXy3wA2J7kgySuAa4H71+i1JElLWJPhmqo6nuQG4Cv0dqHcWVWPr8Vrddbz7hTWNhxrG461DafZ2tbkh1dJ0vrgEa+S1DBDXpIadlKFfJJ3J3k8yU+SzC6ad1OSA0m+neRf97Vf2bUdSHLjhOq8J8me7nYoyZ6ufVOSH/fN+8wk6llU2yeSfL+vhqv75g3swwnW9gdJnkiyN8l9SU7v2qfeb10dE/8sLVPL+UkeSrK/+z/xoa59yfd3wvUdSvJYV8Nc13Zmkl1Jnuzuz5hCXa/v65s9SZ5L8uFp9VuSnUmOJdnX1zawn9Lzx93nb2+Si1f1IlV10tyAf0bvwKm/AGb72t8AfAt4JXABcJDeD76nddMXAq/olnnDhGu+BfidbnoTsG/KffgJ4D8NaB/YhxOu7ZeBl3XTvwf83jrqt6l/lhbVswG4uJt+Nb3TiLxhqfd3CvUdAs5a1Pb7wI3d9I0L7++U39P/A/z8tPoNeAtwcf/ne6l+Aq4G/ozecUiXAl9fzWucVFvyVbW/qgYdGbsV+EJVvVBVfwscoHdqhX84vUJV/V9g4fQKE5EkwHuAuyf1miNYqg8npqq+WlXHu4eP0Du+Yr2Y6mdpsao6UlWPdtM/AvYD506rnlXaCtzZTd8JvGuKtQBcARysqu9Oq4Cqehj4waLmpfppK/C56nkEOD3JhpVe46QK+WWcC3yv7/Hhrm2p9km5DDhaVU/2tV2Q5JtJvpbksgnW0u+G7s+9nX1/Mk+7rxb7DXpbLQum3W/rrX/+QZJNwEXA17umQe/vpBXw1SS7u1OYAJxTVUeg9yUFnD2l2hZcy0s3wNZDv8HS/TTUZ3DdhXySP0+yb8Btua2mpU6jsOLpFda4zm289EN0BNhYVRcBvwV8PsnPjaOeE6jtNuC1wJaunlsWVhvwVGPfv3Y1/ZbkZuA4cFfXNJF+W6n0AW1T3/84yauAe4EPV9VzLP3+Ttqbq+pi4Crgg0neMqU6BkrvIM13An/aNa2XflvOUJ/BdXf5v6p62xCrLXcahTU5vcJKdSZ5GfArwC/1rfMC8EI3vTvJQeB1wFjPs7zaPkzyWeB/dQ8nciqKVfTbdcA7gCuqG4icVL+tYN2dqiPJy+kF/F1V9SWAqjraN7///Z2oqnqmuz+W5D56w11Hk2yoqiPdMMOxadTWuQp4dKG/1ku/dZbqp6E+g+tuS35I9wPXJnllkguAzcBfM93TK7wNeKKqDi80JJlJ71z7JLmwq/OpCdWzUEP/GN41wMKv+kv14SRruxL4KPDOqnq+r33q/cY6O1VH93vP7cD+qvp0X/tS7+8ka/vZJK9emKb3g/o+ev11XbfYdcCXJ11bn5f8lb0e+q3PUv10P/Br3V42lwI/XBjWWdY0f90e4pfoa+h9m70AHAW+0jfvZnp7P3wbuKqv/Wp6ex4cBG6eYK13AB9Y1PZvgcfp7ZnxKPBvptCH/x14DNjbfWg2rNSHE6ztAL0xxz3d7TPrpd+m+VlaopZ/Se9P9b19/XX1cu/vBGu7sHuvvtW9bzd37a8BHgSe7O7PnFLf/QzwLPBP+9qm0m/0vmiOAP+vy7brl+onesM1/7X7/D1G3x6Gy908rYEkNayV4RpJ0gCGvCQ1zJCXpIYZ8pLUMENekhpmyEtSwwx5SWrY/wcrJbhllHmqjgAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "rng = np.random.RandomState(123)\n", "x1 = rng.normal(0, 20, 1000) \n", "x2 = rng.normal(15, 10, 1000)\n", "\n", "# fixed bin size\n", "bins = np.arange(-100, 100, 5) # fixed bin size\n", "\n", "plt.hist(x1, bins=bins, alpha=0.5)\n", "plt.hist(x2, bins=bins, alpha=0.5)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Subplots" ] }, { "cell_type": "code", "execution_count": 89, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "\n", "import matplotlib.pyplot as plt\n", "\n", "x = range(11)\n", "y = range(11)\n", "\n", "fig, ax = plt.subplots(nrows=2, ncols=3,\n", " sharex=True, sharey=True)\n", "\n", "for row in ax:\n", " for col in row:\n", " col.plot(x, y)\n", " \n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Colors and Markers" ] }, { "cell_type": "code", "execution_count": 90, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "x = np.linspace(0, 10, 100)\n", "plt.plot(x, np.sin(x),\n", " color='orange',\n", " marker='^',\n", " linestyle='')\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Saving Plots" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The file format for saving plots can be conveniently specified via the file suffix (.eps, .svg, .jpg, .png, .pdf, .tiff, etc.). Personally, I recommend using a vector graphics format (.eps, .svg, .pdf) whenever you can, which usually results in smaller file sizes than bitmap graphics (.jpg, .png, .bmp, tiff) and does not have a limited resolution." ] }, { "cell_type": "code", "execution_count": 91, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "x = np.linspace(0, 10, 100)\n", "plt.plot(x, np.sin(x))\n", "\n", "plt.savefig('myplot.png', dpi=300)\n", "plt.savefig('myplot.pdf')\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Resources" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There are no reading assignments for this lecture. However, you should run this notebook on your computer from top to bottom at your own pace and make sure that you are comfortable with the different commands, most of which you may need for the problem sets." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "NumPy and Matplotlib reference material:\n", "\n", "- [The official NumPy documentation](https://docs.scipy.org/doc/numpy/reference/index.html)\n", "- [The official Matplotlib Gallery](https://matplotlib.org/gallery/index.html)\n", "- [The official Matplotlib Tutorials](https://matplotlib.org/tutorials/index.html)\n", "\n", "\n", "Optional references books for using NumPy and SciPy. You are not expected to read this for this class, but I am listing it in case you are interested in learning NumPy for your projects.\n", "\n", "- Rougier, N.P., 2016. [From Python to NumPy](http://www.labri.fr/perso/nrougier/from-python-to-numpy/).\n", "- Oliphant, T.E., 2015. [A Guide to NumPy: 2nd Edition](https://www.amazon.com/Guide-NumPy-Travis-Oliphant-PhD/dp/151730007X). USA: Travis Oliphant, independent publishing.\n", "- Varoquaux, G., Gouillart, E., Vahtras, O., Haenel, V., Rougier, N.P., Gommers, R., Pedregosa, F., Jędrzejewski-Szmek, Z., Virtanen, P., Combelles, C. and Pinte, D., 2015. [SciPy Lecture Notes](http://www.scipy-lectures.org/intro/numpy/index.html).\n" ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" }, "toc": { "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 4 }