{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Hands-on: Numerical Python -- Introduction to NumPy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Objectives:**\n", "\n", "Upon completion of this lesson, you should be able to:\n", "\n", "1. Create N-dimensional arrays\n", "\n", "2. Index values in those arrays\n", "\n", "3. Perform operations on those arrays\n", "\n", "4. Know what \"broadcasting\" is and how it works" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## What is Numpy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> **Python** has strings, integers, floating point types for numerics, and containers (*which ones?*) for storing (heterogeneous) collections of those (convenience) --- but more is needed (efficiency)\n", "\n", "> **Numpy** is a Python package for multidimensional arrays, designed for scientific computation (efficiency and convenience)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## For example" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An array containing...\n", "\n", "* discretized time of an experiment/simulation\n", "\n", "* signal recorded by a measurement device\n", "\n", "* pixels of an image\n", "\n", "* voxels of a volume\n", "\n", "* ..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The Basics of NumPy:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* NumPy's main object is the homogeneous multidimensional **array**\n", "* An array is a *table* of elements (usually numbers), all of the same type, indexed by a tuple of positive integers\n", "* In Numpy dimensions are called `axes`. The number of axes is a ''rank'' of the array\n", "\n", "**For example**\n", "\n", "\n", "* Coordinates of a point in 3D space: ```[1, 2, 1]```\n", "\n", "* Is an array of rank 1, because it has one axis\n", "* That axis has a length of 3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Little helper" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "np.lookfor(\"holy\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The ndarray" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Numpy's array class is called `ndarray`. (It is also known by the alias `array`)\n", "\n", "* **ndarray.ndim**: the number of axes (dimensions) of the array. In\n", "the Python world, the number of dimensions is referred to as `rank`\n", "\n", "* **ndarray.shape**: the dimensions of the array. This is a tuple of\n", "integers indicating the size of the array in each dimension\n", "\n", "* **ndarray.size**: the total number of elements of the array. This is\n", "equal to the product of the elements of `shape`\n", "\n", "* **ndarray.dtype**: an object describing the type of the elements in the\n", "array\n", "\n", "* **ndarray.itemsize**: the size in bytes of each element of the array" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating arrays" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### **1-D**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = np.array([0, 1, 2, 3])\n", "print(repr(a), a)\n", "print(a.ndim)\n", "print(a.shape)\n", "print(len(a))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### **2-D, 3-D, ...**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b = np.array([[0, 1, 2], [3, 4, 5]]) # 2 x 3 array\n", "print(repr(b))\n", "print(b)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(b.ndim)\n", "print(b.shape)\n", "print(len(b)) # returns the size of the first dimension" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "c = np.array([[[1], [2]], [[3], [4]]])\n", "print(c)\n", "print(c.shape)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## In practice" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "We rarely enter items one by one...\n", "\n", "* Evenly spaced:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "np.arange(10) # 0 .. n-1 (!)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.arange(1, 9, 2) # start, end (exlusive), step" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "or by number of points using `np.linspace()`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.linspace(0, 1, 6) # start, end, num-points" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.linspace(0, 1, 5, endpoint=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Common arrays" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.ones((3, 3)) # reminder: (3, 3) is a tuple" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.zeros((2, 2))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.eye(3)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.diag(np.array([1, 2, 3, 4, 5]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Random numbers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[numpy.random](http://docs.scipy.org/doc/numpy/reference/routines.random.html) provides many routines to generate various pseudo-random numbers using PRNG (pseudo-random numbers generator)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.random.rand(4) # uniform in [0, 1]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.random.rand(3, 3)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.random.randn(4) # gaussian" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Set the seed" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you would like \"reproducible\" random numbers you can seed the PRNG globally:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.random.seed(1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "or create an instance of the PRNG to be used in your particular code (preferred, so there is no side-effects):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "prng = np.random.RandomState(1)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "prng.rand(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Basic data types" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "You probably noted the `1` and `1.` above. These are different\n", "data types:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = np.array([1, 2, 3])\n", "a.dtype" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b = np.array([1., 2., 3.])\n", "b.dtype" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- - -\n", "**Warning**\n", "Much of the time you don't necessarily need to care, but remember about the same gotcha as with regular Python2 \"int\":" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a / 3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "so you would need to cast one of the arguments to float:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a/3." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a.astype(float)/3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Choose your own dtype adventure" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "You can control your data type destiny:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "c = np.array([1, 2, 3], dtype=float)\n", "c.dtype" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "The **default** data type is floating point:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = np.ones((3, 3))\n", "a.dtype" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The many choices..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "There are also other types:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.array([1+2j, 3+4j, 5+6*1j]).dtype" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.array([True, False, False, True]).dtype" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.array(['Bonjour', 'Hello', 'Hallo', 'Terve', 'Hej']).dtype" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Main lecturer today: Jake VanderPlas" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.display import YouTubeVideo\n", "YouTubeVideo('EEUXKG97YRw')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.1" } }, "nbformat": 4, "nbformat_minor": 1 }