{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# SciPy and NumPy Cheat Sheet\n", "The core libraries in Python are NumPy, SciPy, Matplotlib and Pandas. NumPy provides homogeneous multidimensional arrays, SciPy provides functions and operators for mathematical computation\n", "Matplotlib can be used to plot functions and Pandas is used for non-homogeneous data manipulation and reading and writing files." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Python built-in data structures\n", "Python offers the following built-in data structures: tutple, list, dictionary and set. We will discuss the array data structure in a following sextion about the NumPy numerical package. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Tuple\n", "A tutple is an immutable collection of immutable items that can be of different types: int, float, bool, string. A tuple provides few operators such as count() and index()" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Number of occurrences: 2\n", "Index of 1st occurrence: 2\n", "\n" ] } ], "source": [ "t = 1, 2.6, 'New York', 'New York'\n", "c = t.count('New York') \n", "i = t.index('New York')\n", "print(\"Number of occurrences: {0:d}\\nIndex of 1st occurrence: {1:d}\\n\".format(c, i))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### List\n", "A Python list is a mutable collection of mutable items that can be of different types. In other words, a list can be expanded and reduced and its items are mutable. The functions used to expand a list are: append(), extend(), insert(). " ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1, 2, 5]\n", "1\n" ] } ], "source": [ "alist = [1,2,5]\n", "print(alist) # prints the list\n", "print(alist[0]) # prints the 1st element of the list" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can add one item at a time to the end of the list using append()" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1, 2, 5, 7]\n" ] } ], "source": [ "alist.append(7)\n", "print(alist)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can use extend() to add more items" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1, 2, 5, 7, 8, 3, 2]\n" ] } ], "source": [ "alist.extend([8, 3, 2])\n", "print(alist)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we can add an element at a certain index in the list" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1, 'New York', 2, 5, 7, 8, 3, 2]\n" ] } ], "source": [ "alist.insert(1, 'New York')\n", "print(alist)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can remove the 1st occurrence of an item that contains a particular value from the list " ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1, 2, 5, 7, 8, 3, 2]\n" ] } ], "source": [ "alist.remove('New York')\n", "print(alist)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also return an item with a certain index value and remove it from the list" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "7" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "v = alist.pop(3)\n", "v" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1, 2, 5, 8, 3, 2]\n" ] } ], "source": [ "print(alist)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can slice a tuple or a list to select subsets of those collections" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[5, 8]" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "alist[2:4]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can look for an item in a list of strings" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "cities = ['Rome', 'Paris', 'London', 'Berlin', 'Madrid', 'Athens']" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [], "source": [ "def find_item(list, item):\n", " if (item in list):\n", " return 1\n", " else:\n", " return 0" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "find_item(cities, 'Madrid')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Multi-dimensional list" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[0.1, 2], [3, 4]]\n" ] }, { "data": { "text/plain": [ "0.1" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "alist2d = [[0.1, 2], [3, 4]]\n", "print(alist2d)\n", "alist2d[0][0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### List comprehension\n", "List comprehension can be used to apply a function to a list of objects, as with a for loop, like a map() function or a lambda function." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "families = [\"Pippo/Pluto\", \"Qui/Quo/Qua\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's define a function that splits the tokens in a string and extracts the last element" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "def leaf(family):\n", " length = len(family.split(\"/\"))\n", " leaf = family.split(\"/\")[length - 1]\n", " return leaf" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we apply the function to a list of strings using a list comprehension" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['Pluto', 'Qua']" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[leaf(family) for family in families]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can achieve the same result with the following line, kind of lambda function" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['Pluto', 'Qua']" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[family.split(\"/\")[len(family.split(\"/\")) -1] for family in families ]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can easily create nested list comprehension that work similarly to nested for loops" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[[(1, 1), (1, 2), (1, 3), (1, 4)],\n", " [(2, 1), (2, 2), (2, 3), (2, 4)],\n", " [(3, 1), (3, 2), (3, 3), (3, 4)],\n", " [(4, 1), (4, 2), (4, 3), (4, 4)]]" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[[(i,j) for j in range(1,5)] for i in range(1,5)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Functional programming\n", "We can get the same result by applying the leaf() function to each object in the families list using the map() function." ] }, { "cell_type": "code", "execution_count": 70, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['Pluto', 'Qua']" ] }, "execution_count": 70, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(map(leaf, families))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Dictionary\n", "A dictionary is a symbol table: a collection of objects in which each object is associated to a key. The collection can be expanded and reduced." ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [], "source": [ "cartoon_characters = {\"Pippo\": 5,\n", " \"Pluto\": 4,\n", " \"Topolino\": 2}" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Name: Pippo, score: 5\n", "Name: Pluto, score: 4\n", "Name: Topolino, score: 2\n" ] }, { "data": { "text/plain": [ "[None, None, None]" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[print(\"Name: \" + name + \", score: \" + str(score)) for name, score in cartoon_characters.items()]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can print the objects' keys" ] }, { "cell_type": "code", "execution_count": 81, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['Pippo', 'Pluto', 'Topolino']" ] }, "execution_count": 81, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(cartoon_characters.keys())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "and the values" ] }, { "cell_type": "code", "execution_count": 83, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[5, 4, 2]" ] }, "execution_count": 83, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(cartoon_characters.values())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## NumPy \n", "A list can be used as an array: a collection of mutable objects of the same type: chars, integers or float or bool. Let's import the NumPy library" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A NumPy array can be created from a list" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1 2 4]\n" ] }, { "data": { "text/plain": [ "numpy.ndarray" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "l = [1, 2, 4]\n", "array = np.array(l)\n", "print(array) # prints the full array\n", "type(array)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "or from a range of number with start, end and step" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0 1 2 3 4 5 6 7 8]\n" ] } ], "source": [ "arange = np.arange(0, 9, 1) # an array from 0 to 9 with increment 1\n", "print(arange)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A Python array is an object, an instance of a class, that contains a list of data of the same type and offers many functions to transform the data. We can write the data into a file" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "path = 'physics/data/myarray.txt'\n", "with open(path, 'wb') as f:\n", " arange.tofile(f)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then we can read the file and put the data into a new array" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0 1 2 3 4 5 6 7 8]\n" ] } ], "source": [ "with open(path, 'rb') as f:\n", " c = np.fromfile(f, dtype='int')\n", "print(c)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can compute the sum, the mean, and the standard deviation of the elements in the array" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Sum: 36.0\n", "Mean: 4.0\n", "Standard deviation: 2.582\n" ] } ], "source": [ "s = c.sum()\n", "m = c.mean()\n", "e = c.std()\n", "print(\"Sum: {0:.1f}\\nMean: {1:.1f}\\nStandard deviation: {2:.3f}\".format(s, m, e))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Bi-dimensional array\n" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[1 2 3]\n", " [4 5 6]\n", " [7 8 9]]\n" ] } ], "source": [ "array2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])\n", "print(array2d) # prints the 2D array" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's print the element in the 3rd row and 2nd column" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "8\n" ] } ], "source": [ "print(array2d[2, 1]) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An n-dimensional array is used to represent a transformation into an n-dimesional space. " ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Array dimension: 2\n" ] } ], "source": [ "d = array2d.ndim # dimension of the array \n", "print(\"Array dimension: {0:d}\".format(d))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Each dimension consists of a tuple of numbers that represents its shape." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(3, 3)" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "array2d.shape # shape of the multidimensional array" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can compute the sum of the values of the array along one dimension or axix, for example along the columns " ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([12, 15, 18])" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "array2d.sum(axis=0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "or along the rows" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 6, 15, 24])" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "array2d.sum(axis=1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We may want to initialize an array with zeros" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[0. 0. 0.]\n", " [0. 0. 0.]\n", " [0. 0. 0.]]\n" ] } ], "source": [ "zeros2d = np.zeros((3,3))\n", "print(zeros2d)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "or with ones" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[1. 1. 1.]\n", " [1. 1. 1.]\n", " [1. 1. 1.]]\n" ] } ], "source": [ "ones2d = np.ones((3, 3))\n", "print(ones2d)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "or we might need an identity matrix, a square matrix whose diagonal values are ones and all the rest zeros" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1., 0., 0.],\n", " [0., 1., 0.],\n", " [0., 0., 1.]])" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.eye(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sometimes we need a sample of real numbers from an interval that are equally spaced. The linspace() function can be used to create such samples. It takes as input the two endpoints of the interval and the number of sample data points equally spaced within that interval. It is similar to range() but instead of setting the step we set the number of data points in the sample. " ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ])" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = np.linspace(0., 1., 11)\n", "x" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Reshaping\n", "The data points to be represented in a matrix, or in another multi-dimesional array, may come from a sequence so that once they are in a unidimensional array we have to change its shape adding the missing dimensions and putting the data points in the right place. After a reshaping operation The number of elements will stay the same but the shape of the array will be different." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[1 2 3]\n", " [4 5 6]\n", " [7 8 9]]\n" ] } ], "source": [ "l = 1, 2, 3, 4, 5, 6, 7, 8, 9\n", "m1 = np.array(l)\n", "m2 = m1.reshape(3,3) # this is the same array but represented as a 3x3 matrix\n", "print(m2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also flatten the data back to its original shape" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1, 2, 3, 4, 5, 6, 7, 8, 9])" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m2.flatten()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As said at the beginning of this section an array contains mutable object, that is, each object can change value (not type)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[1 2 3]\n", " [4 0 6]\n", " [7 8 9]]\n" ] } ], "source": [ "m2[1, 1] = 0\n", "print(m2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Resizing\n", "We may need to create a new array, for example the column vectors from a matrix, by resizing an array. After the risizing the shape will be different but the dimensions will stay the same so to create a column vector we have to create a nwe array and copy the values of the resized (or of the original matrix) into it." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1., 2., 3.])" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "v_shape = 3\n", "v1 = np.resize(m2, (v_shape, 1))\n", "v1\n", "v2 = np.zeros(3)\n", "for i in range(0, 3):\n", " v2[i] = v1[i];\n", "v2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Boolean arrays\n", "We can apply logical operators to arrays as well as to numbers. We apply a logic operator to a random matrix, a matrix whose elements are a sample of pseudo-random numbers from a uniform distribution in the interval [0, 1)" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[False, True, True],\n", " [ True, True, True],\n", " [False, False, False]])" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "R = np.random.rand(3, 3)\n", "A = R > 0.5\n", "A" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Instead of representing the logical values with True or False we can represent them with the integers 1 and o respectively" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[0, 1, 1],\n", " [1, 1, 1],\n", " [0, 0, 0]])" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A.astype(int)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can filter the elements of an array according to the logical rule defined before" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0.66377115, 0.70955479, 0.96854003, 0.76779624, 0.77286988])" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "R[A]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can apply different rules on the array elements depending on the result of the logical operator" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ True, False, False],\n", " [False, False, False],\n", " [ True, True, True]])" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.where(R > 0.5, R - 0.5, R + 0.5) > 0.5" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: total: 1.11 s\n", "Wall time: 1.13 s\n" ] } ], "source": [ "I = 5000\n", "%time mat = np.random.standard_normal((I, I))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Concatenation\n", "Let's say we have four 2x2 matrices A, B, C and D and we want to concatenate them in one 4x4 matrix M\n", "\n", "$$ M = \\begin{bmatrix} A & B \\\\ C & D \\end{bmatrix} $$" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(array([[1, 2],\n", " [3, 4]]),\n", " array([[5, 6],\n", " [7, 8]]),\n", " array([[0., 0.],\n", " [0., 0.]]),\n", " array([[1., 1.],\n", " [1., 1.]]))" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A = np.array([[1, 2], [3, 4]])\n", "B = np.array([[5, 6], [7, 8]])\n", "C = np.zeros([2, 2])\n", "D = np.ones([2, 2])\n", "A, B, C, D" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1., 2., 5., 6.],\n", " [3., 4., 7., 8.],\n", " [0., 0., 1., 1.],\n", " [0., 0., 1., 1.]])" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "M_up = np.concatenate([A, B], axis=1)\n", "M_down = np.concatenate([C, D], axis=1)\n", "M = np.concatenate([M_up, M_down], axis=0)\n", "M" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Vectorized operators\n", "An important feature of Python arrays is that mathematical operators on arrays do not need to use loops, an operator is applied to all members of an operand array" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 5., 8., 11., 14., 17., 20., 23., 26., 29.])" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = np.arange(0., 9., 1)\n", "y = 3 * x + 5\n", "y" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "vectorization is even more useful for multi-dimensional arrays " ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[6, 3, 3],\n", " [3, 1, 0],\n", " [8, 5, 0]])" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A = np.random.randint(0, 10, (3, 3))\n", "A" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[9, 1, 9],\n", " [0, 5, 2],\n", " [8, 5, 1]])" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "B = np.random.randint(0, 10, (3, 3))\n", "B" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can compute the sum of two matrices with one single statement without loops. The operator is apĆ¼plied to an element of the first matrix and the corresponding element in the second matrix" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[15, 4, 12],\n", " [ 3, 6, 2],\n", " [16, 10, 1]])" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A + B" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Broadcasting\n", "Broadcasting allows to use an operator with operands of different shapes, for example a matrix and a scalar" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[12, 4, 12],\n", " [ 3, 8, 5],\n", " [11, 8, 4]])" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "B + 3" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[18, 2, 18],\n", " [ 0, 10, 4],\n", " [16, 10, 2]])" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "2 * B" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Two matrices A and B can be used as operand of an operator O if the matrix with the smaller shape can be broadcasted over the bigger one by moving it along one index or both." ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1, 2, 3],\n", " [4, 5, 6],\n", " [7, 8, 9]])" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])\n", "A" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1, 2, 3])" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "B = np.array((1, 2, 3))\n", "B" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Clearly we can move B over A one row at a time. " ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 1, 4, 9],\n", " [ 4, 10, 18],\n", " [ 7, 16, 27]])" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A * B" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Broadcasting doesn't change the associative rule of the operator." ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 1, 4, 9],\n", " [ 4, 10, 18],\n", " [ 7, 16, 27]])" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "B * A" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Matrix multiplication" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 1, -2],\n", " [ 5, -9]])" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "C = np.array([[1, -2], [5, -9]])\n", "C" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[-9, 2],\n", " [-5, 1]])" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "D = np.array([[-9, 2], [-5, 1]])\n", "D" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1, 0],\n", " [0, 1]])" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "C @ D" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since D is the inverse of C, that is $D = C^{-1}$ then CD = DC = I, where I is the identity matrix." ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1, 0],\n", " [0, 1]])" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "D @ C" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can apply an operator C to a vector x in two ways: directly or through their transpose $\\hat{C}\\vec{x} = \\vec{x}^T\\hat{C}^T$" ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([-1, -4])" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = np.array([1, 1])\n", "C @ x" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([-1, -4])" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x.T @ C.T" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Matrix multiplication for collaborative filtering\n", "A case is the computation of item similarities. Given an interaction matrix A where the rows are users and the columns represent\n", "items, a 1 represent an interaction between a user and an item. A similarity measure between items can be computed from the interaction matrix A as shown below." ] }, { "cell_type": "code", "execution_count": 122, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[3, 2, 1, 2],\n", " [2, 2, 1, 1],\n", " [1, 1, 1, 0],\n", " [2, 1, 0, 2]])" ] }, "execution_count": 122, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A = np.array([[1,1,0,1], [1,1,1,0], [1,0,0,1]]) # 3 users and 4 items\n", "AT = A.transpose()\n", "S = AT.dot(A) # S is symmetric\n", "S" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also consider the ratings of the users on the items, e.g. on a scale from 1 to 5 (0 means not rated) " ] }, { "cell_type": "code", "execution_count": 123, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[50, 15, 12, 17],\n", " [15, 9, 0, 6],\n", " [12, 0, 9, 3],\n", " [17, 6, 3, 6]])" ] }, "execution_count": 123, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A = np.array([[5,3,0,2], [4,0,3,1], [3,0,0,1]])\n", "AT = A.transpose()\n", "S = AT.dot(A)\n", "S" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can now recommend items to a user by multiplying the user's ratings by the similarity matrix. The output is a vector of values that represent how much the user should rate each item. We will have to remove from the list the items that have been already rated by the user in order to recommend previously unseen items." ] }, { "cell_type": "code", "execution_count": 124, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[329 114 66 115]\n" ] } ], "source": [ "u1 = A[0] # user 1 interactions\n", "r1 = S.dot(u1)\n", "print(r1)" ] }, { "cell_type": "code", "execution_count": 125, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "66 (array([2], dtype=int64),)\n" ] } ], "source": [ "u1_nr = (u1 == 0) # find items not rated by user 1\n", "item_i = np.where(u1_nr) # find the index of the 1st not rated item\n", "print(r1.dot(u1_nr), item_i)" ] }, { "cell_type": "code", "execution_count": 126, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[253 66 78 83]\n" ] } ], "source": [ "u2 = A[1]\n", "r2 = S.dot(u2)\n", "print(r2)" ] }, { "cell_type": "code", "execution_count": 127, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1329 (array([1], dtype=int64),)\n" ] } ], "source": [ "u2_nr = (u2 == 0)\n", "item_i = np.where(u2_nr)\n", "print(r2.dot(u2), item_i)" ] }, { "cell_type": "code", "execution_count": 128, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([167, 51, 39, 57])" ] }, "execution_count": 128, "metadata": {}, "output_type": "execute_result" } ], "source": [ "u3 = A[2]\n", "S.dot(u3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Record array\n", "A record array can contain objects of different types in different colums, e.g. all integers in a column and all strings in another column" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[(0, 0., b'') (0, 0., b'')]\n" ] } ], "source": [ "recarray = np.zeros((2,), dtype=[('Integers','i4'),('Float','f4'),('Strings','a10')]) # a record array of two empty records (dtype is optional)\n", "print(recarray)" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[array([1, 2]), array([1.5, 2.5]), array(['Rome', 'Paris'], dtype='\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
openhighlowclosevolume
029.0129.2928.7029.00422606
128.8129.1528.7128.93265548
\n", "" ], "text/plain": [ " open high low close volume\n", "0 29.01 29.29 28.70 29.00 422606\n", "1 28.81 29.15 28.71 28.93 265548" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "path = 'covid19/data/tui-dat.csv'\n", "df = pd.read_csv(path)\n", "df[0:2]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Linear Algebra" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[1.75]\n", " [1.75]\n", " [0.75]]\n", "[[12.]\n", " [-2.]\n", " [10.]]\n" ] } ], "source": [ "# Solve the linear system AX = B\n", "A = np.array([[3, 6, -5], \n", "[1, -3, 2],\n", "[5, -1, 4]])\n", "\n", "B = np.array([[12],\n", "[-2],\n", "[10]])\n", "\n", "Ainv = np.linalg.inv(A) # inverse of A\n", "\n", "X = Ainv.dot(B) # X = A^(-1)B\n", "print(X)\n", "\n", "print(A.dot(X)) # returns B" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-2.0\n", "[ 1. 2. -1.]\n" ] } ], "source": [ "# Determinant of a squared matrix\n", "# 1 2 3\n", "# S = 0 2 1\n", "# 0 0 -1\n", "\n", "S = np.array([[1,2,3],\n", " [0,2,1],\n", " [0,0,-1]])\n", "det = np.linalg.det(S)\n", "print(det)\n", "l = np.linalg.eigvals(S) # eigenvalues\n", "print(l)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Histograms" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "scrolled": true }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAT8AAAEvCAYAAAAzcMYwAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAAsTAAALEwEAmpwYAAAR0UlEQVR4nO3df6xfdX3H8efLytBMFyFcWG3rSlxNBDdLctO58MeYOOnAWP2DpSwjTUZS/ygJJJpZ9A/xj2YsU9iSiUtVYrOhrIkaGsHNSjTGRMFbBkgpzEY6KO3oVWeUf1ha3vvjHuYXdnvvt98f/d57P89HcnPP9/M953zfR7mvfj7nc875pqqQpNa8ZtIFSNIkGH6SmmT4SWqS4SepSYafpCYZfpKa9NpJFwBwwQUX1Pr16yddhqQV5sCBAz+tqqn53lsS4bd+/XpmZmYmXYakFSbJf57uPYe9kppk+ElqkuEnqUmGn6QmGX6SmmT4SWqS4SepSYafpCYZfpKaZPhJapLhJ6lJS+LeXrVr/c77/m/5yG3XTLAStcaen6QmGX5aMtbvvO8VPUFpnAw/SU0y/CQ1yfCT1CTDT1KTDD9JTTL8JDXJ8JPUpEXDL8nrkjyU5NEkB5N8smu/NclzSR7pfq7u2eaWJIeTPJXkqnEegCQNop/b214E3l1VLyQ5B/hekm90791RVZ/qXTnJJcBW4FLgzcC3krytqk6NsnBJGsaiPb+a80L38pzupxbYZAtwT1W9WFVPA4eBTUNXKkkj1Nc5vySrkjwCnAD2V9WD3Vs3JnksyV1Jzuva1gDP9mx+tGuTpCWjr/CrqlNVtRFYC2xK8g7gs8BbgY3AceDT3eqZbxevbkiyPclMkpnZ2dkBSpekwZ3RbG9V/QL4DrC5qp7vQvEl4HP8emh7FFjXs9la4Ng8+9pdVdNVNT01NTVI7ZI0sH5me6eSvKlbfj3wHuDJJKt7Vvsg8Hi3vA/YmuTcJBcDG4CHRlq1JA2pn9ne1cCeJKuYC8u9VfX1JP+UZCNzQ9ojwIcAqupgkr3AE8BJYIczvXo1H12lSVs0/KrqMeCyedqvX2CbXcCu4UrTcuYTmrXUeYeHpCYZfpKaZPhJapLhJ6lJhp+kJhl+kppk+ElqkuEnqUmGn6QmGX6SmmT4SWpSPw82kEbChxloKTH8NHaGnpYih72SmmT4SWqS4SepSZ7z00B8WKmWO8NPS47BqrPBYa+kJhl+kprksFcj4/V8Wk7s+Ulqkj0/LVtOjGgY9vwkNcnwk9Qkw09SkxYNvySvS/JQkkeTHEzyya79/CT7k/y4+31ezza3JDmc5KkkV43zACRpEP30/F4E3l1V7wQ2ApuTvAvYCTxQVRuAB7rXJLkE2ApcCmwG7kyyagy1S9LAFp3traoCXuhentP9FLAFuKJr3wN8B/ho135PVb0IPJ3kMLAJ+P4oC9fS4fV9Wo76utSl67kdAH4X+ExVPZjkoqo6DlBVx5Nc2K2+BvhBz+ZHuzZpYF7WolHra8Kjqk5V1UZgLbApyTsWWD3z7eL/rZRsTzKTZGZ2dravYiVpVM5otreqfsHc8HYz8HyS1QDd7xPdakeBdT2brQWOzbOv3VU1XVXTU1NTZ165JA2hn9neqSRv6pZfD7wHeBLYB2zrVtsG3Nst7wO2Jjk3ycXABuChEdctSUPp55zfamBPd97vNcDeqvp6ku8De5PcADwDXAtQVQeT7AWeAE4CO6rq1HjK10rnZIrGpZ/Z3seAy+Zp/xlw5Wm22QXsGro6SRoT7/CQ1CSf6qIz4jBUK4U9P0lNMvwkNcnwk9Qkw0+ntX7nfZ7j04pl+ElqkuEnqUmGn6QmGX6SmmT4SWqS4SepSd7epmXHy280Cvb8JDXJ8JPUJMNPUpMMP0lNMvwkNcnZXi3K2VWtRPb8JDXJ8JPUJMNPUpMMP60IPnhVZ8oJD60ovQF45LZrJliJljrDTyuWQaiFOOyV1KRFwy/JuiTfTnIoycEkN3XttyZ5Lskj3c/VPdvckuRwkqeSXDXOA5CkQfQz7D0JfLiqHk7yRuBAkv3de3dU1ad6V05yCbAVuBR4M/CtJG+rqlOjLFyShrFoz6+qjlfVw93yr4BDwJoFNtkC3FNVL1bV08BhYNMoipWkUTmjc35J1gOXAQ92TTcmeSzJXUnO69rWAM/2bHaUhcNSks66vsMvyRuArwA3V9Uvgc8CbwU2AseBT7+86jyb1zz7255kJsnM7OzsmdYtSUPpK/ySnMNc8N1dVV8FqKrnq+pUVb0EfI5fD22PAut6Nl8LHHv1Pqtqd1VNV9X01NTUMMcgSWesn9neAF8ADlXV7T3tq3tW+yDweLe8D9ia5NwkFwMbgIdGV7IkDa+f2d7LgeuBHyV5pGv7GHBdko3MDWmPAB8CqKqDSfYCTzA3U7zDmV5JS82i4VdV32P+83j3L7DNLmDXEHVpQrw/Vq3wDg9JTTL8JDXJ8JPUJJ/qIsBzfWqPPT9JTTL8JDXJ8JPUJMNPUpMMP0lNMvwkNcnwUxP8aku9muEnqUmGn6QmGX6SmmT4SWqS4SepSYafpCYZfpKaZPhJapLhJ6lJhp+kJhl+kppk+ElqkuEnqUmGn6QmGX6SmrRo+CVZl+TbSQ4lOZjkpq79/CT7k/y4+31ezza3JDmc5KkkV43zACRpEP30/E4CH66qtwPvAnYkuQTYCTxQVRuAB7rXdO9tBS4FNgN3Jlk1juIlaVCLhl9VHa+qh7vlXwGHgDXAFmBPt9oe4APd8hbgnqp6saqeBg4Dm0ZctyQN5YzO+SVZD1wGPAhcVFXHYS4ggQu71dYAz/ZsdrRrk6Qlo+/wS/IG4CvAzVX1y4VWnaet5tnf9iQzSWZmZ2f7LUOSRqKv8EtyDnPBd3dVfbVrfj7J6u791cCJrv0osK5n87XAsVfvs6p2V9V0VU1PTU0NWr8kDaSf2d4AXwAOVdXtPW/tA7Z1y9uAe3vatyY5N8nFwAbgodGVLEnDe20f61wOXA/8KMkjXdvHgNuAvUluAJ4BrgWoqoNJ9gJPMDdTvKOqTo26cGkQvV9feeS2ayZYiSZt0fCrqu8x/3k8gCtPs80uYNcQdeks8Hts1TLv8JDUJMNPUpMMP0lNMvwkNcnwk9Qkw09Skww/SU0y/CQ1yfCT1CTDT1KT+rm3V1rRvN+3Tfb8JDXJ8JPUJMNPUpM859cgH2Ul2fOT1CjDT1KTDD9JTTL8JDXJ8JPUJMNPUpMMP0lNMvwkNcnwk9Qk7/BQs7zTpW32/CQ1adHwS3JXkhNJHu9puzXJc0ke6X6u7nnvliSHkzyV5KpxFS5Jw+in5/dFYPM87XdU1cbu536AJJcAW4FLu23uTLJqVMVK0qgses6vqr6bZH2f+9sC3FNVLwJPJzkMbAK+P3iJGgXPb0mvNMw5vxuTPNYNi8/r2tYAz/asc7Rrk6QlZdDw+yzwVmAjcBz4dNeeedat+XaQZHuSmSQzs7OzA5YhSYMZKPyq6vmqOlVVLwGfY25oC3M9vXU9q64Fjp1mH7urarqqpqempgYpQ5IGNtB1fklWV9Xx7uUHgZdngvcBX0pyO/BmYAPw0NBVamCe65Pmt2j4JfkycAVwQZKjwCeAK5JsZG5IewT4EEBVHUyyF3gCOAnsqKpTY6lckobQz2zvdfM0f2GB9XcBu4YpSpLGzTs8JDXJ8JPUJMNPUpN8qssK5Azv4F7+3+7IbddMuBKNmz0/SU0y/CQ1yfCT1CTDT1KTnPCQ5tE7aeTkx8pkz09Skww/SU0y/CQ1yfCT1CTDT1KTDD9JTTL8JDXJ8JPUJC9yXiF8kot0Zuz5SWqS4SepSYafpCYZfpKaZPhJapLhJ6lJhp+kJhl+0iLW77zP6yhXoEXDL8ldSU4kebyn7fwk+5P8uPt9Xs97tyQ5nOSpJFeNq3DN8Q9TGkw/Pb8vAptf1bYTeKCqNgAPdK9JcgmwFbi02+bOJKtGVq0kjcii4VdV3wV+/qrmLcCebnkP8IGe9nuq6sWqeho4DGwaTamSNDqD3tt7UVUdB6iq40ku7NrXAD/oWe9o1yYte36p0coy6gmPzNNW866YbE8yk2RmdnZ2xGVI0sIGDb/nk6wG6H6f6NqPAut61lsLHJtvB1W1u6qmq2p6ampqwDIkaTCDht8+YFu3vA24t6d9a5Jzk1wMbAAeGq5ESRq9Rc/5JfkycAVwQZKjwCeA24C9SW4AngGuBaiqg0n2Ak8AJ4EdVXVqTLVL0sAWDb+quu40b115mvV3AbuGKUqSxs07PCQ1yfCT1CS/w2MZ8nY2aXj2/CQ1yfCT1CSHvcuIw11pdOz5SWqS4ScNwOcoLn+Gn6QmGX6SmuSEhzQEn/G3fNnzk9Qkw09Skww/SU0y/CQ1yfCT1CTDTxoRL3xeXgw/SU0y/CQ1yYuclziHUdJ42POT1CTDT1KTDD9JTTL8JDXJ8JPUJGd7lyhneaXxGir8khwBfgWcAk5W1XSS84F/AdYDR4A/q6r/Hq5MSRqtUQx7/7iqNlbVdPd6J/BAVW0AHuheS9KSMo5h7xbgim55D/Ad4KNj+JwVx6GudPYM2/Mr4JtJDiTZ3rVdVFXHAbrfFw75GZI0csP2/C6vqmNJLgT2J3my3w27sNwO8Ja3vGXIMiTpzAwVflV1rPt9IsnXgE3A80lWV9XxJKuBE6fZdjewG2B6erqGqWO5c7grnX0DD3uT/GaSN768DLwXeBzYB2zrVtsG3DtskZI0asP0/C4Cvpbk5f18qar+NckPgb1JbgCeAa4dvkxp+ZivJ+/XWi49A4dfVf0EeOc87T8DrhymKEkaN+/wkM4Cv9x86fHeXklNMvwkNcnwk9Qkw09Skww/SU0y/M4yv9haWhq81GVCDEBpsuz5SWqS4SepSYafpCYZfpKa5ITHGL08qeG9nOrlfb5Lg+F3FjizKy09DnulCfK6z8mx5yctUQ6Px8vwk5YAg+7sc9grqUmGn6QmOewdgkMVjYMTIGeH4Tdi/ocrLQ+G34gYetLyYvgtYL5hrSEnrQyGn7TMee55MIaftAzMN+Iw6IbjpS6SmjS2nl+SzcDfA6uAz1fVbeP6rFHzvJ5WIofHrzSW8EuyCvgM8CfAUeCHSfZV1RPj+DypRQv9I23QLS5VNfqdJn8I3FpVV3WvbwGoqr+eb/3p6emamZkZeR1nwt6eWrWSwzHJgaqanu+9cQ171wDP9rw+CvzBKD+g33/ZDDVpYQs9dHcp9CDHVcO4wi/ztL2ii5lkO7C9e/lCkqcW2N8FwE9P+2F/c8b1LWULHusK0spxwjI51sX+jvr8OxvrsQ7wt/47p3tjXOF3FFjX83otcKx3haraDezuZ2dJZk7XdV1pWjnWVo4TPNalalyXuvwQ2JDk4iS/AWwF9o3psyTpjI2l51dVJ5PcCPwbc5e63FVVB8fxWZI0iLFd51dV9wP3j2h3fQ2PV4hWjrWV4wSPdUkay6UukrTUeXubpCYtq/BL8pEkleSCSdcyLkn+NsmTSR5L8rUkb5p0TaOWZHOSp5IcTrJz0vWMS5J1Sb6d5FCSg0lumnRN45RkVZJ/T/L1SdfSj2UTfknWMXe73DOTrmXM9gPvqKrfB/4DuGXC9YxUz62PfwpcAlyX5JLJVjU2J4EPV9XbgXcBO1bwsQLcBByadBH9WjbhB9wB/BWvulh6pamqb1bVye7lD5i7RnIl2QQcrqqfVNX/APcAWyZc01hU1fGqerhb/hVzwbBmslWNR5K1wDXA5yddS7+WRfgleT/wXFU9OulazrK/BL4x6SJGbL5bH1dkIPRKsh64DHhwwqWMy98x1zl5acJ19G3JPMw0ybeA357nrY8DHwPee3YrGp+FjrWq7u3W+Thzw6a7z2ZtZ8Gitz6uNEneAHwFuLmqfjnpekYtyfuAE1V1IMkVEy6nb0sm/KrqPfO1J/k94GLg0SQwNwx8OMmmqvqvs1jiyJzuWF+WZBvwPuDKWnnXIi166+NKkuQc5oLv7qr66qTrGZPLgfcnuRp4HfBbSf65qv5iwnUtaNld55fkCDBdVUv+RvFBdA+BvR34o6qanXQ9o5bktcxN5FwJPMfcrZB/vhLvAMrcv9Z7gJ9X1c0TLues6Hp+H6mq9024lEUti3N+jfkH4I3A/iSPJPnHSRc0St1kzsu3Ph4C9q7E4OtcDlwPvLv7//KRrnekJWDZ9fwkaRTs+UlqkuEnqUmGn6QmGX6SmmT4SWqS4SepSYafpCYZfpKa9L8Yl8gnwPCxEwAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "import matplotlib.pyplot as plt\n", "mu, sigma = 0.0, 1.0 # mean value and standard deviation\n", "sample_size = 10000\n", "v = np.random.normal(mu,sigma, sample_size) # Normal distribution\n", "fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(5, 5))\n", "n = ax.hist(v, bins=100) # n contains the number of samples in each bin, the one-dimensional grid values and the plot object" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# SciPy\n", "Let's import the SciPy library and the Matplotlib library for visualization" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "from scipy.optimize import curve_fit" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Linear fitting" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0.99569401 2.03516151]\n" ] }, { "data": { "text/plain": [ "[]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Creating a linear function to model and create data\n", "def linearfunc(x, a, b):\n", " return a * x + b\n", "\n", "x = np.linspace(0, 10, 100) # start = 0, stop = 0, samples = 100\n", "y = linearfunc(x, 1, 2) # linear function defined in [0,10]\n", "plt.figure()\n", "plt.plot(x, y)\n", "# Adding noise to the data\n", "yn = y + 0.9 * np.random.normal(size=len(x))\n", "plt.plot(x,yn)\n", "# Executing curve_fit on noisy data\n", "popt, pcov = curve_fit(linearfunc, x, yn) # estimates the parameters of the linear function a, b\n", "print(popt)\n", "yfit = linearfunc(x,popt[0],popt[1]) # the fitted linear function overlaps with the original one\n", "plt.plot(x,yfit)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Gaussian fitting" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 0.97061079 5.08490729 -2.02934528]\n" ] }, { "data": { "text/plain": [ "[]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Creating a Gaussian function to model and create data\n", "def gaussfunc(x, a, b, c):\n", " return a*np.exp(-(x-b)**2/(2*c**2))\n", "\n", "# Generating clean data\n", "x = np.linspace(0, 10, 100)\n", "y = gaussfunc(x, 1, 5, 2)\n", "\n", "# Adding noise (gaussian) to the data (also gaussian)\n", "yn = y + 0.2 * np.random.normal(size=len(x))\n", "plt.figure()\n", "plt.plot(x,yn) # plot the gaussian function with random noise - red color\n", "# Executing curve_fit on noisy data\n", "popt, pcov = curve_fit(gaussfunc, x, yn) # estimates the parameters of the gaussian function a, b, c\n", "print(popt)\n", "yfit = gaussfunc(x,popt[0],popt[1],popt[2]) # plot the fitted gaussian\n", "plt.plot(x,yfit)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Solve equations" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1.]\n", "[2.]\n" ] } ], "source": [ "from scipy.optimize import fsolve\n", "curve = lambda x: (x - 1)*(x - 2)\n", "solution1 = fsolve(curve, 0)\n", "print(solution1)\n", "solution2 = fsolve(curve,3)\n", "print(solution2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Univariate interpolation" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "from scipy.interpolate import interp1d\n", "x = np.linspace(0, 3*np.pi, 10)\n", "y = np.sin(x)\n", "\n", "# create a linear interpolation function\n", "linearfunc = interp1d(x, y, kind='linear')\n", "# create a quadratic interpolation function\n", "quadraticfunc = interp1d(x, y, kind='quadratic')\n", "# interpolate on a grid of 1,000 points\n", "x_interp = np.linspace(0, 3*np.pi, 100)\n", "linear_interp = linearfunc(x_interp)\n", "quadratic_interp = quadraticfunc(x_interp)\n", "# plot the results\n", "plt.figure() # new figure\n", "plt.plot(x, y,'o') # plot the data points\n", "plt.plot(x_interp, linear_interp, x_interp, quadratic_interp); # plot the linear and quadratic interpolations\n", "plt.legend(['data', 'linear', 'quadratic'], loc='best')" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Interpolation of noisy data\n", "from scipy.interpolate import UnivariateSpline\n", "sample = 30\n", "x = np.linspace(0.5, 10*np.pi, sample)\n", "y = np.cos(x) + np.log10(x) + np.random.randn(sample) / 10\n", "linearfunc = interp1d(x, y, kind='linear')\n", "splinefunc = UnivariateSpline(x, y, s=1)\n", "x_interp = np.linspace(0.5, 10*np.pi, 1000)\n", "linear_interp = linearfunc(x_interp)\n", "spline_interp = splinefunc(x_interp)\n", "plt.figure()\n", "plt.plot(x,y,'o')\n", "plt.plot(x_interp, linear_interp, x_interp, spline_interp)\n", "plt.legend(['data', 'linear', 'spline'], loc='best')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Multivariate Interpolation" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "def func(x, y):\n", " return np.sqrt(x**2 + y**2)+np.sin(x**2 + y**2)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# creates a 2D grid of 1000x1000 points with coordinates values from 0 to 5 for both x and y\n", "grid_x, grid_y = np.mgrid[0:5:100j, 0:5:100j] \n", "\n", "# sample data points\n", "xy = np.random.rand(1000, 2)\n", "z = func(xy[:,0]*5, xy[:,1]*5)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "from scipy.interpolate import griddata\n", "# interpolating data\n", "grid_z0 = griddata(xy*5, z, (grid_x, grid_y), method='cubic')" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Text(0.5, 1.0, 'Interpolated')" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.subplot(121)\n", "plt.imshow(func(grid_x, grid_y).T, extent=(0,1,0,1), origin='lower') # shows the image generated on the grid points \n", "plt.plot(xy[:,0], xy[:,1], 'k.', ms=1) # print the ramdom sample points\n", "plt.title('Original')\n", "\n", "plt.subplot(122)\n", "plt.imshow(grid_z0.T, extent=(0,1,0,1), origin='lower') # shows the interpolated image\n", "plt.title('Interpolated')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3D Plotting" ] }, { "cell_type": "code", "execution_count": 72, "metadata": {}, "outputs": [], "source": [ "def f(x, y):\n", " return np.exp(-(x*x + y*y) / 1.0)\n", " #return 1 / (1 + np.exp(-5*x - 4*y))" ] }, { "cell_type": "code", "execution_count": 73, "metadata": {}, "outputs": [], "source": [ "x = np.linspace(-1, 1, 100)\n", "y = np.linspace(-1, 1, 100)\n", "X, Y = np.meshgrid(x, y) # generates a grid from one-dimensional arrays\n", "z = f(X, Y)" ] }, { "cell_type": "code", "execution_count": 74, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 74, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "fig = plt.figure()\n", "ax = plt.axes(projection='3d')\n", "ax.contour3D(X, Y, z, 100, cmap='binary')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.5" } }, "nbformat": 4, "nbformat_minor": 4 }