{ "cells": [ { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "# Week 05 - Numpy II and Scipy\n", "\n", "## Today's Agenda\n", "- Numpy II\n", "- Scipy" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "# Numpy II\n", "\n", "Last time in [Week 05](https://github.com/VandyAstroML/Vanderbilt_Computational_Bootcamp/blob/master/notebooks/Week_05/05_Numpy_Matplotlib.ipynb), we covered `Numpy` and `Matplotlib`. This time we will be focusing on more advanced concepts of `Numpy`." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "deletable": true, "editable": true }, "outputs": [], "source": [ "# Loading modules\n", "%matplotlib inline\n", "import numpy as np\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "## Review\n", "As a review, let's explore some of the concepts that were introduced last time, in `Numpy I`.\n", "\n", "### Create 1D-arrays\n", "We introduced how to create a 1D-array" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([ 1., 2., 3., 5., 6., 7., 8., 10.])" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = np.array([1,2,3,5,6,7,8,10],dtype=float)\n", "x" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "y = np.arange(10)\n", "y" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([ 0. , 2.04081633, 4.08163265, 6.12244898,\n", " 8.16326531, 10.20408163, 12.24489796, 14.28571429,\n", " 16.32653061, 18.36734694, 20.40816327, 22.44897959,\n", " 24.48979592, 26.53061224, 28.57142857, 30.6122449 ,\n", " 32.65306122, 34.69387755, 36.73469388, 38.7755102 ,\n", " 40.81632653, 42.85714286, 44.89795918, 46.93877551,\n", " 48.97959184, 51.02040816, 53.06122449, 55.10204082,\n", " 57.14285714, 59.18367347, 61.2244898 , 63.26530612,\n", " 65.30612245, 67.34693878, 69.3877551 , 71.42857143,\n", " 73.46938776, 75.51020408, 77.55102041, 79.59183673,\n", " 81.63265306, 83.67346939, 85.71428571, 87.75510204,\n", " 89.79591837, 91.83673469, 93.87755102, 95.91836735,\n", " 97.95918367, 100. ])" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "z = np.linspace(0,100,50)\n", "z" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([ 1.2125995 , 1.83486385, 0.63821859, -0.20766847, 0.16394987,\n", " 0.29579982, -1.16177048, -0.83906042, 1.05600292, -0.96406777,\n", " -1.83799388, -2.2114805 , -0.46763226, -0.28014607, -0.82671363,\n", " -0.29653933, -0.99461879, -1.05692908, -0.63431348, 0.58229083,\n", " -0.06923924, 2.56103329, 0.14558599, 0.06025412, 1.29401031,\n", " -0.9230366 , 1.01506698, 0.3772861 , -1.20513242, 1.49001138,\n", " -0.8995781 , 1.3497741 , 1.408165 , 1.23759972, 0.61076751,\n", " -0.4094068 , 0.67698797, -0.77231489, -1.56098342, -0.99862791,\n", " 1.21275181, 0.00340132, 0.77635256, 0.03975821, 1.06701466,\n", " 0.05538845, 1.00814729, -0.79890085, 0.01000337, 0.36924362,\n", " -0.79997093, -0.83036405, -0.741903 , 1.48206007, 0.3152332 ,\n", " -0.17685307, -0.17511851, 0.74958486, -1.16929455, 1.03634486,\n", " 1.71691808, 0.84479828, -0.65825241, -1.49280007, -0.90322557,\n", " 0.33369591, -0.65891918, 1.81929896, 0.22827773, 2.24103316,\n", " -0.50474291, 0.67246374, 0.57899788, 0.14338281, -0.80948219,\n", " -0.09728413, 0.8249388 , 1.44920434, -1.61984601, -0.23985288,\n", " 0.12277596, -0.73190566, 0.63827243, -0.13229476, 0.3812185 ,\n", " -1.58845796, -1.52718767, -0.10934944, -1.35275154, 0.79154503,\n", " 1.76049644, -0.26587198, -0.0231885 , -0.8432236 , 0.68359977,\n", " 2.46355285, 1.85368996, -1.49378643, -0.49819556, 0.0810053 ])" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "h = np.random.randn(100)\n", "h" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "### Handling arrays\n", "\n", "These are just a few of the different ways to __create__ numpy arrays.\n", "\n", "You can also use functions like __np.max()__ and __np.min()__ to get the maximum and minimum values, respectively." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Min X: 1.000 \t Max X: 10.000\n" ] } ], "source": [ "print('Min X: {0:.3f} \\t Max X: {1:.3f}'.format(np.min(x), np.max(x)) )" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "#### Apply mathematical functions" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([ 4., 28., 90., 400., 684., 1078., 1600., 3100.])" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "zz = x**2 + 3*x**3\n", "zz" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "#### Conditionals\n", "Find the indices of the elements in an array that meet some criteria.\n", "In this example, we're finding all the elements that are within 100 and 500 in array ___\"```zz```\"___." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "zz_idx: [3]\n" ] }, { "data": { "text/plain": [ "array([400.])" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "zz_idx = np.where((zz>= 100)&(zz <= 500))[0]\n", "print('zz_idx: {0}'.format(zz_idx))\n", "\n", "zz[zz_idx]" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "## Manipulating Arrays\n", "There are a lot of things we can do to a _numpy array_. " ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([25, 26, 24, 15, 35, 24, 21, 47, 31, 22, 17, 35, 47, 30, 14, 29, 17,\n", " 17, 22, 27, 46, 26, 49, 40, 27, 30, 20, 26, 36, 38, 49, 15, 16, 29,\n", " 24, 21, 26, 47, 41, 14, 20, 37, 40, 18, 31, 29, 34, 31, 41, 42])" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "h1 = np.random.randint(10, 50, 50)\n", "h1" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "We can get the overall __size__ and __shape__ of the array.\n", "\n", "We cane use the functions `numpy.size` and `numpy.shape` to get the total number of elements in an array and the shape of the array, respectively." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "50" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.size(h1)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "(50,)" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "h1.shape" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([[ 1, 2, 3, 4, 5],\n", " [ 6, 7, 8, 9, 10],\n", " [12, 13, 14, 16, 17],\n", " [13, 45, 67, 89, 90]])" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A = np.array([[1,2,3,4,5],\n", " [6,7,8,9,10],\n", " [12,13,14,16,17],\n", " [13,45,67,89,90] ])\n", "A" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "(4, 5)" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.shape(A)" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "You can also __transpose__ array `A`." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([[ 1, 6, 12, 13],\n", " [ 2, 7, 13, 45],\n", " [ 3, 8, 14, 67],\n", " [ 4, 9, 16, 89],\n", " [ 5, 10, 17, 90]])" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A_t = np.transpose(A)\n", "A_t" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "Why are `Numpy` arrays better than lists:\n", "- Python lists are very _general_.\n", "- Lists __do not__ support matrix and dot multiplications, etc.\n", "- `Numpy` arrays are memory _efficient_.\n", "- Numpy arrays are __statically typed__ and __homogeneous__.\n", "- They are fast at mathematical functions.\n", "- They can be used in compiled languages, e.g. _C_ and _Fortran_." ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "### Array-generating functions\n", "For large arrays it is inpractical to initialize the data manually, using normal Python lists. Instead, we can use many of the Numpy functions to generate arrays of different forms." ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "#### __numpy.arange__\n", "We use this one to create a sequence of ordered elements" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.arange(0,10,1)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([ 0, 5, 10, 15])" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.arange(0,20,5)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([-40, -30, -20, -10, 0, 10, 20])" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.arange(-40,21,10)" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "#### linspace and logspace\n", "We use these functions to created ordered lists, separated by intervals in _real_- and _log_-space." ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([ 0. , 1.02040816, 2.04081633, 3.06122449, 4.08163265,\n", " 5.10204082, 6.12244898, 7.14285714, 8.16326531, 9.18367347,\n", " 10.20408163, 11.2244898 , 12.24489796, 13.26530612, 14.28571429,\n", " 15.30612245, 16.32653061, 17.34693878, 18.36734694, 19.3877551 ,\n", " 20.40816327, 21.42857143, 22.44897959, 23.46938776, 24.48979592,\n", " 25.51020408, 26.53061224, 27.55102041, 28.57142857, 29.59183673,\n", " 30.6122449 , 31.63265306, 32.65306122, 33.67346939, 34.69387755,\n", " 35.71428571, 36.73469388, 37.75510204, 38.7755102 , 39.79591837,\n", " 40.81632653, 41.83673469, 42.85714286, 43.87755102, 44.89795918,\n", " 45.91836735, 46.93877551, 47.95918367, 48.97959184, 50. ])" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "B = np.linspace(0,50)\n", "B" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([ 0. , 5.26315789, 10.52631579, 15.78947368,\n", " 21.05263158, 26.31578947, 31.57894737, 36.84210526,\n", " 42.10526316, 47.36842105, 52.63157895, 57.89473684,\n", " 63.15789474, 68.42105263, 73.68421053, 78.94736842,\n", " 84.21052632, 89.47368421, 94.73684211, 100. ])" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "B = np.linspace(0,100, 20)\n", "B" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "Array of __25 elements__ from $10^{0}$ to $10^{3}$, with __base of 10__." ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([ 1. , 1.33352143, 1.77827941, 2.37137371,\n", " 3.16227766, 4.21696503, 5.62341325, 7.49894209,\n", " 10. , 13.33521432, 17.7827941 , 23.71373706,\n", " 31.6227766 , 42.16965034, 56.23413252, 74.98942093,\n", " 100. , 133.35214322, 177.827941 , 237.13737057,\n", " 316.22776602, 421.69650343, 562.34132519, 749.89420933,\n", " 1000. ])" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "B = np.logspace(0,3,25)\n", "B" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "Creating an array of 11 elements from $e^{0}$ to $e^{10}$, with the ```base == numpy.e```" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([1.00000000e+00, 2.71828183e+00, 7.38905610e+00, 2.00855369e+01,\n", " 5.45981500e+01, 1.48413159e+02, 4.03428793e+02, 1.09663316e+03,\n", " 2.98095799e+03, 8.10308393e+03, 2.20264658e+04])" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "B = np.logspace(0,10,11, base=np.e)\n", "B" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "#### Random Data" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "deletable": true, "editable": true }, "outputs": [], "source": [ "from numpy import random" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([[0.60426703, 0.09522098, 0.73603191, 0.41629722, 0.03830843],\n", " [0.16121388, 0.42582919, 0.97209131, 0.9586233 , 0.61308885],\n", " [0.63238237, 0.15224875, 0.0961162 , 0.91785892, 0.1275659 ],\n", " [0.9448008 , 0.01132601, 0.656943 , 0.03904403, 0.28083426],\n", " [0.80007388, 0.05710783, 0.23060248, 0.75706392, 0.14909109]])" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Uniform random numbers in [0,1]\n", "random.rand(5,5)" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([24, 15, 18, 26, 19, 19, 17, 27, 22, 23, 17, 10, 28, 19, 22, 22, 18,\n", " 11, 13, 15])" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# 20 Random integers from 10 to 30\n", "random.randint(10,30,20)" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "#### Arrays of `zeros` and `ones`." ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,\n", " 0., 0., 0.])" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.zeros(20)" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "You can use these to populate other arrays" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nelem = 10\n", "C = np.ones(10)\n", "C" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([0.93317549, 0.14262875, 0.6902437 , 0.87186553, 0.53890869,\n", " 0.71245388, 0.15132051, 0.20685907, 0.39431495, 0.89131517])" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "for ii in range(C.size):\n", " C[ii] = random.rand()\n", "C" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "### Diagonals\n", "You can also construct an array with another array as the diagonal" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([[16, 0, 0, 0, 0],\n", " [ 0, 13, 0, 0, 0],\n", " [ 0, 0, 11, 0, 0],\n", " [ 0, 0, 0, 15, 0],\n", " [ 0, 0, 0, 0, 14]])" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.diag(random.randint(10,20,5))" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "### Indexing" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "You can choose which values to select.\n", "Normally, you select the `rows` first, and then the `cols` of a `numpy.ndarray`." ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([[0.68677283, 0.38902045, 0.42871994, 0.81523068, 0.64483989],\n", " [0.51884508, 0.49264815, 0.99477923, 0.18455273, 0.51262224],\n", " [0.45850799, 0.66295697, 0.81406189, 0.66847916, 0.62882748],\n", " [0.652714 , 0.24415316, 0.47897214, 0.06621372, 0.27317223],\n", " [0.04435413, 0.49195189, 0.10773559, 0.85004019, 0.29236086],\n", " [0.96106373, 0.35939549, 0.75456427, 0.00273902, 0.14031911],\n", " [0.69224339, 0.46379945, 0.59234115, 0.19991018, 0.49572112],\n", " [0.30785617, 0.97237966, 0.47133784, 0.79830205, 0.54963373],\n", " [0.30952122, 0.50024043, 0.41479166, 0.51712861, 0.57509521],\n", " [0.99702942, 0.41362465, 0.44884597, 0.8832873 , 0.95066126]])" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "M = random.rand(10,5)\n", "M" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "Selecting the 1st row" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([0.51884508, 0.49264815, 0.99477923, 0.18455273, 0.51262224])" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "M[1,:]" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "The 2nd column" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([0.38902045, 0.49264815, 0.66295697, 0.24415316, 0.49195189,\n", " 0.35939549, 0.46379945, 0.97237966, 0.50024043, 0.41362465])" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "M[:,1]" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "Select a range of columns and rows" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([[0.99477923, 0.18455273],\n", " [0.81406189, 0.66847916]])" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "M[1:3, 2:4]" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "You can easily use this to create a __mask__, for when you are _cleaning_ your data." ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([[ nan, 0.22709903, 0.34331486],\n", " [0.91062782, nan, 0.89461017],\n", " [0.54718005, 0.09103976, nan]])" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A = random.rand(3,3)\n", "np.fill_diagonal(A, np.nan)\n", "A" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([[0, 1, 2],\n", " [3, 4, 5],\n", " [6, 7, 8]])" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "B = np.arange(0,9).reshape((3,3))\n", "B" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "Appying the __mask__ from $A \\to B$" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([[False, True, True],\n", " [ True, False, True],\n", " [ True, True, False]])" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A_mask = np.isfinite(A)\n", "A_mask" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([1, 2, 3, 5, 6, 7])" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "B[A_mask]" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "### Binning you data\n", "This is probably one of the best functions of `Numpy`.\n", "You can use this to bin you data, and calculate __means__, __standard deviations__, etc." ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "#### numpy.digitize" ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Creating my bin edges\n", "bins = np.arange(0,13)\n", "bins" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([4.99232218, 5.53702794, 5.929478 , 0.83550921, 9.96708561,\n", " 2.96031408, 2.00377163, 0.31854306, 0.48134572, 8.66035854,\n", " 9.95827216, 6.93070449, 4.47507617, 1.07392493, 8.15819131,\n", " 5.44619864, 2.53174022, 0.16471857, 8.35843109, 3.12430454,\n", " 4.99680908, 2.18259352, 9.21925301, 1.61365526, 7.83888345,\n", " 7.83650219, 8.6382215 , 8.30377063, 3.94558256, 5.31388424,\n", " 7.40166355, 2.91812888, 8.57977213, 0.4082193 , 1.4190216 ,\n", " 0.8316332 , 1.35202807, 8.28293158, 3.60098678, 6.28447875,\n", " 7.12344948, 1.13588079, 5.05358434, 0.10274816, 1.71728426,\n", " 3.30533502, 3.41941008, 9.99568425, 5.70585609, 3.95266365,\n", " 4.94809192, 9.25017982, 1.76354834, 0.16091627, 7.69585176,\n", " 5.37392259, 5.66522166, 6.62362653, 7.96742888, 2.46147816,\n", " 7.87554625, 9.54633578, 4.74731321, 5.30576571, 1.20723003,\n", " 0.03960206, 1.01701994, 1.35132563, 4.98330919, 3.84840303,\n", " 6.82422195, 0.81665793, 4.60547073, 5.82589065, 9.54330536,\n", " 6.28828892, 6.57514647, 2.24173202, 4.48154443, 9.28000871,\n", " 9.97469965, 5.7771928 , 5.75892113, 5.58329275, 7.99999175,\n", " 2.57523747, 4.55828035, 4.25558514, 6.56544173, 1.60343203,\n", " 2.30881588, 5.87116817, 7.95971268, 7.6376034 , 7.12862717,\n", " 5.01166971, 6.17339498, 6.55501683, 6.06717032, 3.18076008])" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Generating Data\n", "data = 10*random.rand(100)\n", "data" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "Now I want to bin my data and calculate the mean for each bin" ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([ 5, 6, 6, 1, 10, 3, 3, 1, 1, 9, 10, 7, 5, 2, 9, 6, 3,\n", " 1, 9, 4, 5, 3, 10, 2, 8, 8, 9, 9, 4, 6, 8, 3, 9, 1,\n", " 2, 1, 2, 9, 4, 7, 8, 2, 6, 1, 2, 4, 4, 10, 6, 4, 5,\n", " 10, 2, 1, 8, 6, 6, 7, 8, 3, 8, 10, 5, 6, 2, 1, 2, 2,\n", " 5, 4, 7, 1, 5, 6, 10, 7, 7, 3, 5, 10, 10, 6, 6, 6, 8,\n", " 3, 5, 5, 7, 2, 3, 6, 8, 8, 8, 6, 7, 7, 7, 4])" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Defining statistical function to use\n", "stat_func = np.nanmean\n", "# Binning the data\n", "data_bins = np.digitize(data, bins)\n", "data_bins" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "Calculating the __mean__ for each of the bins" ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([ 0.41598935, 1.38675917, 2.46486798, 3.54718072,\n", " 4.70438024, 5.54393829, 6.4887491 , 7.67866005,\n", " 8.42595382, 9.63720271, -10. , -10. ])" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "failval = -10\n", "bins_stat = np.array([stat_func(data[data_bins == ii]) \\\n", " if len(data[data_bins == ii]) > 0 \\\n", " else failval \\\n", " for ii in range(1,len(bins))])\n", "bins_stat = np.asarray(bins_stat)\n", "bins_stat" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can put all of this into a function that estimates errors and more..." ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [], "source": [ "import math\n", "\n", "def myceil(x, base=10):\n", " \"\"\"\n", " Returns the upper-bound integer of 'x' in base 'base'.\n", "\n", " Parameters\n", " ----------\n", " x: float\n", " number to be approximated to closest number to 'base'\n", "\n", " base: float\n", " base used to calculate the closest 'largest' number\n", "\n", " Returns\n", " -------\n", " n_high: float\n", " Closest float number to 'x', i.e. upper-bound float.\n", "\n", " Example\n", " -------\n", " >>>> myceil(12,10)\n", " 20\n", " >>>>\n", " >>>> myceil(12.05, 0.1)\n", " 12.10000 \n", " \"\"\"\n", " n_high = float(base*math.ceil(float(x)/base))\n", "\n", " return n_high\n", "\n", "def myfloor(x, base=10):\n", " \"\"\"\n", " Returns the lower-bound integer of 'x' in base 'base'\n", "\n", " Parameters\n", " ----------\n", " x: float\n", " number to be approximated to closest number of 'base'\n", "\n", " base: float\n", " base used to calculate the closest 'smallest' number\n", "\n", " Returns\n", " -------\n", " n_low: float\n", " Closest float number to 'x', i.e. lower-bound float.\n", "\n", " Example\n", " -------\n", " >>>> myfloor(12, 5)\n", " >>>> 10\n", " \"\"\"\n", " n_low = float(base*math.floor(float(x)/base))\n", "\n", " return n_low\n", "\n", "def Bins_array_create(arr, base=10):\n", " \"\"\"\n", " Generates array between [arr.min(), arr.max()] in steps of `base`.\n", "\n", " Parameters\n", " ----------\n", " arr: array_like, Shape (N,...), One-dimensional\n", " Array of numerical elements\n", "\n", " base: float, optional (default=10)\n", " Interval between bins\n", "\n", " Returns\n", " -------\n", " bins_arr: array_like\n", " Array of bin edges for given arr\n", "\n", " \"\"\"\n", " base = float(base)\n", " arr = np.array(arr)\n", " assert(arr.ndim==1)\n", " arr_min = myfloor(arr.min(), base=base)\n", " arr_max = myceil( arr.max(), base=base)\n", " bins_arr = np.arange(arr_min, arr_max+0.5*base, base)\n", "\n", " return bins_arr" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [], "source": [ "def Mean_std_calc_one_array(x1, y1, arr_len=0, statfunc=np.nanmean,\n", " failval=np.nan, error='std',\n", " base=10.):\n", " \"\"\"\n", " Calculates statistics of two arrays, e.g. scatter,\n", " error in `statfunc`, etc.\n", " \n", " Parameters\n", " ----------\n", " x1: array-like, shape (N,)\n", " array of x-values\n", " \n", " y1: array-like, shape (N,)\n", " array of y-values\n", " \n", " arr_len: int, optional (default = 0)\n", " minimum number of elements in the bin\n", " \n", " statfunc: numpy function, optional (default = numpy.nanmean)\n", " statistical function used to evaluate the bins\n", " \n", " failval: int or float, optional (default = numpy.nan)\n", " Number to use to replace when the number of elements in the \n", " bin is smaller than `arr_len`\n", " \n", " error: string, optional (default = 'std')\n", " type of error to evaluate\n", " Options:\n", " - 'std': Evaluates the standard deviation of the bin\n", " - 'stat': Evaluates the error in the mean/median of each bin\n", " - 'none': Does not calculate the error in `y1`\n", " base: float\n", " Value of bin width in units of that of `x1`\n", " \n", " Returns\n", " --------\n", " x1_stat: array-like, shape (N,)\n", " `stat_func` of each bin in `base` spacings for x1\n", "\n", " y1_stat: array-like, shape (N,)\n", " `stat_func` of each bin in `base` spacings for y1\n", " \"\"\"\n", " x1 = np.asarray(x1)\n", " y1 = np.asarray(y1)\n", " assert((x1.ndim==1) & (y1.ndim==1))\n", " assert((x1.size >0) & (y1.size>0))\n", " n_elem = len(x1)\n", " ## Computing Bins\n", " x1_bins = Bins_array_create(x1, base=base)\n", " x1_digit = np.digitize(x1, x1_bins)\n", " ## Computing Statistics in bins\n", " x1_stat = np.array([statfunc(x1[x1_digit==ii]) \n", " if len(x1[x1_digit==ii])>arr_len \n", " else failval\n", " for ii in range(1,x1_bins.size)])\n", " y1_stat = np.array([statfunc(y1[x1_digit==ii])\n", " if len(y1[x1_digit==ii])>arr_len \n", " else failval\n", " for ii in range(1,x1_bins.size)])\n", " ## Computing error in the data\n", " if error=='std':\n", " stat_err = np.nanstd\n", " y1_err = np.array([stat_err(y1[x1_digit==ii]) \n", " if len(y1[x1_digit==ii])>arr_len \n", " else failval\n", " for ii in range(1,x1_bins.size)])\n", " if error!='none':\n", " y1_err = np.array([stat_err(y1[x1_digit==ii])/np.sqrt(len(y1[x1_digit==ii]))\n", " if len(y1[x1_digit==ii])>arr_len\n", " else failval\n", " for ii in range(1,x1_bins.size)])\n", " if (stat_func==np.median) or (stat_func==np.nanmedian):\n", " y1_err *= 1.253\n", " else:\n", " y1_err = np.zeros(y1.stat.size)\n", " \n", " return x1_stat, y1_stat, y1_err" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__Example of using these function__:" ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "import numpy as np\n", "\n", "# Defining arrays\n", "x_arr = np.arange(100)\n", "y_arr = 50*np.random.randn(x_arr.size)\n", "\n", "# Computing mean and error in the mean for `x_arr` and `y_arr`\n", "x_stat, y_stat, y_err = Mean_std_calc_one_array(x_arr, y_arr,\n", " statfunc=np.nanmean,\n", " failval=np.nan,\n", " base=10)\n", "x_stat2, y_stat2, y_err2 = Mean_std_calc_one_array(x_arr, y_arr,\n", " statfunc=np.nanmedian,\n", " failval=np.nan,\n", " base=10)" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "collapsed": false, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.style.use('seaborn-notebook')\n", "plt.clf()\n", "plt.close()\n", "fig = plt.figure(figsize=(12,8))\n", "ax = fig.add_subplot(111,facecolor='white')\n", "ax.plot(x_arr, y_arr, 'ro', label='Data')\n", "ax.errorbar(x_stat, y_stat, yerr=y_err, color='blue', marker='o',\n", " linestyle='--',label='Mean')\n", "ax.errorbar(x_stat2, y_stat2, yerr=y_err2, color='green', marker='o',\n", " linestyle='--',label='Median')\n", "ax.set_xlabel('X axis', fontsize=20)\n", "ax.set_ylabel('Y axis', fontsize=20)\n", "ax.set_title('Data and the Binned Data', fontsize=24)\n", "plt.legend(fontsize=20)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "With this function, it is really easy to apply statistics on binned data, as well as to ___estimate errors___ on the data." ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "### Reshaping, resizing and stacking arrays\n", "One can always modify the shape of a `numpy.ndarray`, as well as append it to a pre-existing array." ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([[ 0, 1, 2, 3, 4],\n", " [10, 11, 12, 13, 14],\n", " [20, 21, 22, 23, 24],\n", " [30, 31, 32, 33, 34],\n", " [40, 41, 42, 43, 44]])" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A = np.array([[n+m*10 for n in range(5)] for m in range(5)])\n", "A" ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "deletable": true, "editable": true }, "outputs": [], "source": [ "n, m = A.shape" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([[ 0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 20, 21, 22, 23, 24, 30,\n", " 31, 32, 33, 34, 40, 41, 42, 43, 44]])" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "B = A.reshape((1,n*m))\n", "B" ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([ 0, 1, 2, 3, 4, 10, 11, 12, 13, 14, 20, 21, 22, 23, 24, 30, 31,\n", " 32, 33, 34, 40, 41, 42, 43, 44])" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A_f = A.flatten()\n", "A_f" ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([0.22546599, 0.97540312, 0.13059234, 0.60889311, 0.30114622,\n", " 0.35999722, 0.87544608, 0.22143994, 0.02629711, 0.5020551 ,\n", " 0.85802423, 0.20244432, 0.4825194 , 0.38529469, 0.79174434,\n", " 0.55720808, 0.18167695, 0.58105877, 0.41776584, 0.80677464,\n", " 0.2633001 , 0.79435708, 0.87975669, 0.94088208, 0.89088446])" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "C = random.rand(A.size)\n", "C" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "(25,)" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "C.shape" ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([[0.00000000e+00, 2.25465989e-01],\n", " [1.00000000e+00, 9.75403121e-01],\n", " [2.00000000e+00, 1.30592337e-01],\n", " [3.00000000e+00, 6.08893110e-01],\n", " [4.00000000e+00, 3.01146219e-01],\n", " [1.00000000e+01, 3.59997225e-01],\n", " [1.10000000e+01, 8.75446079e-01],\n", " [1.20000000e+01, 2.21439941e-01],\n", " [1.30000000e+01, 2.62971117e-02],\n", " [1.40000000e+01, 5.02055104e-01],\n", " [2.00000000e+01, 8.58024231e-01],\n", " [2.10000000e+01, 2.02444322e-01],\n", " [2.20000000e+01, 4.82519404e-01],\n", " [2.30000000e+01, 3.85294690e-01],\n", " [2.40000000e+01, 7.91744345e-01],\n", " [3.00000000e+01, 5.57208079e-01],\n", " [3.10000000e+01, 1.81676951e-01],\n", " [3.20000000e+01, 5.81058773e-01],\n", " [3.30000000e+01, 4.17765837e-01],\n", " [3.40000000e+01, 8.06774642e-01],\n", " [4.00000000e+01, 2.63300096e-01],\n", " [4.10000000e+01, 7.94357085e-01],\n", " [4.20000000e+01, 8.79756686e-01],\n", " [4.30000000e+01, 9.40882076e-01],\n", " [4.40000000e+01, 8.90884459e-01]])" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Stacking the two arrays\n", "D = np.column_stack((A_f,C))\n", "D" ] }, { "cell_type": "code", "execution_count": 52, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([[ 2. , 0.13059234],\n", " [ 3. , 0.60889311],\n", " [ 4. , 0.30114622],\n", " [10. , 0.35999722],\n", " [11. , 0.87544608],\n", " [12. , 0.22143994],\n", " [13. , 0.02629711],\n", " [14. , 0.5020551 ]])" ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Selecting from 3rd to 11th row\n", "D[2:10]" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "#### np.concatenate\n", "You can also concadenate different arrays" ] }, { "cell_type": "code", "execution_count": 53, "metadata": { "deletable": true, "editable": true }, "outputs": [], "source": [ "a = np.array([[1, 2], [3, 4]])\n", "b = np.array([[5,6]])" ] }, { "cell_type": "code", "execution_count": 54, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([[1, 2],\n", " [3, 4],\n", " [5, 6]])" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.concatenate((a,b))" ] }, { "cell_type": "code", "execution_count": 55, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([[1, 2, 5],\n", " [3, 4, 6]])" ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.concatenate((a,b.T), axis=1)" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "### Copy and \"Deep Copy\"\n", "Sometimes it is important to create new _copies_ of arrays and other objects. For this reason, one uses __numpy.copy__ to create _new_ copies of arrays" ] }, { "cell_type": "code", "execution_count": 56, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([[1, 2],\n", " [3, 4]])" ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A = np.array([[1, 2], [3, 4]])\n", "A" ] }, { "cell_type": "code", "execution_count": 57, "metadata": { "deletable": true, "editable": true }, "outputs": [], "source": [ "# `B` is now referring to the same array data as `A`\n", "B = A" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "If we make any changes to `B`, __`A` will also be affected by this change__." ] }, { "cell_type": "code", "execution_count": 58, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([[10, 2],\n", " [ 3, 4]])" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "B[0,0] = 10\n", "B" ] }, { "cell_type": "code", "execution_count": 59, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([[10, 2],\n", " [ 3, 4]])" ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "To get a __completely independent, new object__, you would use:" ] }, { "cell_type": "code", "execution_count": 60, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([[-5, 2],\n", " [ 3, 4]])" ] }, "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ "B = np.copy(A)\n", "\n", "# Modifying `B`\n", "B[0,0] = -5\n", "B" ] }, { "cell_type": "code", "execution_count": 61, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([[10, 2],\n", " [ 3, 4]])" ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "The array `A` was not affected by this changed. This is important when you're constantly re-defining new arrays" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "# Scipy - Library of Scientific Algorithms for Python" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "`SciPy` provides a large number of higher-level scientif algorithms.\n", "It includes:\n", "- Special Functions\n", "- Integration\n", "- Optimization\n", "- Interpolation\n", "- Fourier Transforms\n", "- Signal Processing\n", "- Linear Algebra\n", "- Statistics\n", "- Multi-dimensional image processing" ] }, { "cell_type": "code", "execution_count": 62, "metadata": { "deletable": true, "editable": true }, "outputs": [], "source": [ "import scipy as sc" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "## Interpolation\n", "You can use `Scipy` to interpolate your data.\n", "You would use the `interp1d` function to interpolate your function." ] }, { "cell_type": "code", "execution_count": 63, "metadata": { "deletable": true, "editable": true }, "outputs": [], "source": [ "from scipy.interpolate import interp1d" ] }, { "cell_type": "code", "execution_count": 64, "metadata": { "deletable": true, "editable": true }, "outputs": [], "source": [ "def f(x):\n", " return np.sin(x)" ] }, { "cell_type": "code", "execution_count": 65, "metadata": { "deletable": true, "editable": true }, "outputs": [], "source": [ "n = np.arange(0, 10) \n", "x = np.linspace(0, 9, 100)\n", "\n", "y_meas = f(n) + 0.1 * np.random.randn(len(n)) # simulate measurement with noise\n", "y_real = f(x)\n", "\n", "linear_interpolation = interp1d(n, y_meas)\n", "y_interp1 = linear_interpolation(x)\n", "\n", "cubic_interpolation = interp1d(n, y_meas, kind='cubic')\n", "y_interp2 = cubic_interpolation(x)" ] }, { "cell_type": "code", "execution_count": 66, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "fig, ax = plt.subplots(figsize=(15,6))\n", "ax.set_facecolor('white')\n", "ax.plot(n, y_meas, 'bs', label='noisy data')\n", "ax.plot(x, y_real, 'k', lw=2, label='true function')\n", "ax.plot(x, y_interp1, 'r', label='linear interp')\n", "ax.plot(x, y_interp2, 'g', label='cubic interp')\n", "ax.legend(loc=3, prop={'size':20});\n", "ax.tick_params(axis='both', which='major', labelsize=20)\n", "ax.tick_params(axis='both', which='minor', labelsize=15)" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "### KD-Trees\n", "You can also use `SciPy` to calculate __KD-Trees__ for a set of points" ] }, { "cell_type": "code", "execution_count": 67, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([[ 4.91677843, 203.53330277, 100.85634333],\n", " [ 42.6545178 , 82.63566177, 228.99066071],\n", " [232.5314 , 180.79608431, 95.0066412 ],\n", " ...,\n", " [ 76.65226267, 207.95240739, 30.11950574],\n", " [224.58498597, 221.71205368, 242.40314917],\n", " [ 30.36547633, 148.94184928, 144.29214812]])" ] }, "execution_count": 67, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Lbox = 250.\n", "Npts = 1000\n", "# Creating cartesian coordinates\n", "x = np.random.uniform(0, Lbox, Npts)\n", "y = np.random.uniform(0, Lbox, Npts)\n", "z = np.random.uniform(0, Lbox, Npts)\n", "sample1 = np.vstack([x, y, z]).T\n", "sample1" ] }, { "cell_type": "code", "execution_count": 68, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "(1000, 3)" ] }, "execution_count": 68, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sample1.shape" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "Let's say we want to know how many points are within distances of 30 and 50 from other points. To know this, you construct a __KD-Tree__" ] }, { "cell_type": "code", "execution_count": 69, "metadata": { "deletable": true, "editable": true }, "outputs": [], "source": [ "from scipy.spatial import cKDTree" ] }, { "cell_type": "code", "execution_count": 70, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [], "source": [ "# Initializing KDTree\n", "KD_obj = cKDTree(sample1)" ] }, { "cell_type": "code", "execution_count": 71, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Number of Neighbours: 20536\n" ] } ], "source": [ "N_neighbours = cKDTree.count_neighbors(KD_obj, KD_obj, 50) - \\\n", " cKDTree.count_neighbors(KD_obj, KD_obj, 30)\n", "print(\"Number of Neighbours: {0}\".format(N_neighbours))" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "Let's say you want to get the distances to the __Nth-nearest__ neighbor." ] }, { "cell_type": "code", "execution_count": 72, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([[ 0. , 14.1577149 , 20.28848924, 22.1680788 ],\n", " [ 0. , 12.88549073, 17.92327564, 18.42130936],\n", " [ 0. , 17.05682145, 18.37226707, 25.92539045],\n", " ...,\n", " [ 0. , 17.95350521, 20.41909125, 22.79189467],\n", " [ 0. , 18.57945774, 22.80480635, 29.61321934],\n", " [ 0. , 12.66943781, 21.47467022, 27.36415359]])" ] }, "execution_count": 72, "metadata": {}, "output_type": "execute_result" } ], "source": [ "k_nearest = 4\n", "dist_k, dist_k_idx = cKDTree.query(KD_obj, sample1, k_nearest)\n", "dist_k" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "You can also get the indices" ] }, { "cell_type": "code", "execution_count": 73, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "array([[ 0, 38, 553, 113],\n", " [ 1, 151, 855, 541],\n", " [ 2, 212, 37, 40],\n", " ...,\n", " [997, 317, 886, 688],\n", " [998, 357, 844, 163],\n", " [999, 691, 928, 497]])" ] }, "execution_count": 73, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dist_k_idx" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "The first columns corresponds to _itself_." ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "You can also find pairs that are separated by at most a distance _r_" ] }, { "cell_type": "code", "execution_count": 74, "metadata": { "collapsed": false, "deletable": true, "editable": true, "jupyter": { "outputs_hidden": false } }, "outputs": [ { "data": { "text/plain": [ "[[0, 38, 113, 306, 526, 553, 762, 856, 946],\n", " [1, 151, 541, 552, 565, 583, 855],\n", " [2, 37, 40, 212, 561, 567, 653, 742],\n", " [3, 5, 18, 102, 342, 480, 513, 570, 618],\n", " [4, 75, 109, 207, 311, 864],\n", " [3, 5, 18, 342, 480, 513, 570, 922],\n", " [6, 323, 543, 617, 638, 655, 675, 686, 842, 873, 905, 964],\n", " [7, 112, 229, 248, 440, 463, 506, 615, 865, 927],\n", " [8, 205, 275, 296, 420, 486, 589, 636, 737, 839],\n", " [9, 719, 949]]" ] }, "execution_count": 74, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pairs = KD_obj.query_ball_tree(KD_obj, 30)\n", "pairs[0:10]" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "__So that's it for today's lesson__" ] }, { "cell_type": "markdown", "metadata": { "deletable": true, "editable": true }, "source": [ "# Next week:\n", "- Advanced Visualization\n", "- Seaborn" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 4 }