{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true, "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import numpy, scipy, matplotlib.pyplot as plt, pandas, librosa" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "[← Back to Index](index.html)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# NumPy and SciPy" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "The quartet of NumPy, SciPy, Matplotlib, and IPython is a popular combination in the Python world. We will use each of these libraries in this workshop." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "## Tutorial" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "[NumPy](http://www.numpy.org) is one of the most popular libraries for numerical computing in the world. It is used in several disciplines including image processing, finance, bioinformatics, and more. This entire workshop is based upon NumPy and its derivatives.\n", "\n", "If you are new to NumPy, follow this [NumPy Tutorial](http://wiki.scipy.org/Tentative_NumPy_Tutorial).\n", "\n", "[SciPy](http://docs.scipy.org/doc/scipy/reference/) is a Python library for scientific computing which builds on top of NumPy. If NumPy is like the Matlab core, then SciPy is like the Matlab toolboxes. It includes support for linear algebra, sparse matrices, spatial data structions, statistics, and more.\n", "\n", "While there is a [SciPy Tutorial](http://docs.scipy.org/doc/scipy/reference/tutorial/index.html), it isn't critical that you follow it for this workshop." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Special Arrays" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0 1 2 3 4]\n" ] } ], "source": [ "print numpy.arange(5)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 0. 0.5 1. 1.5 2. 2.5 3. 3.5 4. 4.5]\n" ] } ], "source": [ "print numpy.linspace(0, 5, 10, endpoint=False)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 0. 0. 0. 0. 0.]\n" ] } ], "source": [ "print numpy.zeros(5)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 1. 1. 1. 1. 1.]\n" ] } ], "source": [ "print numpy.ones(5)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 1. 1.]\n", " [ 1. 1.]\n", " [ 1. 1.]\n", " [ 1. 1.]\n", " [ 1. 1.]]\n" ] } ], "source": [ "print numpy.ones((5,2))" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ -1.12009510e+00 2.15875646e-03 -7.93208376e-01 -1.02710782e+00\n", " 2.37388108e+00]\n" ] } ], "source": [ "print scipy.randn(5) # random Gaussian, zero-mean unit-variance" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[-0.60527349 -0.82200312]\n", " [-0.67330474 -0.12914043]\n", " [-0.71574719 -0.5962005 ]\n", " [-1.03690426 0.59078457]\n", " [-2.22983691 -1.70858604]]\n" ] } ], "source": [ "print scipy.randn(5,2)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Slicing Arrays" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[2 3]\n" ] } ], "source": [ "x = numpy.arange(10)\n", "print x[2:4]" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "9\n" ] } ], "source": [ "print x[-1]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "The optional third parameter indicates the increment value:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0 2 4 6]\n" ] } ], "source": [ "print x[0:8:2]" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[4 3]\n" ] } ], "source": [ "print x[4:2:-1]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "If you omit the start index, the slice implicitly starts from zero:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0 1 2 3]\n" ] } ], "source": [ "print x[:4]" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0 1 2 3 4 5 6 7 8 9]\n" ] } ], "source": [ "print x[:999]" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[9 8 7 6 5 4 3 2 1 0]\n" ] } ], "source": [ "print x[::-1]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Array Arithmetic" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 2. 3. 4. 5. 6.]\n" ] } ], "source": [ "x = numpy.arange(5)\n", "y = numpy.ones(5)\n", "print x+2*y" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "`dot` computes the dot product, or inner product, between arrays or matrices." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-4.36027379404\n" ] } ], "source": [ "x = scipy.randn(5)\n", "y = numpy.ones(5)\n", "print numpy.dot(x, y)" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 0.9351335 0.9351335 ]\n", " [-4.22851009 -4.22851009]\n", " [-2.66983557 -2.66983557]\n", " [ 3.18545804 3.18545804]\n", " [ 1.82532797 1.82532797]]\n" ] } ], "source": [ "x = scipy.randn(5,3)\n", "y = numpy.ones((3,2))\n", "print numpy.dot(x, y)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Boolean Operations" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ True True True True True False False False False False]\n" ] } ], "source": [ "x = numpy.arange(10)\n", "print x < 5" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ True False False False False False False False False False]\n" ] } ], "source": [ "y = numpy.ones(10)\n", "print x < y" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Distance Metrics" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "5.0\n", "25.0\n", "7\n", "4\n" ] } ], "source": [ "from scipy.spatial import distance\n", "print distance.euclidean([0, 0], [3, 4])\n", "print distance.sqeuclidean([0, 0], [3, 4])\n", "print distance.cityblock([0, 0], [3, 4])\n", "print distance.chebyshev([0, 0], [3, 4])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "The cosine distance measures the angle between two vectors:" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.0\n", "1.0\n" ] } ], "source": [ "print distance.cosine([67, 0], [89, 0])\n", "print distance.cosine([67, 0], [0, 89])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Sorting" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "NumPy arrays have a method, `sort`, which sorts the array *in-place*." ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 0.70589021 0.14767722 0.06884379 0.37189002 0.43313129]\n", "[ 0.06884379 0.14767722 0.37189002 0.43313129 0.70589021]\n" ] } ], "source": [ "x = scipy.randn(5)\n", "print x\n", "x.sort()\n", "print x" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "notes" } }, "source": [ "`numpy.argsort` returns an array of indices, `ind`, such that `x[ind]` is a sorted version of `x`." ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 0.9443719 0.2831604 0.85627 0.22827583 -0.03939166]\n", "[4 3 1 2 0]\n", "[-0.03939166 0.22827583 0.2831604 0.85627 0.9443719 ]\n" ] } ], "source": [ "x = scipy.randn(5)\n", "print x\n", "ind = numpy.argsort(x)\n", "print ind\n", "print x[ind]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "[← Back to Index](index.html)" ] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.6" } }, "nbformat": 4, "nbformat_minor": 0 }