{
 "metadata": {
  "name": "",
  "signature": "sha256:857e057dbd74b9baeb082c07cb00cc4732f4a26e74816babca64ca6cfdba487f"
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "**Task:**\n",
      " \n",
      "     Make your python scripts run faster\n",
      "     \n",
      "**Solution:**\n",
      "    \n",
      "    multiprocessor, cython, numba"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "[Notebook file](http://nbviewer.ipython.org/urls/raw.github.com/koldunovn/earthpy.org/master/content/notebooks/speed.ipynb)"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "One of the counterarguments that you constantly hear about using python is that it is slow. This is somehow true for many cases, while most of the tools that scientist mainly use, like `numpy`, `scipy` and `pandas` have big chunks written in `C`, so they are very fast. For most of the geoscientific applications main advice would be to use vectorisation whenever possible, and avoid loops. However sometimes loops are unavoidable, and then python speed can get on to your nerves. Fortunately there are several easy ways to make your python loops faster."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "##Multiprocessor"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Let's first download some data to work with (NCEP reanalysis air temperature):"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "#variabs = ['air']\n",
      "#for vvv in variabs:\n",
      "#    for i in range(2000,2010):\n",
      "#        !wget ftp://ftp.cdc.noaa.gov/Datasets/ncep.reanalysis.dailyavgs/surface/{vvv}.sig995.{i}.nc"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 2
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ls *nc"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "air.sig995.2000.nc  air.sig995.2002.nc  air.sig995.2004.nc  air.sig995.2006.nc  air.sig995.2008.nc  air.sig995.2012.nc\r\n",
        "air.sig995.2001.nc  air.sig995.2003.nc  air.sig995.2005.nc  air.sig995.2007.nc  air.sig995.2009.nc\r\n"
       ]
      }
     ],
     "prompt_number": 3
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "This is netCDF files, so we need `netCDF4` library:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "from netCDF4 import Dataset"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 38
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Now we create useless but time consuming function, that have a lot of loops in it. It takes year as an input and then just summs up all the numbers from the file one by one."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "def useless(year):\n",
      "    from netCDF4 import Dataset\n",
      "    f = Dataset('air.sig995.'+year+'.nc')\n",
      "    a = f.variables['air'][:]\n",
      "    a_cum = 0\n",
      "    for i in range(a.shape[0]):\n",
      "        for j in range(a.shape[1]):\n",
      "            for n in range(a.shape[2]):\n",
      "                a_cum = a_cum+a[i,j,n]\n",
      "                \n",
      "    a_cum.tofile(year+'.bin')\n",
      "    print(year)\n",
      "    return a_cum"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 39
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "It works slow enough:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%time\n",
      "useless('2000')"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "2000\n",
        "CPU times: user 3.49 s, sys: 24 ms, total: 3.52 s\n",
        "Wall time: 3.49 s\n"
       ]
      },
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 40,
       "text": [
        "1068708186.2315979"
       ]
      }
     ],
     "prompt_number": 40
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "We can create a loop that will process several files one by one. First make a list of years:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "years = [str(x) for x in range(2000,2008)]\n",
      "years"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 41,
       "text": [
        "['2000', '2001', '2002', '2003', '2004', '2005', '2006', '2007']"
       ]
      }
     ],
     "prompt_number": 41
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "And now the loop:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%time\n",
      "for yy in years:\n",
      "    useless(yy)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "2000\n",
        "2001"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2002"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2003"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2004"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2005"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2006"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2007"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "CPU times: user 27.2 s, sys: 288 ms, total: 27.5 s\n",
        "Wall time: 28 s\n"
       ]
      }
     ],
     "prompt_number": 42
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Processing of each file is independent from others. This is \"embarrassingly parallel\" problem and can be done very easily in parallel with `multiprocessing` module of the standard library."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Most of the modern computers have more than one processor. Even the 5 year old machine, where I now edit this notebook have two. The way to check how many processors you have is to run:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "!nproc"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "2\r\n"
       ]
      }
     ],
     "prompt_number": 43
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Now let's import `multiprocessing`:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "import multiprocessing"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 44
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "And create a `pool` with 2 processors (you can use your number of processors):"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "pool = multiprocessing.Pool(processes=2)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 45
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Let's have a look how fast do we get:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%time\n",
      "r = pool.map(useless, years)\n",
      "pool.close()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "CPU times: user 4 ms, sys: 0 ns, total: 4 ms\n",
        "Wall time: 9.69 s\n",
        "2000\n",
        "2001\n",
        "20022003\n",
        "\n",
        "20042005\n",
        "\n",
        "20072006\n",
        "\n"
       ]
      }
     ],
     "prompt_number": 46
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "More than two times faster - not bad! But we can do better!"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "##Cython"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "[Cython](http://cython.org/) is an optimising static compiler for Python. Cython magic is one of the default extensions, and we can just load it (you have to have cython already installed):"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%load_ext cythonmagic"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "The cythonmagic extension is already loaded. To reload it, use:\n",
        "  %reload_ext cythonmagic\n"
       ]
      }
     ],
     "prompt_number": 47
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "The only thing we do here - add `%%cython` magic at the top of the cell. This function will be compiled with cython. "
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%cython\n",
      "def useless_cython(year):\n",
      "    from netCDF4 import Dataset\n",
      "    f = Dataset('air.sig995.'+year+'.nc')\n",
      "    a = f.variables['air'][:]\n",
      "    a_cum = 0\n",
      "    for i in range(a.shape[0]):\n",
      "        for j in range(a.shape[1]):\n",
      "            for n in range(a.shape[2]):\n",
      "                a_cum = a_cum+a[i,j,n]\n",
      "                \n",
      "    a_cum.tofile(year+'.bin')\n",
      "    print(year)\n",
      "    return a_cum"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 48
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Only this give us a good boost:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%time\n",
      "useless_cython('2000')"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "2000\n",
        "CPU times: user 2.57 s, sys: 40 ms, total: 2.61 s\n",
        "Wall time: 2.62 s\n"
       ]
      },
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 49,
       "text": [
        "1068708186.2315979"
       ]
      }
     ],
     "prompt_number": 49
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "One processor:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%time\n",
      "for yy in years:\n",
      "    useless_cython(yy)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "2000\n",
        "2001"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2002"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2003"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2004"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2005"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2006"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2007"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "CPU times: user 19.7 s, sys: 200 ms, total: 19.9 s\n",
        "Wall time: 19.9 s\n"
       ]
      }
     ],
     "prompt_number": 50
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Two processors:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%time\n",
      "pool = multiprocessing.Pool(processes=2)\n",
      "r = pool.map(useless_cython, years)\n",
      "pool.close()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "CPU times: user 0 ns, sys: 16 ms, total: 16 ms\n",
        "Wall time: 7.27 s\n",
        "2000\n",
        "2001\n",
        "20022003\n",
        "\n",
        "20052004\n",
        "\n",
        "20072006\n",
        "\n"
       ]
      }
     ],
     "prompt_number": 51
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "But the true power of cython revealed only when you provide types of your variables. You have to use `cdef` keyword in the function definition to do so. There are also couple other modifications to the function"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%cython\n",
      "import numpy as np\n",
      "\n",
      "def useless_cython(year):\n",
      "    \n",
      "    # define types of variables\n",
      "    cdef int i, j, n\n",
      "    cdef double a_cum\n",
      "    \n",
      "    from netCDF4 import Dataset\n",
      "    f = Dataset('air.sig995.'+year+'.nc')\n",
      "    a = f.variables['air'][:]\n",
      "    \n",
      "    a_cum = 0.\n",
      "    for i in range(a.shape[0]):\n",
      "        for j in range(a.shape[1]):\n",
      "            for n in range(a.shape[2]):\n",
      "                #here we have to convert numpy value to simple float\n",
      "                a_cum = a_cum+float(a[i,j,n])\n",
      "    \n",
      "    # since a_cum is not numpy variable anymore,\n",
      "    # we introduce new variable d in order to save\n",
      "    # data to the file easily\n",
      "    d = np.array(a_cum)\n",
      "    d.tofile(year+'.bin')\n",
      "    print(year)\n",
      "    return d"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 52
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%time\n",
      "useless_cython('2000')"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "2000\n",
        "CPU times: user 1.16 s, sys: 20 ms, total: 1.18 s\n",
        "Wall time: 1.19 s\n"
       ]
      },
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 53,
       "text": [
        "array(1068708186.2315979)"
       ]
      }
     ],
     "prompt_number": 53
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "One processor:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%time\n",
      "for yy in years:\n",
      "    useless_cython(yy)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "2000\n",
        "2001"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2002"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2003"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2004"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2005"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2006"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2007"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "CPU times: user 11.1 s, sys: 180 ms, total: 11.3 s\n",
        "Wall time: 11.3 s\n"
       ]
      }
     ],
     "prompt_number": 54
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Multiprocessing:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%time\n",
      "pool = multiprocessing.Pool(processes=2)\n",
      "r = pool.map(useless_cython, years)\n",
      "pool.close()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "CPU times: user 0 ns, sys: 12 ms, total: 12 ms\n",
        "Wall time: 4.3 s\n",
        "2000\n",
        "2001\n",
        "20022003\n",
        "\n",
        "20052004\n",
        "\n",
        "20072006\n",
        "\n"
       ]
      }
     ],
     "prompt_number": 55
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "From 9 seconds to 4 on two processors - not bad! But we can do better! :)"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "##Numba"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "[Numba](http://numba.pydata.org/) is an just-in-time specializing compiler which compiles annotated Python and NumPy code to LLVM (through decorators). The easiest way to install it is to use Anaconda distribution."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "from numba import jit, autojit"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 56
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "We now have to split our function in two (that would be a good idea from the beggining). One is just number crunching part, and another responsible for IO. The only thing that we have to do afterwards is to put `jit` decorator in front of the  first function."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "@autojit\n",
      "def calc_sum(a):\n",
      "    a_cum = 0.\n",
      "    for i in range(a.shape[0]):\n",
      "        for j in range(a.shape[1]):\n",
      "            for n in range(a.shape[2]):\n",
      "                a_cum = a_cum+a[i,j,n]\n",
      "    return a_cum\n",
      "\n",
      "def useless_numba(year):\n",
      "    #from netCDF4 import Dataset\n",
      "    f = Dataset('air.sig995.'+year+'.nc')\n",
      "    a = f.variables['air'][:]\n",
      "    a_cum = calc_sum(a)\n",
      "    \n",
      "    d = np.array(a_cum)\n",
      "    d.tofile(year+'.bin')\n",
      "    print(year)\n",
      "    return d"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 57
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%time\n",
      "useless_numba('2000')"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "2000\n",
        "CPU times: user 464 ms, sys: 12 ms, total: 476 ms\n",
        "Wall time: 483 ms\n"
       ]
      },
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 58,
       "text": [
        "array(1068708186.2315979)"
       ]
      }
     ],
     "prompt_number": 58
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "One processor:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%time\n",
      "for yy in years:\n",
      "    useless_numba(yy)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "2000\n",
        "2001"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2002"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2003"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2004"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2005"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2006"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2007"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "CPU times: user 1.53 s, sys: 152 ms, total: 1.68 s\n",
        "Wall time: 1.7 s\n"
       ]
      }
     ],
     "prompt_number": 59
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Two processors:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%time\n",
      "pool = multiprocessing.Pool(processes=2)\n",
      "r = pool.map(useless_numba, years)\n",
      "pool.close()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "CPU times: user 4 ms, sys: 8 ms, total: 12 ms\n",
        "Wall time: 912 ms\n",
        "2000\n",
        "2001\n",
        "20022003\n",
        "\n",
        "20042005\n",
        "\n",
        "20062007\n",
        "\n"
       ]
      }
     ],
     "prompt_number": 60
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Nice! Maybe we can speed up a bit more?"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "You can also provide type for the input and output:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "@jit('f8(f4[:,:,:])')\n",
      "def calc_sum(a):\n",
      "    a_cum = 0.\n",
      "    for i in range(a.shape[0]):\n",
      "        for j in range(a.shape[1]):\n",
      "            for n in range(a.shape[2]):\n",
      "                a_cum = a_cum+a[i,j,n]\n",
      "    return a_cum\n",
      "\n",
      "def useless_numba2(year):\n",
      "    #from netCDF4 import Dataset\n",
      "    f = Dataset('air.sig995.'+year+'.nc')\n",
      "    a = f.variables['air'][:]\n",
      "    a_cum = calc_sum(a)\n",
      "    \n",
      "    d = np.array(a_cum)\n",
      "    d.tofile(year+'.bin')\n",
      "    print(year)\n",
      "    return d"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 61
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%time\n",
      "useless_numba2('2000')"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "2000\n",
        "CPU times: user 216 ms, sys: 16 ms, total: 232 ms\n",
        "Wall time: 244 ms\n"
       ]
      },
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 62,
       "text": [
        "array(1068708186.2315979)"
       ]
      }
     ],
     "prompt_number": 62
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%time\n",
      "pool = multiprocessing.Pool(processes=2)\n",
      "r = pool.map(useless_numba2, years)\n",
      "pool.close()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "CPU times: user 8 ms, sys: 4 ms, total: 12 ms\n",
        "Wall time: 884 ms\n",
        "2000\n",
        "2001\n",
        "20022003\n",
        "\n",
        "20042005\n",
        "\n",
        "20062007\n",
        "\n"
       ]
      }
     ],
     "prompt_number": 65
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "just a tiny bit..."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "##Native numpy"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "This is how you really should solve this problem using numpy.sum(). Note, that the result will be different compared to previous examples. Only if you first convert to `float64` it becomes the same. Be careful when dealing with huge numbers!"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "import numpy as np\n",
      "def calc_sum(a):\n",
      "    a = np.float64(a)\n",
      "    return a.sum()\n",
      "\n",
      "def useless_good(year):\n",
      "    from netCDF4 import Dataset\n",
      "    f = Dataset('air.sig995.'+year+'.nc')\n",
      "    a = f.variables['air'][:]\n",
      "    a_cum = calc_sum(a)\n",
      "    \n",
      "    d = np.array(a_cum)\n",
      "    d.tofile(year+'.bin')\n",
      "    print(year)\n",
      "    return d"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 66
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%time\n",
      "useless_good('2000')"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "2000\n",
        "CPU times: user 172 ms, sys: 44 ms, total: 216 ms\n",
        "Wall time: 224 ms\n"
       ]
      },
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 67,
       "text": [
        "array(1068708186.2315979)"
       ]
      }
     ],
     "prompt_number": 67
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%time\n",
      "for yy in years:\n",
      "    useless_good(yy)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "2000\n",
        "2001"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2002"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2003"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2004"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2005"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2006"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2007"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "CPU times: user 1.76 s, sys: 296 ms, total: 2.06 s\n",
        "Wall time: 2.07 s\n"
       ]
      }
     ],
     "prompt_number": 68
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%time\n",
      "pool = multiprocessing.Pool(processes=2)\n",
      "r = pool.map(useless_good, years)\n",
      "pool.close()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "CPU times: user 4 ms, sys: 8 ms, total: 12 ms\n",
        "Wall time: 1.04 s\n",
        "2001\n",
        "2000\n",
        "20032002\n",
        "\n",
        "20052004\n",
        "\n",
        "20072006\n",
        "\n"
       ]
      }
     ],
     "prompt_number": 69
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Actually `numpy` version is a bit slower than `numba` version, but it's not clear to me how much I can trust this result."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "##Cython 2nd try"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "I was surprised to see Cython results so much different from Numba, and decide to make a second try. Here I  make cython to be aware of numpy arrays. "
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%cython\n",
      "import numpy as np\n",
      "cimport numpy as np\n",
      "cimport cython\n",
      "\n",
      "def calc_sum(np.ndarray[float, ndim=3] a):\n",
      "        \n",
      "    cdef int i, j, n\n",
      "    cdef float a_cum\n",
      "    \n",
      "    a_cum = 0\n",
      "    for i in range(a.shape[0]):\n",
      "        for j in range(a.shape[1]):\n",
      "            for n in range(a.shape[2]):\n",
      "                a_cum = a_cum+(a[i,j,n])\n",
      "    return a_cum"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 76
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "def useless_cython2(year):\n",
      "    from netCDF4 import Dataset\n",
      "    f = Dataset('air.sig995.'+year+'.nc')\n",
      "    a = f.variables['air'][:]\n",
      "    a_cum = calc_sum(a)\n",
      "    \n",
      "    d = np.array(a_cum)\n",
      "    d.tofile(year+'.bin')\n",
      "    print(year)\n",
      "    return d"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 77
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%time\n",
      "useless_cython2('2000')"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "2000\n",
        "CPU times: user 184 ms, sys: 36 ms, total: 220 ms\n",
        "Wall time: 225 ms\n"
       ]
      },
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 78,
       "text": [
        "array(1068708186.2315979)"
       ]
      }
     ],
     "prompt_number": 78
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%time\n",
      "for yy in years:\n",
      "    useless_cython2(yy)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "2000\n",
        "2001"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2002"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2003"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2004"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2005"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2006"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "2007"
       ]
      },
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "\n",
        "CPU times: user 1.44 s, sys: 164 ms, total: 1.6 s\n",
        "Wall time: 1.62 s\n"
       ]
      }
     ],
     "prompt_number": 79
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%time\n",
      "pool = multiprocessing.Pool(processes=2)\n",
      "r = pool.map(useless_cython2, years)\n",
      "pool.close()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "CPU times: user 4 ms, sys: 8 ms, total: 12 ms\n",
        "Wall time: 866 ms\n",
        "2000\n",
        "2001\n",
        "20022003\n",
        "\n",
        "20042005\n",
        "\n",
        "20062007\n",
        "\n"
       ]
      }
     ],
     "prompt_number": 80
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Now it's comparable to other competitors :)"
     ]
    },
    {
     "cell_type": "heading",
     "level": 2,
     "metadata": {},
     "source": [
      "Further reading and watching"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "[VIDEO: Cython: Speed up Python and NumPy, Pythonize C, C++, and Fortran, SciPy2013 Tutorial](http://pyvideo.org/video/2162/cython-speed-up-python-and-numpy-pythonize-c-c-7)\n",
      "\n",
      "[Numba vs. Cython: Take 2](http://nbviewer.ipython.org/url/jakevdp.github.io/downloads/notebooks/NumbaCython.ipynb)\n",
      "\n",
      "[Numexpr is a fast numerical expression evaluator for NumPy](https://github.com/pydata/numexpr)\n",
      "\n",
      "[Pythran is a python to c++ compiler for a subset of the python language](http://pythonhosted.org/pythran/)\n",
      "\n",
      "[Theano allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently](http://deeplearning.net/software/theano/)\n",
      "\n"
     ]
    }
   ],
   "metadata": {}
  }
 ]
}