{ "metadata": { "name": "9932_05_04" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Chapter 5, example 4\n", "====================\n", "\n", "In this example, we show how to accelerate a pure Python algorithm with Cython, just by giving some type information about the local variables." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will implement the Eratosthenes Sieve algorithm to find all prime numbers smaller than a given number. The first version is coded in pure Python." ] }, { "cell_type": "code", "collapsed": false, "input": [ "def primes1(n):\n", " primes = [False, False] + [True] * (n - 2)\n", " i = 2\n", " while i < n:\n", " # we do not deal with composite numbers\n", " if not primes[i]:\n", " i += 1\n", " continue\n", " k = i * i\n", " # mark multiples of i as composite numbers\n", " while k < n:\n", " primes[k] = False\n", " k += i\n", " i += 1\n", " return [i for i in xrange(2, n) if primes[i]]" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's evaluate the performance of this first version." ] }, { "cell_type": "code", "collapsed": false, "input": [ "n = 10000" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 2 }, { "cell_type": "code", "collapsed": false, "input": [ "primes1(20)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 3, "text": [ "[2, 3, 5, 7, 11, 13, 17, 19]" ] } ], "prompt_number": 3 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit primes1(n)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "100 loops, best of 3: 10.8 ms per loop\n" ] } ], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, we load the Cython magic extension to write Cython code right in the notebook." ] }, { "cell_type": "code", "collapsed": false, "input": [ "%load_ext cythonmagic" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 5 }, { "cell_type": "markdown", "metadata": {}, "source": [ "All we do is to add `%%cython` in the first line of the cell." ] }, { "cell_type": "code", "collapsed": false, "input": [ "%%cython\n", "def primes2(n):\n", " primes = [False, False] + [True] * (n - 2)\n", " i = 2\n", " while i < n:\n", " if not primes[i]:\n", " i += 1\n", " continue\n", " k = i * i\n", " while k < n:\n", " primes[k] = False\n", " k += i\n", " i += 1\n", " return [i for i in xrange(2, n) if primes[i]]" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 6 }, { "cell_type": "code", "collapsed": false, "input": [ "primes2(20)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 7, "text": [ "[2, 3, 5, 7, 11, 13, 17, 19]" ] } ], "prompt_number": 7 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The speed up improvement is modest." ] }, { "cell_type": "code", "collapsed": false, "input": [ "timeit primes2(n)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "100 loops, best of 3: 6.97 ms per loop\n" ] } ], "prompt_number": 8 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, we will precise the type of each local variable so that Cython can considerably optimize the code." ] }, { "cell_type": "code", "collapsed": false, "input": [ "%%cython\n", "def primes3(int n):\n", " primes = [False, False] + [True] * (n - 2)\n", " cdef int i = 2\n", " cdef int k = 0\n", " while i < n:\n", " if not primes[i]:\n", " i += 1\n", " continue\n", " k = i * i\n", " while k < n:\n", " primes[k] = False\n", " k += i\n", " i += 1\n", " return [i for i in xrange(2, n) if primes[i]]" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 9 }, { "cell_type": "code", "collapsed": false, "input": [ "timeit primes3(n)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "1000 loops, best of 3: 1.36 ms per loop\n" ] } ], "prompt_number": 10 }, { "cell_type": "markdown", "metadata": {}, "source": [ "We achieve a 8x speed improvement just by specifying the variable types." ] } ], "metadata": {} } ] }