{ "cells": [ { "cell_type": "markdown", "metadata": { "collapsed": true, "slideshow": { "slide_type": "slide" } }, "source": [ "# Generators: x = yield 42 \n", "## And other applications" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### David Stuebe\n", "\n", "www.swipely.com\n", "\n", "Get the presentation\n", "\n", "On Github IO: http://dstuebe.github.io/generators/\n", "\n", "On NbViewer: http://nbviewer.ipython.org/github/dstuebe/generators/blob/gh-pages/presentation.ipynb\n", "\n", "March 2, 2015\n", "\n", "Copyright(C) 2015, David Stuebe " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "##What is a generator?\n", "___\n", "\n", "```\n", " A kind of function that can return an intermediate result (\"the next\n", " value\") to its caller, but maintaining the function's local state so\n", " that the function can be resumed again right where it left off.\n", "```\n", "PEP 255 introduced the generator object and the *yield* statement in version 2.2 of Python." ] }, { "cell_type": "code", "execution_count": 138, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def fib():\n", " a, b = 0, 1\n", " while True:\n", " yield b\n", " a, b = b, a+b" ] }, { "cell_type": "code", "execution_count": 139, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "function_result = fib()\n", "print function_result" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "The result of calling the gererator function is a generator object.\n", "The generator object can be used as an iterator." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The generator is an iterator\n", "---" ] }, { "cell_type": "code", "execution_count": 140, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 140, "metadata": {}, "output_type": "execute_result" } ], "source": [ "function_result.__iter__() is function_result" ] }, { "cell_type": "code", "execution_count": 141, "metadata": { "collapsed": false, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "[(0, 1),\n", " (1, 1),\n", " (2, 2),\n", " (3, 3),\n", " (4, 5),\n", " (5, 8),\n", " (6, 13),\n", " (7, 21),\n", " (8, 34),\n", " (9, 55)]" ] }, "execution_count": 141, "metadata": {}, "output_type": "execute_result" } ], "source": [ "zip(xrange(10),function_result)" ] }, { "cell_type": "code", "execution_count": 142, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "(10, 89)" ] }, "execution_count": 142, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(10, function_result.next())" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Inside a generator\n", "---\n", "Inside the generator we can see that execution is paused after the *yield* and state is maintained between calls to next\n" ] }, { "cell_type": "code", "execution_count": 117, "metadata": { "collapsed": false }, "outputs": [], "source": [ "def noisy_generator():\n", " print ' Initializing'\n", " print ' first yield'\n", " yield \"A\"\n", " print ' generator running...'\n", " print ' second yield'\n", " yield \"B\"\n", " print ' generator running...'\n", " print ' now what?'" ] }, { "cell_type": "code", "execution_count": 118, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Calling next...\n", " Initializing\n", " first yield\n", "Result 1: A\n", " generator running...\n", " second yield\n", "Result 2: B\n", " generator running...\n", " now what?\n" ] }, { "ename": "StopIteration", "evalue": "", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mStopIteration\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 3\u001b[0m \u001b[0;32mprint\u001b[0m \u001b[0;34m'Result 1: %s'\u001b[0m \u001b[0;34m%\u001b[0m \u001b[0mg\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mnext\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0;32mprint\u001b[0m \u001b[0;34m'Result 2: %s'\u001b[0m \u001b[0;34m%\u001b[0m \u001b[0mg\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mnext\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 5\u001b[0;31m \u001b[0;32mprint\u001b[0m \u001b[0;34m'Result 3: %s'\u001b[0m \u001b[0;34m%\u001b[0m \u001b[0mg\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mnext\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mStopIteration\u001b[0m: " ] } ], "source": [ "g = noisy_generator()\n", "print 'Calling next...'\n", "print 'Result 1: %s' % g.next()\n", "print 'Result 2: %s' % g.next()\n", "print 'Result 3: %s' % g.next()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Generator interface\n", "---" ] }, { "cell_type": "code", "execution_count": 181, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Help on generator object:\n", "\n", "foo = class generator(object)\n", " | Methods defined here:\n", " | \n", " | __getattribute__(...)\n", " | x.__getattribute__('name') <==> x.name\n", " | \n", " | __iter__(...)\n", " | x.__iter__() <==> iter(x)\n", " | \n", " | __repr__(...)\n", " | x.__repr__() <==> repr(x)\n", " | \n", " | close(...)\n", " | close() -> raise GeneratorExit inside generator.\n", " | \n", " | next(...)\n", " | x.next() -> the next value, or raise StopIteration\n", " | \n", " | send(...)\n", " | send(arg) -> send 'arg' into generator,\n", " | return next yielded value or raise StopIteration.\n", " | \n", " | throw(...)\n", " | throw(typ[,val[,tb]]) -> raise exception in generator,\n", " | return next yielded value or raise StopIteration.\n", " | \n", " | ----------------------------------------------------------------------\n", " | Data descriptors defined here:\n", " | \n", " | gi_code\n", " | \n", " | gi_frame\n", " | \n", " | gi_running\n", "\n" ] } ], "source": [ "help(g)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Generator interface: throw\n", "---" ] }, { "cell_type": "code", "execution_count": 182, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Help on built-in function throw:\n", "\n", "throw(...)\n", " throw(typ[,val[,tb]]) -> raise exception in generator,\n", " return next yielded value or raise StopIteration.\n", "\n" ] } ], "source": [ "help(g.throw)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Looks fun - lets try it..." ] }, { "cell_type": "code", "execution_count": 183, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Initializing\n" ] }, { "data": { "text/plain": [ "'Hello'" ] }, "execution_count": 183, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def accepting_generator():\n", " print 'Initializing'\n", " try:\n", " yield 'Hello'\n", " print 'generator running...'\n", " except StandardError as e:\n", " print \"Error Message: %s\" % e\n", " yield 'World'\n", "\n", "g = accepting_generator()\n", "g.next()" ] }, { "cell_type": "code", "execution_count": 184, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Error Message: Foo bar baz\n" ] }, { "data": { "text/plain": [ "'World'" ] }, "execution_count": 184, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g.throw(StandardError, 'Foo bar baz')" ] }, { "cell_type": "code", "execution_count": 185, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "ename": "StopIteration", "evalue": "", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mStopIteration\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mg\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mnext\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mStopIteration\u001b[0m: " ] } ], "source": [ "g.next()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Generator interface: close\n", "---" ] }, { "cell_type": "code", "execution_count": 447, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Help on built-in function close:\n", "\n", "close(...)\n", " close() -> raise GeneratorExit inside generator.\n", "\n" ] } ], "source": [ "help(g.close)" ] }, { "cell_type": "code", "execution_count": 448, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "1" ] }, "execution_count": 448, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def closeable():\n", " try:\n", " yield 1\n", " yield 2\n", " except GeneratorExit:\n", " print 'closing'\n", "g = closeable()\n", "g.next()" ] }, { "cell_type": "code", "execution_count": 449, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "closing\n" ] } ], "source": [ "g.close()" ] }, { "cell_type": "code", "execution_count": 450, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "ename": "StopIteration", "evalue": "", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mStopIteration\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mg\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mnext\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mStopIteration\u001b[0m: " ] } ], "source": [ "g.next()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Generator interface: send\n", "---" ] }, { "cell_type": "code", "execution_count": 451, "metadata": { "collapsed": false, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Help on built-in function send:\n", "\n", "send(...)\n", " send(arg) -> send 'arg' into generator,\n", " return next yielded value or raise StopIteration.\n", "\n" ] } ], "source": [ "help(g.send)" ] }, { "cell_type": "code", "execution_count": 452, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "ename": "TypeError", "evalue": "can't send non-None value to a just-started generator", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0mg\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mfib\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mg\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'foo'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mTypeError\u001b[0m: can't send non-None value to a just-started generator" ] } ], "source": [ "g = fib()\n", "g.send('foo')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "so *next()* is really just *send(None)* " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Lets try that again...\n", "___" ] }, { "cell_type": "code", "execution_count": 453, "metadata": { "collapsed": false, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "1" ] }, "execution_count": 453, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g = fib()\n", "g.send(None)" ] }, { "cell_type": "code", "execution_count": 454, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "1" ] }, "execution_count": 454, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g.send(None)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "What is *send('foo')* ?" ] }, { "cell_type": "code", "execution_count": 455, "metadata": { "collapsed": false, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 455, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g.send('foo')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Where did it go?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## What is generator.send?\n", "___\n", "```\n", " Coroutines are a natural way of expressing many algorithms, such as\n", " simulations, games, asynchronous I/O, and other forms of event-\n", " driven programming or co-operative multitasking. Python's generator\n", " functions are almost coroutines -- but not quite -- in that they\n", " allow pausing execution to produce a value, but do not provide for\n", " values or exceptions to be passed in when execution resumes. They\n", " also do not allow execution to be paused within the \"try\" portion of\n", " try/finally blocks, and therefore make it difficult for an aborted\n", " coroutine to clean up after itself.\n", "```\n", "PEP 342 added *close*, *send* and *throw* to the generator in version 2.5 of python and made *yield* an expresion rather than a statement.\n" ] }, { "cell_type": "code", "execution_count": 143, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "5" ] }, "execution_count": 143, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def adder(val):\n", " x = 0\n", " while True:\n", " x = yield val+x\n", " \n", "g = adder(5)\n", "g.send(None)" ] }, { "cell_type": "code", "execution_count": 144, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "7" ] }, "execution_count": 144, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g.send(2)" ] }, { "cell_type": "code", "execution_count": 145, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "11" ] }, "execution_count": 145, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g.send(6)\n", " " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Inside a generator part 2\n", "---" ] }, { "cell_type": "code", "execution_count": 146, "metadata": { "collapsed": true, "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "def noisy_coroutine():\n", " print ' Initializing'\n", " print ' first yield'\n", " received = yield \"A\"\n", " print ' generator running after receiving: %s' % received\n", " print ' second yield'\n", " received = yield \"B\"\n", " print ' generator running after receiving: %s' % received\n", " print ' now what?'" ] }, { "cell_type": "code", "execution_count": 147, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Calling send...\n", " Initializing\n", " first yield\n" ] }, { "data": { "text/plain": [ "'A'" ] }, "execution_count": 147, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g = noisy_coroutine()\n", "print 'Calling send...'\n", "g.send(None)" ] }, { "cell_type": "code", "execution_count": 148, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " generator running after receiving: 1\n", " second yield\n" ] }, { "data": { "text/plain": [ "'B'" ] }, "execution_count": 148, "metadata": {}, "output_type": "execute_result" } ], "source": [ "g.send(1)\n" ] }, { "cell_type": "code", "execution_count": 149, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " generator running after receiving: 2\n", " now what?\n" ] }, { "ename": "StopIteration", "evalue": "", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mStopIteration\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mg\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mStopIteration\u001b[0m: " ] } ], "source": [ "g.send(2)" ] }, { "cell_type": "code", "execution_count": 150, "metadata": { "collapsed": false, "slideshow": { "slide_type": "skip" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(1, 2, 3)\n" ] } ], "source": [ "def foo():\n", " x = yield\n", " y = yield\n", " z = yield\n", " yield x, y, z\n", " \n", "g = foo()\n", "g.next()\n", "g.send(1)\n", "g.send(2)\n", "print g.send(3)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Calling next to start the generator is a pain\n", "___\n", "Lets fix that with a decorator..." ] }, { "cell_type": "code", "execution_count": 81, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def consumer(func):\n", " def wrapper(*args,**kw):\n", " gen = func(*args, **kw)\n", " gen.next()\n", " return gen\n", " wrapper.__name__ = func.__name__\n", " wrapper.__doc__ = func.__doc__\n", " return wrapper" ] }, { "cell_type": "code", "execution_count": 82, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "15\n", "6\n" ] } ], "source": [ "g = consumer(adder)(4)\n", "print g.send(11)\n", "print g.send(2)" ] }, { "cell_type": "code", "execution_count": 83, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "7" ] }, "execution_count": 83, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@consumer\n", "def subtractor(val):\n", " '''A generator that subtracts numbers from a value'''\n", " x = 0\n", " while True:\n", " x = yield x - val\n", "\n", "g = subtractor(8)\n", "g.send(15)\n" ] }, { "cell_type": "code", "execution_count": 84, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Help on function subtractor in module __main__:\n", "\n", "subtractor(*args, **kw)\n", " A generator that subtracts numbers from a value\n", "\n" ] } ], "source": [ "help(subtractor)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Lets try and do something interesting\n", "___\n", "So far we have looked at simple examples but we can use generators for:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Iteration" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Data flow" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Concurrancy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Reformulate control flow" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Iteration (a la David Beazley)\n", "___\n", "An example of setting up a processing pipeline with generators" ] }, { "cell_type": "code", "execution_count": 418, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "217.237.150.206 - - [02/Apr/2015:00:47:42 -0600] \"GET /python/tutorial/beazley_advanced_python/Slides/SLIDE060.HTM HTTP/1.0\" 200 1686\n", "129.106.32.126 - - [02/Apr/2015:00:47:43 -0600] \"GET /python/tutorial/beazley_advanced_python/Slides/SLIDE006.HTM HTTP/1.0\" 200 1254\n", "66.91.239.214 - - [02/Apr/2015:00:47:44 -0600] \"GET /python/tutorial/beazley_advanced_python/Slides/SLIDE014.HTM HTTP/1.0\" 200 1232\n", "217.219.18.80 - - [02/Apr/2015:00:47:46 -0600] \"GET /python/tutorial/beazley_advanced_python/Slides/SLIDE090.HTM HTTP/1.0\" 200 2001\n" ] }, { "ename": "KeyboardInterrupt", "evalue": "", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mKeyboardInterrupt\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 21\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 22\u001b[0m \u001b[0;31m# Pull results out of the processing pipeline\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 23\u001b[0;31m \u001b[0;32mfor\u001b[0m \u001b[0mline\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mpylines\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 24\u001b[0m \u001b[0;32mprint\u001b[0m \u001b[0mline\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m\u001b[0m in \u001b[0;36mgrep\u001b[0;34m(pattern, lines)\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;31m# From http://www.dabeaz.com/coroutines/pipeline.py\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mgrep\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mpattern\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0mlines\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0;32mfor\u001b[0m \u001b[0mline\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mlines\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 4\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mpattern\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mline\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0;32myield\u001b[0m \u001b[0mline\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m\u001b[0m in \u001b[0;36mfollow\u001b[0;34m(thefile)\u001b[0m\n\u001b[1;32m 11\u001b[0m \u001b[0mline\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mthefile\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mreadline\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 12\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0mline\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 13\u001b[0;31m \u001b[0mtime\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msleep\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m0.1\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# Sleep briefly\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 14\u001b[0m \u001b[0;32mcontinue\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 15\u001b[0m \u001b[0;32myield\u001b[0m \u001b[0mline\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mKeyboardInterrupt\u001b[0m: " ] } ], "source": [ "# From http://www.dabeaz.com/coroutines/pipeline.py\n", "def grep(pattern,lines):\n", " for line in lines:\n", " if pattern in line:\n", " yield line\n", "\n", "import time\n", "def follow(thefile):\n", " thefile.seek(0,2) # Go to the end of the file\n", " while True:\n", " line = thefile.readline()\n", " if not line:\n", " time.sleep(0.1) # Sleep briefly\n", " continue\n", " yield line\n", "\n", "# Set up a processing pipe : tail -f | grep python\n", "with open(\"access-log\") as logfile:\n", " loglines = follow(logfile)\n", " pylines = grep(\"python\",loglines)\n", "\n", " # Pull results out of the processing pipeline\n", " for line in pylines:\n", " print line," ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Data Flow (a la David Beazley)\n", "___\n", "An example of setting up a similar pipeline with coroutines" ] }, { "cell_type": "code", "execution_count": 423, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Looking for python\n", "66.249.65.37 - - [02/Apr/2015:00:51:57 -0600] \"GET /papers/python.html HTTP/1.1\" 404 133\n", "128.135.125.245 - - [02/Apr/2015:00:51:58 -0600] \"GET /python/tutorial/beazley_intro_python/intropy.pdf HTTP/1.0\" 304 -\n", "194.105.57.11 - - [02/Apr/2015:00:52:01 -0600] \"GET /python/tutorial/beazley_advanced_python/Slides/SLIDE002.HTM HTTP/1.0\" 200 1352\n", "189.141.19.88 - - [02/Apr/2015:00:52:03 -0600] \"GET /python/tutorial/beazley_advanced_python/Slides/SLIDE096.HTM HTTP/1.0\" 200 1671\n", "123.190.193.8 - - [02/Apr/2015:00:52:03 -0600] \"GET /python/tutorial/beazley_advanced_python/Slides/SLIDE059.HTM HTTP/1.0\" 200 1694\n" ] }, { "ename": "KeyboardInterrupt", "evalue": "", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mKeyboardInterrupt\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 24\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 25\u001b[0m \u001b[0;32mwith\u001b[0m \u001b[0mopen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"access-log\"\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0mlogfile\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 26\u001b[0;31m \u001b[0mfollow\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlogfile\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0mgrep\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'python'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0mprinter\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m)\u001b[0m \u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;32m\u001b[0m in \u001b[0;36mfollow\u001b[0;34m(thefile, target)\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0mline\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mthefile\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mreadline\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 6\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0mline\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 7\u001b[0;31m \u001b[0mtime\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msleep\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m0.1\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;31m# Sleep briefly\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 8\u001b[0m \u001b[0;32mcontinue\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 9\u001b[0m \u001b[0mtarget\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mline\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mKeyboardInterrupt\u001b[0m: " ] } ], "source": [ "import time\n", "def follow(thefile, target):\n", " thefile.seek(0,2) # Go to the end of the file\n", " while True:\n", " line = thefile.readline()\n", " if not line:\n", " time.sleep(0.1) # Sleep briefly\n", " continue\n", " target.send(line)\n", "\n", "@consumer\n", "def grep(pattern, target):\n", " print \"Looking for %s\" % pattern\n", " while True:\n", " line = (yield)\n", " if pattern in line:\n", " target.send(line),\n", "\n", "@consumer\n", "def printer():\n", " while True:\n", " line = (yield)\n", " print line,\n", "\n", "with open(\"access-log\") as logfile:\n", " follow(logfile, grep('python',printer() ) )" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Are generators fast for data flow?\n", "___\n", "Lets start with Beazely's Benchmark example " ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "coroutine: 0.326890945435\n", "object: 0.381680011749\n" ] } ], "source": [ "# An object\n", "class GrepHandler(object):\n", " def __init__(self,pattern, target):\n", " self.pattern = pattern\n", " self.target = target\n", " def send(self,line):\n", " if self.pattern in line:\n", " self.target.send(line)\n", "\n", "# A coroutine\n", "@consumer\n", "def grep(pattern,target):\n", " while True:\n", " line = (yield)\n", " if pattern in line:\n", " target.send(line)\n", "\n", "# A null-sink to send data\n", "@consumer\n", "def null(): \n", " while True: item = (yield)\n", "\n", "# A benchmark\n", "line = 'python is nice'\n", "p1 = grep('python',null()) # Coroutine\n", "p2 = GrepHandler('python',null()) # Object\n", "\n", "from timeit import timeit\n", "\n", "print \"coroutine:\", timeit(\"p1.send(line)\", \"from __main__ import line, p1\")\n", "\n", "print \"object:\", timeit(\"p2.send(line)\", \"from __main__ import line, p2\")" ] }, { "cell_type": "code", "execution_count": 114, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The slowest run took 9.04 times longer than the fastest. This could mean that an intermediate result is being cached \n", "1000000 loops, best of 3: 317 ns per loop\n" ] } ], "source": [ "%timeit p1.send(line)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Beware - ipython magic *%timeit* does not play nice with generator send calls!" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Lets a problem that calls send a few more times...\n", "___\n", "The idea for this benchmark comes from a stack overflow question a-faster-nested-tuple-to-list-and-back\n", "\n", "```\n", "I'm trying to perform tuple to list and list to tuple conversions on nested sequences of unknown depth and shape. The calls are being made hundreds of thousands of times, which is why I'm trying to squeeze out as much speed as possible.\n", "```\n", "\n", "First, lets define a function to make some test data...\n" ] }, { "cell_type": "code", "execution_count": 153, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "[[[0, 1], [[[0, 1], [[[0, 1], []]]], [[0, 1], [[[0, 1], []]]]]],\n", " [[0, 1], [[[0, 1], [[[0, 1], []]]], [[0, 1], [[[0, 1], []]]]]],\n", " [[0, 1], [[[0, 1], [[[0, 1], []]]], [[0, 1], [[[0, 1], []]]]]]]" ] }, "execution_count": 153, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def make_test(m, n):\n", " return [[range(m), make_test(m, n-1)] for i in range(n)]\n", "make_test(2,3)" ] }, { "cell_type": "code", "execution_count": 154, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "(((0, 1), (((0, 1), (((0, 1), ()),)), ((0, 1), (((0, 1), ()),)))),\n", " ((0, 1), (((0, 1), (((0, 1), ()),)), ((0, 1), (((0, 1), ()),)))),\n", " ((0, 1), (((0, 1), (((0, 1), ()),)), ((0, 1), (((0, 1), ()),)))))" ] }, "execution_count": 154, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def list2tuple(a):\n", " return tuple((list2tuple(x) if isinstance(x, list) else x for x in a))\n", "\n", "def tuple2list(a):\n", " return list((tuple2list(x) if isinstance(x, tuple) else x for x in a))\n", "\n", "list2tuple(make_test(2,3))" ] }, { "cell_type": "code", "execution_count": 155, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 155, "metadata": {}, "output_type": "execute_result" } ], "source": [ "make_test(2,3) == tuple2list(list2tuple(make_test(2,3)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "From HYRY's answer" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "##Now lets try a solution using coroutines\n", "___" ] }, { "cell_type": "code", "execution_count": 156, "metadata": { "collapsed": false }, "outputs": [ { "ename": "ValueError", "evalue": "generator already executing", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 8\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 9\u001b[0m \u001b[0ml2t\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mcolist2tuple\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 10\u001b[0;31m \u001b[0ml2t\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mmake_test\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m3\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0ml2t\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;32m\u001b[0m in \u001b[0;36mcolist2tuple\u001b[0;34m()\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0;32mwhile\u001b[0m \u001b[0mTrue\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 6\u001b[0m \u001b[0mlst\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0ml2t\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32myield\u001b[0m \u001b[0mresult\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 7\u001b[0;31m \u001b[0mresult\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtuple\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ml2t\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0ml2t\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0mlist\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32melse\u001b[0m \u001b[0mx\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mx\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mlst\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 8\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 9\u001b[0m \u001b[0ml2t\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mcolist2tuple\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m((x,))\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0;32mwhile\u001b[0m \u001b[0mTrue\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 6\u001b[0m \u001b[0mlst\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0ml2t\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32myield\u001b[0m \u001b[0mresult\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 7\u001b[0;31m \u001b[0mresult\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtuple\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ml2t\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0ml2t\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0mlist\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32melse\u001b[0m \u001b[0mx\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mx\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mlst\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 8\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 9\u001b[0m \u001b[0ml2t\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mcolist2tuple\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mValueError\u001b[0m: generator already executing" ] } ], "source": [ "@consumer\n", "def colist2tuple():\n", " \"\"\"Convertes lists to tuples\"\"\"\n", " result = []\n", " while True:\n", " lst, l2t = yield result\n", " result = tuple((l2t.send((x, l2t)) if isinstance(x,list) else x for x in lst) )\n", " \n", "l2t = colist2tuple()\n", "l2t.send((make_test(2,3),l2t))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Generators are not recursive..." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Try a worker pool of generators\n", "___" ] }, { "cell_type": "code", "execution_count": 157, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 157, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@consumer\n", "def colist2tuple():\n", " \"\"\"Convertes lists to tuples\"\"\"\n", " result = []\n", " while True:\n", " (lst, pool) = yield result\n", " result = tuple((pool[0].send((x, pool[1:])) if isinstance(x,list) else x for x in lst) )\n", "\n", "@consumer\n", "def cotuple2list():\n", " \"\"\"converts tuples to lists\"\"\"\n", " result = None\n", " while True:\n", " (tup, pool) = (yield result)\n", " result = list((pool[0].send((x, pool[1:])) if isinstance(x, tuple) else x for x in tup))\n", " \n", "class GenPool:\n", " def __init__(self, gen_func, depth):\n", " self.pool = [gen_func() for i in xrange(depth) ]\n", " def send(self,iterable):\n", " return self.pool[0].send((iterable, self.pool[1:]))\n", " \n", "l2t_pool = GenPool(colist2tuple,5)\n", "t2l_pool = GenPool(cotuple2list,5)\n", "\n", "make_test(3,2) == t2l_pool.send(l2t_pool.send(make_test(3,2)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "From my answer" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## How about doing it inplace for the lists?\n", "___" ] }, { "cell_type": "code", "execution_count": 158, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 158, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def inplace_list2tuple(a):\n", " for (i,x) in enumerate(a):\n", " if isinstance(x,list):\n", " a[i] = inplace_list2tuple(x)\n", " return tuple(a)\n", " \n", "def inplace_tuple2list(a):\n", " a = list(a) # can't really modify a tuple in place...\n", " for (i,x) in enumerate(a):\n", " if isinstance(x,tuple):\n", " a[i] = inplace_tuple2list(x)\n", " return a\n", " \n", "make_test(2,3) == inplace_tuple2list(inplace_list2tuple(make_test(2,3)))\n" ] }, { "cell_type": "code", "execution_count": 166, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 166, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@consumer\n", "def inplace_colist2tuple():\n", " \"\"\"Convertes lists to tuples\"\"\"\n", " result = None\n", " while True:\n", " (lst, co_pool) = (yield result)\n", " for (i,x) in enumerate(lst):\n", " if isinstance(x,list):\n", " lst[i] = co_pool[0].send((x, co_pool[1:]))\n", " result = tuple(lst)\n", " \n", "@consumer\n", "def inplace_cotuple2list():\n", " \"\"\"converts tuples to lists\"\"\"\n", " result = None\n", " while True:\n", " (tup, co_pool) = (yield result)\n", " result = list(tup)\n", " for (i,x) in enumerate(result):\n", " if isinstance(x,tuple):\n", " result[i] = co_pool[0].send((x, co_pool[1:]))\n", " \n", "l2t_pool = GenPool(inplace_colist2tuple,5)\n", "t2l_pool = GenPool(inplace_cotuple2list,5)\n", "\n", "make_test(3,2) == t2l_pool.send(l2t_pool.send(make_test(3,2)))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Lets timeit!\n", "___" ] }, { "cell_type": "code", "execution_count": 168, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import timeit\n", "breadth,depth = 25,8\n", "number,repeat = 2,3\n", "l2t_pool = GenPool(colist2tuple,breadth)\n", "t2l_pool = GenPool(cotuple2list,breadth)\n", "inplace_l2t_pool = GenPool(inplace_colist2tuple,breadth)\n", "inplace_t2l_pool = GenPool(inplace_cotuple2list,breadth) " ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Generator ['3.18855', '3.21685', '2.94067']\n", "Recursive ['2.46753', '2.82640', '3.27728']\n" ] } ], "source": [ "# Compare round trip operations\n", "print \"Generator %s\" % [\"%0.5f\" % (v/number) for v in timeit.repeat('t2l_pool.send(l2t_pool.send(test_data))', setup='from __main__ import t2l_pool, l2t_pool, make_test, depth, breadth; test_data = make_test(breadth, depth)', number=number, repeat=repeat)]\n", "print \"Recursive %s\" % [\"%0.5f\" % (v/number) for v in timeit.repeat('tuple2list(list2tuple(test_data))', setup='from __main__ import tuple2list, list2tuple, make_test, depth, breadth; test_data = make_test(breadth, depth)', number=number, repeat=repeat)]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Lets timeit!\n", "___" ] }, { "cell_type": "code", "execution_count": 41, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Generator ['1.54234', '1.54228', '1.51895']\n", "Recursive ['1.38003', '1.37990', '1.37279']\n" ] } ], "source": [ "# Compare round trip operations using inplace for list to tuple\n", "print \"Generator %s\" % [\"%0.5f\" % (v/number) for v in timeit.repeat('inplace_t2l_pool.send(inplace_l2t_pool.send(test_data))', setup='from __main__ import inplace_l2t_pool, inplace_t2l_pool, make_test, depth, breadth; test_data = make_test(breadth, depth)', number=number, repeat=repeat)]\n", "print \"Recursive %s\" % [\"%0.5f\" % (v/number) for v in timeit.repeat('inplace_tuple2list(inplace_list2tuple(test_data))', setup='from __main__ import inplace_tuple2list, inplace_list2tuple, make_test, depth, breadth; test_data = make_test(breadth, depth)', number=number, repeat=repeat)]\n" ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Generator ['0.51324', '0.51948', '0.50737']\n", "Recursive ['0.44689', '0.45245', '0.44493']\n" ] } ], "source": [ "# Compare just list2tuple using inplace\n", "print \"Generator %s\" % [\"%0.5f\" % (v/number) for v in timeit.repeat('inplace_l2t_pool.send(test_data)', setup='from __main__ import inplace_l2t_pool, make_test, depth, breadth; test_data = make_test(breadth, depth)', number=number, repeat=repeat)]\n", "print \"Recursive %s\" % [\"%0.5f\" % (v/number) for v in timeit.repeat('inplace_list2tuple(test_data)', setup='from __main__ import inplace_list2tuple, make_test, depth, breadth; test_data = make_test(breadth, depth)', number=number, repeat=repeat)]\n" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true, "slideshow": { "slide_type": "fragment" } }, "source": [ "Generators are not magic - a function call is still a function call and the gil still serializes the process." ] }, { "cell_type": "code", "execution_count": 146, "metadata": { "collapsed": false, "slideshow": { "slide_type": "skip" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1 loops, best of 3: 305 ms per loop\n", "1 loops, best of 3: 271 ms per loop\n" ] } ], "source": [ "test_data = make_test(breadth,depth)\n", "# Courtine\n", "%timeit t2l_pool.send(l2t_pool.send(test_data))\n", "# Recursive\n", "%timeit tuple2list(list2tuple(test_data))" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "collapsed": false, "slideshow": { "slide_type": "skip" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "100 loops, best of 3: 15.3 ms per loop\n", "100 loops, best of 3: 13.5 ms per loop\n" ] } ], "source": [ "inplace_l2t_pool = GenPool(inplace_colist2tuple,breadth)\n", "inplace_t2l_pool = GenPool(inplace_cotuple2list,breadth) \n", "test_data = make_test(breadth,depth)\n", "%timeit inplace_t2l_pool.send(inplace_l2t_pool.send(test_data))\n", "test_data = make_test(breadth,depth)\n", "%timeit inplace_tuple2list(inplace_list2tuple(test_data))" ] }, { "cell_type": "code", "execution_count": 156, "metadata": { "collapsed": false, "slideshow": { "slide_type": "skip" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The slowest run took 51639.38 times longer than the fastest. This could mean that an intermediate result is being cached \n", "100000 loops, best of 3: 2.63 µs per loop\n", "The slowest run took 56308.31 times longer than the fastest. This could mean that an intermediate result is being cached \n", "100000 loops, best of 3: 2.17 µs per loop\n" ] } ], "source": [ "# Generator\n", "test_data = make_test(breadth,depth)\n", "%timeit inplace_l2t_pool.send(test_data)\n", "# Recursive\n", "test_data = make_test(breadth,depth)\n", "%timeit inplace_list2tuple(test_data)\n" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true, "slideshow": { "slide_type": "subslide" } }, "source": [ "## Lets try HYRY's Cython\n", "___" ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import cython\n", "cython_list2tuple, cython_tuple2list = cython.inline(\n", " \"\"\"\n", " @cython.profile(True)\n", " def cython_list2tuple(a):\n", " return tuple([cython_list2tuple(x) if type(x)==list else x for x in a])\n", "\n", " @cython.profile(True)\n", " def cython_tuple2list(a):\n", " return [cython_tuple2list(x) if type(x)==tuple else x for x in a]\n", " \"\"\"\n", " ).values() # it returns a dict of named functions\n", "\n", "make_test(3,2) == cython_tuple2list(cython_list2tuple(make_test(3,2)))" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Cython: ['0.40505', '0.38909', '0.40008']\n" ] } ], "source": [ "print \"Cython: %s\" % [\"%0.5f\" % (v/number) for v in timeit.repeat('cython_tuple2list(cython_list2tuple(t))', setup='from __main__ import cython_list2tuple, cython_tuple2list, make_test, depth, breadth; t = make_test(breadth, depth)', number=number, repeat=repeat)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As expected, cython is blazing fast" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Lets try cProfile\n", "___" ] }, { "cell_type": "code", "execution_count": 163, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import cProfile" ] }, { "cell_type": "code", "execution_count": 164, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 13590406 function calls (6137622 primitive calls) in 5.433 seconds\n", "\n", " Ordered by: standard name\n", "\n", " ncalls tottime percall cumtime percall filename:lineno(function)\n", " 328801/1 0.942 0.000 2.520 2.520 :1(list2tuple)\n", "3397601/9 1.042 0.000 2.520 0.280 :2()\n", " 328801/1 1.159 0.000 2.853 2.853 :4(tuple2list)\n", "3397601/9 1.121 0.000 2.853 0.317 :5()\n", " 1 0.060 0.060 5.433 5.433 :1()\n", " 6137600 1.109 0.000 1.109 0.000 {isinstance}\n", " 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}\n", "\n", "\n" ] } ], "source": [ "test_data = make_test(breadth, depth)\n", "cProfile.run('tuple2list(list2tuple(test_data));')" ] }, { "cell_type": "code", "execution_count": 165, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 6795204 function calls (6137604 primitive calls) in 3.892 seconds\n", "\n", " Ordered by: standard name\n", "\n", " ncalls tottime percall cumtime percall filename:lineno(function)\n", " 328801/1 1.019 0.000 1.614 1.614 :1(inplace_list2tuple)\n", " 328801/1 1.684 0.000 2.228 2.228 :7(inplace_tuple2list)\n", " 1 0.050 0.050 3.892 3.892 :1()\n", " 6137600 1.140 0.000 1.140 0.000 {isinstance}\n", " 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}\n", "\n", "\n" ] } ], "source": [ "test_data = make_test(breadth, depth)\n", "cProfile.run('inplace_tuple2list(inplace_list2tuple(test_data));')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Lets try cProfile\n", "___" ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 657604 function calls (4 primitive calls) in 1.422 seconds\n", "\n", " Ordered by: standard name\n", "\n", " ncalls tottime percall cumtime percall filename:lineno(function)\n", " 1 0.053 0.053 1.422 1.422 :1()\n", " 328801/1 0.689 0.000 0.689 0.689 _cython_inline_2b6dbefcfcbecd965f1c99dd441375e0.pyx:12(cython_tuple2list)\n", " 328801/1 0.680 0.000 0.680 0.680 _cython_inline_2b6dbefcfcbecd965f1c99dd441375e0.pyx:8(cython_list2tuple)\n", " 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}\n", "\n", "\n" ] } ], "source": [ "test_data = make_test(breadth, depth)\n", "cProfile.run('cython_tuple2list(cython_list2tuple(test_data));')\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Lets try cProfile\n", "___" ] }, { "cell_type": "code", "execution_count": 169, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 14248010 function calls (6137626 primitive calls) in 5.771 seconds\n", "\n", " Ordered by: standard name\n", "\n", " ncalls tottime percall cumtime percall filename:lineno(function)\n", " 328801/1 0.804 0.000 2.583 2.583 :1(colist2tuple)\n", "3397601/9 1.090 0.000 3.188 0.354 :15()\n", " 2 0.000 0.000 5.771 2.886 :20(send)\n", "3397601/9 1.147 0.000 2.583 0.287 :7()\n", " 328801/1 1.505 0.000 3.188 3.188 :9(cotuple2list)\n", " 1 0.000 0.000 5.771 5.771 :1()\n", " 6137600 1.098 0.000 1.098 0.000 {isinstance}\n", " 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}\n", " 657602/2 0.128 0.000 5.771 2.886 {method 'send' of 'generator' objects}\n", "\n", "\n" ] } ], "source": [ "test_data = make_test(breadth, depth)\n", "cProfile.run('t2l_pool.send(l2t_pool.send(test_data))')" ] }, { "cell_type": "code", "execution_count": 171, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 7452808 function calls (6137608 primitive calls) in 3.864 seconds\n", "\n", " Ordered by: standard name\n", "\n", " ncalls tottime percall cumtime percall filename:lineno(function)\n", " 2 0.000 0.000 3.864 1.932 :20(send)\n", " 328801/1 1.060 0.000 1.665 1.665 :1(inplace_colist2tuple)\n", " 328801/1 1.594 0.000 2.199 2.199 :12(inplace_cotuple2list)\n", " 1 0.000 0.000 3.864 3.864 :1()\n", " 6137600 1.102 0.000 1.102 0.000 {isinstance}\n", " 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}\n", " 657602/2 0.108 0.000 3.864 1.932 {method 'send' of 'generator' objects}\n", "\n", "\n" ] } ], "source": [ "test_data = make_test(breadth, depth)\n", "cProfile.run('inplace_t2l_pool.send(inplace_l2t_pool.send(test_data))')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Lets play with dot product\n", "___\n" ] }, { "cell_type": "code", "execution_count": 174, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import numpy\n", "import itertools\n", "# make some data\n", "n = 1000000\n", "a = numpy.random.randn(n)\n", "b = numpy.random.randn(n)\n", "number,repeat = 2,3" ] }, { "cell_type": "code", "execution_count": 175, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Numpy Dot Product\n", "Value 731.284297837\n", "Time ['0.00087', '0.00082', '0.00081']\n" ] } ], "source": [ "print \"Numpy Dot Product\" \n", "print \"Value %s\" % numpy.dot(a,b)\n", "print \"Time %s\" % [\"%0.5f\" % (v/number) for v in timeit.repeat('numpy.dot(a,b)', setup='from __main__ import numpy, a, b', number=number, repeat=repeat)]" ] }, { "cell_type": "code", "execution_count": 177, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Naive Dot Product\n", "Value: 731.284297837\n", "Time: ['0.32970', '0.33121', '0.33032']\n" ] } ], "source": [ "def naive_loop(it):\n", " result = numpy.float64(0.0)\n", " for (a_val,b_val) in it:\n", " result += a_val*b_val\n", " return result\n", "print \"Naive Dot Product\"\n", "print \"Value: %s\" % naive_loop(numpy.nditer([a, b]))\n", "print \"Time: %s\" % [\"%0.5f\" % (v/number) for v in timeit.repeat('naive_loop(it)', setup='from __main__ import numpy, naive_loop, a, b; it = numpy.nditer([a, b])', number=number, repeat=repeat)]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Coroutine Dot Product\n", "___" ] }, { "cell_type": "code", "execution_count": 176, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Coroutinte Dot Product\n", "Value: 731.284297837\n", "Time ['0.49758', '0.49486', '0.50892']\n" ] } ], "source": [ "@consumer\n", "def mult(target=None):\n", " result = None\n", " while True:\n", " (a, b) = (yield result)\n", " result = target.send(a*b)\n", " \n", "@consumer\n", "def add():\n", " result = numpy.float64(0.0)\n", " while True:\n", " m = (yield result)\n", " result += m\n", " \n", "dot_product_process = mult(add())\n", " \n", "def gen_loop(it):\n", " dot_product = None\n", " for a_val,b_val in it: \n", " dot_product = dot_product_process.send((a_val, b_val))\n", " return dot_product\n", "\n", "print \"Coroutinte Dot Product\"\n", "print \"Value: %s\" % gen_loop(numpy.nditer([a, b]))\n", "print \"Time %s\" % [\"%0.5f\" % (v/number) for v in timeit.repeat('gen_loop(it)', setup='from __main__ import numpy, gen_loop, a, b; it = numpy.nditer([a, b])', number=number, repeat=repeat)]" ] }, { "cell_type": "code", "execution_count": 187, "metadata": { "collapsed": false, "slideshow": { "slide_type": "skip" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Coroutine Dot Product:\n", "Value: [ 731.28429784]\n", "Time ['0.59082', '0.57764', '0.61219']\n" ] } ], "source": [ "# Coroutine processes without using return...\n", "@consumer\n", "def mult_allocated_result(target):\n", " while True:\n", " (a, b) = (yield)\n", " result = target.send(a*b)\n", " \n", "@consumer\n", "def add_allocated_result(result):\n", " while True:\n", " m = (yield)\n", " result[0] += m\n", " \n", "result = numpy.zeros((1,))\n", "dot_product_process_noresult = mult_allocated_result(add_allocated_result(result))\n", " \n", "def allocated_result_loop(it):\n", " dot_product = None\n", " for a_val,b_val in it: \n", " dot_product_process_noresult.send((a_val, b_val))\n", "\n", "print \"Coroutine Dot Product:\" \n", "allocated_result_loop(numpy.nditer([a, b]))\n", "print \"Value: %s\" % result\n", "print \"Time %s\" % [\"%0.5f\" % (v/number) for v in timeit.repeat('allocated_result_loop(it)', setup='from __main__ import numpy, allocated_result_loop, a, b; it = numpy.nditer([a, b])', number=number, repeat=repeat)]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## Try cProfile\n", "___" ] }, { "cell_type": "code", "execution_count": 77, "metadata": { "collapsed": true }, "outputs": [], "source": [ "n = 100000\n", "a = numpy.random.randn(n)\n", "b = numpy.random.randn(n)" ] }, { "cell_type": "code", "execution_count": 78, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 400003 function calls (300003 primitive calls) in 0.205 seconds\n", "\n", " Ordered by: standard name\n", "\n", " ncalls tottime percall cumtime percall filename:lineno(function)\n", " 100000 0.091 0.000 0.138 0.000 :1(mult)\n", " 1 0.046 0.046 0.205 0.205 :17(gen_loop)\n", " 100000 0.026 0.000 0.026 0.000 :8(add)\n", " 1 0.000 0.000 0.205 0.205 :1()\n", " 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}\n", "200000/100000 0.042 0.000 0.159 0.000 {method 'send' of 'generator' objects}\n", "\n", "\n" ] } ], "source": [ "cProfile.run('gen_loop(numpy.nditer([a, b]))')" ] }, { "cell_type": "code", "execution_count": 79, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 3 function calls in 19.887 seconds\n", "\n", " Ordered by: standard name\n", "\n", " ncalls tottime percall cumtime percall filename:lineno(function)\n", " 1 19.887 19.887 19.887 19.887 :1(naive_loop)\n", " 1 0.000 0.000 19.887 19.887 :1()\n", " 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}\n", "\n", "\n" ] } ], "source": [ "cProfile.run('naive_loop(numpy.nditer([a, b]))')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Unfortunately, we don't get a lot of insight into why the naive loop is soo slow." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## More Data Flow\n", "___" ] }, { "cell_type": "code", "execution_count": 95, "metadata": { "collapsed": true }, "outputs": [], "source": [ "TARGETS = 'targets'\n", "class AnalysisWindowComplete(Exception):\n", " \"\"\"Used in coroutine control flow, this is not an error condition.\"\"\"\n", " \n", "@consumer\n", "def myprinter(name):\n", " while True: \n", " p = (yield) \n", " print \" PrinterName: %s; says: %s\" % (name, p)\n", " " ] }, { "cell_type": "code", "execution_count": 104, "metadata": { "collapsed": true, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "@consumer\n", "def average(targets):\n", " try:\n", " while True:\n", " cnt, result = 0, 0.0 \n", " try:\n", " \n", " while True: \n", " result += (yield) \n", " cnt += 1\n", " \n", " except AnalysisWindowComplete as wc:\n", " print ' In complete with:', wc\n", " for target in targets:\n", " target.send(result/cnt)\n", " \n", " except (ValueError, IndexError) as e:\n", " raise\n", " " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### take the average of a few values\n", "___" ] }, { "cell_type": "code", "execution_count": 105, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Call Complete\n", " In complete with: foobar\n", " printer #1 says: 4.75\n" ] } ], "source": [ "avg_co = average([myprinter('#1'),] )\n", " \n", "avg_co.send(5.)\n", "avg_co.send(5.)\n", "avg_co.send(5.)\n", "avg_co.send(4.)\n", "print \"Call Complete\"\n", "avg_co.throw(AnalysisWindowComplete,'foobar')\n", " " ] }, { "cell_type": "code", "execution_count": 106, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def get_targets(co):\n", " '''Get the targets from the generators frame locals'''\n", " try:\n", " return co.gi_frame.f_locals[TARGETS]\n", " except KeyError as ke:\n", " raise KeyError('No target key found?')\n", " \n", "def set_targets(co, targets):\n", " \"\"\"Set new targets after the generator has started!\"\"\"\n", " t = get_targets(co)\n", " while len(t) > 0:\n", " t.pop()\n", " \n", " for target in targets:\n", " t.append(target)\n", " " ] }, { "cell_type": "code", "execution_count": 108, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " In complete with: Print twice!\n", " printer #2 says: 6.33333333333\n", " printer #3 says: 6.33333333333\n" ] } ], "source": [ " \n", "# Now continue using the same coroutine workflow for next analysis\n", "avg_co.send(6)\n", "avg_co.send(6)\n", "avg_co.send(7)\n", " \n", "# change where things go...\n", "set_targets(avg_co,(myprinter('#2'),myprinter('#3')) )\n", " \n", "avg_co.throw(AnalysisWindowComplete,'Print twice!')\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Data Processing Chains\n", "___\n", "\n", "Generators can be used to build some really powerful data processing flows\n", "* The lines before the first yield are valuable\n", "* Reflection and reasoning on the processing chain is possible\n", "* Pass by reference and allocated numpy types are your friend\n", "* Combine with MPI4Py to make parallel applications\n", "\n", "Pitfalls\n", "* Code is generally not pretty\n", "* Beware of tight loops \n", "* Count function calls and object initialization\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Further reading: David M. Beazley\n", "\n", "With lots more pictures than this talk...\n", "\n", "### Generator Tricks for Systems Programmers\n", "http://www.dabeaz.com/generators-uk/\n", "\n", "\n", "### A Curious Course on Coroutines and Concurrency\n", "http://www.dabeaz.com/coroutines/\n", "\n", " \n", "### Generators: The Final Frontier\n", "http://www.dabeaz.com/finalgenerator/\n", "\n", "Combining Context Managers, Decorators and Generators in Python 3.0 for control flow of inline thread execution.\n" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true, "slideshow": { "slide_type": "slide" } }, "source": [ "### David Stuebe\n", "www.swipely.com\n", "\n", "Get the presentation\n", "On Github IO: http://dstuebe.github.io/generators/ \n", "\n", "On NbViewer: http://nbviewer.ipython.org/github/dstuebe/generators/blob/gh-pages/presentation.ipynb\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.8" } }, "nbformat": 4, "nbformat_minor": 0 }