{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# List Comprehensions, Generators, Generator Expressions\n", "\n", "[back to main page](../index.ipynb)\n", "\n", "These are some of the greatest features of Python, you should definitely know how to use them!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## List Comprehensions\n", "\n", "Python's [list comprehesion](https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions) feature allows us to concisely create new lists from existing sequences.\n", "\n", "This doesn't sound too spectacular, but wait until you see what you can do with them ...\n", "\n", "First of all, we need a sequence of example data that we will use as basis for the following list comprehension examples:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "numbers = 4, 9, 7, 20, 6, 33, 13, 23, 16, 62, 8" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Shallow Copy\n", "\n", "The simplest (and mostly useless) form of a list comprehension is just a dull repetition of all sequence elements into a new list:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[4, 9, 7, 20, 6, 33, 13, 23, 16, 62, 8]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[x for x in numbers]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I'm showing this just for didactical purposes, in practice there is a simpler and less confusing way to get the exact same thing:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[4, 9, 7, 20, 6, 33, 13, 23, 16, 62, 8]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(numbers)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The [list()](https://docs.python.org/3/library/stdtypes.html#list) constructor takes any *iterable* and turns it into a list, so there is no need for a list comprehension here." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Map\n", "\n", "Imagine you want to evaluate an expression (e.g. `2 * x`) for every item in a sequence and store all the results in a new list.\n", "\n", "It's as easy as that:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[8, 18, 14, 40, 12, 66, 26, 46, 32, 124, 16]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[2 * x for x in numbers]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alternatively, we could also use the built-in function [map()](https://docs.python.org/3/library/functions.html#map) to map a given function to each of the items in a sequence.\n", "\n", "But to be able to do that, we have to turn our expression into a proper function:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "def two_times(x):\n", " return 2 * x" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can pass this function, together with the initial sequence, to `map()`:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "map(two_times, numbers)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What comes out of this is not a list, but a `map` object.\n", "This is a *lazy sequence*, i.e. an iterator (see [below](#Iterators) for an explanation) that yields its items *on demand*, not all at once.\n", "\n", "If you want to see them at once anyway, you can use the `list()` constructor again, which iterates over all results and puts them into a new list:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[8, 18, 14, 40, 12, 66, 26, 46, 32, 124, 16]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(map(two_times, numbers))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Defining a named function (and especially coming up with a meaningful name) can be cumbersome, especially if the function only returns a single expression.\n", "\n", "In this case you can use Python's [lambda](https://docs.python.org/3/reference/expressions.html#lambda) keyword to define an in-line anonymous function directly within the call to `map()`:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[8, 18, 14, 40, 12, 66, 26, 46, 32, 124, 16]" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(map(lambda x: 2 * x, numbers))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is nearly as concise, but the list comprehension above is easier to grasp, isn't it?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Filter\n", "\n", "Imagine you want to create a list with items from another sequence, but only those which satisfy a given condition.\n", "\n", "This can be done by throwing an `if` into the list comprehension:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[20, 33, 13, 23, 16, 62]" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[x for x in numbers if x > 10]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It should be quite clear what the above line does.\n", "\n", "But as before, there is also an alternative way to get the same result, namely using the built-in [filter()](https://docs.python.org/3/library/functions.html#filter) function:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[20, 33, 13, 23, 16, 62]" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(filter(lambda x: x > 10, numbers))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we directly used a lambda function to express our condition.\n", "\n", "If the condition is more complicated, it can of course be encapsulated in a function, e.g.:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "def my_condition(x):\n", " \"\"\"A very strange condition.\"\"\"\n", " if x % 2 == 0:\n", " x //= 2\n", " elif x % 3 == 0:\n", " x += 4\n", " return 3 < x < 10" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Such a conditional function (sometimes called *predicate*) can be used in a list comprehension ..." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[7, 16, 8]" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[x for x in numbers if my_condition(x)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "... and with `filter()`:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[7, 16, 8]" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(filter(my_condition, numbers))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this case it is largely a matter of taste which of the two alternatives you should use." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Map + Filter\n", "\n", "As you can imagine, it's possible to combine the mapping of an expression to a list and the filtering of this list:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[40, 66, 26, 46, 32, 124]" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[2 * x for x in numbers if x > 10]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This can of course also be done with a combination of `map()` and `filter()`, but list comprehensions are often easier to read." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[40, 66, 26, 46, 32, 124]" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(map(lambda x: 2 * x, filter(lambda x: x > 10, numbers)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Map With `if`/`else`-Expression\n", "\n", "The first part of a list comprehension can be an arbitrary *expression*, but *statements* are not allowed.\n", "That's one of the situations where an `if`/`else`-expression (a.k.a. [conditional expression](https://docs.python.org/3/reference/expressions.html#conditional-expressions)) can come in handy:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[400, 90, 70, 200, 60, 330, 130, 230, 160, 620, 80]" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[x * 100 if x < 5 else x * 10 for x in numbers]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Nested Sequences\n", "\n", "A sequence can itself contain sequences, e.g. a sequence of numbers consists of sequences of digits:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['4', '9', '7', '20', '6', '33', '13', '23', '16', '62', '8']" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[str(x) for x in numbers]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If the initial sequence is such a nested sequence, the list comprehension can operate on the flattened sequence by using multiple `for`-parts:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[4, 9, 7, 2, 0, 6, 3, 3, 1, 3, 2, 3, 1, 6, 6, 2, 8]" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[int(c) for x in numbers for c in str(x)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that the first `for`-part concerns the outer list and the second one the inner lists, just as in two nested `for` loops:" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[4, 9, 7, 2, 0, 6, 3, 3, 1, 3, 2, 3, 1, 6, 6, 2, 8]" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mylist = []\n", "for x in numbers:\n", " for c in str(x):\n", " mylist.append(int(c))\n", "\n", "mylist" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, if you want the output list also to be nested, you can use a separate (inner) list comprehension as the first part of another (outer) list comprehension:" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[[4], [9], [7], [2, 0], [6], [3, 3], [1, 3], [2, 3], [1, 6], [6, 2], [8]]" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[[int(c) for c in str(x)] for x in numbers]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Filtering Nested Sequences\n", "\n", "Of course, you can also add an `if`-part to a nested list comprehension:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[2, 0, 3, 3, 1, 3, 2, 3, 1, 6, 6, 2]" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[int(c) for x in numbers if x > 10 for c in str(x)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Apparently, the position of the `if`-part doesn't matter, it can be either between the `for`-parts or after them." ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[2, 0, 3, 3, 1, 3, 2, 3, 1, 6, 6, 2]" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[int(c) for x in numbers for c in str(x) if x > 10]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And you can have more than one `if`-parts. You can have one for each `for`-part." ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[2, 0, 1, 2, 1, 2]" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[int(c) for x in numbers if x > 10 for c in str(x) if c not in ('3', '6')]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Replacing `for`-Loops With List Comprehensions\n", "\n", "People who come to Python from other languages tend to create empty lists and then `append()` to them within a `for`-loop like this:" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "newlist = []\n", "for n in numbers:\n", " if n % 2 != 0:\n", " newlist.append(n**2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It's of course *possible* to do this, but for reasonably short expressions it is much *nicer* to replace this with a list comprehension:" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [], "source": [ "newlist = [n**2 for n in numbers if n % 2 != 0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is the way how it's done in Python!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Set Comprehensions, Dictionary Comprehensions\n", "\n", "For completeness' sake, let me mention `set` and `dict` comprehensions here.\n", "\n", "Set comprehensions look very much like `list` comprehensions, except that they are enclosed in a pair of braces instead of brackets.\n", "\n", "And of course they create a `set`, i.e. there are no duplicates and you shouldn't rely on the order of items." ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{0, 1, 2, 3, 4, 6, 7, 8, 9}" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "{int(c) for x in numbers for c in str(x)}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All the rest works the same as in list comprehensions.\n", "\n", "There is another type of comprehensions that is enclosed in braces: dictionary comprehensions.\n", "The difference is that `dict` comprehensions use a pair of values with a colon inbetween.\n", "The expression in front of the colon is evaluated and specifies a *key*, while the expression after the colon provides the corresponding *value*.\n", "\n", "Let's look at this very silly example for a dict comprehension (which also includes a list comprehension just for the sake of it):" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'4': [4],\n", " '9': [9],\n", " '7': [7],\n", " '20': [2, 0],\n", " '6': [6],\n", " '33': [3, 3],\n", " '13': [1, 3],\n", " '23': [2, 3],\n", " '16': [1, 6],\n", " '62': [6, 2],\n", " '8': [8]}" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "{str(x): [int(c) for c in str(x)] for x in numbers}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Set comprehensions and dictionary comprehensions are by far not as common as list comprehensions, but it's still good to know them.\n", "\n", "But now for something completely different ..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Iterators\n", "\n", "*Iterators* are a completely different topic, but they are necessary to understand [generators](#Generators) which, combined with features from list comprehensions will finally lead us to [generator expressions](#Generator-Expressions) (which are awesome).\n", "\n", "There are many *iterable* built-in objects in Python, e.g. lists, tuples, strings, files. There are many more *iterable* objects in the standard library and in third-party libraries.\n", "The concept of iterable objects (often just called *iterables*) is very important in Python.\n", "\n", "If you want to *iterate* over an *iterable* you can get an *iterator* by calling the built-in [iter()](https://docs.python.org/3/library/functions.html#iter) function." ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [], "source": [ "it = iter(numbers)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Iterators are very simple (but powerful!). The only thing you can do to them, is call the built-in function [next()](https://docs.python.org/3/library/functions.html#next) on them:" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(it)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "... and again ..." ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "9" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(it)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "... and again ..." ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "7" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(it)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "... and again ..." ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "20" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(it)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An iterator object is itself iterable, therefore we can use the `list()` constructor to turn the remaining elements into a list:" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[6, 33, 13, 23, 16, 62, 8]" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(it)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This basically calls `next(it)` repeatedly and collects the results into a list.\n", "\n", "But what happens if the iterator is exhausted?\n", "How does the list constructor know that there are no more items left to iterate over?\n", "\n", "That's easy: if `next()` is called on an exhausted iterator, an exception of the type [StopIteration](https://docs.python.org/3/library/exceptions.html#StopIteration) is raised.\n", "\n", "A very common use of an iterator is in a `for` loop.\n", "In this case the interpreter internally calls `iter()` and then calls `next()` on the resulting iterator assigning the result to the loop variable.\n", "After the loop body, `next()` is called again and the process is repeated until a `StopIteration` is raised.\n", "\n", "When we just use a `for`-loop, we don't have to worry about `iter()` and `next()`, but that's what's going on under the hood." ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "....\n", ".........\n", ".......\n", "....................\n", "......\n", ".................................\n", ".............\n", ".......................\n", "................\n", "..............................................................\n", "........\n" ] } ], "source": [ "for n in numbers:\n", " print(n * '.')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As I mentioned above, an *iterator* is itself *iterable*, i.e. we can again call `iter()` on it.\n", "But don't worry about calling `iter()` too often.\n", "If it is called on something that's already and iterator, it will just return the thing unchanged:" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "it = iter(numbers)\n", "it is iter(it) is iter(iter(it)) is iter(iter(iter(it)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Generators\n", "\n", "Iterating over an existing sequence is very simple and useful, but there are two cases where this may not be feasible anymore:\n", "\n", "* If creating a single list element takes a non-negligible amount of time you might not want to create a whole list of them before you start working with the list elements.\n", "Wouldn't it be nice to create a sequence one item at a time? Then we could start processing the first element immediately and don't have to wait for the whole list to be created.\n", "\n", "* If we are dealing with a very long sequence, building a list of it will need a lot of memory at once.\n", "If we only need to process one item element at a time, wouldn't it be nice to avoid wasting all that memory and just process the sequence elements one by one?\n", "\n", "That's where *generators* come in.\n", "*Generator objects* are *iterable*, but they don't necessarily store all items.\n", "They only produce one item at a time.\n", "\n", "*Generator objects* are created by calling *generator functions*.\n", "A *generator function* looks very much like a normal function, but instead of a `return` statement (or several) it has one or more `yield` statements.\n", "\n", "This is a really silly example of a *generator function* which yields our numbers from above one by one:" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [], "source": [ "def my_generator():\n", " yield 4\n", " yield 9\n", " yield 7\n", " yield 20\n", " yield 6\n", " yield 33\n", " yield 13\n", " yield 23\n", " yield 16\n", " yield 62\n", " yield 8" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When we call this *generator function* we get a *generator object*:" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "my_generator()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's assign it to a variable:" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [], "source": [ "g = my_generator()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that when running a *generator function*, the code *inside* is not (yet) executed.\n", "\n", "Every *generator object* is an *iterator*, so we can call `next()` on it:" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(g)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When we call `next()`, the body of the *generator function* is executed until the first `yield` expression is encountered.\n", "The value on the right side of the `yield` keyword is produced and the execution is stopped.\n", "\n", "When we call `next()` again, the execution of the function continues at the statement where it was stopped before and the following statements are executed until the next `yield` expression is encountered.\n", "At this point the corresponding value is produced and the execution is \"frozen\" again." ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "9" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(g)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Viewed from the outside, this just looks like any old iterable.\n", "Let's turn whatever is left in the *generator object* into a list:" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[7, 20, 6, 33, 13, 23, 16, 62, 8]" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(g)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After that the generator is exhausted (a `StopIteration` exception was raised within the `list()` constructor), we cannot really do anything else with it.\n", "\n", "Calling `next()` again would just raise another `StopIteration` exception.\n", "\n", "But of course we can call the *generator function* again to get a completely new *generator object* which can again be iterated over exactly once:" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[4, 9, 7, 20, 6, 33, 13, 23, 16, 62, 8]" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(my_generator())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This example is really silly, the same thing could also be done with just a `list` of numbers.\n", "But imagine producing one of these number would take a long time.\n", "Then it would be perfectly reasonable to compute and yield them one by one in a generator function.\n", "\n", "And there is another use case where a `list` just wouldn't work: generators can be infinite!\n", "\n", "To show this, we can create a generator function with an infinite loop that yields a value at each iteration:" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [], "source": [ "def integers_starting_from(n):\n", " while True:\n", " yield n\n", " n += 1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When we call this function we get a generator ..." ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [], "source": [ "i = integers_starting_from(42)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "... on which we can call `next()` ..." ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "42" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(i)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "... and again ..." ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "43" ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(i)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "... and again ..." ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "44" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(i)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "... and we can do this arbitrarily often, until we get bored.\n", "\n", "> Side note: Since Python can handle arbitrarily large integers, we can *really* do this arbitrarily often and we will each time get an even larger number, the numbers will never overflow (unless there is a bug in the Python interpreter).\n", "\n", "This time we cannot create a `list` of the remaining items, because this list would grow infinitely large!\n", "\n", "To avoid an infinitely large list, we can create another generator function that takes a generator, yields a finite number of items and then stops:" ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [], "source": [ "def take15(it):\n", " for _ in range(15):\n", " yield next(it)" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [], "source": [ "many_integers = integers_starting_from(42)" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56]" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(take15(many_integers))" ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71]" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(take15(many_integers))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "But is wasn't even necessary to write our own generator function for this, the Python standard library already has such a thing: [itertools.islice()](https://docs.python.org/3/library/itertools.html#itertools.islice).\n", "It's only an import away ..." ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [], "source": [ "import itertools" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With the help of this, let's create a helper function that returns a list with the first 15 elements of a (possibly infinite) generator." ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [], "source": [ "def list15(it):\n", " return list(itertools.islice(it, 15))" ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56]" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list15(integers_starting_from(42))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will use this below to quickly check the first few values of an infinite generator.\n", "\n", "By the way, our `integers_starting_from()` generator function is also already available in the `itertools` library under the name [itertools.count()](https://docs.python.org/3/library/itertools.html#itertools.count)." ] }, { "cell_type": "code", "execution_count": 55, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "count(42)" ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "itertools.count(42)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This returns a special `count` object, which is iterable and yields increasing values forever, just like the generator returned by `integers_starting_from()`." ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56]" ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list15(itertools.count(42))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Map and Filter for Generators\n", "\n", "Similar to what we saw above, we can also *filter* elements from a generator and we can also *map* a function/expression on the sequence yielded by a generator.\n", "\n", "One way to do this, is to create a new generator function:" ] }, { "cell_type": "code", "execution_count": 57, "metadata": {}, "outputs": [], "source": [ "def square_evens(it):\n", " for x in it:\n", " if x % 2 == 0:\n", " yield x**2" ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[4, 16, 36, 64, 100, 144, 196, 256, 324, 400, 484, 576, 676, 784, 900]" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list15(square_evens(itertools.count(1)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is of course also possible to do this with the functions `map()` and `filter()`:" ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[4, 16, 36, 64, 100, 144, 196, 256, 324, 400, 484, 576, 676, 784, 900]" ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list15(map(lambda x: x**2, filter(lambda x: x % 2 == 0, itertools.count(1))))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This doesn't really look nice, wouldn't it be great if we could use a similar syntax as with list comprehensions?\n", "\n", "Which leads us to the climax of this page ..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Generator Expressions\n", "\n", "Generator expressions look very similar to list comprehensions, but instead of a finite-size `list`, they return a generator object.\n", "\n", "In fact, they look exactly like list comprehensions, just without the brackets surrounding the expression.\n", "In some cases, we have to surround generator expressions with parentheses, because otherwise the code would be ambiguous:" ] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [], "source": [ "squares_of_evens = (x**2 for x in itertools.count(1) if x % 2 == 0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The whole generator expression evalutes to a *generator object*:" ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [ { "data": { "text/plain": [ " at 0x7ff76c578f50>" ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ "squares_of_evens" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this case the generator object is infinite, so let's only look at the first few items:" ] }, { "cell_type": "code", "execution_count": 62, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[4, 16, 36, 64, 100, 144, 196, 256, 324, 400, 484, 576, 676, 784, 900]" ] }, "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list15(squares_of_evens)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we only need the generator expression once, we can drop the assignment and use it directly (in any place where an *iterator* is expected)." ] }, { "cell_type": "code", "execution_count": 63, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[4, 16, 36, 64, 100, 144, 196, 256, 324, 400, 484, 576, 676, 784, 900]" ] }, "execution_count": 63, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list15(x**2 for x in itertools.count(1) if x % 2 == 0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note: Here we didn't need an additional pair of parentheses!\n", "\n", "Conceptually, a list comprehension is just a generator expression that's immediately turned into a `list`:" ] }, { "cell_type": "code", "execution_count": 64, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[20, 33, 13, 23, 16, 62]" ] }, "execution_count": 64, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(x for x in numbers if x > 10) # a generator expression as argument to list()" ] }, { "cell_type": "code", "execution_count": 65, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[20, 33, 13, 23, 16, 62]" ] }, "execution_count": 65, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[x for x in numbers if x > 10] # a list comprehension" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Reduce\n", "\n", "TODO: sum, any, all, min, max\n", "\n", "TODO: mention [functools.reduce()](https://docs.python.org/3/library/functools.html#functools.reduce)?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example: The Sieve Of Eratosthenes\n", "\n", "This example is taken from the very famous book \"Structure and Interpretation of Computer Programs\" (SICP).\n", "The example can be found in [chapter 3.5](https://mitpress.mit.edu/sicp/full-text/book/book-Z-H-24.html) and in the [Video Lecture 6B on YouTube](https://youtu.be/DCub3iqteuI).\n", "\n", "Let's look at this video where Hal Abelson explains the [Sieve Of Eratosthenes](https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes):" ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [ { "data": { "image/jpeg": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wCEABALDA4MChAODQ4SERATGCgaGBYWGDEjJR0oOjM9PDkzODdASFxOQERXRTc4UG1RV19iZ2hnPk1xeXBkeFxlZ2MBERISGBUYLxoaL2NCOEJjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY//AABEIAWgB4AMBIgACEQEDEQH/xAAbAAEAAgMBAQAAAAAAAAAAAAAAAQIDBAUGB//EAEYQAAIBAgMDCAcGBAQFBQEAAAABAgMRBBIhBTFRExQWQVORktEGFSIyUmGBIzNCcaHSJEPh8DRigqJEY2RysRdUssHxB//EABkBAQEBAQEBAAAAAAAAAAAAAAABAgMEBf/EACQRAQADAAICAwEBAQADAAAAAAABAhESEwMhFDFRQWEEIjJC/9oADAMBAAIRAxEAPwD5+AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAD1HRvB9pX8S8iejWDv97X8S8jj30a4y8sD1fRjBdriPEvIdGMF2uI8UfId1DjLygPV9GMF2uI8UfIdGcF2uI8S8h3UOMvKA9X0ZwXa4jxLyHRnBdrX8S8i91ThLygPVdGcF2tfxLyHRnBdrX8UfId1DjLyoPVdGcH2tfxR8h0awfaV/FHyHdQ4y8qD1XRrB9rX8S8h0awfa1/EvId1DhLyoPVdGsH2tfxLyHRrB9rX8S8h3UOMvKg9V0awfa1/EvIdGsH2tfxLyHdQ4y8qD1PRrB9rX8S8h0bwfa1/EvId1TjLywPUdG8H2lfxLyC9G8G/5lfxLyHdQ4y8uD1PRrB9rX8S8h0awfa1/EvId1DhLywPU9GsH2lfxLyHRrB9rX8S8id1DhLywPU9GsH2tfxLyHRvB9rX8S8h3UOEvLA9T0bwfaV/EvIdGsH2lfxLyL3UOMvLA9T0awfaV/EvIdGsH2lfxLyHdQ4y8sD1PRrB9rX8S8h0awfa1/EvId1DjLywPVdGsH2tfxLyI6NYPtK/ij5DuocJeWB6ro1gu1r+JeQ6NYPta/iXkO6icZeVB6ro1gu1r+JeQ6NYLta/iXkO6i8ZeVB6rozgu1r+JeRC9GsHf72v4l5DuocZeWB6ro1gu1r+JeQ6M4Lta/iXkO6hwl5UHqujOC7XEeJeRPRnBdpiPEvId1DhLygPV9GcF2mI8S8h0YwXaYjxLyHdQ4S8oD1fRjB9piPEvIdGMH2mI8UfId1DhLygPWdGMH2mI8S8h0XwnaYjxLyHdQ4S8mD1vRbCdpiPEvIleiuE+PE+KPkO6hwl5EHruiuE+PEeJeQ6KYT48T4o+Q7qHGXkQev6J4T48T4o+RPRLC/HifFHyJ3UOMvHg9j0Rwvx4jxR8jHX9FcNToymp4i6/zR8jUeWspxl2Vhqvwlua1b+6zkxxmI7RmRY3Edqzj8f/AFz+ZR1Xhqnwsjm1X4WcznuI7Rk89xHaMvx/9Pl0dLm9T4WRzep8LOa8biO1Y57iO0Y+OfKo6XN6nwschU+E5nPMR2jHPMR2jHR/p8ujp8hU+Ec3qcDl88xHaMc6r9ox0f6fLo6nN6nAjm9Tgct4qv2jI5zW7Rk6P9X5dHV5vMchP5HJ5xW7Rh16r/mMvRH6nzK/jrchP5EchLiu85PL1e0kRy1T42OiP0+XX8dfkHxXeORfxR7zkOrU+NkcrP433joj9Pl1/HX5H/NHvHJf5495xuUn8b7wpz+N946Y/T5dfx2eR19+PeSsNK12zl4WcnXSu+87VZ/ZyHRDUf8ARE/xhdGK/mR7xyUe1j3nKnOWd6veVzy4sdEMT/1RDrcnDtY95GSn2se85OaXFkZnxL0Qny/8dfLS7aPeRal20e85GZi7HRB8v/HXtS7aPePse2j3nITfEm7HTCfLda9HtY94vQ7WPecm7CbJ01Pluteh2se8KWH66qOVchsvTU+W6+fDdqu8cphe1ORcDpqny5djlcJ2g5bBr+YcdkDpqfKn8drl8F8Y5fBfEcW5JemifLl2uc4JddwsVgUzigvVQ+XZ2ue4H4Qsdgl+A4pKHVRPlWdnn+D7MescJ2X6HGFx1UPlWdj1lhVupD1nh+xRxwOqifKs7D2nR6qC/v6FXtWmt1Bd/wDQ5WbQMvVT8T5V3U9bQ7Bd/wDQj1wv/brv/ocsDrp+JP8A03dT1xwoLv8A6G1gcY8VJ3gkjgo62xVpJkmtfxa/9F5dLETVOm5JLQ5M9q1U7KETo4x/w8jz8tZM6U8dJj6dbeW0Q3fW1f4UV9a4j5GkLDjT8ef5F249p4jiYMRtHESw8k5GFlK6+wkMrH8WPPeUxRdIqkXSM68+e0WFiwsXVxVogsyLDTEAmwaIYgCwsUxBBNiAAYIYAABRkEgaIFgCKzYT/ERO1Wf2cji4V/xMTsVXeEiRLvRx8rlOT4EKLb0ROfLOS4llVknohaZYtEaxOLTtYZXwMnK2nmylXUbne2hOTHFSwtbejLykV+HUiclLcXTiog1YmLs7l5TTktBpxYrPqJRmU6bW6zK1MujViacWMgnqBUlBKIZKLqDBNiGNEFipIEgACCQAgACgBYETAEEgCLghlE3OvsZ+wzj9R19kfdMxLdPtuYx/w8jgS3s7eLf8PI4berO9Pp38n0GWEYy3P6GG5kpTjB6q5yn7eWCSi5W3IxYpKNGVmXnLNK9jFidaEg3CyLl1Qa6y3Iu28ejPbFcm5kdFrrQ5CYaxiBk5KVyHTa32BihBkyP5DI/kDGMuoprQZHwEcy3AVt8irRklKT0ZW3yB6Y7EGRxIylTFAXy6kWBioLxhmZVpXsgmKgvk0Iy2CxC+Gf8AEROvU9yRyMOvt4nVqP2JEd6OTP32ZKdnHR2Zin77HUSYc7z7ZGo3s39SY04Nv29DFFXdizp8CZKatOCXXoW5GOW+azMfJyfVcZGuoez0zOjFRWurIdJJq7MeVtkunPLfqJ7X0ySoU4vWZiqQyNa6dRV0572S4SdroqbC0Iqe9lJLK9S0YTjuTRE4yWskVFSUTGDkm+pEFQIZLIGCGSR1koIkAAQALASAADK3LFba7yokkizZa24Igqy+R3sQ4aAxR7jrbJ+6ZyWdXZmlF2JMN0+2fFP+HkcZ72dfEv8Ah2cd72dY+nbyfSbkAGHlhJjxH+HkXKV/uJEbq2HOcdL6BVZ7iuaUmrIuozZj0vtOeb3kurNvRllCS32KR03oemo1ZVpResSKk1NXtZlbshtdaJhsouLviSsvzHscWb09ou+IUpE2j8QSje7ZPQiUpEZ3xLTUW9GVyfND0IcmISeZEum1vkiIQedaoehNR2mUuXqU25vVEKm72CIjPKzI4Ju66yrpyhrZNEObbuTVhkhG7s9CakEkYsz37jHOvGOlya1Eay0dKyOnP3ZHEWIyzUlFs2I7TUrqpGxqJdYrMQpL32CudTba3EmpcL/aU3F3RLqybvoVIM4msqqze4Ko7a6siOqtexd1IW0RnVidVU2ncOpJ/kSpx4FXLXQAqk7ZS2eppcrn0138SY1WnuuBbPUfEh8pJWabDqvgiHUk1vAmOaF1/wCSkpNvckFN3I3sqMsad6eZshUm/wAiuZ5cvUIycdzHtEulq9StrOwzyTvci7buX2el4JORZNZrSiYlxLNtrUC6SctySElG+jRiuEPZsMjhfcTBRd2zHdkD2mwyWi3wLyhBRTVjALshsMtrO+lmWS1s9z3GBNls7vce19MrjlhfruQ7cpZdZiu+IjLK7osanoqRtJnR2c7UWcyTvdvedHZ7+yYap9suJd6DOU95067/AIdnLb1OsT6dfJ9JBW4Rh5GSFk9VcripWoSWVEXsTidcM2R0p9ppuzLPRlI70ZJ9RnGkZnfeWcm97KE7y4qyIIGowxJDJ14ENPgAFyLPgLPgAbIJyvgSoPgMgRNaImkvaE1oTB2ZFUmrzZCunuJk/aZFyiW2yr3EuRFzNpxukbKrp1KkbmDIov22tDcdaEVZysYZYeNSLnFvU56+l4/HER6YZyll9hmKq3KG6zNzD4SKg5Xd+BeWH5SD9nUsTDpamw0MLVkp5Xqmbxqww1SNe7WhtNNb0dXy/NXJCCdC2TQY4KhF+Sb6yu52IYIkgkqqveSQyVv1IgC8kluKMsCCQ92guUGShJ3IRDFnB2uVRflGo2tcpmJ7MSlrZl5xypGO5ZvMUVAuStSoAE2IYixFi4BiiRNiwKYqVLsqwmIe46GBdqLOe9xvYPSiSW6fa9d/YM5zOhXf2JzzpH06eX6AgEZeVPURV1ws0TcrVf2EjMt0I3vuMuaVtxhjOXFl1OS62F1e8uA9vgVzvi+8jPL4n3lXVnn4Ee3wIc5PrZGZ8WDV7TGWTKZnxIuDYZFCXFDK/iRjuAavlS/Gh7K/EUZAw1duPFjNFcTGyC4clmxci4BupbFyASY1qtsWcqfXG7Lxqpq3UYt7skZObNRc7o4zSX0vD565jLGdjJCavp3HPg6sp5YK643OlgalOlPLUjr8TLXxS9E+WrcoYfNaVRacCMVgoVHeDs+BtOayrLrcnJub3npjx5Dw+TLuLUwdWn+FsxSUlo00ej0tqYqlGlU3xRJo88+OXnm2EnvOlidn2Wenr8jRcnGVmrMxMYxNZhWz4EbjLKre2iMUndk1lBZK7Ki9rWKal6OxMbX1K31uAEvedi0Fm0sivWOsCZWUtCLkAJq+dZLMog78AvyAsTfgQFoACIJiBJYgJkEkBsi5VWuLlMwuE1LZS5NypWZSb+Ff2Jzzew33RJbonEfcmibuIf2RpG4+m/L9JIAJDzQNlaz+wkWsUr/cSJLdZIstcQpVH+Bl+RqfCwzntW4uW5Kp8LK8nU+FkaLkMsqc+uLIlGS3plEAJN9TJcWuoKgkWfAloCjBazFgKMgvYqDEAl7iLgCSOsm4xW5gqSleTRuQwaqJqfukYCGWmsxuZklobrD1UhpypU8P7NGldmxyUNPYRki1vCaeqOjolL2fZ0sOUlF66kOT6ijlPin+YRKrOV0XjL2TTlUy1LWs2bUG3FAZk7o520MMl7aR0IsxYtN4eVjEwxaHFtdbtxW3s3sjHKTzNMhS6jlkvPOMj3X0Edb7jFmDlwJjOsmZ8C0bP8zDctGq4xski4LSlZiPtMo5Nu7IjNxdxgvmYTbZRyu7sZi4Ms4tK6KKQ5V5bFLkwZMwzGNSFxhq9yblLk3LiLXZKkY7hMmJq7k2RcrmGYuGrXFyuYZgLXIK3Fyosjew33ZoXN7DP7Iy6+NFd+xY1bGzXehrG4+mvKACxHnSUxH3EixjxH3EiS3WG+9p05fybfk/6Ec+p/DLvEcNT4F3h6Wnsnk5a+pHihTndN8QsXS4syPC0uBHNKfAa11wo8VSf4/0MU50pv7+S+hsc0p8BzSnwNck6YKOJwcF7XtfQzc9wPwLuMHM6ZHM4F5sdFWxzzA/Au4c8wHwR7jW5nTHMoF7Dohs872f8Ee4jnGz31LuNbmUBzKBe1OiGzy+z31LuKVKuBy+xluYOZR4kcyjxHadEJ5Si+qHeispUn1Q70TzGPEjmUeI7JT49WSisI39q4r8jZhQwNR/ZtP6GlzJcTZwkI0IvrZYmZOisNmcow0RMKmZbjSrYhKV3uRs0KkKsFKD16z1QmYzNvLoY4Ta0dzPG2W7KSSfu2ZoRKooohV4v5iKu7PQpUwzesdAimInG6klqbNGopwVjm16VRR94vgqjScZt6cDOtY6qZadpU2jReMUNFSnL5lfWE3pGjp8yTaISazLUq0YKpLTrIhhac/xW/MrVpVKk3JXVzHzer8T7zlbyQx8fW3zCn2sQtnw6qsTT5Ct8b7xyFb433mOa/Ghu+ro9rEj1Yu1iafI1/jfeTyeI+N947E+NDb9WLtUPVn/ADImpyeI+N95GTEfG+8dh8Zueq32iI9Vy6qi7jUy4ntH3i2KX8x945p8ZtS2bKMbuSt+RrchFuyqK/5GGrUxMdJVXbgYJ1Zxa1Mz5LfxY/5q/wBlnq5aUnGUkWglJXVu80KlVyneTuXpTlf2b2MTa7pHh8Ufbq08DOrHNBot6srf2jRp42vGVqbaubKxeL+N9xuvkn+sX8FZ/wDVl9V1v7RD2bX4foU53jPifciee41fi/RG+xz+Mstm1/h/Qera3w/oQsfjFvf6In1ji+P6IdifGR6vr/C+4lbNxD3QfcPWWL4/oiVtTFr/APESfJJ8ZHq3E9m+4q9nYlb4PuMnrbF/2kQ9q4mWjV/ojMeWf7B8Zi5lW+E2KNKdOFpIxc4q1HrD9S/LySs4G+WtR4JhNSjOa9hXMawtbs5dxeGOqU/dj+hf1pWS1gu412QW8MywPDVr/dy7hzet2cu42PW9TrgT64l2aHOHP48tXm9Xs5dxjxGHqrDyeSXcb/rd9mjHidrZsNJcmhyWPBMLotwMFOvCW5ou60I2u9DxvpQzsGNVqclpNd5bPF7mirqwIuLjVSQGUqTVOm5PqKi4MVGo6sM1rJ7jJcCWQLkASCBfiQAQmuINQJI0aBhn7Es05WXAuqpXpqWlzYpKNGmlHQ01isPKolexsVFnh7MtPkda3YvVtxnGUfaZXVbpo5vOPtOTgpTl8kbMaMpL288GeiLRjhjaU4VNJStIl/Z+62/qaUsO29KjM9KCpxs5N/mzE+SIWKTKzlKpvRMYqK0MdSvCmrtpfU5+I2rThe0rv5M4TaZ+neKxH26bklp1/IOLSvLR8Dg+uKm+nBJ8XqdDAY2WLi3U94zMSTaG4CLi/wAjEyJBAAkEACyBFxcInQx1YqUX1fkXBRzp1503leq+aOfiV7d+J1MXhqtR3g01wObPCVYT9p2Nw42mWtZuRtU5QX3iuYp0Jxd7mKWZPVOxZcnWwUJRV3TzI34NSWsbGhga6hTy5lL/AFHQjJtaqxymXesLWXAWXAEjXRGWPAZI8CQDDJHgMkeAuSBXk48CeTjwJA0Mq4EcnHrKyqxh7zKRxVOTspIaMvJQ4Dko9SEasX1llJdQ1VeSjwHIQ4F0ybjUxj5CD6jFisPBYaehsmPFP+FmarJMOBFtapmeniZLSWqNZEreYbisNp1qe9JxZEsRZaSua0ihYlJo6OHrRk/bqWNy0mr0q6fyZwQrp6Oxr0nGztOtWg/aSZgxGJdTLBppX1OfmnvzMnlZX3j/AMUyzrxxdKKSWiRkhXhPc0zi8s+BkhPW70Ivt2sySuY54iEfmc54pbrmOpX0vewyVb7xV9yMc602nvNFYizvcyLGXVhxk9NzCSk5+1c3bN9Rx44xwehkW0KnUkzE7Cum4y6jA1WhO/J51wNZbTqL8KLx2tNb6cWTZVtqtwwbv9PM2aEZYiElKlyfyNGO3JR05CBmj6QStrRX0NZLMyTpvDyahTsYJYiq9Mku42V6RcaL7y3SGD30H3muM/qRb/GvGtOVkqU/zZStDF1NKeWC+ZudIKX/ALd95Se3Yv3aNh7hvlrk1dlYur71a5g9Q1/jizsS2tCW+DJhtHDfjUhHltH8YmuuP6kxC/FA2tn7OxGHqPM42fA6y2rgUvdfd/Qn1tgupfozU3tLnNYhVYao9Uiea1vhMsds4W1tV9GW9bYT4/0ZOOrya/Nqvwjm9VfhNn1phLfefoyPWmF6p/oxxg5tfm9X4WRKjUjvjY2JbUoW9mRgljKUnd1EYmYhYnVMsuAyy4FucUn+NGfD1aDleVSJItrTXyS4EOEuB1lXoP8AmQ7yeVoy0zw1+Z04yxzhw6laFLWTOZicXCrO6V7Gztim6ddyjaSfBnHcp391osRJOSyzrzlLdZDlFL3o3MOaf5ExUnvHtjjDr4ajQSU8tpG65JK73HKwlVQeeacmtxmlWnVd3ouBzbiIhvxqRnudy5pUKmRGalNzlqGmcEX1JAABFVE5qEbs59fH78g2jXy+ymciUnJmorpLaq4yU+u5hjVlmutDGmkRKd3oa4sSzyxdRaOZsUdoSgveucySzcCYzXu2E0g136W0ov3kzcpVo1FeDPLKq4mxhca6NVa+y2Y4yuvS3MWKf8LMmnUU6cZLc0VxX+Fn+QiTXBixfUiI6zLtC0txQtLcVuWFQSQ2RcotchuxS7IuXE1fMVdYlM1qm81WGLSyyrte6isKut5u5gbCZ0iHPWV1XfTcTyrMVxcYayOo2ZqFS7tc1Mxek7SMzHpYs6VlxJUU+s1eXit5bnEeJyyXWJbGUW+Zr84jxHLx+IZKti3zJua/LR+InlY/ETJTWe5JhVSL/EZIQdR2jqBfQKLluVzoYTZqftVbnRhRo0VaFNfmyGvP5Gt6sRrwPRZqS96CMU8Ph57oFHDtZalGdyeyadSnmhKzNCrs+tS3RzIYNO4UkuomVOUd6sRYYi12QVsyU2TFXRDKp2JuMTBN8WWzyW6T7yE7Et36ipxhDlJ75N/myuj6i30BdlchVxj8K7iVFcEGLjlJkLJF0jHF6mzSjnlYms4Qi3uNulTyL5iFNQXzLkRIAKBJAKribTl9q0c6U7HT2tRalnOS7HWrNkObGcoyLm8c9Xc7kLiiugukMNZVLiTL3bow3LKWliYPSbIrcrhEuuKNvEv+Fmc7YUHGhOT3O1v1Ohif8NM4f/TUODDcWW8yrCyg7PeY2rSaMu1UVHZGG5kq7jEahdGwbdPZuMqwU6eHnKL3NIS2ZjYrXDzX0NDTvYi5mlha696lNfQxulUj70GiotRjndma9ZZZGaLyMitGNTVby1lJhq7yLEyTiVvodHFJDIbG8oI2cPFPeY6FCVXduNqMVDRGLy3WNVdKNyOSiZSDnsuuKcjEcirbzItw6hylMY+RVt45BPrMttC1JKVSKfEnKTGzs/Y8q7zzllhod2hg6GGjaKV+Jjp1VTpQhHTQpUqtdZmZ0xuzlGMbpmlUxDvruKU6z/E7irBVFmi/oFY6la73kc4aklcxujInknvsBtqvOyszPSxM90jQi7GSNT2tQjZr4ehi46ezLijh43B4nDSdnJx6mjs5lHVGRSjXpunPVNbzXIeVdSsutkctW+Zv47Dc3quzujVaLsGSw8vWRPL1i6VybDYTJU5xVJ5zU4ItZcBZW3DYPavPKi/Ciedy+FENLgTkXAej2c7fXFGWnVc+oxZI8DaowSjojMzC5K0d5tYT7wwJWNnCe8zGrjbYBFjTCQCSogkADHXpRrQyyR53G4V4ebPTM5O1uRmve9o6VJcJlC0lZlbnbHGUXFyAVEmfCUJYirGEV16mKnBzdlvPTbNwMcLTTteb6zne2Daw1JUKEaa6kWxP+FmXSsUxX+Fn+Rwj7dYhTGUst5M5zoq97s79WnHEwypq/Eweo69rqcTlES1VxJYdS6zZ2bsvnOIjvyxep0fUeJ4xOhgcHjMLHJGNM6xEtzMN6jFU4Rpw0jFWMqjUW7LYxw5zFaxgZU61tYx7yTWXG2p9t74xsVlDNvpJls+X3mjHUxUYvR3GSzqebUZe9RicXbUMJC1KnRjyj4HRq7SyK0Um/wAzhTV68q055pM3FZWJcfE4KUm3GOppSwVZfgueglUS1lfuMLrw+fcd4rJMw40MBXl+Cxlhsuo3qv1OtHERe5NkrExvbIxxk1pzwkqdJRjE1cj4HaVXMt3ezFUwLqyvTmkYnxtxfHKcGR12OlPZleO53+hansacvalUV+Bia41zctbg9x1vV1SN8yS+hrzo5XaULfQ5y39tHqM2EhnrxtuMuSPBG7hYwpRzWRNMlmm1FpcDnYzF5XaJtVq0Vc5VWLqVHwLEDPhsRUqR1LPGSpSsyi9iForUwyozqu3EuGuhDFOWt0zNCrfrNCFPkI5Wy8aii95mRsVajUi8JXRr5oy1uZI1oJEVlVXNOz3G1RkkjmOur6IzQxNlqUblaCrU3FrXicKrBwk4yVjt06uZI2obKwuMjnqZlL5M3X2kzjy0dLko9PL0aw8vdqyj+pTovDqxL8P9TXCWOcPNEvcehl6LT/DiF9Y/1MM/RnFL3akZfT+pOMryhwyTry9HMcuqL+pR+j+P7OPiJxldhy0bdH3SlbB1sNPLUjZ/IyUotQMTCwlGzhPeZrJWMkXOPuysZhZdAf8Ag0OUq/EzHWxdSEGnLU7UrynHOWfF46NDRas589oVZ6qVjTqSlOV5Mi9j6ni/5a57cLeTPps89r/Gxz2v8bNXMLnf4/j/ABjnLYnja8tHIwSm3v1Kg1HhpH0c5Yp69RjaXA2HYhifDEs6wZEMi4mWyGVHP49V1FNKMk0z0OE2jSnTSnK0jz+VFor5nK//ACRJFserjXpyWk0RipJ4Wep5uNTL1mfntRUHG90eef8AktV07Yej2bnq1FwPQJWSR88w2150PdlJfQ24+kE0/fkeaPX8ae7SLWPDL0ik/wCZLuHrzNvrPuHL/DjL3Ro47FOEskHqeUW1oP8A4iXcbFHFKazKWb5ssTpNZdWpiJ2tfUxNTy5nL9TSeJjFXveRSeNqZbaWOsQzjNUk03dmFxbV7M1qUqmJr5I72dp7IqpK1YbEGOa42W5mN0W90TqPY+JvdVYkPZeLW5p/Ucq/o5sKbj1FWssrnQezMb8H6o162zsYnfk33ouwIp2l1GeKSV2v1MVLC4iK9qlIvKNRRs4S7ho2aFOVSXsysvmb9PCwtrNt/kc/CSnSvJxbX5HSpSzQU1dXJ6VKwkL2nK6+RkngcFyTzRcnYo5tJu70NGW04VJyi4zvHfoZnJWNhb1Xgp+/DKvzZo7Qo08PFqnfKbscZnjanSl9StacK9B06ytLqscprDpW0vMVqt779DAqmup057KnKblntHqL0tmU4yzSd2ZzHT7aVONaovZgZY4XEt3ukdPkWutWGSXVJDG+LlVcDipO6lEwyweLjvjc7ihNdaLpyfUXDi87CFaLtJMtPNDemehnRuvaiatXDp9WhmYTHMoxco57G7hMFUxV8u6JmpUoQjZLedrZOG5OMptWvYkV1i041aGy6mVbjfp4OpTjZM3VFF0dq0cJtrTVKuusPnEd5vIiTS3m8ZaTqVrXUboLEyW+Jlr4unRg22kjhYrbN3JU7P6Mk6Y7CxbelilbFRpxvUnlPPxx2Jl+JRTMNSp+KV5P5meTcVZsfiY162aCutdWTs3CSxWI3LIt5s4PZXOaSq1ZNReqSOtg6NDAxtBPXiT7/izPFaps3CSetFfRswz2LhJLSLj+TZt88op6tr6DneHeiqLuZuKVZ5y42O2PHDUXVhU0XUzy9abnLU9jt3Fwhgsqkm2zxMpLievw+Ou6lrTiJGNpITqWMSm2z37EOP2yXFypN7IRIXJIuhcuoDQi4uNAEkDRKJSIRIBoiT9hk3KTfssxf6R9MhsPZk1rgqPgXkS/RrZE9+Dh9EkeYj6duP8AwX+4zL/+g2X+B/3Hycd+TvP0T2Q/+HSMb9Dtkv8AlyX1ON/6hf8AQf7h/wCoX/Q/7hhrrP0L2U+qa+r8zIvRvZ+Hp5eUkl+f9Tg1fT+c1aODy/6jXfpq5e9hL/6iezZd2WwsPU9nDVGn8zWxHo5VhBt16aVt8tP/ALOVH01nTTdPCpSe55txzq3pRjq83KpK64aeRfY9RsfZfIVZVJzhNLrW469m5e/Gx5LZ3pfSpUcmLo1G/wDlpeaNxemOy7/4fFW+UY/uONqWldejk5JaW7yIzfC7/M4D9Mtj5bLD4zwx/cY36Y7MXuYfFfVL9xz67I9Pra9inKvNZwlbieZ6YYDqhi1/pj+4lemeBy2dLFeCPmXruvp6V1o33PuGei9+U8y/TPBJexQxDfzjHzMXTHDp+zQqpf8AbHzLwubD0ePsqDdNa/JGDCuvyeaq0l1JI4/THBTilUpYn6Qj5jpbsvscX4Y/uNxFv66Rav8AXclVb0jFGB02228sfocrpZsrssZ4I/uKv0p2VLfSxnhj+43kryq60lpaFmVVF39pI5S9J9kLdSxnhj+4dKNl9njPDH9xOMtR5KutKm48LcDFKGbcrHLn6T7Na9ini/rGP7iKfpPs5P2qWKf5Rj+4cZXsq6TpS4FeTa3mn0q2Zf7jE+GP7jHU9J9nS3UsSl/2x/cOMr2w6Sp3M8KaW6xxOk+BSsqWI8K8x0nwVtKWI8K8yZP4nZDuSoqe+RCw0etvuOJH0owi/l1/CvMuvSzCL+VX8MfMuf4dlXWWFhGaeuhvquoU7U966jza9LMH+KliPDHzHS3B9VKv4Y+ZOLEzWf69D6wqwXtUpP8AIstrQ/FFr8zzr9LsHfShX8K8yH6WYB/8PX8MfM37c5iHoa216MVfPFfXU59fbM6jtSg3/mbOFiNubNxLvOhiE+KS8zB6y2d1PGL/AEx8yTpGOpUlWryvVqN/K+hHJxS0RoR2vs1b1i3/AKY+ZljtvZa30sU/ov3GJrd0iatm1iKVPlsXCDajG+rZjW39kLfhsS/ov3EVNv7Knuw2Ii+KS/cZilvxZvX+PVrEUKVOMISi8qtozWq4hTfvxX1PNevNl3+7xfcv3E+vNlPfDGd0f3HT3+Ofp26laCX3if8AqNWU4t3zrvOd652M/wAGN8Mf3EPa+xWtYY7wx/caiE9MW0q7lLLm0XzOZfTedKeM9H6jvKG0PDH9xXl/Ry33e0PDH9x66WrEMWcmbLU91zcqV9jZvsqGJa/zb/8A5ERxey0rc3rd/wDU6dkfrONfKw0Zp4vZ7XsUaq/v8zDzjDfDU/v6mu6qZKLIkq6+H6oT/v6kvEUPgl/f1HfUwIuOcUfgkUdan1KRe6v6YyXCZi5aPBjlo8GXur+nFlRNzFy8ODHLx4Md1f04styk9zK8vHgyHWg1azMW8tZ/qcWAAHjdAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAB//2Q==\n", "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "execution_count": 66, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from IPython.display import YouTubeVideo\n", "YouTubeVideo(\"DCub3iqteuI\", start=495)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To implement the *sieve*, we can use a recursive generator function:" ] }, { "cell_type": "code", "execution_count": 67, "metadata": {}, "outputs": [], "source": [ "def sieve(stream):\n", " x = next(stream)\n", " yield x\n", " yield from sieve(y for y in stream if y % x != 0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In SICP, they are using what they call \"streams\" (implemented in LISP) to represent the infinite sequences of natural numbers and of prime numbers, respectively.\n", "\n", "We can get a similar thing in Python by using generators.\n", "The generator function `sieve()` takes the first element from a \"stream\" and yields it.\n", "Afterwards, it calls itself recursively using a generator expression that filters the remaining sequence.\n", "\n", "The filtering could of course also done with filter(), see [above](#Filter), but the generator expression is more Pythonic.\n", "\n", "To get our infinite sequence of prime numbers, we have to apply the `sieve()` to the natural numbers starting with 2." ] }, { "cell_type": "code", "execution_count": 68, "metadata": {}, "outputs": [], "source": [ "def primes():\n", " return sieve(itertools.count(2))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Calling this function returns an infinite generator object that yields all primes on demand." ] }, { "cell_type": "code", "execution_count": 69, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]" ] }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list15(primes())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Those seem to be the right numbers, but this recursive Python implementation has a problem when we look for larger prime numbers.\n", "\n", "To see that, let's create a helper function that shows only one given item from the sequence instead of the first few:" ] }, { "cell_type": "code", "execution_count": 70, "metadata": {}, "outputs": [], "source": [ "def nth(n, it):\n", " \"\"\"Return n-th element (0-based) from iterable it.\"\"\"\n", " return next(itertools.islice(it, n, None))" ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 71, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nth(0, primes())" ] }, { "cell_type": "code", "execution_count": 72, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "29" ] }, "execution_count": 72, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nth(9, primes())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "While we're at it, let's measure how long the computation takes for a somewhat larger number ..." ] }, { "cell_type": "code", "execution_count": 73, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 30.6 ms, sys: 0 ns, total: 30.6 ms\n", "Wall time: 34.1 ms\n" ] }, { "data": { "text/plain": [ "3457" ] }, "execution_count": 73, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%time nth(482, primes())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This works fine with small numbers, but there is a limit (depending on your Python interpreter settings), you can try it on your own:" ] }, { "cell_type": "code", "execution_count": 74, "metadata": {}, "outputs": [], "source": [ "#nth(490, primes()) # maximum recursion depth exceeded!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The problem is that (C)Python doesn't eliminate tail recursion.\n", "In other languages (e.g. LISP) this would work fine, but not in Python.\n", "It's possible to increase the recursion depth in the Python interpreter, but this would just shift the problem to slightly larger numbers.\n", "\n", "But anyway, why would we solve an iterative problem with a recursive solution?\n", "\n", "We can easily create an iterative implementation by applying a few minor changes:" ] }, { "cell_type": "code", "execution_count": 75, "metadata": {}, "outputs": [], "source": [ "def primes2():\n", " collected_primes = []\n", " for x in itertools.count(2):\n", " if all(x % p != 0 for p in collected_primes):\n", " yield x\n", " collected_primes.append(x)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, instead of storing the intermediate state on the stack (where it's hidden in the recursive implementation), we just explicitly create a list of all so far known primes and append each newly found prime number to this list.\n", "\n", "> Important Note: The function all() is short-circuiting, i.e. the divisibility test is only executed until the first match.\n", "It's important to use a generator expression here, because if we would use a list comprehension instead, the predicate would be calculated for all items and the whole intermediate list would have to be created on each iteration. This would be much slower and use more memory.\n", "\n", "Let's compare this with our previous implementation:" ] }, { "cell_type": "code", "execution_count": 76, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 21.7 ms, sys: 0 ns, total: 21.7 ms\n", "Wall time: 22.1 ms\n" ] }, { "data": { "text/plain": [ "3457" ] }, "execution_count": 76, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%time nth(482, primes2())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The new iterative implementation of course yields the same result as before, but it is also faster!\n", "\n", "And now we don't have this silly recursion limit, it also works with much larger numbers:" ] }, { "cell_type": "code", "execution_count": 77, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 1.16 s, sys: 158 µs, total: 1.16 s\n", "Wall time: 1.17 s\n" ] }, { "data": { "text/plain": [ "48619" ] }, "execution_count": 77, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%time nth(5000, primes2())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Some time after writing the above implementation, I stumbled upon a blog post called [List, Dict and Set Comprehensions By Example](https://www.smallsurething.com/list-dict-and-set-comprehensions-by-example/), which also uses the Sieve of Eratosthenes as an example.\n", "However, it doesn't use generators but set comprehensions (which I only mentioned very shortly [above](#Set-Comprehensions,-Dictionary-Comprehensions)).\n", "It also has a different API, since it doesn't produce the results one by one, but instead calculates all prime numbers up to a given upper limit.\n", "\n", "The original code has a second set comprehension in the last line.\n", "I took the liberty of changing that into a list comprehension, since there are no duplicates anymore at this point and the order is actually relevant." ] }, { "cell_type": "code", "execution_count": 78, "metadata": {}, "outputs": [], "source": [ "def eratosthenes(n):\n", " # Based on: https://www.smallsurething.com/list-dict-and-set-comprehensions-by-example/\n", " not_prime = {j for i in range(2, n) for j in range(i * 2, n, i)}\n", " return [i for i in range(2, n) if i not in not_prime]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As it turns out, this not only works but it is also significantly faster!" ] }, { "cell_type": "code", "execution_count": 79, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 44.3 ms, sys: 4.01 ms, total: 48.3 ms\n", "Wall time: 47.8 ms\n" ] }, { "data": { "text/plain": [ "(5001, 48619)" ] }, "execution_count": 79, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%time result = eratosthenes(48622)\n", "len(result), result[-1]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That's it, have fun generating prime numbers!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "

\n", " \n", " \"CC0\"\n", " \n", "
\n", " To the extent possible under law,\n", " the person who associated CC0\n", " with this work has waived all copyright and related or neighboring\n", " rights to this work.\n", "

" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" } }, "nbformat": 4, "nbformat_minor": 4 }