{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Generating permutations, several approaches with Python\n", "\n", "This small notebook implements, in [Python 3](https://docs.python.org/3/), several algorithms aiming at a simple task:\n", "given a certain list, generate *all* the [permutations](https://en.wikipedia.org/wiki/Permutation) of the list.\n", "\n", "For instance, for `[1, 2]`, it should give `[1, 2]` and `[2, 1]`.\n", "\n", "## References\n", "- [This blog post, doing a similar job but in OCaml](http://typeocaml.com/2015/05/05/permutation/),\n", "- [The documentation for itertools.permutations](https://docs.python.org/3/library/itertools.html#itertools.permutations).\n", "\n", "## About\n", "- *Date:* 06/02/2017.\n", "- *Author:* [Lilian Besson](https://GitHub.com/Naereen), (C) 2017.\n", "- *Licence:* [MIT Licence](http://lbesson.mit-license.org).\n", "\n", "----" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> This notebook should be compatible with both Python versions, [2](https://docs.python.org/2/) and [3](https://docs.python.org/3/)." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from __future__ import print_function, division # Python 2 compatibility if needed" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "----\n", "# 1. Reference implementation: from `itertools`\n", "\n", "The [`itertools`](https://docs.python.org/3/library/itertools.html) module, from the Python standard library, contains a function [`itertools.permutation`](https://docs.python.org/3/library/itertools.html#itertools.permutations):" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Builtin implementation, as a reference\n", "from itertools import permutations as itertools_permutations" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This will obviously be the quickest implementation, and there is no hope of beating it with pure Python (in terms of computational speed), as it is written in C and not in Python.\n", "\n", "Let's check that it works as wanted:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "itertools_permutations([1, 2])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What's that weird result? In fact, `itertools.permutations()` does not return the *list* of all permutations, but rather an *iterator*.\n", "It can be looped on in a similar way, and can be converted to a list easily:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(1, 2)\n", "(2, 1)\n" ] }, { "data": { "text/plain": [ "[(1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2), (3, 2, 1)]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "for p in itertools_permutations([1, 2]):\n", " print(p)\n", "\n", "list(itertools_permutations([1, 2, 3]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So, what's the advantage of returning an *iterator* and not a list of lists? **Memory and time efficiency** !\n", "\n", "The $n!$ permutations (if the list is of size $n$) take both a lot of time to compute and a lot of memory to store, so it's very clever if we can generate one after another, on demand, instead of having to compute all of them and storing them.\n", "\n", "The first two algorithms presented below are not iterators, but the last one will be." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Permutations of the list are given as tuples, but there is no difference." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can check how quick is this first function:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 0 ns, sys: 0 ns, total: 0 ns\n", "Wall time: 29.1 µs\n", "CPU times: user 44 ms, sys: 12 ms, total: 56 ms\n", "Wall time: 114 ms\n", "CPU times: user 172 ms, sys: 32 ms, total: 204 ms\n", "Wall time: 273 ms\n" ] }, { "data": { "text/plain": [ "362880" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%time len(list(itertools_permutations(list(range(4)))))\n", "%time len(list(itertools_permutations(list(range(8)))))\n", "%time len(list(itertools_permutations(list(range(9)))))" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1 loop, best of 3: 1.17 s per loop\n" ] } ], "source": [ "%timeit len(list(itertools_permutations(list(range(10)))))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There is $n!$ permutations to generate, so obviously any algorithm is running in $\\Omega(n!)$ time to generate all of them, and that is approximately the behavior observed above.\n", "\n", "> This claim should need better measurements to be really empirically supported!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "----\n", "# 2. First algorithm : The insert-into-all-positions solution\n", "\n", "The basic idea is to separate the first element $x$ of the list, and the rest $xs$.\n", "\n", "- For instance, for $l = [1, 2, 3]$, $x = 1$ and $xs = [2, 3]$.\n", "- Then the permutations of $l$ are obtained by inserting $x$ in every possible positions of every permutations of $xs$.\n", "- Here, the permutations of $xs$ are $[2, 3]$ and $[3, 2]$. Inserting $x = 1$ in the first one give $[1, 2, 3]$ (first position), $[2, 1, 3]$ and $[2, 3, 1]$. Similarly, we obtain the last permutations: $[1, 3, 2]$, $[3, 1, 2]$ and $[3, 2, 1]$.\n", "\n", "So we first need a function that insert an element $x$ in every possible index of a list $l$." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def ins_all_positions(x, l):\n", " \"\"\"Return a list of lists obtained from l by inserting x at every possible index.\"\"\"\n", " res = []\n", " for i in range(0, len(l) + 1):\n", " res.append(l[:i] + [x] + l[i:])\n", " return res" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then we write a recursive function, following the description of the algorithm:" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from functools import reduce\n", "# reduce(lambda acc, p: acc + f(p), l, []) is the same as the concatenation of list f(p) for p in l" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Now the main permutations generator.\n", "def first_permutations(iterable):\n", " \"\"\"Second algorithm, insert-into-all-positions solution.\"\"\"\n", " if len(iterable) == 0:\n", " return []\n", " # we must specify this edge case\n", " elif len(iterable) == 1:\n", " return [[iterable[0]]]\n", " else:\n", " x, xs = iterable[0], iterable[1:]\n", " # reduce is needed instead of a simple sum(...) as sum() only works for numerical values\n", " return reduce(lambda acc, p: acc + ins_all_positions(x, p), first_permutations(xs), [])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can try it out, but only on small list as it is *not efficient*." ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "[[1, 2, 3], [2, 1, 3], [2, 3, 1], [1, 3, 2], [3, 1, 2], [3, 2, 1]]" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "first_permutations([1, 2, 3])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And let's measure its efficiency on small lists of size $4,5,6,7,8$:" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 0 ns, sys: 0 ns, total: 0 ns\n", "Wall time: 89.2 µs\n", "CPU times: user 0 ns, sys: 0 ns, total: 0 ns\n", "Wall time: 392 µs\n", "CPU times: user 0 ns, sys: 0 ns, total: 0 ns\n", "Wall time: 16.2 ms\n", "CPU times: user 72 ms, sys: 4 ms, total: 76 ms\n", "Wall time: 168 ms\n", "CPU times: user 4.58 s, sys: 96 ms, total: 4.68 s\n", "Wall time: 8.11 s\n" ] }, { "data": { "text/plain": [ "40320" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%time len(list(first_permutations(list(range(4)))))\n", "%time len(list(first_permutations(list(range(5)))))\n", "%time len(list(first_permutations(list(range(6)))))\n", "%time len(list(first_permutations(list(range(7)))))\n", "%time len(list(first_permutations(list(range(8)))))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$\\implies$ This implementation take about $8 s$ for a list with $n = 8$ elements: **it's crazily slow!**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "----\n", "# 3. Second algorithm : The fixed-head solution\n", "\n", "The second algorithm will not be more efficient, but it is different in his design.\n", "\n", "Instead of inserting an element at every possible index, this second algorithm rather generate the permutations by considering that every element of the list will be the head of some of the permutation.\n", "\n", "With a fixed head, ie an element $x$, removed from the list $xs = l \\setminus x$, permutations of $l$ are obtained by simply adding $x$ as the head of every permutation of $xs$.\n", "\n", "As for the first algorithm, this one is also recursive.\n", "\n", "One limitation of its simple implementation below is that it requires all elements in the list to be different, as it will compute the list $xs = l \\setminus x$ with this very simple function `rm(x, l)` :" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def rm(x, l):\n", " \"\"\"List l without element x.\"\"\"\n", " return [y for y in l if x != y]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Note that with comparisons on indexes instead of comparisons on values, we could treat the general case not much harder.\n", "\n", "Then, we need, as before, a function to add $x$ as the head of all lists $p$ in a certain list of lists $l$." ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def head_of_all(x, l):\n", " \"\"\"List of lists from l where x is the head of all the lists.\"\"\"\n", " return [[x] + p for p in l]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And finally, the fixed-head algorithm is easy to implement, as a recursive function.\n", "- The case of en empty list or a list with only one element are easy,\n", "- The recursion case uses, again, a call to `reduce(fun acc, x: acc + f(x), list, [])` to permforms the concatenation of all lists `f(x)` for `x` in `l`." ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def second_permutations(iterable):\n", " \"\"\"Second algorithm, fixed-head solution.\"\"\"\n", " if len(iterable) == 0:\n", " return []\n", " # we must specify this edge case\n", " elif len(iterable) == 1:\n", " return [[iterable[0]]]\n", " else:\n", " return reduce(lambda acc, x: acc + head_of_all(x, second_permutations(rm(x, iterable))), iterable, [])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's try it out:" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "[[1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2], [3, 2, 1]]" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "second_permutations([1, 2, 3])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And let's measure its efficiency on small lists of size $4,5,6,7,8$:" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 0 ns, sys: 0 ns, total: 0 ns\n", "Wall time: 180 µs\n", "CPU times: user 0 ns, sys: 0 ns, total: 0 ns\n", "Wall time: 869 µs\n", "CPU times: user 4 ms, sys: 0 ns, total: 4 ms\n", "Wall time: 30.1 ms\n", "CPU times: user 44 ms, sys: 0 ns, total: 44 ms\n", "Wall time: 53 ms\n", "CPU times: user 472 ms, sys: 12 ms, total: 484 ms\n", "Wall time: 582 ms\n", "CPU times: user 6.78 s, sys: 88 ms, total: 6.86 s\n", "Wall time: 7.38 s\n" ] }, { "data": { "text/plain": [ "362880" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%time len(list(second_permutations(list(range(4)))))\n", "%time len(list(second_permutations(list(range(5)))))\n", "%time len(list(second_permutations(list(range(6)))))\n", "%time len(list(second_permutations(list(range(7)))))\n", "%time len(list(second_permutations(list(range(8)))))\n", "%time len(list(second_permutations(list(range(9)))))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$\\implies$ this second algorithm is more efficient, as it requires only $0.6 s$ to generate the $8! = 40320$ different permutations of the list $[0, 1, 2, 3, 4, 5, 6, 7]$." ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "40320" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from math import factorial\n", "factorial(8)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "----\n", "# 3. Third algorithm : the Johnson-Trotter algorithm\n", "\n", "This algorithm is more complicated to explain, I will let you refer to [its Wikipedia page](https://en.wikipedia.org/wiki/Johnson-Trotter), or for more details, [this blog post](http://typeocaml.com/2015/05/05/permutation/).\n", "\n", "We use simple values to denote directions, `left` or `right`:" ] }, { "cell_type": "code", "execution_count": 41, "metadata": { "collapsed": true }, "outputs": [], "source": [ "left = False\n", "right = True" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will need a first function to attach a direction to every element of an array `t`, and then to remove them." ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def attach_direction(t, d=left):\n", " \"\"\"Attach the direction d to all elements of array t.\"\"\"\n", " return [(x, d) for x in t]" ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def remove_direction(t):\n", " \"\"\"Remove the attached direction d to all elements of array t.\"\"\"\n", " return [y for y, _ in t]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This classical function `swap(t, i, j)` exchange the position of the elements `t[i]` and `t[j]`:" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def swap(t, i, j):\n", " \"\"\"Swap t[i] and t[j] in array t.\"\"\"\n", " t[i], t[j] = t[j], t[i]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We first need to know if the element `a[i]` can be moved, according to its attached direction, to the left or right.\n", "The rule is that an element can only be swapped to a **small** element." ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def is_movable(a, i):\n", " \"\"\"Can a[i] be moved?\"\"\"\n", " x, d = a[i]\n", " if d == left:\n", " return i > 0 and x > a[i - 1][0]\n", " elif d == right:\n", " return i < len(a) - 1 and x > a[i + 1][0]\n", " else:\n", " raise ValueError(\"unknown direction d = {}\".format(d))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then the function `move(a, i)` simply swaps `a[i]` to the left or right, if it is possible.\n", "\n", "It raises a `ValueError` exception if it cannot swap, to be general, but of course the algorithm will never be in such a undesirable state." ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def move(a, i):\n", " \"\"\"Move it if possible.\"\"\"\n", " x, d = a[i]\n", " if is_movable(a, i):\n", " if d == left:\n", " swap(a, i, i - 1)\n", " elif d == right:\n", " swap(a, i, i + 1)\n", " else:\n", " raise ValueError(\"unknown direction d = {}\".format(d))\n", " else:\n", " raise ValueError(\"not movable\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then we need a function to scan the array `a`, from its beginning, to find the largest movable element.\n", "This can cost upto a time of $O(n)$ (if $n = \\#a$), but it could hardly be improved." ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def scan_largest_movable(a):\n", " \"\"\"Find the largest movable element.\"\"\"\n", " def aux(acc, i):\n", " if i >= len(a):\n", " return acc\n", " else:\n", " if not is_movable(a, i):\n", " return aux(acc, i + 1)\n", " else:\n", " x, _ = a[i]\n", " if acc is None:\n", " return aux(i, i + 1)\n", " else:\n", " j = acc if x < a[acc][0] else i\n", " return aux(j, i + 1)\n", " return aux(None, 0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Directions will be flipped, alternating `left` and `right`, with `flip(d)`:" ] }, { "cell_type": "code", "execution_count": 52, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def flip(d):\n", " \"\"\"Flip direction d : left -> right, right -> left\"\"\"\n", " return not d" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then the list will need to be scanned, and flip the directions of all elements larger than some `x`:" ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def scan_flip_larger(x, a):\n", " \"\"\"Scan to flip larger.\"\"\"\n", " for i, (y, d) in enumerate(a):\n", " if y > x:\n", " a[i] = y, flip(d)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We finally have all the pieces needed to implement the Johnson-Trotter algorithm:" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def third_permutations(iterable):\n", " \"\"\"Third algorithm, Johnson-Trotter algorithm.\"\"\"\n", " i = sorted(list(iterable)) # Required by the algorithm\n", " # We attach directions, and we will only use the array a\n", " a = attach_direction(i)\n", " # First permutation\n", " r = list(iterable)[:]\n", " while True:\n", " yield r[:] # A copy of the current permutation is yielded\n", " i = scan_largest_movable(a)\n", " if i is None: # No more permutation!\n", " raise StopIteration\n", " else:\n", " x, _ = a[i]\n", " move(a, i)\n", " scan_flip_larger(x, a)\n", " # The next permutation should not have direction information attached to it\n", " r = remove_direction(a)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Yeay, we finally have an **iterator** on permutations of a list, instead of generating all of them.\n", "\n", "Let's try it out:" ] }, { "cell_type": "code", "execution_count": 53, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "third_permutations([1, 2, 3])" ] }, { "cell_type": "code", "execution_count": 54, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "[[1, 2, 3], [1, 3, 2], [3, 1, 2], [3, 2, 1], [2, 3, 1], [2, 1, 3]]" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(third_permutations([1, 2, 3]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And let's measure its efficiency on small lists of size $4,5,6,7,8$:" ] }, { "cell_type": "code", "execution_count": 55, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 0 ns, sys: 0 ns, total: 0 ns\n", "Wall time: 177 µs\n", "CPU times: user 0 ns, sys: 0 ns, total: 0 ns\n", "Wall time: 904 µs\n", "CPU times: user 4 ms, sys: 0 ns, total: 4 ms\n", "Wall time: 6.25 ms\n", "CPU times: user 44 ms, sys: 0 ns, total: 44 ms\n", "Wall time: 47.9 ms\n", "CPU times: user 484 ms, sys: 0 ns, total: 484 ms\n", "Wall time: 752 ms\n", "CPU times: user 6.18 s, sys: 36 ms, total: 6.21 s\n", "Wall time: 6.61 s\n" ] }, { "data": { "text/plain": [ "362880" ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%time len(list(second_permutations(list(range(4)))))\n", "%time len(list(second_permutations(list(range(5)))))\n", "%time len(list(second_permutations(list(range(6)))))\n", "%time len(list(second_permutations(list(range(7)))))\n", "%time len(list(second_permutations(list(range(8)))))\n", "%time len(list(second_permutations(list(range(9)))))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$\\implies$ the Johnson-Trotter algorithm is, as expected, quicker than the previous naive implementations, but it's still pretty slow compared to the reference implementation [`itertools.permutation`](https://docs.python.org/3/library/itertools.html#itertools.permutations)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "----\n", "# 4. Comparing our Johnson-Trotter implementation and `itertools.permutation`\n", "\n", "To compare them fairly, we need to run them several times:" ] }, { "cell_type": "code", "execution_count": 63, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The slowest run took 7.44 times longer than the fastest. This could mean that an intermediate result is being cached.\n", "100000 loops, best of 3: 2.13 µs per loop\n", "10000 loops, best of 3: 51.9 µs per loop\n" ] } ], "source": [ "%timeit len(list(itertools_permutations([1, 2, 3])))\n", "%timeit len(list(third_permutations([1, 2, 3])))" ] }, { "cell_type": "code", "execution_count": 64, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The slowest run took 5.71 times longer than the fastest. This could mean that an intermediate result is being cached.\n", "10000 loops, best of 3: 14.5 µs per loop\n", "1000 loops, best of 3: 1.52 ms per loop\n" ] } ], "source": [ "%timeit len(list(itertools_permutations([1, 2, 3, 4, 5])))\n", "%timeit len(list(third_permutations([1, 2, 3, 4, 5])))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, [IPython](http://ipython.org/)'s `%timeit` function warns us that `itertools.permutation` *could* use caching, and that could bias the result.\n", "\n", "One easy way to remove any caching is to test on different input lists, and that can be done, for instance, with random lists." ] }, { "cell_type": "code", "execution_count": 61, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "[384, 283, 979, 482, 388]" ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from numpy.random import choice\n", "\n", "def random_list_of_size_n(n=5, N=1000):\n", " return list(choice(list(range(1, N + 1)), size=n, replace=False))\n", "\n", "random_list_of_size_n(5)" ] }, { "cell_type": "code", "execution_count": 65, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1000 loops, best of 3: 318 µs per loop\n", "100 loops, best of 3: 2.1 ms per loop\n" ] } ], "source": [ "%timeit len(list(itertools_permutations(random_list_of_size_n(5))))\n", "%timeit len(list(third_permutations(random_list_of_size_n(5))))" ] }, { "cell_type": "code", "execution_count": 66, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1000 loops, best of 3: 313 µs per loop\n", "100 loops, best of 3: 10.6 ms per loop\n" ] } ], "source": [ "%timeit len(list(itertools_permutations(random_list_of_size_n(6))))\n", "%timeit len(list(third_permutations(random_list_of_size_n(6))))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "----\n", "# 5. Testing our $3$ implementations\n", "\n", "Additionnally to comparing the speed efficiency, we would also like to simply check that all the functions we wrote are working as expected!" ] }, { "cell_type": "code", "execution_count": 72, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def test(list_of_f, iterable):\n", " \"\"\" Test that all functions in list_of_f give the same list of permutation on this iterable.\"\"\"\n", " print(\"Testing for the list of functions {} ...\".format([f.__name__ for f in list_of_f])) # DEBUG\n", " result = True\n", " print(\"Testing for the iterable {} ...\".format(iterable)) # DEBUG\n", " i = iterable\n", " allperms = []\n", " for f in list_of_f:\n", " allperms.append(sorted([list(p) for p in f(iterable)]))\n", " for i, pi in enumerate(allperms):\n", " for j in range(i + 1, len(allperms)):\n", " pj = allperms[j]\n", " if pi != pj:\n", " print(\" - Function #{} ({.__name__}) gave a different list of permutations as function #{} ({.__name__}) ...\".format(i, list_of_f[i], j, list_of_f[j])) # DEBUG\n", " result = False\n", " else:\n", " print(\" - Function #{} ({.__name__}) gave the same list of permutations as function #{} ({.__name__}) ...\".format(i, list_of_f[i], j, list_of_f[j])) # DEBUG\n", " return result" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will test and compare the reference implementation, `itertools.permutation`, with the three other implementations given above." ] }, { "cell_type": "code", "execution_count": 73, "metadata": { "collapsed": true }, "outputs": [], "source": [ "list_of_f = [itertools_permutations, first_permutations, second_permutations, third_permutations]\n" ] }, { "cell_type": "code", "execution_count": 74, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Testing for the list of functions ['permutations', 'first_permutations', 'second_permutations', 'third_permutations'] ...\n", "Testing for the iterable [1, 2, 3] ...\n", " - Function #0 (permutations) gave the same list of permutations as function #1 (first_permutations) ...\n", " - Function #0 (permutations) gave the same list of permutations as function #2 (second_permutations) ...\n", " - Function #0 (permutations) gave the same list of permutations as function #3 (third_permutations) ...\n", " - Function #1 (first_permutations) gave the same list of permutations as function #2 (second_permutations) ...\n", " - Function #1 (first_permutations) gave the same list of permutations as function #3 (third_permutations) ...\n", " - Function #2 (second_permutations) gave the same list of permutations as function #3 (third_permutations) ...\n" ] }, { "data": { "text/plain": [ "True" ] }, "execution_count": 74, "metadata": {}, "output_type": "execute_result" } ], "source": [ "iterable = [1, 2, 3]\n", "test(list_of_f, iterable)" ] }, { "cell_type": "code", "execution_count": 75, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Testing for the list of functions ['permutations', 'first_permutations', 'second_permutations', 'third_permutations'] ...\n", "Testing for the iterable [1, 2, 3, 4, 5] ...\n", " - Function #0 (permutations) gave the same list of permutations as function #1 (first_permutations) ...\n", " - Function #0 (permutations) gave the same list of permutations as function #2 (second_permutations) ...\n", " - Function #0 (permutations) gave the same list of permutations as function #3 (third_permutations) ...\n", " - Function #1 (first_permutations) gave the same list of permutations as function #2 (second_permutations) ...\n", " - Function #1 (first_permutations) gave the same list of permutations as function #3 (third_permutations) ...\n", " - Function #2 (second_permutations) gave the same list of permutations as function #3 (third_permutations) ...\n" ] }, { "data": { "text/plain": [ "True" ] }, "execution_count": 75, "metadata": {}, "output_type": "execute_result" } ], "source": [ "iterable = [1, 2, 3, 4, 5]\n", "test(list_of_f, iterable)" ] }, { "cell_type": "code", "execution_count": 76, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Testing for the list of functions ['permutations', 'first_permutations', 'second_permutations', 'third_permutations'] ...\n", "Testing for the iterable [1, 2, 3, 4, 5, 6] ...\n", " - Function #0 (permutations) gave the same list of permutations as function #1 (first_permutations) ...\n", " - Function #0 (permutations) gave the same list of permutations as function #2 (second_permutations) ...\n", " - Function #0 (permutations) gave the same list of permutations as function #3 (third_permutations) ...\n", " - Function #1 (first_permutations) gave the same list of permutations as function #2 (second_permutations) ...\n", " - Function #1 (first_permutations) gave the same list of permutations as function #3 (third_permutations) ...\n", " - Function #2 (second_permutations) gave the same list of permutations as function #3 (third_permutations) ...\n" ] }, { "data": { "text/plain": [ "True" ] }, "execution_count": 76, "metadata": {}, "output_type": "execute_result" } ], "source": [ "iterable = [1, 2, 3, 4, 5, 6]\n", "test(list_of_f, iterable)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "----\n", "> That's it for today, folks!\n", "\n", "- If you want to read more about permutations and algorithms *to generate them all*, [this page is very helpful](https://en.wikipedia.org/wiki/Permutation#Algorithms_to_generate_permutations),\n", "- But if you want to find *one ring to rule them all*, [Bilbo is the guy to ask to](http://www.lmgtfy.com/?q=one%20ring%20to%20rule%20them%20all).\n", "\n", "More notebooks can be found on [my GitHub page](https://GitHub.com/Naereen/notebooks)." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.4.3+" } }, "nbformat": 4, "nbformat_minor": 1 }