{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Consider a simple, recursive implementation of Fibonacci:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "def fib(n):\n", " '''Regular, recursive Fibonacci.'''\n", " if n < 2:\n", " return n\n", " return fib(n - 1) + fib(n - 2)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[fib(n) for n in range(10)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Problem is...it is slow:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "387 ms ± 864 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)\n" ] } ], "source": [ "%timeit fib(30)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It's slow because it recomputes many smaller Fibonacci numbers over and over again, resulting in [exponential cost](https://en.wikipedia.org/wiki/Dynamic_programming#Fibonacci_sequence). Dynamic programming techniques can be used to speed things up." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### `%timeOnce` testing magic" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The default behaviour of `%timeit` is to re-run a statement 7 times with each run comprising of several loops. Since this notebook will look at memoized functions (which can return repeated function calls in constant time), it is better to have only 1 test run and 1 loop." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Created `%timeOnce` as an alias for `%timeit -n1 -r1`.\n", "Created `%%timeOnce` as an alias for `%%timeit -n1 -r1`.\n" ] } ], "source": [ "%alias_magic timeOnce %timeit -p \"-n1 -r1\"" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "389 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n" ] } ], "source": [ "%timeOnce fib(30)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Dynamic Programming\n", "\n", "Dynamic programming involves breaking a problem up into smaller sub-problems, solving the sub-problems, and then using the solutions to solve the original problem. This strategy can be employed in two ways: top-down and bottom-up.\n", "\n", "## Memoization (top-down)\n", "\n", "Memoization is a technique where the results of a function call are remembered by the function itself. When a memoized function is called with a particular input, the result is computed, stored somewhere (e.g. a hashmap), and the result is returned. Now if that function is called again with the same input, the result can simply be retrieved from the hashmap instead of recomputed.\n", "\n", "Python makes memoization very easy with the [@lru_cache](https://docs.python.org/3/library/functools.html) decorator. As it happens, Fibonacci is used by the Python docs to illustrate how `@lru_cache` works." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "from functools import lru_cache" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "@lru_cache(maxsize=None)\n", "def memoized_fib(n):\n", " '''Memoized Fibonacci.\n", " source: https://docs.python.org/3/library/functools.html'''\n", " if n < 2:\n", " return n\n", " return memoized_fib(n - 1) + memoized_fib(n - 2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The only difference between `memoized_fib()` and `fib()` is the `@lru_cache` decorator, and yet:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "20.3 µs ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n" ] } ], "source": [ "%timeOnce memoized_fib(30)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`fib()` takes ~400 milliseconds to compute the 30th Fibonnaci number while `memoized_fib()` only takes ~20 *micro seconds*. The [cost is O(n)](https://en.wikipedia.org/wiki/Dynamic_programming#Fibonacci_sequence).\n", "\n", "`memoized_fib()` does even better when tested again:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1.4 µs ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n" ] } ], "source": [ "%timeOnce memoized_fib(30)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For the second call, `memoized_fib()` only needs to retrieve the result of `memoized_fib(30)` from the cache so it is significantly faster." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is possible to view the effectiveness of the cache with `.cache_info`:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "CacheInfo(hits=29, misses=31, maxsize=None, currsize=31)" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "memoized_fib.cache_info()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And to clear it:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "memoized_fib.cache_clear()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "While memoization is easy (in Python), it's not a perfect solution. Lots of calls must be made to start building up the cache so it is easy to hit the recursion limit." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "maximum recursion depth exceeded while calling a Python object\n" ] } ], "source": [ "try:\n", " memoized_fib(800)\n", "except RecursionError as e:\n", " print(e)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Tabulation (bottom-up)\n", "\n", "With memoization, the cache is built-up \"automatically\" from function calls. However, there is no guarrentee that the construction of the cache is done optimally. Although it is true that no computation is repeated twice, it can take many recursive calls before a cached solution is found. Can this be avoided?\n", "\n", "What if the cache were built-up intentionally? It can start with some base values:\n", "\n", "\\begin{align}\n", " \\text{fib}(0) &= 0 \\\\\n", " \\text{fib}(1) &= 1\n", "\\end{align}\n", "\n", "and these base values can be used to add larger Fibonacci numbers to the cache:\n", "\n", "\\begin{align}\n", " \\text{fib}(2) &= \\text{fib}(1) + \\text{fib}(0) \\\\\n", " \\text{fib}(2) &= 1\n", "\\end{align}\n", "\n", "This process can be repeated all the way up to the Fibonacci number given to the function.\n", "\n", "General formula:\n", "\n", "$$ \\text{fib}(n) = \\text{fib}(n-1) + \\text{fib}(n-2) $$" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "def tabulated_fib(n):\n", " '''Tabulated Fibonacci.'''\n", " # Create a cache to store computed Fibonacci values.\n", " # Seed it with fib(0) = 0 and fib(1) = 1.\n", " cache = [0, 1]\n", " # Compute fib(2), fib(3), ..., fib(n)\n", " for num in range(2, n+1):\n", " cache.append(cache[num-1] + cache[num-2])\n", " \n", " return cache[n]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "How does it fare against `memoized_fib()`?" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "8 µs ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n" ] } ], "source": [ "%timeOnce tabulated_fib(30)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`tabulated_fib()` is a bit faster than `memoized_fib()`, but it is still in the same complexity class as `memoized_fib`: $O(n)$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another advantage `tabulated_fib()` has over `memoized_fib()` is that it is not recursive, so it will never hit the maximum recursion limit." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "69283081864224717136290077681328518273399124385204820718966040597691435587278383112277161967532530675374170857404743017623467220361778016172106855838975759985190398725" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tabulated_fib(800)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The full cache isn't actually needed to compute Fibonacci, but I included it because dynamic programming often involves creating tables of solutions to small problems and referring to those tables to solve large problems. For Fibonacci, only the last two computations are needed but generally speaking, larger tables may be necessary to solve other problems using a dynamic programming approach." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "def slim_fib(n):\n", " '''Only remember the last two computations (fib(n-1) and fib(n-2)).'''\n", " prev_fib = 1\n", " prev_prev_fib = 0\n", " for num in range(2, n+1):\n", " next_fib = prev_prev_fib + prev_fib\n", " prev_prev_fib = prev_fib\n", " prev_fib = next_fib\n", "\n", " return prev_fib" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`slim_fib()` is fastest of all:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "3.1 µs ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n" ] } ], "source": [ "%timeOnce slim_fib(30)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`slim_fib()` is also $O(n)$, but only takes $O(1)$ space (compared to `tabulated_fib()` which takes $O(n)$ space)." ] } ], "metadata": { "kernelspec": { "display_name": "Python [default]", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.2" } }, "nbformat": 4, "nbformat_minor": 2 }