{ "metadata": { "name": "recitation6" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# CS1001.py\n", "\n", "## Extended Introduction to Computer Science with Python, Tel-Aviv University, Spring 2013\n", "\n", "# Recitation 6 - 18-22.4.2013\n", "\n", "## Last update: 18.4.2013" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# [HW 2](http://tau-cs1001-py.wdfiles.com/local--files/home-assignments-2013b/HW2.pdf)\n", "\n", "## General comments\n", "\n", "* `print` and `return` are different things!\n", " * `print` writes to the screen\n", " * `return` is a mechanism for function output\n", "* `Big O` and `little o` are not the same. For example, $n=O(n)\\;$ but $n \\ne o(n)$. Definitions:\n", " * $f=O(g) \\Rightarrow \\exists M, x_o \\; s.t. \\; |f(x)| < M|g(x)|, \\; \\forall x>x_0$\n", " * $f=o(g) \\Rightarrow \\lim_{x \\to \\infty}{\\frac{f(x)}{g(x)}}=0$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Question 4c - complexity of addition and multiplication\n", "\n", "Consider two binary strings, *A* and *B* with number of bits *n* and *m*, respectively. \n", "\n", "Assume that every bit operation (addition or multiplication of two bits) takes a single time unit to execute. \n", "\n", "### Addition\n", "\n", "The complexity is $O(max(n,m)) \\;\\;$ because in the worst case scenario there is a carry bit for the entire length of the longer string. Therefore, there is a bit addition for every position in the longer bit string.\n", "\n", "### Multiplication\n", "\n", "First we multiply every bit in *B* with every bit in *A*. This takes $n \\cdot m$ bit operations. \n", "\n", "We than have $m$ bit strings which we need to add. The longest of these is of length $n+m-1 \\;$ and the shortest is of length $n \\;$. Therefore the number of bit operations is \n", "$nm + (n+m-1) + (n+m) + ... + n = (n+m-1+n)\\cdot \\frac{m}{2} = \\frac{nm + m^2 -m +nm}{2} = O(nm+m^2)$.\n", "\n", "Since we can swap *A* and *B* before starting we can enforce $m < n$ to get $O(nm)$.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Recursion\n", "\n", "## The recursive bionomial formula\n", "\n", "$$\n", "\\binom{n}{k} = \\binom{n-1}{k-1} + \\binom{n-1}{k} \\\\\\\\\n", "\\binom{n}{0} = \\binom{n}{n} = 1\n", "$$\n", "\n", "### Naive implementation" ] }, { "cell_type": "code", "collapsed": false, "input": [ "def binom_naive(n, k):\n", " if k == n or k == 0:\n", " return 1\n", " if n < k or n < 0 or k < 0:\n", " return 0\n", " return binom_naive(n - 1, k - 1) + binom_naive(n - 1, k)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 1 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit -n 3 binom_naive(6,4)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "3 loops, best of 3: 12.3 us per loop\n" ] } ], "prompt_number": 5 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit -n 3 binom_naive(26,4)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "3 loops, best of 3: 13 ms per loop\n" ] } ], "prompt_number": 6 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit -n 3 binom_naive(26,13)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "3 loops, best of 3: 12.6 s per loop\n" ] } ], "prompt_number": 7 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Complexity\n", "\n", "The leaves of the recursion tree are those functions that return 1. So the value of $\\binom{n}{k}$ is computed as $1+1+1+...+1$, and in total we use $\\binom{n}{k}-1 \\;$ additions to compute $\\binom{n}{k}$. So the complexity is $O(\\binom{n}{k})$. How does this compare to complexities we are familiar with? It [can be shown](http://www.site.uottawa.ca/~ivan/tr-educ.pdf) that this is $O(e^n)$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Memoization implementation\n", "Since the binomial formula is deterministic we don't want to calculate the same values over and over again, as this creates a large burden on the efficiency of the algorithm.\n", "\n", "Therefore we **memoize** calculated valued." ] }, { "cell_type": "code", "collapsed": false, "input": [ "binom_memory = {}\n", "\n", "def binom_mem(n, k):\n", " if k == n or k == 0:\n", " return 1\n", " if k > n or n < 0 or k < 0:\n", " return 0\n", " key = (n,k)\n", " if key not in binom_memory:\n", " binom_memory[key] = binom_mem(n - 1, k - 1) + binom_mem(n - 1, k)\n", " return binom_memory[key]" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 14 }, { "cell_type": "markdown", "metadata": {}, "source": [ "For a fair comperison we init the dictionary before every call:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%%timeit -n 3 \n", "binom_memory.clear()\n", "binom_mem(6,4)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "3 loops, best of 3: 13.5 us per loop\n" ] } ], "prompt_number": 32 }, { "cell_type": "code", "collapsed": false, "input": [ "%%timeit -n 3 \n", "binom_memory.clear()\n", "binom_mem(26,4)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "3 loops, best of 3: 161 us per loop\n" ] } ], "prompt_number": 33 }, { "cell_type": "code", "collapsed": false, "input": [ "%%timeit -n 3 \n", "binom_memory.clear()\n", "binom_mem(26,13)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "3 loops, best of 3: 321 us per loop\n" ] } ], "prompt_number": 34 }, { "cell_type": "markdown", "metadata": {}, "source": [ "But if we don't clear the dictionary it is even better:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%%timeit -n 3\n", "binom_mem(26,13)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "3 loops, best of 3: 1.01 us per loop\n" ] } ], "prompt_number": 36 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Complexity \n", "The memoization technique greatly improves the complexity to $O(nk)$. This is because we only need, at most, to compute once each combination of $n$ and $k$ and there are $n\\cdot k$ such combinations. Of course we don't calculate all the combinations, only those where $k < n$, but that does not change the complexity." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Recursive factorial implementation\n", "\n", "This implementation is based on the definition of the binomial coefficient: $\\binom{n}{k} = \\frac{n!}{k! (n-k)!}$.\n", "\n", "The factortial $n!$ is calculated recusrively: $n! = n \\cdot (n-1)!$." ] }, { "cell_type": "code", "collapsed": false, "input": [ "def binom_formula(n,k):\n", " return factorial(n) // (factorial(k) * factorial(n - k))\n", "\n", "def factorial(n):\n", " if n == 0: \n", " return 1\n", " return n * factorial(n - 1)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 3 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit -n 3 binom_formula(6,4)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "3 loops, best of 3: 22.1 us per loop\n" ] } ], "prompt_number": 4 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit -n 3 binom_formula(26,4)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "3 loops, best of 3: 49.9 us per loop\n" ] } ], "prompt_number": 5 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit -n 3 binom_formula(26,13)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "3 loops, best of 3: 48.7 us per loop\n" ] } ], "prompt_number": 6 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Complexity\n", "\n", "To calculate $n!$ we need to do $n$ multiplications. So for $\\binom{n}{k}=\\frac{n!}{k!(n-k)!}\\;$ there are $n + (n-k) + k\\;$ multiplications, which is a total of $2n$ multiplications, plus another multiplication in the denominator and one division. Therefore the complexity is $O(n)$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Running time comparison" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit -n 100 binom_naive(14,5)\n", "%timeit -n 1000 binom_formula(14,5)\n", "%timeit -n 1000 binom_mem(14,5)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "100 loops, best of 3: 1.68 ms per loop\n", "1000 loops, best of 3: 7.52 us per loop\n", "1000 loops, best of 3: 809 ns per loop\n" ] } ], "prompt_number": 48 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit -n 1 binom_naive(24,8)\n", "%timeit -n 10 binom_formula(24,8)\n", "%timeit -n 10 binom_mem(24,8)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "1 loops, best of 3: 602 ms per loop\n", "10 loops, best of 3: 13.4 us per loop\n", "10 loops, best of 3: 966 ns per loop\n" ] } ], "prompt_number": 47 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Pascal's triangle\n" ] }, { "cell_type": "code", "collapsed": false, "input": [ "def pascal(n):\n", " for i in range(n + 1):\n", " for j in range(i + 1):\n", " print(binom_mem(i,j), end=\"\\t\")\n", " print()" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 106 }, { "cell_type": "code", "collapsed": false, "input": [ "pascal(4)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "1\t\n", "1\t1\t\n", "1\t2\t1\t\n", "1\t3\t3\t1\t\n", "1\t4\t6\t4\t1\t\n" ] } ], "prompt_number": 107 }, { "cell_type": "code", "collapsed": false, "input": [ "pascal(7)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "1\t\n", "1\t1\t\n", "1\t2\t1\t\n", "1\t3\t3\t1\t\n", "1\t4\t6\t4\t1\t\n", "1\t5\t10\t10\t5\t1\t\n", "1\t6\t15\t20\t15\t6\t1\t\n", "1\t7\t21\t35\t35\t21\t7\t1\t\n" ] } ], "prompt_number": 108 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Change\n", "\n", "A bus driver needs to give change. Given the amount to change and a list of coin types, how many ways are there to give change?\n", "\n", "Solution:G\n", "Given amount of change $x$ and the coins $a_1, a_2, ..., a_n$, one can either use the largest coin or not use it:\n", "\n", "$$\n", "f(x,n) = f(x - a_n, n) + f(x, n-1)\n", "$$\n", "\n", "with stopping criteria:\n", "\n", "$$\n", "f(0,n) = 1 \\\\\\\\\n", "f(x \\lt 0, n) = 0 \\\\\\\\\n", "f(x, 0) = 0\n", "$$" ] }, { "cell_type": "code", "collapsed": false, "input": [ "def count_change(amount, coins):\n", " if amount == 0:\n", " return 1\n", " if amount < 0 or len(coins) == 0:\n", " return 0\n", " without_large_coint = count_change(amount, coins[:-1]) \n", " with_large_coin = count_change(amount - coins[-1], coins)\n", " return without_large_coint + with_large_coin" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 51 }, { "cell_type": "code", "collapsed": false, "input": [ "count_change(5, [1,2,3])" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 52, "text": [ "5" ] } ], "prompt_number": 52 }, { "cell_type": "markdown", "metadata": {}, "source": [ "An example of a *partial* recursion tree for returning a change of 5 with the coins 1, 2 and 3. Red indicates a stop criteria:\n", "\n", "![Change recursion tree](https://raw.github.com/yoavram/CS1001.py/master/recitation6_change_tree.PNG)\n", "\n", "As can be seen from the above diagram, this algorithm too can benefit from *memoization*. Try it at home!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Quicksort\n" ] }, { "cell_type": "code", "collapsed": false, "input": [ "def slowsort(lst):\n", " \"\"\"quicksort of list lst\"\"\"\n", " if len(lst) <= 1: \n", " return lst\n", " pivot = lst[0] # select first element from list\n", " smaller = [elem for elem in lst if elem < pivot] \n", " equal = [elem for elem in lst if elem == pivot] \n", " greater = [elem for elem in lst if elem > pivot]\n", " return slowsort(smaller) + equal + slowsort(greater)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 4 }, { "cell_type": "code", "collapsed": false, "input": [ "import random\n", "\n", "def quicksort(lst):\n", " \"\"\"quicksort of list lst\"\"\"\n", " if len(lst) <= 1: \n", " return lst\n", " pivot = random.choice(lst) # select a random element from list\n", " smaller = [elem for elem in lst if elem < pivot] \n", " equal = [elem for elem in lst if elem == pivot] \n", " greater = [elem for elem in lst if elem > pivot]\n", " return quicksort(smaller) + equal + quicksort(greater)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 5 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Running time on random input" ] }, { "cell_type": "code", "collapsed": false, "input": [ "lst = [random.randint(0,100) for _ in range(100000)]" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 121 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit -n 10 slowsort(lst)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "10 loops, best of 3: 143 ms per loop\n" ] } ], "prompt_number": 122 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit -n 10 quicksort(lst)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "10 loops, best of 3: 150 ms per loop\n" ] } ], "prompt_number": 123 }, { "cell_type": "code", "collapsed": false, "input": [ "lst = [i for i in range(100)]" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 6 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit -n 10 slowsort(lst)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "10 loops, best of 3: 2.55 ms per loop\n" ] } ], "prompt_number": 7 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit -n 10 quicksort(lst)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "10 loops, best of 3: 816 us per loop\n" ] } ], "prompt_number": 8 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Complexity\n", "\n", "Assume the values are unique (what will happen if all values are the same?, i.e. `lst = [1, 1, 1, 1, ..., 1]`).\n", "\n", "* At the first function call, you go over the entire list and separate the numbers that are smaller and bigger than the pivot. So this is $O(n)$.\n", "* At the next function calls, you go over two sub-lists and do the same for them, and the total of items in the two sub-lists is $n$, so this is also $O(n)$.\n", "* This goes on until we reach lists of length 1\n", "* Denoting the recursion depth by x, the complexity is therefore $O(n x)$.\n", "\n", "For **Slowsort**, the worst case is when the list is already sorted (`[1,2,3,4,5]`) in which case the pivot will always be the minimum and the sub-lists will always be of length *1* and *k-1$. \n", "\n", "The recursion depth will be $n$ and therefore the complexity $O(n^2)$.\n", "\n", "In the base case, however, the pivot is always the median, and therefore the sub-lists are always of equal size, making for a recursion depth of $\\log_2{n}$ and complexity of $O(n \\log{n})$.\n", "\n", "For **Quicksort** the choice of the pivot is random, so the algorithm complexity does not depend on the input *order*. \n", "\n", "On average, the pivot will be the median (which is the best choice), the sub-lists will be of equal size, and the recursion depth will be $\\log_2{n} \\;$ leading to a complexity of $O(n \\log{n})$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Fin\n", "This notebook is part of the [Extended introduction to computer science](http://tau-cs1001-py.wikidot.com/) course at Tel-Aviv University.\n", "\n", "The notebook was written using Python 3.2 and IPython 0.13.1.\n", "\n", "The code is available at .\n", "\n", "The notebook can be viewed online at .\n", "\n", "This work is licensed under a [Creative Commons Attribution-ShareAlike 3.0 Unported License](http://creativecommons.org/licenses/by-sa/3.0/)." ] } ], "metadata": {} } ] }