{
 "metadata": {
  "name": "recitation6"
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "# CS1001.py\n",
      "\n",
      "## Extended Introduction to Computer Science with Python, Tel-Aviv University, Spring 2013\n",
      "\n",
      "# Recitation 6 - 18-22.4.2013\n",
      "\n",
      "## Last update: 18.4.2013"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "# [HW 2](http://tau-cs1001-py.wdfiles.com/local--files/home-assignments-2013b/HW2.pdf)\n",
      "\n",
      "## General comments\n",
      "\n",
      "* `print` and `return` are different things!\n",
      "  * `print` writes to the screen\n",
      "  * `return` is a mechanism for function output\n",
      "* `Big O` and `little o` are not the same. For example, $n=O(n)\\;$ but $n \\ne o(n)$. Definitions:\n",
      "  * $f=O(g) \\Rightarrow \\exists M, x_o \\; s.t. \\; |f(x)| < M|g(x)|, \\; \\forall x>x_0$\n",
      "  * $f=o(g) \\Rightarrow \\lim_{x \\to \\infty}{\\frac{f(x)}{g(x)}}=0$"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Question 4c - complexity of addition and multiplication\n",
      "\n",
      "Consider two binary strings, *A* and *B* with number of bits *n* and *m*, respectively. \n",
      "\n",
      "Assume that every bit operation (addition or multiplication of two bits) takes a single time unit to execute. \n",
      "\n",
      "### Addition\n",
      "\n",
      "The complexity is $O(max(n,m)) \\;\\;$ because in the worst case scenario there is a carry bit for the entire length of the longer string. Therefore, there is a bit addition for every position in the longer bit string.\n",
      "\n",
      "### Multiplication\n",
      "\n",
      "First we multiply every bit in *B* with every bit in *A*. This takes $n \\cdot m$ bit operations. \n",
      "\n",
      "We than have $m$ bit strings which we need to add. The longest of these is of length $n+m-1 \\;$ and the shortest is of length $n \\;$. Therefore the number of bit operations is \n",
      "$nm + (n+m-1) + (n+m) + ... + n = (n+m-1+n)\\cdot \\frac{m}{2} = \\frac{nm + m^2 -m +nm}{2} = O(nm+m^2)$.\n",
      "\n",
      "Since we can swap *A* and *B* before starting we can enforce $m < n$ to get $O(nm)$.\n"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "# Recursion\n",
      "\n",
      "## The recursive bionomial formula\n",
      "\n",
      "$$\n",
      "\\binom{n}{k} = \\binom{n-1}{k-1} + \\binom{n-1}{k} \\\\\\\\\n",
      "\\binom{n}{0} = \\binom{n}{n} = 1\n",
      "$$\n",
      "\n",
      "### Naive implementation"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "def binom_naive(n, k):\n",
      "    if k == n or k == 0:\n",
      "        return 1\n",
      "    if n < k or n < 0 or k < 0:\n",
      "        return 0\n",
      "    return binom_naive(n - 1, k - 1) + binom_naive(n - 1, k)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 1
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%timeit -n 3 binom_naive(6,4)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "3 loops, best of 3: 12.3 us per loop\n"
       ]
      }
     ],
     "prompt_number": 5
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%timeit -n 3 binom_naive(26,4)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "3 loops, best of 3: 13 ms per loop\n"
       ]
      }
     ],
     "prompt_number": 6
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%timeit -n 3 binom_naive(26,13)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "3 loops, best of 3: 12.6 s per loop\n"
       ]
      }
     ],
     "prompt_number": 7
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "#### Complexity\n",
      "\n",
      "The leaves of the recursion tree are those functions that return 1. So the value of $\\binom{n}{k}$ is computed as $1+1+1+...+1$, and in total we use $\\binom{n}{k}-1 \\;$ additions to compute $\\binom{n}{k}$. So the complexity is $O(\\binom{n}{k})$. How does this compare to complexities we are familiar with? It [can be shown](http://www.site.uottawa.ca/~ivan/tr-educ.pdf) that this is $O(e^n)$."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "### Memoization implementation\n",
      "Since the binomial formula is deterministic we don't want to calculate the same values over and over again, as this creates a large burden on the efficiency of the algorithm.\n",
      "\n",
      "Therefore we **memoize** calculated valued."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "binom_memory = {}\n",
      "\n",
      "def binom_mem(n, k):\n",
      "    if k == n or k == 0:\n",
      "        return 1\n",
      "    if k > n or n < 0 or k < 0:\n",
      "        return 0\n",
      "    key = (n,k)\n",
      "    if key not in binom_memory:\n",
      "        binom_memory[key] =  binom_mem(n - 1, k - 1) + binom_mem(n - 1, k)\n",
      "    return binom_memory[key]"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 14
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "For a fair comperison we init the dictionary before every call:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%timeit -n 3 \n",
      "binom_memory.clear()\n",
      "binom_mem(6,4)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "3 loops, best of 3: 13.5 us per loop\n"
       ]
      }
     ],
     "prompt_number": 32
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%timeit -n 3 \n",
      "binom_memory.clear()\n",
      "binom_mem(26,4)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "3 loops, best of 3: 161 us per loop\n"
       ]
      }
     ],
     "prompt_number": 33
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%timeit -n 3 \n",
      "binom_memory.clear()\n",
      "binom_mem(26,13)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "3 loops, best of 3: 321 us per loop\n"
       ]
      }
     ],
     "prompt_number": 34
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "But if we don't clear the dictionary it is even better:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%%timeit -n 3\n",
      "binom_mem(26,13)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "3 loops, best of 3: 1.01 us per loop\n"
       ]
      }
     ],
     "prompt_number": 36
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "#### Complexity \n",
      "The memoization technique greatly improves the complexity to $O(nk)$. This is because we only need, at most, to compute once each combination of $n$ and $k$ and there are $n\\cdot k$ such combinations. Of course we don't calculate all the combinations, only those where $k < n$, but that does not change the complexity."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "### Recursive factorial implementation\n",
      "\n",
      "This implementation is based on the definition of the binomial coefficient: $\\binom{n}{k} = \\frac{n!}{k! (n-k)!}$.\n",
      "\n",
      "The factortial $n!$ is calculated recusrively: $n! = n \\cdot (n-1)!$."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "def binom_formula(n,k):\n",
      "    return factorial(n) // (factorial(k) * factorial(n - k))\n",
      "\n",
      "def factorial(n):\n",
      "    if n == 0: \n",
      "        return 1\n",
      "    return n * factorial(n - 1)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 3
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%timeit -n 3 binom_formula(6,4)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "3 loops, best of 3: 22.1 us per loop\n"
       ]
      }
     ],
     "prompt_number": 4
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%timeit -n 3 binom_formula(26,4)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "3 loops, best of 3: 49.9 us per loop\n"
       ]
      }
     ],
     "prompt_number": 5
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%timeit -n 3 binom_formula(26,13)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "3 loops, best of 3: 48.7 us per loop\n"
       ]
      }
     ],
     "prompt_number": 6
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "#### Complexity\n",
      "\n",
      "To calculate $n!$ we need to do $n$ multiplications. So for $\\binom{n}{k}=\\frac{n!}{k!(n-k)!}\\;$ there are $n + (n-k) + k\\;$ multiplications, which is a total of $2n$ multiplications, plus another multiplication in the denominator and one division. Therefore the complexity is $O(n)$."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "### Running time comparison"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%timeit -n 100 binom_naive(14,5)\n",
      "%timeit -n 1000 binom_formula(14,5)\n",
      "%timeit -n 1000 binom_mem(14,5)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "100 loops, best of 3: 1.68 ms per loop\n",
        "1000 loops, best of 3: 7.52 us per loop\n",
        "1000 loops, best of 3: 809 ns per loop\n"
       ]
      }
     ],
     "prompt_number": 48
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%timeit -n 1 binom_naive(24,8)\n",
      "%timeit -n 10 binom_formula(24,8)\n",
      "%timeit -n 10 binom_mem(24,8)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "1 loops, best of 3: 602 ms per loop\n",
        "10 loops, best of 3: 13.4 us per loop\n",
        "10 loops, best of 3: 966 ns per loop\n"
       ]
      }
     ],
     "prompt_number": 47
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "### Pascal's triangle\n"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "def pascal(n):\n",
      "    for i in range(n + 1):\n",
      "        for j in range(i + 1):\n",
      "            print(binom_mem(i,j), end=\"\\t\")\n",
      "        print()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 106
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "pascal(4)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "1\t\n",
        "1\t1\t\n",
        "1\t2\t1\t\n",
        "1\t3\t3\t1\t\n",
        "1\t4\t6\t4\t1\t\n"
       ]
      }
     ],
     "prompt_number": 107
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "pascal(7)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "1\t\n",
        "1\t1\t\n",
        "1\t2\t1\t\n",
        "1\t3\t3\t1\t\n",
        "1\t4\t6\t4\t1\t\n",
        "1\t5\t10\t10\t5\t1\t\n",
        "1\t6\t15\t20\t15\t6\t1\t\n",
        "1\t7\t21\t35\t35\t21\t7\t1\t\n"
       ]
      }
     ],
     "prompt_number": 108
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Change\n",
      "\n",
      "A bus driver needs to give change. Given the amount to change and a list of coin types, how many ways are there to give change?\n",
      "\n",
      "Solution:G\n",
      "Given amount of change $x$ and the coins $a_1, a_2, ..., a_n$, one can either use the largest coin or not use it:\n",
      "\n",
      "$$\n",
      "f(x,n) = f(x - a_n, n) + f(x, n-1)\n",
      "$$\n",
      "\n",
      "with stopping criteria:\n",
      "\n",
      "$$\n",
      "f(0,n) = 1 \\\\\\\\\n",
      "f(x \\lt 0, n) = 0 \\\\\\\\\n",
      "f(x, 0) = 0\n",
      "$$"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "def count_change(amount, coins):\n",
      "    if amount == 0:\n",
      "        return 1\n",
      "    if amount < 0 or len(coins) == 0:\n",
      "        return 0\n",
      "    without_large_coint = count_change(amount, coins[:-1]) \n",
      "    with_large_coin = count_change(amount - coins[-1], coins)\n",
      "    return without_large_coint + with_large_coin"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 51
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "count_change(5, [1,2,3])"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "pyout",
       "prompt_number": 52,
       "text": [
        "5"
       ]
      }
     ],
     "prompt_number": 52
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "An example of a *partial* recursion tree for returning a change of 5 with the coins 1, 2 and 3. Red indicates a stop criteria:\n",
      "\n",
      "![Change recursion tree](https://raw.github.com/yoavram/CS1001.py/master/recitation6_change_tree.PNG)\n",
      "\n",
      "As can be seen from the above diagram, this algorithm too can benefit from *memoization*. Try it at home!"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Quicksort\n"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "def slowsort(lst):\n",
      "    \"\"\"quicksort of list lst\"\"\"\n",
      "    if len(lst) <= 1: \n",
      "        return lst\n",
      "    pivot = lst[0]      # select first element from list\n",
      "    smaller = [elem for elem in lst if elem < pivot] \n",
      "    equal = [elem for elem in lst if elem == pivot]      \n",
      "    greater = [elem for elem in lst if elem > pivot]\n",
      "    return slowsort(smaller) + equal + slowsort(greater)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 4
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "import random\n",
      "\n",
      "def quicksort(lst):\n",
      "    \"\"\"quicksort of list lst\"\"\"\n",
      "    if len(lst) <= 1: \n",
      "        return lst\n",
      "    pivot = random.choice(lst) # select a random element from list\n",
      "    smaller = [elem for elem in lst if elem < pivot] \n",
      "    equal = [elem for elem in lst if elem == pivot]      \n",
      "    greater = [elem for elem in lst if elem > pivot]\n",
      "    return quicksort(smaller) + equal + quicksort(greater)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 5
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "### Running time on random input"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "lst = [random.randint(0,100) for _ in range(100000)]"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 121
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%timeit -n 10 slowsort(lst)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "10 loops, best of 3: 143 ms per loop\n"
       ]
      }
     ],
     "prompt_number": 122
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%timeit -n 10 quicksort(lst)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "10 loops, best of 3: 150 ms per loop\n"
       ]
      }
     ],
     "prompt_number": 123
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "lst = [i for i in range(100)]"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 6
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%timeit -n 10 slowsort(lst)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "10 loops, best of 3: 2.55 ms per loop\n"
       ]
      }
     ],
     "prompt_number": 7
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "%timeit -n 10 quicksort(lst)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "output_type": "stream",
       "stream": "stdout",
       "text": [
        "10 loops, best of 3: 816 us per loop\n"
       ]
      }
     ],
     "prompt_number": 8
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "### Complexity\n",
      "\n",
      "Assume the values are unique (what will happen if all values are the same?, i.e. `lst = [1, 1, 1, 1, ..., 1]`).\n",
      "\n",
      "* At the first function call, you go over the entire list and separate the numbers that are smaller and bigger than the pivot. So this is $O(n)$.\n",
      "* At the next function calls, you go over two sub-lists and do the same for them, and the total of items in the two sub-lists is $n$,  so this is also $O(n)$.\n",
      "* This goes on until we reach lists of length 1\n",
      "* Denoting the recursion depth by x, the complexity is therefore $O(n x)$.\n",
      "\n",
      "For **Slowsort**, the worst case is when the list is already sorted (`[1,2,3,4,5]`) in which case the pivot will always be the minimum and the sub-lists will always be of length *1* and *k-1$. \n",
      "\n",
      "The recursion depth will be $n$ and therefore the complexity $O(n^2)$.\n",
      "\n",
      "In the base case, however, the pivot is always the median, and therefore the sub-lists are always of equal size, making for a recursion depth of $\\log_2{n}$ and complexity of $O(n \\log{n})$.\n",
      "\n",
      "For **Quicksort** the choice of the pivot is random, so the algorithm complexity does not depend on the input *order*. \n",
      "\n",
      "On average, the pivot will be the median (which is the best choice), the sub-lists will be of equal size, and the recursion depth will be $\\log_2{n} \\;$ leading to a complexity of $O(n \\log{n})$."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Fin\n",
      "This notebook is part of the [Extended introduction to computer science](http://tau-cs1001-py.wikidot.com/) course at Tel-Aviv University.\n",
      "\n",
      "The notebook was written using Python 3.2 and IPython 0.13.1.\n",
      "\n",
      "The code is available at <https://raw.github.com/yoavram/CS1001.py/master/recitation6.ipynb>.\n",
      "\n",
      "The notebook can be viewed online at <http://nbviewer.ipython.org/urls/raw.github.com/yoavram/CS1001.py/master/recitation6.ipynb>.\n",
      "\n",
      "This work is licensed under a [Creative Commons Attribution-ShareAlike 3.0 Unported License](http://creativecommons.org/licenses/by-sa/3.0/)."
     ]
    }
   ],
   "metadata": {}
  }
 ]
}