{ "metadata": { "name": "" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# CS1001.py\n", "\n", "## Extended Introduction to Computer Science with Python, Tel-Aviv University, Spring 2013\n", "\n", "# Recitation 8 - 2-6.5.2013\n", "\n", "## Last update: 6.5.2013" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Homework 3\n", "\n", "## Question 4b" ] }, { "cell_type": "code", "collapsed": false, "input": [ "def f2(lst):\n", " i = 1\n", " while i < len(lst):\n", " j = 1\n", " while j < i:\n", " print(lst[i], lst[j])\n", " j += 5\n", " i *= 2" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The inner loop does $O(i/5)$ iterations, because *j* is bounded by *i* and does steps of *5*. \n", "\n", "The value of *i* after *k* iterations of the outer loop is $2^k$, and this outer loop goes on untill $2^k=n$, or $k = \\log_{2}{n})$.\n", "\n", "So the total complexity is:\n", "\n", "$$\n", "O(\\sum_{k=1}^{\\log_{2}{n}}{\\frac{2^k}{5}}) = O(\\frac{1}{5}\\sum_{k=1}^{\\log_{2}{n}}{2^k}) = O(\\frac{2^{\\log_{2}{n}}-1}{2-1}) = O(2^{\\log_{2}{n}}) = O(n)\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Question 4c" ] }, { "cell_type": "code", "collapsed": false, "input": [ "def f3(lst):\n", " i = len(lst)\n", " while i > 0:\n", " for j in range(i):\n", " for k in range(j, 10**5):\n", " print(i)\n", " i -= 2" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The the most inner loop does an $O(1)$ operation and at most $10^5$ iterations, so the inner loop's complexity is $O(1)$. This is because:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "list(range(5,2))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "pyout", "prompt_number": 7, "text": [ "[]" ] } ], "prompt_number": 7 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The middle loop does at most *n* iterations at the first time, then *n-2* etc.:\n", "$$\n", "n + n-2 + n-4 + ... + 1 = O(n^2)\n", "$$\n", "\n", "And therefore the total complexity is $O(n^2)$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Hashing\n", "\n", "We want a dynamic dictionary with insert, remove, search. \n", "\n", "The keys are from a very large set *U* but the number of distinct elements expected is *n*, and $n<<|U|$.\n", "\n", "We will use a table of size *m*, where $m\\approx n$.\n", "\n", "For example, *U* is the set of 8-digit ID numbers, so $|U| = 10^8$. \n", "\n", "*n* is the size of the Israeli population, so $m\\approx n \\approx 10^7$.\n", "\n", "## Hash function\n", "\n", "We need a function $h: U \\to {0,1,...,m-1}$. \n", "\n", "1. Collisons are unavoidable (*pigeon hole principle*), so we want *h* to distribute the values uniformly on ${0,1,...,m-1}$, and we can then avoid collisons using lists, coockoo hashing, etc. \n", "1. We want *h* to be deterministic (*not* random). That means that for a given input the output is always the same - *h(x)=h(x)*.\n", "1. We need *h* to be easily computable for all inputs.\n", "\n", "## Hash table" ] }, { "cell_type": "code", "collapsed": false, "input": [ "class MyHashtable:\n", " def __init__(self, m, hash_func=hash):\n", " \"\"\" initial hash table, m empty entries \"\"\"\n", " self.data = [ [] for i in range(m)]\n", " self.hash_mod = lambda x: hash_func(x) % m\n", " \n", " def find(self, item):\n", " \"\"\" returns True if item in hashtable, False otherwise \"\"\"\n", " i = self.hash_mod(item) \n", " try:\n", " self.data[i].index(item)\n", " return True\n", " except ValueError:\n", " return False\n", "\n", " def insert(self, item):\n", " \"\"\" insert an item into table \"\"\"\n", " i = self.hash_mod(item)\n", " try:\n", " self.data[i].index(item)\n", " except ValueError:\n", " self.data[i].append(item)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 1 }, { "cell_type": "code", "collapsed": false, "input": [ "ht = MyHashtable(100)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 2 }, { "cell_type": "code", "collapsed": false, "input": [ "ht.insert(\"yoav\")\n", "ht.insert(\"amir\")\n", "ht.insert(\"amiram\")\n", "ht.insert(\"haim\")" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 3 }, { "cell_type": "code", "collapsed": false, "input": [ "ht.find(\"yoav\")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 4, "text": [ "True" ] } ], "prompt_number": 4 }, { "cell_type": "code", "collapsed": false, "input": [ "ht.find(\"inon\")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 5, "text": [ "False" ] } ], "prompt_number": 5 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercise \u2013 repeating substring\n", "\n", "Input: We are given a string *st* of length *n*, and a small integer *k* (*k=O(1)* with respect to *n*).\n", "\n", "Output: Is there a substring of *st* of length *k* which repeats itself (more than once)?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Naive solution" ] }, { "cell_type": "code", "collapsed": false, "input": [ "def repeat_naive(st, k):\n", " for i in range(len(st) - k + 1):\n", " for j in range(i + 1, len(st) - k + 1):\n", " if st[i : i + k] == st[j : j + k]:\n", " return True\n", " return False" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 6 }, { "cell_type": "code", "collapsed": false, "input": [ "song = '''So, so you think you can tell \n", "Heaven from Hell, \n", "Blue sky's from pain. \n", "Can you tell a green field \n", "From a cold steel rail? \n", "A smile from a veil? \n", "Do you think you can tell? \n", "\n", "And did they get you to trade \n", "Your heroes for ghosts? \n", "Hot ashes for trees? \n", "Hot air for a cool breeze? \n", "Cold comfort for change? \n", "And did you exchange \n", "A walk on part in the war \n", "For a lead role in a cage? \n", "\n", "How I wish, how I wish you were here. \n", "We're just two lost souls \n", "Swimming in a fish bowl, \n", "Year after year, \n", "Running over the same old ground. \n", "And how we found\n", "The same old fears. \n", "Wish you were here.'''" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 9 }, { "cell_type": "code", "collapsed": false, "input": [ "print(repeat_naive(song, 5))\n", "%timeit -n 10 repeat_naive(song, 5)\n", "print(repeat_naive(song, 30))\n", "%timeit -n 10 repeat_naive(song, 30)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "True\n", "10 loops, best of 3: 2.07 ms per loop" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\n", "False" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\n", "10 loops, best of 3: 137 ms per loop" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\n" ] } ], "prompt_number": 8 }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is fast but it's a short string.\n", "\n", "Let's try it with a larger string - [\"The Adventures of Tom Sawyer by Mark Twain\"](http://www.gutenberg.org/cache/epub/74/pg74.txt) from the [Gutenberg Project](http://www.gutenberg.org/). \n", "\n", "We use the builtin [urllib.request.urlopen](http://docs.python.org/3.0/library/urllib.request.html) to get the book from the web - this function returns a file-like object. The text is read as bytes and we have to tell python to decode it, usually using `utf-8` code which is a type of unicode. \n", "This decoding will allow us to read text in non-latin languages (like Hebrew, Arabic and Russian)." ] }, { "cell_type": "code", "collapsed": false, "input": [ "import urllib.request\n", "with urllib.request.urlopen(\"http://www.gutenberg.org/cache/epub/74/pg74.txt\") as r:\n", " book = r.read().decode('utf-8')" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 16 }, { "cell_type": "code", "collapsed": false, "input": [ "print(book[:book.index('\\n\\r')])" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "\ufeff\r\n", "The Project Gutenberg EBook of The Adventures of Tom Sawyer, Complete by\r\n", "Mark Twain (Samuel Clemens)\r\n" ] } ], "prompt_number": 17 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Looking for a short repeat is quick:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print(repeat_naive(book, 5))\n", "%timeit -n 3 repeat_naive(book, 5)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "True\n", "3 loops, best of 3: 709 ms per loop" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\n" ] } ], "prompt_number": 18 }, { "cell_type": "markdown", "metadata": {}, "source": [ "But searching for a longer repeat takes a very long time:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print(repeat_naive(book, 30))\n", "%timeit -n 1 repeat_naive(book, 30)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "True\n", "1 loops, best of 3: 10.9 s per loop" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\n" ] } ], "prompt_number": 110 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Worst case complexity\n", "\n", "The worst case happens when there is no such repeat, which is likely when *k* is large.\n", "\n", "The number of iterations is $O((n-k)^2$. Each iteration we compare a substring of length *k*, which is $O(k)$.\n", "So the total complexity is $O(k(n-k)^2)$. Because we consider $k = O(1)\\;$ then we get $O(n^2)$.\n", "\n", "In *With you were here* we had *n=589*, so the number of character comparisons was 346,921.\n", "\n", "But with *The Adventures of Tom Sawyer*, we had *n=421869* and roghly 177,973,453,161 character comparisons." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Hash solution" ] }, { "cell_type": "code", "collapsed": false, "input": [ "def repeat_hash1(st, k, m=0):\n", " if m == 0: # default hash table size is ~number of substrings to be inserted\n", " m = len(st) - k\n", " htable = MyHashtable(m)\n", " for i in range(len(st) - k + 1):\n", " if htable.find(st[i : i + k]):\n", " return True\n", " else: \n", " htable.insert(st[i : i + k])\n", " return False" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 5 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit -n 10 repeat_hash1(song, 5)\n", "%timeit -n 10 repeat_hash1(song, 30)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "10 loops, best of 3: 226 us per loop\n", "10 loops, best of 3: 7.38 ms per loop" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\n" ] } ], "prompt_number": 6 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit -n 3 repeat_hash1(book, 5)\n", "%timeit -n 1 repeat_hash1(book, 30)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "3 loops, best of 3: 102 ms per loop\n", "1 loops, best of 3: 96.3 ms per loop" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\n" ] } ], "prompt_number": 7 }, { "cell_type": "code", "collapsed": false, "input": [ "print(repeat_hash1(book, 1000))\n", "%timeit -n 1 repeat_hash1(book, 1000)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "False\n", "1 loops, best of 3: 15.5 s per loop" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\n" ] } ], "prompt_number": 8 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Average case complexity\n", "\n", "Creating hashtable - $O(m)=O(n-k)$.\n", "Number of iterations - $O(n-k)$.\n", "Each iteration we search a list of length $O(1)$ on average, so overall $O(k)$.\n", "Total complexity on average $O(k(n-k))=O(n)$.\n", "\n", "The worst case is not as good - all elements get the same hash and enter the same list, so the complexity is $O(k(n-k)^2)$. \n", "But the probability for this is *very* low." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Python builtin solution\n", "\n", "Which of python's data structures is appropriate?\n", "\n", "- `list` - mutable and ordered\n", "- `string`, `tuple` - immutable and ordered\n", "- `dict` - mutable and unordered with unique immutable keys\n", "- `set` - mutable and unordered with unique immutable values\n", "\n", "`set` is just the right structure - we need a structure we can insert items into and check if items are already in it." ] }, { "cell_type": "code", "collapsed": false, "input": [ "def repeat_hash2(st, k, m=0):\n", " htable = set() # Python sets use hash functions for fast lookup\n", " for i in range(len(st) - k + 1):\n", " if st[i : i + k] not in htable:\n", " htable.add(st[i : i + k])\n", " else: \n", " return True\n", " return False" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 9 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit -n 10 repeat_hash2(song, 5)\n", "%timeit -n 10 repeat_hash2(song, 30)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "10 loops, best of 3: 17.7 us per loop\n", "10 loops, best of 3: 718 us per loop\n" ] } ], "prompt_number": 122 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit -n 10 repeat_hash2(book, 5)\n", "%timeit -n 10 repeat_hash2(book, 30)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "10 loops, best of 3: 49.1 us per loop\n", "10 loops, best of 3: 458 us per loop\n" ] } ], "prompt_number": 123 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit -n 1 repeat_hash2(book, 1000)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "1 loops, best of 3: 3.44 s per loop\n" ] } ], "prompt_number": 125 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### DNA sequences\n", "An interesting use for this kind of a `repeat` function is to find a recurring DNA motif in a DNA string.\n", "In this case the alphabet is much smaller - 4 letters: A, G, C, T - so the chance for repitiotion is higher and the worst case happens less frequently.\n", "But, [DNA strings are very long](https://en.wikipedia.org/wiki/Genome#Genome_size).\n", "\n", "Lets look at the [`lacZ`](https://en.wikipedia.org/wiki/Lacz) gene sequence (from ):" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import urllib" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 10 }, { "cell_type": "code", "collapsed": false, "input": [ "with urllib.request.urlopen(\"http://berry.engin.umich.edu/gene2oligo/lacZ.txt\") as r:\n", " lacZ = r.read().decode('utf-8')" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 11 }, { "cell_type": "code", "collapsed": false, "input": [ "print(lacZ)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Name: LacZ\n", "Sequence:\n", "ACCATGATTACGGATTCACTGGCCGTCGTTTTACAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAAC\n", "TTAATCGCCTTGCAGCACATCCCCCTTTCGCCAGCTGGCGTAATAGCGAAGAGGCCCGCACCGATCGCCC\n", "TTCCCAACAGTTGCGCAGCCTGAATGGCGAATGGCGCTTTGCCTGGTTTCCGGCACCAGAAGCGGTGCCG\n", "GAAAGCTGGCTGGAGTGCGATCTTCCTGAGGCCGATACTGTCGTCGTCCCCTCAAACTGGCAGATGCACG\n", "GTTACGATGCGCCCATCTACACCAACGTAACCTATCCCATTACGGTCAATCCGCCGTTTGTTCCCACGGA\n", "GAATCCGACGGGTTGTTACTCGCTCACATTTAATGTTGATGAAAGCTGGCTACAGGAAGGCCAGACGCGA\n", "ATTATTTTTGATGGCGTTAACTCGGCGTTTCATCTGTGGTGCAACGGGCGCTGGGTCGGTTACGGCCAGG\n", "ACAGTCGTTTGCCGTCTGAATTTGACCTGAGCGCATTTTTACGCGCCGGAGAAAACCGCCTCGCGGTGAT\n", "GGTGCTGCGTTGGAGTGACGGCAGTTATCTGGAAGATCAGGATATGTGGCGGATGAGCGGCATTTTCCGT\n", "GACGTCTCGTTGCTGCATAAACCGACTACACAAATCAGCGATTTCCATGTTGCCACTCGCTTTAATGATG\n", "ATTTCAGCCGCGCTGTACTGGAGGCTGAAGTTCAGATGTGCGGCGAGTTGCGTGACTACCTACGGGTAAC\n", "AGTTTCTTTATGGCAGGGTGAAACGCAGGTCGCCAGCGGCACCGCGCCTTTCGGCGGTGAAATTATCGAT\n", "GAGCGTGGTGGTTATGCCGATCGCGTCACACTACGTCTGAACGTCGAAAACCCGAAACTGTGGAGCGCCG\n", "AAATCCCGAATCTCTATCGTGCGGTGGTTGAACTGCACACCGCCGACGGCACGCTGATTGAAGCAGAAGC\n", "CTGCGATGTCGGTTTCCGCGAGGTGCGGATTGAAAATGGTCTGCTGCTGCTGAACGGCAAGCCGTTGCTG\n", "ATTCGAGGCGTTAACCGTCACGAGCATCATCCTCTGCATGGTCAGGTCATGGATGAGCAGACGATGGTGC\n", "AGGATATCCTGCTGATGAAGCAGAACAACTTTAACGCCGTGCGCTGTTCGCATTATCCGAACCATCCGCT\n", "GTGGTACACGCTGTGCGACCGCTACGGCCTGTATGTGGTGGATGAAGCCAATATTGAAACCCACGGCATG\n", "GTGCCAATGAATCGTCTGACCGATGATCCGCGCTGGCTACCGGCGATGAGCGAACGCGTAACGCGAATGG\n", "TGCAGCGCGATCGTAATCACCCGAGTGTGATCATCTGGTCGCTGGGGAATGAATCAGGCCACGGCGCTAA\n", "TCACGACGCGCTGTATCGCTGGATCAAATCTGTCGATCCTTCCCGCCCGGTGCAGTATGAAGGCGGCGGA\n", "GCCGACACCACGGCCACCGATATTATTTGCCCGATGTACGCGCGCGTGGATGAAGACCAGCCCTTCCCGG\n", "CTGTGCCGAAATGGTCCATCAAAAAATGGCTTTCGCTACCTGGAGAGACGCGCCCGCTGATCCTTTGCGA\n", "ATACGCCCACGCGATGGGTAACAGTCTTGGCGGTTTCGCTAAATACTGGCAGGCGTTTCGTCAGTATCCC\n", "CGTTTACAGGGCGGCTTCGTCTGGGACTGGGTGGATCAGTCGCTGATTAAATATGATGAAAACGGCAACC\n", "CGTGGTCGGCTTACGGCGGTGATTTTGGCGATACGCCGAACGATCGCCAGTTCTGTATGAACGGTCTGGT\n", "CTTTGCCGACCGCACGCCGCATCCAGCGCTGACGGAAGCAAAACACCAGCAGCAGTTTTTCCAGTTCCGT\n", "TTATCCGGGCAAACCATCGAAGTGACCAGCGAATACCTGTTCCGTCATAGCGATAACGAGCTCCTGCACT\n", "GGATGGTGGCGCTGGATGGTAAGCCGCTGGCAAGCGGTGAAGTGCCTCTGGATGTCGCTCCACAAGGTAA\n", "ACAGTTGATTGAACTGCCTGAACTACCGCAGCCGGAGAGCGCCGGGCAACTCTGGCTCACAGTACGCGTA\n", "GTGCAACCGAACGCGACCGCATGGTCAGAAGCCGGGCACATCAGCGCCTGGCAGCAGTGGCGTCTGGCGG\n", "AAAACCTCAGTGTGACGCTCCCCGCCGCGTCCCACGCCATCCCGCATCTGACCACCAGCGAAATGGATTT\n", "TTGCATCGAGCTGGGTAATAAGCGTTGGCAATTTAACCGCCAGTCAGGCTTTCTTTCACAGATGTGGATT\n", "GGCGATAAAAAACAACTGCTGACGCCGCTGCGCGATCAGTTCACCCGTGCACCGCTGGATAACGACATTG\n", "GCGTAAGTGAAGCGACCCGCATTGACCCTAACGCCTGGGTCGAACGCTGGAAGGCGGCGGGCCATTACCA\n", "GGCCGAAGCAGCGTTGTTGCAGTGCACGGCAGATACACTTGCTGATGCGGTGCTGATTACGACCGCTCAC\n", "GCGTGGCAGCATCAGGGGAAAACCTTATTTATCAGCCGGAAAACCTACCGGATTGATGGTAGTGGTCAAA\n", "TGGCGATTACCGTTGATGTTGAAGTGGCGAGCGATACACCGCATCCGGCGCGGATTGGCCTGAACTGCCA\n", "GCTGGCGCAGGTAGCAGAGCGGGTAAACTGGCTCGGATTAGGGCCGCAAGAAAACTATCCCGACCGCCTT\n", "ACTGCCGCCTGTTTTGACCGCTGGGATCTGCCATTGTCAGACATGTATACCCCGTACGTCTTCCCGAGCG\n", "AAAACGGTCTGCGCTGCGGGACGCGCGAATTGAATTATGGCCCACACCAGTGGCGCGGCGACTTCCAGTT\n", "CAACATCAGCCGCTACAGTCAACAGCAACTGATGGAAACCAGCCATCGCCATCTGCTGCACGCGGAAGAA\n", "GGCACATGGCTGAATATCGACGGTTTCCATATGGGGATTGGTGGCGACGACTCCTGGAGCCCGTCAGTAT\n", "CGGCGGAATTCCAGCTGAGCGCCGGTCGCTACCATTACCAGTTGGTCTGGTGTCAAAAATAATAATAAcg\n", "gctgccgt\n", "\n", "\n" ] } ], "prompt_number": 12 }, { "cell_type": "markdown", "metadata": {}, "source": [ "We need to remove the first two lines, remove newlines and change lowercase letters to uppercase: " ] }, { "cell_type": "code", "collapsed": false, "input": [ "lacZ = str.join('',lacZ.split('\\n')[2:]).upper()\n", "print(len(lacZ))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "3088\n" ] } ], "prompt_number": 13 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can use the repeat functions to check for repeating subsequences:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "print(repeat_hash2(lacZ, 10))\n", "print(repeat_hash2(lacZ, 12))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "True\n", "False\n" ] } ], "prompt_number": 149 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit -n 10 repeat_hash1(lacZ,12)\n", "%timeit -n 10 repeat_hash2(lacZ,12)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "10 loops, best of 3: 34.1 ms per loop\n", "10 loops, best of 3: 3.75 ms per loop" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\n" ] } ], "prompt_number": 14 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Fin\n", "This notebook is part of the [Extended introduction to computer science](http://tau-cs1001-py.wikidot.com/) course at Tel-Aviv University.\n", "\n", "The notebook was written using Python 3.2 and IPython 0.13.1.\n", "\n", "The code is available at .\n", "\n", "The notebook can be viewed online at .\n", "\n", "This work is licensed under a [Creative Commons Attribution-ShareAlike 3.0 Unported License](http://creativecommons.org/licenses/by-sa/3.0/)." ] } ], "metadata": {} } ] }