{ "metadata": { "name": "", "signature": "sha256:886390ed780163e82269498e5839c7d8c26111f5c32d9e040205fad036536b9e" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "code", "collapsed": false, "input": [ "from IPython.display import display\n", "from IPython.display import HTML" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "

Problem 7

\n", "\n", "By listing the first six prime numbers: 2, 3, 5, 7, 11, and 13, we can see that the 6th prime is 13.\n", "
\n", "What is the 10 001st prime number?\n", "\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another prime question. Calculating 10,000. We did this before in question 3, time to reuse it..\n", "\n", "**Method 1: brute force**" ] }, { "cell_type": "code", "collapsed": false, "input": [ "def isPrime(x):\n", " if (x==1):\n", " return False\n", " for i in range(2,x):\n", " if x%i==0:\n", " return False\n", " return True\n", "\n", "def getPrimes(maxValue):\n", " primes = []\n", " for i in range(1,maxValue):\n", " if isPrime(i):\n", " primes.append(i)\n", " return primes\n", "\n", "primes = getPrimes(10000)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 84 }, { "cell_type": "code", "collapsed": false, "input": [ "%%timeit\n", "getPrimes(10000)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "1 loops, best of 3: 1.25 s per loop\n" ] } ], "prompt_number": 86 }, { "cell_type": "code", "collapsed": false, "input": [ "len(primes)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 85, "text": [ "1229" ] } ], "prompt_number": 85 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The brute force solution solution takes more than a second to calculate primes up to 10000. And how many primes did that yield? Only 1229! This doesn't look like a reasonable way to calculate 10000 primes. Luckily, there is a very simple and clever algorithm that can do this job much faster.\n", "\n", "**Method 2: [Sieve of Eratosthenes](http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes)**\n", "\n", "The basic notion of the sieve of Eratosthenes is to pre-allocate a list of numbers up to n, and then, taking a prime (starting with 2), cross out every multiple of that prime, as those multiples clearly can't be primes. The next prime is then the next unmarked value in the list. The process repeats until there are no more primes to be found. " ] }, { "cell_type": "code", "collapsed": false, "input": [ "def showState(l, p, nx):\n", " numbers = ''\n", " for n in l:\n", " style=''\n", " if n<0:\n", " style+='text-decoration: line-through; background-color: rgb(171, 231, 255);'\n", " if n==p:\n", " style+='background-color: rgb(230,255,95);'\n", " if n==nx:\n", " style+='background-color: rgb(150, 233, 150);'\n", " if n==0:\n", " style+='background-color: rgb(220,220,220); color: rgb(220,220,220);' \n", " numbers+='{1}'.format(style, abs(n))\n", " s = \"\"\"\n", " {0}
\"\"\".format(numbers)\n", " h = HTML(s)\n", " display(h)\n", " \n", "\n", "def sieve(size, showStates=True):\n", " l = list(range(2,size+1)) #generate the candidate set\n", " idx = lambda x: x-2 #just a simple mapping from number in list to list index\n", " p = 2 #seed with initial prime\n", " \n", " for iteration in range(len(l)):\n", " #mark every multiple of p\n", " for i in range(p*2, size+1, p):\n", " l[idx(i)] = -i\n", " \n", " #find the next unmarked value, that's the next p\n", " nextPrime = 0\n", " for i in l[idx(p+1):]:\n", " if i>0:\n", " nextPrime = i\n", " break\n", " \n", " if (showStates):\n", " showState(l, p, nextPrime)\n", " for i in range(p*2, size+1, p):\n", " l[idx(i)] = 0\n", " \n", " p = nextPrime\n", " \n", " #if we haven't found any unmarked values, we're done\n", " if p == 0:\n", " break\n", " \n", " #return all unmarked values \n", " return filter(lambda x: x>0, l)\n", "\n", "sieve(58, True)" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "\n", "
2345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758
" ], "metadata": {}, "output_type": "display_data", "text": [ "" ] }, { "html": [ "\n", "
23056709011121301501718190210232425027029303103303536370390414243045047484905105354550570
" ], "metadata": {}, "output_type": "display_data", "text": [ "" ] }, { "html": [ "\n", "
23050700101101301501701920002302500029303100035037004041043045047049500053055000
" ], "metadata": {}, "output_type": "display_data", "text": [ "" ] }, { "html": [ "\n", "
2305070001101314001701902102300002829031000350370004142430004704900053005600
" ], "metadata": {}, "output_type": "display_data", "text": [ "" ] }, { "html": [ "\n", "
2305070001101300017019002223000002903103300037000410434400470000053055000
" ], "metadata": {}, "output_type": "display_data", "text": [ "" ] }, { "html": [ "\n", "
230507000110130001701900023002600290310000037039041043000470000525300000
" ], "metadata": {}, "output_type": "display_data", "text": [ "" ] }, { "html": [ "\n", "
23050700011013000170190002300000290310034003700041043000470005105300000
" ], "metadata": {}, "output_type": "display_data", "text": [ "" ] }, { "html": [ "\n", "
23050700011013000170190002300000290310000037380041043000470000053000570
" ], "metadata": {}, "output_type": "display_data", "text": [ "" ] }, { "html": [ "\n", "
2305070001101300017019000230000029031000003700041043004647000005300000
" ], "metadata": {}, "output_type": "display_data", "text": [ "" ] }, { "html": [ "\n", "
2305070001101300017019000230000029031000003700041043000470000053000058
" ], "metadata": {}, "output_type": "display_data", "text": [ "" ] }, { "html": [ "\n", "
230507000110130001701900023000002903100000370004104300047000005300000
" ], "metadata": {}, "output_type": "display_data", "text": [ "" ] }, { "html": [ "\n", "
230507000110130001701900023000002903100000370004104300047000005300000
" ], "metadata": {}, "output_type": "display_data", "text": [ "" ] }, { "html": [ "\n", "
230507000110130001701900023000002903100000370004104300047000005300000
" ], "metadata": {}, "output_type": "display_data", "text": [ "" ] }, { "html": [ "\n", "
230507000110130001701900023000002903100000370004104300047000005300000
" ], "metadata": {}, "output_type": "display_data", "text": [ "" ] }, { "html": [ "\n", "
230507000110130001701900023000002903100000370004104300047000005300000
" ], "metadata": {}, "output_type": "display_data", "text": [ "" ] }, { "html": [ "\n", "
230507000110130001701900023000002903100000370004104300047000005300000
" ], "metadata": {}, "output_type": "display_data", "text": [ "" ] }, { "metadata": {}, "output_type": "pyout", "prompt_number": 71, "text": [ "[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53]" ] } ], "prompt_number": 71 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Above is the state of the preallocated list at each iteration of sifting primes up to 58. \n", "\n", "Starting with a fully unmarked list, and the first prime, 2 (shown in yellow), every multiple of 2 is marked off in the list (shown in blue). The next prime (green) is then found by moving up the list until the first unmarked number.\n", "\n", "The next iteration starts at the newly found prime, 3, and proceeds to mark off every multiple of 3 in the list, and so forth.\n", "\n", "Finally, the last iteration attempts to find unmarked values to the right of 53 and finds none. At that point the algorithm can terminate and return the remaining unmarked values in the list." ] }, { "cell_type": "code", "collapsed": false, "input": [ "%%timeit\n", "v = sieve(10000, False)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "10 loops, best of 3: 55.2 ms per loop\n" ] } ], "prompt_number": 58 }, { "cell_type": "code", "collapsed": false, "input": [ "len(sieve(10000, False))" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 72, "text": [ "1229" ] } ], "prompt_number": 72 }, { "cell_type": "markdown", "metadata": {}, "source": [ "At less than 60ms to find all primes less than 10000, this algorithm is orders of magnitude faster.\n", "\n", "It can be further optimized by recognizing that if one divisor or factor of a number (other than a perfect square) is greater than its square root, then the other factor will be less than its square root. Hence all multiples of primes greater than the square root of n need not be considered[1]. The sieve function can be trivially modified to use this knowledge by limiting the marking phase to $\\sqrt{n}$\n", "\n", "\n", "[1] http://britton.disted.camosun.bc.ca/jberatosthenes.htm" ] }, { "cell_type": "code", "collapsed": false, "input": [ "#comments removed for brevity\n", "def sieve(size, showStates=True):\n", " l = list(range(2,size+1)) \n", " idx = lambda x: x-2 \n", " p = 2 \n", " for iteration in range(int(0.5+len(l)**0.5)):\n", " #mark every multiple of p up to sqrt(n)\n", " for i in range(p*2, size+1, p):\n", " l[idx(i)] = -i\n", " nextPrime = 0\n", " for i in l[idx(p+1):]:\n", " if i>0:\n", " nextPrime = i\n", " break\n", " if (showStates):\n", " showState(l, p, nextPrime)\n", " for i in range(p*2, size+1, p):\n", " l[idx(i)] = 0\n", " p = nextPrime\n", " if p == 0:\n", " break\n", " return filter(lambda x: x>0, l)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 73 }, { "cell_type": "code", "collapsed": false, "input": [ "%%timeit\n", "v = sieve(10000, False)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "100 loops, best of 3: 13.2 ms per loop\n" ] } ], "prompt_number": 74 }, { "cell_type": "markdown", "metadata": {}, "source": [ "So the Eratosthenes sieve is very fast at finding primes up to some limit m. At m=10000, we find n=1229. What range do we have to sieve to actually get our n=1000 primes?\n", "\n", "Rosser's theorem[2] provides a useful inequality that establishes bounds on the value of the nth prime number:\n", "\n", "$\\ln n + \\ln\\ln n - 1 < \\frac{p_n}{n} < \\ln n + \\ln \\ln n \\quad\\text{for } n \\ge 6$\n", "\n", "[2] http://en.wikipedia.org/wiki/Prime_number_theorem#Approximations_for_the_nth_prime_number" ] }, { "cell_type": "code", "collapsed": false, "input": [ "def maxPrime(n):\n", " return int(0.5+(float(n)*log(n)+ n*log(log(n))))\n", " \n", "limit = maxPrime(10000)\n", "print('The 10000th prime has a value < {0}'.format(limit))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "The 10000th prime has a value < 114307\n" ] } ], "prompt_number": 114 }, { "cell_type": "code", "collapsed": false, "input": [ "primes = sieve(limit, False)\n", "len(primes)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 115, "text": [ "10816" ] } ], "prompt_number": 115 }, { "cell_type": "markdown", "metadata": {}, "source": [ "The upper bound function appears to have done it's job and netted just over 10000 primes. We can now obtain the 10001st" ] }, { "cell_type": "code", "collapsed": false, "input": [ "primes[10000]" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 118, "text": [ "104743" ] } ], "prompt_number": 118 }, { "cell_type": "markdown", "metadata": {}, "source": [ "

Problem 8

\n", "\n", "The four adjacent digits in the 1000-digit number that have the greatest product are 9 \u00d7 9 \u00d7 8 \u00d7 9 = 5832.\n", "\n", "
\n",
      "73167176531330624919225119674426574742355349194934\n",
      "96983520312774506326239578318016984801869478851843\n",
      "85861560789112949495459501737958331952853208805511\n",
      "12540698747158523863050715693290963295227443043557\n",
      "66896648950445244523161731856403098711121722383113\n",
      "62229893423380308135336276614282806444486645238749\n",
      "30358907296290491560440772390713810515859307960866\n",
      "70172427121883998797908792274921901699720888093776\n",
      "65727333001053367881220235421809751254540594752243\n",
      "52584907711670556013604839586446706324415722155397\n",
      "53697817977846174064955149290862569321978468622482\n",
      "83972241375657056057490261407972968652414535100474\n",
      "82166370484403199890008895243450658541227588666881\n",
      "16427171479924442928230863465674813919123162824586\n",
      "17866458359124566529476545682848912883142607690042\n",
      "24219022671055626321111109370544217506941658960408\n",
      "07198403850962455444362981230987879927244284909188\n",
      "84580156166097919133875499200524063689912560717606\n",
      "05886116467109405077541002256983155200055935729725\n",
      "71636269561882670428252483600823257530420752963450\n",
      "
\n", "Find the thirteen adjacent digits in the 1000-digit number that have the greatest product. What is the value of this product?\n", "\n", "---" ] }, { "cell_type": "code", "collapsed": false, "input": [ "source = '''\n", "73167176531330624919225119674426574742355349194934\n", "96983520312774506326239578318016984801869478851843\n", "85861560789112949495459501737958331952853208805511\n", "12540698747158523863050715693290963295227443043557\n", "66896648950445244523161731856403098711121722383113\n", "62229893423380308135336276614282806444486645238749\n", "30358907296290491560440772390713810515859307960866\n", "70172427121883998797908792274921901699720888093776\n", "65727333001053367881220235421809751254540594752243\n", "52584907711670556013604839586446706324415722155397\n", "53697817977846174064955149290862569321978468622482\n", "83972241375657056057490261407972968652414535100474\n", "82166370484403199890008895243450658541227588666881\n", "16427171479924442928230863465674813919123162824586\n", "17866458359124566529476545682848912883142607690042\n", "24219022671055626321111109370544217506941658960408\n", "07198403850962455444362981230987879927244284909188\n", "84580156166097919133875499200524063689912560717606\n", "05886116467109405077541002256983155200055935729725\n", "71636269561882670428252483600823257530420752963450\n", "'''.replace('\\n','')\n", "\n", "#break the source string into a series of 13 character long slices at every possible position\n", "window_size = 13\n", "slices = [source[x:x+window_size] for x in range(len(source) - window_size + 1)]\n", "\n", "#compute the product of each slice\n", "products = [product(map(int, row), dtype='int64') for row in slices]\n", "\n", "max(products)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 121, "text": [ "23514624000" ] } ], "prompt_number": 121 }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] } ], "metadata": {} } ] }