{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Memoization, Code Optimization and Time Complexity\n", "- based on the Latin word memorandum, meaning \"to be remembered\" \n", "- similar to word memorization, its a technique used in coding to improve program runtime by memorizing intermediate solutions\n", "- if an intermediate results (from some computations) are used over and again, the results can be remembered/stored to avoid unnecessary repeating calculations\n", "- using dict type datastructure, one can memoize intermediate results from functions esp. in recurisive solutions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## naive recursive fib function" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "fib at 30th position = 1346269\n", "fib function count = 2692537\n" ] } ], "source": [ "count = 0\n", "def fib(n):\n", " global count\n", " count += 1\n", " if n <= 1:\n", " return 1\n", " f = fib(n-1) + fib(n-2)\n", " return f\n", "\n", "n=30 #40, 50? takes a while\n", "print(\"fib at {}th position = {}\".format(n, fib(n)))\n", "print(\"fib function count = {}\".format(count))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## theoritical computational complexity\n", "- Worst case Big-Oh Time Complexity: O(n) = time to calculate Fib(n-1) + Fib(n-2) + time to add them: O(1)\n", " - T(n) = T(n-1) + T(n-2) + O(1)\n", " - T(n) = O(2n-1) + O(2n-2) + O(1)\n", " - T(n) = O(2n)\n", "- Space Complexity = O(n) due to call stack" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## finding actual time - timeit\n", "- timeit - measures execution time of small code snippets\n", "- timeit.timeit(stmt='pass', setup='pass', timer=[default timer], number=1000000, globals=None)\n", "- returns time in seconds" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Help on module timeit:\n", "\n", "NAME\n", " timeit - Tool for measuring execution time of small code snippets.\n", "\n", "MODULE REFERENCE\n", " https://docs.python.org/3.6/library/timeit\n", " \n", " The following documentation is automatically generated from the Python\n", " source files. It may be incomplete, incorrect or include features that\n", " are considered implementation detail and may vary between Python\n", " implementations. When in doubt, consult the module reference at the\n", " location listed above.\n", "\n", "DESCRIPTION\n", " This module avoids a number of common traps for measuring execution\n", " times. See also Tim Peters' introduction to the Algorithms chapter in\n", " the Python Cookbook, published by O'Reilly.\n", " \n", " Library usage: see the Timer class.\n", " \n", " Command line usage:\n", " python timeit.py [-n N] [-r N] [-s S] [-t] [-c] [-p] [-h] [--] [statement]\n", " \n", " Options:\n", " -n/--number N: how many times to execute 'statement' (default: see below)\n", " -r/--repeat N: how many times to repeat the timer (default 3)\n", " -s/--setup S: statement to be executed once initially (default 'pass').\n", " Execution time of this setup statement is NOT timed.\n", " -p/--process: use time.process_time() (default is time.perf_counter())\n", " -t/--time: use time.time() (deprecated)\n", " -c/--clock: use time.clock() (deprecated)\n", " -v/--verbose: print raw timing results; repeat for more digits precision\n", " -u/--unit: set the output time unit (usec, msec, or sec)\n", " -h/--help: print this usage message and exit\n", " --: separate options from statement, use when statement starts with -\n", " statement: statement to be timed (default 'pass')\n", " \n", " A multi-line statement may be given by specifying each line as a\n", " separate argument; indented lines are possible by enclosing an\n", " argument in quotes and using leading spaces. Multiple -s options are\n", " treated similarly.\n", " \n", " If -n is not given, a suitable number of loops is calculated by trying\n", " successive powers of 10 until the total time is at least 0.2 seconds.\n", " \n", " Note: there is a certain baseline overhead associated with executing a\n", " pass statement. It differs between versions. The code here doesn't try\n", " to hide it, but you should be aware of it. The baseline overhead can be\n", " measured by invoking the program without arguments.\n", " \n", " Classes:\n", " \n", " Timer\n", " \n", " Functions:\n", " \n", " timeit(string, string) -> float\n", " repeat(string, string) -> list\n", " default_timer() -> float\n", "\n", "CLASSES\n", " builtins.object\n", " Timer\n", " \n", " class Timer(builtins.object)\n", " | Class for timing execution speed of small code snippets.\n", " | \n", " | The constructor takes a statement to be timed, an additional\n", " | statement used for setup, and a timer function. Both statements\n", " | default to 'pass'; the timer function is platform-dependent (see\n", " | module doc string). If 'globals' is specified, the code will be\n", " | executed within that namespace (as opposed to inside timeit's\n", " | namespace).\n", " | \n", " | To measure the execution time of the first statement, use the\n", " | timeit() method. The repeat() method is a convenience to call\n", " | timeit() multiple times and return a list of results.\n", " | \n", " | The statements may contain newlines, as long as they don't contain\n", " | multi-line string literals.\n", " | \n", " | Methods defined here:\n", " | \n", " | __init__(self, stmt='pass', setup='pass', timer=, globals=None)\n", " | Constructor. See class doc string.\n", " | \n", " | autorange(self, callback=None)\n", " | Return the number of loops and time taken so that total time >= 0.2.\n", " | \n", " | Calls the timeit method with *number* set to successive powers of\n", " | ten (10, 100, 1000, ...) up to a maximum of one billion, until\n", " | the time taken is at least 0.2 second, or the maximum is reached.\n", " | Returns ``(number, time_taken)``.\n", " | \n", " | If *callback* is given and is not None, it will be called after\n", " | each trial with two arguments: ``callback(number, time_taken)``.\n", " | \n", " | print_exc(self, file=None)\n", " | Helper to print a traceback from the timed code.\n", " | \n", " | Typical use:\n", " | \n", " | t = Timer(...) # outside the try/except\n", " | try:\n", " | t.timeit(...) # or t.repeat(...)\n", " | except:\n", " | t.print_exc()\n", " | \n", " | The advantage over the standard traceback is that source lines\n", " | in the compiled template will be displayed.\n", " | \n", " | The optional file argument directs where the traceback is\n", " | sent; it defaults to sys.stderr.\n", " | \n", " | repeat(self, repeat=3, number=1000000)\n", " | Call timeit() a few times.\n", " | \n", " | This is a convenience function that calls the timeit()\n", " | repeatedly, returning a list of results. The first argument\n", " | specifies how many times to call timeit(), defaulting to 3;\n", " | the second argument specifies the timer argument, defaulting\n", " | to one million.\n", " | \n", " | Note: it's tempting to calculate mean and standard deviation\n", " | from the result vector and report these. However, this is not\n", " | very useful. In a typical case, the lowest value gives a\n", " | lower bound for how fast your machine can run the given code\n", " | snippet; higher values in the result vector are typically not\n", " | caused by variability in Python's speed, but by other\n", " | processes interfering with your timing accuracy. So the min()\n", " | of the result is probably the only number you should be\n", " | interested in. After that, you should look at the entire\n", " | vector and apply common sense rather than statistics.\n", " | \n", " | timeit(self, number=1000000)\n", " | Time 'number' executions of the main statement.\n", " | \n", " | To be precise, this executes the setup statement once, and\n", " | then returns the time it takes to execute the main statement\n", " | a number of times, as a float measured in seconds. The\n", " | argument is the number of times through the loop, defaulting\n", " | to one million. The main statement, the setup statement and\n", " | the timer function to be used are passed to the constructor.\n", " | \n", " | ----------------------------------------------------------------------\n", " | Data descriptors defined here:\n", " | \n", " | __dict__\n", " | dictionary for instance variables (if defined)\n", " | \n", " | __weakref__\n", " | list of weak references to the object (if defined)\n", "\n", "FUNCTIONS\n", " default_timer = perf_counter(...)\n", " perf_counter() -> float\n", " \n", " Performance counter for benchmarking.\n", " \n", " repeat(stmt='pass', setup='pass', timer=, repeat=3, number=1000000, globals=None)\n", " Convenience function to create Timer object and call repeat method.\n", " \n", " timeit(stmt='pass', setup='pass', timer=, number=1000000, globals=None)\n", " Convenience function to create Timer object and call timeit method.\n", "\n", "DATA\n", " __all__ = ['Timer', 'timeit', 'repeat', 'default_timer']\n", "\n", "FILE\n", " /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/timeit.py\n", "\n", "\n" ] } ], "source": [ "import timeit\n", "help(timeit)" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.7455664590088418\n" ] } ], "source": [ "#print(globals())\n", "import timeit\n", "print(timeit.timeit('fib(30)', number=1, globals=globals()))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## memoized recursive fib function" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "fib at 60th position = 2504730781961\n", "fib function count = 119\n" ] } ], "source": [ "memo = {}\n", "count = 0\n", "def MemoizedFib(n):\n", " global memo, count\n", " count += 1\n", " if n <= 1:\n", " return 1\n", " if n in memo:\n", " return memo[n]\n", " memo[n] = MemoizedFib(n-1) + MemoizedFib(n-2)\n", " return memo[n]\n", "\n", "n=60 #try 40, 50, 60\n", "print(\"fib at {}th position = {}\".format(n, MemoizedFib(n)))\n", "print(\"fib function count = {}\".format(count)) " ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.002014205005252734\n" ] } ], "source": [ "import timeit\n", "print(timeit.timeit('MemoizedFib(1000)', number=1, globals=globals()))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## computational complexity of memoized fib\n", "- Time Complexity - O(n)\n", "- Space Complexity - O(n)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## exercise\n", "- Write a program that finds factorials of a bunch of positive integer numbers. Would memoization improve time complexity of the program? " ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]\n" ] } ], "source": [ "# find the factorials of the first 20 positive numbers\n", "# optimize the program so it finds the factorials in the \n", "# shortest time possible\n", "nums = list(range(1, 21))\n", "print(nums)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "# Python code optimization\n", "- https://wiki.python.org/moin/PythonSpeed/PerformanceTips" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Summary\n", "\n", "- use functions and local variables instead of globals, eventhough function calls are relatively high\n", "- avoid dot . member access specifier\n", "- user built-in list.sort and sorted with key using itemgetter function if required\n", "- avoid string concatenation; use \"\".join(somelist) instead\n", "- use list comprehension and map() instead of loops\n", "- while working with dict (esp. initializing), use try catch instead of key test\n", "- use defaultdict class from collections\n", "- doing stuff less often - change sys.setswitchinterval(sys.maxsize) to as high as possible if not running threads and catching signals (default is 100?)\n", "- use addition and subtraction instead of multiplicaiton and divisions" ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.005" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import sys\n", "sys.getswitchinterval()" ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "9223372036854775807" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sys.maxsize" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [], "source": [ "import timeit" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.046036384999752045" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "code = '''\n", "x = 47\n", "x = x * 2\n", "'''\n", "timeit.timeit(stmt=code)" ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.07047916000010446" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "code = '''\n", "x = 47\n", "x = x << 1\n", "'''\n", "timeit.timeit(stmt=code)" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.04736856800445821" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "code = '''\n", "x = 47\n", "x = x + x\n", "'''\n", "timeit.timeit(stmt=code)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Time complexity\n", "- Time complexity of various Python built-in data sturctures and functions \n", "- https://wiki.python.org/moin/TimeComplexity" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.2" } }, "nbformat": 4, "nbformat_minor": 2 }